top of page
Search
  • Writer's pictureAndy Brave

Who's that bug?

Updated: Apr 23, 2022




Have you ever been stuck on a bug a no idea how to deal with it?


Unfortunately, this entry won’t solve your specific problem, but maybe it can help you with new ideas.

First things first, before starting the quest, let’s categorize them, so you can recognize them to get new ideas of how you can ride off them. Here we go.


Heisenbug

It appears when you are not looking into the code and disappears when you start to debug.


Known causes

Sometimes, when you are debugging, deliberately or undelivered, you incur undesired secondary effects. For example, some compilers don’t use the same optimization for memory use when debugging, or maybe it could be on your own if you are not using the same environment for debugging. For example, you could have fewer threads than in the production code, or your collections don’t have the same data. [1]


How to deal with it?

“Reduce the problem code.”

Try to eliminate sections of the code by any means necessary. Don’t assume the error is in the section you are looking for. Frequently, you will be surprised by the things that happen on parts of code you consider correct.


Don’t assume the error is in the section you are looking for. Frequently, you will be surprised by the things that happen on parts of code you consider correct.

I know there is a temptation to skip them, but some extra scaffolding to reduce the possible problems is almost always valuable on hard debugging problems. You can never be too paranoid. Check all passed parameters for validity. Do range checks on indices. Verify that a pointer to foo actually points to an object of type foo. Check for things that “can’t happen.”

Reduce the amount of data shared between threads. We are using a message passing interface, where each thread more or less has its own data. This has been a big win. We often copy data before passing it on to the next thread, just to make sure.

“Heisenbugs” are almost always the result of memory management bugs. Arrays boundaries are being overwritten, null pointers are being accessed, freed memory is being used, or some other memory is being misused. It’s a prevalent problem.

Another solution, and one I have found to be *incredibly* useful, is to use a memory profiler. There are loads of memory profilers on the market today.[2]


An excerpt from the book Python tricks that may help


if cond == 'x':
    do_x()
elif cond == 'y':
    do_y()
else:
    assert False, (
    'This should never happen, but it does'
    'occasionally. We are currently trying to'
    'figure out why. Email Andy if you'
    'encounter this in the wild. Thanks!')

Is this ugly? Well, yes. But it’s definitely a valid and helpful technique if you’re faced with a Heisenbug in one of your applications.


Bohrbug

Is a classification of an unusual software bug that always produces a failure on retying the operation, which caused the loss. The Bohrbug was named after the Bohr atom because it represents a solid and easily detectable bug that can be isolated by standard debugging techniques. Contrast with heisenbug.[3]


A Bohrbug is pretty easy to localize and pinpoint to a specific part of a codebase. This is a massive boon for developers since we can reliably find and then fix the Bohrbug, as annoying as it might be![4]


Known causes

Your own code. Examples: an if-else code block with an untested path, a loop go outside the boundaries or ends prematurely, etc. This kind of stuff you can control.


How to deal with it?

Generally, these types of bugs are easily detectable by your unitary tests. Since they are within the scope of your code and your environment.


Mandelbugs

Mandelbugs are faults triggered by complex conditions, such as interaction with hardware and other software and timing or order of events. These faults are considerably challenging to detect with traditional testing techniques since controlling their complex triggering conditions in a testing environment can be challenging. Therefore, it is necessary to adopt specific verification and/or fault-tolerance strategies to deal with them cost-effectively.


Known causes

Interaction with other services, methods, APIs, or systems.

Sequencing of inputs, events, and operations (the inputs could have been run in a different order, and at least one of the other orders would not have led to a failure).

Timing of inputs, events, and operations (relative to each other, the system runtime, or calendar time).

In most cases, the complexity of the triggering conditions of Mandelbugs makes them difficult to isolate and significantly increases the efforts for systematically reproducing the failures caused by these faults.[5]


How to deal with it?

This bitch bug is a really hard one. In my experience, once you have checked all your code and you are sure the things are happing correctly in your local or QA env, you must try all the external sources. Try to disconnect services outside of your code, reboot while your code is running, and send sign kills and sign terms to your system to know how your program responds to these events.

As Dr. House said, try to get sick the patient more.

Also, you must check on your infrastructure. Do your disks have enough space? Does your server have enough available memory? What about the threads on your DB?


Hindenbug

A Hindenbug is a catastrophic bug that destroys data and may shut down systems or cause significant problems with an IT system. It is a general IT slang word for a big bug that does more than just create a nuisance or an annoyance for users. This kind of bug can end with lives and money losses.[6]


Known causes

Poor management of exceptions and propagation of errors that ends with the premature commit of transactions of data not preserved correctly. Sometimes, systems are not good at dealing with crashes, causing miscommunication with databases, or sending incorrect end signals. Generally, you can diagnose what happened once this bug occurs to prevent it again.


How to deal with it?

Remember Murphy’s law, “Anything that can go wrong will go wrong,” so try to prevent it in the most ways you can: always have a fail-safe for your essential processes. Don’t forget to catch all the errors, send warnings, and have monitoring on them.


Higgs-bugson

A hypothetical bug is predicted to exist based on a small number of possibly related event log entries and vague anecdotal reports from users. Still, it is difficult (if not impossible) to reproduce on a dev machine because you don’t really know if it’s there and, if it is there, what is causing it.[7]


Known causes

Software engineers often deny the existence of the Higgs-Bugson and offer alternative theories that often blame the user. Engineers, after all, don’t write bugs.


How to deal with it?

Check for user behavior. Sometimes they can teach you how to reproduce the bug and when it occurs. Ask them to remember all the steps, the data consulted and the values used in operation.

If it is possible to check-in simultaneously, maybe a process that runs simultaneously occurs in the same place generating the error.


Final thoughts

Although there isn’t a silver bullet to solve each of the bugs, try to be consistent and ordered in your process of debugging, this will help you minimize the scope and finally find the actual line of code causing the bug.

If you could try to speak to another developer, this will help you get new ideas to get into the solution.


If you come here after a break, it’s time to go for that bug. You’ll do great, traveler!


Sources:

[1] Definitions. https://en.wikipedia.org/wiki/Heisenbug
[2] Experience of many developers: https://ask.slashdot.org/story/01/04/11/2017242/how-do-you-deal-w-heisenbugs
[3] https://www.webopedia.com/definitions/bohrbug/
[4] https://medium.com/baseds/weeding-out-distributed-system-bugs-28a01e37f70c
[5] Carrozza, Gabriella & Cotroneo, Domenico & Natella, Roberto & Pietrantuono, Roberto & Russo, Stefano. (2013). Analysis and Prediction of Mandelbugs in an Industrial Software System. Proceedings - IEEE 6th International Conference on Software Testing, Verification and Validation, ICST 2013. 10.1109/ICST.2013.21.
[6] https://www.techopedia.com/definition/31877/hindenbug
[7] https://blog.codinghorror.com/new-programming-jargon/


48 views0 comments

Recent Posts

See All
Post: Blog2_Post
bottom of page