Doodles And Twister
Documents
v 2.10
Back

Regular Bugs
Easy bugs are just that, easy. Easy to produce, easy to find and easy to fix. These are the bugs you find in development and before release. They don't require too much thinking in correcting as they just just kind of jump out at you and say "silly you, you indexed by the wrong value here".

There is not much of a mindset or philosophy required to find and correct these type of issues.

Nasty Bugs
Real hard bugs are the bugs that are created by either coding done on the project, or by the interaction of the product with the the computer system (and all it's processes) that it's run on. These are the bugs that development can and should do something about. That being said, there is still much that impacts the debugging processes in both a positive and negative way.

A mind set that something is wrong
When something goes wrong, the first step is acknowledgment that something is wrong. The report could have come from the field, a web user in Bulgaria or a 6 year old son of a tester, but something did or did not occurred as expected. What ever the source, the impact was great enough to the user that it was reported.

If the reported bug can not be duplicated, it should be researched a little deeper. You'll never know when a opportunity will present it's self to correct an error at the earliest possible time. But just as important is to gather information about configurations that users have on their systems and the discovery of issues that may never have been thought of before. Users are more than happy to provide information that will make the product better. On the plus side is the personal repore that a user will feel, knowing that the products company is concerned enough about the average user to research out any problems they are having. Even if the issue is never resolved, this concern and good will will replicate it's self in the products community.

Downstream
The response back to the user should never be that its "not a bug". It may not be a code bug, but a lack of clarity on behalf of the documentation, support, wording, placement in the window, etc. but that it is an issue with that user. This person has spent part of their precious time in reporting what to them is a bug. This behavior should be rewarded as it will enhance the insight on what the users expectation are and how the product is used. Any less of a response will reduce the feedback from the users, as well as injure future prospects of sales.

Fault of the unknown
To paraphrase a political statement: "Don't fault you, Don't fault me, fault that fella behind that tree". Programmers are confident about their code. They have to be, otherwise these great software development could not be accomplished. There is a downside to this confidence though, and that this when bugs are found in a system during integration testing or later. If the issue is one that can not be tracked to a specific area (memory leak, delayed crashing, general instability), then this mindset of confidence can mislead and effect the debugging process. Logic is replaced by personal feeling and the wild goose chase begins.

Confidence allow one to overcome struggles and difficulties to produce great code. It also clouds the vision of our own weaknesses, of which fallibility is one. Programmers can be so confident in their own code that it's difficult to comprehend that they may be the source of current issues.

Run of the mill bugs can be easily tracked by stack tracing, exception information or by stepping through the code while running. But general stability, intermittent failures and random faults are another issue. These are the unknown faults and programmers have a hard time dealing with.

It's during these times of stress and difficulties in debugging where differences in mindsets are magnified. Normally everyone is coding happily until all code needs to be inspected for faults in logic or coding. In a perfect world, everyone's code would routinely go through code reviews on a regular bases, but in small shops and or when delivery pressure is mounting, reviews are one of the first processes that get dropped. The code appears to be working and progress is being made, so the thought is "why disrupt coding when everything is working?". The reason is that code may work fine in unit testing, but fail in mysterious places during integration testing or worse, after a release into the wild.

The programmer response to these type of "unknown" errors range from denial, to "it's not my code", to "I think it's a OS error" to "it must be something we did". Only the last response is the logic response and only one that will prevail in a solution. The others will eventually get to the same destination, but only after long delays, bruised egos and creating or enlarged chasms between staff members.

Denial is the most unfortunate response. People have reported one of these unknown faults and unless it is found, understood found and corrected, it will continue to dog the product. Denial does nothing but increase the cost of resolution and undermine the confidence in the product. This mindset will continue until the evidence mounts and the continuing pain forces a rethinking of the denial strategy. At this point the shift can be to the "fault the unknown" or to the "something we did" mindset.

The "fault the unknown" mindset is still a viewpoint that the product code is ok, but there is a mystery bug with some code (OS, Development tools, etc.) not in control of the development. Now it's not that this is an invalid occurrence (it can happen) but more likely this mindset is just a temporary stop on the way to the "something we did" realization. It's hard to justify this position without some hard evidence. Hearsay (newsgroups, forums, etc.), rumors and hunches are not hard evidence for blaming another companies product for your problem. This strategy will continue until the programmers statements get so unbelievable that even they no longer believe in what they are suggesting. At this point there is only one stop left in the mindset shuffle, and that's to the "something we did" mindset.

Only at the "something we did" mindset will progress be made on the seemingly unknown error. But this will only work if all involved are at this level. Because of the unknown nature of the bug, every programmer that has a piece of code in the project, must be sincere in re-evaluating theirs and each others code. A programmer that is not of this viewpoint will only be evaluating their code for the aspects that work and not for side effects that could lead to instability or possible errors. Heck, some programmers don't even know how their own code works, let alone what ripple effects it may cause.

At this level, each programmer should actively evaluate all code in the product. Some programmers may still hold on to the "it's not my code" mindset, but when things really go wrong, everyone's code must be evaluated for errors.

Whom do you trust?
For most errors the debugging process is just find the line of code and fix it, then repeat for all bugs listed. Most bugs are easily fixed, a check here, more logic there and that's all there is. Larger bugs that take a day or longer to understand will expose the thought process of the developer. This thought process I call the "whom do you trust"

The debugging mindset listed above is really a thought process of looking at all of the code (yours, others, OS, drivers, etc.) and developing a level of trust of each of those pieces of code that interact with the code being debugged.

The logical questions that should be asked for each section of code (including your own) is:

  • What is my confidence that this code is free of the defect I am looking for?

  • What is my confidence that this code has been completely tested?

  • Are others confident of this code?

By answering these questions about your code, the operating system or language runtime, the answers should point to your own code as the likely source any errors. But it's amazing how many programmers will fault the OS or language runtimes as the possible source over there own.

Logic would seem to state that new code is the least tested and most likely to contain errors, rather than say the current Java VM. The order for searching for your bug should flow from least tested (newest code) to Most tested (mature code).

Example
A programmer just updated the web server from Tomcat to WebLogic from BEA for the first time. There where some port conflicts in the default settings, so the programmer went to work editing the configuration files to get things to work. The application seemed to work, but then started to hang in random places after a few minutes. This was quite puzzling and soon the statement arose, "maybe it's a bug in the WebLogic server".

Hum, new code, first time ever installing WebLogic, moving from a JSP/Servlet server to a J2EE leading AppServer and you really think your code is more trustworthy? Whom do you trust more?

(c) 2008 Doodles & Twister. Questions?
Top