Skip to main content

Look Where You Touched

Software has a peculiar property rarely seen in the physical world — if you get it working once, it’ll just keep on working. The closest we get in the physical world is perhaps a high quality clock.

This has one important consequence. If a given piece of code is working to spec, it will not suddenly, of its own volition, become buggy. Bugs likely occur due to one or more of:

  • change in the code itself
  • change in some other part of the system that affects how this code is used
  • latent bug within the code that has never manifested until now because a full range of possible inputs was never encountered

This implies that whenever a bug comes up in code that until now was working just fine, the question to ask is “what changed?”

For the first two causes of bugs above, git history is a good start. If you’re in the middle of making changes, and a bug comes up, most likely something in your changes caused it. We may be tempted to think that we discovered a bug in the library, but libraries tend to be well-used and well-tested on average, so odds are against the library being at fault.

If the first two cases are the kinds of unfortunate bugs that you bring upon yourself, the third case is the “interesting” case. Running into a latent bug that’s been there for weeks or months, but never manifested because of some chance restriction of the input domain, can be fairly frustrating. In large part this is because you’ve trained yourself that most bugs happens as a result of your own changes[1], so it takes a while before you convince yourself to actually check the implementation of some part of the code you haven’t touched.

One important corollary: do not touch if you can avoid it, in particular do not upgrade libraries unless you actually need something from the new version, since touching code necessarily increases the risk of bugs. Note, that this is not a license to avoid cleanup, refactoring, and all the things that keep the code base healthy, fear-free, etc. The two pieces of advice here are at odds, so rely on ROI to choose which one to follow in every situation.

Next: It's All Just Code ⇒

[1] In this case it may indirectly a result of your own changes — the change you made may have removed an artificial restriction on the input domain of a function, and thus manifested the latent, until now undiscovered, bug.

Comments