Sunday, June 15, 2014

Bad Code Considered Harmful

If you are not familiar with the very brief article, "with Statement Considered Harmful", this post will not make much sense to you. While not particularly well-known, it has been one of the more influential "Considered Harmful" articles. The JSLint tool forbids the use of with without giving any option to permit it as a direct result of this article.

For the most part, javascript conventions don't have much of an impact on how I write code. I have yet to encounter anything that can be written better in pure javascript than it can be in CoffeeScript, LiveScript, or clojurejs (or even TypeScript, though I am emphatically not a fan of explicitly statically typed languages -- implicit static typing like Haskell is great though). For the most part, I am convinced that you can get much cleaner, better code by precompiling something else to js than you can by actually writing code in js. In addition, every application I have spent much time working on sent a lot of data and did more processing on the server than in the application. The slightly larger files and slower performance from using these libraries is tiny compared to the differences that come from making sure that image resolution is no bigger than it needs to be, for example.

If you can't write code where you can keep track of what properties you know an object has, you probably don't have much of a future in programming. After all if ooo.eee.oo.ah_ah.ting.tang doesn't have a walla property, you're going to get an error when you try to access ooo.eee.oo.ah_ah.ting.tang.walla.walla because you can't access properties of undefined.

Secondly, one of the great misfortunes of javascript, in general, is that it does a very poor job of handling scope. This isn't a deficiency of  the with statement. It's a problem with javascript in general. Variable's default to global scope, when you practically always want them to be local; there are no nested scopes for conditions and loops when declaring variables, there's no one to isolate a namespace to a file, etc. (In browser javascript, Node has cleaned up the last one.)

There are two fundamental problems with javascript's scoping. One is that it enables you to easily make mistakes, the other is that it enables you to write bad code. A mistake is code you wrote that behaves differently from what you expected it to. If you forget to specify a local scope for your variable you will end up with a global variable, and possibly with bad behavior. This kind of thing is in general pretty easy to test for, pretty easy to catch, and pretty easy to correct. The ease of debugging becomes double plus true in this sort of situation when it is based on a known pitfall of the language that you routinely check for. Bad code is code from which it is very difficult to or impossible to discern from reading what you intended the code to do.

This is exactly the thing that the rant I linked to was complaining about. However, there is a key distinction between what I'm saying and what that rant was saying.

The key distinction is the word enables. Javascript does not force you to write bad code, any more than it forces you to misscope your variables. Some languages, do force you to write bad code. I don't want that last sentence to turn into a debate, so I'll use INTERCAL as an example of a programming language that forces the programmer to write exclusively bad code. It does so by design because it was created for entertainment value rather than usefulness. In my opinion, plenty of other languages accidentally  force you to write bad code as well but that's another rant for another day.

If you find yourself needing to set a global variable while programming in Python, you are forced to use a syntax that makes that intention clear. In javascript, you have the freedom to just, set it. However, you also have the freedom to clearly indicate that you mean to set a global variable.

If you are programming for a browser, you can use the following function, to clearly indicate when you intend to be accessing a global variable.

function SET__GLOBAL(var_name, value) {
    window[var_name] = value;
}

(Or you could use a comment!)

Is this something you should do regularly? Probably not. Are there cases when you should do it? Almost certainly.

There is one good reason to set global variables. There is also one good reason to use the with statement, and it's the same as the only good reason to use practically any feature of any language. It's also the same as the only consideration that actually matters 90+% of the time you are picking what language or toolset you should code something in. Choosing to use that language, that toolset, or that feature makes your code easier to write, easier to read, and easier to maintain.

For picking a language, the libraries and tools available do make a little difference but not much. You will be hard pressed to find a readable language in reasonably common use today that doesn't have good libraries for practically every task you want a good library for. If the syntax for a language is well-suited to your task, you can rest assured that somebody else has noticed the same thing, and coded a library to help with whatever you are trying to do. (This isn't to say that you should necessarily use that library, if the syntax is really well-suited to what you are trying to do, then sometimes it's a lot easier to build something up from scratch than it is to learn how to use somebody else's library, especially when that library approximates your goals but doesn't perfectly address them.)

When does the with statement make code better? It can in situations like the following. Suppose you are writing a lot of functions that make extensive use of trigonometry, you might want to do something like this:

var MathFunctions = {};
with (Math) {
    MathFunctions.should_I_trust_floating_point_equality = function(x) {
        var one = sin(x) * sin(x) + cos(x) * cos(x);
        return (one === 1);
    };
}

Incidentally, I've mentioned a way to clearly identify that you intend to set a global variable, but that's only half the problem that can arise when you are using the with statement. What if you need to set an attribute of Math? You can use Math.attribute. What if Math has a Math attribute. Then before your with statement you can do Math__Accessor = Math;. For the sake of consistency, the final option should probably be the default (or turning it into a function like the one that was previously mentioned). I can't necessarily think of a reason why you would ever want to do some of these things, but in general, I think having the freedom to them, along with the discipline to do them right results in much better code than proscribing their use altogether.



I cannot think of any reason you would ever want to write javascript without surrounding it with

with _

referring to either lodash or underscore). Of course, if you're going to do that, you might as well just go a step further and use CoffeeScript or LiveScript unless you have some performance-critical processing to do client-side. I do write javascript, and I don't do this. I do want to, at least assuming I can't use either of the languages I just mentioned. For a substantial part of every week, I am payed to write code that conforms to given instructions. Those given instructions have technical constraints due to "business considerations." I personally believe that good developers tend to be fluent enough in coding, in general, that they have an easier time working with a codebase that is halfway well written because half of it is written with modern tools and practices than a codebase that is entirely poorly written for the sake of maintaining consistency with the half that was written badly a few years ago.


I don't think I write lousy code for fun (when I'm writing code without being constrained by a lot of rules), but I do get payed to write lousy code. So I write a lot of lousy code. I go back to management routinely and say that given the spec, which always tells me what language I must use and frequently tells me what third party libraries I must use to accomplish the things specified, that the only conforming code I can write will be bad. Then I do as I'm told, and rant about it later.

Management isn't just making up rules for no reason. It is enforcing rules suggested from programming blogs about how to avoid common pitfalls of programming. I work for a Microsoft shop at the moment, and, as such, most of the rules and difficulties I'm dealing with come from what I will call the Microsoft philosophy.



My general understanding of the Microsoft philosophy is that it consists mainly of tools and guidelines intended to permit even poor programmers to slowly build out and maintain an application. These tools and philosophies come at the cost of forcing even good programmers to slowly build out and maintain an application.


This causes the enormous problem of creating a situation in which the optimal strategy for building out an application is to hire an army of mediocre or poor programmers, thus ensuring that rather than having a few problems created by a few bad practice, your codebase as a whole is plagued with issues arising from general incompetence. This problem arises even if the core development is carried out by a team of very good programmers and has been for years. Firstly, as I mentioned earlier, development is slow. I have yet to encounter a problem that can be solved more succinctly and quickly in C# than it can in Ruby or Python. As a result, mercenaries occasionally descend upon the project to fill it with thousands of lines of the most unreadable, unmaintainable garbage I have ever seen, so that some objective can be achieved a month sooner, at the expense of every other deadline that will arise in the future. The resulting vicious cycle is only checked by the fact that the code eventually becomes so bad that the product stops working. Then money dries up, and armies of mercenaries cannot be afforded until the code and the product are salvaged again.

There are a huge number of patterns that radically hurt the quality of code. Religious observation of rules that someone came up with due to the misuse of a code pattern in one place rank among the worst. Defensive programming also ransacks the readability of code. In my experience, explicit static typing is also something that tends to hurt badly.

People do all of these things intentionally. People advocate all of these patterns. The above are optimized for code safety, a concept I don't fully understand. What is code safety? It clearly doesn't mean maintainability. The aforementioned things all make the code less maintainable. It is also difficult to see how they reduce bugs. Indeed, defensive programming introduces a whole new category of bugs. The-overzealous-error-checking-resulting-in-errors-being-thrown-when-they-really-should-not-have-been-thrown-because-now-you've-got-edge-cases-in-both-directions-instead-of-just-one category of bugs. Good testing finds anything that you can find through defensive programming. In both cases you can find all of the errors that you expected. (Actually, in good testing, you find many errors you didn't anticipate in addition to the ones you expected because your old unit tests continually validate that you didn't actually change old behavior. Whereas with defensive programming you tend only to find the errors you expected. The old checks don't provide much value because they aren't set up to check for the ordinary case.)

There are two other patterns that, in my opinion are mostly harmful, that again, are explicitly aimed at improving the quality of the code. These two are especially pernicious because they aim to simplify. The first is the intention to prevent complicated lines. I won't discuss it now, because I plan to discuss it in depth as my next topic. The second is a foolish consistency.

Consistency is good at least in part because a deliberate consistency is usually based on a standard that has been well-thought-out, and in many times, incorporates things you had not yet planned to do. When I began using vim, I immediately regretted the fact that I had previously disregarded 79 character line length rules. I hadn't intended for my code to be edited on a computer so old that it lacked resizeable editors and had not yet come to appreciate the fact that their may be good reasons to prefer not to resize your editor even when you have a choice.

The very worst form of foolish consistency in my opinion is the restriction of third party libraries to a core few so that people can "learn it quickly" (as if well-documented third party libraries provide a steeper learning curve than in house code written without them, or worse, with a different third party library that does not actually suit the task at hand, and must therefore be continually hacked to accomplish its objectives.)

All of the things I'm mentioning are policies that are in place because some programmer advocated them to avoid some pitfall. My counterargument that I will be presenting throughout this blog is that these pitfalls don't produce problems. Bad code produces problems. Bad code is just code that isn't good. And good code only results from giving good programmers the freedom to program the way they think the should. It helps to have a few practices in place like unit testing (which doesn't interfere with how the programmer goes about writing code) and accountability via code reviews with other good programmers (which improves communication and general cohesiveness of the project without directly impacting the way code is being written while it's being written).

I realize my overall thesis has a major point of circular ambiguity. I haven't defined good programmers, but I do mean good not great. My general belief is that most programmers are good enough to be considered good programmers. However most code is bad, especially when developed by a team of people, partly because of poor communication (which has little to do with how programming itself is done) and largely because one or two bad programmers (and outside consultants who have no incentive to care about long term maintainability) can do damage much more quickly than five or six good programmers can fix it.