Bad Code Considered Harmful: April 2015

Friday, April 3, 2015

Language-independent programming (Part 1; the basic motivation)

I have been gradually coming to believe that one of the biggest problems with programming is that code lives in text files that are written in a particular programming language, and that programmers directly manipulate these text files.

The improved alternative that I would like to propose is for code to live in abstract trees with reverse-compilation hints that allow it to reverse-compile differently depending on context.

Let me give you a little bit of motivation for this idea.

Let's say I'm building a complicated object for an end-user to manipulate and interact with via a computer, and I want be able to improve my end users experience of interacting with this object over time. For example, let's say I've created a computer game. If I were to write a computer game, I wouldn't give my users tools to directly modify the files that their computer stores to keep track of their progress and let them quit the game and reload it later. Instead, I would give the user the ability to play the game, and I would write utilities to allow the user to save the game... i.e. to abstract from the game state sufficient information that the program I wrote could recreate the game-state that the user had reached the next time the user loads the game.

This process of saving an abstraction of what the user was interacting with and then converting that abstraction back into game state later allows the game creators a lot more freedom to update the game. Let's say a particular gun is overpowered and the game creators decide that they need to nurf that gun to make the game more fair. Since the saved game-state only has a reference to that gun rather than a full description of that gun, the programmers only need to update the game, and they don't have to update the saved file at all.

Now, switch over to programming languages. Let's say I wrote a programming language in which I had statements like 'print' or 'exec' that were a little bit different from functions, mostly with regard to their syntax but that otherwise behaved a lot like functions. Let's say that after twenty years of developing this language, I decided that having these things be statements instead of functions was a bad idea for several reasons, both in terms of the way the language internally distinguishes between functions and statements (statements cannot be overridden with an alternative implementation but functions can), and how the syntax works. It's a bit clunky to have these particular concepts be key words or to have them have different syntax rules from functions.

Now, I want to release an update to the programming language, but I really can't do it because doing so would involve requiring people who have been using the language for the past twenty years to change a whole bunch of files that use syntax that is no longer consistent with what the update would require, so instead I create a backwards-incompatible update and cease to actively update the old programming language and cause there to be a great schism between code compatible with the old way of doing things and code compatible with the new way of doing things.

It's a step forward, but it isn't a very tidy one.

It would be a lot nicer if instead of having to update these files, we could just have them reverse-compile to the appropriate syntax when we decided that the change needed to be made. (And forward compile to behaving like functions instead of behaving like statements. To satisfy both constraints we would need something that is not truly compiled, but capable of compiling in either direction.)

Now, let's consider a second point: programmer preference.

Let's say I really don't like programs to have more than four spaces worth of indentation per line, and in fact I prefer two. My coworker would rather have eight spaces worth of indentation, and he would rather have that spaces-worth of indentation be encoded in tabs than in spaces (we're working in a language that doesn't really care which one we use). So we pick one of our preferences and stick to it, but both of us are working on other projects that we code according to our own preferences, so we occasionally get confused and write code that mixes the two.

Again, it would be a lot nicer if somehow we had the code stored in a way that would cause it to be written back to us in a manner consistent with our preferences instead of a way that forced us to look at the same file of text even though we would both rather that it be formatted in different ways from each other.

For an even stronger disagreement of preferences, we use different editors. In my co-workers editor having 100 characters to a line makes a ton of sense, but in mine, it's best to keep it to 80. Again, the same thing.

We can go even further. After we start working on this project, I realize that having
3+5 * 4
represent
32
makes a whole lot more sense to me than having it represent
23
and that furthermore, I find this representation to be significantly more readable to me than
(3+5) * 4
or
(3 + 5) * 4
My co-worker is a purist who wants to keep the traditional interpretation of parentheses and thinks that that having whitespace sensitivity for that sort of thing in a programming language is a BAD idea.

So I install a module that lets me represent the code the way I want it represented, and he leaves it set up the default way.

And all of this is just the beginning of what you could begin to do differently (and better) if you found a way to store code in something abstract with guidance for how to be reverse-compile back to what the programmer wanted.

If you were going to build this system you would start by doing something that doesn't do too much in the way of producing abstractions, but you could systematically build out and maintain this sort of system much more easily than you can systematically build out and maintain the sorts of things that people currently write.

Thursday, April 2, 2015

lang: Part 0, the motivation

[Another old post I didn't want to delete. It's mostly tongue-in-cheek. I don't stand by what it says.]

Let's design a programming language. Do let's!

Actually, let's not. Instead let's try to describe lang.

lang, as its name suggests, is a programming language that seeks to incorporate as little originality as possible. The goal is to avoid designing it so that no one will ever have to learn it.

The designers of Go said they would know they'd done something wrong if printf was magic. I think that that's admirable, but it's also original, so it's not what we're going for. In lang, we know we've messed up if we finish it and some piece of it seems brilliant, cute, or original.

Hmm... actually, in lang we've messed up if we have to write any official documentation. It should all be unofficial and casual. Besides, we are intending that no one will have to learn lang.

It's 2013, so lang will be object-oriented. At this point, what could possibly be less original than yet another object-oriented programming language?

So what sort of object-oriented language is lang? Obviously, it's inspired by C, really there aren't other options when we are trying to avoid being original, but explicitly typed languages are so disco-era. Retro is a form of self-expression, and we want lang to just blend in. So it shouldn't be explicitly typed, not in 2013. So let's plan, for now, to go with dynamic typing.

So now we're heading into the territory between Python and C. Do we go with elif or else if?

Going with either would be making a decision, so let's go with both. In lang, the statement else if is the same as the statement elif. Is that cute? Personally, I think it seems lazy and tacky, which is about as far from cute as you can possibly get. So it tentatively gets my stamp of approval. (Also, as someone who switches back and forth between Python and C family languages on a regular basis, I mess this up more than I should, and I always get worried that someone might be looking over my shoulder thinking, "Does this guy seriously not know how to write conditional statements in a programming language he uses pretty much every day?" So lang just won't have that problem.)

There are occasions when somebody just gets something right. Guido van Rossum got white space right, really right, except for allowing tabs. At least, he got it exactly right for functions. I think the forced correlation between visual layout and logical layout is very nice especially given that the correlation is a matter of conventional redundancy for programming in most languages. I don't like writing redundantly when I'm programming. I've also quite fond of how it handles semicolons. Line breaks should mean something, but occasionally, there are reasons that line-breaking is stylistically undesirable -- or simply inconvenient. I mostly use semicolons in the interpreter when I'm editing a file. I type something like:

>>> reload(Module); c = Module.Class(stuff); c.dothathingimdebugging()

All in all, Python's handling of white space leaves almost nothing to be desired. I just occasionally wish that I didn't have to nest all of my sub-declarations, especially when I have a class with a function that produces classes, something I've only done once, because it made something that seemed complicated at first much easier. I forget what it was. So let's treat C-style brackets like C-style semicolons. Something like

class Class { 

x = 30

y = 45


def function(self):

if y > x:
return 'well this is a dumb example'

is perfectly valid code in lang.

I also wouldn't mind at all if the colons became optional. Something primarily used like the semicolon for one liners, but that don't break anything when they are included. They simply are not strictly necessary. Getting a text editor to do smart auto-formatting might be slightly harder, but we do have a list of relevant keywords: if, for, else, elif, else if, def, class, while, try, except, switch, case, etc. (Yes, we are picking up case and switch from C.)

For the sake of consistency, let's revisit our type system. Dynamic typing is nice, but it shouldn't preempt the possibility of explicit static typing. So in lang, we permit

int x = 15

This creates x, a statically typed int. Obviously, the goal of static typing is not just to produce new compiler errors, and force you to recast your variables before passing them into functions. A better purpose might be for the static type to take the form of an inline unit test... but never mind that would be original. More compiler errors it is!

[Now, if I could remember what I was trying to parody (Was it Ruby?), I might be able to complete the post, but I don't remember that anymore well enough to maintain a consistent trail of sarcasm...]

Complicatedness

[This is another unpublished old post that I decided to edit, finish, and post instead of delete.]

Every project will rapidly build up complexity until it reaches the point where the most skilled developer working on the project routinely finds his work somewhat challenging.

Human's thrive when they are working on problems that are challenging but solvable. Read virtually any book about design from the last twenty years, and it will tell you that the flow state is real and that it matters. The flow state is the state in which people are most productive and happiest to be working. They achieve it only when they are working on something that challenges them.

What happens naturally when people are maintaining code is that the most adept programmer tends to have the biggest individual impact on the codebase. Others may have equally or more drastic impact on the overall project by introducing problems that everyone has to spend time cleaning up or by setting the tone for some piece of the project that ends up becoming a central piece, but in the average day, the smartest developer is probably the one writing the most lines that are being kept and the one doing most of the prototyping, factoring, and refactoring that create the context in which everyone else is doing their work. S/he won't slow down until s/he finds the complexity challenging, and in in fact her/his subjective evaluation of the quality of the codebase will be that it is improving with each increase in complexity until s/he begins to find managing the code somewhat challenging.

Not all challenges are the same. Witty Haskell-style one liners that showcase clever tricks are a very different kind of challenge than the kind of challenge that gets created by a nebulous distributed cloud of little intricacies and corrections that have to do pieces of code separated by 30 lines of other code interacting with each other. The witty one-liners are a lot less time-consuming to debug and maintain even though they pose a few skull scratching challenges when people look at them and try to figure out what makes them work.

If it is true that the complexity in the codebase is going to climb to some limit either way, it's much better to have complicated lines of code than it is to have complicated patterns of interaction between different lines of code. So you want to keep your codebase full of pithy one liners.

productive vs consumer ui

[This is an old unpublished post that I decided I would rather post than delete.]

tldr: The concept of "good UI" is too narrow. People want to develop optimal UI should distinguish between "good UI to optimize productivity for power users" and "good UI to maximize curbside appeal and minimize learning curve for entry users." Since these two concepts are inherently opposes, a product cannot be both. If you want to compete in a relatively empty space, design for power users, since most people design for curb-side appeal. (Designing for power users isn't creating feature bloat, it's minimizing the amount of time required to complete certain routine tasks for people who bother to learn the shortcuts. vim and emacs are the only two programs I know of that were truly built for power users.)

The greatest strides in technological improvement are those that improve productivity. Individual companies make and lose temporary fortunes with changes to consumer experience, but in the long run, these changes are inconsequential. Having the world's flashiest interface today does not necessarily make it any easier to have the world's flashiest interface tomorrow — especially for a company that consumes its own products.

Interfaces that look nice, frequently do so at the cost of efficiency. Animations between screens take time, not just the imaginary time that my computer uses, its clock cycles that are mostly spent idle anyways. But my time, the time you and I spend sitting at our desks waiting for our computer to let us implement the things we are thinking about implementing. The mouse was a great step forward for consumer UI. For many applications, it's point and click functionality was also a step forward in productivity. But for many other purposes it was a step backwards. It is simply true that you can work more quickly if you can leave your hands in place than if you have to move them between the keyboard and the mouse.

Hence, I take Vim as my example of the greatest achievement in productive UI. It has a huge learning curve, but it pays handsomely for the effort that the user takes to learn it. Firstly, it serves as practically the only example of a productive interface that has seen several radical improvements and rebirths. ed begat vi begat Vim. With each of these changes, the scope of the tool increased radically, and with the scope, so too did the amount that the user had to learn to take full advantage of the program.

Consumer UIs tend to move in the opposite direction. The manual shrinks away to nothing over time. Both trends make sense, and learning to contribute to the continuation of both trends is certainly worthwhile. They both make sense in their own context. Those who expect to use a program regularly for the next ten or twenty years can afford to spend a few hours a day for several months or even years learning how to take advantage of the features available in that program. Whereas, consumers typically want to pull up a product and have it work instantly. Very often, they have no plans to stick with that particular product long into the future. Its instantaneous accessibility is what brings in new users, and its inability to offer more efficient use with practice is not really a defect because nobody wants to put the practice into learning how to use the product efficiently.

Unfortunately (or fortunately depending on how well you can position yourself to seize the opportunity the situation presents), trends in optimizing for consumers tend to infect the practices of developing for productivity. People who aren't thinking attentively are likely to evaluate a productivity tool based on its consumer appeal even though that appeal is often inversely corilated with increased productivity. IDEs are almost always too GUI.

[When I originally wrote the post two years ago, I had rants about Xcode here. I haven't used Xcode in a long time, so I have no idea if those complaints still apply.]