Tuesday, May 26, 2020

Coding in Anger

The most skilled senior developer I worked with when I was still a pretty junior developer had a couple stock phrases that I think ended up teaching me a lot about the right way to write code. One of those phrases was "coding in anger."

When he was asked to give his opinion on a new tool or library, he would often say, "I don't know yet. I've just played with it for a couple days, but I haven't really used it in anger yet."

This, to me, is one of the most fundamentally human lessons of programming. Good patterns are the ones you reach for when you are exasperated with your code base. Good libraries are the ones you reach for when you are infuriated with the extant solution to a particular problem. You learn to write code well by writing it in anger.

Your code ends up being personal, opinionated, and judgmental.

If you are not putting your personality and your judgment into your code, you haven't figured out what you're doing yet. When I'm working in a codebase that has been maintained by a small enough team that was maintained by a bunch of bad developers, I can sometimes infer who did what without looking at git history just by the smell. But this is an artifact of small team size.

Oh, this query is pointlessly and inefficiently over-complicated and uses CTEs everywhere instead of temp tables even though temp tables are much more efficient in this case. I know who wrote it. But that only works if I'm on a team with only one bad developer whose form of badness is taking pride in making things pointlessly overcomplicated. If there are two or more of them, they feed off each other in their attempts to write ever more  elaborate code.

There are about eight approaches to bad development that I've seen.

1) The person who only knows the minimum set of knowledge required to write something, who turns every screw into a nail because the hammer is the only tool they have in their toolbelt.
2) The person who uses the most elaborate technique possible to solve anything to prove that they know esoteric things even when those things are completely irrelevant and even counterproductive to the problem at hand.
3) The person who experiments with a different library every project.
4) The person who just doesn't care and takes no pride in anything and has no consistency. They might mix different forms of capitalization for things that aren't case sensitive (SQL, yuck!) or mix tabs and spaces or put no thought into where their line breaks are supposed to go, and have no consistency of whether they put the operators at the start of the new line or the end of the old one. etc. 
5) The person who actually can't code and just does crazy stuff by trial and error until it works.
6) The person who thinks verbose, defensive code is optimal and writes as many try/catch statements as possible and includes extra branches in their code that end up doing the same thing just to prove that they considered that branch too.
7) The person who copies and pastes spaghetti code from elsewhere in the repository and strings it out further instead of integrating anything.
8) The person who constantly googles everything and copy-pastes from stack overflow.

When there are two or more people on a team who do any one of these things, everything gets worse at a rate that scales superlinearly with the number of problem children, and it becomes impossible to tell who wrote what. All of these different approaches illustrate a lack of opinion about what good code looks like. The code that reflects the personality rather than the values of the author.

In contrast, when you are working with a really good developer, and you read code that only they wrote, you know exactly who wrote it, and you feel judged for not writing all of your code the same way they write theirs. There are forms of consistency you didn't realize could exist. All classes are always instantiated by passing in arguments as kwargs, and functions and methods besides constructors are always invoked by passing in any arguments that can be passed in positionally, positionally. You eventually realize that there's a reason for this pattern, and unless you have a similarly esoteric pattern that is equally well-thought out, you adopt it. There's partial convergence among good developers, but they all come with their own opinions too.

There are only so many personalities, but there are endlessly many different sets of values.

As long as two or more really good developers are on the same team working closely together, they are able to negotiate a truce. They stake out their respective hills to die on, and they figure out where and how they can agree to disagree, and they assert themselves in the places where no one else has defined a standard.

One says metaclasses are good and codegen is bad. The other says the opposite. The compromise is that neither metaclasses nor codegen will be used because they both think that this is a hill to die on. They have different opinions about how linebreaks should be handled in a function's signature, and they each write it their own way because it's ultimately not that important. (But, if one of them has to take over the other's code and become the primary maintainer, the line breaks get changed every time a function's signature needs to be updated, until finally all the remaining functions also need to have their line breaks updated "for consistency.")

When they cease working closely together, they become rivals. Their code diverges in ways that make them disapprove of each other's approach.

This is just what happens because they are both opinionated and both write code in anger. And it's fine. Actually, it's good. This is what's supposed to happen. It helps them keep a clean separation between their duties. Neither of them want to touch the other's code base anymore, and if they are going to collaborate with another new developer, they will find someone junior and impressionable who has a lot of raw talent but hasn't figured out that good developers code in anger yet, and life goes on with each team developing its own culture and its own ways. It motivates them to write better code to prove that their approach is the superior approach.

The same thing happens in music. Collaborators can come together and form their own style. Bands can sound like they belong together. The Beatles and The Velvet Underground both always sounded like coherent bands. But when Paul McCartney goes off and does his own thing and John Lennon does his own thing, there's no common ground between them anymore. Same thing with John Cale and Lou Reed. They might get back together for a concert in Central Park and find that they can recreate the magic they once shared for one night, or they might be able to set aside their difference permanently for the sake of collaboration. Most likely, they've become too unique though, at this point.

There's a widespread notion that goodness converges, and all the really smart people will find common ground they can agree on if they get together. But that's not what really happens. Goodness diverges and all the really smart people define their own world with its own culture where the set of practices only make sense in light of all the other practices.

Some of their disagreements stem from their choice of editor. Some of their disagreements stem from their preferred line width. Some of their disagreements come from the patterns of the languages that they use on the side to solve the problems that for whatever reason, their primarily language isn't particularly well-suited to address. Etc. And this whole ecosystem with all of its difference keeps growing and keeps diverging.

And this is good. This is what's supposed to happen. This creates the kind of diversity or opinion and skillset that is actually valuable instead of the arbitrary divisions created by pointless separations of duties.

So go forth, and code in anger.

Saturday, May 23, 2020

Bureaucracy Considered Harmful

Okay, I’ve written this essay a hundred times.


It’s always true. It’s always necessary.


Sometimes, I get stuck. Writers talk about writer’s block. I’m not a writer. I don’t get that, not in the normal course of life.


I’m an engineer. I get engineer’s block.


The solution is always to do something, to do anything. For me, the solution is usually writing.


I need to write something. To put words on a page, to get used to making progress on something again without feeling constantly overwhelmed by the approach of obstacles.


By bureaucracy and its hideous head.


I hate paperwork, and I hate testing (not unit testing manual testing), and those are the two rewards for completing the work that’s assigned to me.


This is my problem.


I’m sure that this is what causes me to get blocked.


What I need to do to overcome it is to do work that goes unpunished.


I want to write code, really I do. But I also really, really don’t. I’ve developed a resistance to the idea. A mind block, maybe even a phobia.


I bet the writers who get the worst writer’s block are the ones who dislike editing.


When work is its own punishment, you develop an aversion to doing it.


I like writing. I like editing. I'm afraid of publishing, sharing, having a world that can in principal respond. Trying to publish and thinking about publishing are the things that have given me writer’s block in the past. (I finally got over this about three weeks ago, I think and hope.)


JIRA tickets give me engineer’s block.


Sometimes, I want to get buried in a project and not come up for air for a month or two. Always, I want that. There is some semblance of peace in that, some escape from drudgery.


Instead I have to edit JIRA tickets constantly, go through the experience of annoying a coworker until he or she finally reviews my code, and then do a bunch of manual testing every couple of days as soon as I finish a little project, and I hate it. I detest it. I feel like I’m suffocating.


Come up for air is a strange analogy. I think I’m a fish and writing code is my water, and every time I finish a project I get dragged back up onto the boat. I just wish I could just write code and never finish projects.


Actually, I don't wish that. I just wish the release process wasn't so horrible, and I could go back to working in a code base that relies on unit tests instead of manual processes for quality control.

Test Your Public API

The thing that I've encountered a software developer which people are least likely to do that they should do is write good tests.

I've worked on several code bases that had no tests when I started working in them, and the ones that had tests usually had bad tests. Bad tests differ from good tests in many ways, but the one that I want to focus on today is the layer of abstraction being tested.

Bad tests test implementation details. Good tests test your API.

The implementation details should be expected to change and evolve constantly throughout the lifecycle of a project. By the time a project is delivered, the API should be stable. Good unit tests assert that stable things remain stable. Bad unit tests assert that nothing is changing.

Your public API is the part of you code that you expect users to use invoked the way you expect user to invoke it. If your code is a script, your API is the shell command to run that script. You test it by running the script. If your code is a web service, your API is the collection of webhooks that your service provides. And you test them by making HTTP requests against them. You spin up the service in the setup of your unit tests and you shut down the service in the tear down for your unit tests.

I do most of my coding in python, and I am use it for both the service and the clients, if I am writing a web service. But in principle there is no reason this can't be done for everything. A few languages are much better suited for one piece of this than another. But you can still test them the same way. As discussed above, you can shell out as part of running unit tests. As such, there's no reason that you can't write a service in one language and test it using a different language.

Isn't this integration testing?

No, it's not. Integration testing is testing that all of your code works together the way that it's supposed to work together and that it properly integrates with third party APIs. What makes it unit testing is that you mock out everything that isn't specifically what that API is doing to ensure that each individual component of your API works as intended when isolated.

But, but, (someone says), you can't mock out the things you need to mock out and properly isolate your tests if you shell out to it as a script.

Yes, you can.

It's not hard to let your scripts mock out what they need to mock out. The can accept an argument called --mock-context that takes two additional args, one to specify the module from which the mock context is imported, and a second to specify the name of the context manager in that mock context that you are using to isolate this test.

Mocking is beautifully supported in Python. I know other languages don't support it quite as well, but the concept still exists everywhere. It's just harder to use in some languages than others. Unit testing is testing where you've isolated each individual thing you are delivering and are testing it in isolation; in Python, the primary things that separate unit testing from integration testing is that you use mock extensively in unit testing, and unit tests run quickly. Integration tests run something end-to-end and make sure that your code continues to integrate with the rest of your stack correctly. Integration tests end up having to run for however long they have to run, which is often a long time.

But, but, (someone says), it's bad practice to insert things into your public API that only exist for testing.

I disagree, but that's a topic for another day. However, you don't have to add support for these command line arguments part to your public API.

You can define the script as a class and subclass it, so that it accepts the one additional argument it needs. Then you can point explicitly to that script. Unit tests are still supposed to be testing individual features in isolation, so a unit test that is testing that the scripts properly get installed should just be testing installation. As long as you aren't implicitly testing installation as part of the script, you should be able to subclass it to take whatever additional testing-specific arguments it needs.

(Again this is easy to do in Python. It's harder in a lot of other languages, but it's still doable in most languages I've used.)

I don't consider this ideal, but it is good enough. (I like having the ability to mock on the fly. I think it makes it easy to test all sort of things. The --mock-context you pass in as I described it doesn't have to be part of the given library. It can be used to test subbing out one backend for another pretty easily. Not a good long term solution, but mock is also useful for testing out proof of concepts, not just for unit testing. Suddenly your script is a lot more flexible than it was originally intended to be. Is v2 supposed to be backwards compatible? Don't update the client code just yet. Just spin up v2 and run the script that runs client code mocked out to hit the v2 API instead of v1.)

By the way, a unit test of installation of a script ensures that it can output its version when asked to do so, and that the version it outputs is the one that was just installed. A side effect of good unit testing is that you end up writing scripts correctly.

The help message that your script outputs should almost certainly be considered part of its public API.

That said, you should only be testing that the stable portion of the API remains stable. Since you write your scripts as well-factored classes, you often add features through mixins that the scripts you actually use inherit from. You don't want to have to update every test that tests the help message of every script that inherits from it every time you add a new feature to one of these mixins.

Your unit tests should not ensure that no new features are added, only that no existing features are broken or deleted. Testing your help message should ensure that all of the lines that you expect to see in the help message are printed in the help message, and whenever order matters, they are printed in the appropriate order.

Since all good code is self-documenting, your web service auto-generates its own documentation just like your scripts generate their own help text, and you test this too.

If your public API is just a class, then the self-documentation is the code itself because the only people who need to be able to read the documentation already know how to program in your programming language, so they don't need the behavior translated. Where necessary, you still have comments in the code to help other programmers understand how to use these classes. The comments typically point them towards your unit tests where you are using them to assert that they behave as expected.

I've mostly been focused on testing the public API as a form of ensuring that the stable part of your code remains stable, but there are other advantages.

It expands your coverage to actually test everything that's important. Bugs anywhere can be catastrophic. Bugs in an argument parser can be catastrophic. I've seen a lot of projects where the different between a script having major side effects, and having the script just print output without doing anything was entirely determined by whether or not someone passed in an argument named --dry, or something similar. I've seen these scripts used to test scenarios that were not real and that under no circumstance should have been treated as real. What happens if everything in the code works exactly as intended except there's a typo in the argument parser. Do you know the difference between how python's argparse treats action="store_true" and action="store-true"? I don't know if this is still the case, but "store-true" used to just be ignored. It didn't raise an error telling you it was an invalid action. It just was silently ignored. You can do a lot of damage very quickly if the only bug in your code is that hyphen. I've also written bugs where I got confused and made an argument "store_true" when I really wanted it to be "store_false" or vice versa. This can all be catastrophic, and it's not necessarily the case that these things would get caught. The existence of a script to expose functionality is sometimes an afterthought. Sometimes, the code is thoroughly tested in its integration with something else, and the script is just pieced onto it. (This might sound like I've been burned, and I'm trying to make excuses. The opposite is actually the case. I've never caused a catastrophic failure. I have from time to time said, "Oh, thank goodness I wrote that test," or "Thank goodness I decided that I needed to check one more thing before pushing this commit," because I was working in a codebase with woefully inadequate test coverage.)

There are still more benefits to good testing. If you test at the right abstraction layer, you are ensuring that the code you deliver is usable. If you write automated tests to hit each part of your API and test the functionality of each thing in isolation, you are eating your own dog food. It's one of the easiest ways to eat your own dog food, and one of the ways that gives you the most immediate returns. If you find the tests frustrating to write, your users will find the API frustrating to use.

 I could give several more reasons, but I've covered the major ones, and I think this is enough reasons to say that unit tests should be testing your public API.