Crafting Software: Writing Maintainable Code
Maintainable code is can easily be the difference between long-lived, profitable software, and short-lived money pits.
Sometimes, it is necessary to sacrifice maintainability for shipping quickly. When this happens, technical debt is incurred because you are borrowing from your future development capacity. It is important to pay these debts down quickly so the burden does not compound.
Even better though is to avoid the debt and focus on writing lean, maintainable code as you go.
Slow is smooth, smooth is fast...
In this article you'll find a non-exhaustive list for techniques to keep in mind as you code that will go a long way to keeping your software maintainable.
Commenting the Why
Commenting code is is one of easiest things to do and it is also one of the easiest things to do wrongly.
There are two main audiences for comments:
- External developers who use your code as a library. This is where providing good docstrings helps a developer to understand the API as they are using editors that support "intellisense" like help.
- Other developers within the same codebase, including future-self, who maintain the code a long time into the future.
For the purposes of this article, we are ignoring the first use case and focused solely on comments that support maintenance.
Ideally, we want the code to be as comment-free as possible so that the code speaks for itself. Comments also need to be maintained. When they are not, they tend to drift out of sync with the code. This can be worse than no comments at all, causing confusion and wild-goose chases.
One good rule of thumb is to reserve comments for explaining why and/or to provide context for design decisions that isn't obvious from the code.
For example, maybe the current version of a library you are working with has a known (at the time) limitation with a certain API and there is a suggested workaround. Adding a comment in the code with a link to the comment on the PR or Issue would be helpful to someone coming in some years later wondering why some awkard use of the library was in play. And maybe there is now a fix and we can clean up the code with given context.
Another good prompt, is if during code reviews, one of your peers asks why something was done a particular way. That might be a good opportunity to answer that question in a comment rather than a reply on the PR.
The Single Responsibility principle is a pretty well known one stemming, but not limited to, object oriented design.
The idea is pretty simple--a unit of code (function, class, module)--should have a single point of responsibility. Martin talks about actors and relationships to them but I think it is simple enough to think about a "single concern".
When a function or class has multiple concerns coupled together it can introduce edge-case bugs that are hard to track down and fix. It can also make the code harder to read and grok as a newcomer to the code base.
Resist the urge to add a quick code branch inside a function during maintenance to add a new side-affect. You could very well be unwittingly adding new concerns to a single-concerned function. Take the time to factor out code to keep things single concerned and decoupled as much as possible.
Variable names like
a are fine in quick one off scripts. And maybe
even in short lived loops where it is obvious that the variable is holding
some index because the loop is only 2-3 lines long.
When in doubt though, use names that are meaningful, but short.
Think more Hemingway and less Faulkner.
The goal here is to make skimming a class you've written quick and easy while minimizing the chance that the reader missed something.
Variable, class, and property names should be short nouns.
Methods should be verbs describing the action they perform, maybe sometimes
with a hint at what they return (e.g.
If your classes, methods and variables are named well, your code will be easier, perhaps even a delight, to read. Code with good names doesn't come easy.
There is a lot of thought and care that goes into it and it is well worth the investment.
No Magic Numbers
Similar in motivation to having meaningful names is getting rid of any magic numbers and by "numbers" I mean strings or other types too. There is never a good argument for unnamed constants in the code.
Instead of pasting in to the
requests.post call the URL for the API endpoint
your client code uses, set a
API_ENDPOINT constant and referenced that named
constant. It can be clear enough with something like a URL but not with other
values you could very well end up using.
Collecting all these into a single
constants.py module depending on the size
of your project will keep things even tidier.
I'm not a test driven development champion. I know. We aren't supposed to admit this. But in the 20+ years I've written software professionally, I can count the number of times that TDD felt worthwhile on one, maybe two hands.
I mostly write tests after the fact and not for every line of code that I produce, generally just the trickier parts. Sometimes, if I'm working on something really tricky that I need some iterations in the code to help me think through, then I might write some test harnesses to help execute the code.
That said, as I code I try to keep top of mind just how testable is what I'm writing: Am I using services that will have to be mocked in a test? Am I coupled to dependencies that I can't influece through injection? Can I decompose what I am writing to smaller units that would be easier to test if I get around to writing the tests? Can I isolate the nasty bits that will need to mocked to a smaller piece?
Thinking in terms of single responsibility helps with this mental framing of the problem.
Easy to Read
You've made sure things all have a single concern, are well named, are testable, and have any relevant context commented. Still, though there might be room to make it easier to read.
Yes, this one is very subjective, however, I think we all generally know it when we see it. However, two easy objective rules to add on to some of the previous techniques are:
- code-block lengths
- two much branching
Generally, speaking, we should refactor any function or method to fit on a typically display without having to scroll to avoid code-blocks that are hard to read because they are too long.
Likewise, heavy / deeply nested branching of the code can be hard to follow and easy for bugs to sneak in. Refactoring this out to strategy patterns and/or named functions will yield a lot of readability benefits.
Don't Repeat Yourself is a classic engineering principal made well known by The Pragmatic Programmer (highly recommend this book!).
This is as simple as it sounds.
If you find yourself copying and pasting code--stop. Make a function or class.
If you find yourself having very similar pieces of code--stop. Consider how, with the the right abstractions, you could reuse code that might behave a bit differently depending on the inputs.
Getting really good at keeping your code DRY will make your code more readable, more bug free, and easier to maintain.
Lastly, as you working on any bit of code, always consider, what you might want to leave it with if you knew you were going to come back to this code some number of years later to have to fix something under a tight deadline.
What things, could you do now, given all the context you have in your head about the weaknesses, edge cases, etc, that would make your job easier in that future scenario.
It might just be leaving some comments. It might be refactoring or buttoning up a known weak point.
Only you really know what things you could best do to help your future self.