out with the old, in with the less
It’s probably worth preemptively citing jwz’s “Cascade of Attention-Deficit Teenagers” model. It certainly is appealing to throw everything away as a bug disposal mechanism. As noted, this rarely has the intended effect and just replaces one set of bugs with another set. The rewrites mentioned here have a slightly different motivation. Instead of trying to fix known bugs, we’re trying to fix unknown bugs. It’s not based on the current buggy state of the code, but the anticipated future buggy state of the code. Past bugs are a bigger factor than current bugs.
An example that didn’t receive much attention because it’s not user visible is the replacement of the isp SCSI driver. The isp driver supported a wide variety of QLogic interface cards. Unfortunately, support for the newest generation of hardware didn’t work, and efforts to improve it frequently broke other hardware models. The driver tried to do too much and as a result was difficult to maintain. Not all similar looking code is really meant to be shared. Eventually, the isp driver was replaced with three different drivers: qla, qle, and qlw. Each driver can now be modified in isolation without unintentional side effects on other hardware, or the need to consider if and where further special cases need to be added. Despite the fact that these three drivers duplicate all the common boilerplate code, combined they only amount to about half as much code as the old driver. More is less.
The reverse effect can be seen in the evolution of mandoc. Ingo’s talk about mandoc becoming the main BSD manual toolbox has all the details, but it’s interesting to see one of the most unixy parts of the system succeed by following a very nonunix path. What could be more unix than having to fork and exec and pipe the output of a half dozen programs to put some text on the screen? Now we’ve got this slicing, dicing wunderprogram. Even so, it’s smaller and simpler than the accreted mass of what it replaces, but it does more and does it better. Less is more.
Much has been said about httpd already, and how it’s either the best or worst. Overlooked in Reyk’s httpd paper is a quote from another OpenBSD developer in regards to SPDY support. “Is there a reason to believe this specific piece of code is worse than any other?” The unnamed developer was actually me. Not too long after I asked that came the 1.5.10 SPDY memory corruption bug followed by the 1.5.11 SPDY buffer overflow. To their credit, nginx did mark the feature experimental, and it was OpenBSD that enabled it, but the lesson for me was to ask the right question. Is there a reason to believe that if a feature exists, it has bugs? Yes.
(I asked the question without knowing SPDY was experimental. “It’s experimental” would certainly have been a good answer, but communicating was hard. Inside OpenBSD, there was a fair amount of confusion about what features were enabled, or should be enabled. Once you know a feature exists and can be enabled, how can you not choose all the Pikachus?)
From the beginning, the most commonly requested httpd feature has been rewriting. Reyk resisted adding it, though, since it more or less required regex support, and now you have two problems. At BSDCan it came up again, and I mentioned Lua patterns. They’re much simpler than regular expressions and can be learned quickly, but in my experience are more than powerful enough for most tasks. True, if you’re already a regex swashbuckler, it’s yet another thing to learn, but it’s really just a matter of looking up the syntax for character classes. On the other hand, for a user setting up a web site with simple needs... Compare re_format(7) and patterns(7). It only took Reyk about a day to hack something together, and shortly after commit.
The file utility is a curious beast. Almost by definition, its sole input will be untrusted input. Perversely, people will then trust what file tells them and then go about using that input, as if file somehow sanitized it. You don’t trust the source to tell you what something is, but nevertheless trust what’s inside it? Regardless, if a program should be hardened against attack, file would be a good choice. So Nicholas rewrote it. The new version endeavors to be safe first, accurate second. Given the ease with which chimeric files can be created, I’m not even sure accuracy belongs on the scoring rubric.
These examples have been about code, but as often as not it’s about user experience as well. Stop and consider XKCD: Manuals. The best tools are the ones which do not require (extensive) manuals. I like to think signify is in this category. The xkcd hovertext regarding sudoers is certainly appropriate, since sudo is moving to ports. As hinted later in the thread, I have a replacement utility doas which is very tiny. Perhaps too tiny. We’ll see. This is not the doas retrospective post.
Popularly cited is also Joel’s never rewrite article. A few comparisons. First, business customers may revolt if future releases have fewer features, but many OpenBSD users welcome less if it means simpler. Second, “they had never stopped working on the old code base, so they had something to ship, making it merely a financial disaster, not a strategic one.” Excellent point. Most OpenBSD rewrites are skunkworks kinds of projects, at least to start. Todd didn’t stop updating sendmail while smtpd was in development. New code can reduce the enthusiasm for maintaining the old, but frequently the old and new are worked on by different developers.
Joel’s warning anecdote about the untidy two page long function does parallel something that happened early in doas development. Instead of copying the sudo privilege switching code, I wrote my own. As a result, there was a bug because I forgot to mess with the supplemental groups. I should perhaps have copied the sudo function, but I couldn’t find it. The good news is that the replacement code was easily read and the bug quickly identified. Code does accumulate fixes for obscure edge cases over time, and there is a real cost when a rewrite loses them.
As with jwz’s post, Joel’s article focuses on feature equivalent rewrites. One can try simply reducing functionality in an existing codebase, but then there’s a lot of scaffolding left over. YAGNI (you aren’t gonna need it) in reverse. You did need it, but now you don’t. Like extreme liposuction, the end result is better than before, but it’s still not pretty. Start from scratch, add things back, and maybe find even more unexpected fat.
Pruning and Polishing covered some complementary material. In particular, the polishing section mentions the benefits of homogenization. All of the above examples (isp, nginx, file, sudo) were stylistic outsiders. Let’s pretend rewrites are extreme polishing, like sandblasting. The replacement source code now feels like OpenBSD code.