books chapter fifteen
At last we come to Donald Knuth, who wrote some books. He first programmed on an IBM 650, a decimal computer. Unlike other schools, Case allowed students to actually interact with the machine, instead of simply passing cards to technicians. He contrasts this experience with Dijktra’s advice to not let students touch a machine at first. Don’s view is that it’s a common error for someone at the top of their field to propose a new, better way of teaching, but forgetting that they themselves learned the old way. I have to agree, that it’s difficult for us to know what we know, or how, and it’s hard to account for the true value of the time wasted, to use a word, learning things we don’t need.
Seibel asks his second favorite question, does everybody need to read all of The Art of Computer Programming? Don somewhat jokingly admits that maybe even he can’t read them, and he keeps an answer key to the exercises because it would take a really long time to work things out from scratch. I thought that was pretty refreshing. He wrote things down as a way to compile important information, but not necessarily because everything is required knowledge. The information is accessible if you need it, and especially so because it’s written in a consistent voice with consistent terminology. Publishing a great many papers by various authors together wouldn’t be so effective because it would require decoding of each paper’s unique jargon.
This segues into TeX and literate programming, which is probably Seibel’s very favorite topic. Don is a little uncomfortable being an advocate for literate programming, or anything. He’ll tell you what’s good about it, and why he thinks it works, but you should make your own decision.
One rule for technical writing is to say everything twice in complementary ways, formally and informally. You might include a prose version and then a proof, or a proof and then the prose. This helps the reader have a general idea of what’s happening, and then also a precise understanding. And so it should be with programming, with an informal description in English and a formal description in code. I’m not sure how much I agree. There’s a serious risk that the code and comments diverge which is perhaps less likely with a mathematical proof. Also, I think mathematicians have a culture of studying proofs to verify they prove what is claimed, but we don’t have quite as rigid a culture in software. If the comment says something, how carefully is the code reviewed to verify that? But this may be a matter of perception and perspective.
The literate style does allow one to write what are effectively inline subroutines. So this is an interesting point. Weave and tangle may provide a better means of organizing a program. We spend a lot of effort manually decomposing programs and inventing calling conventions, but this is very much tied to machine calling conventions. It’s not necessarily how we would think about the problem.
Don read a review for a book about programming tricks that said “I’d fire anyone caught using these tricks.” He thought this would be a great book to learn from, but unfortunately none of the tricks were very good. Coding should be fun. It shouldn’t consist of just pasting libraries together. It’s great that libraries exist, but they shouldn’t be black boxes we use without changing. We should open them up and adapt them to our needs. To paraphrase his point a bit, it’s rare for generic software to fit your needs, or if it does it’s because it’s a gigantic package.
“To me one of the most important revolutions in programming languages was the use of pointers in the C language.” In particular, he’s talking about pointer arithmetic. Of course, it gets a bad rap for allowing bad things to happen, but the idea that
p + 1 refers to the next thing is great. “That, to me, is one of the most amazing improvements in notation.”
“Change files are something that I’ve got with literate programming that I don’t know of corresponding tools in any other programmers’ toolkits, so let me explain them to you.” Don then goes on to exactly describe diff and patch. He writes a master copy of a program, then hands it out, and users keep a diff, er ahem change, file that they apply to adapt it to their needs. I remarked last time that keeping a program as an original plus a pile of patches seems like a recipe for catastrophe, but I guess he likes it.
Mostly irrelevant, but Don mentions that GDB can debug CWEB programs because they use __LINE__ directives. I suspect that’s a transcription error for #line directives.
On formal verification. “Or TeX, for example, is a formal mess. It was intended to be for human use, not for computer use. To define what it means for TeX to be correct would be incomprehensible. Some methods for formal semantics are so complicated that nobody can comprehend the definition of correctness.”
When testing a program, change mentality. Forget you wrote it. It’s somebody else’s program and you’re trying to find as many bugs as you can. Break it every way possible. A lot of people get stuck doing tests that only show a program works, not tests that break it.
In 1974, Don predicted that in 1984 we’d have the perfect programming language, Utopia 84. We don’t. “I think what happens is that every time a new language comes out it cleans up what’s understood about the old languages and then adds something new, experimental and so on, and nobody has ever come to the point where they have a new language and then they want to stop at what’s understood. They’re always wanting to push further.” I wonder if we’re getting closer to settling down, or if we’ll look back in another ten years and say, nope. “In a way I resent having every language be universal because they’ll be universal in a different way.” Heh. Accidentally Turing Complete is one of my peeves, too.
Go back and read old papers. People are familiar with some of the work of Boyer and Moore, but there’s more to their work than you might get from Wikipedia. I’ve noticed this, that original papers often present variations of algorithms, but only one form gets preserved in the lore. A lot of time spent rediscovering this knowledge. “The idea that people knew a thing or two in the 70s is strange to a lot of young programmers.”
Read lots of source code, but don’t only read people who code like you.
Blake Ross created Firefox. “Firefox has cut into the formerly overwhelming market share of Internet Explorer, and dominates among technical users.” So this is a bit of a time capsule. Netscape kept adding buttons to the browser to drive traffic to their portal, and wouldn’t let Mozilla include a popup blocker, and so it was time for a fresh start with a small team. Blake wanted to focus on end users, but this was a struggle at times. “We had to transform the culture at Mozilla because it was all based around open source ethos, which says programmers are kings, marketers are sleaze, and everyone else can read the manual.”
“Whereas Microsoft, they win a browser war, so in 2001, they bow out. Which is completely irresponsible, because this is the most used software application in the world, and they just stopped developing it.” This is an interesting point. It’s very true, and it’s generally accepted as bad thing, but is it? People liked Windows 7, but Microsoft didn’t stop there and we got Windows 8 and 10. There was probably a version of Word that people liked, long ago, but development never stopped there either. And maybe I’m just really cranky, but I would love it if browsers stopped adding more and more features. Can’t anything ever be done?
“It’s hard to convince 500 flesh-and-blood developers that their pet feature may not be desirable to 500 million imaginary users.” I think this is very true, and I see a lot of it. One approach, which I like because it’s pretty easy, is to focus on an audience more like oneself. But if you’re going to target other people, yeah, they may not all be just like you.
After Firefox, Blake went on to found Parakey, although I think they were swallowed by Facebook by the time the book was published.
Mena Trott founded Six Apart. She had a blog, but didn’t like the software available. Enter Movable Type. Originally meant to be donationware, it was surprisingly popular right from the launch. So she and her husband Ben set their sights a little higher and created TypePad, a hosted blog platform. They’d heard bad stories about what happened to companies that took investment money, so they founded their company as an LLC because LLCs are restricted from taking investments. Kind of a surprising decision. Of course, you can change corporate structure, and eventually they did.
One of their early missteps was changing the pricing model of Movable Type to collect more money from commercial users, and suddenly they were the bad guys. “It wasn’t that we didn’t want people to make a living off of Movable Type; it’s just that, if we weren’t making a living, we didn’t think that other people should be making money.”
Livingston asks why there aren’t more female founders, but Mena turns it around. We want more women to go into engineering, but maybe there’s an assumption that engineering is the smart field because all the men work there. Mena thinks design, where a lot of women work, is much harder to learn than engineering. So maybe we should recognize that instead? Kind of a second order sexism, to assume that whatever jobs men have are the good ones?
It’s hard blogging when you’re famous. No joke goes uncritiqued, no statement unanalyzed. Of the interviews so far, I didn’t find this one particularly insightful, but it was an enjoyable read. There’s much more of a human aspect that comes across.
It’s been twenty years since The Mythical Man-Month was originally published. (And now another twenty years since the anniversary edition.) Brooks was sitting next to someone on a plane reading the original, and after the flight asked what he thought. “Hmph! Nothing in it I didn’t know already.” Heh. As ever it seems the people who need to read a book and the people who do read it don’t overlap. But how relevant is such an old book now? That’s what Brooks seeks to answer in this chapter, though I think can weigh in and say quite a bit because we’ve not done a very good job of assimilating and integrating knowledge in the field.
Conceptual integrity of a product is the most important factor in ease of use, and probably success. These are products designed by a single mind. A lot of Brooks’s chapters focus on how to enable one person to oversee and direct a large team working on a large product. He indirectly makes a point which is worth emphasizing. The user only has a single mind with which to understand the product.
Second system effect redux. As more people use software, software packages grow from hundreds to millions of users, with diverse needs. The easy path is to keep adding features. “The loss of ease of use sneaks up insidiously, as features are added in little increments, and the manuals wax fatter and fatter.” Speaking of ongoing development, he cites a review of Microsoft Word 6.0 that says it has too many features, is slow and bloated. Requires 4MB of RAM!
One might imagine that Brooks contradicted himself, with simultaneous advice to avoid the second system effect and also to build one to throw away. The key difference is whether it happens before or after shipping, and how the scope changes. You’ll need two attempts to build the first version of a product, even with a limited feature set, so plan for that. It’s when designing the successor system, after successfully shipping, that things really go off the rails.
Brooks loves the WIMP interface; windows, icons, menus, pointers. This is a little less revolutionary today, but he has some interesting observations about how it enables great efficiency, but also impedes it. Why is there only one mouse? We have two hands. The desktop metaphor is familiar to users, but in the real world I can pick up a file folder in each hand. His observations seem primarily based on the Mac interface, with the menu at the top, which requires constantly moving the mouse back and forth. Today we have the right click context menu, which addresses this by bringing the verbs closer to the noun. An interface that properly supported two mice would be pretty exciting, though. I spend a lot of time moving windows around Towers of Hanoi style because I’m limited to one operation at a time.
Along those lines, a shoutout to hotkeys. Power users will learn shortcuts for their favorite commands as long as such are discoverable. The menus are always there for options you’re not familiar with, but you’re constantly reminded of a faster way to do things because the hotkey is right there. I appreciate this. Software should teach you as you use it, but of course it gets taken way too far by wizards and tutorials that popup when I’m trying to get something done. Never have I once started a program after an update and wanted to take a tour of its new features before doing what I wanted to do. There’s a bright line here between revealing a better workflow and imposing it. Or should be.
Brooks predicts the demise of WIMP within a generation, giving rise to speech interfaces. Siri, has it been a generation? Alexa, has speech taken over? Okily Dokily, Googley.
Returning to the throw one away plan, that assumes a simplistic waterfall model. It’s a good plan, but for a broken model. Brooks notes that a lot of his advice assumed a waterfall model. Rapid prototyping and incremental building are better approaches. Get a do nothing version of the program running first, and then add to it, but always have an integrated something to run and test. I am reminded of the tracer bullets approach.
Brooks retracts his advice that every programmer be given access to see the internals of all components. He is now very much in favor of well defined interfaces, with shielded internals. I don’t know that we’ve really resolved this debate. Knuth is obviously at the other end. Brooks advocates for a style of development that’s almost exactly what Knuth decries, gluing together a series of libraries. In Brooks’s world these are all well tested and documented. Maybe that could work, but it really seems like a lot to ask. I’ve invariably been let down by libraries, but have had better, or at least less frustrating, experiences taking some sample code and banging it into shape. Brooks is convinced this is the next step, that will let us make another exponential improvement to tackle complexity, but man, it mostly seems like we’re just stacking the leaning tower higher and higher.
Some other books to read. Barry Boehm and Software Engineering Economics, although the COCOMO model appears obsoleted by a followup, and in any case it’s expensive (the book; I have no idea about the model). Peopleware says you should provide people with spacious, quiet offices. Crazy!
Some more thoughts on shrink wrapped software and attacking the essence of complexity. But still, too much software is provided in the form of applications and not libraries, making it difficult to script and automate.
Finally, he draws a metaphor comparing software engineering to chemical engineering. The field of chemical engineering is rather new, and quite different from chemistry. It’s not about fundamental principles, but about scaling them up to industrial levels. Good point.
A cosmic ray induced off by one meant I skipped a chapter. Going back to the middle, chapter 5.
26. Minimize coupling. They propose the Law of Demeter. Only calls methods on self, parameters, objects created within a function. The counter example would be apparently be forest->gorilla->banana->eat(). If you want the gorilla to eat the banana, you have to pass banana as a parameter, not forest. This seems like useful advice, but perhaps difficult to follow. The law is seemingly applied locally to this function, but really it’s about the API of all the other objects and functions. You can detect the violation locally, but not fix it.
27. Enable dynamic configuration. There’s some helpful advice here about the difficulty of changing code vs changing config, but they seem to go overboard. Everything must be configurable? If everything in a program can be this way or that way, that obviously doubles the amount of code. They’re pretty close to advocating putting the entire app in the config, at which point you need another config file to configure the configuration.
“Programming in machine code is like eating with a toothpick.” Instead we’d prefer to write assembly code, which is a text version of the code. Then a program called an assembler turns our text into the appropriate binary for a computer. But this doesn’t help us to write code for a different computer. For that, we want a high level language, so we can express a concept in one language, then have a compiler translate it to machine code for multiple computers. However, this means giving up some capabilities. “Many processors have bit-shifting instructions. As you’ll recall, these instructions shift the bits of the accumulator to the right or left. But almost no high-level programming languages include such operations.” Really?
Anyway, we’re going to take a tour of ALGOL, which nobody uses, but will conveniently let us avoid certain arguments about what’s best. We multiply some numbers and print the result. “These days, the familiar x multiplication symbol is usually not allowed in programming languages because it’s not part of the ASCII and EBCDIC character sets.” I guess there’s another typographic failure here, because that sure looks like the x character in ASCII. “ALGOL also defines an arrow (↑), another non-ASCII character, for exponentiation.” The arrow came through ok, though.
COBOL, PL/I, BASIC, all caps oh my. Pascal, Ada, C. Lots of languages out there. Everything is kind of ALGOL like and runs on von Neumann computers, except for LISP. APL is unusual.
I’m still pondering how valuable reusable software components are. Brooks and Knuth make compelling but contrasting arguments. If we assume the availability of robust libraries, that surely seems convenient, but can we assume that?