stylish bugs

Whenever a bug is discovered, it’s immediately apparent that three things have gone wrong. They used the wrong language, and this never would have happened if they’d written it in the bugfree language. They used the wrong development methodology, and this never would have happened if they’d used the prevent this bug technique. And finally, they ignored best practice and didn’t use the proper coding style. The bug would have been extremely obvious if only they’d replaced all v_w_ls with underscores in variable names. Just look at the word v_w_ls. It pops right out.

I recently contemplated a new style which seemed intriguing, but let’s begin with a survey of a few coding styles. How will our new contender rank?

yoda

The prototypical error free style is putting the constant first in a condition. if (0 == x) prevents mistakes like assignment in the condition if (x = 0). I’ve never used a compiler that didn’t warn on this, however. And the warning also catches mistakes like if (x = y) where swapping if (y = x) doesn’t prevent the problem.

I think this is common to coding styles. They only prevent a limited subset of the error class and are typically subsumed by better checks.

It’s also interesting that this bug may occur as a result of another style requirement, to always write out expressions. I’d probably just write if (!x) in many cases, but oh no, that’s not explicit, you’re only one character away from doom, we need to write out the full thing.

There was an attempt to slip such a bug into linux in 2003. And of course, there’s a comment that you should switch the conditional around, although it’s less clear why a hacker writing directly to the cvs repo would bother following your style.

Another security bug was in the X server because they check if (geteuid == 0) without calling the function. Sure enough, all the very smarts had the bright idea that this wouldn’t have happened if only they had followed proper style and written if (0 == geteuid). Nope, that does nothing.

So another observation is that the set of bugs actually preventable by a coding style is likely much smaller than the set of bugs its advocates pattern match on. Zero in if I see, yoda fix it would I say.

ASI

Javascript will automatically insert semicolons into code based on some rules about the next token. As a result, you can usually omit them and get the same result. There are two gotchas however. Sometimes the compiler inserts a semicolon where you don’t expect it, and sometimes the compiler doesn’t insert a semicolon where you do expect it.

Surprise number two leads to the common style that one should always type out the semicolon. Fair enough, but weirdly people believe this will save them from surprise number one. I’ve been told hundreds of times that since the rules are difficult to understand, they will always add semicolons and then they don’t have to learn the rules. But that’s not at all how the compiler works. The compiler is not going to look over your code and notice all the manually placed semicolons, and then turn off ASI. The compiler will always insert semicolons where it wants to.

This is another common theme to style guides. There’s a language feature which we don’t have time to learn, so we avoid it, and tell everyone to do things this other way. But the feature is still there! Anything the compiler will do implicitly or automatically is going to happen even if you don’t intentionally rely on it.

braces

    if ((err = sha1update(&ctx, &client)))
        goto fail;
        goto fail;
    if ((err = sha1update(&ctx, &server)))
        goto fail;

No discussion of proper style would be complete without the all time record holder for most intense forum reaction. Modern languages don’t even let you program like this anymore. Every language also now comes with a fmt utility.

    if ((err = sha1update(&ctx, &client))) {
        goto fail;
    }
    goto fail;
    if ((err = sha1update(&ctx, &server))) {
        goto fail;
    }

I don’t have too much to say that hasn’t already been said, except that the first example is more glaringly wrong to me than the second. Even the most obvious example of the most obviously correct style contains a lot of suppositions about how exactly the error is introduced and detected. (“This is why I always commit my code first, then run go fmt, then diff the result and inspect the changes before rebasing.” Oh, really? I doubt I’m alone in never inspecting the changes go fmt makes, but perhaps I am mistaken.) I would advise caution when reviewing merges, but I guess as long as you use the proper style, it can’t happen to you.

Why are we even looking at the return value of a function that can’t fail? Oh right, every error must be checked. Says so in the rulebook.

const

There was a bug, and a reply, and a taunt, but somewhere in there might be a really good idea. What if we make every intermediate value const? I kinda think the SSA transform is the compiler’s job, but at the same time, compilers use SSA because it makes reasoning about the code easier. So why not use it if we want to reason about our code?

I think the style here was incidental to detecting the bug, but certainly many forms of incorrect variable use could be identified if variables can’t be reused. There’s no obvious downside, except for a bit more typing. Well, maybe. If you end up with 47 variables in scope at the end of the function, does that increase the difficulty of choosing the right one? What happens if your code has branches? After thinking it over for a little while, I decided to investigate further and try things out. A little field experiment.

For completeness, we should consider an example that wouldn’t be helped.

void setpsize(dev_t *dv)
{
    dv->asize = db->asize << dv->ashift;
}

I’ve seen many variations on this bug, so again, we need to be cautious about overfitting our style to catch one bug, yet I was curious to see what would fall out by shaking this particular tree.

With that in mind, I set out converting my favorite testing project to use lots of const. And made it about two minutes.

const n, err = strconv.Atoi("10")

Doesn’t compile. It will compile if I change the const to var (it was originally := shorthand), but that’s obviously not what we want. From reading the go spec for constant declarations, I can’t see why it would be different from variable declarations. Nevertheless, the compiler is the ultimate arbiter of what the compiler accepts. Given the prevalence of multiple return functions in go, this is a pretty big roadblock. One weird trick to eliminate arguments about favored styles.

Go also lacks a ternary operator (or expression yielding if) which really puts a damper on one’s ability to adopt this style universally. Sometimes language decisions can show up as unexpected limitations.

The good news is I also have some C code I’ve been poking. Let’s give it a whirl. I’d describe myself as a moderately enthusiastic const user, but this will be my first attempt really pushing const onto every variable. A few trivial functions were easily converted, but without meaningful improvement. Does tossing const onto the fd returned by open really make a difference? Fortunately, I happened upon a function performing some calculations fairly similar to the original bug hiding function.

void
yuv2rgb(float Y, float u, float v, uint8_t *rgb)
{
    Y /= 255.0;
    u /= 255.0;
    v /= 255.0;
    Y = 1.1643 * (Y - 0.0625);
    u = u - 0.5;
    v = v - 0.5;

    float r = Y + 1.5958 * v;
    float g = Y - 0.39173 * u - 0.81290 * v;
    float b = Y + 2.017 * u;
    rgb[0] = r * 255;
    rgb[1] = g * 255;
    rgb[2] = b * 255;
}

We’ve got ourselves a handful of operations and potential intermediate values. The code can look correct, but are we sure we’ve used the inputs and assigned to the outputs correctly? I have a vague understanding of the algorithm, though to be honest, I just cribbed it from elsewhere, and I’m not sure what to call the intermediates, but we work with what we’re given.

void
yuv2rgb(const float Y, const float u, const float v, uint8_t *rgb)
{
    const float Yscaled = Y / 255.0;
    const float uscaled = u / 255.0;
    const float vscaled = v / 255.0;
    const float Yadjusted = 1.1643 * (Yscaled - 0.0625);
    const float uadjusted = uscaled - 0.5;
    const float vadjusted = vscaled - 0.5;

    const float r = Yadjusted + 1.5958 * vadjusted;
    const float g = Yadjusted - 0.39173 * uadjusted - 0.81290 * vadjusted;
    const float b = Yadjusted + 2.017 * uadjusted;
    rgb[0] = r * 255;
    rgb[1] = g * 255;
    rgb[2] = b * 255;
}

I was able to detect some bugs caused by using the wrong variable, like accidentally using Y instead of Yadjusted in the RGB assignment, thanks to the unused variable warning. But this bug was actually introduced by creating the new variables. The original function doesn’t have multiple Y variables to choose from. What would really concern me in a function like this is mixing up u and v. What if I accidentally set b using v? There’s no warning, because all the variables are used at least once somewhere.

Now I’m less impressed with the prospects of const first programming. Maybe we’ve fallen victim to overfitting on a lucky first hit. There’s likely some other way to write yuv2rgb, or arguments to be made about whether it’s representative of C code in general, but in my opinion it’s a pretty good match for the kind of function that contains the errors we’re trying to flush out.

The best prevention for assigning the wrong type of int to the wrong variable remains type checking. The C language won’t do this automatically, but we can use tooling that does.

conclusion

Write the code you like to write in a way that feels natural? That seems like the best way to understand it and avoid mistakes. Recognize that it’s more a personal preference than a correctness issue.

I’ll toss in one more, very personal, style. I nearly always name functions in go, and thus avoid closing over loop variables, but that’s not why I picked that style. I only made the connection much later, after many missed opportunities to deride clown developers for defining functions inside for loops. My delightfully bug free code was not the result of skillful intention or refined technique, just happy coincidence.

And that about sums it up. Sometimes we get lucky, and sometimes other people get unlucky, and there’s always something to point to, haha, that never happens here, but we need consider that it could be more luck than skill when discussing isolated incidents. May as well attribute it to keyboard layout. Well actually, by separating comma and semicolon... There’s always somebody who gets the green jelly bean.

Posted 12 Aug 2025 08:13 by tedu Updated: 12 Aug 2025 08:13
Tagged: programming thoughts