invented by openbsd
The primary product of the OpenBSD project is the OpenBSD operating system, but sometimes other artifacts are produced as byproducts. Avant-garde web site design, funny email threads. Also, reusable code that can be beneficial to other developers, outside the strict confines of OpenBSD.
Unfortunately, sometimes this code doesn’t see the widest distribution. Often this can be the result of Not Invented Here syndrome, though other times it takes the appearance of a more pernicious problem. Invented by OpenBSD.
It was brought to my attention that NetBSD recently imported two OpenBSD functions, but reimplemented them in such a way as to be dangerously incompatible.
reallocarray
reallocarray
was introduced in OpenBSD to reduce (though not solve) the problem of integer overflows. More about reallocarray in OpenBSD. The OpenBSD version is pretty short. And, as the initial commit says, it’s in a separate file so others can avoid reinventing the wheel.
Turns out reallocarray is pretty good at preventing integer overflows. So good, in fact, NetBSD imported it to fix an overflow in the regex library. This bug had been fixed in OpenBSD some months prior as part of a general sweep to replace all suspect looking allocations.
But then NetBSD decided to improve their manpage by adding an incorrect statement about zero sized allocations. The OpenBSD man page is quite clear:
If size or nmemb is equal to 0, a unique pointer to an access protected,
zero sized object is returned.
There is no ambiguity. In fact, the initially imported implementation worked the same way on NetBSD. And so, in order to prevent anyone from pointing out that all implementations work the same way, NetBSD was forced to change their implementation. First, they added a reallocarr function with different semantics. Then they changed reallocarray to be a wrapper around that function, with newly incompatible semantics.
What’s different? With a zero sized allocation, the new and “improved” reallocarray will free the input pointer (something the original never did) and return NULL. At which time the caller will free the pointer again. That’s bad.
Was it really worth replacing one ten line function with another ten line function just to be different?
Oh, hey, look, reallocarray was fixed to behave like OpenBSD. Now it’s a 13 line replacement for a 10 line function. But at least it doesn’t have any code written by an OpenBSD developer.
strtonum
strtonum
was introduced in OpenBSD a while back to solve a more mundane problem, so there aren’t any cool blog posts showing you how to use it. I did, however, write up some notes about the design of strtonum when I got tired of people misunderstanding it. The OpenBSD version has some subtleties, but you can generally work out what it does. Or, as a last resort, read the man page.
NetBSD also imported this function, but again as a wrapper around another function, strtoi. I’ll spare you the trouble of looking up what strtoi does and paste the code here:
__TYPE
INT_FUNCNAME(, _FUNCNAME, _l)(const char * __restrict nptr,
char ** __restrict endptr, int base,
__TYPE lo, __TYPE hi, int * rstatus, locale_t loc)
{
return INT_FUNCNAME(_int_, _FUNCNAME, _l)(nptr, endptr, base, lo, hi,
rstatus, loc);
}
In case you were wondering, yes, you can read the NetBSD man page for this function, but that would be incorrect. The man page text was copied from OpenBSD and describes the behavior of original implementation, not the behavior of the function on NetBSD.
And as with reallocarray, it’s not like strtoi had any precedent. It was introduced in the same commit as strtonum. But why have one function when you can have two?
portability
The OpenBSD implementations of reallocarray and strtonum are deliberately designed to only depend on other standard C functions. The functions themselves are not widely portable, but the implementations are, so that developers are free to take the code and incorporate it. You aren’t required to also take a pile of other code because they aren’t wrappers built upon wrappers. The implementations are expressly permissively licensed to make this even easier, because multiple incompatible implementations damages the ecosystem.
Oh, and it’s not like this hasn’t happened before. Want to know a good way to get ssh to segfault? Pick a string function used by OpenSSH and add it to your libc, but reverse the arguments.
Long ago, the tactic used to keep OpenBSD functions like strlcpy out of the ecosystem was to simply not import them. But eventually, that tactic was overrun by all of the software including its own copy. The new tactic seems to be introducing incompatible versions of the same functions, so that nobody knows which ones are safe to use.
others
When writing this post, I’d forgotten about a few other functions. For completeness, I’ll mention that NetBSD also has consttime_memequal
and explicit_memset
functions, clear analogs to the timingsafe_bcmp
and explicit_bzero
functions in OpenBSD. That’s ok; I’m less troubled by them. I think it’s a tad NIH, but at least the functions have different names. Different names can cause confusion and more work when trying to write portable software, but it’s not nearly as toxic as introducing implementations with the same name but different behaviors.