sometimes the knote comes early
Some bugs, some ambiguities, some assumptions, some bad results. Nothing went too seriously wrong, but it seems like an interesting case study in code evolution. I had nothing to do with finding or resolving the issues, I’m just commenting.
I first learned about the problem when some OpenBSD developers reported that cmake was hanging during builds. This lead to a few issues on the libuv github, pointing to a bug in kevent, which was eventually fixed.
Backing up, when you want to deal with multiple files without using threads, you use something like select or poll. This works great because everything is a file. Except the things that aren’t, like signals and processes. Enter event libraries which abstract over this. I believe libevent was among the first; libuv is another.
At the same time, kernel developers were observing the problem and thought maybe there should be a better systems level solution. Enter kevent. So now why bother with the event library? Well, if you ask me, using kevent is easier than using libevent, so I wouldn’t bother. But maybe you’ve already written the code to use the library, or you care about systems that don’t have kevent. And the event library itself can be updated to use the new interface as a backend.
And that’s where we are with libuv. It was recently updated to use kevent to watch for when a child process exits, instead of the old SIGCHLD signal based approach. However, kevent doesn’t wait for the child, you still have to do that. And now we’re getting really close to the matter at hand.
Why would libuv programs like neovim and cmake hang when using kevent? Well, what happens is that libuv calls kevent, kevent says a child exited, libuv calls waitpid with WNOHANG, gets back 0 indicating no child to wait for, and libuv returns to kevent for the next event. Except there are no more events. The child exiting was the last event. And now we’re stuck. Here’s a good investigation.
Later it was determined this problem exists with other kqueue systems. Instead of sticking with signals just for OpenBSD, the more complete fix is to call waitpid without WNOHANG. Block until the child is ready, then collect it. kevent has told us it’s exiting, so we won’t block for long. And now everything works. The event doesn’t get lost; the event loop doesn’t get stuck. Here’s the fix for that.
For good measure, OpenBSD has now moved the knote call to close the race. Here’s that commit.
Working bottom up, is the race condition a bug? Reviewing the kevent manual it doesn’t literally guarantee that
waitpid(WNOHANG) will work. But come on, that’s a reasonable assumption, right? Probably. What does kevent really (really) mean when it says something is ready? Does it mean that a nonblocking call will return data? Or does it mean that a blocking call will return data without blocking? So it can’t totally guarantee the latter. There may be various locks to be acquired, etc., which will prevent instantaneous success. And in the extreme, are nonblocking calls required to return data ever? I think a low quality but still conforming implementation could always just return EAGAIN. (Check my facts!)
This seems like needless pedantry, but I would argue that real systems do not behave ideally, and that should factor into our assumptions. Our model of kevent should be that when it says something is ready, that means we can perform a blocking operation and it will complete “soon enough”. And kevent should endeavor to keep that gap as small as possible. This set of assumptions seems like it leads to the most robust software.
The problem libuv experienced is one I like to call accidentally nonblocking. It’s easy to imagine how this happened. When you get SIGCHLD, the traditional way to handle that is to call waitpid in a loop, because you don’t know how many or which children exited, using WNOHANG so you don’t get stuck. At least not without further complications involving sigaction and siginfo_t. (Exercise for the adventurer to asses the reliability of this technique.) So switching the event source from signal to kevent but leaving the waitpid call as is seems reasonable. And it works. Well enough, anyway.
Normally the result of accidentally nonblocking software is excessive system calls, possibly busy spinning, as it futilely searches for the file descriptor that’s ready to proceed. The situation unfolded a little differently here. The nonblocking call only happened once, but resulted in a dropped event, and then the libuv state machine and the kevent state machine became desynced. This is one of the common hazards with kevent. Older functions like poll and select are generally resilient to missed events, because you just poll again and you get a second chance. Being edge triggered, once kevent gives you the event, that’s it, now it’s your responsibility to see that it’s handled properly.
Of course, libuv had no way of knowing the race condition bug was lurking in kevent. There’s a lot of surface to kevent, and some of it is exercised more thoroughly than other parts. I’ve personally used the EVFILT_PROC facility, but only with blocking waitpid. I would not describe kevent as fragile, but it is precise, and not resilient to misuse or mistakes. But ultimately, we achieve reliable software by fixing bugs, not by papering over them.
This is probably a common occurence when switching over from an old API to a new one. There’s two happy paths, the old way tested by all the old software, and the new way tested by all the new software. Despite apparent compatibility, the half complete conversion is the unhappy middle ground with unforeseen edge cases.
Everything is a little bit better now.