All the way back in 2003, I was at the hackathon in Calgary and on the last day was fooling around trying to get xmms to stop skipping. xmms used threads to play music while also processing UI events and updating the playlist, but if the thread reading MP3 tags from disk blocked for too long, the music would also stop. Despite the appearance of multiple threads, that was just some fakery performed by the userland library; there was still only one real thread of execution running. The obvious solution was to get more than one real thread going.
I knew the rfork system call could be used to create processes which shared an address space, so that’s what I started with. I built a few functions to wrap
wait() with pthread functions, added some crude mutexes that used
nanosleep() to avoid spinning too much, and basically had enough going to run xmms. It was really rough though. Exiting xmms would only stop the first thread, the music playing thread would continue running until the end of the song, then crash. But it proved the concept could work. The name rthreads, of course, meant “rfork threads”. The very first rthreads commit, to make rfork a little more useful, happened April 2, 2004. I’m not sure if this means I sat on the diff for a year or if I misremember and only started work on rthreads in 2004, but in either case, the prototype was running at least that long ago.
That experiment was thrown away, but in 2005 I came back to the idea. This time, I added some basic kernel code to support threading, so that all the threads in a process would at least exit cleanly at the same time. This led to a talk at EuroBSDCon in Basel in November 2005. Finally, rthreads was committed on Dec 3, 2005. The library was a little more complete, but just by glancing at the cvs logs for rthread.c, you can see we made it to revision 1.25 by the end of the month.
Lots of people helped out in the early days once the code was in the tree and available for hacking. Things moved pretty quickly, fleshing out the library functions and fixing up the obvious bugs, but we weren’t quite done. Then came the long pause. No commits to rthread.c from January 2006 until May 2007, and then not a second commit until June 2008. I had somewhat lost interest (and had long since stopped using xmms), and the many remaining problems were fairly large and vague, and involved stuff like signals which aren’t very fun to play with. So ends chapter one.
The man who really brought rthreads back to life was guenther@. He came along and made it work, fixing all those pesky corner cases to actually do what the posix standard dictates. He’s done a ton of work, definitely the hard part of the 80/20 work split. And he got the honor of flipping the switch, making rthreads the default implementation. The switch to rthreads was not entirely painless. Once the change was made, all sorts of minor incompatibilities were revealed, but they were all resolved in short order.
One nice thing I’ll say about the pthreads API, as an implementer, it provides a fairly solid abstraction. Programmers still find ways to intentionally or unintentionally depend on implementation specific details, but the internals are secure against fiddling. librthread was generally a binary compatible dropin for libpthread.
One interesting change is that to really support thread local storage, we had to move beyond rfork, and a new system call
tfork() was introduced. And rfork itself has been totally removed. At this point, I think rthreads stands for “real threads”, as opposed to the “pseudo threads” provided by the previous libpthread.
Sadly, I didn’t get to go to the rthreads hackathon, but a bunch of people did and worked on some finishing polish. (ok, I could have gone, it’s only my own fault I wasn’t present.) Here’s links to their writeups. As typical at a hackathon, lots of unrelated stuff also happened, but some important threading work was in the mix.
Finally, here’s some more commentary from me, generally on the state of threaded programming in general.
I never fully elaborated on the threads as state machine comment. I’ve seen a few programs designed very similarly to the traditional state machine model, where the program does only one thing at a time, but instead of a loop around a switch, it uses threads. It doesn’t use concurrent threads, only one a time, but after each event is processed it, it’s placed on the appropriate queue for the next thread. For some problems, I think this is conceptually a little easier to understand, and with a userspace threading library it’s about as fast. I believe it can also solve some bugs where a typical async main event loop has to decide which event to process next and does so poorly under certain conditions.