syzkaller found a bug
Common problem for operating system fuzzers is breaking the system they’re running on. Some forms of damage are expected, some are not, and sometimes it’s difficult to tell which is which.
panic: bad dir
There’s a few possible causes for this, but inspection revealed that the most likely case might be a directory entry missing the nul terminator. The timing certainly seemed suspicious. Could there be an off by one?
memset(newdirp->d_name + (cnp->cn_namelen & ~(DIR_ROUNDUP-1)), 0, DIR_ROUNDUP);
Actually no. syzkaller had found a way to create filesystem corruption through one of the “expected” damage paths, but the test case was a little obfuscated. More study revealed it was calling mknod to create a new device that happened to be equal to
/dev/sd0c and opening it, and then calling pwrite to write some garbage to a random spot.
mknod("banana", 0777, 0x0402); open("banana") pwrite(3, "oops", 4, 0x9000);
Further complicating the matter was that syzkaller didn’t know that pwrite is one of the magic syscalls that takes a padding argument before off_t. This didn’t affect the test, per se, but makes it harder to interpret because syzkaller calls things directly. (The actual syscall in use was the iovec variant, pwritev.)
syscall(SYS_pwritev, r, 0x200002c0, 1, 0);
If you read the man page for pwritev that looks correct. But inspecting
src/sys/kern/syscalls.master reveals that the fourth argument is actually a pad argument, and the offset is the fifth argument. So the call above was writing to an offset that was not zero.
Not the first fuzzer to encounter this oddity. More details here.
In the end, it was just coincidence that syzkaller found a new way to corrupt its filesystem a few days after a filesystem commit.