syzkaller found a bug

Common problem for operating system fuzzers is breaking the system they’re running on. Some forms of damage are expected, some are not, and sometimes it’s difficult to tell which is which.

A few days ago, a stack leak bug was fixed in FreeBSD. A similar fix for OpenBSD was committed. And then syzkaller came kalling just a few days later.

panic: bad dir

There’s a few possible causes for this, but inspection revealed that the most likely case might be a directory entry missing the nul terminator. The timing certainly seemed suspicious. Could there be an off by one?

memset(newdirp->d_name + (cnp->cn_namelen & ~(DIR_ROUNDUP-1)), 0, DIR_ROUNDUP);

Actually no. syzkaller had found a way to create filesystem corruption through one of the “expected” damage paths, but the test case was a little obfuscated. More study revealed it was calling mknod to create a new device that happened to be equal to /dev/sd0c and opening it, and then calling pwrite to write some garbage to a random spot.

mknod("banana", 0777, 0x0402);
open("banana")
pwrite(3, "oops", 4, 0x9000);

Not recommended.

Further complicating the matter was that syzkaller didn’t know that pwrite is one of the magic syscalls that takes a padding argument before off_t. This didn’t affect the test, per se, but makes it harder to interpret because syzkaller calls things directly. (The actual syscall in use was the iovec variant, pwritev.)

syscall(SYS_pwritev, r[0], 0x200002c0, 1, 0);

If you read the man page for pwritev that looks correct. But inspecting src/sys/kern/syscalls.master reveals that the fourth argument is actually a pad argument, and the offset is the fifth argument. So the call above was writing to an offset that was not zero.

Not the first fuzzer to encounter this oddity. More details here.

In the end, it was just coincidence that syzkaller found a new way to corrupt its filesystem a few days after a filesystem commit.

Posted 10 May 2019 16:02 by tedu Updated: 10 May 2019 16:02
Tagged: openbsd