commands without magic
Is a magic command without magic still a command? And if a feature was a bug, can a new bug be a feature?
It was recently reported that the behavior of doas has changed in OpenBSD 6.4. Shell scripts used to work, but now in some cases they don’t. Specifically, they work with a leading #! line, but fail otherwise, even though this works from the shell and in previous releases.
My first reaction was to say nothing has changed, of course this is how it’s always been because it’s how it always should have been. But a little testing revealed that indeed the report was correct, and the behavior changed. I was as surprised by this as anybody.
doas executes commands using the execvpe function, which is a libc wrapper around the execve system call. The p means it checks the PATH for commands instead of requiring a full path. However, there’s an additional wrinkle, but first let’s step back a bit.
What indicates that a file can be executed? The obvious answer is the x permission bit in the file system. That only indicates that you are allowed to execute the command, though. Not that it can be executed. What does it mean to execute a jpeg with an x bit? So in addition to file system permissions, the kernel checks for a magic header indicating a known executable format. Probably an ELF header. Maybe a Linux ELF header, sometime in the past. Or if that doesn’t work, the kernel checks for the magic #! line at the top of the file, which contains the name of an interpreter to exec. If there’s no magic header, then the kernel returns an error, ENOEXEC.
But people are lazy, and sometimes they just dump shell commands in a file without heading the appropriate header. sh accommodates this behavior by attempting to interpret any executable file that lacks magic as a shell script. With that in mind, we return to execvpe.
The name of execvpe would imply that it might replicate the shell’s PATH searching behavior. That’s not the only thing it does, though. It also replicates the shell’s ENOEXEC handling. If you call execvpe on a magic free file with the appropriate permissions, libc will start a sh process with the assumed script as an argument. The function might be better named execlikeshdoes.
This behavior is documented in the man page, though I’m pretty sure I overlooked it. If it hadn’t been part of libc, I certainly would not have attempted to emulate it myself. All I wanted was a function that would search the path, and I found a function that looked like what I wanted, not suspecting some bonus features would come along for the ride.
So the presence of this feature might be considered a bug.
Now, in 6.4, things are different. This is the result of the new unveil system call. A series of unveil calls were added to doas. At first it doesn’t seem like unveiling the exact command one is about to exec would do much. The visibility of other files hardly matters if we’re going straight into exec. But sometimes adding seemingly redundant checks catches things one hadn’t considered.
As a result of the unveil calls, the command to be executed was visible, but sh was not. Therefore, if the command lacked magic, the kernel would refuse to execute it. execvpe in libc would try again with sh, but there was no sh, and the exec would fail.
A thing which used to work stopped working, the unintended result of what should have been a harmless change. Ordinarily we’d call that a bug.
Except this new behavior is actually what I wanted all along. So could the bug be a bug fix? After some consideration, yes. We probably would not have intentionally changed the magic less command behavior, in either direction. Probably not the best way to design software, but such is life on the fringes.
The interesting bit is that unveil caught an unknown behavior. doas is a pretty small program, small enough I thought I knew what it did. It makes a few calls to the OpenBSD libc, which rarely does things that surprise me. Except when it does.