flak rss random www

another subtle string function

Recently was reminded of an old string handling function I used for programming interviews.

My original programming interview question started with a short C function that did something a little unusual to a string. When asked to describe its behavior, many candidates initially see the general outline of the function and then have difficulty seeing what the real code does when it differs from their expectations.

Reactions from candidates varied, although were usually pretty muted, given they wanted the job and weren’t likely to complain too loudly. Other interviewers, however, usually provided feedback between “That’s mean.” and “What? Why?”. The why is answered in the previous post, although opinions varied as to how likely one was to encounter such code in the wild. In refactoring some code, I ran into something fairly similar, and then had the exact issue I was testing for, only seeing what I was hoping to see.

itsabbr(const char *abbr, const char *word)
	if (lowerit(*abbr) != lowerit(*word))
		return FALSE;
	while (*++abbr != '\0')
		do {
			if (*word == '\0')
				return FALSE;
		} while (lowerit(*word++) != lowerit(*abbr));
	return TRUE;

Given a function lowerit that lowercases a single letter, much like tolower, what does this function do? I was pretty sure it was a hand rolled version of strncasecmp given the larger context (which did include an exact strcasecmp workalike). But making that conversion resulted in runtime failures. My first guess was that I had forgotten to invert true/false values by comparing strncasecmp == 0, but that was caught by reading the code before testing. Maybe I’d gotten the caller wrong and passed the wrong length? Not that either.

Nope, this function was definitely not quite strncasecmp, although it looked like it and served a very similar purpose. Finally sat down and did things the hard way, working through the code and identifying what it did do and not what I expected it to do. I’m still not sure the actual behavior is the intended behavior because some edge cases are handled poorly in my opinion, but the input data now depends on this function working exactly like it does.

Bonus question: What could be wrong with converting the above to use tolower?

Posted 10 Feb 2015 23:31 by tedu Updated: 10 Feb 2015 23:40
Tagged: c programming