strict structs

Contrary to popular belief, C does have types. It even has type qualifiers. Unfortunately, the selection is somewhat limited and there are several implicit conversions that may lead to less than robust code. The good news is that with a little effort we can define our own types and enforce our own rules. I’ve forgotten where I first saw this, and don’t really have a good name for it.

Here’s a common bug which is fun to consider.

        memset(ptr, sizeof(*ptr), 0);

The problem is that integer types readily turn in to other integer types. It may be possible to coerce a warning out of the compiler, but it’s rare. Ideally we would like a size type that’s completely incompatible with other integers.

If we look at code, a fairly common pattern emerges. There’s a size, it’s used to allocate memory, passed around to various functions, etc., but rarely manipulated or operated on as an integer.

        size_t size = 1024;
        char *ptr = malloc(size);
        memset(ptr, 0, size);
        snprintf(ptr, size, "hello");

Let’s a make a new size type.

typedef struct {
        size_t s;
} size;
void *mallocS(size size);
void *memsetS(void *ptr, int val, size size);
int snprintfS(char *buf, size size, const char *fmt);

Now our example looks like this.

        size size = { 1024 };
        char *ptr = mallocS(size);
        memsetS(ptr, 0, size);
        snprintfS(ptr, size, "hello");

Practically identical. However, attempting to reintroduce our original bug will fail.

        memsetS(ptr, size, 0);

Compile that and we see an error.

error: passing 'size' to parameter of incompatible type 'int'
        memsetS(ptr, size, 0);
                     ^~~~

Size is no longer an integer.

A typical program may have a variety of integers that should not mix and match, except under special circumstances. Height, width, x position, y position, etc. Unfortunately, we often see functions like moveto(int, int, int, int) where it’s very easy to pass arguments in the wrong order. We can make them all different types to prevent this.

Another thing we may consider is adding custom qualifiers to types. C provides const to make a variable readonly. What we would like is a notnull qualifier to deal with all those pesky functions that return null. This is a fact of life, sometimes they simply don’t have a value to return, but we don’t want the null to accidentally flow into other parts of our program expecting valid data.

Fairly typical API.

thing *getthing(const char *name); /* may return NULL */
void printthing(thing *thing); /* must not be NULL */

Fairly typical code.

        thing *thing = getthing("it");
        printthing(thing);

That’s a bug, and a common one. It’s just too easy to take the thing and toss it around.

Now imagine we have these types.

typedef struct maybething {
        thing *ptr;
} maybething;
typedef struct notnullthing {
        thing *ptr;
} notnullthing;

maybething getthingX(const char *name); /* may return NULL */
void printthingX(notnullthing thing); /* must not be NULL */

The compiler will enforce that the return of getthing does not immediately flow to printthing because they have incompatible types. The user must convert them, hopefully after checking for null.

        maybething maybe = getthingX("it");
        if (maybe.ptr) {
                notnullthing notnull = { maybe.ptr };
                printthingX(notnull);
        }

Hardly foolproof, since we can always force the conversion without a null check, but that requires a deliberate act of foolishness, not mere carelessness or forgetfulness.

With some magic macros and perhaps unwise cleverness, we can also clean it up a bit.

        maybething thing = getthingX("it");
        with (thing) {
                printthingX(thing);
        }

This code only runs if thing exists, but removing the if-like with will cause a compile failure.

At the processor level, all of this indirection is completely eliminated. One word structs get passed in registers, and the compiler will eliminate the redundant temporaries and conversions. Zero runtime overhead.

Posted 14 Nov 2018 15:45 by tedu Updated: 14 Nov 2018 15:45
Tagged: c programming