timing attacks vs interned strings

Some experiments with trying to extract strings from a Lua process via timing attacks.

one

For the first test, let’s just verify that simple == equality testing doesn’t produce measurable differences. Create a file with 2155582 lines of “aaaaaaaaaaaaaaaaaaaab” and then run this script.

local fd = io.open("a")
while true do
        local line = fd:read()
        if not line then break end
        if line == "aaaaaaaaaaaaaaaaaaaaa" then
                print("match")
        end
end

Next we run the script again, but change the magic string to be “baaaaaaaaaaaaaaaaaaaa”. The run time is the same, because we’re not actually comparing byte for byte strings, but only their post-intern pointer values.

two

For comparison, I wrote the same test in C.

#include <stdio.h>
#include <string.h>

int
main(int argc, char **argv)
{
        char buf[256];
        FILE *fp;

        fp = fopen("a", "r");
        while (fgets(buf, sizeof(buf), fp)) {
                if (memcmp(buf, "aaaaaaaaaaaaaaaaaaaaa", 21) == 0)
                        printf("match\n");
        }
}

Then I changed the string to mismatch on the first letter. Then I reran the test with timingsafe_bcmp and timingsafe_memcmp. Results:

          memcmp     timingsafe_bcmp      timingsafe_memcmp
a...a       0.19                0.14                   0.19
b...a       0.15                0.14                   0.19

The timing safe functions are constant time, as expected. Interesting to note that timingsafe_bcmp is faster than memcmp.

three

If straightforward attacks don’t work, my next approach was to try to detect the intern operation itself. If a string already exists, then a new copy won’t be made (fast). If it doesn’t exist, a new copy will be made (slow).

I created a new file with 2284880 lines consisting of every word from “aaaa” to “zzzz”, then repeated five times, and a similar file with 2284880 lines of “aaaa”. Ran the same script as above. The first file took twice as long to process as the second file. Interning different strings (and consequent garbage collection) slows things down in theory, but can we observe this in practice?

four

Start with a simple server. Because the correct magic string is always in memory, we know those guesses are fast.

Obviously we could cheat and look at the answer, but in a more likely scenario a web server will have cookies for many recent users in memory. Being able to remotely guess a session token, even if it’s not for our user, is a big deal, because we can try it out for all the users without much trouble.

Lua server:

local skt = require "socket"

local server = skt.bind("localhost", 9999)
while true do
        local client = server:accept()
        while true do
                local line, err = client:receive()
                if not line then
                        break
                end
                if line == "apez" then
                        client:send("yup!")
                else
                        client:send("nope")
                end
        end
        client:close()
end

And C attack code:

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>

#include <machine/pctr.h>

int
hostconnect(const char *host, const char *port)
{
        int s;
        struct addrinfo *res, *origres, hints;

        memset(&hints, 0, sizeof(hints));
        hints.ai_socktype = SOCK_STREAM;
        hints.ai_protocol = IPPROTO_TCP;

        if (getaddrinfo(host, port, &hints, &res))
                return -1;
        s = -1;
        origres = res;
        while (res) {
                s = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
                if (connect(s, res->ai_addr, res->ai_addrlen) == 0)
                        break;
                close(s);
                s = -1;
                res = res->ai_next;
        }
        freeaddrinfo(origres);

        return s;
}

int
main(int argc, char **argv)
{
        uint64_t before, after;
        char buf[64];
        int i, s;

        s = hostconnect("localhost", "9999");

        for (i = 0; i < 4; i++) {
                snprintf(buf, sizeof(buf), "%.4s\n", argv[1]);
                before = rdtsc();
                send(s, buf, 5, 0);
                recv(s, buf, 20, 0);
                after = rdtsc();
                printf("timing %lld\n", after - before);
        }
}

We run this on the command line a few times, with various guesses. For instance, we might know that “nope” will be in memory. We also know that “1234” is not in memory, because only letters form tokens in this universe. That gives us a baseline for comparison with other guesses, like “boom” and “yawn”.

Unfortunately, I couldn’t detect any difference between new strings and old strings. Even after lots of samples, I couldn’t average them out to make a difference show up.

five

If we can’t detect interning, maybe we can detect GC pauses? Yes! New attack code:

int
main(int argc, char **argv)
{
        uint64_t before, after;
        char buf[64], rbuf[64];
        int count, i, s;

        s = hostconnect("localhost", "9999");

        snprintf(buf, sizeof(buf), "aaaa\n");
        count = 0;
        while (1) {
                before = rdtsc();
                send(s, buf, 5, 0);
                recv(s, rbuf, 20, 0);
                after = rdtsc();
                count++;
                if (after - before > 500000) {
                        printf("after %d guesses, took a long time\n", count);
                        count = 0;
                }
                if (buf[0] == 'z') {
                        buf[0] = 'a';
                        if (buf[1] == 'z') {
                                buf[1] = 'a';
                                if (buf[2] == 'z') {
                                        buf[2] = 'a';
                                        if (buf[3] == 'z') {
                                                buf[3] = 'a';
                                        } else
                                                buf[3]++;
                                } else
                                        buf[2]++;
                        } else
                                buf[1]++;
                } else
                        buf[0]++;
        }
}

My theory is that sending an existing string will never cause a garbage collection. We bombard the server with garbage strings until we think it’s about to collect. Then we send our target string. Pause -> miss. No pause -> hit!

This produces some reasonable output:

after 70551 guesses, took a long time
after 52204 guesses, took a long time
after 72026 guesses, took a long time
after 122641 guesses, took a long time
after 72217 guesses, took a long time
after 72731 guesses, took a long time
after 76937 guesses, took a long time
after 76836 guesses, took a long time
after 75308 guesses, took a long time
after 73700 guesses, took a long time
after 77875 guesses, took a long time

Reasonable, but still quite variable. Just glancing at the numbers suggests that it would take millions of probes just to confirm or deny whether a single string was in memory. If the tokens in question are more than even a few bytes long, it would take forever to identify one. Don’t think this is actually going to work.

Posted 31 Jul 2014 15:03 by tedu Updated: 31 Jul 2014 15:03
Tagged: lua programming security web