timing attacks vs interned strings
Some experiments with trying to extract strings from a Lua process via timing attacks.
one
For the first test, let’s just verify that simple ==
equality testing doesn’t produce measurable differences. Create a file with 2155582 lines of “aaaaaaaaaaaaaaaaaaaab” and then run this script.
local fd = io.open("a")
while true do
local line = fd:read()
if not line then break end
if line == "aaaaaaaaaaaaaaaaaaaaa" then
print("match")
end
end
Next we run the script again, but change the magic string to be “baaaaaaaaaaaaaaaaaaaa”. The run time is the same, because we’re not actually comparing byte for byte strings, but only their post-intern pointer values.
two
For comparison, I wrote the same test in C.
#include <stdio.h>
#include <string.h>
int
main(int argc, char **argv)
{
char buf[256];
FILE *fp;
fp = fopen("a", "r");
while (fgets(buf, sizeof(buf), fp)) {
if (memcmp(buf, "aaaaaaaaaaaaaaaaaaaaa", 21) == 0)
printf("match\n");
}
}
Then I changed the string to mismatch on the first letter. Then I reran the test with timingsafe_bcmp and timingsafe_memcmp. Results:
memcmp timingsafe_bcmp timingsafe_memcmp
a...a 0.19 0.14 0.19
b...a 0.15 0.14 0.19
The timing safe functions are constant time, as expected. Interesting to note that timingsafe_bcmp is faster than memcmp.
three
If straightforward attacks don’t work, my next approach was to try to detect the intern operation itself. If a string already exists, then a new copy won’t be made (fast). If it doesn’t exist, a new copy will be made (slow).
I created a new file with 2284880 lines consisting of every word from “aaaa” to “zzzz”, then repeated five times, and a similar file with 2284880 lines of “aaaa”. Ran the same script as above. The first file took twice as long to process as the second file. Interning different strings (and consequent garbage collection) slows things down in theory, but can we observe this in practice?
four
Start with a simple server. Because the correct magic string is always in memory, we know those guesses are fast.
Obviously we could cheat and look at the answer, but in a more likely scenario a web server will have cookies for many recent users in memory. Being able to remotely guess a session token, even if it’s not for our user, is a big deal, because we can try it out for all the users without much trouble.
Lua server:
local skt = require "socket"
local server = skt.bind("localhost", 9999)
while true do
local client = server:accept()
while true do
local line, err = client:receive()
if not line then
break
end
if line == "apez" then
client:send("yup!")
else
client:send("nope")
end
end
client:close()
end
And C attack code:
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <machine/pctr.h>
int
hostconnect(const char *host, const char *port)
{
int s;
struct addrinfo *res, *origres, hints;
memset(&hints, 0, sizeof(hints));
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_TCP;
if (getaddrinfo(host, port, &hints, &res))
return -1;
s = -1;
origres = res;
while (res) {
s = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
if (connect(s, res->ai_addr, res->ai_addrlen) == 0)
break;
close(s);
s = -1;
res = res->ai_next;
}
freeaddrinfo(origres);
return s;
}
int
main(int argc, char **argv)
{
uint64_t before, after;
char buf[64];
int i, s;
s = hostconnect("localhost", "9999");
for (i = 0; i < 4; i++) {
snprintf(buf, sizeof(buf), "%.4s\n", argv[1]);
before = rdtsc();
send(s, buf, 5, 0);
recv(s, buf, 20, 0);
after = rdtsc();
printf("timing %lld\n", after - before);
}
}
We run this on the command line a few times, with various guesses. For instance, we might know that “nope” will be in memory. We also know that “1234” is not in memory, because only letters form tokens in this universe. That gives us a baseline for comparison with other guesses, like “boom” and “yawn”.
Unfortunately, I couldn’t detect any difference between new strings and old strings. Even after lots of samples, I couldn’t average them out to make a difference show up.
five
If we can’t detect interning, maybe we can detect GC pauses? Yes! New attack code:
int
main(int argc, char **argv)
{
uint64_t before, after;
char buf[64], rbuf[64];
int count, i, s;
s = hostconnect("localhost", "9999");
snprintf(buf, sizeof(buf), "aaaa\n");
count = 0;
while (1) {
before = rdtsc();
send(s, buf, 5, 0);
recv(s, rbuf, 20, 0);
after = rdtsc();
count++;
if (after - before > 500000) {
printf("after %d guesses, took a long time\n", count);
count = 0;
}
if (buf[0] == 'z') {
buf[0] = 'a';
if (buf[1] == 'z') {
buf[1] = 'a';
if (buf[2] == 'z') {
buf[2] = 'a';
if (buf[3] == 'z') {
buf[3] = 'a';
} else
buf[3]++;
} else
buf[2]++;
} else
buf[1]++;
} else
buf[0]++;
}
}
My theory is that sending an existing string will never cause a garbage collection. We bombard the server with garbage strings until we think it’s about to collect. Then we send our target string. Pause -> miss. No pause -> hit!
This produces some reasonable output:
after 70551 guesses, took a long time
after 52204 guesses, took a long time
after 72026 guesses, took a long time
after 122641 guesses, took a long time
after 72217 guesses, took a long time
after 72731 guesses, took a long time
after 76937 guesses, took a long time
after 76836 guesses, took a long time
after 75308 guesses, took a long time
after 73700 guesses, took a long time
after 77875 guesses, took a long time
Reasonable, but still quite variable. Just glancing at the numbers suggests that it would take millions of probes just to confirm or deny whether a single string was in memory. If the tokens in question are more than even a few bytes long, it would take forever to identify one. Don’t think this is actually going to work.