random in the wild
A bit of commentary for some selected examples from Theo’s random hunt. Mostly a post commit justification for the great posix violation.
Before we get to the fun parts, though, a bit of serious commentary. The rand
and random
functions are specified by C and POSIX, respectively, to have initial seeds of 1 and to be reseedable via srand
and srandom
. That has been interpreted to mean that the same seed must always produce the same sequence, but the actual guarantee is quite a bit weaker than that. The C standard says very little about the rand algorithm, and POSIX doesn’t add much except random must be “a non-linear additive feedback random-number generator”. The key point, however, is that there are many such generators, all of which produce different sequences even with identical seeds. Different operating systems are not guaranteed to produce the same sequence. In fact, repeated executions of the same program are not guaranteed to produce the same sequence. Each execution could pick different internal multipliers. Or a fully conforming implementation may well contain seed ^= getpid()
at the top of srandom. Repeatable within the program, but not afterwards. The same sequence language only applies within the execution of a single process. Any program which depends on generating the same sequence with the same seed after multiple executions is in fact depending on undocumented, nonstandard, implementation specific behavior. Here’s a nickel kid; get a better language lawyer.
code
But who cares about standards? Let’s look at some code.
A popular choice for a seed is 0. This is not so very different from the default of 1, but some people want their numbers to be a little different. But not too different.
srandom(0);
Considering these two lines from the srandom implementation, not too different at all.
/* A seed of 0 would result in state[] always being zero. */
state[0] = x ? x : 1;
(Now may be a good time to point out that the relevant standards do not specify that different seeds produce different sequences.)
Some people have read a book.
srand(42);
Using your birthday is another choice.
srand(1969);
Younger programmers prefer srand48.
srand48(1978);
Bigger numbers are also common.
srandom(12346);
Careful, that’s a 6 not a 5. Unpredictable, no?
Hexadecimal is the preferred numeric notation of the serious programmer.
srand(0x1234);
Humor is common.
srand(0xabad1dea);
When a fixed seed doesn’t work, maybe the current time will.
srandom(time(NULL));
Old timers are old school.
srand(time(0));
If seconds aren’t precise enough, try microseconds.
srandom(tv.tv_sec * 1000000 + tv.tv_usec);
Or seconds and microseconds, all together.
srandom(tv.tv_sec ^ tv.tv_usec);
If performance is important, cache the time.
srandom(cachedTime.usec());
Maybe you want your time to be different from other people’s time.
srand(time(NULL) ^ 3141592654UL);
For extra variation, try adding in the pid.
srandom(time(NULL) ^ getpid());
Or literally adding in the pid.
srandom(((int)time(NULL))+((int)getpid()));
Add literally oring in the pid.
srandom(time(NULL)| getpid());
All together now.
srand((unsigned int)(
((unsigned int)pid << 16)
^ (unsigned int)pid
^ (unsigned int)tv.tv_sec
^ (unsigned int)tv.tv_usec));
Perhaps multiplication with a side of completely unnecessary modulus?
srand(getpid() * time(NULL) % ((unsigned int) -1));
Here’s one with a completely necessary modulus. We wouldn’t want our random numbers to get too random.
srand((unsigned int)(tv.tv_usec + pid) % 991);
The one operation that was not observed was substracting the pid from the time. More research into this subject is warranted.
Another approach one can take is self seeding. It’s the same, but different.
srandom((unsigned)random());
Future proof your code by allowing the seed values to increase over time.
srand (rand () % r_sys_now ());
A little of this, a little of that.
srand((time(NULL) + buf_len) ^ rand());
Much more sophisticated seeding methods are also possible.
ptr=&cp[0];
strcpy(cp,THISVERSION);
x=strlen(THISVERSION);
for(i=0;i<x;i++)
y+=*ptr++;
srand(y);
Remember to always initialize the seed from a copy of the version string, not the original.
If you don’t have a version string, pretend you do.
char stackdata[256];
unsigned int c = 1234, n;
c = time(NULL);
c *= getpid();
/* Referenced before set... DELIBERATELY */
for (n = 0; n < sizeof(stackdata); n++) /* IGNORE WARNING */
c += stackdata[n]; /* DELIBERATELY USED BEFORE SET */
srand((unsigned int)c);
Don’t be afraid to reseed too frequently. Once per random number is good if your program runs slowly.
int getrand(int toprange)
{
srand((int)(time(NULL) * getpid()));
return (rand() % toprange) +1;
}
The multitude of random number generators also permits cross seeding.
seedbufptr[0] = (unsigned short)rand();
seedbufptr[1] = (unsigned short)rand();
seedbufptr[2] = (unsigned short)rand();
(void)seed48(seedbufptr);
If you need a really good seed, look to the OpenSSL RAND functions.
unsigned char s[16] = { 0 };
int seed = 0;
RAND_bytes(s, 16);
s[15] = '\0';
seed = ElfHash(s);
srand48((long) seed);
Take 16 bytes of random data. No, wait, make that 15 bytes. Then hash it to four bytes to really squeeze the entropy in. Then seed.
Of course, this can be done both ways. The output of rand can be used to seed OpenSSL RAND. Or attempted, anyway.
char buffer[256]="";
char randombytes[256];
time_t t = time(NULL);
int n;
srand((unsigned int)t);
sprintf(buffer, "%.0f", (((double)(rand()%RAND_MAX)/RAND_MAX)*
(sizeof(randombytes)-128-1)));
n = (atoi(buffer)+1)%(sizeof(randombytes)-128-1);
RAND_seed(randombytes, 128);
Yes, you are reading that correctly. Seed rand with the current time, then print a random number into a buffer. Convert the buffer back into a number. Discard the number. Seed RAND with a different, uninitialized array. I don’t use this program. Maybe you do.
The deficiencies of rand have been known for some time, leading some software to work around them.
/* First generate RAND_HEAD_LEN-2 random bytes. We encrypt the
* output of rand() to get less predictability, since rand() is
* often poorly implemented.
One can also try varying the output a bit.
return (rand() * rand() + (rand() >> (sizeof(int) * 4)) + (cnt++)) & INT_MAX;
Apart from the fact that signed integer overflow is undefined behavior, that’s a great way to turn a uniform random number generator into a biased random number generator.
I ran out of funny trying to describe what’s happening here.
hptr->id = htons(ntohs(hptr->id) + k + lrand48() & 0xffff);
Another amusing code snippet observed.
#define arc4random() random()
At least this way the code looks like it’s using unpredictable numbers. Another form of deception is to leave misleading comments.
/* true_random -- generate a crypto-quality random number. */
static int true_random(void)
{
/* crap. this isn't crypto quality, but it will be Good Enough */
srand((unsigned int)(((time_now >> 32) ^ time_now) & 0xffffffff));
return rand() & 0x0FFFF;
}
Much better without the comments.
static unsigned long
generate_hash_secret_salt(void)
{
unsigned int seed = time(NULL) % UINT_MAX;
srand(seed);
return rand();
}
appendix
Some random gold from Daily WTF.
Now, of course, maybe all these examples are quite safe in context and I’m over reacting because nobody in real life would ever use rand to generate disk encryption keys.
TIFU by using Math.random has some interesting details about how just how bad random can be.
I guess it’s good news that malware authors repeatedly make the same mistakes?
Voting machines make it easy to guess the seed time by printing it out for you.
It was noted in 2012 that libexpat random is weak. And then nothing happened for three years. Then came a 2015 redhat bug. Now in 2016 it’s not one but two more CVEs. One could use arc4random like OpenBSD, but then your secret hashes would be unpredictable.
7z uses the tried and true time and pid technique.
Nobody will ever guess your 32-bit time seeded bitcoin wallet.