random ip id comments
There’s a new paper, From IP ID to Device ID and KASLR Bypass, which I liked. It’s at the intersection of networking, old but not obsolete standards, random, security, and implementation defined behavior. By all means, read the paper, but the really short version is they accomplished two things. They reverse engineered a per host random seed from network traffic on Windows and Linux, allowing fingerprinting, and more surprising, turned this into a KASLR break on Linux. Pretty wild.
prior
IP ID is a field in IP packets to handle fragmentation. When a packet is fragmented, the receiving host needs to know which fragments belong to which packet. Much like IP address (and probably port) identify a packet. The IP ID is just a 16 bit field, however. It’s important to avoid collisions, or fragments won’t be assembled properly. In the early days, the simplest thing to do was to simply increment a counter for each fragmented packet. The assumption is that by the time the counter wraps, all the old fragments will have reached their destination.
This simplicity enables a few tricks. One early paper is A Technique for Counting NATted Hosts which uses global linear IP IDs to count hosts. A later paper Counting NATted Hosts by Observing TCP/IP Field Behaviors, doi:10.1109/icc.2012.6364596, improved on this by including more fields. Finally, there’s A closer look at IP-ID behavior in the Wild which measured the number of hosts utilizing various IP ID generation techniques.
OpenBSD uses a random generation technique, with a twist, which I’ll get to later. Other systems use a variation of the counter technique, but instead of just a global counter, they calculate an offset derived from the connection to prevent the above identification techniques and other tomfoolery.
algos
There are three algorithms described in the paper for IP ID generation. Global counter. Global counter with local offsets. Random generation with lookback.
This last technique, used by Apple, involves maintaining a queue of recently used IDs, generating new ones randomly, and then checking they are not reused. This solves several problems. There’s no global counter and no offset to reverse. Unlike pure random algorithms, the probability of reuse is reduced.
OpenBSD, using an algorithm originally from Dragonfly, does a variation of this third algorithm. There’s an array of all 64K possible IDs. It’s shuffled. The next entry is selected, and then shuffled back into the array at an offset of 32K. This prevents it from being recycled too quickly.
derezzing
Klein and Pinkas, in the linked paper, look at the IP ID from many different connections, do some whiz bang math, and calculate the original global counter. And just like that, partying like it’s 1999.
What’s even more interesting is that part of the calculation used in Linux includes the address of a global variable, and thus calculating the global seed also reveals the kernel load address.
KASLR
There’s some debate about the benefits of fine grained versus coarse grained ASLR. Coarsely, we can randomize the load address of a program (or the kernel). Or we can chop the program up into pieces, and randomize all of them. The theory behind fine grained ASLR is it makes the attacker guess more addresses. The argument against the additional complexity is that all it takes is an infoleak to reveal as many addresses as the attacker needs.
This bug provides an interesting couterpoint to that argument. Only one address is leaked here. The attacker cannot send more packets to learn more addresses.
The implication is that if one can identify an Android device, one knows the kernel layout, and using this IP ID derezzing technique, can learn the load address, fully derandomizing the kernel. If on the other hand, different devices used different linkages of the same kernel, with symbols arranged in a different layout, then leaking a single kernel address would be of little use.
16bits
Just a small footnote to say that DNS also relies on 16 bit IDs to identify requests.