some gripes about nacl
Making encryption easier and accessible is all the rage. From a programming perspective, one of the most frequent suggestions is to use nacl. I have a few gripes with it.
The good news is that what works, works robustly. The algorithms included were generally designed with solid implementations in mind. nacl certainly raises the bar in terms of making it difficult or impossible to do bad things, but it’s still rather difficult to work with.
The build system is a little weird. It builds static libraries and benchmarks them to find the one that performs best on the build host. Unfortunately, this means it’s tied to the build host. If nacl picks up that your new CPU has some advanced instructions, it will use them, resulting in a library that won’t work on older CPUs. It’s hard to ship binaries when you’re uncertain of what CPUs they will run on.
The C API is a little wonky.
Consider crypto_box.
WARNING: Messages in the C NaCl API are 0-padded versions of messages in the C++ NaCl API. Specifically: The caller must ensure, before calling the C NaCl crypto_box function, that the first crypto_box_ZEROBYTES bytes of the message m are all 0.
Major pain the butt. Understandable that the function is trying to avoid allocating memory, but making the caller deal with this particular detail is harsh. The C++ API doesn’t have this problem, but it also creates some rather large stack buffers. One for the incoming message and one for the outgoing message. Dealing with even moderately large messages could overflow the stack.
Speaking of large messages, although the C API is defined in terms of long long
and the C++ API uses std::string
, internally the implementation uses int
in many places. Not good things will happen if you operate on messages larger than 2GB on a 64-bit machine.
size_t mlen = m.size() + crypto_box_ZEROBYTES;
unsigned char mpad[mlen];
for (int i = 0;i < crypto_box_ZEROBYTES;++i) mpad[i] = 0;
for (int i = crypto_box_ZEROBYTES;i < mlen;++i) mpad[i] = m[i - crypto_box_ZEROBYTES];
At times, the documentation still recommends a few practices that could lead to mistakes. The discussion of nonces, which are pretty critical to the security of a function like crypto_box
is saved for a later security model section, not the API documentation. Almost like an afterthought.
Nonces are long enough that randomly generated nonces have negligible risk of collision.
This is great. But why do I need to generate it myself? Asking for trouble. If one follows just the C API documentation, it can be very easy to accidentally omit the critical random_bytes
call to initialize the nonce. Or...
There is no harm in having the same nonce for different messages if the {sender, receiver} sets are different. This is true even if the sets overlap.
Telling people they can reuse a nonce if they’re very careful is really asking for it. I’d wager that anybody building a complicated sender/receiver nonce database is far more likely to screw it up entirely.
It’s missing some of the newest hotness.
crypto_sign
says it’s just a prototype. It’s said that for three years now. In fact, the last release of nacl was four years ago. This wouldn’t be a problem if it were done, but the web site has lots of plans for future releases.
The encryption cipher used is XSalsa20, which isn’t a problem per se, but most other applications (TLS, ssh) are using the slightly newer ChaCha20. Building something with XSalsa makes it more annoying to build an interoperable version with another library that may only include ChaCha.
It’s also missing a few key features useful in a library like this, such as a key derivation function.
More good news is that many of these gripes are addressed by libsodium. Better build system. Shared libraries. Various new functions (easy and detached variants of many functions). Preliminary password hashing.