Saturday, 14 September 2013

Let's Build a Crypto Ecosystem


Let's pretend for a moment that we didn't have such fantastic crypto primitives such as AES, SHA3 and so on, and we'd not (for some reason) never needed crypto, how would we go about building a 'modern' crypto system?

Firstly, this is meant as a basic overview of how most crypto primitives that are in use today are put together. Not the "most interesting" bits, but the pieces that do all the heavy lifting -- the symmetric ciphers.

What is a cipher suite, and what exactly is AES-128-CBC-HMAC-SHA1 ?  We'll get into what these sort things mean, but first we have to start somewhere.

Jumping Off

Block ciphers typically take as input a key and a "block". A block is just a fixed-width collection of bits. The Data Encryption Standard (DES) takes 64-bits, and the Advanced Encryption Standard (AES) takes 128-bits. The great thing about this is that they are reasonably easy to reason about, lend themselves to parallel usage (depending on the usage) and also relatively easy to implement in hardware.

For a really basic block cipher, we'll be pinching Heys' SPN cipher from his Linear & Differential Cryptanalysis Tutorial. I've chosen it because it's reasonably easy to understand, implement and the design is small and freely available.

Heys' paper does not specify a key schedule. I am going to decree that the key is 80-bits long, which gets broken into 5 16-bit subkeys to be fed wholesale into the relevant parts of the cipher.

I have put my implementation up at on a Gist. We can use this a a basic implementation for some fun work.

Note, I've used this cipher for an additional reason: to drive home the fact that you never, ever write your own crypto. That includes plugging together AES into AES-CBC mode, or making your own C-MAC from a block cipher. Just don't do it, this blog post is for demonstration purposes only.

If you write your own crypto in production, you will lose the security game, your data is not safe.

With that out of the way, let's move on.


If we want to go down the route of academic correctness, we must first convince ourselves that our fancy new cipher is a Psuedo-Random Permutation (PRP).  That basically means that an expected adversary (i.e. one with normal, classical computers) cannot distinguish the cipher from a random permutation function.

We aren't going to go into what that exactly means, but a decent way to see how good a PRP our cipher makes is to turn it into a PRNG and see how well it handles the FIPS 140-2 PRNG tests. These just throw a small battery of statistical tests at our random data, and see how it fares. For reference, let's see about /dev/urandom:

[Sat 13/09/14 21:52 BST][pts/3][x86_64/linux-gnu/3.5.0-37-generic][5.0.0]

zsh/2 936  (git)-[master]-% cat /dev/urandom | rngtest
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABIL

rngtest: starting FIPS tests...
rngtest: bits received from input: 109576192
rngtest: FIPS 140-2 successes: 5474
rngtest: FIPS 140-2 failures: 4
rngtest: FIPS 140-2(2001-10-10) Monobit: 0
rngtest: FIPS 140-2(2001-10-10) Poker: 0
rngtest: FIPS 140-2(2001-10-10) Runs: 3
rngtest: FIPS 140-2(2001-10-10) Long run: 1
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=1.051; avg=42.559; max=19073.486)Mibits/s
rngtest: FIPS tests speed: (min=956.476; avg=53182.979; max=71806.066)Kibits/s
rngtest: Program run time: 4499000 microseconds

A few failures are to be expected; it's supposed to be random, so "extreme events" do occasionally occur; they just should occur too regularly ;-)

Let's make our cipher into a PRNG. We're going to simply write to a file E(0,i) where i is [1..b] so we end up with ~b random bytes. A quick implementation of this is available

Let's run our small test:

[Sat 13/09/14 21:51 BST][pts/3][x86_64/linux-gnu/3.5.0-37-generic][5.0.0]

zsh/2 934  (git)-[master]-% cat randomBytes.txt | rngtest 
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABIL

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 2097152
rngtest: FIPS 140-2 successes: 0
rngtest: FIPS 140-2 failures: 104
rngtest: FIPS 140-2(2001-10-10) Monobit: 104
rngtest: FIPS 140-2(2001-10-10) Poker: 104
rngtest: FIPS 140-2(2001-10-10) Runs: 104
rngtest: FIPS 140-2(2001-10-10) Long run: 71
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=3.328; avg=74.461; max=19073.486)Mibits/s
rngtest: FIPS tests speed: (min=1.056; avg=30.812; max=97.813)Mibits/s
rngtest: Program run time: 93448 microseconds

Note the line "rngtest: FIPS 140-2 failures: 104" and how many failures are listed, now back to /dev/urandom. Take that as a warning; this isn't even the DieHarder tests.


A note on notation: E(k,P) means encryption a plaintext P under key k. D(k,C) means decrypting ciphertext C under key k. So for example, the statement D(k,E(k,P)) means encrypting and then decrypting a plaintext, P, under the same key. Incidentally, for any cipher, this should always result in P.

Now, assuming we have our Heys' Cipher implemented, we can just start throwing bytes at it, right? So we have plaintext P = P1, P2, ..., PN (our plaintext is composed of several plaintext "blocks", each of length 16 bits), so we can just compute C = E(k, P1), E(k, P2), ..., E(k, PN) and have a sufficiently scrambled message. Decryption in this case is obvious.

Well, you'd think, but it's actually a really, really Bad Idea. This is a cipher mode all to itself, and it is known as Electronic Code Book, or ECB.

This is not so good. We know that E(k, p) is deterministic. This means that if C1 = C, and the attacker knows they were both encrypted under the same key, then the attacker knows that E(k, P1) = E(k, P) => P1 = P. The attacker can then do all sorts of fun things, such as re-sending the same block to cause havoc.

So, how do we modify things so that we don't get this issue? First, let's modify our notation slighty, so that P[i] means the i'th block of the plaintext P, and we'll also use '+' to mean xor.

A common way so to use something called Chained Block Cipher (CBC) mode. This is, in notation, expressed as C[i] = E(k, C[i-1] + P[i]).

This basically means that for each block you wish to encrypt, you xor the result of encrypting the previous block with the current plaintext block. This works surprisingly well, but it has a fatal problem. How do you do P[1], the first block?

Well, the normal way is to just generate a completely random block and to package that up in the clear with your message and send it on. This is known as an Initialisation Vector, or IV.

There are also other issues with this, one of them is performance. You can't access C[i] until C[i-1] is computed. This mode is inherently serial, you can't get any faster by throwing processors at it, you have to simply buy a faster processor. Not good in today's CPU market.

How do you get around this? Well, another mode of operation is Counter (CTR)  mode. Conceptually, this is much simpler, and very similar to how I turned the Heys Cipher into a PRNG. It is expressed simply as C[i] = E(k, f(IV, i)) + P[i].

In words, this is encrypting an IV (often just 'nonce') with a counter of some sort. The resulting stream is then xor'd with each plaintext block. This is basically turning the block cipher into a stream cipher, and can be used as such. It should be noted that the function f in the above expression is almost always simple addition, however this was not always the case.
This has a few benefits, the most obvious being that it is easily parallelised. 

So, we can easily construct a working cipher system, right?

Well, not quite. In my next post, I'll go over authentication. Authentication is a crucial part of any modern cipher system. If you're involved in designing a cryptosystem, and you're not thinking about authentication, you probably should not be designing a cryptosystem.