Friday, 2 November 2012

Memory-Hard Password Hashing in Java


As mentioned in my last post, password hashing is about trying to use your resources to out-wit an attacker. Attackers have lots of CPU power to throw at the problem of guessing our user's passwords, and even have statistical methods for determining a user's most likely passwords.

What we need to do is slow the rate at which the attacker can make guesses. Sadly, this isn't that feasible: as CPU power only ever increases. We need to increase the cost of guessing a password. For that to happen, we need to make any hardware needed to guess a password as complex as possible to impede attackers implementing our algorithm in hardware (e.g. FPGAs) or farming it out to GPUs. 

Any source code in this article is instructional only, use the real scrypt in production!

Increasing Complexity

In order to increase the complexity of the hardware that the user needs to attack the system, we must not rely too heavily on the CPU, or the GPU, or any other single piece of hardware -- we must use them in concert. For general purpose computing machinery, this means putting the server's memory to use as well.

But, you say, what about in 5 years time, when GPUs have 10GiB of onboard RAM,  and FPGAs have a logic cell count in the billions? As in the previous post, we need a way to increase the complexity of the password function as time goes on.

The heart of the scrypt algorithm is referred to as "SMix", which is actually an instance of ROMix (also defined in Percival's paper) with some specified parameters.

The Heart of The Matter

ROMix is quite a simple algorithm, once you get it.

You fill an array with the repeated hash of the password, then you hash the password xor'd with various elements of that array.

In Java, this looks something like this:

public class ROMix {
  private MessageDigest hash;

  public ROMix(MessageDigest hash) {
    assert hash != null : "ROMix cannot work with no hash algorithm."
    this.hash = hash;

  public byte[] mix(byte[] input, Integer strength) {
    int n = Math.ceil(Math.pow(2,Double.valueOf(strength)));
    ArrayList byteArr = new ArrayList(n); // An array of size n.
    byte[] digested = hash.digest(input);
    // First, we iterate the hash, saving the result into an array
    for (i=0; i < byteArr.size(); i++) {
      byte[i] = digested;
      digested = hash.digest(digested);

    // Next, we access the array of hashes in an psuedo-random (but predictable) fashion.
    byte[] previous = byteArr.get(byteArr.size());
    byte[] integerdPrevious = byteArr.get(integerify(previous) % n);

    for (i=0; i < byteArr.size(); i++) {
      digested = hash.digest(xor(previous, integerdPrevious));
      // Here is where we access the array of hashes in a random way,
      // thus forcing either expensive recomputation, or storing the 
      // hashes in RAM (Also expensive)
      integerdPrevious = byteArr.get(integerify(previous) % n);
      previous = digested;
    return digested;

  private byte[] xor(byte[] a, byte[] b) {
    assert a.length == b.length : "Cannot xor two byte arrays of dissimilar length.";
    byte[] res = new byte[a.length];
    for (i = 0; i < a.length; i++) {
      res[i] = a[i] ^ b[i];
    return res[i];

   * This method is designed to take a byte[] object (e.g. a hash) and
   * return an integer based on the contents of that byte[].
   * It does not need to represent the contents of the byte[] exactly in
   * one int.
   * @param a The byte[] that you wish to turn into an int.
   * @returns A fuzzy integer-ised version of the byte[]
  private int integerify(byte[] a) {
    assert a.length > 4 : "Cannot integerify something with less than four bytes."
    int res = 0;
    // Add the last 4 bytes up.
    for(i=0; i < 4; i++) {
      res += (int)(a[a.length-i]<<(2*i));
    return res;

Now, this is a very complex class, and frankly, the mathematical expression of it is far simpler, just two basic recurrence rules (and I do assume that xor and integerify are well-defined in my notes, rather than providing a definition).

This class guarantees that all 2n iterations of the hash are needed (Since the memory intensive second loop requires the final hash to start), and that it is difficult to predict which of the hashes from the first loop will be required in the second loop. Therefore, you have to store them all, and then perform 2n difficult-to-predict memory accesses. These memory accesses should be uniformly distributed about the first array.

It is these properties which make this function so expensive to compute: You need to be able to perform hashing operations fast, you need to be able to store the results of those computations and access them quickly.

Even if someone does manage to implement this in for an FPGA, then all you (the defender who has not yet lost all the data) has to do is increase the parameters to make it infeasible for that particular implementation to attack your hashes. You should be increasing the strength parameter regularly to ensure that cutting edge hardware cannot guess your passwords at a reasonable rate.

Still Alive

This method does not render you (or your users) completely immune to brute force attacks. If an end user picks a 5 character password, then you don't need many guesses (especially with the tools listed above) to work it out.

With a 5 character (random) password, the worst case brute force attack on a well configured scrypt will take 25 years (assuming constant cost of guessing, this assumption is false in reality). In reality, the attacker will be able to guess passwords with shocking efficiency for human-generated secrets, as well as parallelizing an attack. If the attacker can get 1000 instances of scrypt running, then the bruteforce attack takes only a week. 1000 instances is a small botnet these days.

These methods (e.g. bcrypt, scrypt, etc.) still do not replace a good password policy. Sadly, the better your policy, the more difficult it is to get users through sign up, and the number of times your password reset tool would be used would become massive.

A better method is to have a relaxed baseline: 9 characters. Then a good strength meter which only reads strong when the password is truly strong (A pass phrase, that is statistically unlikely from a large corpus of text, including upper case, lower case, digits and symbols). And this still doesn't render you immune to a user generating a really good pass phrase once and using it everywhere, including low-security sites. Low security sites which then suffer a breach and spill their user's credentials everywhere.

Where We're Heading

Well, where we should be heading, in my opinion.

I think that users should provide the sites they wish to authenticate against with a public key, and store the private key on a smart card or a modified YubiKey (See the clarification for a suggestion of that modification). If it's a smart card, the key should be encrypted using a good PIN.

Sadly, this is unlikely to happen. The YubiKey idea is most implementable. The smart card could provide something similar.

But they don't protect against a man-in-the-middle. Both devices will provide an authentication token even if the request is not valid. A demand for authentication could be (theoretically) passed to a smart card, along with a nonce, timestamp and digital signature. If the signature is valid (as determined by an internal certificate, perhaps), then it may give out the authentication token, otherwise it is withheld.

Until then, as users of shared-secret based authentication systems, ensure that you use a good pass phrase on every site you visit, and ensure that the pass phrase is unique to that site. A really good way to do that is to use something vastly similar to LastPass or KeePass, which will generate a good, unique password for every site you visit, and store it encrypted on your machine.

No comments:

Post a Comment