Coder in a World of Code: 2012

Tuesday, 18 December 2012

Overcoming The Ban

Overview

I am a member of the Pirate Party UK, and there are a number of very good reasons to support them. This blog post isn't about that, this blog post is about censorship on the web.

The Pirate Bay has been blocked by major ISPs in the UK since about February of this year, and recently, the power to block sites was abused, shutting off access to the Promo Bay.

This article is about how to keep your connection to the network free and unhindered during these troubling times.

Why Bother?

This is absolutely necessary, as giving the government or corporations the power to silence people at will is a really bad idea, for numerous reasons.

First and foremost is that freedom of association and speech is vital for a civilized society. The correct response to speech that you don't like is more speech, not an attempt to silence them.

Secondly, giving these powers to people is poor civil hygiene. Why give someone a power that can be abused? You don't know the kind of people that will be in power in the future.

Let's have a thought experiment. Assume the BNP get into power. They decree that all immigrants are bad, and sites associated with foreigners should be banned. The result of this is that perfectly legitimate sites will be banned, for example, many sites about Islam would be illegal, as would many sites critical of Christianity (there's a lot of atheist videos on YouTube!). This brings sites like Reddit, YouTube, Wikipedia, etc. into the firing line.

Thirdly, a broken business model is not an excuse to infringe on a person's rights. Times have moved on, and as yet, the publishing industry is failing to do so. The copyright lobby represents a government-backed monopoly that morally should not exist, so why are we doing all that is in our power to strengthen their monopoly?

Beat The Ban

There are three broad methods for beating the ban. The first is to rely on web proxies, like the Pirate Party's, secondly, you can use a VPN service in a more free country, and finally, you can use The Onion Router (TOR) for the ultimate power

Web Proxies

These are the simplest, easiest-to-use methods of overcoming the ban. Instead of navigating to a web page which displays the contents of The Pirate Bay to you.

You can find a list of such proxies on Pirate Reverse, but Google will serve you well too.

Problems?

If your ISP is very much against all of this, you can bet that they'll be adding proxies to the list of banned sites very soon.

Not to mention the BPI, MPAA & RIAA going after the proxy hosters directly and taking them offline. They're far too centralised.

Your ISP can also see that you are visiting a known proxy, and potentially take actions based on that fact. Not good.

Virtual Private Networks

By using a Virtual Private Network (VPN) provider in a more free country, you can easily get around the ban.

Using a VPN means that you basically make an encrypted connection to the VPN provider, and then you appear to be using their network. This can also be achieved by using a virtual private hosting service, and forwarding all of your traffic through that (usually using SSH).

These are traditionally used for pushing large amounts of "illegal" data through a connection, but can be used specifically to circumvent the ban.

There is one which is advertised by TorrentFreak, it is Private Internet Access. I've never used it, so I can't vouch for it. Have a look around for one that suits you if you choose this route.

Problems?

Usually, VPN services cost money, or are severely limited, not to mention the single point of failure that is the VPN provider, which, if you're using it to only access the front page of the Pirate Bay, it's a little bit overkill.

The Onion Router

The Onion Router (Tor) is a fantastic bit of technology. By wrapping up your data in multiple layers of encryption and sending it into the network to be bounced around, you can be sure that any one node cannot determine the source and destination of the data. Most of the exit nodes (The places that your data leaves the network to enter the rest of the internet, effectively) exist in relatively free countries, so accessing The Pirate Bay via this method is extremely effective.

It also means that your ISP can't see what you're browsing. All they can see is "Tor user". You could be looking at literally anything, from kittens, to the pirate bay, from medical information to bomb making instructions.

Tor is the ultimate in censorship-free browsing, but it does not guarantee anonymity. Your movements online can be tracked by things like tracking cookies and other techniques, so be sure to use the full Tor Browser Bundle, which is pre-configured to get around these issues.

Problems?

Tor can be a pain to set up, but it should be very easy when using the Tor Browser Bundle.

It can also be quite slow to use, but if all you're doing is loading up your favorite torrent sites to view some magnet links, you're basically sorted.

Do not push BitTorrent traffic through Tor. It was not designed for this, and it is unfair to the operators of the relay and exit nodes who make their bandwidth available for those who really need it; for example, Syrian dissidents and the like.

Conclusion

The ban is bad, but more than that, it has the potential to spiral out of control and seriously damage the ecosystem of the internet. We should reject censorship unconditionally, and take up our tools to assert our rights.

Information is always going to make it's way in and out of countries, through blockades and around damage. The only question is if it arrives in a timely and convenient manner.

Saturday, 17 November 2012

Suicide

Anger

Long story short, I was in a queue for food in the cafeteria at work, with a postgrad behind me. She was loudly wittering on the phone to someone, complaining about everything. I wrote her off as just a little bit self centered and left it at that.

Right up until she said her train was late, "because some woman was walking on the tracks" shortly followed by "If they want to kill themselves, we should just let them!"

At that point, I started to get angry. However, i did not say anything. I bit my tongue.

For reference, this blog post is not directed at those with mental health issues, per-se. It is more directed at those who judge them.

Just Let Them

This is the feeling that a lot of the general populace have about people with mental health issues. It is, at it's core, an argument based on a fundamental mis-conception about mental health issues and the people who suffer with them. Not only is the argument flawed, it also precludes the option for treatment in so many cases.

As a quick disclaimer, I am not a mental health care professional, and I do not suffer with any mental health issues to the best of my knowledge. I therefore fill the middle ground; the lay folk, the ones farthest from the problems in most cases. Despite this, I've had a lot of exposure to mental health issues. From close friends and lovers suffering with bipolar disorder, PTSD, anxiety, depression, self-harm paranoia and suicidal tendencies to actively working on a helpline during my undergraduate days.

The Faulty Premise

There is one core, faulty premise at the heart of the "just let them" argument. That premise is that people with mental health issues somehow chose their suffering. That is to say, they're either responsible for starting the illness, responsible for not fixing it, or somehow deserve it because of how the illness makes them behave.

These people are usually suffering because of a genetic pre-disposition; which they didn't choose; environmental conditions during their childhood; which they didn't choose; or a traumatic experience during their lifetime; often which they did not choose.

Notice how not one of those criteria is something that was chosen by the sufferer.

What this means is that the sufferers are not choosing to end their lives. They are choosing to end their suffering. Suicide is their last way out that they can see. If everyday you were in agony, and your doctors had told you that they'll see what they can do, but it might never work, wouldn't you look for other options?

I'm choosing to ignore the "what about the sufferer's family and friends?" retort here, as it devalues those who have no family or friends. Think of an old person who has outlived their friends and family, or many children who are in care. Those are some of the most vulnerable in our society, and to devalue them in such a way is to put them at further risk.

Treatment

Since the "just let them" argument simply shifts the blame to the sufferer, it's not difficult to make they leap to "they did it to themselves, they can fix it themselves".

If I became injured, the NHS would treat me to the best of their abilities. Even if it was in a car accident that was my fault, the NHS would patch me up. I'm not precluded from treatment because of how I became injured. I would even get on-going physiotherapy if I needed it. Why is this ethos not extended to those with mental health issues?

Epilogue

I'd like to state that I am not against euthanasia. If this seems like a contradiction, I'll explain it. I believe that every person has the right to decide their own fate. However, I feel that the best way to deal with mental health issues is preventative. Ensure that the person has the best treatment available to them to improve their quality of life.

Just because a person's quality of life has fallen, doesn't mean that it has to remain low -- there may be an option to improve their quality of life. However, if it is not an option, or it is not working, then the person should be able to decide their own fate, safely, quickly, as painlessly as possible.

Wednesday, 7 November 2012

OATH Analysis

Overview

Here, we go over the OATH specification, and point out some potential problems.

The OATH specification details a couple of methods of producing One-Time Passwords (OTPs) and verifying them.

I am going to leave the significantly more complex OCRA challenge/response algorithm to another blog post here.

HOTP Algorithm

The HMAC-Based One-Time Password (HOTP) scheme allows for the generation of OTPs that can be verified by a server. They are only properly suited for authenticating to one verifier, as they contain a counter which can become out of sync between the HOTP generator and the verifier.

The algorithm is as follows; the generator has stored on it a key, K and a counter C. In order to generate an OTP, the generator produces HMAC-SHA-1(K,C), truncates it to not less than 6 decimal digits and shows it. Mathematically, this can be expressed as:

HOTP(Key, Counter) = StringToNumber(DynamicTruncate(HMAC-SHA-1(K, C)))

DyanmicTruncate is an interesting function. It takes the last nibble (That is the bitwise-and of the final byte and 0xf) and uses it to jump somewhere into the output of HMAC-SHA-1(K,C), select 4 sequential bytes and treats them as a 32-bit integer (with the bitwise and of the most significant byte and 0x7F).

StringToNumber then computes the modulus of the dynamic truncation mod 10^{Digits} where Digits is the number of digits you want the output to be. Digits is expected to be 6, 7, or 8.

Analysis

The key is a shared secret and requires the shared secret to retrievable as plain text. This suffers from many of the same issues which are covered in a previous blog post. Primarily credential leak.

The agorithm isn't stateless with respect to it's previous executions, that is to say, if the counter gets out of sync, the server should stop accepting the OTPs.

TOTP Algorithm

The Time-based One-Time Password (TOTP) scheme is vastly the same as the HOTP algorithm, except T is a time-based parameter. Often constructed so that it updates once a minute. Mathematically, this is:

TOTP(K) = HOTP(K,T)

T = floor((CurrentUnixTime - T0 / TimeStep))

Where T0 is the time to start counting steps. One can tweak the StringToNumber function to give 6 to 8 digits, just the same as in the HOTP scheme.

Analysis

There is no counter based on the number of times the algorithm has been executed, which means that, given the key is shared between multiple verifiers, they can all independently verify the OTPs without needing to sync their counters. This is nice in some ways, but very bad in others, as it could be seen to be an encouragement to share keys.

There is also the issue of needing an OTP twice during the same time slice. What becomes of the old one? Do you just accept it? This makes it susceptible to fast replay or MITM attacks, as a person who realises they've given away a key they shouldn't have must wait for the key to become invalid, rather than being able to manually invalidate it.

Conclusion

Broadly speaking, criticisms that can be levelled at one can be levelled at the other.

Output Format

The output format that was decided on in the introductory parts of the paper makes some sense, a 6 to 8 digit OTP can be entered on almost any device, and 6 to 8 digit LCD displays are cheap, making it possible to produce these devices in hardware cheaply.

It also faciliates an 'air-gap' device, which is where the code is displayed on a screen. A human can be expected to reliably type 6 to 8 digits. a 50 character digital signature not so much. This means that the OTPs can be used every where. Home computers, laptops, touch screen devices, mobile phones, locked-down internet cafes with no USB ports, and so on.

However, such a limited output format completely precludes the possibility of a stronger form of authentication. For example, outputting an ID, counter/timestamp and a digital signature would be very good, and prevent the need for shared secrets.

Shared-Secrets

If you read this blog regularly, you are probably well aware that I find the idea of shared-secrets to be quite un-nerving.

Not only that, but the requirement to have the keys in plain text for computing HMAC-SHA-1(K,C) is somewhat problematic. You can't protect them cryptographically. What that means is that for serious institutions, they would have to invest in a hardware key storage module, that is tamper-resistant. This makes a wide deployment of these for a high-security system slightly more expensive, and any software would need to interface with it.

This means that for most purposes, you should generate a new key for every service that you wished to authenticate against, as no-one can reliably be trusted to protect your key. This is no better than needing a unique pass phrase for each service that you authenticate with in many respects. Yes, the secret will be hard to guess, but spilling it will be awful.

I can envision a smart phone app which supports OATH. It would store a unique key for every service that you want to authenticate with, and only provide an OTP from an encrypted keystore when a master pass phrase was presented. I am tempted to write one just for the hell of it and get back into Programming Android.

Such an application would be great as a second layer of authentication for most services.

Other Attacks

Brute-Force

Given that the shared-secrets are recommended to be 160 bits, brute forcing the secret is not likely to be an option.

However, if a server does not rate limit my guesses, I only have to enumerate 10^{Digits} worth of codes to have a verification server deem my OTP valid. Typically, this is only 1,000,000 requests to a server, which modern hardware and can do easily -- especially with a botnet.

Replay

Broadly speaking, a correctly implemented HOTP verification server is not vulnerable to a replay attack. However, a TOTP verification server may be.

Man-in-the-Middle (MITM)

All of the OTPs are vulnerable to relatively quick (MITM) attacks. Phising attacks and such are a real danger to this system.

Overall

Overall, I would say that an HOTP/TOTP capable device was more versatile than a YubiKey, but also less secure, as it is expensive for an organisation to protect their shared secrets, and shared secrets cannot be cryptographically hashed to secure them.

Hopefully, OCRA may solve some of the above problems without impeding the usability of the device, but I've yet to look into it..

Saturday, 3 November 2012

Shared-Secret Authentication

Overview

Authentication is difficult. Barring the issues of lost passwords, weak passwords, the philosophical concept of "identity", and a host of other issues, we'll be looking at performing authentication "correctly" for users.

Issues

Credential Leaks

This is a simple attack. An attacker somehow gets their grubby little mitts on a set of (username, secret) and the gig is up.

But even if you're storing a hash of the user's secret, there are still pit-falls.

Timing Attacks

Timing attacks are often overlooked, but they're usually only a problem for high-value targets. They almost always comprise a high level of knowledge about the internal system, but can be used to effectively do things like enumerate all of the usernames on your system.

This can be a problem, what if, like Facebook, your username is the user's email? This is bad, because it means that you are responsible for leaking the data to an attacker who may just add them to a spam list, or use it as another step in an attack on a user.

Online Brute-force Attacks

These are the simplest to imagine. Just start guessing the credentials and sending them to the server for authentication. You have to know nothing about the system to do this, with the exception of where it is.

These attacks can become more sophisticated, by using a botnet to distribute the authentication requests around the world, and by statistically guessing credentials. When combined with a timing attack to enumerate usernames and good statistical guessing of passwords, this actually becomes quite a formidable threat.

Replay Attacks

A replay attack is where an attacker captures a valid (usually encrypted) request to a system. The system doesn't distinguish one request from another, so when the attacker re-issues the request, the system gladly carries out the request again. This could be as simple as logging the user in, or moving money from one account to another.

Defences

Credential Leaks

These come in many forms.

You leak your database, and the shared-secrets (i.e. passwords) are not protected by a hash.
You leak your database, and the shared-secrets are protected by a weak hash algorithm.
The credentials passed in plain text over the wire on the way to your server.
The credentials might be extracted from a "secure" tunnel.

Given the broad nature of this issue, I will issue a few simple golden rules.

Ensure that any shared secrets are stored in your database using a strong hashing algorithm. I recommend scrypt and a reasonable password policy. With reasonable parameters, this ensures that any attacker has to spend a lot of cash to break an scrypt hashed password. This covers the first two bullet points above.

And don't forget, that when hashing the password, you need to hash it with a salt which is cryptographically random and fairly big.

For the second two on a web application, always use SSL, and turn compression off. Do not store shared secrets on the client in cookies or anywhere else in the client. If you're not writing a web application, ensure that you use something equivalent to SSL on your connection between the server and the client. If you can't have this, you're going to need something other than shared-secrets for your authentication.

In general (for a web application) you want to ensure that your application is immune to XSS attacks, as it may actually be very easy for an attacker to use your site to send credentials elsewhere. This could be done by injecting a JavaScript keylogger into an otherwise perfectly valid page. If you're doing all this over SSL, then the subject of the attack will fully trust all of the script on the page to be from you, not an attacker, and will probably let the keylogger do it's business. Not cool.

Timing Attacks

This is quite feasible. Given a login procedure something along these lines:

Get the user's hash from the database. If there is no hash, return false.
Hash the given secret and compare it to the one in the database. Return the appropriate value.

Given that a good hash function should take ~100ms per password hash, an attacker can send hundreds of username guesses and random passwords down the wire. The attacker then measures the response time. The ones with a large response time are the ones which had their random passwords hashed, which means that a username was present.

There are several ways to defend, the most common is to attempt to engineer your solution to take the same amount of time regardless of the path it takes. This conflates your authentication code with code which tries to defeat a timing attack.

A better method is to leave your authentication code as-is, and wrap your authentication in an object which forces all requests to have the same average response time, regardless of whether or not the authentication was successful or not.

This has the benefit that, if your authentication process needs to query an external system, as well as internal systems, the external system need not be timing-attack resilient, and can be as simple as possible. It also keeps any code for checking locally-managed passwords simple.

The wrapper would simply keep a track of the average time that it took to perform a successful authentication. In the event that the authentication was unsuccessful, the wrapper would force the authentication to take the average time on total. Some code which has this idea can be found on my GitHub.

The source code above is in Java, and can be built using

ant dist

and run using

java -jar ./dist/lib/TimingAttack.jar

It will print out a series of authentication results and times to reach the result. You should see that the time on each should be (on average) the same, regardless of the result.

This doesn't protect against other side-channel attacks. If, for example, a malicious application was running on the same CPU as your application, there are a host of attacks (e.g. cache-miss side-channel attacks) that can leak information about what your application is doing.

Online Brute-force Attacks

This is actually a very tough nut to crack. If we assume that the attacker is as sophisticated as possible; getting a botnet to do statistically very good guessing for them; then we have a serious problem on our hands.

The first way to help deal with this is to force (or encourage) your users to pick very strong, difficult to guess passwords.

The second way is to rate limit any guesses. In general, this is A Good Thing^TM as it prevents an attacker from using up system resources at an unreasonable rate, and thus protects against a simple denial of service attack.

The second way could be achieved by keeping a global (for Java, read Singleton) RateLimiter class which keeps a record of IPs and the most recent time they sent a request in. If it's been too recent, make them wait a long time.

An example of such a rate limiting can be found in the same GitHub repository as the previous example. It can be run in a very similar way, first build it:

ant dist

and run using

java -jar ./dist/lib/OnlineBruteForceAttack.jar

A potential modification would be to slow down the request for each user account individually, as opposed for each IP address. That way, each IP gets some number of guesses that get a reasonable response time for each user, thus helping any folks with NATs. However, this means that an attacker can DoS a user. Some combination of IP Address and account would be in order, but even that won't slow an attacker down much.

Replay Attacks

These are usually dealt with by correctly using SSL. SSL uses nonces (cryptographically strong random numbers) to ensure that two requests can be distinguished.

In a system which does not use SSL, you basically have to implement nonces yourself. This is not hard, but I won't cover it here. Basically you stick a random number into the request. If you've seen that number previously, discard it. If you didn't generate that number, discard it.

Conclusion

The "best" [Edit: shared-secret based] authentication system requires good passwords (for some value of good), hashes them with a really good hashing algorithm (my opinion is that a correctly-configured scrypt is the way forward), and defends against timing-attacks and simple brute-force attacks is the best.

However, you need to balance this against the needs of your user. How sensitive is the data you're storing? How well compartmentalised is your application? (i.e. can a compromise of one account lead to another user's data being exposed?) Do you have the time to implement all of this, or do you just want to scrypt the password and call it a day -- for now?

Either way, as I've mentioned before on this blog, I want shared-secret authentication to die.

Friday, 2 November 2012

Memory-Hard Password Hashing in Java

Overview

As mentioned in my last post, password hashing is about trying to use your resources to out-wit an attacker. Attackers have lots of CPU power to throw at the problem of guessing our user's passwords, and even have statistical methods for determining a user's most likely passwords.

What we need to do is slow the rate at which the attacker can make guesses. Sadly, this isn't that feasible: as CPU power only ever increases. We need to increase the cost of guessing a password. For that to happen, we need to make any hardware needed to guess a password as complex as possible to impede attackers implementing our algorithm in hardware (e.g. FPGAs) or farming it out to GPUs.

Any source code in this article is instructional only, use the real scrypt in production!

Increasing Complexity

In order to increase the complexity of the hardware that the user needs to attack the system, we must not rely too heavily on the CPU, or the GPU, or any other single piece of hardware -- we must use them in concert. For general purpose computing machinery, this means putting the server's memory to use as well.

But, you say, what about in 5 years time, when GPUs have 10GiB of onboard RAM, and FPGAs have a logic cell count in the billions? As in the previous post, we need a way to increase the complexity of the password function as time goes on.

The heart of the scrypt algorithm is referred to as "SMix", which is actually an instance of ROMix (also defined in Percival's paper) with some specified parameters.

The Heart of The Matter

ROMix is quite a simple algorithm, once you get it.

You fill an array with the repeated hash of the password, then you hash the password xor'd with various elements of that array.

In Java, this looks something like this:

public class ROMix {
  
  private MessageDigest hash;

  public ROMix(MessageDigest hash) {
    assert hash != null : "ROMix cannot work with no hash algorithm."
    this.hash = hash;
  } 

  public byte[] mix(byte[] input, Integer strength) {
    int n = Math.ceil(Math.pow(2,Double.valueOf(strength)));
    ArrayList byteArr = new ArrayList(n); // An array of size n.
    byte[] digested = hash.digest(input);
    
    // First, we iterate the hash, saving the result into an array
    for (i=0; i < byteArr.size(); i++) {
      byte[i] = digested;
      digested = hash.digest(digested);
    }

    // Next, we access the array of hashes in an psuedo-random (but predictable) fashion.
    byte[] previous = byteArr.get(byteArr.size());
    byte[] integerdPrevious = byteArr.get(integerify(previous) % n);

    for (i=0; i < byteArr.size(); i++) {
      digested = hash.digest(xor(previous, integerdPrevious));

      // Here is where we access the array of hashes in a random way,

      // thus forcing either expensive recomputation, or storing the

      // hashes in RAM (Also expensive)
      integerdPrevious = byteArr.get(integerify(previous) % n);
      previous = digested;
    }
    return digested;
  }

  private byte[] xor(byte[] a, byte[] b) {
    assert a.length == b.length : "Cannot xor two byte arrays of dissimilar length.";
    byte[] res = new byte[a.length];
    for (i = 0; i < a.length; i++) {
      res[i] = a[i] ^ b[i];
    }
    return res[i];
  }

/**

   * This method is designed to take a byte[] object (e.g. a hash) and

   * return an integer based on the contents of that byte[].

   * It does not need to represent the contents of the byte[] exactly in

   * one int.

   * @param a The byte[] that you wish to turn into an int.

   * @returns A fuzzy integer-ised version of the byte[]

   */
  private int integerify(byte[] a) {

    assert a.length > 4 : "Cannot integerify something with less than four bytes."
    int res = 0;

    // Add the last 4 bytes up.
    for(i=0; i < 4; i++) {
      res += (int)(a[a.length-i]<<(2*i));
    }
    return res;
  }
 }

Now, this is a very complex class, and frankly, the mathematical expression of it is far simpler, just two basic recurrence rules (and I do assume that xor and integerify are well-defined in my notes, rather than providing a definition).

This class guarantees that all 2ⁿ iterations of the hash are needed (Since the memory intensive second loop requires the final hash to start), and that it is difficult to predict which of the hashes from the first loop will be required in the second loop. Therefore, you have to store them all, and then perform 2ⁿ difficult-to-predict memory accesses. These memory accesses should be uniformly distributed about the first array.

It is these properties which make this function so expensive to compute: You need to be able to perform hashing operations fast, you need to be able to store the results of those computations and access them quickly.

Even if someone does manage to implement this in for an FPGA, then all you (the defender who has not yet lost all the data) has to do is increase the parameters to make it infeasible for that particular implementation to attack your hashes. You should be increasing the strength parameter regularly to ensure that cutting edge hardware cannot guess your passwords at a reasonable rate.

Still Alive

This method does not render you (or your users) completely immune to brute force attacks. If an end user picks a 5 character password, then you don't need many guesses (especially with the tools listed above) to work it out.

With a 5 character (random) password, the worst case brute force attack on a well configured scrypt will take 25 years (assuming constant cost of guessing, this assumption is false in reality). In reality, the attacker will be able to guess passwords with shocking efficiency for human-generated secrets, as well as parallelizing an attack. If the attacker can get 1000 instances of scrypt running, then the bruteforce attack takes only a week. 1000 instances is a small botnet these days.

These methods (e.g. bcrypt, scrypt, etc.) still do not replace a good password policy. Sadly, the better your policy, the more difficult it is to get users through sign up, and the number of times your password reset tool would be used would become massive.

A better method is to have a relaxed baseline: 9 characters. Then a good strength meter which only reads strong when the password is truly strong (A pass phrase, that is statistically unlikely from a large corpus of text, including upper case, lower case, digits and symbols). And this still doesn't render you immune to a user generating a really good pass phrase once and using it everywhere, including low-security sites. Low security sites which then suffer a breach and spill their user's credentials everywhere.

Where We're Heading

Well, where we should be heading, in my opinion.

I think that users should provide the sites they wish to authenticate against with a public key, and store the private key on a smart card or a modified YubiKey (See the clarification for a suggestion of that modification). If it's a smart card, the key should be encrypted using a good PIN.

Sadly, this is unlikely to happen. The YubiKey idea is most implementable. The smart card could provide something similar.

But they don't protect against a man-in-the-middle. Both devices will provide an authentication token even if the request is not valid. A demand for authentication could be (theoretically) passed to a smart card, along with a nonce, timestamp and digital signature. If the signature is valid (as determined by an internal certificate, perhaps), then it may give out the authentication token, otherwise it is withheld.

Until then, as users of shared-secret based authentication systems, ensure that you use a good pass phrase on every site you visit, and ensure that the pass phrase is unique to that site. A really good way to do that is to use something vastly similar to LastPass or KeePass, which will generate a good, unique password for every site you visit, and store it encrypted on your machine.

Tuesday, 30 October 2012

YubiKey Key Storage Module Review

Overview

Today, I'm reviewing the YubiKey Key Storage Module code. It is a PHP application designed to be run on a locked down server.

Operation

The basic premise of it's running is this; there are 3 participants:

The end user
The YubiKey Key Storage Module (YubiKey-KSM) server
An application server

The YubiKey-KSM is assumed to be running on a 'locked-down' server, separate to the application server. An authentication looks something like this:

The end user provides an OTP from their YubiKey to the application server
The application server hands the OTP to the YubiKey-KSM sever
The YubiKey-KSM server replies with an authentication success or failure message
If the OTP validated, the application server performs any further authentication, checks the counter values of the OTP are valid, and the grants access if all of the authentication is a success.

Attack

Worst Case

The attack is quite simple; if you assume the authentication server isn't completely impenetrable (For example, using a MySQL database).

This means that an attacker can simply grab a user's OTP credentials and use them immediately, without any hindrance.

Logs

The logs contain OTPs and their associated plain-texts. Enough said. Even successful authentications are written out to the logs. These can be reused by an attacker.

Man-in-the-Middle (MITM)

A man in the middle can harvest OTPs from a user (if there is, for example a poor application with doesn't use SSL on it's sign-in page, or perhaps a phising attack), then these harvested OTPs can be reused with the YubiKey-KSM PHP implementation.

Improvements

Hashing

Protecting the user's inner name (the secret) one should be a top priority. Hashing the password with a good configuration of scrypt should take ~100ms.

Cracking that hash would be difficult. If we assume that scrypt had been used correctly, and it takes one attacker's CPU 100ms to check one guess, it will take on average 1.5x10⁵ CPU-years to guess an internal name. Even being able to concurrently guess 10⁶ passwords at once reduces this to 57 days. That's a lot of CPU time and a hell-of-a-lot of RAM (assuming decent scrypt settings).

This still doesn't stop the attacker gaining the AES key for all the YubiKeys and just decrypting any that they see to extract the credentials. And bear in mind that OTPs can be printed to the YubiKey-KSM logs when a failure occurs

Logs

Just stop writing the OTPs and their plain texts to logs, ever. Just don't do it. You wouldn't steal a car, you wouldn't log a user's password in plain-text, don't log a user's plain-text OTP credentials.

Use The Counter

Storing the last-seen counter against the user's public name is an absolute must. Don't leave any step of the validation up to the application server. If you've seen that counter value before, just stop, declare it to be an authentication failure and move on with your life! This helps to protect against a MITM attack

Attacking the Defences

A better way would be to simply retrieve the AES key (as outlined in the worst case) and attempt to sniff the incoming OTPs directly, thus breaking the secrecy of the internal name and counter. This allows an attacker to generate valid OTPs at will, even with all of the defences above implemented.

Conclusion

Using your own YubiKey-KSM is not an especially fantastic idea at the moment, with the code as it stands at this time. A few patches here and there, and it'll be much better. Some improvements might be:

Allowing for different databases much more easily,
Hashing the credentials,
Asserting that the request for validation is genuine,
Rate limiting the number of requests for OTP validations that can be made in a given second,
Not writing sensitive information to log files,
Checking the counter on the OTP before declaring an OTP valid).

These are quite a substantial modification to the system. However, excluding the 3^rd point, none of the above actually change the interface that the YubiKey-KSM server presents to the outside world.

Monday, 29 October 2012

Time-Hard Password Hashing in Java

Overview

So, you want to write your web application using, for example, Java, and you want to store user's credentials?

Well, standard is to hash them with bcrypt or scrypt and a random salt which is unique to each user. But, how do they actually work? Today, we'll cover hash stretching.

Any source code in this article is instructional only, use the real scrypt in production!

Onto The Rack

As mentioned above, what you should be doing is hashing the password with (for example) bcrypt and a really good unique salt. Here's a method for stretching the hash of a user's pass phrase.

The plan is to repeatedly hash the user's password and hash so that to get the user's password, they have to do a very specific (read, high) amount of work.

public class StretchedHash {
  private MessageDigest hash;

  public StretchedHash(MessageDigest hash) {
    assert hash != null : "You cannot stretch a user's password with a null hashing algorithm.";
    this.hash = hash;
  }

  public void hash(String password, String salt, Integer stretch) {
    assert stretch > 0 : "You cannot negatively stretch the hash of credentials";
    // We use funny concatenation to force a one-to-one mapping.
    byte[] digestBytes = hash.digest((salt.length().toString() +
                                      password + salt).getBytes());
    for (i = 0; i < Math.pow(2,Double.valueOf(stretch)); i++) {
      digestBytes = hash.digest(digestBytes);
    }
    return new BigInteger(digestBytes).toString(16);
  }
}

Problems?

This solution does not actually guard against GPUs very well. SHA-512 has a very small memory foot print, and so can be farmed out to GPUs nicely. Even stretching the hash doesn't help that much, the hashes can still be checked very, very fast on a rig with a few GPUs.

Even worse, many hashing algorithms (e.g. SHA1) is very small in hardware, meaning that it is trivial to implement several on an FPGA, or have them fabricated in custom silicon.

The way to make this better is to force it to require multi-megabyte amount of RAM. But not only that, this requirement should go up as time goes on -- just like the amount of CPU time required should go up.

This makes the problem evident. You are buying servers to serve your application to honest users. Your attacker is specialising in one thing: Getting your user's credentials. Unless you specialise your hardware to defend well, you're always going to be at a disadvantage.

In my opinion, shared secrets (i.e. passwords) is a soon-to-be-dead form of authentication, as there are just too many vectors of attack, and the end user will often re-use the same shared secrets for different services. People are bad at strong authentication.

Thursday, 25 October 2012

The state of Java in FreeBSD

Available Implementations

Sun's JDK
Sun's Linux JDK
OpenJDK
DiabloJDK

We're going to focus on OpenJDK, since Oracle is probably giving Sun's Java a very slow death. We know at least two things:

OpenJDK is written in Java.
BSD's ports system uses source code to build the packages locally

Points 1 and 2 together mean that I need Java to build Java. This is the same problem that Haskell has when you attempt to build it. There's a few ways of solving it:

Have a cheap and cheerful Java compiler written in a language that you already have a compiler for.
Use an existing Java compiler to build Java.

OpenJDK does not supply a compiler written in anything other than Java. This means that to install Java, you're relying on a non-free Java.

Luckily, there are multiple solutions:

A bare-bones (aka bootstrap) Java compiler written in C/C++ to bootstrap OpenJDK into life
Binaries from OpenJDK

However, both of these have some major problems.

Writing a bootstrap compiler in C/C++

Java is a big language. I presume something like 95% - 99% of the language features in Java are in use in the OpenJDK compiler and virtual machine.

This means that any bootstrap compiler & associated VM will be very big. Possibly warranting it's own project.

Being produced by the OpenJDK community and written in something-other-than Java will also likely be an issue -- I assume the OpenJDK community is most proficient and prefers writing things in Java.

Binaries from OpenJDK

This is a problem, since the binaries that OpenJDK sends out will have to be for every OS and every architecture that they want to support. This is a huge burden and simply not feasible

Can you imagine having to either cross-compile then test on, or directly compile OpenJDK on ARM, AMD64, x86, MIPS, SPARC? Not to mention the array of OSes that may run on the above architectures.

FreeBSD's Solution

FreeBSD's solution attempts to neatly side-step the solution. Their port of OpenJDK relies on the "DiabloJDK". This is ~~Sun's~~ Oracle's JDK licensed by the FreeBSD foundation for usage. I'm not sure what the exact terms are, but this makes me uneasy (personally).

To install OpenJDK, you grab the DiabloJDK package off Oracle's un-navigable site and stick it in a very particular directory. When OpenJDK is being built, it uses DiabloJDK to build itself. You are then free to remove the DiabloJDK.

This is, officially, a bitch. Oracle require you to signup and agree to their Ts & Cs before allowing you to download stuff from their site. However, it does mean that OpenJDK don't actually have to bother very much about FreeBSD-specificities.

That's right folks, to get an open implementation of Java on your FreeBSD box, the standard method is to get a non-free implementation of Java on your FreeBSD box.

My (Proposed) Solution(s)

PyPy

We write a small, Java interpreter in PyPy (RPython), and directly run the un-compiled OpenJDK code on itself to compile it.

Interpreters are supposed to be quick and easy to write in PyPy and it should therefore be within someone's grasp to do so.

Obviously, you'd need to install PyPy before this, and I think that requires Python, which is a very odd dependency for OpenJDK to have.

LLVM

LLVM is also supposed to be nice for interpreters and the like. Why not have OpenJDK depend on the LLVM tool chain? Maybe we can have a Java-to-LLVM bitcode compiler?

This is (as in the PyPy idea) a little bit of a strange idea, but most languages eventually need a C compiler to bootstrap them, why not use something closely related?

Tuesday, 23 October 2012

YubiKey Man-in-the-middle

Overview

This is just a brief outline of a man-in-the-middle attack on a YubiKey. It's already a widely known attack. [Edit 24th October 2012: As pointed out by Simon Josefsson this attack is valid for almost any OTP type device]

Participants

Alice, the valid user
Bob the Bank, the person/service Alice wants to prove her identity to.
Mallory the attacker

Attack

Mallory goes to the Bank's website and copies their design to a new place. He then puts up the credible-looking-but-actually-malicious website. As an aside, The HTML5 fullscreen API actually makes this quite easy, as you could make something that looks a lot like the user's browser, complete with green padlock to signify a good SSL connection and the bank's URL.

He then convinces Alice to visit the bad website and enter her credentials. When she enters her username, password and YubiKey OTP, Mallory uses them to empty Alice's account and then redirects Alice to the bank home page, making it seem like she's entered a bad password.

Alternatively, Mallory calls Alice and claims to be Bob. Alice believes him and hands over credentials, including an OTP. Mallory immediately uses it to interfere with Alice's account.

This actually becomes even more problematic, as Mallory could then potentially add his own YubiKey to Alice's account as an authorised user of Alice's account and remove Alice's access. Mallory can now access Alice's account at will.

Defense

Even with the hardening technique mentioned in the clarification of my previous YubiKey post, this technique still works!

The "correct" defense is for the YubiKey to refuse to give out an OTP except to someone who can prove their identity to the YubiKey, e.g. by passing a signed message to the YubiKey. However, this massively decreases the usability of the system and increases the cost to deploy, as it will require system-specific drivers.

The softer version of this is for people to not reveal their OTPs until they have been invalidated by usage.

Sunday, 21 October 2012

OpenPGP SmartCard

Overview

The OpenPGP SmartCard is a card (not unlike the EMV/Chip-n-PIN cards in the UK) for storing private keys. It provides cryptographic functions (signing, encryption and decryption) on board the card.

Cost

An OpenPGP SmartCard from Kernel Concepts costs 13.90EUR, and a card reader with integrated PIN pad is 50.00EUR, for a total cost of 63.90EUR. This is significantly pricier than a YubiKey.

Hardware Required

Computer, assumed to be Linux, but binaries probably (?) exist for other operating systems. Not to mention the driver for the card reader.

All of the functionality of the card can easily be replicated in software, and it effectively is done so widely.

Usages

GnuPG/PGP private key storage & usage
SSH private Key storage & usage

Since all cryptographic functions are carried out on the card, as opposed to the machine, the private key is never on a host system, allowing for significantly more secure PGP and SSH private keys.

Given co-operation of browser vendors and service implementers, the OpenPGP SmartCard could reasonably be used in a challenge/response protocol not unlike that implemented by SSH to verify an authentication request for a user.

Weaknesses

For this section, we assume a challenge/response protocol has been designed for the card, not unlike that used in SSH.

Server-Side

If authentication server has been compromised, then no keys need to be revoked -- the authentication server only holds the public keys for each of the users. At worst, it could grant access to attackers and deny it to legitimate users.

If the implementation of RSA on the cards is not sufficiently hardened against timing attacks, then the server could mount a timing attack to discover the signing key stored on the card, given enough authentication attempts. Newer implementations of RSA are not vulnerable to this, by blind-signing a challenge-token the timing attack is defeated.

Client-Side

If the card can be stolen and disassembled, then the private key can be recovered, assuming it is stored unencrypted on the card. If the key is stored encrypted on the card (e.g. using the user's PIN) then the key cannot be recovered, assuming that the algorithm used to encrypt the key does not have any cryptographic breaks, or the user's pin remains secure.

Since cryptographic functions are supplied on-card, it should be relatively easy to strongly verify that the server requesting authentication is valid, thus eliminating many man-in-the-middle attacks.

Other

If the function used to generate the private keys is not sufficiently random, then the private keys of a given user could be guessed by an attacker.

Hardening Techniques

I'm open to suggestions. The only real problem with this system vs. the YubiKey system is that the implementation is more often an exercise left to the reader.

Conclusion

An organisation with sufficient funds could use OpenPGP SmartCards to provide security in excess of that provided by a YubiKey, and long-term costs could potentially be lower, as a compromised authentication server does not mandate the re-issuing/re-burning of all cards.

However, it would be a significant investment for an organisation to deploy OpenPGP SmartCards, despite the additional security and functionality.

YubiKey

Overview

A YubiKey is a small USB key that is provided by Yubico [Edited to correct name]. When plugged into a computer, it provides a one-time password that uniquely identifies the YubiKey.

Cost

One YubiKey costs ~25USD (19.20EUR at the time of writing). Discounts are available for large purchase volumes.

Hardware Required

Computer of any major (Mac, Linux, Windows) OS. YubiKey token.

A software YubiKey could easily be implemented, e.g. as an application for a smart phone.

Operation

When a plugged into a host system, and a button on the YubiKey is pressed, a one-time-password is produced (OTP) and "typed" (by the fact that it is a USB HID) into the host system.

The OTP is as follows:

Secret ID (6B)
Session Counter (2B)
Timestamp (3B)
Session Use (1B)
Output from an LSFR (a psuedo-random number, 2B)
Checksum, CRC-16 (2B)

This is for a total of 16 bytes. The output is encrypted using AES-ECB and a 128-bit key which is also presumed unique to the device. This is then encoded in BASE16 [Edit 25th Oct 2012: It's not quite BASE16, it's referred to as ModHex and is specific to YubiKey. Thanks to Simon Josefsson] and "typed" to the host system.

The server which authenticates the device retrieves the plain-text of the OTP and verifies the secret ID. If the OTP decrypt successfully and secret ID matches a given username [Edit 25th Oct 2012: The sever also ensures that the session counter and Session Use have not already been seen and have increased monotonically. Thanks to Simon Josefsson], then the authentication was a success.

Weaknesses

Server-Side

If the server is compromised, all keys must be re-issued (keys can be "re-burnt" without re-purchasing). This is because the authenticating server will spill all of the YubiKeys' AES keys and secret IDs.

Client-Side

If the client is compromised (i.e. the key falls into the hands of an attacker), the attacker could disassemble the device and retrieve the AES key and the secret ID. This would allow them to arbitrarily forge a user's YubiKey authentication.

An attacker could also perform a live man-in-the-middle attack, by tricking or forcing the user to reveal credentials, and subsequently using them to login to services. This is because the YubiKey does not verify the authenticity of the authentication request before sending a token.

Other

If the procedure to generate the AES keys and secret IDs is not sufficiently random, an attacker could simply start guessing AES key/secret ID pairs and requesting authentication. I am not sufficiently familiar with the procedure for generating the AES keys or IDs, so it may be cryptographically strong (i.e. un-guessable without searching the whole space).

Hardening Techniques

First and foremost, protecting against man-in-the-middle attacks should be very high on the list, as it a very simple attack to perform. This means forcing the YubiKey to authenticate the authenticity of the request before providing credentials at any point. There should be no way to have the YubiKey present an OTP without it first verifying the credentials of a server (e.g. digital signature based challenge/response protocol).

However, if you're implementing that, why not make it bi-directional (i.e. the server can authenticate the YubiKey using the exact same protocol)? Mostly because doing even single direction challenge/response authentication is very difficult, and breaks the assumption that the YubiKey is only a USB HID.

Secondly, if the server spills the AES keys and secret IDs, it is game over -- time to revoke the keys and force re-issuing or re-burning of the keys. A simple solution to have the YubiKey sign the OTP and send the signature to the server. The server would hold the public-key of the YubiKey. If this was spilt, it would not compromise the security of the system. This would, however, increase the length of the OTP substantially. It would also mean that secret IDs no longer need to be secret. This would not, however, protect against man-in-the-middle attacks.

Conclusion

Overall, the use of a YubiKey, combined with the use of a username and password (i.e. hardware and secret token for authentication) does make an authentication system significantly more secure. However, there is room to significantly increase the security of the YubiKey system, without major impact to the usability of the system.

Given that the YubiKey requires little extra software or hardware, it is a cost-effective and user-friendly method for increasing the amount of time, effort and skill an attacker must have to compromise a system.

Clarification, 2012-10-23 13:00

The hardening technique mentioned in the last paragraph is as follows:

The server maintains a list of YubiKey unique IDs/public keys.
The YubiKey writes out a message of the following format:

Unique ID
Session Counter
Session Use
Timestamp
PRNG output
Digital Signature of the above fields

The server can then verify the OTP by checking the following:

The ID is valid,
The Session Counter/Use is greater than previously seen counters
The digital signature is valid for the fields.

This output could optionally be encrypted using AES-EBC, but there's no need. The power of the system comes from the fact that the only place the private key is kept is the YubiKey (part of the assumption of this slightly modified system)

This would mean that should the server be compromised the unique IDs/keys be spilt, there would be no need to re-issue/re-burn the keys. This is useful when keys are used across multiple (>2 in the case of YubiKey 2.2), disconnected authentication servers (for example, if I used my YubiKey to gain access to my VPS host, to gain access to my banking and several other websites).

The YubiKey would need to also be able to "type" it's public key as part of registering itself with an authentication provider.

The draw backs of this modified system are as follows:

Cost. You'd need to re-issue the keys for any system that already uses the YubiKeys, not to mention that you need more computational power to perform a digital signature -- each YubiKey would be more expensive
You're relying on the digital signature algorithm. Depending on your algorithm chosen, will rely on the discrete logarithm problem or integer factorisation. These are assumptions, not guarantees.
The OTP would be much longer when typed. I don't actually think this is much of a problem, personally.