Coder in a World of Code

Poor man's map-reduce

2018-11-18T01:25:00.002+00:00

Sometimes, you've just got to say "fuck it" and go with what you've got. I recently had such a night.

I had a large line-oriented input file, like a CSV file, no time, and a slow processing script. The script couldn't be optimised easily because each record needed a network call. I also didn't have the time to upgrade the script to use the language's built-in concurrency or parallelism. I needed answers sooner rather than later; every minute was a minute I was not in bed ahead of an important meeting where I would present the data and my analysis.

needed to run my script against two different back-end systems, and a quick back-of-the-envelope estimate put the runtime over two hours. Two hours that I did not have.

Long story short, there was no time for caution, and it was do-or-die. The answer, of course, is ~~improv everywhere~~ shell scripting.

To sum up, I had:

A large line-oriented input file
A slow line-oriented processing script
A *nix system
No time for caution

Map-reduce

Map-reduce is a way of processing large input data quickly. The input is split into chunks, processed in parallel, then brought back together at the end.

My problem happened to fit this model, but I had no time to wrestle my script. I split up my input file, ran the script on each part, and then brought the answers back together manually.

shell to the rescue!

First, I used split -l to break up my input file up into chunks:

split -l 459 input.txt

This gave me ~48 with filenames like xaa and xab, each with ~460 lines. Each file would take ~1 minute to process at 6 TPS, which was about the maximum throughput of the processing script.

Next, I launched my script 48 times:

for i in x*
do script.py $i > output-$i &; done

Off it went, with 48 processes launched in parallel, each writing to their individual output file.

Bringing it back together

The files need to be brought back together. In my case, a colleague had written another script to analyse a single output file and generate a report. I needed to join all my files back together.

cat output-x* | sort | uniq > aggregate-output.txt

Error handling

Some lines in the input would cause my processing script to crash. Because I needed to do this process twice, I tried two different approaches:

Let it crash and re-process missed lines later
Code up try/catch/continue.

The first approach turned out to be frustrating and time consuming. I ended up using wc -l to quickly work out which files had failed, re-collecting, re-splitting, and re-running after removing some of the offending lines. This was especially difficult because a single chunk could have multiple "poison" lines, so I ended up going down to chunks with 2 lines. Very annoying.

The second approach was much better and quicker, however, it did need a little extra print >>sys.stderr "msg"" to list of poison lines. All in, the second approach was quicker in this case.

Conclusion

In the end, I took the processing down from >2 hours to a couple of minutes with only a few moments 'investment' with the shell.

I would not recommend this to any sane person, except in the fairly narrow circumstances listed above. It's difficult to test, requires manual shell work, and failures can really set you back.

From a software engineering perspective it's ridiculous. On the other hand, it's not stupid if it works.

systemd: OpenSSL of the future?

2016-10-02T12:40:00.003+01:00

OpenSSL has received a lot of flak over the years. Both cryptographic and implementation flaws have been plentiful, and researchers are all over it. Much has been said about the software engineering process that was used to put it all together.

It appears to me that systemd is following a similar path, with another vulnerability turning up in the last few days. A very quick assessment of this vulnerability would place it as a 'medium' (CVSSv3 between 5.5 and 6.5) severity vulnerability. Ayer's post goes on to use this example as another data point in favour of the "systemd is insecure" camp.

OpenSSL is bad. It is security-critical software that has a constant stream of high and critical severity bugs. However, it's not too much of a problem for me. I update the library and restart the affected services of a test system, then on to the production systems. Usually this is just reloading or restarting nginx and sshd. The impact to end-users is small or non-existant. For some people, an OpenSSL bug is much worse, it all depends on the exact nature of the system you run and the particular bug.

On the other hand, we have systemd. We've not seen as many issues so far. But we can't just update and restart systemd, because it is a tightly integrated with many components. I feel like this is similar to the libc vulnerabilities we saw a little while back. The maintenance impact is much bigger because we're forced to do a full system restart.

This is one of the many reasons why privilege separation is such a good thing. Every new process gets the updated code. So while the minimal core of the application may contain vulnerable code, it might be easy to show that it is never reached. On the other hand, the privilege-separated workers don't need that much attention, so we just restart them and have them pick up the upgrades. Further, each worker process can be individually sandboxed and allowed to fail.

I agree with Ayer's perspective. Software is written by fallible humans who make mistakes with alarming regularity. When we dive into coding without thinking about the long-term implications, we set ourselves up for failure. If we build a monolith, any RCE bug is fatal. If we fail to sandbox our systems, any path traversal bug is fatal. We should avoid designs where common flaws are fatal.

Engineers need to be thinking about a few core questions:

How important is this system going to be?
How do we make bugs less likely?
How do we make bugs less problematic?

These are the central questions that a software development process answers. These are the questions which we frequently ignore. We purchase time-to-market with steep maintenance costs. If the systemd developers persist in ignoring these deeper issues, I think systemd will be the OpenSSL of the future. A constant stream of high and critical severity issues causing a never-ending headache for those that use it.

If you are starting a project, or running a project, please look at the design of vsftpd and OpenSSH. Look over the various secure software development life-cycles. Think about using a language that supports you. If you're going to push people to use your system, consider the security implications of your decisions and remember that hope is not a strategy.

Top 5 Security Fails of 2015

2016-01-02T01:12:00.000+00:00

Much like every other year, 2015 had a veritable smorgasbord of security breaches and failures. This top 5 list, in chronological order, catalogues the trials and tribulations of security in 2015.

Contributing author: Sephy Hallow.

1. GHOST

Source: openclipart

This year opened with a bit of a bang, with CVE-2015-0235 being announced in January 2015. This vulnerability was branded as "GHOST." This was an issue with a core library which underlies almost every piece of Linux software. A successful attack would result in remote code execution on the target machine, gaining a CVSS score of 10.0 out of a possible 10.

The only saving grace was that it was difficult to determine if a particular piece of software actually used the library in a vulnerable way. As it turned out, very few pieces of software were actually vulnerable, but the difficulty determining that lead to a fair few people going into panic mode for a day or two.

Score: 1/5 - All Ghost and No Ghoulies

2. Office of Personnel Management

Source: Office of Personnel Management Seal

The Office of Personnel Management (OPM) breach was announced in June of 2015. Although the number of records exposed initially estimated at four million, this breach turned out to be a gift that kept on giving, with the estimate ballooning to 18 and finally 21.5 million records. Even better, the records were said to contain highly sensitive information from background checks, including personally identifying information, social security numbers, and even security clearance data.

What made this a real show stopper was the inept response. Putting aside the inability to simply count the number of records compromised, this became a comedy of errors as it was eventually shown that the OPM had been warned several times regarding shoddy security practices. In the aftermath of the attack, OPM set about trying to spread the blame far and wide, and speculated on the identity of the perpetrators rather than fixing their systems.

Score: 5/5 - Bureau Prats

3. Stagefright

Source: Charles Darwin

This year, no stone was left unturned, with security researchers turning their ingenuity to Android. Their efforts uncovered a glorious bounty of not one, not two, but eight vulnerabilities in a single library. Six of the eight vulnerabilities scored the maximum CVSS of 10.0 out of 10, with a 9.3 and a 5.0 thrown in for good measure. The vulnerabilities manifested themselves in the library named libstagefright, which was used for showing media files. A proof-of-concept exploit was developed which triggered the issue by the means of a crafted MMS message, and did not require user interaction.

Obviously, everyone quickly deployed the fix, right? Wrong. In reality we're talking about the Android ecosystem here, with the multiple phone carriers who are well known for not pushing security updates out to users. Oh, and the carriers lock the devices so that users cannot apply the patches themselves. Seems like a winning combination.

Score: 3/5 - Phantom of the Opera-ting System

4. Ashley Madison

Source: No Wedding

Not one to be out done on the sensitivity of information recovered, The Impact Team leaked some 25GB of customer data from Ashley Madison in August. Who is Ashley Madison, you ask? None other than that upstanding company who's motto is "Life is short. Have an affair." Clearly, their real motto was "Life is short. Security is for losers."

The data included roughly everything: financial information, names, addresses, and details of sexual fantasies. The internet took up harassing and bullying the victims whilst half the criminal underworld attempted to extort the victims. At least one person is known to have committed suicide, having directly cited the leak as their motivation for doing so.

Score: 5/5 - Security Blows

5. TalkTalk

Source: TalkTalk Logo

In October this year, TalkTalk's defences crumbled after coming under an alleged "significant and sustained cyber-attack" and purportedly leaked the details of some four million customers. Back in reality, this was a simple SQL injection attack which could have been conducted by a relatively unskilled teenager in their bedroom. The attackers demanded a ransom of roughly £17 million. Eventually, TalkTalk revised their estimate of the number of records accessed down to approximately two hundred thousand, and the Metropolitan Police arrested a 15 year-old from Northern Ireland in connection with the breach.

After issuing a ransom demand and confusing the heck out of TalkTalk, the attackers either used or sold on the data, resulting in at least one victim losing nearly £3,000 to fraud. Finally, the CEO, Dido Harding, was hauled in front of a Home Affairs Select Committee and asked to account for TalkTalk's (in)actions regarding the incident, including allegations that they had "covered up both the scale and duration of this attack[.]" Nice.

Score: 3/5 - All Talk

Inferior Process and Incompetent Developers

2015-11-24T21:47:00.000+00:00

In Falhar's recent post, "Everybody is doing TDD," they claim that every developer uses test-driven development (TDD), because they will either automate their tests, or they will manually test their application. They go on to state that those who are manually testing their applications are "fully incompetent." Whilst I agree that with a sufficiently broad definition, almost anyone who tests their programs are undertaking TDD. Whether that broadly-defined TDD matches the commonly accepted definition is a different matter. However, I want to argue that those who do not produce automated tests are not necessarily incompetent, but rather that this is a matter of context.

Let's take three developers working on three separate projects.

Developer A is working on a security critical software library. The library implements a well-known cryptographic construction, which is defined in one or more RFC documents. Prior to development, this developer produces an automated test suite which consists of the test vectors from the RFC and property-based randomised tests. They work alone, so there is no code or design review, but they do use commonly available static analysis and code style tools to ensure that their work is consistent and free of "obvious" errors.

Developer B is a keen gardener, but is very forgetful. In order to ensure that they do not forget to tend their various plants according to a complex schedule, they write a program to help them remember. When run by cron, the program sends them an email with the names of the plants to water. There is no over-arching specification, the requirements are all encoded within the developer's head. If the program fails, the primary impact is that some plants are not watered for a day or two, or the schedule does not work out quite as planned. To develop this program, the developer uses some simple shell scripts, and a single crontab entry.

Finally, we have Developer C. Developer C is working on the control software for a turbofan engine (commonly called a jet engine). They are part of a large team, which includes safety managers, requirements engineers, and so on. The development time scale is on the order of a decade, and starts with requirements gathering, hazard analyses, risk assessments, and so on. Due to the fact that a failed engine could send searing hot fragments of turbine blade into the passenger cabin, the decision is made to formally verify the software. Developers are not expected to test their code; they're expected to write code which can be shown to be equivalent to the specification. Testing is handled by a large and dedicated assurance team, who test both the components, and the system as a whole. The closest to testing that developer C undertakes is checking that their code and associated proof holds according to the verifier.

It does not make sense to refer to any of the above developers as incompetent, despite the fact that only one of them is practising TDD. Each project calls for differing levels of assurance, and therefore different processes. Each process is completely adequate for the context, and further, it is possible that a single developer undertakes each of the projects outlined, some as part of their hobby, and some as part of their usual employment. There is no incompetence here, just different assurance levels.

TDD is a tool which is available to many developers. Not using TDD does not mark a developer as incompetent. Using a process which is inappropriate for the assurance level required by a project may well result in poor outcomes, but often developers do not decide on the process. In the cases where developers do decide on the process, it may be the case that their choices are guided by forces other than software correctness, such as market forces, management pressure, team familiarity, and so on. There may be cases where the wrong process is used for the situation, and often this would be referred to as negligence and would likely be incompetence.

Ransomware on Linux

2015-11-07T21:41:00.000+00:00

Dr.WEB is reporting that ransomware has come to the Linux ecosystem. Fortunately, this has only affected "tens" of users thus far. In particular, this malware is targeting those with a lot to lose: web site administrators. This gives the malware a good chance of ensnaring some business-critical data or functionality, thereby giving the victim a bit more incentive to pay the ransom.

Ransomware has been around for some time in the Windows ecosystem. Previously these programs would show a dialogue, claiming that the machine was locked and could be unlocked when a suitable payment was made. In reality, these were often just programs configured to run automatically on start-up, and did not directly endanger user data. In recent years, these have made attempts at encrypting the user's data and putting the key out of reach. A prompt payment promises to return the key, and thus the data, to the victim. These have had varying levels of success, with the "best" managing to pull in millions of dollars for their creators. They have not been without their flaws which allowed the victims to recover their data without paying; some variants stored the key locally on the machine, some eventually had the keys disclosed by security researchers, and some which have yet to to be broken. Often, organisations have no option but to pay the ransom.

Fortunately, this particular strain of malware requires extensive user interaction to run, requiring root privileges. This does not prevent future generations of this malware piggy-backing on other access vectors, such as vulnerable web browsers, email clients, web servers, and so on. I would predict that we will see this kind of malware attached to remote exploits in the moderately near future. Even using old exploits, or only encrypting a user's home directory could turn up quite the bounty for the attacker, as those who don't update their systems may well not have suitable backup processes in place to recover from the attack, and many people store their valuable files in their home directory.

There are a few options to mitigate the risk posed by this threat. However, none will be wholly effective, so a combination may be required. For some organisations, this will simply be a strengthening or verification of existing defences. For others, this threat may call for entirely new defences to be deployed.

The first and most common would be to ensure that all systems under your control have all relevant security patches applied. This should limit the likelihood of an exploit being used to launch an attack without user interaction. A backup system which stores backups offline should be used. If an on-line backup system is in use, either deploy an offline system or ensure that a previously saved backup cannot be overwritten by a corrupted copy, or easily reached by an attacker. This will reduce the impact of a breach, as it should be possible to recover from relatively recent backups in the event of a compromise. Where possible, software which consumes untrusted input, such as web browsers, email clients, web servers, and so on, should be placed into to a suitable sandbox environment. This should reduce the likelihood that the malware will be able to reach critical business data. Finally, better user education may reduce the likelihood of a breach, as they may be better able to detect social engineering attacks which might have otherwise lead them to run the malware.

It is fortunate that Linux has several sandbox mechanisms available, and an appropriate one can be selected. Such mechanisms include chroots, SELinux, AppArmor, or seccomp-bpf. Other systems, such as FreeBSD, should not be considered invulnerable, and similar mitigations applied, such as the use of jails or Capsicum. Unfortunately, restricting a complex web browser's access to the file system may have unexpected consequences, or simply be very time consuming. Ubuntu provides an AppArmor profile to do this for Chromium. However, it is not without it's issues, such as not being able to determine if it is the default browser on the system.

SQLite and Testing

2015-08-08T18:24:00.001+01:00

Categorical claims are often the source of faulty statements. "Don't test with SQLLite [sic] when you use Postgres in Production" by Robellard is a fantastic example. I actually agree with a variant of this statement: "If you need high levels of assurance, don't test with SQLite alone when you use Postgres in production."

Robellard bases his claim on several points, noting that "SQLite has different SQL semantics than Postgres," "SQLite has different bugs than Postgres," and "Postgres has way more features that SQLite." He has a couple more points, but all of these largely amount to a discrepancy between SQLite and Postgres, or between one Postgres version and another, leading to a defect. These points are a genuine concern, but his claim relies on using exactly one database back-end for testing, and exactly one risk profile for various applications.

As a quick diversion, I am not using the common definition of risk which is synonymous with chance. I am using a more stringent definition: "the effect of uncertainty on objectives" as specified in ISO Guide 73:2009. This definition often requires an assessment of both the impact and likelihood of some form of scenario to obtain a fuller picture of an "effect."

If the risk posed by defects caused by an SQLite-Postgres discrepancy is too high, then you'll likely want use Postgres as part of your testing strategy. If the risk posed is sufficiently low, then SQLite alone may be appropriate. These are predicated on the risk posed by defects, and the organisational appetite for risk.

A testing strategy comprising several different testing methodologies can often be thought of as a filter of several layers. Different layers are variously better or worse at surfacing different types of defects. Some are more likely to surface defects within components, and others are better at locating defects in the interactions between components. Other "layers" might be useful for catching other classes of defects. Each layer reduces the likelihood of a defect reaching production, which reduces the risk that defects pose. Each layer also has a cost associated with writing and maintaining that layer.

It's quite common for different layers to be run at different times. For instance, mock-based unit tests might be run very frequently by developers. This provides the developers with very quick feedback on their work. Integration tests backed by an in-memory database might be run prior to committing. These take a little longer to run and so might get run less often, but still catch most problems caused by erroneous component interactions. A continuous integration (CI) server might run integration tests backed by Postgres, and slower UI tests periodically. Finally, penetration tests might be conducted on a yearly or six-monthly basis.

This sort of process aims to allow developers the flexibility to work with confidence by providing quick feedback. However, it also provides heavier-weight checking for the increased levels of assurance required for the risk-averse organisation. An organisation with a greater appetite for risk may remove one or more of those layers, such as in-memory integration tests, to speed development. This saves them money and time but increases their exposure to risk posed by defects.

SQLite is just a tool which may be used as part of one's testing strategy. Declaring "Don't test with SQLLite [sic] when you use Postgres in Production" ignores how it may be usefully applied to reduce risk in a project. In many cases SQLite is entirely appropriate, as the situation simply does not require high levels of assurance. In other cases, it may form part of a more holistic approach along side testing against other database backends, or be removed entirely.

Not every organisation is NASA, and not every project handles secrets of national import. Most failures do not kill people. An honest assessment of the risks would ideally drive the selection of the testing strategy. Often-times this selection will be balanced against other concerns, such as time-to-market and budget. There is no silver bullet. A practical, well-rounded solution is often most appropriate.

Infosec's ability to quantify risk

2015-07-25T23:23:00.001+01:00

In Paul Graham's latest post, "Infosec's inability to quantify risk," Graham makes the following claim:

"Infosec isn't a real profession. Among the things missing is proper "risk analysis". Instead of quantifying risk, we treat it as an absolute. Risk is binary, either there is risk or there isn't. We respond to risk emotionally rather than rationally, claiming all risk needs to be removed. This is why nobody listens to us. Business leaders quantify and prioritize risk, but we don't, so our useless advice is ignored."

I'm not going to get into a debate as to the legitimacy of infosec as a profession. My job entails an awful lot of infosec duties, and there are plenty of folks turning a pretty penny in the industry. It's simply not my place to tell people what they can and cannot define as a "profession."

However, I do take issue with the claim that the infosec community lack proper risk analysis tools. We have risk management tools coming out of our ears. We have risk management tools at every level. We have those used at the level of design and implementation, for assessing the risk a vulnerability poses to an organisation, and even tools for analysing risk at an organisational level.

At the design and implementation level, we have software maturity models. Many common ones explicitly include threat modelling and other risk assessment and analysis activities.

One of the explicit aims of the Building Security in Maturity Model (BSIMM) is "Informed risk management decisions." Some activities in the model include "Identify PII obligations" (CP1.2) and "Identify potential attackers" (AM1.3). These are the basic building blocks of risk analysis activities.

The Open Software Assurance Maturity Model (OpenSAMM) follows a similar pattern, including a requirement to "Classify data and applications based on business risk" (SM2) and "Explicitly evaluate risk from third-party components" (TA3).

Finally, the Microsoft Security Development Lifecycle requires that users "Use Threat Modelling" to "[...] determine risks from those threats, and establish appropriate mitigations." (SDL Practice #7).

So, we can clearly see that risk analysis is required during the design and implementation of a system. Although no risk management methodology is prescribed by the maturity models, it's easy to see that we're clearly in an ecosystem that's not only acutely aware of risk, but also the way those risks will impact organisational objectives.

If these maturity models fail to produce adequately secure software, we need to understand how bad a vulnerability is. Put simply, statements like "On the scale of 1 to 10, this is an 11" are not useful. I understand why such statements are sometimes necessary, but I worry about the media becoming fatigued.

Vulnerabilities are classified using one of several methods. Off the top of my head, I can think of three:

Common Vulnerability Scoring System (CVSS)
DREAD Risk Assessment Model (Wikipedia)
STRIDE (Wikipedia)

These allow for those with infosec duties to roughly determine the risk that a vulnerability may pose to their organisation. Put simply, they allow for the assessment of the risk posed to one's systems. They are a (blunt) tool for risk assessment.

Finally, there are whole-organisation mechanisms for managing risks, which are often built into an Information Security Management System (ISMS). One of the broadest ISMS standards is BS ISO/IEC 27001:2013, which states:

"The organization shall define and apply an information security risk assessment process [...]"

If this seems a bit general, you should be aware that an example of a risk management process (which includes mechanisms for risk assessment & analysis) is available in BS ISO/IEC 27005:2011.

Let's look at the CERT Operationally Critical Threat, Asset, and Vulnerability Evaluation (OCTAVE) Allegro technical report:

"OCTAVE Allegro is a methodology to streamline and optimize the process of assessing information security risks [...]"

Similarly, Appendix A provides guidance on risk management, which includes sections on risk assessment and analysis.

Yet another standard is NIST SP800-30 Revision 1, "Guide for Conducting
Risk Assessments". It states it's purpose quite clearly in section 1.1 "Purpose and Applicability"

"The purpose of Special Publication 800-30 is to provide guidance for conducting risk assessments [...]"

NIST SP800-30 Revision 1 also provides an example of how to conduct a risk assessment.

As you can see, members of the infosec community have quite a few tools for risk assessment and analysis at our finger-tips. From the design and implementation of software, through to the assessment of individual vulnerabilities, and even for assessing, analysing, and mitigating organisational risk, we're well equipped.

The infosec community is often very bad at communicating, and the media likes a salacious story. How often have you heard that a cure for cancer has been found, sight returned to the blind, and teleportation achieved? Recently, members of the infosec community have played into this, but that does not eliminate the fact that we do have tools for proper risk management. Our field is not so naive that we blindly believe all risk to be unacceptable.

The Four Words of Distress

2015-06-07T12:13:00.001+01:00

The title of this post is taken directly from The Codeless Code, "The Four Words of Distress"

I read it again yesterday, and today I have been bitten by it.

I accidentally removed the content of a comment on this blog. I don't get many comments, so I feel that all of them are important contributions (except the incessant 50 Shades of Grey spam, urgh, do I regret writing about that.) As such, removing a comment is "serious" to me.

Upon realising my error, I immediately searched the page for an undo or a "restore" link of some sort, and eventually went to search for an answer to my problems. I only found unanswered questions on Google's Groups, and a Product Help page claiming that I should've been sent to a confirmation page (no such thing happened)

To have fixed this problem, there would have been any number of ways to deal with it:

A confirmation dialogue, alongside a "don't show this again" check box.
A trash can, which comments can go to for a fixed period of time.
An undo button
A help guide.

As it stands, I accidentally clicked the "Remove Content" link, and now, without warning, the comment has gone. I worry that this is a black mark against the commenter's account, when it is a simple mistake.

Chrooting a Mumble Server on OpenBSD

2015-06-06T21:14:00.003+01:00

One of my colleagues is starting remote working shortly. As such, we needed a VoIP solution that worked for everyone, Mac, Linux and FreeBSD. It was discovered that Mumble provided ample quality and worked everywhere. Top it off with the fact that we could host it ourselves, and we looked to be set.

However, being security conscious, I like to sandbox any service I have. Further, since this solution is slightly ad-hoc at the moment, it's being run off a personal BigV server, running OpenBSD. So I set out to chroot the package-managed mumble server, umurmur, which does not include sandboxing as default.

Fortunately, if no logfile is specified, umurmurd will log to syslog, and it does not support config reloading, so I don't need to worry about that.

Chrooting is not entirely simple, and simple versions can be improved by "refreshing" the chroot every time the service starts. This means that if an attacker infects the binary in some way, it'll get cleared out and replaced after a restart. As another bonus, if a shared library is updated, it simply won't get found, which tells you what to update! If the binary gets updated, it'll be copied in fresh when the service is restarted.

To do this, we modify /etc/rc.d/umurmur:

#!/bin/sh
#
# $OpenBSD: umurmurd.rc,v 1.2 2011/07/08 09:09:43 dcoppa Exp $
#
# 2015-06-06, Turner:
#
# Jails the umurmurd deamon on boot, copies the daemon binary and
# libraries in each time.
#
# An adversary can still tamper with the logfiles, but everything
# else is transient.
#

chroot="/var/jails/umurmurd"
group="_umurmur"
original_daemon="/usr/local/bin/umurmurd"
daemon="chroot $chroot $original_daemon"

build_chroot() {
  # Locations of binaries and libraries.
  mkdir -p "$chroot/usr/local/bin"
  mkdir -p "$chroot/usr/lib"
  mkdir -p "$chroot/usr/local/lib"
  mkdir -p "$chroot/usr/libexec"

  # Copy in the binary.
  cp "$original_daemon" "$chroot/usr/local/bin/"

  # Copy in shared libraries
  cp "/usr/lib/libssl.so.27.0" "$chroot/usr/lib/"
  cp "/usr/lib/libcrypto.so.30.0" "$chroot/usr/lib/"
  cp "/usr/local/lib/libconfig.so.9.2" "$chroot/usr/local/lib/"
  cp "/usr/local/lib/libprotobuf-c.so.0.0" "$chroot/usr/local/lib/"
  cp "/usr/lib/libc.so.77.0" "$chroot/usr/lib/"
  cp "/usr/libexec/ld.so" "$chroot/usr/libexec/ld.so"

  # Setup /etc and copy in config.
  mkdir -p "$chroot/etc/umurmur"
  cp "/etc/umurmur/umurmur.conf" "$chroot/etc/umurmur/umurmur.conf"
  cp "/etc/umurmur/certificate.crt" "$chroot/etc/umurmur/certificate.conf"
  cp "/etc/umurmur/private_key.key" "$chroot/etc/umurmur/private_key.key"

  # Setup the linker hints.
  mkdir -p "$chroot/var/run/"
  cp "/var/run/ld.so.hints" "$chroot/var/run/ld.so.hints"
  cp "/usr/libexec/ld.so" "$chroot/usr/libexec/ld.so"

  # Copy the pwd.db password database in. This is less-than-ideal.
  cp "/etc/pwd.db" "$chroot/etc/"
  grep "$group" "/etc/group" > "$chroot/etc/group"

  # Setup /dev
  mkdir "$chroot/dev"
  mknod -m 644 "$chroot/dev/urandom" c 1 9
  mknod -m 644 "$chroot/dev/null" c 1 3
}

destroy_chroot() {
  if [ "$chroot" ]
  then
    rm -rf "$chroot"
  fi
}

case "$1" in
  start)
    build_chroot
    ;;
  stop)
    destroy_chroot
    ;;
esac

# Standard rc.d "stuff" here.
. /etc/rc.d/rc.subr

rc_reload=NO

rc_cmd $1

So there we go! When /etc/rc.d/umurmurd start is called, the chroot is setup, and umurmurd started in there. When you kill it, the chroot jail is emptied.

There are some limitations. For one, any private key (in the default, it's private_key.key) can be compromised by an attacker who can compromise umurmurd, and this can be used to impersonate the server long after the compromise. Secondly, if you do specify a log file in umurmur.conf, and you setup the relevant directory for logging to, it will be trashed when you stop the daemon. This is a real problem if you're trying to workout what happened during a compromise.

Finally, if umurmur is updated, and the required libraries do change, "ldd /usr/local/bin/umurmurd" will list the new shared objects.

Known Issues

This does not currently stop the umurmur daemon on stop. I'm not entirely sure why, but the work around is to stop the service using /etc/rc.d/umurmurd stop, then find it using ps A | grep umurmur and kill -15 it.

Refactoring & Reliability

2015-05-31T18:04:00.002+01:00

We rely on so many systems that their reliability is becoming more and more important.

Pay bonuses are determined based on the output of performance review systems; research grants handed out based on researcher tracking systems, and entire institutions may put their faith for visa enforcement in yet more systems.

The failure of these systems can lead to distress, financial loss, the closure of the organisations, or even prosecution. Clearly, we want these systems to have a low failure rate; be they design flaws or implementation defects.

Unfortunately for the developers of the aforementioned systems, the all have a common (serious) problem: the business rules around them are often in flux. Therefore, the systems must have the dual property of flexibility and reliability. Very often, these are in contradiction to one another.

Reliability requires requirements, specification, design, test suites, design and code review, change control, monitoring, and many other processes to prevent, detect, and recover from failures in the system. Each step in the process is designed as a filter to deal with certain kinds of failures. Without them, these failures can start creeping into a production system. These filters also reduce the agility of a team; reducing their capability to respond to new opportunities and changing business rules.

On the other hand, the flexibility demanded by the team's environment is often attained through the use of traditional object-orientated design. This is typically achieved by writing to specific design patterns. If a system is not already in a state that is considered to be "good design," a team will apply refactorings.

Refactorings are small, semantics-preserving changes to the source of a system, with the goal of migrating towards a better design. This sounds perfect. Any analysis and testing which took place prior to the refactoring should still be valid! [1].
However, even though the semantics of the source are preserved (although, humans do occasionally make mistakes!), other observable properties of the program are not preserved. Any formal argument that was made regarding the correctness, or time, space or power requirements may not be valid after the refactoring.

Not only does the refactoring undermine any previous formal argument, it can often make it more difficult to construct a new argument for the new program. This is because many of the refactoring techniques given introduce additional indirection, duplicate loops, or use dynamically allocated objects. These are surprisingly difficult to deal with in a formal argument. So much so that many safety-critical environments simply do not support them, for example, SPARKAda. In many common standards aimed at safety critical systems, they are likewise banned.

I am not arguing against refactoring. I think it's a great tool to have in one's toolbox. I also think that like any other tool, it needs to be used carefully and with prior thought. I'd also shy away from the idea that just because something's important, it is critical. With a suitable development process, a development team can remain agile whilst still reducing the risk of a serious failure to an acceptable level.

In the end, it's exactly that -- the balance of risks. If a team is not responsive, they may miss out on significant opportunities.To mitigate this risk, teams introduce flexibility into their code through refactoring. To mitigate the risk of these refactorings causing a serious failure [2], the team should employ other mitigations, for example, unit and integration testing, design and code review, static analysis, and so on. Ideally, to maintain the team's agility, they should be as automated and integrated into their standard development practice as possible. Each team and project is different, so they would need to assess which processes best mitigate the risks, whilst maintaining that flexibility and agility.

[1] Fowler states that for refactorings to be "safe", you should have (as a minimum) comprehensive unit tests.
[2] Assuming that the "original" system wasn't going to cause a serious failure regardless.

All White Labour?

2015-04-11T13:01:00.003+01:00

With the up and coming general election, we've been receiving election material.

When my other half mentioned that the leaflet we received from the Labour candidate for York Central looked a little bit "overly white" (paraphrasing), I decided to run the numbers.

The York Unitary Authority's demographics, from the 2011 census show that we have 94.3% "white" people [1].

We sat down and counted the faces on the leaflet, excluding the candidate themselves. We came to a count of 14 faces, all of which were white.

The chance of picking 14 people at random from a population which is 94.3% white and getting 14 white people is 43.97%. That means that the chance of getting at least one non-white person on the leaflet would've been 56.03%.

Obviously, this is quite close to a toss-up, but bear in mind that these people aren't usually selected for at random. All sorts of biases go into selecting people for photo shoots, from who turns out, to who interacts with the candidate, who the photographer chooses to take photographs of and who is selecting which photos from the shoots end up on the page and their biases towards what will "look good," and what is expected to promote an image of diversity.

Anyways, I don't want to say one way or the other about what this does or does not mean, I just want the data to be available for people.

References:

[1] : 2011 Census: KS201EW Ethnic group, local authorities in England and Wales (Excel sheet 335Kb) (http://www.ons.gov.uk/ons/publications/re-reference-tables.html?newquery=*&newoffset=0&pageSize=25&edition=tcm%3A77-286262) Accessed: 2015-04-11

OpenBSD time_t Upgrade

2015-04-06T13:35:00.001+01:00

Last night I foolishly undertook the upgrade from 5.4 to 5.5, without properly reading the documentation. My login shell is zsh, which meant that, when the upgrade was complete, I couldn't login to my system.

I'd get to the login prompt, enter my credentials, see the motd, and be kicked out as zsh crashed, due to the change from 32-bit time_t to 64-bit time_t change. I'd also taken the security precaution of locking the root account.

If fixed this as follows:

Reboot into single-user mode (boot -s at the boot> prompt)
Mounted my filesystems (they all needed fsck running on them, before mount -a)
Changed my login shell to /bin/sh (chsh -s /bin/sh <<username>>)
Rebooted.

After that, it was a simple question of logging in and doing the usual; update my PKG_PATH to point at a 5.4 mirror, and running "pkg_add -u" to upgrade all my affected packages.

I then continued on to upgrade my system to OpenBSD 5.5.

A quick warning: This is one particular failure mode arising from not reading the docs. It may get you back into your system, but it's unlikely to fix all your woes if you don't read the docs.

So, the moral of the story is: ALWAYS READ THE DOCS.

Keeping a Web Service Secure

2015-02-16T22:22:00.003+00:00

This post is aimed at those tasked with developing and maintaining a secure system, but who have not had to do so previously. It is primarily a sign-posting exercise, pointing initiates in the right direction. It is not gospel; many things require common sense, and there is rarely a right answer. Everything is a trade off; performance for security, usability for development time, backup size for recovery ability. To do this effectively, you need to consider your systems as a whole, including the people running and building them.

Keeping a web service adequately secure is quite a task. Many of the things covered will simply not apply, depending on your business model. For instance, web services which use services such as Google App Engine, Heroku, or OpenShift will not need to keep up with OS patches. Those that are for intranet use only may be able to get away with weaker password policies if they're not usually exposed to the internet.

Use your common sense, but be prepared to backup your decisions with evidence. When the executioner comes, you need to be able to honestly argue that you did the right thing, given the circumstances. If you can't do this honestly, you will get found out, your failure will be compounded, and the outcomes all the more severe.

The whole aim of these recommendations is to give you a head start removing potential avenues of attack for your adversaries, providing some mitigation to recover from an attack, and giving you plenty of places to go do further research.

You'll need to do the basics for securing your infrastructure. As with most other things, this is an ongoing process.

You will likely need, as a bare minimum, a properly configured firewall. I'm a big fan of pf, but IPTables is probably more widely used. Use what's appropriate for your platform, do some reading, and make sure you've got it dropping as much un-necessary traffic as possible, in and out of your organisation.

For remote access, most services go with SSH. OpenSSH is easily hardened. I'd recommend having a good read of Lucas' SSH Mastery to familiarise yourself with the finer points of the service. If your platform doesn't have SSH, it's likely that you'll have something similar, but you'll have to do your research.

If your company has more than a small handful of employees, sudo is an absolute life saver. Once again, Lucas provides a good book on utilising sudo for access control & auditing in Sudo Mastery.

You must keep any services you run up to date. This is everything, database, web server, remote access, kernel, etc. This will entail some downtime. A little and often goes a long way towards preventing major outages.

Often, people have perfectly good intentions, but fail to keep up with patches. There are three main causes.

The first is wanting to avoid any and all scheduled downtime. If you plan correctly, and notify your users ahead of time, this is no issue. People largely don't mind the service disappearing at a well-known time for an hour a month.

The second is harder to combat. Ignorance of the updates existence, the security implications of not applying the patch, or knowledge of how to apply them. You need to collate a list of your dependencies (web servers, databases, OS patches, etc.) and their security announcement pages. This list then needs to be checked, and any relevant issues assessed. You should also be aware of how you actually update each piece of software. Many operating systems from the unix family make this easy with a package manager, but I don't know how the rest of the computing world fares in that arena.

Typically, Mitre will provide an assessment of the risk posed to organisations by a vulnerability, which is normally a reasonable estimate of the risk posed to your organisation. High risk vulnerabilities may need re-mediation outside your normal downtime cycle, lower risk ones may be able to wait.

The third is that many people fear updates as they can break things. This risk only gets worse with time. If you're not keeping up with the latest security patches, then when things break, you need to work out which of the 17 patches you just applied did it. With just one or two patches, you can file a bug report, or determine if you're mis-using the dependency much more easily.

A hidden set of dependencies are those of your own bespoke software, which are pulled in by some form of build process. Many organisations simply ignore is the dependencies of their bespoke software. For instance, with a large Java application, it's all too easy to get left behind when ORMs and IoCs get updated regularly. You must be regularly checking your dependencies of your bespoke software for updates, and the same arguments regarding OS level patching apply. Get too far behind, and you'll get bitten -- hard.

I'd recommend turning on OS-level exploit mitigation techniques provided by your OS. This is typically things like address-space layout randomisation (ASLR) and W^X, but plenty of others exist. Should you be on holiday when a problem arises, these will buy you some time to either put a phone call in to someone to get the ball rolling, or get to an internet cafe and open a connection with a one-time password (don't trust the internet cafe!). They also tend to frustrate attackers, and may prevent some specific vulnerabilities from being exploitable at all.

This is a huge field, and some systems don't have all of these protections turned on by default, or simply lack the feature at all. Look very closely at your OS, and the dials that you can turn up.

Other systems of policy enforcement, such as SELinux may also help, but I've not had much chance to use them. Read up on them, both pros and cons, and work out if they're of much use to you, and if they're worth your time.

The next class of problems is running obviously bad software. Even if you keep your dependencies up to date, lots of people will fall prey to this. One of the worst offenders is WordPess (and it's plugin ecosystem), but other system have not had a clean run.

Check out the history of a project before utilising it. If you've decided that it's bad, but nothing else works for you, segregate, segregate, segregate. Then monitor it for when it does get compromised. This can be done with a simple chroot, or for beefier solutions, a separate system with an aggressive firewall between it and any user data can help. For monitoring, you'll probably want something like Nagios, but there's plenty of alternatives.

If possible, you should try to segregate and monitor your bespoke software as if it were in the class of services with a poor history, but this may not be possible.

On the subject of monitoring, you should be monitoring lots of performance metrics, such as database load, queries per second, errors/second in logs, etc. These will help tell if you something funny (security related or otherwise) is up, before it becomes a major problem for you.

You may also chose to deploy an intrusion detection system (IDS). Snort is widely recommended, and comes with a whole bunch of basic rules. You'll need to tune this to filter out the noise and leave the signal. This is no small task, prepare yourself for some serious documentation divin' with most IDS/

Once you're at this point, you should be have a relatively decent security posture. But there's two crucial things I've not covered, relating to recovery from an incident.

The first is backups. Make them, store the offline (at least one business has gone under from this), and test them, and your restore processes regularly. If something bad happens, you need these to be out of reach of the threat, whether it's deliberate or otherwise.

Secondly is general incident response. However, the one I'd most like to bring to the fore is a breach communications package. This is a set of templates covering several scenarios which can be used to notify customers, fend off the press, and put on your website to explain the situation. If you're a big company, and you expose customer data, the press will call. If you expose a lot of user data, the press will call. If you are compromised by a highly news-worthy adversary (e.g. ISIL), the press will call.

Do not waste your time trying to talk to journalists, and do not waste time writing a press release under pressure. You'll do a bad job of it, and things will go from bad to worse very quickly; especially if reporters think you're available for comment.

And finally, what's the point of all this, if you're not going to be developing some bad-ass web app to scratch a particular itch.

I strongly recommend that you use some variety of a secure development lifecycle, such as OpenSAMM, BSIMM, or Microsoft's SDL. Obviously, these won't solve your issues, but they should help alleviate them.

One of the most important things on each lifecycle is developer training. Without that, you're leaving a very large, obvious and attractive target for attackers.

Many of the things in a lifecycle will overlap with, and go beyond these recommendations. That's good and fine, but most of them discuss things in very abstract terms, and hopefully this post puts some of the requirements down more concretely.

Hopefully, that should give you a good place to start. The important thing is discipline. The discipline to check for & apply patches, follow a secure development lifecycle, impose some reasonable restrictions on yourself and therefore your attackers.

Developing software is very hard. Developing secure software is maddeningly hard. For this reason, there is no security silver bullet. Anyone trying to sell you a one-stop solution is full of it. It requires careful research and implementation; and it will take time. It is easiest to do this from the start, but by injecting quality control activities into an existing process, you can begin to improve an organisation's security posture, but it will be slow. In many cases, bespoke software will have to be reviewed or simply thrown away, services migrated to better-configured infrastructure, and so on. It takes time, a lot of time.

Madness from the C

2015-02-10T12:00:00.000+00:00

So, I've decided to take the plunge.

C is so widely used that not being quite intimately acquainted with it is a definite hinderance. I can read C comfortably for the most part, ably wielding the source of a few kernels or utilities to track down bugs and determine exactly how features are used is something that's not beyond my remit.

I might actually try to write some C. Originally I was going to be all hipster and show how to build your web 2.0 service in C using FastCGI, because the 90s are still alive and well. Also, you can do some pretty awesome things relating to jailing strategies (e.g. chroot+systrace, chroot+seccomp or jail+capsicum), and the performance can be good.

Unfortunately, when it comes to writing C, I freeze up. I know there is so much left to the imagination; so much to fear; so much lurking in the shadows, waiting to consume your first born. And I hate giving out advice that could lead to some one waking up to find ETIMEDOUT written in blood on the walls; even if the "advice" is given in jest. (Thanks to Mickens for the wording)

I speak of the dreaded undefined behaviour.

Undefined behaviour is a double edged sword. In tricky situations (such as INT_MAX + 1), it allows the compiler to do as it pleases, for the purposes of simplicity and performance. However, this often leads to "bad" things happening to those who transgress in the areas of undefined behaviour.

I suggest that if you are a practicing C developer, and you don't think this is much of a problem, or you don't know much about it, you read both Regehr's "A Guide to Undefined Behaviour in C and C++" and Lattner's "What Every C Programmer Should Know About Undefined Behaviour" in full.

I was in the camp of "undefined behaviour's bad, but not too much of a problem since it can be easily avoided" camp, since I am more security-oriented than performance-oriented. I much prefer Ada to C.

That was, until I started in on Hatton's "Safer C".

The book is well written, clear, and direct.

Broadly speaking, all was going well, until I got to page 49. Page 49 contains a listing of undefined behaviour, as do pages 50, 51, 52, 53, 54, 55, and about one third of 56. That's over 7 pages of undefined behaviour.

It could be made better, if these were all weird, corner cases, that the compiler clearly could not detect and throw out as "obviously bad," and not even just semantically bad (for example, dereferencing a null pointer); but those that are syntactically bad.

Things like "The same identifier is used more than once as a label in the same function" (Table 2.2, entry 5) and "An unmatched ' or " character is encountered on a logical source line during tokenization" (Table 2.2, entry 4) and my current favourite, "A non-empty source file does not end in a new-line character, ends in a new-line character immediately preceded by a backslash character, or ends in partial preprocessing token or comment" (Table 2.2, entry 1).

Just look at those three. Syntactic errors, which, according to the standard, could bring forth the nasal demons.

Let me put it more bluntly. A conforming compiler may accept syntactically invalid C programs, and then emit (in effect) any code it wants.

Now, clearly, most compilers do not do this; they give define these undefined behaviours as syntax errors. The thing which really scares me is that the errors presented are from the first page of the table, and there's another six pages like that to go.

Further, I think I must come from a really luxurious background. I expect compilers to do everything reasonable to help program writers avoid obvious bugs, rather than simply making the compiler writers life easier.

A compiler is to be written for a finite set of hardware platforms. It needs to be used by countless other (probably less experienced) programmers to produce software which may then go on to be used by an unfathomable number of people. It's no small thing to claim that a large fraction of the world's population have probably interacted, in one way or another, with the OpenSSL code base, or the Linux kernel.

The reliability of these programs is directly correlated to the "helpfulness" of the language they are written in, and as such, C needs to be revisited with a view to "pinning down" many of the undefined behaviours; especially those which commonly lead to serious safety or security vulnerabilities.

Last Christmas, I gave you my HeartBleed

2015-02-04T20:54:00.000+00:00

With HeartBleed well and truly behind us, and entered into the history books, I want to tackle the idea that HeartBleed would've definitely been prevented with the application of formal methods -- specifically those that culminate in a proof of correctness regarding the system.

Despite sounding really good, this is actually a false statement, and the reasoning isn't actually all that subtle.

To demonstrate, I'm going to use a common example, the Bell and LaPaulda model for mandatory access control.

The model is, roughly, that everything is a resource or a subject. Both resources and subjects have classifications (we'll just limit ourselves to two), for example, "low" and "high". Subjects with low classification can only read low resources. Subjects with high classification may read both high and low resources.

The specification of the write operations is not currently relevant.

So, let's try and specify this in (something like) the Z notation; we need two operations, ReadOk and ReadError. ReadOp is a combination of both of them.

I apologise in advance, this is the bastardisation of Z that I could work into a blog post without inclining images.

ReadOk:

Variables:
SubjectClassification?
ResourceId?
Result!

Predicates:
(SubjectClassification? = high) || (classificationOfResource(resourceFromId(resourceId?)) = low)
Result! = resourceFromId(resourceId?)

What this can be read as is, given inputs SubjectClassification and ResourceId, and an Result output; and the predicate (SubjectClassification? = high) || (classificationOfResource(resourceFromId(resourceId?)) = low), then the Result output is the resource.

This embodies the "Read down" property of Bell & LaPaulda. If a subject has high classification, then they can read, otherwise, the resource must have a low classification.

The ReadError operation is similarly designed;

ReadError:

Variables:
SubjectClassification?
ResourceId?
Result!

Predicates:
(SubjectClassification? = low) && (classificationOfResource(resourceFromId(resourceId?)) = high)
Result! = error

This basically, roughly, very poorly, states that if you're a low classification subject, and you try to read a high classification resource, you'll get an error.

We now know that, for our specification, we're done.

This looks really good! A decent specification that we can really work to! Let's implement it!

public class BLPFileSystem {

  public enum Classification { LOW, HIGH }

  private final Map<ResourceId, Resource> resourceMap;

  public Resource readById(final ResourceId id, Classification subjectClassification) {

    assert id != null;

    assert resourceMap.contains(id);



    Resource r = resourceMap.get(id);

    Classification resClass = r.getClassification();

    r.setClassification(LOW);



    if (subjectClassification == HIGH) {

      return r;

    } else if (resClass == LOW) {

      return r;

    } else {

      throw new IllegalAccessException();

    }

  }

}

Also, imagine the rest of the infrastructure is available and correct...

Well, this is obviously broken. If you've skipped the code, the offending line is "r.setClassification(LOW);"

That's right, this method declassifies everything as it goes along!

Interestingly, this completely meets our specification. Now, if this was an automatically verified (engage hand-waving/voodoo), I could push this to production with no hassle.

This isn't just a contrived example, but a demonstration of a general issue with these sorts of things.

A specification is usually a minimum of what your software must do -- it usually does not declare a maximum. In OpenSSL's case, the software did the minimum that it was supposed to, it also went above and beyond it's specification to work in new and interesting ways; which turned out to be really bad.

Even with our file system, we can add predicates to ensure that the state of the read file is not changed by reading it; but then the state of other files could be modified. A bad service could declassify every other file when reading any file.

It's not easy to just put "must not" into the spec. Many cryptography systems must run in adversarial situations, for example, sharing virtual machines sharing hardware with adversaries. These systems must protect their key material despite a threat model in which the adversary can measure the time, power and cache effect of the system.

In our example, the system can still violate the "spirit" of the specification by leaking information based on timing-dependant operations.

In part, this exists because the specification is a "higher level" of abstraction, and the abstraction is not perfect.

Now, that is not to say that we should abandon formal methods. Far from it, these, and related formalisms are the kinds of things which save lives, day in, day out. The overall quality of most projects would be vastly improved if some degree of formal methods had been applied from the start. Doubly so if it's something as unequivocally bad as the OpenSSL source. It's just that, in the face of security, our tools need some refinement.

Software Testing with Rigour

2015-01-26T21:30:00.002+00:00

Previously, I've heard a lot about test-driven design (TDD) and why TDD is dead. Gauntlets have been thrown down, holy wars waged, and internet blood spilled.

This has resulted in widespread disaffection with TDD, and in part, some substantial disillusion with testing as a whole. I know that I suffer this; some days, I just want to cast the spear and shield of TestNG and Cobertura aside, and take up the mighty bastard-sword of SPARKAda.

I can sympathise with the feeling that testing is ineffective, that testing is a wast of time, that testing just doesn't work. When I get another bug report that the bit of code that I worked so hard to test is "just broken," I want to find the user and shake them until they conform to the code.

Clearly, assaulting end-users is not the way forwards. Unfortunately, tool support for the formal verification of Java code is also lacking in the extreme, so that route is right out too.

My own testing had been... undisciplined. Every class had a unit tests, and I even made quite a lot of integration tests. But it seemed lots of bugs were getting through.
It seems to me, that there are two things that really need to be said:

Testing is extremely important.
Creating tests up front was, at one point, basically common sense. Meyers covers this in the 1979 book, "The Art of Software Testing"

Following in Meyers' footsteps, I'd also like to make a big claim: Most people who do TDD aren't doing testing right.

TDD is often used as a substitute for program requirements, or program specification. Unfortunately, since the tests are almost always code, when a test fails, how does one really decide in a consistent way if it's the tests or the program that's broken? What if a new feature is to be worked into the code base that, on the surface looks fine, but the test suite shows it to be mutually exclusive with another feature?

Agile purists take note; a user story can work as a piece of a system's specification or requirements, depending on the level of detail in the story. If you're practicing "agile", but you don't have some way of specifying features, you actually practicing the "ad-hoc" software development methodology, and quality will likely fall.

Testing is "done right" when it is done with the intent of showing that the system is defective.

Designing Test Cases

A good test case is one which stands a high chance of exposing a defect in the system.

Unlike Meyers, I side with Regehr on this one: Randomised testing of a system is a net good, if you have:

A medium or high strength oracle.
The time to tweak your test case generator to your system.
A relatively well-specified system.

If you want to add even more strength to this method, combinatorial test case generation, followed by randomised test case generation looks to be a very powerful methodology.

However, I also feel strongly that time and effort needs to be put into manually designing test cases. Specifically, designing test cases with a high degree of discipline, and the honest intent to break your code.

Meyers recommends using boundary value analysis to partition both the input domain and output range of the system, and designing test cases which exercise those boundaries, as well as representative values from each range.

Oddly, he also discusses designing test cases which will raise most coverage metrics to close to 100%, which struck me as odd; although he tempered it by using boundary value analysis to slot into the high-coverage tests. I'm not sure I can recommend that technique, as it destroys the coverage metrics as a valid proxy of testing effectiveness, and Meyers acknowledges this earlier in the book.

For a really good run down of how to apply boundary value analysis (and lots of other really interesting techniques!), I really can't do any better than referring you to Meyers' book.

Testing Oracles

Testing oracles are things which, given some system input, and the system's output, decides if that output is "wrong".

These can be hand-written expectations, or simply waiting for exceptions to appear.

The former is a form of strong oracle, the latter is a very weak oracle. If your system has a high quantity of non-trivial assertions, you've got yourself a medium-strength oracle, and can rely on that to some degree.

What To Test

Meyers and Regehr are in agreement: Testing needs to be done at many levels, though it's not as clear in Meyers' book.

Unit testing is a must on individual classes, but this is not enough. Components need to be tested with their collaborators to ensure that there are no mis-communications or mis-understandings of a unit's contract.

I guess the best way to put this across is to simply state this: unit testing and integration testing are looking for different classes of errors. It is not valid to have one without the other, and claim that you may have found a reasonable amount of errors, as there are errors which you are simply not looking for.

I, personally, am a big fan of unit testing, then bottom-up integration testing. That is, test all the modules individually, with all collaborators mocked out, then start plugging units together starting at the lowest level, culminating in the completed system.

Other methods may be more effective for you; see Meyers' book for more methods.

This method allows you to look for logic errors in individual units, and when an integration test fails, you have a good idea of what the error is, and where it lies.

How to Measure Testing Effectiveness

A test is effective if it has a high probability of finding errors. Measuring this is obviously very hard. One thing that you may need to do is work out an estimate of how buggy your code is to begin with. Meyers has a run down of some useful techniques for this.

Coverage is a reasonable proxy -- if and only if you have not produced test cases designed to maximise a coverage metric.

It would also seem that most coverage tools only measure a couple of very weak coverage metrics: statement coverage and branch coverage. I would like to see common coverage tools start offering condition coverage, multi-condition coverage, and so on.

When to Stop Testing

When the rate at which you're finding defects becomes sufficiently low.

This is actually very hard, especially in an agile or TDD house, where tests are being run constantly, defects are patched relatively quickly with little monitoring, and all parts of development are expected to be a testing phase.

If your methodology has a testing phase, and you find that at the end of your (for example) 4 week testing window is finding more and more defects every week, don't stop. Only stop when the defect-detection rate is dropping down to an acceptable level.

If your methodology doesn't have a testing phase, this is a much harder question. You have to rely on other proxy methods of whether your testing is effective, and if you've discovered most of the defects your end users are likely to see. Good luck.

I, unfortunately, am in the latter category. I just test as effectively as I can, as I go along and hope that my personal biases aren't sinking me too badly.

Conclusion

Do testing with the intent of breaking your code, otherwise you're not testing -- you're just stroking your ego.

If possible, get a copy of Meyers' book and have a read. The edition I have access to is quite small, coming in at ~160 pages. I think that if you're serious about having a highly-effective test suite, you need to read this book.

Regehr's Udacity course, "Software Testing" is also worth a plug, as he turns out to be both very capable of effective systems testing teaching; a rare trait. Take advantage for your benefit. The course also provides a nice, more modern view on many of Meyers' techniques. His blog is also pretty darn good.

Securing Strings

2014-12-13T10:33:00.001+00:00

This is not about String as in Java, or std::string in C++. This is about the program strings, part of the GNU Development tools.

The strings program takes a file and prints out the printable strings.

Recently, the author Michal Zalewski (aka, icamtuf) used his American Fuzzy Lop (afl) tool to fuzz a variety of GNU tools, one of which was the strings program. The outcome of this was that it's a very bad idea to run strings on untrusted input.

I should make it clear, I don't think that they author of strings should've undertaken these when it was written. Most software starts off as a personal prototype or tool and grows. It's silly to start demanding the most rigorous secure software development methodology from one author and their pet project.

Once the pet project escapes and starts being relied on by other people, the dynamic obviously changes, more questions about who is responsible for the correctness of the program start being asked -- even if anyone is responsible for it, since most software comes with a disclaimer of warranty.

I will not cover program verification tools. They're often just overkill for most problems, and I think this may well be one of them.

Anyways, onto my main point. How do we go about solving this problem once and for all?

Audit ALL the things!

This is OpenBSD's primary approach. All commits must be reviewed, and almost all of the code base was reviewed somewhere around version 2 or 3, when Theo De Raadt's own OpenBSD machine was compromised (Shock, horror!).

This is a timely process, and there are no guarantees. If one person misses a subtle bug, what's the chance that the next person misses the bug? I'd wager that the chance is "high," but I'm mostly speculating.

I'm actually very pro-code review/audit. I like that TrueCrypt (now VeraCrypt) is getting audited, and at my work, I'm pushing hard for code review before any code goes live. I'm also aware that it is by no means a perfect tool, and it does slow down the process to go from design to deployment.

Use a Memory Safe Language

We could use (for example) Java, or even Ada (with the right GNATs flags) to re-write the strings tool and completely avoid memory safety vulnerabilities.

I like this idea for new projects; I would never suggest starting a new project in C, Objective-C or C++ because they all inherit C's badness with memory.

But... Java requires a runtime (The JVM) and it's startup time is non-trivial. Most people don't know Ada, and Haskell has even fewer engineers to its name.

Further, for Java especially, you're relying on the security of the underlying runtime, which hasn't had a great track-record.

I'd argue that Ada is the best choice out of the lot, but I'm biased. I really like Ada.

Obviously, re-writing extant programs in an entirely new language is not the smartest idea, unless there's really good reason. It's time consuming, and you're likely to re-introduce bugs that you coded out in the original.

Apply Good Security Principles

What I actually mean by this is that you should ensure that you apply the principle of least privilege. That means restricting exactly what the program can do, so that if compromised, the program can't do much more harm, even if the attacker manages to gain complete control over the program.

On Linux, this can be achieved to a very fine-grained level with the seccomp system call, and on FreeBSD, there is the Capsicum subsystem.

What these allow you to do is to enter a "secure" mode, where by all but a few, specified system calls are completely disabled. An application attempting to run the banned system calls is killed by the kernel. Often, you're allowed to enter the secure mode with a file descriptor.

For strings, you'd read the command line, open your target file read only (check it exists, is readable, etc.) and then enter secure mode whereby you can only read the single opened file. Should an RCE be found, the adversary would be able to read as much of the open file that they like, but they would be contained within that single process. They could not open a new shell (that would involve a banned system call), the could not open file descriptors to sensitive files (/etc/passwd, browser caches, etc.) since that would involve creating a new file descriptor, which is banned. It couldn't open a network socket and send any data to a C&C host, as it would be banned from creating sockets.

The only way out would be to find a privilege escalation exploit in the kernel using the system calls that aren't immediately filtered.

I actually like this idea best, since it can easily be combined with code review. You aim to reduce the number of security relevant bugs using code review (and testing, but you're unlikely to cover a large amount of the state space). Any that slip through the net become safe crashes, not full compromises.

First, you implement the minimal "jail" (not like chroot jails or FreeBSD jails, but seccomp or Capsicum per-process jails), and have the implementation use it. You then get your colleagues to review the jail implementation in your program.

A Simple Mail Merge Application

2014-12-06T14:10:00.002+00:00

My other half has just finished writing their first full-length novel. As such, they'd like to send it off to agents.

My first thought was OpenOffice Base. Have her enter the agent details into a table, and use OpenOffice Writer's mail merge facility. This however, did not work, since Writer's mail merge facility lacked the all-important attachments functionality.

If OpenOffice had this functionality, we'd have been up and running in about 15 minutes. I'm surprised it doesn't, it's the ultimate way to automate the job search, surely an unemployed programmer would've provided the functionality at some point...

But, leaving that by the way side, I wasn't about to give up on OpenOffice just yet. I know that OpenOffice Base's files are just cunningly zipped HSQL databases, with some metadata surrounding it.

So, I thought I'd unzip the OpenOffice Base file and have a small Java application read the HSQL database, put the results through the Velocity template engine and send off the email.

This would've involved sneaker-netting the ODB file back and forth between my machine and my partner's, but that seemed ok. They'd enter many agents in during the day, and I'd "send" them all over night. No biggy.

This was also a bust. Once my Java application with it's all-mighty HSQL JDBC jar had touched the database, it seemed to taint it. I think it bumped a version field in the database. This meant that OpenOffice Base refused to open it after even one round with my Java program.

So, plan C. SQLite is an amazing embedded database. Far faster and nicer than HSQL -- it even comes with a neat little command line interface.

I set up some test data in an SQLite database and pointed my Java program at it. Success!

So then I told OpenOffice Base to look at the file, so my partner could enter some data. Failure! OpenOffice Base had an issue with (I think) the scrolling modes available. So that was right out, wouldn't even show the data in the OpenOffice interface. Sad times.

Plan D. Remember, you've always got to have at least 3 fall-back plans when developing software, otherwise nothing will ever work.

PostgreSQL to the rescue! I setup Postgres on my machine and opened a port for it. On my partner's machine, I tested that they could connect their OpenOffice Base to my Postgres. I then tested dropping some test data in the Postgres database and trying my Java program, configured to use Postgres... Success!!

Now all I need to do is figure out how to send MIME multi-part HTML emails correctly ... ugh.

Anyway, this has been a day or so worth of work on my part. I suspect I'll have another half-day to get HTML emails working correctly, and then I'll be sorted. Hopefully it'll enable my partner to effectively reach a whole host of agents without writing out the same damn cover letter, attaching various PDFs, and DOC files, etc. over and over.

Once this is all wrapped up, I may open source it. I'll need to tidy the code, add tests and documentation, but it may be of use to someone.

The moral of the story is that HSQL is a difficult database to work with, SQLite is always awesome but somethings don't support it, and PostgreSQL is the best RDBMS since sliced bread.

How I fixed Shellshock on my OpenBSD Box

2014-09-28T10:58:00.002+01:00

Well, I am on a "really old" OpenBSD, and I couldn't be bothered updating it right now. It was really arduous:

[Sun 14/09/28 10:52 BST][p0][x86_64/openbsd5.2/5.2][4.3.17]
zsh 1044 % sudo pkg_delete bash
bash-4.2.36: ok
Read shared items: ok

In reality, this is only possible, because, as a sane operating system, there's no dependencies on anything other than a POSIX-compliant sh, of which there are several available.

To me, just another reason to avoid specific shells and just target POSIX. When the shit hits the fan, you'll have somewhere to hide.

When I tried to do the same on my work (FreeBSD) box, I came up against several issues. The main one being that lots of packages specify bash as a dependency.

At some point, I'll write a blog post about the functions of that server, and how I've hardened it.

Time for Regulation in the Software Industry?

2014-09-26T13:29:00.001+01:00

Many pieces of software have a clause similar to:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

You may recognise that as a large portion of the BSD 2-Clause open source license. Not to pick on BSD style licenses, section 15 of the GPL v3, section 11 of the GPL v2, section 7 of the Apache license, and a substantial portion of the NCSA and MIT licenses also have this clause. The wording is very similar from license to license. Let's look at how this plays out:

Linux: GPL
FreeBSD: BSD Style
OpenBSD: BSD Style
Bash: GPL
GCC: GPL
Clang/LLVM: NCSA
Apache HTTPd: Apache
NGinx: BSD
Postgres: "BSD-Like", Postgres License
MySQL: GPL
Apache Struts: Apache

This represents a huge portion of internet-facing devices, including popular services such as Netflix, tumblr, Facebook, and even the London Stock Exchange. Many devices run Android, and lots of the routers separating home networks from the Internet are running variants of Linux.

This is not to exclude commercial software, even Microsoft puts these clauses in their licenses.

I've even come across behaviour in The Co-operative Bank's online banking platform and TSB's Verified by Visa platform (URL structure, exception handling behaviour) that suggest it has a component which uses Apache Struts.

The basic meaning (and I am not a lawyer by any stretch) is that the person or entity who produced the software is (as far as legally possible), in the event of Bad Stuff, they're not Responsible for the Bad Stuff.

So, the story so far:

Developers kindly write software, entirely disclaim warranty or liability.
Organisations setup paid-for services off the back of this software, without verifying that the software is fit for purpose, since auditing software is expensive.
End users then entrust the organisations, via the un-audited software, with their money and personally identifying information (PII).

The end users -- the ones risking their bank deposits, their PII -- are, in many cases (banking has specific protections), the ones who are basically expected to
evaluate the software which they are unknowingly going to use. They are in no position to asses or even understand the risks that they are taking.

With cars, there are safety ratings, Euro NCAP and the like. Electric goods must meet a minimum safety standard (In the UK, most people look for the Kitemark) before they can be brought to market. When these goods fail to meet this standard, the seller is held accountable, who in turn, may hold suppliers accountable for failing to supply components which meet the specified requirements. There is a well-known, well-exercised tradition of consumer protection, especially in the UK.

On the other hand, should you provide a service which fails to deploy even basic security practices, you're likely to get, at most, a slap on the wrist from a toothless regulatory body (In the UK, this is the Information Commissioner's Office, ICO). For example, when Tesco failed to store passwords securely, the only outcome was that the ICO was "unimpressed". There was likely no measurable outcome for Tesco.

Banks, who usually are expected to be exemplary in this field, respond very poorly to research that would allow consumers to differentiate between the "secure banks" and the "insecure banks". This is the exact opposite of what needs to happen.

The lack of regulation, and strangling off of information to consumers is leading to companies transferring ever larger risks to their clients. Often, the clients have no option but to accept this risk. How many people have a Facebook account because it's the only way to arrange social gatherings (because everyone's on Facebook)? How many people carry and use EMV (in the UK, these are Chip n' PIN) debit or credit cards? Banks are blindly rolling out NFC (Touchless payments) to customers, who have no choice in the matter, and who, in many situations, simply do not know that this is an unsafe proposition, and could never really know.

An illuminating example is eBay's recent antics. The short version of the story is that a feature for sellers (and hence, a profit making feature for eBay) has turned out to be a Bad Idea, exposing buyers to a very well crafted credential-stealing attack. This has lead to the exposure of many buyer's bank details to malicious third-parties who have then used the details to commit fraud.

In this situation, eBay is clearly gaining by providing a feature to the sellers, and by shifting the risk to the consumer. Further, because this is a profit-making hole, and closing it could break many sellers' pages (thus incurring a huge direct cost), eBay is spinning its wheels and doing nothing.

The consumers gain little, if anything from this additional risk which they are taking on. Like a petrochemical company building plants in heavily populated areas, the local population bear the majority of the risk and do not share in the profits.

This is an unethical situation, and regulatory bodies will normally step in to ensure that the most vulnerable are suitably protected. This is not the case in software & IT services, the regulatory bodies are toothless, and often do not have the expertise to determine which products are safe and which are not.

For instance, both OpenSSL and NSS (both open source cryptographic libraries) are used the The Onion Router (TOR) and the TorBrowser (Effectively FireFox with TOR baked in) are used by dissidents and whistleblowers the world over to protect their identity where their lives or livelihoods may be at risk.

Both OpenSSL and NSS have won FIPS-140 (a federal standard in the US) approval in the past. Yet we have had Heartbleed from OpenSSL, and recently signature forgeries in NSS. Clearly, these bodies don't actually audit the code they certify, and when it does go catastrophically wrong, the libraries in question maintain their certifications.

For reference, the academic community have been concerned with the state of the OpenSSL codebase for some time. We've known that it was bad, we've shouted about it, and yet it retained it's certified status.

Individual governments often influence this, by only procuring high-assurance software, and demanding that certain products meet stringent standards, and failures of those products can therefore be financially damaging to the suppliers.

The UK government already has Technology code of practice which government departments must use to evaluate IT suppliers' offerings. However, there are many more fields which the government has little remit over, and no international remit. The US Government has similar standards processes embodied with the Federal Information Processing Standards (FIPS, of which FIPS-140, mentioned above, is just one of many).

We also have existing standards processes, like the ISO 27000 series, which have a viral nature, in that the usage of a external service can only be considered if the organisation aiming for ISO 27000 certification can show that it has done due diligence on the supplier.

However, as mentioned above, these standards rarely mean anything, as they rely on the evaluation of products which very few people understand, and are hard to test. Products that shouldn't achieve certification do, as with OpenSSL.

Clearly, the current certification process is not deployed widely enough, and is not behaving as a certification process should, so we need something with more teeth. In the UK, we have the British Medical Association (BMA), which often takes it's recommendations directly from the National Institute of Clinical Excellence (NICE). If a failure occurs, a health care provider (doctor, medical device supplier, etc.) will end up in court with the BMA present, and may lose their right to trade, as well as more serious consequences.

There is a similar situation in the UK for car manufacture, where cars have a minimum safety expectation, and if the manufacturer's product doesn't meet that expectation, the company is held accountable.

Another example is the US cars being sold into China. Many US cars don't meet the Chinese emissions standards, and hence cannot be sold into China.

We need something similar in software and services: an agreement that, like in other forms of international trade, vendors and service providers are beholden to local law.

We have existing legislation relating to the international provision of services. In many cases, this means that when a company (such as Google) violates EU anti-competition laws, they are threatened with fines. The laws are in place, but need to be stronger, in terms of what constitutes a violation of the law, and the measures that can be applied to the companies in question.

Currently, internet is the wild west, where you can be shot, mugged and assaulted all at once, and it's your own fault for not carrying a gun and wearing body armour. However, the general public are simply not equipped to acquire suitable body armour or fire a gun, so we need some form of "police" force to protect the general public.

There will always be "bad guys" but we need reasonable protection from both the malicious and the incompetent. Anyone can set up an "encrypted chat program" or "secure social media platform", but actually delivering on those promises when people's real, live PII is on them is much harder, and should be regulated.

Acknowledgements

Many thanks to my other half, Persephone Hallow for listening to my ranting on the subject, and inspiring or outright suggesting about half the ideas in this post, as well proofreading & reviewing the post.

Achieving Low Defect Rates

2014-09-14T01:16:00.001+01:00

Overview

Software defects, from null dereferences to array out of bounds and concurrency errors are a serious issue, especially in security-critical software. As such minimising defects is often a stated goal of many projects.

I am currently writing a library (OTP-Java) which provides several one-time password systems to Java applications, and this is a brief run-down of some of the steps I am taking to try to ensure that the library is as defect-free as possible. The library is not even alpha as yet. It still has several failing tests, but hopefully will be "completed" relatively soon.

Specification

Many products go forward without a specification. In many cases this is not a "bad thing" per-se, but it can make testing more difficult.

When surprising or unexpected behaviour is found, it should be classified as either a defect or simply a user without a complete understanding of the specification. With no specification, there can be so such classification. The best that can be done is to assess the behaviour and to determine if it is "wanted".

As an example, I have seen a system where one part expected a user to have access to some data, and forwarded them on to it. The system for retrieving the data had more stringent requirements, and threw a security exception. Without a clear, unambiguous specification, there's no way of telling which part of the system is in error, and hence, no immediate way to tell which part of the system should be corrected.

I would like to make it clear that I am not advocating for every system to have a large, unambiguous specification. If the product is security or safety critical, I would argue that it is an absolute must, and many share this view. For most other systems, a specification is an additional burden that prevent a product from getting to market. If a failure of your system will sink your business or kill people, then a specification is wise. Otherwise, just listen to your users.

Defects

Given a specification, a defect is often defined simply as a deviation from that specification. Many defects are benign, and will not cause any issues in production. However, some subset of defects will lead to failures -- these are actual problems encountered by users: exception messages, lost data, incorrect data, data integrity violations and so on.

It is often seen to be most cost-effective to find and eliminate defects before deployment. Sometimes, especially in systems that do not have an unambiguous specification, this is extremely difficult to do, and in a sufficiently large system, this is often nearly impossible.

For a large enough system, it's likely that the system will interact with itself, causing emergent behaviour in the end product. These odd interactions are what make the product versatile, but also what make eliminating surprising behaviour nearly impossible, and it may even be undesired for certain products.

Static Analysis

Tools that can provide feedback on the system without running it are often invaluable. Safety critical systems are expected to go through a battery of these tools, and to have no warnings or errors.

I am using FindBugs on my OTP-Java project to try to eliminate any performance or security issues. I have found that it provides valuable feedback on my code, pointing out some potential issues.

There are also tools which will rate the cyclomatic complexity (CC) of any methods that I write. I believe that Cobertura will do this for me. This will be important, as a high CC is correlated with a high defect rate, and is also expected to make reading the code more difficult.

Testing

Randomised Testing

Fortunately, generating and verifying one-time passwords (OTPs) is a problem space where there are clear measures of success, for example, if I generate a valid OTP, I must be able to validate it. Similarly, if I generate an OTP, modify it, the result should not be valid.

This lends itself to randomised testing, where random "secrets" can be produced, and used to generate OTPs. These can then be verified or modified at will.

Other properties can also be validated, such as, requesting a 6-digit OATH OTP actually does produce a 6-digit string, and that the output is entirely composed of digits.

For the OTP-Java project, I am using the Java implementation of QuickCheck, driven by JUnit.

Unit Testing

In OTP-Java, I've augmented the randomised testing with some test vectors extracted from the relevant RFCs. These test vectors, along with the randomised tests, should provide further confidence that the code meets the specification.

Usually, unit testing only involves testing a single component or class. However, with such a small library, and with such well-defined behaviour, it makes sense to test the behaviour of several parts of the system at the same time.

In my opinion, these tests are probably too big to be called unit tests, and too small to be called integration tests, so I've just lumped them together under "unit tests". Please don't quarrel over the naming. If there's an actual name for this style of testing, I'd like to know it.

Assertive Programming

I have tried to ensure that, as far as possible, the code's invariants are annotated using assertions.

That way, when an invariant is violated under testing, the programmer (me!) is notified as soon as possible. This should help with debugging, and will hopefully avoid any doubt when a test fails as to whether it was a fluke (hardware, JVM failure, or other) or genuine failure on my part.

This has, so far, been of a lot of use in randomised testing, where there have been test failures, but determining exactly why has been shown by the assertions.

It also helps users of the library. If my input validation is not good enough, and the users subject the library to tests as part of their testing, they will also, hopefully, be helped by the assertions, as they may help explain the intent of the code.

Type System

While Java's type system does leave a lot to be desired, it is quite effective at communicating exactly what is required and may be returned from specific methods.

I have, unlike the reference implementation in the RFC (Mmm, "String key", "String crypto", and so on), tried to use appropriate types in my code, requiring a SecureRandom instead of just byte[] or even Random, to convey the fact that this is a security-critical piece of code, and one shouldn't use "any old value", as has often happened with OpenSSL's API (See also, predictable IVs) which have lead to real vulnerabilities in real products.

Shielding the user from common mistakes by the use of a "sane" or "obvious" API is as much my job as it is the final user's. The security of any product which relies on the library is formed by the library's correct specification and implementation, as well as it's correct use. Encouraging and supporting both is very important.

Code Coverage

Code coverage is often a yard-stick for how effective your testing is. Code coverage of 100% is rarely possibly. For instance, if you use Mac.getInstance("HmacSHA1"), it's nearly impossible to trigger the "NoSuchAlgorithmException".

However, many tools provide branch-coverage as well as line coverage. Achieving a high coverage can help your confidence, but when using complex decision cases (for example, if (a && b || !(c && (d || e)))), it's very hard to really be sure that you've covered all cases for "entry" into a block

Cyclomatic complexity (CC) should help here. As a rough guide, if you have a CC of 2, you should have at least 2 tests. Although, this is still just a rule of thumb, it does help me feel more confident that I've ensured that, to a reasonable level, all eventualities are accounted for.

Conclusion

Many products don't have a specification, which can make reducing surprising behaviours difficult. Similarly, not all defects lead to failures.

However, even without a specification, some of the techniques listed in this post can be applied to try to lower defect rates. I've personally found that these increase my confidence when developing, but maybe that just increases my appetite for risk.

I am by no means saying that all of the above tools and techniques must be used. Similarly, I will also not say that the above techniques will ensure that your code is defect free. All you can do is try to ensure that your defect rate is lowered. For me, I feel that the techniques and tools in this post help me to achieve that goal.

The Future of TLS

2014-09-05T00:48:00.002+01:00

There exist many implementations of TLS. Unfortunately, alongside the proliferation of implementations, there is a proliferation of serious defects. These defects aren't just limited to the design of the protocol, they are endemic. The cipher choices, the ways in which the primitives are combined, everything.

I want to lay out a roadmap to a situation where we can have an implementation of TLS that we can be confident in. Given the amount commerce that is currently undertaken using TLS, and the fact that other, life-and-death (for some) projects pull TLS libraries (don't roll your own!) and entrust them, I'd say that this is a noble goal.

Let's imagine what would make for a "strong" TLS 1.3 & appropriate implementation.

We start with primitive cipher and mac selection. We must ensure that well-studied ciphers are chosen. No export ciphers, no ciphers from The Past (3DES), and the ciphers must not show any major weaknesses (RC4). Key sizes must be at least 128-bit or more. If we're looking to defend against quantum computers, the key-size must be at least 256-bit. A similar line of reasoning should be applied to the asymmetric and authentication primitives.

Further, the specification should aim to be as small, simple, self-contained and obvious as possible to facilitate review. It should also shy away from recommendations that are known to be difficult for programmers to use without a high probability of mistakes, for instance, CBC mode.

Basic protocol selections should ensure that a known-good combination of primitives is used. For example, encrypt-then-MAC should be chosen. We head off any chosen ciphertext attacks by rejecting any forged or modified ciphertexts before even attempting to decrypt the cipher text. This should be a guiding principle. Avoid doing anything with data that is not authenticated.

The mandatory cipher suite in TLS 1.2 is TLS_RSA_WITH_AES_128_CBC_SHA. While this can be a strong cipher, it's not guaranteed. It's quite easy to balls-up AES CBC, to introduce padding oracles, and the like. I would argue that the default should be based around AES GCM, as this provides authenticated encryption without even so much as lifting a finger. Even the choice of MAC in the current default is looking wobbly. It's by no means gone, but people are migrating away from HMAC-SHA1 to better MAC constructions. I would recommend exploiting the parallelism on current-generation technologies by allowing a PMAC.

There should also be allowances, and maybe even a preference towards well-studied ciphers that are designed to avoid side-channels, such as Salsa20 and ChaCha20.

There should be no equivalent of the current "null" ciphers. What a terrible idea.

I like that in the current RFC, the maximum datagram size is specified. That means that the memory usage per-connection can be bounded, and the potential impact of any denial of service attack better understood before deployment.

For the implementation, I am not fussy. Upgrading an existing implementation is completely fine by me. However, it should be formally verified. SPARK Ada may be a good choice here, but I have nothing against ACSL and C. The latter "could" be applied to many existing projects, and so may be more applicable. There are also more C programmers than there are Ada programmers.

Personally, I think SPARK Ada would be a fantastic choice, but the license scares me. Unfortunately, ACSL has it's own host of problems. Primarily that the proof annotations for ACSL are not "as good" as for SPARK, due to the much more relaxed language semantics in C.

Any implementation which is formally verified should be kept as short and readable as reasonably possible, to facilitate formal review. The task of any reviewers would be to determine if any side-channels existed, and if so, what remediation actions could be taken. Further, the reviewers should generally be checking that the right proofs have been proven. That is, the specification of the program is in-line with the "new" TLS specification.

Responsibilities should be shirked where it makes good sense. For instance, randomness shouldn't be "hoped for" by simply using the PID, or the server boot time. Use the OS provided randomness, and if performance is paramount, use a CSPRNG periodically reseeded from true random (on Linux, this is supposed to be provided by /dev/random).

The implementation should, as far as possible, aim to meet the specification, as well as providing hard bounds for the time and memory usage of each operation. This should help with avoiding denial of service attacks.

The deployment would be toughest. One could suppose that a few extra cipher suites and a TLS 1.2 mode of operation could be added in to ease deployment, but this would seriously burden any reviewers and implementors. The existing libraries could implement the new TLS specification, without too much hassle, or even go through their existing code-bases and try to move to a formally-verified model, then add in the new version. Once a sufficient number of servers started accepting the new TLS, a movement in earnest to a formally verified implementation could begin (unless this route was taken from the start by and existing library).

Any suitable implementation that wants major uptake would ensure that it has a suitable license and gets itself out there. Current projects could advertise their use of formal methods to ensure the safety of their library over others to try to win market share.

We had the opportunity to break backwards compatibility once and fix many of these issues. We did not, and we've been severely bitten for it. We really go for it now, before the next heartbleed. Then we can go back to the fun stuff, rather than forever looking over our shoulders for the next OpenSSL CVE.

Lovelocks: The Gap Between Intimacy and Security

2014-09-02T23:09:00.001+01:00

Let me start by asking a question. One that has been thoroughly explored by others, but may not have occurred to you:

Why do you lock your doors when you leave your house?

We know that almost all locks can be defeated, either surrupticiously (for example, picking or bumping the lock) or destructively (for example, by snapping the lock). What does this say about your choice to lock the door?

Many agree with me when I say that a lock on a private residence is a social contract. For those that are painfully aware of how weak they are, they represent a firm but polite "Keep Out" notice. They are there to mark your private space.

Now, assume that you and your partner have a long distance from relationship with one another, and also live in the 50s. Love letters may have been common in this era, and you would have likely kept the letters in your house. I know that I did just this with previous partners. I kept the love letters in my house.

Imagine that someone breaks into your house and photographs or photocopies those letters, and posts the contents of them in an international newspaper. To learn of the intrusion, and that you have been attacked on such a fundamental level would be devastating.

To my mind, that is a reasonable analogy for what has happened to those who have had their intimate photos taken and posted to 4chan. An attacker defeated clear security mechanisms to gain access to information that was obviously, deeply, private and personal to the victims.

These victims were not fools, they did the equivalent of locking their doors. Maybe they didn't buy bump, snap, drill and pick resistant locks for over £100 per entrance, but they still marked the door as an entrance to a private space and as such marked it as their private space.

It is right that the police should be treating this breach as an extremely serious assault on these womens' (and they are all women at this time) personal lives, and therefore pursuing the perpetrators to the fullest extent allowable under law.

Claiming that the victims should have practiced better security in this case is blaming the victim. If I go to a bank, and entrust my will to one of their safety deposit boxes (now a dying service), and it is stolen or altered in my absence, am I at fault? No, the bank should be investing in better locks -- that's what they're paid to do; and beyond that, people shouldn't be robbing banks. Bank robbers are not my fault, and iCloud hackers are not the fault of these women.

Further, it is just plain daft to claim that these women would be able to protect themselves in this world, and maintain the intimate relationships that they wanted to. Security is very hard, we know this as barely a week passes where members of a service are not urged to change their passwords due to a security breach. And these breaches affect entities of all sizes, from major corporations to single people. Even those agencies with budgets the size of a small country's GDP, and whose remit is to protect the lives of many people have serious breaches, as evidenced by the ongoing Snowden leaks.

Expecting anyone to be cognizant of the security implications and necessary precautions of sharing an intimate moment with a partner is to ask them to question their trust for that person and the openness of that moment. Security is the quintessential closing off of oneself from the world, of limiting trust, and exerting control. Intimacy; even long distance; is about opening oneself up to another person, to sharing yourself with them, extending to them a deep trust, and allowing them a spectacular amount of control over yourself, physically and emotionally. To place these two worlds together is cognitive dissonance made manifest.

And yet, we use these flawed security devices to display and proclaim our love -- and hence intimacy -- with one another. Even as a we put love locks on bridges, we know that they can be removed. We acknowledge their physical limitations as we try to communicate our emotions with each other.

Pont des Arts + canenas" by Inocybe/Piero d'Houin - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons.

We are accepting of the persistent and somewhat insurmountable failings of physical security, and we do not blame the victim when their physical security is breached. It is also the case that physical and digital security are in many senses a good analogy of one another, but that we apply different standards. We need to start realising that digital security is also imperfect, and further, that it is not the victims' fault when that security fails.

Biometric Authentication

2014-08-29T10:10:00.001+01:00

Overview

Given that the BBC has recently published an article, promoting biometrics as the password replacement technology, I'd like to beat this dead horse as to why biometrics are such a bad idea.

Duress

Many very secure systems which require a PIN, password or passphrase (hence forth, just "secret"), often have multiple secrets.

These are for the end user, should they ever be under duress, i.e. coercion, threats, intimidation. When entered, they give the adversaries what they want, but also alerts the system that something is very wrong.

There's no such thing as a "duress iris scan."

Recovery

We have methods of recovering fingerprints from objects. Part of our forensic system is based on exactly this. They can also be duplicated.

Unfortunately for users of this system, when your adversaries do that, how exactly are you going to change your fingerprints to circumvent the issue?

Resistance

Take, for example, a DNA-based biometric. You often leave your DNA in places you go, and on objects you touch. It's often found on things you use to eat or drink, like drinks cans.

That means that someone rumaging through your bins will likely get the key to impersonating you with 99.9% chance of success, and ~0% chance of detection. Good job we had something super secure!

Discriminatory

Biometrics are mostly based on you having certain body parts, with few exceptions.

Lost your hands in an industrial accident? Sorry, you can't vote. Born without eyes? Not allowed a bank account. Mute? No collecting your pension from the post office.

Perverse Incentives

Fingerprints, retina scans and iris scans present the adversary with some pretty perverse incentives.

Imagine, you are targetted for attack. Previous, with passwords, someone would research you and send a well crafted email asking for your password (or other secret), or directing you to a website that would drop some malware on your machine. This would be, probably, the most effective route for an adversary, and is often called "spear phising".

Not nice, but nothing is physically threatening you.

Now, many adversaries will think: "We need their fingerprints, fingerprints are kept on fingers, we need their fingers!" and come visit you with some bolt cutters. How is this any better than encouraging someone to deceive you, and when you fall for it, and notice (for instance) bank funds missing, you just change your secrets?

Conclusion

Secure authentication systems don't come from verifying your identify to some ridiculously high degree of confidence. Secure systems in general accept that there will be failures, passwords forgotten, sessions left open on public terminals, etc. and having systems in place to resist and recover from these scenarios.

Biometric-based authentication systems take the verification step to it's absolute maximum, but they provide none of the other extremely important features of other authentication systems.

In short, do not use biometrics for authentication.

Sortition Governence

2014-07-10T21:01:00.000+01:00

Overview

Recently, I've been thinking about how our democracy works here in the UK. This has mainly been spurred on by a couple of things, but for this blog post, the main impetus was the strike action this morning.

Strike Action

Many public sector workers have gone on strike over pay, pensions and conditions. A discussion on BBC Radio 4 this morning, involving a government official and an official from Unison stated that many of the strike ballots only had a turn out of around 20%. This was claimed to be, at best, failure of democracy and at worse a direct abuse of the system.

When compared to the Conservative's PCC Election farce in 2012, with a turnout of 15.1%, attacking a 20% turnout as not being legitimate or representative is pure hypocrisy.

Concentration of Power

Democracy is, in essence, a codified method of concentrating the power of the many in the hands of a trusted few with the aim of producing legislation that benefits the many.

Many distributed systems suffer and become vulnerable when power is concentrated too strongly. In the case of BitCoin, this power concentration is often seen in large mining pools, leading to such issues as a 51% attack.

In a democracy, this concentration of power is wielded by people -- fallible humans. People who can be swayed with bribery, extortion or simple cronyism. This leads to sub-par legislation, and outright exploitation of the vulnerable members of society.

Sortition

Sortition is a method of governance whereby the legislators are picked from a pool of eligible people at random. This means that no matter how wealthy you are, who your friends are, or what you do for a living, you may be selected to serve, thus distributing power to the populace much more effectively, and removing power from a wealthy few.

Sortition House of Lords

In our current system, legislation usually must pass the House of Lords. These are not elected officials, but have often pulled sent bad legislation back to the commons to be "fixed."

In this method, for every piece of legislation the commons produces, to reach Royal Assent, it must pass a vote by a suitably large sample of the population, selected at random. Anyone shirking their duty to vote will be fined or jailed.

This vote should ensure that any legislation is suitably aligned with the views and needs of the public, and would entail the minimum of disruption to people's lives. However, if the Commons never suggests the legislation that the people require or desire, it can never come to fruition.

It also does not provide fantastic protection against bribery.

Sortition Commons

Sortition is a method of governance whereby the legislators are picked from a pool of eligible people at random. This means that no matter how wealthy you are, who your friends are, or what you do for a living, you may be selected to serve.

To ensure that the committee is representative of the average person, they could be paid a wage close to that of what a common person could expect. For example, the median wage of the country, plus 10%. That way, on average, people neither lose out nor gain too much from serving on the committee. It would also enable the legislature the enforce service, threatening fines or jail for not serving, much like jury duty. In the case where the person felt they could not serve at the time they were selected, for example, a new mother, they could defer their service to a time that they were more able to serve.

To ensure that a wealthy few could not simply prevent unfavourable laws from coming to fruition, this system would do away with the House of Lords and the process of Royal Assent. Any legislature produced by the sortition Commons would become law without any other outside interference.

This system should ensure that the wealthy don't have too much say into the process (they're only the 1% after all! They'd commonly make up 1% of the committee, and thus have little say) and that those making the legislation are insulated from the temptation of power. It would not help prevent bribery, additional checks and balances may be required to ensure that no bribery occurred.

Issues

Sortition has several criticism commonly leveled at it, many of which revolve around a fear of loss of control.

The Racist Committee

With a sufficiently large sample of people, having an overwhelming representation of extreme views, such as racism, is highly unlikely.

To ensure that, in the unlikely event, an extreme law is passed, the public could have a variant of the right to recall for any legislation. It could even be enshrined in a constitution which upholds human rights and the operation of the sortition committee. Any changes to the constitution would have to be put to referendum.

The Bribery Problem

Many members of a sortition committee may be susceptible to bribery. A simple solution is to have the members of the sortition committee be completely anonymous until after their term has been served. If you can't find someone, you can't bribe them.

There could also be the stipulation that their accounts be checked for irregularities and large payments, to ensure that if their anonymity is breached, they do not accept any bribes. If they are found to have accepted bribes, the member of the committee and the person paying the bribe should be subject to hefty fines and possibly jail.

Incompetence

Many people would be worried about the members of the sortition committee not having the relevant skills to produce legislation or respond to the demands of the public service adequately. This could, in part, be addressed by adequate training. It should probably include some training in statistics, and especially their common mis-uses.

Another potential way to deal with this would be to have a large body of experts available to answer their questions and give expert opinions, much like an expert witness maybe called to a trial, a request could be made for (for example) experts in civil engineering. Several could be found and asked for their opinions for the committee, or to produce independent reports for the committee to consider.

Conclusion

I'm not overly convinced of this idea. I keep asking those around me about it, and it usually receives mixed reactions, and some offer additions and modifications to the idea. Most feel that in our current system, the idea is untennable, others feel that it's only acheivable with incremental changes to the current system.

I feel that sortition has a reasonable chance of working, however, it needs to prove it's metal in the small scale first. For example, unions, the ACM, IEEE or NICE could use sortition to elect their governing persons from their membership without discrimination. Even The Co-operative Group, UK could potentially use it to select it's governance.

Any issues found while "testing" the idea in the small could be used to refine the idea and re-test it.