Saturday, 13 December 2014

Securing Strings

This is not about String as in Java, or std::string in C++. This is about the program strings, part of the GNU Development tools.

The strings program takes a file and prints out the printable strings.

Recently, the author Michal Zalewski (aka, icamtuf) used his American Fuzzy Lop (afl) tool to fuzz a variety of GNU tools, one of which was the strings program. The outcome of this was that it's a very bad idea to run strings on untrusted input.

I should make it clear, I don't think that they author of strings should've undertaken these when it was written. Most software starts off as a personal prototype or tool and grows. It's silly to start demanding the most rigorous secure software development methodology from one author and their pet project.

Once the pet project escapes and starts being relied on by other people, the dynamic obviously changes, more questions about who is responsible for the correctness of the program start being asked -- even if anyone is responsible for it, since most software comes with a disclaimer of warranty.

I will not cover program verification tools. They're often just overkill for most problems, and I think this may well be one of them.

Anyways, onto my main point. How do we go about solving this problem once and for all?

Audit ALL the things!

This is OpenBSD's primary approach. All commits must be reviewed, and almost all of the code base was reviewed somewhere around version 2 or 3, when Theo De Raadt's own OpenBSD machine was compromised (Shock, horror!).

This is a timely process, and there are no guarantees. If one person misses a subtle bug, what's the chance that the next person misses the bug? I'd wager that the chance is "high," but I'm mostly speculating.

I'm actually very pro-code review/audit. I like that TrueCrypt (now VeraCrypt) is getting audited, and at my work, I'm pushing hard for code review before any code goes live. I'm also aware that it is by no means a perfect tool, and it does slow down the process to go from design to deployment.

Use a Memory Safe Language

We could use (for example) Java, or even Ada (with the right GNATs flags) to re-write the strings tool and completely avoid memory safety vulnerabilities.

I like this idea for new projects; I would never suggest starting a new project in C, Objective-C or C++ because they all inherit C's badness with memory.

But... Java requires a runtime (The JVM) and it's startup time is non-trivial. Most people don't know Ada, and Haskell has even fewer engineers to its name.

Further, for Java especially, you're relying on the security of the underlying runtime, which hasn't had a great track-record.

I'd argue that Ada is the best choice out of the lot, but I'm biased. I really like Ada.

Obviously, re-writing extant programs in an entirely new language is not the smartest idea, unless there's really good reason. It's time consuming, and you're likely to re-introduce bugs that you coded out in the original.

Apply Good Security Principles

What I actually mean by this is that you should ensure that you apply the principle of least privilege. That means restricting exactly what the program can do, so that if compromised, the program can't do much more harm, even if the attacker manages to gain complete control over the program.

On Linux, this can be achieved to a very fine-grained level with the seccomp system call, and on FreeBSD, there is the Capsicum subsystem.

What these allow you to do is to enter a "secure" mode, where by all but a few, specified system calls are completely disabled. An application attempting to run the banned system calls is killed by the kernel. Often, you're allowed to enter the secure mode with a file descriptor.

For strings, you'd read the command line, open your target file read only (check it exists, is readable, etc.) and then enter secure mode whereby you can only read the single opened file. Should an RCE be found, the adversary would be able to read as much of the open file that they like, but they would be contained within that single process. They could not open a new shell (that would involve a banned system call), the could not open file descriptors to sensitive files (/etc/passwd, browser caches, etc.) since that would involve creating a new file descriptor, which is banned. It couldn't open a network socket and send any data to a C&C host, as it would be banned from creating sockets.

The only way out would be to find a privilege escalation exploit in the kernel using the system calls that aren't immediately filtered.

I actually like this idea best, since it can easily be combined with code review. You aim to reduce the number of security relevant bugs using code review (and testing, but you're unlikely to cover a large amount of the state space). Any that slip through the net become safe crashes, not full compromises.

First, you implement the minimal "jail" (not like chroot jails or FreeBSD jails, but seccomp or Capsicum per-process jails), and have the implementation use it. You then get your colleagues to review the jail implementation in your program.

Saturday, 6 December 2014

A Simple Mail Merge Application

My other half has just finished writing their first full-length novel. As such, they'd like to send it off to agents.

My first thought was OpenOffice Base. Have her enter the agent details into a table, and use OpenOffice Writer's mail merge facility. This however, did not work, since Writer's mail merge facility lacked the all-important attachments functionality.

If OpenOffice had this functionality, we'd have been up and running in about 15 minutes. I'm surprised it doesn't, it's the ultimate way to automate the job search, surely an unemployed programmer would've provided the functionality at some point...

But, leaving that by the way side, I wasn't about to give up on OpenOffice just yet. I know that OpenOffice Base's files are just cunningly zipped HSQL databases, with some metadata surrounding it.

So, I thought I'd unzip the OpenOffice Base file and have a small Java application read the HSQL database, put the results through the Velocity template engine and send off the email.

This would've involved sneaker-netting the ODB file back and forth between my machine and my partner's, but that seemed ok. They'd enter many agents in during the day, and I'd "send" them all over night. No biggy.

This was also a bust. Once my Java application with it's all-mighty HSQL JDBC jar had touched the database, it seemed to taint it. I think it bumped a version field in the database. This meant that OpenOffice Base refused to open it after even one round with my Java program.

So, plan C. SQLite is an amazing embedded database. Far faster and nicer than HSQL -- it even comes with a neat little command line interface.

I set up some test data in an SQLite database and pointed my Java program at it. Success!

So then I told OpenOffice Base to look at the file, so my partner could enter some data. Failure! OpenOffice Base had an issue with (I think) the scrolling modes available. So that was right out, wouldn't even show the data in the OpenOffice interface. Sad times.

Plan D. Remember, you've always got to have at least 3 fall-back plans when developing software, otherwise nothing will ever work.

PostgreSQL to the rescue! I setup Postgres on my machine and opened a port for it. On my partner's machine, I tested that they could connect their OpenOffice Base to my Postgres. I then tested dropping some test data in the Postgres database and trying my Java program, configured to use Postgres... Success!!

Now all I need to do is figure out how to send MIME multi-part HTML emails correctly ... ugh.

Anyway, this has been a day or so worth of work on my part.  I suspect I'll have another half-day to get HTML emails working correctly, and then I'll be sorted. Hopefully it'll enable my partner to effectively reach a whole host of agents without writing out the same damn cover letter, attaching various PDFs, and DOC files, etc. over and over.

Once this is all wrapped up, I may open source it. I'll need to tidy the code, add tests and documentation, but it may be of use to someone.

The moral of the story is that HSQL is a difficult database to work with, SQLite is always awesome but somethings don't support it, and PostgreSQL is the best RDBMS since sliced bread.