tedu@openbsd.org
I noticed something a bit funny about the exploit mitigation material. On the one hand, it's very technical, how exploits work and are mitigated. On the other hand, the people most interested in exploit mitigation are more likely to be users, stuck running software they don't trust. I wanted to look at things from a different viewpoint. How can we help the good guys?
This is a talk about developing software. My examples are all going to come from OpenBSD, but that doesn't mean I'm only talking to OpenBSD developers. Actually my target audience is everyone who develops software that does, or may, run on OpenBSD. In this case, OpenBSD is the hostile environment. If you're only now discovering that the internet is a hostile environment, you're about 20 years late to the party.
What makes OpenBSD a hostile environment? It doesn't always conform to expectations and it certainly doesn't condone many mistakes. Developers talk a lot about standards; C standard, posix standard; but there's also real world de facto standards and assumptions. Let's challenge some of those assumptions and push the boundaries of the standard. A strictly conforming program will continue to run just as it should, but a program that takes shortcuts will quickly find itself in trouble.
Everybody loves secure software, but as we've maintained for some time, secure software is simply a subset of correct software.
Whenever we've added new exploit mitigations to OpenBSD, something always seems to stop working. Always. This makes the development of such features very exciting, of course. Did I break it, or was it always broken?
From a high level, my philosopy is that instability today leads to stability tomorrow. The sooner we can break it, the sooner we can fix. Everybody will tell you the same thing, that's it's better to find bugs in development than in production. True of course, but that doesn't mean you won't have production bugs. Earlier is better there too. There is an unfortunate mindset that once something ships, we play it safe and become very conservative. But this only delays the inevitable, and it makes me very uncomfortable. Debugging a live system running code from two years ago is much more difficult than debugging a live system running code from last month. You didn't avoid the bug; you've only made it harder to deal with.
I'm guessing that malloc.conf is the most popular, familiar feature I'm going to talk about. It's a good place to start, with the user visible interface to malloc, and then we can move along from there to how allocators work.
All BSDs support the malloc.conf feature, which made its appearance in phkmalloc. It was subsequently retained in jemalloc and ottomalloc, although with some divergence in the available options. The internal behavior of malloc can be controlled by creating a symlink named malloc.conf. The letters in the name of the symlink target enable or disable various options.
The J for junk option is the one I'd like to focus on. When enabled, this option prefills allocated memory with a non zero pattern and overwrites freed memory as well. This catches two different classes of bugs.
First, many program fail to completely initialize heap objects. Many malloc implementations, at least for a while after program startup, will return zero filled memory for allocations because that's what malloc gets from the kernel. Much like an uninitialzed stack variable, you'll probably get lucky until you don't. At some point, malloc will switch to returning previously used memory, which won't be zero, and the bug manifests. Better to catch it on the first allocation.
Second, many use after free bugs rely on the memory remaining unchanged for some time after free. Until the memory is recycled, the program can perhaps continue using it without consequence. By immediately overwriting the memory, we can trigger erroneous behavior.
There's no guarantee that the junking memory will flush out a bug. However, we can hope that the junked memory is sufficiently atypical that it causes observable deviations.
The man page for malloc.conf unfortunately downplays how signficant and helpful it can be. It's not just for testing or debugging, and despite whatever warnings you may find in the man page, running with J in production isn't such a bad idea. If the program behaves, it won't hurt. There may be a performance hit, true, but it may be acceptable. If it does hurt, you probably don't want to be running that program in production, regardless of what malloc options are in use.
A short while ago, I changed the default on OpenBSD to always junk small memory after free. We didn't necessarily want to impose a penalty on all code, particularly considering that some memory may still be unmapped zero on demand pages, so we restricted it to small chunks. And use after free bugs are probably more dangerous and more pervasive than uninitialized memory, so we focused the effort there.
Lots of OpenBSD users do run with malloc options enabled, so many of the bugs it catches have already been caught, but occasionally some slip through. When we started junking memory by default after free, the postgres ruby gem stopped working. You can always find more bugs by conscripting more testers.
There is (or was) a related option, Z for zero. It basically does the opposite, and always zeros newly malloced memory. jemalloc still has this option, but I removed it from OpenBSD because it seems like a crutch for poorly written applications. I don't want to help bad programs run; I want to stop them from running.
Some other options that may be interesting are the F and U options, also designed to help track down use after free bugs. By unmapping the freelist whenever possible, it can trigger segfaults when memory is accessed.
The G option enables guard pages. Unlike the others, I don't think this option is necessarily a good trade off. By default, the kernel will return randomized and well separated pages with unmapped regions between them. The G option enforces this behavior from the userland side, but it adds several system calls to some important code paths and is mostly redudant. If you're looking to conserve performance, skip this one.
Poisoning an object can be as simple as overwriting the memory with a simple pattern, but it can also be considerably more complex. The fixed pattern can be selected (crafted) for maximum invalidity. For example, we might want to overwrite pointers with a value that we know is not mapped. Find a hole in the address space and use that as the fill value. Use a few different patterns. Bugs have a tendency to adapt to whatever fill patterns you use.
Theo and I had been discussing the possibility that the poison values in use might be conveniently aligned with harmless flag values. As an experiment, I inverted the bit patterns used. It wasn't long before the smoke started pouring out. In the OpenBSD kernel, we alternate between two values. Unfortunately, the two values happen to be quite similar, and this allowed some bugs to escape. The function to establish interrupts on i386 failed to initialize a flags argument. This was mostly harmless because the default fill value from malloc didn't set any interesting flags. When the bit values were inverted, the MP safe flag was set. Marking an interrupt handler as MP safe when it is not quickly leads to trouble.
Another thing one can do, and the kernel does this even though userland malloc does not, is to check that the poison values are still in place. This can detect some use after free writes. When code frees an object, it's immediately poisoned. If buggy code later changes the object, that will erase some of the poison. When the allocator decides to recycle the object, it checks that all the poison is in place. If not, panic.
One common policy is fast recycle. Last in, first out. The most recent object freed is the next object allocated. This is great for performance because the object is probably already in cache. Unfortunately, it's not so great for detecting bugs. When the object is recycled, it will be reinitialized. All the poison is washed off. Any dangling pointers to the old freed object will instead see the new object. But since the new object is a perfectly legitimate object, the buggy code will continue running. Usually until something goes really wrong. From a security standpoint, this is also troublesome because it's predictable. If the attacker can control the contents of either the new or old object, they have some control over the other as well. That's always the case, but fast recycle makes it easy to control the interleavings of malloc and free as well, to guarantee that old and new objects overlap.
The opposite policy would be slow recycle. First in, first out. Or last in, last out. Or LRU. The buffer cache uses slow recycle. This is great for detecting bugs. The longer an object remains on the free list, the more opportunities you have to check that the poison is intact.
There's also indeterminate recycling, although that may be a poor name. What I mean is, it's not immediately obvious what policy is in effect. For instance, malloc and pool both operate on a current working page. Freed objects will be returned to whatever page they came from, but allocated objects always come from the current page. So for some objects, not on the current page, this is slow recycling. But for some objects, it's fast recycling. For example, even though userland malloc selects a chunk at random, if you allocate the last object in a page, then free it, then malloc again, you'll get the same object. We addressed this in both pool and malloc by occasionally reselecting a random current page.
Random recycling. First, you can deliberately try to recycle objects randomly. In OpenBSD, we do this by keeping a stash of recently freed objects. Whenever something is freed, a randomly selected index is used. The recently freed something goes into the stash, and the older something that was freed comes out and actually goes onto the free list. This was added as a security feature, but it's also great at mixing things up in everyday programs as well.
Whenever a one byte overflow is found, people tend to have this reaction that it's harmless. That gets revised to mostly harmless. Then possibly harmless. Then maybe not so harmless.
We're not doing anything that electric fence or valgrind can't do, but we do it all the time. No matter how robust your test coverage is, it's never going to cover everything that happens in the real world.
The reverse is not always true, and unfortunately I think this affects OpenBSD's reputation negatively. "Hey, this program crashes when I run it on OpenBSD. OpenBSD sucks." I beg to differ. More likely that it's the program that sucks. Just because a program doesn't always crash doesn't mean it can't be induced to crash.
If you're developing a library, don't fight the operating system. If you're developing an application, be aware of what implicit behaviors the libraries you use may have, and how this may mask bugs. Fast recycling is pretty common in custom allocators. It hides lots of bugs. Another OpenBSD developer told me that they patched the APR (Apache Portable Runtime) library to simply use the system malloc instead of custom pools. subversion stopped working.
Another great bug. I'm going to go light on the details, but the gist of how hibernation and resume work is that the hibernating kernel writes its memory to swap, then the resuming kernel reads that memory back in, overwriting itself. This obviously requires that the two kernels be identical. There was a bug that suddenly appeared where the stack protector was being triggered in resume. Since the introduction of stack protection for the kernel, it used a fixed cookie. Where would it get randomness from? That was recently changed. The bootloader now fills in the random data segment of the kernel. What happened with resume is that the currently running kernel had a different cookie than the saved kernel. As the saved kernel was restored, the cookie value was replaced, but the stack value wasn't updated. The real bug was that the resume code should have switched to a different stack, but continued running with the wrong stack for longer than it should have. Conditions changed, assumptions were challenged, a bug was found.
That's a good idea. In general, we're limited somewhat by the state of programs. We can't break too much at once. Netflix has a program, Chaos Monkey, that goes around killing processes to ensure that their redudancy is working. I think that's pushing it a little too far. Adding options to aid in debugging is great, but it's moving a little farther from the mission of a general purpose OS.