{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Practical Steps with FuzzManager\n", "\n", "In this chapter we will perform basic steps around FuzzManager, including crash submission and triage as well as coverage measurement tasks." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Server Setup\n", "\n", "This Docker image already has a FuzzManager Demo Server running in the background for you. It will be used during the following exercises. If you are prompted to login at any point in time, use the following credentials:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "
**Username: demo**
\n", "
**Password: demo**
" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "At the end of the chapter, you will find a link to setup instructions, in case you would like to run your own instance for future fuzzing work." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import fuzzingbook_utils" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import Fuzzer" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Basic Crash Processing\n", "\n", "To get started with basic steps in crash processing, let's take a look at *simply-buggy*, an example repository containing trivial C++ programs for illustration purposes." ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!git clone https://github.com/choller/simply-buggy\n", "!(cd simply-buggy && make)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The make command compiles our target program, including our first target, the *simple-crash* example. Alongside the program, there is also a configuration file generated. Let's take a look at that file and the source code:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!cat simply-buggy/simple-crash.cpp" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!cat simply-buggy/simple-crash.fuzzmanagerconf" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "As you can see, the source code is fairly simple: A forced crash by writing to a (near)-NULL pointer. The configuration file generated for the the binary also contains some straightforward information, like the revision of the program and other metadata that is required or at least useful later on when submitting things." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Running the program shows us a crash trace as expected:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!simply-buggy/simple-crash" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Now, what we would actually like to do is to run this binary from Python instead, detect that it crashed, collect the trace and submit it to the server. Let's start with a simple script that would just run the program we give it and detect the presence of the ASan trace:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "import subprocess" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "cmd = [\"simply-buggy/simple-crash\"]" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "result = subprocess.run(cmd, stderr=subprocess.PIPE)\n", "stderr = result.stderr.decode().splitlines()\n", "crashed = False\n", "\n", "for line in stderr:\n", " if \"ERROR: AddressSanitizer\" in line:\n", " crashed = True\n", " break\n", "\n", "if crashed:\n", " print(\"Yay, we crashed!\")\n", "else:\n", " print(\"Move along, nothing to see...\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Nice, we can now run the binary and detect that it crashed. But how do we send this information to the server now? Let's make a few modifications..." ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "import subprocess" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "from Collector.Collector import Collector" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "from FTB.ProgramConfiguration import ProgramConfiguration" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "from FTB.Signatures.CrashInfo import CrashInfo" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We instantiate the collector instance; this will be our entry point for talking to the server." ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "collector = Collector()\n", "\n", "cmd = [\"simply-buggy/simple-crash\"]\n", "\n", "result = subprocess.run(cmd, stderr=subprocess.PIPE)\n", "stderr = result.stderr.decode().splitlines()\n", "crashed = False\n", "\n", "for line in stderr:\n", " if \"ERROR: AddressSanitizer\" in line:\n", " crashed = True\n", " break\n", "\n", "if crashed:\n", " print(\"Yay, we crashed, processing...\")\n", " \n", " # This reads the simple-crash.fuzzmanagerconf file\n", " configuration = ProgramConfiguration.fromBinary(cmd[0])\n", " \n", " # This reads and parses our ASan trace into a more generic format,\n", " # returning us a generic \"CrashInfo\" object that we can inspect\n", " # and/or submit to the server.\n", " crashInfo = CrashInfo.fromRawCrashData([], stderr, configuration)\n", " \n", " # Submit the crash\n", " collector.submit(crashInfo)\n", " \n", " print(\"Crash submitted!\")\n", "else:\n", " print(\"Move along, nothing to see...\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We now submitted something to our local FuzzManager demo instance. If you go to http://127.0.0.1:8000/crashmanager/crashes/ you should see your crash. Now click on the crash to inspect the submitted data. Then click the orange *Create* button to create a bucket for this crash. A *crash signature* will be proposed to you for matching this and future crashes of the same type, accept it by clicking *Save*. You will be redirected to the newly created bucket, which shows you the size (how many crashes it holds), it's bug report status (buckets can be linked to bugs in an external bug tracker like Bugzilla) and many other useful information. If you click on the *Signatures* entry in the top menu, you should also see your newly created entry." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Buckets and their signatures are a central concept in FuzzManager. If you receive a lot of crash reports from various sources, bucketing allows you to easily group crashes and filter duplicates. The flexible signature system starts out with an initially proposed fine-grained signature, but it can be adjusted as needed to capture variations of the same bug and make tracking easier. In the next example, we will look at a more complex example that reads data from a file and creates multiple crash signatures." ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!cat simply-buggy/out-of-bounds.cpp" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This program looks way more elaborate compared to the last one, but don't worry, it isn't really doing a whole lot. The code in the `main` function simply reads a file provided on the command line and puts its contents into a buffer that is passed to `validateAndPerformAction`. That function pulls out two bytes of the buffer (`action` and `count`) and considers the rest `data`. Depending on the value of `action`, it then calls either `printFirst` or `printLast`, which prints either the first or the last `count` bytes of `data`. Sounds pointless, and yes, it is. The whole idea of this program is that the security check (that `count` is not larger than the length of `data`) is missing in `validateAndPerformAction` but that the illegal access happens later in either of the two print functions. Hence, we would expect this program to generate at least two (slightly) different crash signatures. Let's try it out with very simple fuzzing based on the last Python script:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "import os\n", "import random\n", "import subprocess\n", "import tempfile\n", "import sys" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "from Collector.Collector import Collector" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "from FTB.ProgramConfiguration import ProgramConfiguration" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "from FTB.Signatures.CrashInfo import CrashInfo" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "# Instantiate the collector instance, this will be our entry point\n", "# for talking to the server." ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "collector = Collector()\n", "\n", "cmd = [\"simply-buggy/out-of-bounds\"]\n", "\n", "crash_count = 0\n", "\n", "for itnum in range(0,100):\n", " rand_len = random.randint(1, 1024)\n", " rand_data = bytearray(os.urandom(rand_len))\n", " \n", " (fd, current_file) = tempfile.mkstemp(prefix=\"fuzztest\")\n", " os.write(fd, rand_data)\n", " os.close(fd)\n", " \n", " current_cmd = []\n", " current_cmd.extend(cmd)\n", " current_cmd.append(current_file)\n", " \n", " result = subprocess.run(current_cmd, stderr=subprocess.PIPE)\n", " stderr = result.stderr.decode().splitlines()\n", " crashed = False\n", "\n", " for line in stderr:\n", " if \"ERROR: AddressSanitizer\" in line:\n", " crashed = True\n", " break\n", "\n", " if crashed:\n", " sys.stdout.write(\"C\")\n", "\n", " # This reads the simple-crash.fuzzmanagerconf file\n", " configuration = ProgramConfiguration.fromBinary(cmd[0])\n", "\n", " # This reads and parses our ASan trace into a more generic format,\n", " # returning us a generic \"CrashInfo\" object that we can inspect\n", " # and/or submit to the server.\n", " crashInfo = CrashInfo.fromRawCrashData([], stderr, configuration)\n", "\n", " # Submit the crash\n", " collector.submit(crashInfo, testCase = current_file)\n", " \n", " crash_count += 1\n", " else:\n", " sys.stdout.write(\".\")\n", " \n", " os.remove(current_file)\n", "\n", "print(\"\")\n", "print(\"Done, submitted %s crashes.\" % crash_count)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If you run this script, you will see it's progress and notice that it produces quite a few crashes. And indeed, if you visit [FuzzManager](http://127.0.0.1:8000/crashmanager/crashes/), you will notice a variety of crashes that have accumulated. Pick the first crash and create a bucket for it, like you did the last time. After saving, you will notice that not all of your crashes went into the bucket. The reason is that our program created several different stacks that are somewhat similar but not exactly identical. This is a common problem when fuzzing real world applications. Fortunately, there is an easy way to deal with this. While on the bucket page, hit the *Optimize* button for the bucket. FuzzManager will then automatically propose you to change your signature. Accept the change by hitting *Edit with Changes* and then *Save*. Repeat these steps until all crashes are part of the bucket. After 3 to 4 iterations, your signature will likely look like this:\n", "\n", "```{\n", " \"symptoms\": [\n", " {\n", " \"type\": \"output\",\n", " \"src\": \"stderr\",\n", " \"value\": \"/ERROR: AddressSanitizer: heap-buffer-overflow/\"\n", " },\n", " {\n", " \"type\": \"stackFrames\",\n", " \"functionNames\": [\n", " \"?\",\n", " \"?\",\n", " \"?\",\n", " \"validateAndPerformAction\",\n", " \"main\",\n", " \"__libc_start_main\",\n", " \"_start\"\n", " ]\n", " },\n", " {\n", " \"type\": \"crashAddress\",\n", " \"address\": \"> 0xFF\"\n", " }\n", " ]\n", "}```\n", "\n", "As you can see in the *stackFrames* symptom, the `validateAndPerformAction` stack frame is still present because it is common in all crashes (in fact, this is where the bug lives). But the lower stack parts have been removed because they vary across the set of submitted crashes.\n", "\n", "The *Optimize* function is designed to automate this process as much as possible: It attempts to broaden the signature by fitting it to untriaged crashes and then checks if the modified signature would touch other existing buckets. This works with the assumption that other buckets are indeed other bugs, i.e. if you had created two buckets from your crashes first, optimizing wouldn't work anymore. Also, if the existing bucket data is sparse and you have a lot of untriaged crashes, the algorithm could propose changes that include crashes of different bugs in the same bucket. There is no way to fully automatically detect and prevent this, hence the process is semi-automated and requires you to review all proposed changes." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Code Coverage\n", "\n", "As we have heard before, measuring code coverage can be beneficial to assess how well a fuzzer is performing on the target code. Holes in code coverage can reveal particularly hard-to-reach locations as well as bugs in the fuzzer itself. Because this is an important part of the overall fuzzing operations, FuzzManager supports visualizing per-fuzzing code coverage of repositories. To illustrate this, we are first going to look at a another simple program, the `maze` example:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!cat simply-buggy/maze.cpp" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "As you can see, all this program does is read some numbers from the command line, compare them to some magical constants and arbitrary criteria, and if everything works out, you reach one of the two secrets in the program. Also, one secret is buggy. Before we start to work on this program, we recompile the programs with coverage support. In order to emit code coverage with either Clang or GCC, programs typically need to be built and linked with special `CFLAGS` like `--coverage`. In our case, the Makefile does this for us:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!(cd simply-buggy && make clean && make coverage)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Also, if we want to use FuzzManager to look at our code, we need to do the initial repository setup (essentially giving the server its own working copy of our GIT repository to pull the source from). Normally, the client and server run on different machines, so this involves checking out the repository on the server and telling it where to find it (and what version control system it uses):" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!git clone https://github.com/choller/simply-buggy $HOME/simply-buggy-server\n", "!python3 $HOME/FuzzManager/server/manage.py setup_repository simply-buggy GITSourceCodeProvider $HOME/simply-buggy-server" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We now assume that we know some of the magic constants (like in practice, we sometimes know some things about the target, but might miss a detail) and we fuzz the program with that:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "import random\n", "import subprocess" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "random.seed(0)\n", "cmd = [\"simply-buggy/maze\"]\n", "\n", "constants = [3735928559, 1111638594]; \n", "\n", "for itnum in range(0,1000):\n", " current_cmd = []\n", " current_cmd.extend(cmd)\n", " \n", " for _ in range(0,4):\n", " if random.randint(0, 9) < 3:\n", " current_cmd.append(str(constants[random.randint(0, len(constants) - 1)]))\n", " else:\n", " current_cmd.append(str(random.randint(-2147483647, 2147483647)))\n", " \n", " result = subprocess.run(current_cmd, stderr=subprocess.PIPE)\n", " stderr = result.stderr.decode().splitlines()\n", " crashed = False\n", " \n", " if stderr and \"secret\" in stderr[0]:\n", " print(stderr[0])\n", "\n", " for line in stderr:\n", " if \"ERROR: AddressSanitizer\" in line:\n", " crashed = True\n", " break\n", "\n", " if crashed:\n", " print(\"Found the bug!\")\n", " break\n", "\n", "print(\"Done!\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "As you can see, with 1000 runs we found secret 1 a few times, but secret 2 (and the crash) are still missing. In order to determine how to improve this, we are now going to look at the coverage data:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "!grcov simply-buggy/ -t coveralls+ --commit-sha $(cd simply-buggy && git rev-parse HEAD) --token NONE -p `pwd`/simply-buggy/ > coverage.json\n", "!python3 -mCovReporter --repository simply-buggy --description \"Test1\" --submit coverage.json" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can now go to http://127.0.0.1:8000/covmanager/ to take a look at our source code and its coverage. You should see one collection with ID 1. Click on the ID to browse the coverage data that you just submitted. You will first see the full list of files in the `simply-buggy` repository, with all but the `maze.cpp` file showing 0% coverage (because we didn't do anything with these binaries since we rebuilt them with coverage support). Now click on `maze.cpp` and inspect the coverage line by line. There are two observations to make:\n", "\n", "1. The if-statement in line 34 is still covered, but the lines following after it are red. This is because our fuzzer misses the constant checked in that statement, so it is fairly obvious that we need to add to our constants list.\n", "\n", "2. From line 26 to line 27 there is a sudden drop in coverage. Both lines are covered, but the counters show that we fail that check in more than 95% of the cases. This explains why we find secret 1 so rarely. If this was a real program, we would now try to figure out how much additional code is behind that branch and adjust probabilities such that we hit it more often, if necessary.\n", "\n", "Of course, the `maze` program is so small that one could see these issues with the bare eye. But in reality with complex programs, it is most of the time not obvious where a fuzzing tool gets stuck. Identifying these cases can greatly help to improve fuzzing results. For the sake of completeness, let's rerun the program now with the missing constant added:" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "import random\n", "import subprocess" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "random.seed(0)\n", "cmd = [\"simply-buggy/maze\"]\n", "\n", "constants = [3735928559, 1111638594, 3405695742]; # Added the missing constant here\n", "\n", "for itnum in range(0,1000):\n", " current_cmd = []\n", " current_cmd.extend(cmd)\n", " \n", " for _ in range(0,4):\n", " if random.randint(0, 9) < 3:\n", " current_cmd.append(str(constants[random.randint(0, len(constants) - 1)]))\n", " else:\n", " current_cmd.append(str(random.randint(-2147483647, 2147483647)))\n", " \n", " result = subprocess.run(current_cmd, stderr=subprocess.PIPE)\n", " stderr = result.stderr.decode().splitlines()\n", " crashed = False\n", " \n", " if stderr:\n", " print(stderr[0])\n", "\n", " for line in stderr:\n", " if \"ERROR: AddressSanitizer\" in line:\n", " crashed = True\n", " break\n", "\n", " if crashed:\n", " print(\"Found the bug!\")\n", " break\n", "\n", "print(\"Done!\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "As expected, we now found secret 2 including our crash." ] } ], "metadata": { "ipub": { "bibliography": "fuzzingbook.bib" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "toc-autonumbering": false }, "nbformat": 4, "nbformat_minor": 2 }