{ "cells": [ { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Getting Coverage\n", "\n", "In the [previous chapter](Fuzzer.ipynb), we introduced _basic fuzzing_ – that is, generating random inputs to test programs. How do we measure the effectiveness of these tests? One way would be to check the number (and seriousness) of bugs found; but if bugs are scarce, we need a _proxy for the likelihood of a test to uncover a bug._ In this chapter, we introduce the concept of *code coverage*, measuring which parts of a program are actually executed during a test run. Measuring such coverage is also crucial for test generators that attempt to cover as much code as possible." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "source": [ "**Prerequisites**\n", "\n", "* You need some understanding of how a program is executed.\n", "* You should have learned about basic fuzzing in the [previous chapter](Fuzzer.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## A CGI Decoder\n", "\n", "We start by introducing a simple Python function that decodes a CGI-encoded string. CGI encoding is used in URLs (i.e., Web addresses) to encode characters that would be invalid in a URL, such as blanks and certain punctuation:\n", "\n", "* Blanks are replaced by `'+'`\n", "* Other invalid characters are replaced by '`%xx`', where `xx` is the two-digit hexadecimal equivalent.\n", "\n", "In CGI encoding, the string `\"Hello, world!\"` would thus become `\"Hello%2c+world%21\"` where `2c` and `21` are the hexadecimal equivalents of `','` and `'!'`, respectively.\n", "\n", "The function `cgi_decode()` takes such an encoded string and decodes it back to its original form. Our implementation replicates the code from \\cite{Pezze2008}. (It even includes its bugs – but we won't reveal them at this point.)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def cgi_decode(s):\n", " \"\"\"Decode the CGI-encoded string `s`:\n", " * replace \"+\" by \" \"\n", " * replace \"%xx\" by the character with hex number xx.\n", " Return the decoded string. Raise `ValueError` for invalid inputs.\"\"\"\n", "\n", " # Mapping of hex digits to their integer values\n", " hex_values = {\n", " '0': 0, '1': 1, '2': 2, '3': 3, '4': 4,\n", " '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,\n", " 'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,\n", " 'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,\n", " }\n", "\n", " t = \"\"\n", " i = 0\n", " while i < len(s):\n", " c = s[i]\n", " if c == '+':\n", " t += ' '\n", " elif c == '%':\n", " digit_high, digit_low = s[i + 1], s[i + 2]\n", " i += 2\n", " if digit_high in hex_values and digit_low in hex_values:\n", " v = hex_values[digit_high] * 16 + hex_values[digit_low]\n", " t += chr(v)\n", " else:\n", " raise ValueError(\"Invalid encoding\")\n", " else:\n", " t += c\n", " i += 1\n", " return t\n" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Here is an example of how `cgi_decode()` works:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Hello world'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cgi_decode(\"Hello+world\")" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "If we want to systematically test `cgi_decode()`, how would we proceed?" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "source": [ "The testing literature distinguishes two ways of deriving tests: _Black-box testing_ and _White-box testing._" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Black-Box Testing\n", "\n", "The idea of *black-box testing* is to derive tests from the _specification_. In the above case, we thus would have to test `cgi_decode()` by the features specified and documented, including\n", "\n", "* testing for correct replacement of `'+'`;\n", "* testing for correct replacement of `\"%xx\"`;\n", "* testing for non-replacement of other characters; and\n", "* testing for recognition of illegal inputs.\n", "\n", "Here are four assertions (tests) that cover these four features. We can see that they all pass:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "assert cgi_decode('+') == ' '\n", "assert cgi_decode('%20') == ' '\n", "assert cgi_decode('abc') == 'abc'\n", "\n", "try:\n", " cgi_decode('%?a')\n", " assert False\n", "except ValueError:\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "The advantage of black-box testing is that it finds errors in the _specified_ behavior. It is independent from a given implementation, and thus allows to create test even before implementation. The downside is that _implemented_ behavior typically covers more ground than _specified_ behavior, and thus tests based on specification alone typically do not cover all implementation details." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## White-Box Testing\n", "\n", "In contrast to black-box testing, *white-box testing* derives tests from the _implementation_, notably the internal structure. White-Box testing is closely tied to the concept of _covering_ structural features of the code. If a statement in the code is not executed during testing, for instance, this means that an error in this statement cannot be triggered either. White-Box testing thus introduces a number of *coverage criteria* that have to be fulfilled before the test can be said to be sufficient. The most frequently used coverage criteria are\n", "\n", "* *Statement coverage* – each statement in the code must be executed by at least one test input.\n", "* *Branch coverage* – each branch in the code must be taken by at least one test input. (This translates to each `if` and `while` decision once being true, and once being false.)\n", "\n", "Besides these, there are far more coverage criteria, including sequences of branches taken, loop iterations taken (zero, one, many), data flows between variable definitions and usages, and many more; \\cite{Pezze2008} has a great overview." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us consider `cgi_decode()`, above, and reason what we have to do such that each statement of the code is executed at least once. We'd have to cover\n", "\n", "* The block following `if c == '+'`\n", "* The two blocks following `if c == '%'` (one for valid input, one for invalid)\n", "* The final `else` case for all other characters.\n", "\n", "This results in the same conditions as with black-box testing, above; again, the assertions above indeed would cover every statement in the code. Such a correspondence is actually pretty common, since programmers tend to implement different behaviors in different code locations; and thus, covering these locations will lead to test cases that cover the different (specified) behaviors.\n", "\n", "The advantage of white-box testing is that it finds errors in _implemented_ behavior. It can be conducted even in cases where the specification does not provide sufficient details; actually, it helps in identifying (and thus specifying) corner cases in the specification. The downside is that it may miss _non-implemented_ behavior: If some specified functionality is missing, white-box testing will not find it." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Tracing Executions\n", "\n", "One nice feature of white-box testing is that one can actually automatically assess whether some program feature was covered. To this end, one _instruments_ the execution of the program such that during execution, a special functionality keeps track of which code was executed. After testing, this information can be passed to the programmer, who can then focus on writing tests that cover the yet uncovered code." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "In most programming languages, it is rather difficult to set up programs such that one can trace their execution. Not so in Python. The function `sys.settrace(f)` allows to define a *tracing function* `f()` that is called for each and every line executed. Even better, it gets access to the current function and its name, current variable contents, and more. It is thus an ideal tool for *dynamic analysis* – that is, the analysis of what actually happens during an execution." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "To illustrate how this works, let us again look into a specific execution of `cgi_decode()`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'a b'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cgi_decode(\"a+b\")" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "To track how the execution proceeds through `cgi_decode()`, we make use of `sys.settrace()`. First, we define the _tracing function_ that will be called for each line. It has three parameters: \n", "\n", "* The `frame` parameter gets you the current _frame_, allowing access to the current location and variables:\n", " * `frame.f_code` is the currently executed code with `frame.f_code.co_name` being the function name;\n", " * `frame.f_lineno` holds the current line number; and\n", " * `frame.f_locals` holds the current local variables and arguments.\n", "* The `event` parameter is a string with values including `\"line\"` (a new line has been reached) or `\"call\"` (a function is being called).\n", "* The `arg` parameter is an additional _argument_ for some events; for `\"return\"` events, for instance, `arg` holds the value being returned." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "We use the tracing function for simply reporting the current line executed, which we access through the `frame` argument." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "coverage = []" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def traceit(frame, event, arg):\n", " if event == \"line\":\n", " global coverage\n", " function_name = frame.f_code.co_name\n", " lineno = frame.f_lineno\n", " coverage.append(lineno)\n", " return traceit" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "We can switch tracing on and off with `sys.settrace()`:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import sys" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def cgi_decode_traced(s):\n", " global coverage\n", " coverage = []\n", " sys.settrace(traceit) # Turn on\n", " cgi_decode(s)\n", " sys.settrace(None) # Turn off" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "When we compute `cgi_decode(\"a+b\")`, we can now see how the execution progresses through `cgi_decode()`. After the initialization of `hex_values`, `t`, and `i`, we see that the `while` loop is taken three times – one for every character in the input." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[9, 10, 11, 12, 15, 16, 17, 18, 19, 21, 30, 31, 17, 18, 19, 20, 31, 17, 18, 19, 21, 30, 31, 17, 32]\n" ] } ], "source": [ "cgi_decode_traced(\"a+b\")\n", "print(coverage)" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Which lines are these, actually? To this end, we encode the source code into an array `cgi_decode_lines`. (An actual coverage tool could also access the source code file directly.)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "cgi_decode_code = \"\"\"\n", "def cgi_decode(s):\n", " \\\"\\\"\\\"Decode the CGI-encoded string `s`:\n", " * replace \"+\" by \" \"\n", " * replace \"%xx\" by the character with hex number xx.\n", " Return the decoded string. Raise `ValueError` for invalid inputs.\\\"\\\"\\\"\n", "\n", " # Mapping of hex digits to their integer values\n", " hex_values = {\n", " '0': 0, '1': 1, '2': 2, '3': 3, '4': 4,\n", " '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,\n", " 'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,\n", " 'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,\n", " }\n", "\n", " t = \"\"\n", " i = 0\n", " while i < len(s):\n", " c = s[i]\n", " if c == '+':\n", " t += ' '\n", " elif c == '%':\n", " digit_high, digit_low = s[i + 1], s[i + 2]\n", " i += 2\n", " if digit_high in hex_values and digit_low in hex_values:\n", " v = hex_values[digit_high] * 16 + hex_values[digit_low]\n", " t += chr(v)\n", " else:\n", " raise ValueError(\"Invalid encoding\")\n", " else:\n", " t += c\n", " i += 1\n", " return t\n", "\"\"\"\n", "cgi_decode_lines = cgi_decode_code.splitlines()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "We see that the first line (9) executed is actually the initialization of `hex_values`..." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[\" '0': 0, '1': 1, '2': 2, '3': 3, '4': 4,\",\n", " \" '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,\",\n", " \" 'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,\",\n", " \" 'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,\"]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cgi_decode_lines[9:13]" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "... followed by the initialization of `t`:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "' t = \"\"'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cgi_decode_lines[15]" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "To see which lines actually have been covered at least once, we can convert `coverage` into a set:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{32, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 30, 31}\n" ] } ], "source": [ "covered_lines = set(coverage)\n", "print(covered_lines)" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us print out the code, annotating lines not covered with '#':" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# 1 def cgi_decode(s):\n", "# 2 \"\"\"Decode the CGI-encoded string `s`:\n", "# 3 * replace \"+\" by \" \"\n", "# 4 * replace \"%xx\" by the character with hex number xx.\n", "# 5 Return the decoded string. Raise `ValueError` for invalid inputs.\"\"\"\n", "# 6 \n", "# 7 # Mapping of hex digits to their integer values\n", "# 8 hex_values = {\n", " 9 '0': 0, '1': 1, '2': 2, '3': 3, '4': 4,\n", " 10 '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,\n", " 11 'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,\n", " 12 'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,\n", "# 13 }\n", "# 14 \n", " 15 t = \"\"\n", " 16 i = 0\n", " 17 while i < len(s):\n", " 18 c = s[i]\n", " 19 if c == '+':\n", " 20 t += ' '\n", " 21 elif c == '%':\n", "# 22 digit_high, digit_low = s[i + 1], s[i + 2]\n", "# 23 i += 2\n", "# 24 if digit_high in hex_values and digit_low in hex_values:\n", "# 25 v = hex_values[digit_high] * 16 + hex_values[digit_low]\n", "# 26 t += chr(v)\n", "# 27 else:\n", "# 28 raise ValueError(\"Invalid encoding\")\n", "# 29 else:\n", " 30 t += c\n", " 31 i += 1\n", " 32 return t\n" ] } ], "source": [ "for lineno in range(1, len(cgi_decode_lines)):\n", " if lineno not in covered_lines:\n", " print(\"# \", end=\"\")\n", " else:\n", " print(\" \", end=\"\")\n", " print(\"%2d\" % lineno, cgi_decode_lines[lineno])" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "We see that a number of lines (notably comments) have not been executed, simply because they are not executable. However, we also see that the lines under `if c == '%'` have _not_ been executed yet. If `\"a+b\"` were our only test case so far, this missing coverage would now encourage us to create another test case that actually covers these lines." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## A Coverage Class\n", "\n", "In this book, we will make use of coverage again and again – to _measure_ the effectiveness of different test generation techniques, but also to _guide_ test generation towards code coverage. Our previous implementation with a global `coverage` variable is a bit cumbersome for that. We therefore implement some functionality that will help us measuring coverage easily." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "The key idea of getting coverage is to make use of the Python `with` statement. The general form\n", "\n", "```python\n", "with OBJECT [as VARIABLE]:\n", " BODY\n", "```\n", "\n", "executes `BODY` with `OBJECT` being defined (and stored in `VARIABLE`). The interesting thing is that at the beginning and end of `BODY`, the special methods `OBJECT.__enter__()` and `OBJECT.__exit__()` are automatically invoked; even if `BODY` raises an exception. This allows us to define a `Coverage` object where `Coverage.__enter__()` automatically turns on tracing and `Coverage.__exit__()` automatically turns off tracing again. After tracing, we can make use of special methods to access the coverage. This is what this looks like during usage:\n", "\n", "```python\n", "with Coverage() as cov:\n", " function_to_be_traced()\n", "c = cov.coverage()\n", "```\n", "\n", "Here, tracing is automatically turned on during `function_to_be_traced()` and turned off again after the `with` block; afterwards, we can access the set of lines executed." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's the full implementation with all its bells and whistles. You don't have to get everything; it suffices that you know how to use it:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Coverage(object):\n", " # Trace function\n", " def traceit(self, frame, event, arg):\n", " if self.original_trace_function is not None:\n", " self.original_trace_function(frame, event, arg)\n", "\n", " if event == \"line\":\n", " function_name = frame.f_code.co_name\n", " lineno = frame.f_lineno\n", " self._trace.append((function_name, lineno))\n", "\n", " return self.traceit\n", "\n", " def __init__(self):\n", " self._trace = []\n", "\n", " # Start of `with` block\n", " def __enter__(self):\n", " self.original_trace_function = sys.gettrace()\n", " sys.settrace(self.traceit)\n", " return self\n", "\n", " # End of `with` block\n", " def __exit__(self, exc_type, exc_value, tb):\n", " sys.settrace(self.original_trace_function)\n", "\n", " def trace(self):\n", " \"\"\"The list of executed lines, as (function_name, line_number) pairs\"\"\"\n", " return self._trace\n", "\n", " def coverage(self):\n", " \"\"\"The set of executed lines, as (function_name, line_number) pairs\"\"\"\n", " return set(self.trace())" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us put this to use:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{('cgi_decode', 11), ('cgi_decode', 17), ('cgi_decode', 10), ('cgi_decode', 31), ('cgi_decode', 16), ('cgi_decode', 30), ('cgi_decode', 19), ('cgi_decode', 18), ('cgi_decode', 12), ('__exit__', 25), ('cgi_decode', 15), ('cgi_decode', 32), ('cgi_decode', 21), ('cgi_decode', 20), ('cgi_decode', 9)}\n" ] } ], "source": [ "with Coverage() as cov:\n", " cgi_decode(\"a+b\")\n", "\n", "print(cov.coverage())" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "As you can see, the `Coverage()` class not only keeps track of lines executed, but also of function names. This is useful if you have a program that spans multiple files." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Comparing Coverage\n", "\n", "Since we represent coverage as a set of executed lines, we can also apply _set operations_ on these. For instance, we can find out which lines are covered by individual test cases, but not others:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('cgi_decode', 20)}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with Coverage() as cov_plus:\n", " cgi_decode(\"a+b\")\n", "with Coverage() as cov_standard:\n", " cgi_decode(\"abc\")\n", "\n", "cov_plus.coverage() - cov_standard.coverage()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "This is the single line in the code that is executed only in the `'a+b'` input." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "We can also compare sets to find out which lines still need to be covered. Let us define `cov_max` as the maximum coverage we can achieve. (Here, we do this by executing the \"good\" test cases we already have. In practice, one would statically analyze code structure, which we introduce in [the chapter on symbolic testing](SymbolicFuzzer.ipynb).)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import fuzzingbook_utils" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with Coverage() as cov_max:\n", " cgi_decode('+')\n", " cgi_decode('%20')\n", " cgi_decode('abc')\n", " try:\n", " cgi_decode('%?a')\n", " except:\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Then, we can easily see which lines are _not_ yet covered by a test case:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('cgi_decode', 22),\n", " ('cgi_decode', 23),\n", " ('cgi_decode', 24),\n", " ('cgi_decode', 25),\n", " ('cgi_decode', 26),\n", " ('cgi_decode', 28)}" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cov_max.coverage() - cov_plus.coverage()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Again, these would be the lines handling `\"%xx\"`, which we have not yet had in the input." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Coverage of Basic Fuzzing\n", "\n", "We can now use our coverage tracing to assess the _effectiveness_ of testing methods – in particular, of course, test _generation_ methods. Our challenge is to achieve maximum coverage in `cgi_decode()` just with random inputs. In principle, we should _eventually_ get there, as eventually, we will have produced every possible string in the universe – but exactly how long is this? To this end, let us run just one fuzzing iteration on `cgi_decode()`:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Fuzzer import fuzzer" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'!7#%\"*#0=)$;%6*;>638:*>80\"=(/*:-(2<4 !:5*6856&?\"\"11<7+%<%7,4.8,*+&,,$,.\"'" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample = fuzzer()\n", "sample" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's the invocation and the coverage we achieve. We wrap `cgi_decode()` in a `try...except` block such that we can ignore `ValueError` exceptions raised by illegal `%xx` formats." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('__exit__', 25),\n", " ('cgi_decode', 9),\n", " ('cgi_decode', 10),\n", " ('cgi_decode', 11),\n", " ('cgi_decode', 12),\n", " ('cgi_decode', 15),\n", " ('cgi_decode', 16),\n", " ('cgi_decode', 17),\n", " ('cgi_decode', 18),\n", " ('cgi_decode', 19),\n", " ('cgi_decode', 21),\n", " ('cgi_decode', 22),\n", " ('cgi_decode', 23),\n", " ('cgi_decode', 24),\n", " ('cgi_decode', 28),\n", " ('cgi_decode', 30),\n", " ('cgi_decode', 31)}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with Coverage() as cov_fuzz:\n", " try:\n", " cgi_decode(sample)\n", " except:\n", " pass\n", "cov_fuzz.coverage()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Is this already the maximum coverage? Apparently, there are still lines missing:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('cgi_decode', 20),\n", " ('cgi_decode', 25),\n", " ('cgi_decode', 26),\n", " ('cgi_decode', 32)}" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cov_max.coverage() - cov_fuzz.coverage()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us try again, increasing coverage over 100 random inputs. We use an array `cumulative_coverage` to store the coverage achieved over time; `cumulative_coverage[0]` is the total number of lines covered after input 1, \n", "`cumulative_coverage[1]` is the number of lines covered after inputs 1–2, and so on." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "trials = 100" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def population_coverage(population, function):\n", " cumulative_coverage = []\n", " all_coverage = set()\n", "\n", " for s in population:\n", " with Coverage() as cov:\n", " try:\n", " function(s)\n", " except:\n", " pass\n", " all_coverage |= cov.coverage()\n", " cumulative_coverage.append(len(all_coverage))\n", "\n", " return all_coverage, cumulative_coverage" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us create a hundred inputs to determine how coverage increases:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def hundred_inputs():\n", " population = []\n", " for i in range(trials):\n", " population.append(fuzzer())\n", " return population" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's how the coverage increases with each input:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "all_coverage, cumulative_coverage = population_coverage(\n", " hundred_inputs(), cgi_decode)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "Text(0,0.5,'lines covered')" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(cumulative_coverage)\n", "plt.title('Coverage of cgi_decode() with random inputs')\n", "plt.xlabel('# of inputs')\n", "plt.ylabel('lines covered')" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "This is just _one_ run, of course; so let's repeat this a number of times and plot the averages." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "runs = 100\n", "\n", "# Create an array with TRIALS elements, all zero\n", "sum_coverage = [0] * trials\n", "\n", "for run in range(runs):\n", " all_coverage, coverage = population_coverage(hundred_inputs(), cgi_decode)\n", " assert len(coverage) == trials\n", " for i in range(trials):\n", " sum_coverage[i] += coverage[i]\n", "\n", "average_coverage = []\n", "for i in range(trials):\n", " average_coverage.append(sum_coverage[i] / runs)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "Text(0,0.5,'lines covered')" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(average_coverage)\n", "plt.title('Average coverage of cgi_decode() with random inputs')\n", "plt.xlabel('# of inputs')\n", "plt.ylabel('lines covered')" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "We see that on average, we get full coverage after 40–60 fuzzing inputs." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Getting Coverage from External Programs\n", "\n", "Of course, not all the world is programming in Python. The good news is that the problem of obtaining coverage is ubiquitous, and almost every programming language has some facility to measure coverage. Just as an example, let us therefore demonstrate how to obtain coverage for a C program." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our C program (again) implements `cgi_decode`; this time as a program to be executed from the command line:\n", "\n", "```shell\n", "$ ./cgi_decode 'Hello+World'\n", "Hello World\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here comes the C code, first as a Python string. We start with the usual C includes:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "cgi_c_code = \"\"\"\n", "/* CGI decoding as C program */\n", "\n", "#include \n", "#include \n", "#include \n", "\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here comes the initialization of `hex_values`:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "cgi_c_code += r\"\"\"\n", "int hex_values[256];\n", "\n", "void init_hex_values() {\n", " for (int i = 0; i < sizeof(hex_values) / sizeof(int); i++) {\n", " hex_values[i] = -1;\n", " }\n", " hex_values['0'] = 0; hex_values['1'] = 1; hex_values['2'] = 2; hex_values['3'] = 3;\n", " hex_values['4'] = 4; hex_values['5'] = 5; hex_values['6'] = 6; hex_values['7'] = 7;\n", " hex_values['8'] = 8; hex_values['9'] = 9;\n", "\n", " hex_values['a'] = 10; hex_values['b'] = 11; hex_values['c'] = 12; hex_values['d'] = 13;\n", " hex_values['e'] = 14; hex_values['f'] = 15;\n", "\n", " hex_values['A'] = 10; hex_values['B'] = 11; hex_values['C'] = 12; hex_values['D'] = 13;\n", " hex_values['E'] = 14; hex_values['F'] = 15;\n", "}\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's the actual implementation of `cgi_decode()`, using pointers for input source (`s`) and output target (`t`):" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "cgi_c_code += r\"\"\"\n", "int cgi_decode(char *s, char *t) {\n", " while (*s != '\\0') {\n", " if (*s == '+')\n", " *t++ = ' ';\n", " else if (*s == '%') {\n", " int digit_high = *++s;\n", " int digit_low = *++s;\n", " if (hex_values[digit_high] >= 0 && hex_values[digit_low] >= 0) {\n", " *t++ = hex_values[digit_high] * 16 + hex_values[digit_low];\n", " }\n", " else\n", " return -1;\n", " }\n", " else\n", " *t++ = *s;\n", " s++;\n", " }\n", " *t = '\\0';\n", " return 0;\n", "}\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Finally, here's a driver which takes the first argument and invokes `cgi_decode` with it:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "cgi_c_code += r\"\"\"\n", "int main(int argc, char *argv[]) {\n", " init_hex_values();\n", "\n", " if (argc >= 2) {\n", " char *s = argv[1];\n", " char *t = malloc(strlen(s) + 1); /* output is at most as long as input */\n", " int ret = cgi_decode(s, t);\n", " printf(\"%s\\n\", t);\n", " return ret;\n", " }\n", " else\n", " {\n", " printf(\"cgi_decode: usage: cgi_decode STRING\\n\");\n", " return 1;\n", " }\n", "}\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us create the C source code:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with open(\"cgi_decode.c\", \"w\") as f:\n", " f.write(cgi_c_code)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can now compile the C code into an executable. The `--coverage` option instructs the C compiler to instrument the code such that at runtime, coverage information will be collected. (The exact options vary from compiler to compiler.)" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "!cc --coverage -o cgi_decode cgi_decode.c" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "When we now execute the program, coverage information will automatically be collected and stored in auxiliary files:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Send mail to me@fuzzingbook.org\r\n" ] } ], "source": [ "!./cgi_decode 'Send+mail+to+me%40fuzzingbook.org'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The coverage information is collected by the `gcov` program. For every source file given, it produces a new `.gcov` file with coverage information." ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "File 'cgi_decode.c'\r\n", "Lines executed:91.89% of 37\r\n", "cgi_decode.c:creating 'cgi_decode.c.gcov'\r\n", "\r\n" ] } ], "source": [ "!gcov cgi_decode.c" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In the `.gcov` file, each line is prefixed with the number of times it was called (`-` stands for a non-executable line, `#####` stands for zero) as well as the line number. We can take a look at `cgi_decode()`, for instance, and see that the only code not executed yet is the `return -1` for an illegal input." ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " -: 26:int cgi_decode(char *s, char *t) {\n", " 64: 27: while (*s != '\\0') {\n", " 31: 28: if (*s == '+')\n", " 3: 29: *t++ = ' ';\n", " 28: 30: else if (*s == '%') {\n", " 1: 31: int digit_high = *++s;\n", " 1: 32: int digit_low = *++s;\n", " 2: 33: if (hex_values[digit_high] >= 0 && hex_values[digit_low] >= 0) {\n", " 1: 34: *t++ = hex_values[digit_high] * 16 + hex_values[digit_low];\n", " 1: 35: }\n", " -: 36: else\n", " #####: 37: return -1;\n", " 1: 38: }\n", " -: 39: else\n", " 27: 40: *t++ = *s;\n", " 31: 41: s++;\n", " -: 42: }\n", " 1: 43: *t = '\\0';\n", " 1: 44: return 0;\n", " 1: 45:}\n" ] } ], "source": [ "lines = open('cgi_decode.c.gcov').readlines()\n", "for i in range(30, 50):\n", " print(lines[i], end='')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us read in this file to obtain a coverage set:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def read_gcov_coverage(c_file):\n", " gcov_file = c_file + \".gcov\"\n", " coverage = set()\n", " with open(gcov_file) as file:\n", " for line in file.readlines():\n", " elems = line.split(':')\n", " covered = elems[0].strip()\n", " line_number = int(elems[1].strip())\n", " if covered.startswith('-') or covered.startswith('#'):\n", " continue\n", " coverage.add((c_file, line_number))\n", " return coverage" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "coverage = read_gcov_coverage('cgi_decode.c')" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[('cgi_decode.c', 53),\n", " ('cgi_decode.c', 35),\n", " ('cgi_decode.c', 16),\n", " ('cgi_decode.c', 14),\n", " ('cgi_decode.c', 38)]" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(coverage)[:5]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "With this set, we can now do the same coverage computations as with our Python programs." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Finding Errors with Basic Fuzzing\n", "\n", "Given sufficient time, we can indeed cover each and every line within `cgi_decode()`, whatever the programming language would be. This does not mean that they would be error-free, though. Since we do not check the result of `cgi_decode()`, the function could return any value without us checking or noticing. To catch such errors, we would have to set up a *results checker* (commonly called an *oracle*) that would verify test results. In our case, we could compare the C and Python implementations of `cgi_decode()` and see whether both produce the same results." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Where fuzzing is great at, though, is in finding _internal errors_ that can be detected even without checking the result. Actually, if one runs our `fuzzer()` on `cgi_decode()`, one quickly finds such an error, as the following code shows:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "from ExpectError import ExpectError" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"\", line 5, in \n", " cgi_decode(s)\n", " File \"\", line 22, in cgi_decode\n", " digit_high, digit_low = s[i + 1], s[i + 2]\n", "IndexError: string index out of range (expected)\n" ] } ], "source": [ "with ExpectError():\n", " for i in range(trials):\n", " try:\n", " s = fuzzer()\n", " cgi_decode(s)\n", " except ValueError:\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "So, it is possible to cause `cgi_decode()` to crash. Why is that? Let's take a look at its input:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'82 202*&<1&($34\\'\"/\\'.<5/!8\"\\'5:!4))%;'" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "The problem here is at the end of the string. After a `'%'` character, our implementation will always attempt to access two more (hexadecimal) characters, but if these are not there, we will get an `IndexError` exception. " ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "This problem is also present in our C variant, which inherits it from the original implementation \\cite{Pezze2008}:\n", "\n", "```c\n", "int digit_high = *++s;\n", "int digit_low = *++s;\n", "```\n", "\n", "Here, `s` is a pointer to the character to be read; `++` increments it by one character.\n", "In the C implementation, the problem is actually much worse. If the `'%'` character is at the end of the string, the above code will first read a terminating character (`'\\0'` in C strings) and then the following character, which may be any memory content after the string, and which thus may cause the program to fail uncontrollably. The somewhat good news is that `'\\0'` is not a valid hexadecimal character, and thus, the C version will \"only\" read one character beyond the end of the string." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Interestingly enough, none of the manual tests we had designed earlier would trigger this bug. Actually, neither statement nor branch coverage, nor any of the coverage criteria commonly discussed in literature would find it. However, a simple fuzzing run can identify the error with a few runs – _if_ appropriate run-time checks are in place that find such overflows. This definitely calls for more fuzzing!" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Lessons Learned\n", "\n", "* Coverage metrics are a simple and fully automated means to approximate how much functionality of a program is actually executed during a test run.\n", "* A number of coverage metrics exist, the most important ones being statement coverage and branch coverage.\n", "* In Python, it is very easy to access the program state during execution, including the currently executed code." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "At the end of the day, let's clean up:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import os\n", "import glob" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "for file in glob.glob(\"cgi_decode\") + glob.glob(\"cgi_decode.*\"):\n", " os.remove(file)" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "source": [ "## Next Steps\n", "\n", "Coverage is not only a tool to _measure_ test effectiveness, but also a great tool to _guide_ test generation towards specific goals – in particular uncovered code. We use coverage to\n", "\n", "* [guide _mutations_ of existing inputs towards better coverage in the chapter on mutation fuzzing](MutationFuzzer.ipynb)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Background\n", "\n", "Coverage is a central concept in systematic software testing. For discussions, see the books in the [Introduction to Testing](Intro_Testing.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Exercises" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden", "solution2_first": true, "solution_first": true }, "source": [ "### Exercise 1: Fixing cgi_decode\n", "\n", "Create an appropriate test to reproduce the `IndexError` discussed above. Fix `cgi_decode()` to prevent the bug. Show that your test (and additional `fuzzer()` runs) no longer expose the bug. Do the same for the C variant." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden" }, "source": [ "**Solution.** Here's a test case:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"\", line 2, in \n", " assert cgi_decode('%') == '%'\n", " File \"\", line 22, in cgi_decode\n", " digit_high, digit_low = s[i + 1], s[i + 2]\n", "IndexError: string index out of range (expected)\n" ] } ], "source": [ "with ExpectError():\n", " assert cgi_decode('%') == '%'" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"\", line 2, in \n", " assert cgi_decode('%4') == '%4'\n", " File \"\", line 22, in cgi_decode\n", " digit_high, digit_low = s[i + 1], s[i + 2]\n", "IndexError: string index out of range (expected)\n" ] } ], "source": [ "with ExpectError():\n", " assert cgi_decode('%4') == '%4'" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "assert cgi_decode('%40') == '@'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "source": [ "Here's a fix:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "def fixed_cgi_decode(s):\n", " \"\"\"Decode the CGI-encoded string `s`:\n", " * replace \"+\" by \" \"\n", " * replace \"%xx\" by the character with hex number xx.\n", " Return the decoded string. Raise `ValueError` for invalid inputs.\"\"\"\n", "\n", " # Mapping of hex digits to their integer values\n", " hex_values = {\n", " '0': 0, '1': 1, '2': 2, '3': 3, '4': 4,\n", " '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,\n", " 'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14, 'f': 15,\n", " 'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15,\n", " }\n", "\n", " t = \"\"\n", " i = 0\n", " while i < len(s):\n", " c = s[i]\n", " if c == '+':\n", " t += ' '\n", " elif c == '%' and i + 2 < len(s): # <--- *** FIX ***\n", " digit_high, digit_low = s[i + 1], s[i + 2]\n", " i += 2\n", " if digit_high in hex_values and digit_low in hex_values:\n", " v = hex_values[digit_high] * 16 + hex_values[digit_low]\n", " t += chr(v)\n", " else:\n", " raise ValueError(\"Invalid encoding\")\n", " else:\n", " t += c\n", " i += 1\n", " return t\n" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [], "source": [ "assert fixed_cgi_decode('%') == '%'" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "assert fixed_cgi_decode('%4') == '%4'" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "assert fixed_cgi_decode('%40') == '@'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "source": [ "Here's the test:" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "for i in range(trials):\n", " try:\n", " s = fuzzer()\n", " fixed_cgi_decode(s)\n", " except ValueError:\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "source": [ "For the C variant, the following will do:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "cgi_c_code = cgi_c_code.replace(\n", " r\"if (*s == '%')\", # old code\n", " r\"if (*s == '%' && s[1] != '\\0' && s[2] != '\\0')\" # new code\n", ")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "source": [ "Go back to the above compilation commands and recompile `cgi_decode`." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Exercise 2: Branch Coverage\n", "\n", "Besides statement coverage, _branch coverage_ is one of the most frequently used criteria to determine the quality of a test. In a nutshell, branch coverage measures how many different _control decisions_ are made in code. In the statement\n", "\n", "```python\n", "if CONDITION:\n", " do_a()\n", "else:\n", " do_b()\n", "```\n", "\n", "for instance, both the cases where `CONDITION` is true (branching to `do_a()`) and where `CONDITION` is false (branching to `do_b()`) have to be covered. This holds for all control statements with a condition (`if`, `while`, etc.).\n", "\n", "How is branch coverage different from statement coverage? In the above example, there is actually no difference. In this one, though, there is:\n", "\n", "```python\n", "if CONDITION:\n", " do_a()\n", "something_else()\n", "```\n", "\n", "Using statement coverage, a single test case where `CONDITION` is true suffices to cover the call to `do_a()`. Using branch coverage, however, we would also have to create a test case where `do_a()` is _not_ invoked." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Using our `Coverage` infrastructure, we can simulate branch coverage by considering _pairs of subsequent lines executed_. The `trace()` method gives us the list of lines executed one after the other:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[('cgi_decode', 9),\n", " ('cgi_decode', 10),\n", " ('cgi_decode', 11),\n", " ('cgi_decode', 12),\n", " ('cgi_decode', 15)]" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with Coverage() as cov:\n", " cgi_decode(\"a+b\")\n", "trace = cov.trace()\n", "trace[:5]" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden", "solution2_first": true }, "source": [ "#### Part 1: Compute branch coverage\n", "\n", "Define a function `branch_coverage()` that takes a trace and returns the set of pairs of subsequent lines in a trace – in the above example, this would be \n", "\n", "```python\n", "set(\n", "(('cgi_decode', 9), ('cgi_decode', 10)),\n", "(('cgi_decode', 10), ('cgi_decode', 11)),\n", "# more_pairs\n", ")\n", "```\n", "\n", "Bonus for advanced Python programmers: Define `BranchCoverage` as a subclass of `Coverage` and make `branch_coverage()` as above a `coverage()` method of `BranchCoverage`." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden" }, "source": [ "**Solution.** Here's a simple definition of `branch_coverage()`:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "def branch_coverage(trace):\n", " coverage = set()\n", " past_line = None\n", " for line in trace:\n", " if past_line is not None:\n", " coverage.add((past_line, line))\n", " past_line = line\n", "\n", " return coverage" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 9), ('cgi_decode', 10)),\n", " (('cgi_decode', 10), ('cgi_decode', 11)),\n", " (('cgi_decode', 11), ('cgi_decode', 12)),\n", " (('cgi_decode', 12), ('cgi_decode', 15)),\n", " (('cgi_decode', 15), ('cgi_decode', 16)),\n", " (('cgi_decode', 16), ('cgi_decode', 17)),\n", " (('cgi_decode', 17), ('cgi_decode', 18)),\n", " (('cgi_decode', 17), ('cgi_decode', 32)),\n", " (('cgi_decode', 18), ('cgi_decode', 19)),\n", " (('cgi_decode', 19), ('cgi_decode', 20)),\n", " (('cgi_decode', 19), ('cgi_decode', 21)),\n", " (('cgi_decode', 20), ('cgi_decode', 31)),\n", " (('cgi_decode', 21), ('cgi_decode', 30)),\n", " (('cgi_decode', 30), ('cgi_decode', 31)),\n", " (('cgi_decode', 31), ('cgi_decode', 17)),\n", " (('cgi_decode', 32), ('__exit__', 25))}" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "branch_coverage(trace)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "source": [ "Here's a definition as a class:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "class BranchCoverage(Coverage):\n", " def coverage(self):\n", " \"\"\"The set of executed line pairs\"\"\"\n", " coverage = set()\n", " past_line = None\n", " for line in self.trace():\n", " if past_line is not None:\n", " coverage.add((past_line, line))\n", " past_line = line\n", "\n", " return coverage" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden", "solution2_first": true }, "source": [ "#### Part 2: Comparing statement coverage and branch coverage\n", "\n", "Use `branch_coverage()` to repeat the experiments in this chapter with branch coverage rather than statement coverage. Do the manually written test cases cover all branches?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "source": [ "**Solution.** Let's repeat the above experiments with `BranchCoverage`:" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{(('cgi_decode', 19), ('cgi_decode', 20)), (('cgi_decode', 20), ('cgi_decode', 31)), (('cgi_decode', 10), ('cgi_decode', 11)), (('cgi_decode', 18), ('cgi_decode', 19)), (('cgi_decode', 19), ('cgi_decode', 21)), (('cgi_decode', 17), ('cgi_decode', 32)), (('cgi_decode', 9), ('cgi_decode', 10)), (('cgi_decode', 17), ('cgi_decode', 18)), (('cgi_decode', 12), ('cgi_decode', 15)), (('cgi_decode', 11), ('cgi_decode', 12)), (('cgi_decode', 30), ('cgi_decode', 31)), (('cgi_decode', 21), ('cgi_decode', 30)), (('cgi_decode', 15), ('cgi_decode', 16)), (('cgi_decode', 31), ('cgi_decode', 17)), (('cgi_decode', 16), ('cgi_decode', 17)), (('cgi_decode', 32), ('__exit__', 25))}\n" ] } ], "source": [ "with BranchCoverage() as cov:\n", " cgi_decode(\"a+b\")\n", "\n", "print(cov.coverage())" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 19), ('cgi_decode', 20)),\n", " (('cgi_decode', 20), ('cgi_decode', 31))}" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with BranchCoverage() as cov_plus:\n", " cgi_decode(\"a+b\")\n", "with BranchCoverage() as cov_standard:\n", " cgi_decode(\"abc\")\n", "\n", "cov_plus.coverage() - cov_standard.coverage()" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [], "source": [ "with BranchCoverage() as cov_max:\n", " cgi_decode('+')\n", " cgi_decode('%20')\n", " cgi_decode('abc')\n", " try:\n", " cgi_decode('%?a')\n", " except:\n", " pass" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 21), ('cgi_decode', 22)),\n", " (('cgi_decode', 22), ('cgi_decode', 23)),\n", " (('cgi_decode', 23), ('cgi_decode', 24)),\n", " (('cgi_decode', 24), ('cgi_decode', 25)),\n", " (('cgi_decode', 24), ('cgi_decode', 28)),\n", " (('cgi_decode', 25), ('cgi_decode', 26)),\n", " (('cgi_decode', 26), ('cgi_decode', 31)),\n", " (('cgi_decode', 28), ('__exit__', 25)),\n", " (('cgi_decode', 32), ('cgi_decode', 9))}" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cov_max.coverage() - cov_plus.coverage()" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "'!7#%\"*#0=)$;%6*;>638:*>80\"=(/*:-(2<4 !:5*6856&?\"\"11<7+%<%7,4.8,*+&,,$,.\"'" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 9), ('cgi_decode', 10)),\n", " (('cgi_decode', 10), ('cgi_decode', 11)),\n", " (('cgi_decode', 11), ('cgi_decode', 12)),\n", " (('cgi_decode', 12), ('cgi_decode', 15)),\n", " (('cgi_decode', 15), ('cgi_decode', 16)),\n", " (('cgi_decode', 16), ('cgi_decode', 17)),\n", " (('cgi_decode', 17), ('cgi_decode', 18)),\n", " (('cgi_decode', 18), ('cgi_decode', 19)),\n", " (('cgi_decode', 19), ('cgi_decode', 20)),\n", " (('cgi_decode', 19), ('cgi_decode', 21)),\n", " (('cgi_decode', 20), ('cgi_decode', 31)),\n", " (('cgi_decode', 21), ('cgi_decode', 22)),\n", " (('cgi_decode', 21), ('cgi_decode', 30)),\n", " (('cgi_decode', 22), ('cgi_decode', 23)),\n", " (('cgi_decode', 23), ('cgi_decode', 24)),\n", " (('cgi_decode', 24), ('cgi_decode', 28)),\n", " (('cgi_decode', 28), ('__exit__', 25)),\n", " (('cgi_decode', 30), ('cgi_decode', 31)),\n", " (('cgi_decode', 31), ('cgi_decode', 17))}" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with BranchCoverage() as cov_fuzz:\n", " try:\n", " cgi_decode(s)\n", " except:\n", " pass\n", "cov_fuzz.coverage()" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 17), ('cgi_decode', 32)),\n", " (('cgi_decode', 24), ('cgi_decode', 25)),\n", " (('cgi_decode', 25), ('cgi_decode', 26)),\n", " (('cgi_decode', 26), ('cgi_decode', 31)),\n", " (('cgi_decode', 32), ('cgi_decode', 9))}" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cov_max.coverage() - cov_fuzz.coverage()" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [], "source": [ "def population_branch_coverage(population, function):\n", " cumulative_coverage = []\n", " all_coverage = set()\n", "\n", " for s in population:\n", " with BranchCoverage() as cov:\n", " try:\n", " function(s)\n", " except:\n", " pass\n", " all_coverage |= cov.coverage()\n", " cumulative_coverage.append(len(all_coverage))\n", "\n", " return all_coverage, cumulative_coverage" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [], "source": [ "all_branch_coverage, cumulative_branch_coverage = population_branch_coverage(\n", " hundred_inputs(), cgi_decode)" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "Text(0,0.5,'line pairs covered')" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(cumulative_branch_coverage)\n", "plt.title('Branch coverage of cgi_decode() with random inputs')\n", "plt.xlabel('# of inputs')\n", "plt.ylabel('line pairs covered')" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "24" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(cov_max.coverage())" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 22), ('__exit__', 25)),\n", " (('cgi_decode', 32), ('__exit__', 25))}" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "all_branch_coverage - cov_max.coverage()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "source": [ "The additional coverage comes from the exception raised via an illegal input (say, `%g`)." ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "{(('cgi_decode', 32), ('cgi_decode', 9))}" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cov_max.coverage() - all_branch_coverage" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "source": [ "This is an artefact coming from the subsequent execution of `cgi_decode()` when computing `cov_max`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden", "solution2_first": true }, "source": [ "#### Part 3: Average coverage\n", "\n", "Again, repeat the above experiments with branch coverage. Does `fuzzer()` cover all branches, and if so, how many tests does it take on average?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "source": [ "**Solution.** We repeat the experiments we ran with line coverage with branch coverage." ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "outputs": [], "source": [ "runs = 100\n", "\n", "# Create an array with TRIALS elements, all zero\n", "sum_coverage = [0] * trials\n", "\n", "for run in range(runs):\n", " all_branch_coverage, coverage = population_branch_coverage(\n", " hundred_inputs(), cgi_decode)\n", " assert len(coverage) == trials\n", " for i in range(trials):\n", " sum_coverage[i] += coverage[i]\n", "\n", "average_coverage = []\n", "for i in range(trials):\n", " average_coverage.append(sum_coverage[i] / runs)\n" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden" }, "outputs": [ { "data": { "text/plain": [ "Text(0,0.5,'line pairs covered')" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(average_coverage)\n", "plt.title('Average branch coverage of cgi_decode() with random inputs')\n", "plt.xlabel('# of inputs')\n", "plt.ylabel('line pairs covered')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "solution2": "hidden" }, "source": [ "We see that achieving branch coverage takes longer than statement coverage; it simply is a more difficult criterion to satisfy with random inputs." ] } ], "metadata": { "ipub": { "bibliography": "fuzzingbook.bib", "toc": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 2 }