{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Greybox Fuzzing with Grammars\n",
"\n",
"In this chapter, we introduce important extensions to our syntactic fuzzing techniques, all leveraging _syntactic_ parts of _existing inputs_.\n",
"\n",
"1. We show how to leverage _dictionaries_ of input fragments during fuzzing. The idea is to integrate such dictionaries into a _mutator_, which would then inject these fragments (typically keywords and other items of importance) into population.\n",
"\n",
"2. We show how to combine [parsing](Parser.ipynb) and [fuzzing](GrammarFuzzer.ipynb) with grammars. This allows _mutating_ existing inputs while preserving syntactical correctness, and to _reuse_ fragments from existing inputs while generating new ones. The combination of language-based parsing and generating, as demonstrated in this chapter, has been highly successful in practice: The _LangFuzz_ fuzzer for JavaScript has found more than 2,600 bugs in JavaScript interpreters this way.\n",
"\n",
"3. In the previous chapters, we have used grammars in a _black-box_ manner – that is, we have used them to generate inputs regardless of the program being tested. In this chapter, we introduce mutational _greybox fuzzing with grammars_: Techniques that make use of _feedback from the program under test_ to guide test generations towards specific goals. As in [lexical greybox fuzzing](GreyboxFuzzer.ipynb), this feedback is mostly _coverage_, allowing us to direct grammar-based testing towards uncovered code parts. This part is inspired by the _AFLSmart_ fuzzer, which combines parsing and mutational fuzzing."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2024-01-18T17:17:10.688611Z",
"iopub.status.busy": "2024-01-18T17:17:10.688131Z",
"iopub.status.idle": "2024-01-18T17:17:10.753528Z",
"shell.execute_reply": "2024-01-18T17:17:10.753220Z"
},
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from bookutils import YouTubeVideo\n",
"YouTubeVideo('hSGzcjUj7Vs')"
]
},
{
"cell_type": "markdown",
"metadata": {
"button": false,
"new_sheet": false,
"run_control": {
"read_only": false
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Prerequisites**\n",
"\n",
"* We build on several concepts from [the chapter on greybox fuzzing (without grammars)](GreyboxFuzzer.ipynb).\n",
"* As the title suggests, you should know how to fuzz with grammars [from the chapter on grammars](Grammars.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"## Synopsis\n",
"\n",
"\n",
"To [use the code provided in this chapter](Importing.ipynb), write\n",
"\n",
"```python\n",
">>> from fuzzingbook.GreyboxGrammarFuzzer import \n",
"```\n",
"\n",
"and then make use of the following features.\n",
"\n",
"\n",
"This chapter introduces advanced methods for language-based grey-box fuzzing inspired by the _LangFuzz_ and _AFLSmart_ fuzzers.\n",
"\n",
"### Fuzzing with Dictionaries\n",
"\n",
"Rather than mutating strings randomly, the `DictMutator` class allows inserting tokens from a dictionary, thus increasing fuzzer performance. The dictionary comes as a list of strings, out of which random elements are picked and inserted - in addition to the given mutations such as deleting or inserting individual bytes.\n",
"\n",
"```python\n",
">>> dict_mutator = DictMutator([\"\", \"\", \"\", \"='a'\"])\n",
">>> seeds = [\"HelloWorld
\"]\n",
">>> for i in range(10):\n",
">>> print(dict_mutator.mutate(seeds[0]))\n",
"He='a'lloWorld
\n",
"HelloWorld
\n",
"Hello>body>World