{ "cells": [ { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Reducing Failure-Inducing Inputs\n", "\n", "A standard problem in debugging is this: Your program fails after processing some large input. Only a _part_ of this input, however, is responsible for the failure. _Reducing_ the input to a failure-inducing minimum not only eases debugging – it also helps in understanding why and when the program fails. In this chapter, we present techniques that _automatically reduce and simplify failure-inducing inputs to a minimum_, notably the popular _Delta Debugging_ technique." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.682025Z", "iopub.status.busy": "2023-11-12T12:40:47.681898Z", "iopub.status.idle": "2023-11-12T12:40:47.723987Z", "shell.execute_reply": "2023-11-12T12:40:47.723586Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from bookutils import YouTubeVideo\n", "YouTubeVideo(\"6fmJ5l257bM\")" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "**Prerequisites**\n", "\n", "* Using the \"delta debugging\" technique for reduction has no specific prerequisites.\n", "* To understand the `DeltaDebugger` implementation, reading [the chapter on tracing](Tracer.ipynb) is recommended." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "This chapter is adapted from [a similar chapter in \"The Fuzzing Book\"](https://www.fuzzingbook.org/html/Reducer.html). The material has been adapted to be independent of the `fuzzingbook` infrastructure, to build on general delta debugging (`dd`), and to provide a simpler invocation interface." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## Synopsis\n", "\n", "\n", "To [use the code provided in this chapter](Importing.ipynb), write\n", "\n", "```python\n", ">>> from debuggingbook.DeltaDebugger import \n", "```\n", "\n", "and then make use of the following features.\n", "\n", "\n", "A _reducer_ takes a failure-inducing input and reduces it to the minimum that still reproduces the failure. This chapter provides a `DeltaDebugger` class that implements such a reducer.\n", "\n", "Here is a simple example: An arithmetic expression causes an error in the Python interpreter:\n", "\n", "```python\n", ">>> def myeval(inp: str) -> Any:\n", ">>> return eval(inp)\n", ">>> with ExpectError(ZeroDivisionError):\n", ">>> myeval('1 + 2 * 3 / 0')\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4002351332.py\", line 2, in \n", " myeval('1 + 2 * 3 / 0')\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2200911420.py\", line 2, in myeval\n", " return eval(inp)\n", " File \"\", line 1, in \n", "ZeroDivisionError: division by zero (expected)\n", "\n", "```\n", "Can we reduce this input to a minimum? _Delta Debugging_ is a simple and robust reduction algorithm. We provide a `DeltaDebugger` class that is used in conjunction with a (failing) function call:\n", "\n", "```python\n", "with DeltaDebugger() as dd:\n", " fun(args...)\n", "dd\n", "```\n", "\n", "The class automatically determines minimal arguments that cause the function to fail with the same exception as the original. Printing out the class object reveals the minimized call.\n", "\n", "```python\n", ">>> with DeltaDebugger() as dd:\n", ">>> myeval('1 + 2 * 3 / 0')\n", ">>> dd\n", "myeval(inp='3/0')\n", "```\n", "The input is reduced to the minimum: We get the essence of the division by zero.\n", "\n", "There also is an interface to access the reduced input(s) programmatically. The method `min_args()` returns a dictionary in which all function arguments are minimized:\n", "\n", "```python\n", ">>> dd.min_args()\n", "{'inp': '3/0'}\n", "```\n", "In contrast, `max_args()` returns a dictionary in which all function arguments are maximized, but still pass:\n", "\n", "```python\n", ">>> dd.max_args()\n", "{'inp': '1 + 2 * 3 '}\n", "```\n", "The method `min_arg_diff()` returns a triple of \n", "* passing input,\n", "* failing input, and\n", "* their minimal failure-inducing difference:\n", "\n", "```python\n", ">>> dd.min_arg_diff()\n", "({'inp': ' 3 '}, {'inp': ' 3 /0'}, {'inp': '/0'})\n", "```\n", "And you can also access the function itself, as well as its original arguments.\n", "\n", "```python\n", ">>> dd.function().__name__, dd.args()\n", "('myeval', {'inp': '1 + 2 * 3 / 0'})\n", "```\n", "`DeltaDebugger` processes (i.e., minimizes or maximizes) all arguments that support a `len()` operation and that can be indexed – notably _strings_ and _lists_. If a function has multiple arguments, all arguments that can be processed will be processed.\n", "\n", "This chapter also provides a number of superclasses to `DeltaDebugger`, notably `CallCollector`, which obtains the first function call for `DeltaDebugger`. `CallReducer` classes allow for implementing alternate call reduction strategies.\n", "\n", "![](PICS/DeltaDebugger-synopsis-1.svg)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Why Reducing?\n", "\n", "A common problem in debugging is that given an input, only a _small part of that input may be responsible for the failure_. A central part of debugging is to _identify_ these parts – and to simplify (or _reduce_) the input to a minimal form that reproduces the failure – but does and contains as little else as possible." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Here's an example of such a situation. We have a `mystery()` method that – given its code – can occasionally fail. But under which circumstances does this actually happen? We have deliberately obscured the exact condition in order to make this non-obvious." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "button": false, "execution": { "iopub.execute_input": "2023-11-12T12:40:47.744884Z", "iopub.status.busy": "2023-11-12T12:40:47.744593Z", "iopub.status.idle": "2023-11-12T12:40:47.747162Z", "shell.execute_reply": "2023-11-12T12:40:47.746889Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import bookutils.setup" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.748708Z", "iopub.status.busy": "2023-11-12T12:40:47.748577Z", "iopub.status.idle": "2023-11-12T12:40:47.847796Z", "shell.execute_reply": "2023-11-12T12:40:47.847459Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import Tracer" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.849755Z", "iopub.status.busy": "2023-11-12T12:40:47.849613Z", "iopub.status.idle": "2023-11-12T12:40:47.851437Z", "shell.execute_reply": "2023-11-12T12:40:47.851143Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import quiz" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.853113Z", "iopub.status.busy": "2023-11-12T12:40:47.853000Z", "iopub.status.idle": "2023-11-12T12:40:47.855035Z", "shell.execute_reply": "2023-11-12T12:40:47.854740Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def mystery(inp: str) -> None:\n", " x = inp.find(chr(0o17 + 0o31))\n", " y = inp.find(chr(0o27 + 0o22))\n", " if x >= 0 and y >= 0 and x < y:\n", " raise ValueError(\"Invalid input\")\n", " else:\n", " pass" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To find an input that causes the function to fail, let us _fuzz_ it – that is, feed it with random inputs – until we find a failing input. There are [entire books about fuzzing](https://fuzzingbook.org); but here, a very simple `fuzz()` function for this purpose will already suffice." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To build a fuzzer, we need random inputs – and thus a source for randomness. The function `random.randrange(a, b)` returns a random number in the range (a, b)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.856564Z", "iopub.status.busy": "2023-11-12T12:40:47.856454Z", "iopub.status.idle": "2023-11-12T12:40:47.858071Z", "shell.execute_reply": "2023-11-12T12:40:47.857807Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import random" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.859569Z", "iopub.status.busy": "2023-11-12T12:40:47.859457Z", "iopub.status.idle": "2023-11-12T12:40:47.861873Z", "shell.execute_reply": "2023-11-12T12:40:47.861616Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "107" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "random.randrange(32, 128)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can use `random.randrange()` to compose random (printable) characters:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.863385Z", "iopub.status.busy": "2023-11-12T12:40:47.863280Z", "iopub.status.idle": "2023-11-12T12:40:47.865218Z", "shell.execute_reply": "2023-11-12T12:40:47.864937Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def fuzz() -> str:\n", " length = random.randrange(10, 70)\n", " fuzz = \"\"\n", " for i in range(length):\n", " fuzz += chr(random.randrange(32, 127))\n", " return fuzz" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here are some random strings produced by our `fuzz()` function:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.866837Z", "iopub.status.busy": "2023-11-12T12:40:47.866728Z", "iopub.status.idle": "2023-11-12T12:40:47.868826Z", "shell.execute_reply": "2023-11-12T12:40:47.868558Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "'N&+slk%hyp5'\n", "\"'@[3(rW*M5W]tMFPU4\\\\P@tz%[X?uo\\\\1?b4T;1bDeYtHx #UJ5\"\n", "'w}pMmPodJM,_%%BC~dYN6*g|Y*Ou9I:\\\\=V\n", " mystery(failing_input)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4141878445.py\", line 5, in mystery\n", " raise ValueError(\"Invalid input\")\n", "ValueError: Invalid input (expected)\n" ] } ], "source": [ "with ExpectError(ValueError):\n", " mystery(failing_input)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Something in this input causes `mystery()` to fail. But what is it?" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Manual Input Reduction\n", "\n", "One important step in the debugging process is _reduction_ – that is, to identify those circumstances of a failure that are relevant for the failure to occur, and to _omit_ (if possible) those parts that are not. As Kernighan and Pike \\cite{Kernighan1999} put it:\n", "\n", "> For every circumstance of the problem, check whether it is relevant for the problem to occur. If it is not, remove it from the problem report or the test case in question." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Specifically for inputs, they suggest a _divide and conquer_ process:\n", "\n", "> Proceed by binary search. Throw away half the input and see if the output is still wrong; if not, go back to the previous state and discard the other half of the input.\n", "\n", "This is something we can easily try out, using our last generated input:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.887355Z", "iopub.status.busy": "2023-11-12T12:40:47.887252Z", "iopub.status.idle": "2023-11-12T12:40:47.889296Z", "shell.execute_reply": "2023-11-12T12:40:47.889020Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'V\"/+!aF-(V4EOz*+s/Q,7)2@0_'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "failing_input" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For instance, we can see whether the error still occurs if we only feed in the first half:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.890933Z", "iopub.status.busy": "2023-11-12T12:40:47.890824Z", "iopub.status.idle": "2023-11-12T12:40:47.893037Z", "shell.execute_reply": "2023-11-12T12:40:47.892790Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'V\"/+!aF-(V4EO'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "half_length = len(failing_input) // 2 # // is integer division\n", "first_half = failing_input[:half_length]\n", "first_half" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.894404Z", "iopub.status.busy": "2023-11-12T12:40:47.894293Z", "iopub.status.idle": "2023-11-12T12:40:47.895827Z", "shell.execute_reply": "2023-11-12T12:40:47.895574Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with ExpectError(ValueError):\n", " mystery(first_half)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Nope – the first half alone does not suffice. Maybe the second half?" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.897114Z", "iopub.status.busy": "2023-11-12T12:40:47.897035Z", "iopub.status.idle": "2023-11-12T12:40:47.899474Z", "shell.execute_reply": "2023-11-12T12:40:47.899003Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'z*+s/Q,7)2@0_'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "second_half = failing_input[half_length:]\n", "assert first_half + second_half == failing_input\n", "second_half" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.901209Z", "iopub.status.busy": "2023-11-12T12:40:47.901090Z", "iopub.status.idle": "2023-11-12T12:40:47.902800Z", "shell.execute_reply": "2023-11-12T12:40:47.902537Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with ExpectError(ValueError):\n", " mystery(second_half)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This did not go so well either. We may still proceed by cutting away _smaller chunks_ – say, one character after another. If our test is deterministic and easily repeated, it is clear that this process eventually will yield a reduced input. But still, it is a rather inefficient process, especially for long inputs. What we need is a _strategy_ that effectively minimizes a failure-inducing input – a strategy that can be automated." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Delta Debugging" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "One strategy to effectively reduce failure-inducing inputs is _delta debugging_ \\cite{Zeller2002}. Delta Debugging implements the \"binary search\" strategy, as listed above, but with a twist: If neither half fails (also as above), it keeps on cutting away smaller and smaller chunks from the input, until it eliminates individual characters. Thus, after cutting away the first half, we cut away\n", "the first quarter, the second quarter, and so on." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us illustrate this on our example, and see what happens if we cut away the first quarter." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.904497Z", "iopub.status.busy": "2023-11-12T12:40:47.904369Z", "iopub.status.idle": "2023-11-12T12:40:47.906430Z", "shell.execute_reply": "2023-11-12T12:40:47.906177Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'F-(V4EOz*+s/Q,7)2@0_'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarter_length = len(failing_input) // 4\n", "input_without_first_quarter = failing_input[quarter_length:]\n", "input_without_first_quarter" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.908027Z", "iopub.status.busy": "2023-11-12T12:40:47.907921Z", "iopub.status.idle": "2023-11-12T12:40:47.909966Z", "shell.execute_reply": "2023-11-12T12:40:47.909700Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2963114098.py\", line 2, in \n", " mystery(input_without_first_quarter)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4141878445.py\", line 5, in mystery\n", " raise ValueError(\"Invalid input\")\n", "ValueError: Invalid input (expected)\n" ] } ], "source": [ "with ExpectError(ValueError):\n", " mystery(input_without_first_quarter)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Ah! This has failed, and reduced our failing input by 25%. Let's remove another quarter." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.911615Z", "iopub.status.busy": "2023-11-12T12:40:47.911508Z", "iopub.status.idle": "2023-11-12T12:40:47.913608Z", "shell.execute_reply": "2023-11-12T12:40:47.913311Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Oz*+s/Q,7)2@0_'" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "input_without_first_and_second_quarter = failing_input[quarter_length * 2:]\n", "input_without_first_and_second_quarter" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.915046Z", "iopub.status.busy": "2023-11-12T12:40:47.914941Z", "iopub.status.idle": "2023-11-12T12:40:47.916655Z", "shell.execute_reply": "2023-11-12T12:40:47.916428Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "with ExpectError(ValueError):\n", " mystery(input_without_first_and_second_quarter)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is not too surprising, as we had that one before:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.918247Z", "iopub.status.busy": "2023-11-12T12:40:47.918127Z", "iopub.status.idle": "2023-11-12T12:40:47.920115Z", "shell.execute_reply": "2023-11-12T12:40:47.919844Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'z*+s/Q,7)2@0_'" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "second_half" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.921587Z", "iopub.status.busy": "2023-11-12T12:40:47.921471Z", "iopub.status.idle": "2023-11-12T12:40:47.923497Z", "shell.execute_reply": "2023-11-12T12:40:47.923234Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Oz*+s/Q,7)2@0_'" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "input_without_first_and_second_quarter" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "How about removing the third quarter, then?" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.925002Z", "iopub.status.busy": "2023-11-12T12:40:47.924894Z", "iopub.status.idle": "2023-11-12T12:40:47.927119Z", "shell.execute_reply": "2023-11-12T12:40:47.926863Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'F-(V4EQ,7)2@0_'" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "input_without_first_and_third_quarter = failing_input[quarter_length:\n", " quarter_length * 2] + failing_input[quarter_length * 3:]\n", "input_without_first_and_third_quarter" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.928669Z", "iopub.status.busy": "2023-11-12T12:40:47.928549Z", "iopub.status.idle": "2023-11-12T12:40:47.930334Z", "shell.execute_reply": "2023-11-12T12:40:47.930062Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4218135276.py\", line 2, in \n", " mystery(input_without_first_and_third_quarter)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4141878445.py\", line 5, in mystery\n", " raise ValueError(\"Invalid input\")\n", "ValueError: Invalid input (expected)\n" ] } ], "source": [ "with ExpectError(ValueError):\n", " mystery(input_without_first_and_third_quarter)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Yes! This has succeeded. Our input is now 50% smaller." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We have now tried to remove pieces that make up $\\frac{1}{2}$ and $\\frac{1}{4}$ of the original failing string. In the next iteration, we would go and remove even smaller pieces – $\\frac{1}{8}$, $\\frac{1}{16}$ and so on. We continue until we are down to $\\frac{1}{26}$ – that is, individual characters." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "However, this is something we happily let a computer do for us – and this is what the _Delta Debugging_ algorithm does. Delta Debugging implements the strategy sketched above: It first removes larger chunks of size $\\frac{1}{2}$; if this does not fail, then we proceed to chunks of size $\\frac{1}{4}$, then $\\frac{1}{8}$ and so on." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our `ddmin()` implementation uses the exact same Python code as Zeller in \\cite{Zeller2002}; the only difference is that it has been adapted to work on Python 3. The variable `n` (initially 2) indicates the granularity – in each step, chunks of size $\\frac{1}{n}$ are cut away. If none of the test fails (`some_complement_is_failing` is False), then `n` is doubled – until it reaches the length of the input." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.931873Z", "iopub.status.busy": "2023-11-12T12:40:47.931768Z", "iopub.status.idle": "2023-11-12T12:40:47.933392Z", "shell.execute_reply": "2023-11-12T12:40:47.933144Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "PASS = 'PASS'\n", "FAIL = 'FAIL'\n", "UNRESOLVED = 'UNRESOLVED'" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.934761Z", "iopub.status.busy": "2023-11-12T12:40:47.934668Z", "iopub.status.idle": "2023-11-12T12:40:47.936442Z", "shell.execute_reply": "2023-11-12T12:40:47.936184Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# ignore\n", "from typing import Sequence, Any, Callable, Optional, Type, Tuple\n", "from typing import Dict, Union, Set, List, FrozenSet, cast" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.937780Z", "iopub.status.busy": "2023-11-12T12:40:47.937696Z", "iopub.status.idle": "2023-11-12T12:40:47.940771Z", "shell.execute_reply": "2023-11-12T12:40:47.940492Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def ddmin(test: Callable, inp: Sequence, *test_args: Any) -> Sequence:\n", " \"\"\"Reduce the input inp, using the outcome of test(fun, inp).\"\"\"\n", " assert test(inp, *test_args) != PASS\n", "\n", " n = 2 # Initial granularity\n", " while len(inp) >= 2:\n", " start = 0\n", " subset_length = int(len(inp) / n)\n", " some_complement_is_failing = False\n", "\n", " while start < len(inp):\n", " complement = (inp[:int(start)] + inp[int(start + subset_length):]) # type: ignore\n", "\n", " if test(complement, *test_args) == FAIL:\n", " inp = complement\n", " n = max(n - 1, 2)\n", " some_complement_is_failing = True\n", " break\n", "\n", " start += subset_length\n", "\n", " if not some_complement_is_failing:\n", " if n == len(inp):\n", " break\n", " n = min(n * 2, len(inp))\n", "\n", " return inp" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "To see how `ddmin()` works, let us run it on our failing input. We need to define a `test` function that returns PASS or FAIL, depending on the test outcome. This `generic_test()` assumes that the function fails if it raises an exception (such as an `AssertException`), and passes otherwise. The optional argument `expected_exc` specifies the name of exception to be checked for; this ensures we reduce only for the kind of error raised in the original failure." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.942277Z", "iopub.status.busy": "2023-11-12T12:40:47.942179Z", "iopub.status.idle": "2023-11-12T12:40:47.944760Z", "shell.execute_reply": "2023-11-12T12:40:47.944511Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def generic_test(inp: Sequence, fun: Callable,\n", " expected_exc: Optional[Type] = None) -> str:\n", " result = None\n", " detail = \"\"\n", " try:\n", " result = fun(inp)\n", " outcome = PASS\n", " except Exception as exc:\n", " detail = f\" ({type(exc).__name__}: {str(exc)})\"\n", " if expected_exc is None:\n", " outcome = FAIL\n", " elif type(exc) == type(expected_exc) and str(exc) == str(expected_exc):\n", " outcome = FAIL\n", " else:\n", " outcome = UNRESOLVED\n", "\n", " print(f\"{fun.__name__}({repr(inp)}): {outcome}{detail}\")\n", " return outcome" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can now invoke `ddmin()` in our setting. With each step, we see how the remaining input gets smaller and smaller, until only two characters remain:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.946306Z", "iopub.status.busy": "2023-11-12T12:40:47.946197Z", "iopub.status.idle": "2023-11-12T12:40:47.948571Z", "shell.execute_reply": "2023-11-12T12:40:47.948276Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mystery('V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "mystery('z*+s/Q,7)2@0_'): PASS\n", "mystery('V\"/+!aF-(V4EO'): PASS\n", "mystery('F-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "mystery('Oz*+s/Q,7)2@0_'): PASS\n", "mystery('F-(V4EQ,7)2@0_'): FAIL (ValueError: Invalid input)\n", "mystery(',7)2@0_'): PASS\n", "mystery('F-(V4EQ'): PASS\n", "mystery('V4EQ,7)2@0_'): PASS\n", "mystery('F-(Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "mystery('Q,7)2@0_'): PASS\n", "mystery('F-()2@0_'): FAIL (ValueError: Invalid input)\n", "mystery('2@0_'): PASS\n", "mystery('F-()'): FAIL (ValueError: Invalid input)\n", "mystery('()'): FAIL (ValueError: Invalid input)\n", "mystery(')'): PASS\n", "mystery('('): PASS\n" ] }, { "data": { "text/plain": [ "'()'" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ddmin(generic_test, failing_input, mystery, ValueError('Invalid input'))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Now we know why `mystery()` fails – it suffices that the input contains two matching parentheses. Delta Debugging determines this in 25 steps. Its result is _1-minimal_, meaning that every character contained is required to produce the error; removing any (as seen in the last two tests, above) no longer causes the test to fail. This property is guaranteed by the delta debugging algorithm, which in its last stage always tries to delete characters one by one." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "A reduced test case such as the one above has many advantages:\n", "\n", "* A reduced test case __reduces the _cognitive load_ of the programmer__. The test case is shorter and focused, and thus does not burden the programmer with irrelevant details. A reduced input typically leads to shorter executions and smaller program states, both of which reduce the search space as it comes to understanding the bug. In our case, we have eliminated lots of irrelevant input – only the two characters the reduced input contains are relevant.\n", "\n", "* A reduced test case __is easier to communicate__. All one needs here is the summary: `mystery() fails on \"()\"`, which is much better than `mystery() fails on a 4100-character input (attached)`.\n", "\n", "* A reduced test case helps in __identifying duplicates__. If similar bugs have been reported already, and all of them have been reduced to the same cause (namely that the input contains matching parentheses), then it becomes obvious that all these bugs are different symptoms of the same underlying cause – and would all be resolved at once with one code fix." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "How effective is delta debugging? In the best case (when the left half or the right half fails), the number of tests is logarithmic proportional to the length $n$ of an input (i.e., $O(\\log_2 n)$); this is the same complexity as binary search. In the worst case, though, delta debugging can require a number of tests proportional to $n^2$ (i.e., $O(n^2)$) – this happens in the case when we are down to character granularity, and we have to repeatedly tried to delete all characters, only to find that deleting the last character results in a failure \\cite{Zeller2002}. (This is a pretty pathological situation, though.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In general, delta debugging is a robust algorithm that is easy to implement, easy to deploy, and easy to use – provided that the underlying test case is deterministic and runs quickly enough to warrant a number of experiments. In general, any debugging task should start with simplifying the test case as much as possible – and this is where delta debugging can help." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## A Simple DeltaDebugger Interface\n", "\n", "As defined above, using `ddmin()` still requires the developer to set up a special testing function – and writing or using even a generic tester (like `generic_test()`) takes some effort. We want to simplify the setup such that only two lines of Python is required.\n", "\n", "Our aim is to have a `DeltaDebugger` class that we can use in conjunction with a failing (i.e., exception raising) function call:\n", "\n", "```python\n", "with DeltaDebugger() as dd:\n", " mystery(failing_input)\n", "dd\n", "```\n", "Here, at the end of the `with` statement, printing out `dd` shows us the minimal input that causes the failure." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Excursion: Implementing DeltaDebugger" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Our interface consist of six building blocks:\n", "\n", "1. We collect the name and args of the first call in the `with` body, as well as the exception it raises.\n", "2. We set up an infrastructure such that we can repeat calls with different arguments.\n", "3. We make sure that multiple tests with the same arguments can return outcomes from a cache.\n", "4. We create a `DeltaDebugger` class that implements the general Delta Debugging algorithm – an algorithm than can minimize failing inputs as well as maximize passing inputs.\n", "5. We provide an infrastructure that applies Delta Debugging on multiple arguments.\n", "6. Finally, custom methods like `min_args()` allow invoking delta debugging on arguments." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Collecting a Call\n", "\n", "We start by creating an infrastructure that collects a call. The `CallCollector` class saves the first call observed in `_function`, `_args`, and `_exception` attributes, respectively; it then turns tracing off." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.950670Z", "iopub.status.busy": "2023-11-12T12:40:47.950459Z", "iopub.status.idle": "2023-11-12T12:40:47.952277Z", "shell.execute_reply": "2023-11-12T12:40:47.952003Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import sys" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.953818Z", "iopub.status.busy": "2023-11-12T12:40:47.953703Z", "iopub.status.idle": "2023-11-12T12:40:47.955333Z", "shell.execute_reply": "2023-11-12T12:40:47.955084Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from types import FunctionType, FrameType, TracebackType" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.956679Z", "iopub.status.busy": "2023-11-12T12:40:47.956593Z", "iopub.status.idle": "2023-11-12T12:40:47.958445Z", "shell.execute_reply": "2023-11-12T12:40:47.958172Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from StackInspector import StackInspector" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.960148Z", "iopub.status.busy": "2023-11-12T12:40:47.960030Z", "iopub.status.idle": "2023-11-12T12:40:47.961684Z", "shell.execute_reply": "2023-11-12T12:40:47.961446Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class NoCallError(ValueError):\n", " pass" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.963044Z", "iopub.status.busy": "2023-11-12T12:40:47.962959Z", "iopub.status.idle": "2023-11-12T12:40:47.967674Z", "shell.execute_reply": "2023-11-12T12:40:47.967430Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CallCollector(StackInspector):\n", " \"\"\"\n", " Collect an exception-raising function call f().\n", " Use as `with CallCollector(): f()`\n", " \"\"\"\n", "\n", " def __init__(self) -> None:\n", " \"\"\"Initialize collector\"\"\"\n", " self.init()\n", "\n", " def init(self) -> None:\n", " \"\"\"Reset for new collection.\"\"\"\n", " self._function: Optional[Callable] = None\n", " self._args: Dict[str, Any] = {}\n", " self._exception: Optional[BaseException] = None\n", " self.original_trace_function: Optional[Callable] = None\n", "\n", " def traceit(self, frame: FrameType, event: str, arg: Any) -> None:\n", " \"\"\"Tracing function. Collect first call, then turn tracing off.\"\"\"\n", " if event == 'call':\n", " name = frame.f_code.co_name\n", " if name.startswith('__'):\n", " # Internal function\n", " return\n", " if self._function is not None:\n", " # Already set\n", " return\n", "\n", " func = self.search_func(name, frame)\n", " if func:\n", " self._function = func\n", " else:\n", " # Create new function from given code\n", " self._function = self.create_function(frame)\n", "\n", " self._args = {} # Create a local copy of args\n", " for var in frame.f_locals:\n", " if var in frame.f_code.co_freevars:\n", " continue # Local var, not an argument\n", " self._args[var] = frame.f_locals[var]\n", "\n", " # Turn tracing off\n", " sys.settrace(self.original_trace_function)\n", "\n", " def after_collection(self) -> None:\n", " \"\"\"Called after collection. To be defined in subclasses.\"\"\"\n", " pass\n", "\n", " def args(self) -> Dict[str, Any]:\n", " \"\"\"Return the dictionary of collected arguments.\"\"\"\n", " return self._args\n", "\n", " def function(self) -> Callable:\n", " \"\"\"Return the function called.\"\"\"\n", " if self._function is None:\n", " raise NoCallError(\"No function call collected\")\n", " return self._function\n", "\n", " def exception(self) -> Optional[BaseException]:\n", " \"\"\"Return the exception produced, or `None` if none.\"\"\"\n", " return self._exception\n", "\n", " def format_call(self, args: Optional[Dict[str, Any]] = None) -> str: # type: ignore\n", " ...\n", "\n", " def format_exception(self, exc: Optional[BaseException] = None) -> str: # type: ignore\n", " ...\n", "\n", " def call(self, new_args: Optional[Dict[str, Any]] = None) -> Any: # type: ignore\n", " ..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A `CallCollector` is used like a `Tracer` from the [chapter on tracing](Tracer.ipynb), using a `with` block to collect a single function call." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.969384Z", "iopub.status.busy": "2023-11-12T12:40:47.969290Z", "iopub.status.idle": "2023-11-12T12:40:47.972012Z", "shell.execute_reply": "2023-11-12T12:40:47.971749Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CallCollector(CallCollector):\n", " def __enter__(self) -> Any:\n", " \"\"\"Called at begin of `with` block. Turn tracing on.\"\"\"\n", " self.init()\n", " self.original_trace_function = sys.gettrace()\n", " sys.settrace(self.traceit)\n", " return self\n", "\n", " def __exit__(self, exc_tp: Type, exc_value: BaseException,\n", " exc_traceback: TracebackType) -> Optional[bool]:\n", " \"\"\"Called at end of `with` block. Turn tracing off.\"\"\"\n", " sys.settrace(self.original_trace_function)\n", "\n", " if not self._function:\n", " if exc_tp:\n", " return False # re-raise exception\n", " else:\n", " raise NoCallError(\"No call collected\")\n", "\n", " if self.is_internal_error(exc_tp, exc_value, exc_traceback):\n", " return False # Re-raise exception\n", "\n", " self._exception = exc_value\n", " self.after_collection()\n", " return True # Ignore exception" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here are the attributes as collected by `CallCollector` for our `mystery()` function. Note that the `mystery()` exception is \"swallowed\" by the `CallCollector`." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.973440Z", "iopub.status.busy": "2023-11-12T12:40:47.973338Z", "iopub.status.idle": "2023-11-12T12:40:47.974912Z", "shell.execute_reply": "2023-11-12T12:40:47.974666Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with CallCollector() as call_collector:\n", " mystery(failing_input)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.976261Z", "iopub.status.busy": "2023-11-12T12:40:47.976181Z", "iopub.status.idle": "2023-11-12T12:40:47.978353Z", "shell.execute_reply": "2023-11-12T12:40:47.978077Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ " None>" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "call_collector.function()" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.979915Z", "iopub.status.busy": "2023-11-12T12:40:47.979808Z", "iopub.status.idle": "2023-11-12T12:40:47.981830Z", "shell.execute_reply": "2023-11-12T12:40:47.981559Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'inp': 'V\"/+!aF-(V4EOz*+s/Q,7)2@0_'}" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "call_collector.args()" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.983559Z", "iopub.status.busy": "2023-11-12T12:40:47.983337Z", "iopub.status.idle": "2023-11-12T12:40:47.985479Z", "shell.execute_reply": "2023-11-12T12:40:47.985244Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "ValueError('Invalid input')" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "call_collector.exception()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If an error occurs _before_ the first function call takes place, the exception is simply re-raised." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.986930Z", "iopub.status.busy": "2023-11-12T12:40:47.986829Z", "iopub.status.idle": "2023-11-12T12:40:47.988695Z", "shell.execute_reply": "2023-11-12T12:40:47.988433Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/700237621.py\", line 3, in \n", " some_error() # type: ignore\n", "NameError: name 'some_error' is not defined (expected)\n" ] } ], "source": [ "with ExpectError(NameError):\n", " with CallCollector() as c:\n", " some_error() # type: ignore" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Repeating a Call\n", "\n", "Our second step is an infrastructure such that we can call the function collected earlier with alternate arguments. We can call the function directly via the collected `_function` attribute:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.990278Z", "iopub.status.busy": "2023-11-12T12:40:47.990058Z", "iopub.status.idle": "2023-11-12T12:40:47.991957Z", "shell.execute_reply": "2023-11-12T12:40:47.991625Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "call_collector.function()(\"foo\")" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.993478Z", "iopub.status.busy": "2023-11-12T12:40:47.993359Z", "iopub.status.idle": "2023-11-12T12:40:47.995186Z", "shell.execute_reply": "2023-11-12T12:40:47.994939Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/3898025015.py\", line 2, in \n", " call_collector.function()(failing_input)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4141878445.py\", line 5, in mystery\n", " raise ValueError(\"Invalid input\")\n", "ValueError: Invalid input (expected)\n" ] } ], "source": [ "with ExpectError(ValueError):\n", " call_collector.function()(failing_input)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can also provide the arguments collected during the call:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.996599Z", "iopub.status.busy": "2023-11-12T12:40:47.996486Z", "iopub.status.idle": "2023-11-12T12:40:47.998165Z", "shell.execute_reply": "2023-11-12T12:40:47.997934Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/1915817655.py\", line 2, in \n", " call_collector.function()(**call_collector.args())\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4141878445.py\", line 5, in mystery\n", " raise ValueError(\"Invalid input\")\n", "ValueError: Invalid input (expected)\n" ] } ], "source": [ "with ExpectError(ValueError):\n", " call_collector.function()(**call_collector.args())" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our `call()` method calls the collected function using this construct. It also allows changing_ individual arguments by providing a `new_args` dictionary of variable names to new values." ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:47.999703Z", "iopub.status.busy": "2023-11-12T12:40:47.999569Z", "iopub.status.idle": "2023-11-12T12:40:48.001941Z", "shell.execute_reply": "2023-11-12T12:40:48.001693Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CallCollector(CallCollector):\n", " def call(self, new_args: Optional[Dict[str, Any]] = None) -> Any:\n", " \"\"\"\n", " Call collected function. If `new_args` is given,\n", " override arguments from its {var: value} entries.\n", " \"\"\"\n", "\n", " if new_args is None:\n", " new_args = {}\n", "\n", " args = {} # Create local copy\n", " for var in self.args():\n", " args[var] = self.args()[var]\n", " for var in new_args:\n", " args[var] = new_args[var]\n", "\n", " return self.function()(**args)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Using simply `call()` without arguments reproduces the failure:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.003331Z", "iopub.status.busy": "2023-11-12T12:40:48.003224Z", "iopub.status.idle": "2023-11-12T12:40:48.005256Z", "shell.execute_reply": "2023-11-12T12:40:48.005022Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/460230734.py\", line 4, in \n", " call_collector.call()\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/3952656613.py\", line 17, in call\n", " return self.function()(**args)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4141878445.py\", line 5, in mystery\n", " raise ValueError(\"Invalid input\")\n", "ValueError: Invalid input (expected)\n" ] } ], "source": [ "with CallCollector() as call_collector:\n", " mystery(failing_input)\n", "with ExpectError(ValueError):\n", " call_collector.call()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can also supply alternate arguments (and get alternate outcomes):" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.006742Z", "iopub.status.busy": "2023-11-12T12:40:48.006641Z", "iopub.status.idle": "2023-11-12T12:40:48.008225Z", "shell.execute_reply": "2023-11-12T12:40:48.007969Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "call_collector.call({'inp': 'foo'})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We close with two helper functions that come handy for logging and error messages:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.009677Z", "iopub.status.busy": "2023-11-12T12:40:48.009587Z", "iopub.status.idle": "2023-11-12T12:40:48.012189Z", "shell.execute_reply": "2023-11-12T12:40:48.011878Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CallCollector(CallCollector):\n", " def format_call(self, args: Optional[Dict[str, Any]] = None) -> str:\n", " \"\"\"Return a string representing a call of the function with given args.\"\"\"\n", " if args is None:\n", " args = self.args()\n", " return self.function().__name__ + \"(\" + \\\n", " \", \".join(f\"{arg}={repr(args[arg])}\" for arg in args) + \")\"\n", "\n", " def format_exception(self, exc: Optional[BaseException] = None) -> str:\n", " \"\"\"Return a string representing the given exception.\"\"\"\n", " if exc is None:\n", " exc = self.exception()\n", " s = type(exc).__name__\n", " if str(exc):\n", " s += \": \" + str(exc)\n", " return s" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.013664Z", "iopub.status.busy": "2023-11-12T12:40:48.013559Z", "iopub.status.idle": "2023-11-12T12:40:48.015186Z", "shell.execute_reply": "2023-11-12T12:40:48.014921Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "with CallCollector() as call_collector:\n", " mystery(failing_input)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.016698Z", "iopub.status.busy": "2023-11-12T12:40:48.016596Z", "iopub.status.idle": "2023-11-12T12:40:48.018543Z", "shell.execute_reply": "2023-11-12T12:40:48.018274Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'mystery(inp=\\'V\"/+!aF-(V4EOz*+s/Q,7)2@0_\\')'" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "call_collector.format_call()" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.020151Z", "iopub.status.busy": "2023-11-12T12:40:48.020042Z", "iopub.status.idle": "2023-11-12T12:40:48.022048Z", "shell.execute_reply": "2023-11-12T12:40:48.021754Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'ValueError: Invalid input'" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "call_collector.format_exception()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Testing, Logging, and Caching\n", "\n", "Our next to last step is an infrastructure that implements delta debugging for the collected call." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We first introduce a `CallReducer` class as an abstract superclass for all kinds of reducers.\n", "Its `run()` method tests the function and returns PASS, FAIL, or UNRESOLVED. As with `generic_test()`, above, we check for exception type and exact error message." ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.023596Z", "iopub.status.busy": "2023-11-12T12:40:48.023489Z", "iopub.status.idle": "2023-11-12T12:40:48.026424Z", "shell.execute_reply": "2023-11-12T12:40:48.026160Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CallReducer(CallCollector):\n", " def __init__(self, *, log: Union[bool, int] = False) -> None:\n", " \"\"\"Initialize. If `log` is True, enable logging.\"\"\"\n", " super().__init__()\n", " self.log = log\n", " self.reset()\n", "\n", " def reset(self) -> None:\n", " \"\"\"Reset the number of tests.\"\"\"\n", " self.tests = 0\n", "\n", " def run(self, args: Dict[str, Any]) -> str:\n", " \"\"\"\n", " Run collected function with `args`. Return\n", " * PASS if no exception occurred\n", " * FAIL if the collected exception occurred\n", " * UNRESOLVED if some other exception occurred.\n", " Not to be used directly; can be overloaded in subclasses.\n", " \"\"\"\n", " try:\n", " result = self.call(args)\n", " except Exception as exc:\n", " self.last_exception = exc\n", " if (type(exc) == type(self.exception()) and\n", " str(exc) == str(self.exception())):\n", " return FAIL\n", " else:\n", " return UNRESOLVED # Some other failure\n", "\n", " self.last_result = result\n", " return PASS" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `test()` method runs a single test (with logging, if wanted); the `reduce_arg()` method will eventually reduce an input to the minimum." ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.027981Z", "iopub.status.busy": "2023-11-12T12:40:48.027853Z", "iopub.status.idle": "2023-11-12T12:40:48.030669Z", "shell.execute_reply": "2023-11-12T12:40:48.030376Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CallReducer(CallReducer):\n", " def test(self, args: Dict[str, Any]) -> str:\n", " \"\"\"Like run(), but also log detail and keep statistics.\"\"\"\n", " outcome = self.run(args)\n", " if outcome == PASS:\n", " detail = \"\"\n", " else:\n", " detail = f\" ({self.format_exception(self.last_exception)})\"\n", "\n", " self.tests += 1\n", " if self.log:\n", " print(f\"Test #{self.tests} {self.format_call(args)}: {outcome}{detail}\")\n", "\n", " return outcome\n", "\n", " def reduce_arg(self, var_to_be_reduced: str, args: Dict[str, Any]) -> Sequence:\n", " \"\"\"\n", " Determine and return a minimal value for var_to_be_reduced.\n", " To be overloaded in subclasses.\n", " \"\"\"\n", " return args[var_to_be_reduced]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's some logging output from the `test()` function:" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.032141Z", "iopub.status.busy": "2023-11-12T12:40:48.032032Z", "iopub.status.idle": "2023-11-12T12:40:48.034877Z", "shell.execute_reply": "2023-11-12T12:40:48.034564Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test #1 mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #2 mystery(inp='123'): PASS\n", "Test #3 mystery(inp='123'): PASS\n" ] }, { "data": { "text/plain": [ "'PASS'" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with CallReducer(log=True) as reducer:\n", " mystery(failing_input)\n", "\n", "reducer.test({'inp': failing_input})\n", "reducer.test({'inp': '123'})\n", "reducer.test({'inp': '123'})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `CachingCallReducer` variant saves test results, such that we don't have to run the same tests again and again:" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.037453Z", "iopub.status.busy": "2023-11-12T12:40:48.037345Z", "iopub.status.idle": "2023-11-12T12:40:48.040180Z", "shell.execute_reply": "2023-11-12T12:40:48.039851Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CachingCallReducer(CallReducer):\n", " \"\"\"Like CallReducer, but cache test outcomes.\"\"\"\n", "\n", " def init(self) -> None:\n", " super().init()\n", " self._cache: Dict[FrozenSet, str] = {}\n", "\n", " def test(self, args: Dict[str, Any]) -> str:\n", " # Create a hashable index\n", " try:\n", " index = frozenset((k, v) for k, v in args.items())\n", " except TypeError:\n", " index = frozenset()\n", "\n", " if not index:\n", " # Non-hashable value – do not use cache\n", " return super().test(args)\n", "\n", " if index in self._cache:\n", " return self._cache[index]\n", "\n", " outcome = super().test(args)\n", " self._cache[index] = outcome\n", "\n", " return outcome" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If we now repeat a test with the same argument, its outcome can be found in the cache:" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.041902Z", "iopub.status.busy": "2023-11-12T12:40:48.041768Z", "iopub.status.idle": "2023-11-12T12:40:48.044625Z", "shell.execute_reply": "2023-11-12T12:40:48.044376Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test #1 mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #2 mystery(inp='123'): PASS\n" ] }, { "data": { "text/plain": [ "'PASS'" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with CachingCallReducer(log=True) as reducer:\n", " mystery(failing_input)\n", "\n", "reducer.test({'inp': failing_input})\n", "reducer.test({'inp': '123'})\n", "reducer.test({'inp': '123'})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### General Delta Debugging\n", "\n", "The `DeltaDebugger` class finally implements Delta Debugging on arguments. Our implementation uses the _general_ `dd` delta debugging algorithm from \\cite{Zeller2002}. In contrast to `ddmin`, it returns a _pair_ of a maximized passing input and a minimized failing input. The algorithm can be customized, however, to leave the passing input fixed and _only_ to minimize the failing input (as with `ddmin`), or vice versa." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Internally, `dd` does not directly work on a list of elements; instead, it works on sets of _indices_ into such a list. The function `to_set()` converts a collection into such a set." ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.046157Z", "iopub.status.busy": "2023-11-12T12:40:48.046048Z", "iopub.status.idle": "2023-11-12T12:40:48.047746Z", "shell.execute_reply": "2023-11-12T12:40:48.047503Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def to_set(inp: Sequence) -> Set:\n", " \"\"\"Convert inp into a set of indices\"\"\"\n", " return set(range(len(inp)))" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.049069Z", "iopub.status.busy": "2023-11-12T12:40:48.048985Z", "iopub.status.idle": "2023-11-12T12:40:48.051134Z", "shell.execute_reply": "2023-11-12T12:40:48.050889Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{0, 1, 2, 3}" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "to_set(\"abcd\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The function `from_set()` converts a set of indices back into the original collection. For this, it uses a function `empty()` that returns an empty collection that has the same type as the given input `inp`." ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.052620Z", "iopub.status.busy": "2023-11-12T12:40:48.052518Z", "iopub.status.idle": "2023-11-12T12:40:48.054314Z", "shell.execute_reply": "2023-11-12T12:40:48.054049Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def empty(inp: Any) -> Any:\n", " \"\"\"Return an \"empty\" element of the same type as inp\"\"\"\n", " return type(inp)()" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.056157Z", "iopub.status.busy": "2023-11-12T12:40:48.055990Z", "iopub.status.idle": "2023-11-12T12:40:48.058676Z", "shell.execute_reply": "2023-11-12T12:40:48.058325Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "('', [], set())" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "empty(\"abc\"), empty([1, 2, 3]), empty({0, -1, -2})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The function `add_to()` tries out various ways to add an element to a given collection." ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.060613Z", "iopub.status.busy": "2023-11-12T12:40:48.060476Z", "iopub.status.idle": "2023-11-12T12:40:48.063538Z", "shell.execute_reply": "2023-11-12T12:40:48.063080Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def add_to(collection: Any, elem: Any) -> Any:\n", " \"\"\"Add element to collection; return new collection.\"\"\"\n", " if isinstance(collection, str):\n", " return collection + elem # Strings\n", "\n", " try: # Lists and other collections\n", " return collection + type(collection)([elem])\n", " except TypeError:\n", " pass\n", "\n", " try: # Sets\n", " return collection | type(collection)([elem])\n", " except TypeError:\n", " pass\n", "\n", " raise ValueError(\"Cannot add element to collection\")" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.065412Z", "iopub.status.busy": "2023-11-12T12:40:48.065299Z", "iopub.status.idle": "2023-11-12T12:40:48.067943Z", "shell.execute_reply": "2023-11-12T12:40:48.067700Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "('abcd', [1, 2, 3, 4], {1, 2, 3, 4})" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add_to(\"abc\", \"d\"), add_to([1, 2, 3], 4), add_to(set([1, 2, 3]), 4)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Using `empty()` and `add_to()`, we can now implement `from_set()`:" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.069663Z", "iopub.status.busy": "2023-11-12T12:40:48.069560Z", "iopub.status.idle": "2023-11-12T12:40:48.071604Z", "shell.execute_reply": "2023-11-12T12:40:48.071352Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def from_set(the_set: Any, inp: Sequence) -> Any:\n", " \"\"\"Convert a set of indices into `inp` back into a collection.\"\"\"\n", " ret = empty(inp)\n", " for i, c in enumerate(inp):\n", " if i in the_set:\n", " ret = add_to(ret, c)\n", "\n", " return ret" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.073060Z", "iopub.status.busy": "2023-11-12T12:40:48.072968Z", "iopub.status.idle": "2023-11-12T12:40:48.075429Z", "shell.execute_reply": "2023-11-12T12:40:48.075101Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'bc'" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from_set({1, 2}, \"abcd\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "To split a set of elements into `n` subsets of equal size, we use this helper function, based on [this discussion in StackOverflow](https://stackoverflow.com/questions/2130016/splitting-a-list-into-n-parts-of-approximately-equal-length/37414115#37414115)." ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.077162Z", "iopub.status.busy": "2023-11-12T12:40:48.077043Z", "iopub.status.idle": "2023-11-12T12:40:48.080398Z", "shell.execute_reply": "2023-11-12T12:40:48.079960Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def split(elems: Any, n: int) -> List:\n", " assert 1 <= n <= len(elems)\n", "\n", " k, m = divmod(len(elems), n)\n", " try:\n", " subsets = list(elems[i * k + min(i, m):(i + 1) * k + min(i + 1, m)]\n", " for i in range(n))\n", " except TypeError:\n", " # Convert to list and back\n", " subsets = list(type(elems)(\n", " list(elems)[i * k + min(i, m):(i + 1) * k + min(i + 1, m)])\n", " for i in range(n))\n", "\n", " assert len(subsets) == n\n", " assert sum(len(subset) for subset in subsets) == len(elems)\n", " assert all(len(subset) > 0 for subset in subsets)\n", "\n", " return subsets" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.082019Z", "iopub.status.busy": "2023-11-12T12:40:48.081915Z", "iopub.status.idle": "2023-11-12T12:40:48.084357Z", "shell.execute_reply": "2023-11-12T12:40:48.084074Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1, 2, 3, 4, 5, 6, 7]]\n", "[[1, 2, 3, 4], [5, 6, 7]]\n", "[[1, 2, 3], [4, 5], [6, 7]]\n", "[[1, 2], [3, 4], [5, 6], [7]]\n", "[[1, 2], [3, 4], [5], [6], [7]]\n", "[[1, 2], [3], [4], [5], [6], [7]]\n", "[[1], [2], [3], [4], [5], [6], [7]]\n" ] } ], "source": [ "for n in range(1, 8):\n", " print(split([1, 2, 3, 4, 5, 6, 7], n))" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.086052Z", "iopub.status.busy": "2023-11-12T12:40:48.085939Z", "iopub.status.idle": "2023-11-12T12:40:48.088438Z", "shell.execute_reply": "2023-11-12T12:40:48.088129Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['ab', 'c', 'd']" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "split(\"abcd\", 3)" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.089976Z", "iopub.status.busy": "2023-11-12T12:40:48.089865Z", "iopub.status.idle": "2023-11-12T12:40:48.092342Z", "shell.execute_reply": "2023-11-12T12:40:48.092013Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[{1, 2, 3}, {4, 5}, {6, 7}]" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "split({1, 2, 3, 4, 5, 6, 7}, 3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "With these, we can now implement general delta debugging. Our implementation follows \\cite{Zeller2002} with the following optimizations:\n", "\n", "1. We can control whether only to minimize or to maximize (\"mode\")\n", "2. The operations \"Reduce to subset\" and \"Increase to subset\" are only taken while the number of subsets is still 2.\n", "3. If \"Reduce to subset\" and \"Increase to subset\" are successful, the offset is set to `i` (not `0`) to distribute reduction operations more evenly across the input. (Thanks to Olaf Chitil and Joanna Sharrad to point out this issue!)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.094209Z", "iopub.status.busy": "2023-11-12T12:40:48.094039Z", "iopub.status.idle": "2023-11-12T12:40:48.096032Z", "shell.execute_reply": "2023-11-12T12:40:48.095644Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class NotFailingError(ValueError):\n", " pass" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.097581Z", "iopub.status.busy": "2023-11-12T12:40:48.097472Z", "iopub.status.idle": "2023-11-12T12:40:48.099184Z", "shell.execute_reply": "2023-11-12T12:40:48.098852Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class NotPassingError(ValueError):\n", " pass" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.101198Z", "iopub.status.busy": "2023-11-12T12:40:48.101055Z", "iopub.status.idle": "2023-11-12T12:40:48.108097Z", "shell.execute_reply": "2023-11-12T12:40:48.107830Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DeltaDebugger(CachingCallReducer):\n", " def dd(self, var_to_be_reduced: str, fail_args: Dict[str, Any], \n", " *, mode: str = '-') -> Tuple[Sequence, Sequence, Sequence]:\n", " \"\"\"General Delta Debugging.\n", " `var_to_be_reduced` - the name of the variable to reduce.\n", " `fail_args` - a dict of (failure-inducing) function arguments, \n", " with `fail_args[var_to_be_reduced]` - the input to apply dd on.\n", " `mode`- how the algorithm should operate:\n", " '-' (default): minimize input (`ddmin`),\n", " '+': maximizing input (`ddmax`),\n", " '+-': minimizing pass/fail difference (`dd`)\n", " Returns a triple (`pass`, `fail`, `diff`) with\n", " * maximized passing input (`pass`), \n", " * minimized failing input (`fail`), and\n", " * their difference `diff`\n", " (elems that are in `fail`, but not in `pass`).\n", " \"\"\"\n", " def test(c: Set) -> str:\n", " # Set up args\n", " test_args = {}\n", " for var in fail_args:\n", " test_args[var] = fail_args[var]\n", " test_args[var_to_be_reduced] = from_set(c, fail_inp)\n", " return self.test(test_args)\n", "\n", " def ret(c_pass: Set, c_fail: Set) -> \\\n", " Tuple[Sequence, Sequence, Sequence]:\n", " return (from_set(c_pass, fail_inp),\n", " from_set(c_fail, fail_inp),\n", " from_set(c_fail - c_pass, fail_inp))\n", "\n", " n = 2 # Initial granularity\n", "\n", " fail_inp = fail_args[var_to_be_reduced]\n", "\n", " c_pass = to_set([])\n", " c_fail = to_set(fail_inp)\n", " offset = 0\n", "\n", " minimize_fail = '-' in mode\n", " maximize_pass = '+' in mode\n", "\n", " # Validate inputs\n", " if test(c_pass) == FAIL:\n", " if maximize_pass:\n", " s_pass = repr(from_set(c_pass, fail_inp))\n", " raise NotPassingError(\n", " f\"Input {s_pass} expected to pass, but fails\")\n", " else:\n", " return ret(c_pass, c_pass)\n", "\n", " if test(c_fail) == PASS:\n", " if minimize_fail:\n", " s_fail = repr(from_set(c_fail, fail_inp))\n", " raise NotFailingError(\n", " f\"Input {s_fail} expected to fail, but passes\")\n", " else:\n", " return ret(c_fail, c_fail)\n", "\n", " # Main loop\n", " while True:\n", " if self.log > 1:\n", " print(\"Passing input:\", repr(from_set(c_pass, fail_inp)))\n", " print(\"Failing input:\", repr(from_set(c_fail, fail_inp)))\n", " print(\"Granularity: \", n)\n", "\n", " delta = c_fail - c_pass\n", " if len(delta) < n:\n", " return ret(c_pass, c_fail)\n", "\n", " deltas = split(delta, n)\n", "\n", " reduction_found = False\n", " j = 0\n", "\n", " while j < n:\n", " i = (j + offset) % n\n", " next_c_pass = c_pass | deltas[i]\n", " next_c_fail = c_fail - deltas[i]\n", "\n", " if minimize_fail and n == 2 and test(next_c_pass) == FAIL:\n", " if self.log > 1:\n", " print(\"Reduce to subset\")\n", " c_fail = next_c_pass\n", " offset = i # was offset = 0 in original dd()\n", " reduction_found = True\n", " break\n", "\n", " elif maximize_pass and n == 2 and test(next_c_fail) == PASS:\n", " if self.log > 1:\n", " print(\"Increase to subset\")\n", " c_pass = next_c_fail\n", " offset = i # was offset = 0 in original dd()\n", " reduction_found = True\n", " break\n", "\n", " elif minimize_fail and test(next_c_fail) == FAIL:\n", " if self.log > 1:\n", " print(\"Reduce to complement\")\n", " c_fail = next_c_fail\n", " n = max(n - 1, 2)\n", " offset = i\n", " reduction_found = True\n", " break\n", "\n", " elif maximize_pass and test(next_c_pass) == PASS:\n", " if self.log > 1:\n", " print(\"Increase to complement\")\n", " c_pass = next_c_pass\n", " n = max(n - 1, 2)\n", " offset = i\n", " reduction_found = True\n", " break\n", "\n", " else:\n", " j += 1 # choose next subset\n", "\n", " if not reduction_found:\n", " if self.log > 1:\n", " print(\"No reduction found\")\n", "\n", " if n >= len(delta):\n", " return ret(c_pass, c_fail)\n", "\n", " if self.log > 1:\n", " print(\"Increase granularity\")\n", "\n", " n = min(n * 2, len(delta))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "By default, `dd()` minimizes inputs – just like `ddmin()`." ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.109695Z", "iopub.status.busy": "2023-11-12T12:40:48.109598Z", "iopub.status.idle": "2023-11-12T12:40:48.111469Z", "shell.execute_reply": "2023-11-12T12:40:48.111151Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with DeltaDebugger() as dd:\n", " mystery(failing_input)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Its output is a triple of maximized passing input (if wanted), minimized failing input, and difference. Here is this triple for `mystery()`, just as with `ddmin()`:" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.113139Z", "iopub.status.busy": "2023-11-12T12:40:48.113044Z", "iopub.status.idle": "2023-11-12T12:40:48.115038Z", "shell.execute_reply": "2023-11-12T12:40:48.114790Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "mystery_pass, mystery_fail, mystery_diff = dd.dd('inp', {'inp': failing_input})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The first element (`mystery_pass`) is the maximal passing input:" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.116500Z", "iopub.status.busy": "2023-11-12T12:40:48.116402Z", "iopub.status.idle": "2023-11-12T12:40:48.118446Z", "shell.execute_reply": "2023-11-12T12:40:48.118144Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "''" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mystery_pass" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The second element (`mystery_fail`) is the minimal failing input:" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.139476Z", "iopub.status.busy": "2023-11-12T12:40:48.139327Z", "iopub.status.idle": "2023-11-12T12:40:48.141754Z", "shell.execute_reply": "2023-11-12T12:40:48.141476Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'()'" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mystery_fail" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And the third element (`mystery_diff`) is the difference between the two:" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.143130Z", "iopub.status.busy": "2023-11-12T12:40:48.143023Z", "iopub.status.idle": "2023-11-12T12:40:48.145002Z", "shell.execute_reply": "2023-11-12T12:40:48.144700Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'()'" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mystery_diff" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "(Note that we will introduce more comfortable APIs later.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can follow the operation of `dd()` by increasing the logging level. We see how with every test, the difference between the passing and the failing input gets smaller and smaller." ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.146730Z", "iopub.status.busy": "2023-11-12T12:40:48.146600Z", "iopub.status.idle": "2023-11-12T12:40:48.148367Z", "shell.execute_reply": "2023-11-12T12:40:48.148112Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with DeltaDebugger(log=2) as dd:\n", " mystery(failing_input)" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.150103Z", "iopub.status.busy": "2023-11-12T12:40:48.149919Z", "iopub.status.idle": "2023-11-12T12:40:48.153947Z", "shell.execute_reply": "2023-11-12T12:40:48.153648Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test #1 mystery(inp=''): PASS\n", "Test #2 mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Passing input: ''\n", "Failing input: 'V\"/+!aF-(V4EOz*+s/Q,7)2@0_'\n", "Granularity: 2\n", "Test #3 mystery(inp='V\"/+!aF-(V4EO'): PASS\n", "Test #4 mystery(inp='z*+s/Q,7)2@0_'): PASS\n", "No reduction found\n", "Increase granularity\n", "Passing input: ''\n", "Failing input: 'V\"/+!aF-(V4EOz*+s/Q,7)2@0_'\n", "Granularity: 4\n", "Test #5 mystery(inp='-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '-(V4EOz*+s/Q,7)2@0_'\n", "Granularity: 3\n", "Test #6 mystery(inp='*+s/Q,7)2@0_'): PASS\n", "Test #7 mystery(inp='-(V4EOz7)2@0_'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '-(V4EOz7)2@0_'\n", "Granularity: 2\n", "Test #8 mystery(inp='7)2@0_'): PASS\n", "Test #9 mystery(inp='-(V4EOz'): PASS\n", "No reduction found\n", "Increase granularity\n", "Passing input: ''\n", "Failing input: '-(V4EOz7)2@0_'\n", "Granularity: 4\n", "Test #10 mystery(inp='-(V47)2@0_'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '-(V47)2@0_'\n", "Granularity: 3\n", "Test #11 mystery(inp='-(V4@0_'): PASS\n", "Test #12 mystery(inp='-(V47)2'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '-(V47)2'\n", "Granularity: 2\n", "Test #13 mystery(inp='-7)2'): PASS\n", "Test #14 mystery(inp='(V4'): PASS\n", "No reduction found\n", "Increase granularity\n", "Passing input: ''\n", "Failing input: '-(V47)2'\n", "Granularity: 4\n", "Test #15 mystery(inp='-47)2'): PASS\n", "Test #16 mystery(inp='-(V7)2'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '-(V7)2'\n", "Granularity: 3\n", "Test #17 mystery(inp='-(V2'): PASS\n", "Test #18 mystery(inp='(V7)'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '(V7)'\n", "Granularity: 2\n", "Test #19 mystery(inp='7)'): PASS\n", "Test #20 mystery(inp='(V'): PASS\n", "No reduction found\n", "Increase granularity\n", "Passing input: ''\n", "Failing input: '(V7)'\n", "Granularity: 4\n", "Test #21 mystery(inp='(7)'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '(7)'\n", "Granularity: 3\n", "Test #22 mystery(inp='()'): FAIL (ValueError: Invalid input)\n", "Reduce to complement\n", "Passing input: ''\n", "Failing input: '()'\n", "Granularity: 2\n", "Test #23 mystery(inp=')'): PASS\n", "Test #24 mystery(inp='('): PASS\n", "No reduction found\n" ] }, { "data": { "text/plain": [ "('', '()', '()')" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.dd('inp', {'inp': failing_input})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Processing Multiple Arguments\n", "\n", "What happens if a function has multiple arguments? First, we check if they are _reducible_ – that is, they provide a `len()` length function and a way to access indexed elements. This holds for all strings and all lists, as well as other ordered collections." ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.155521Z", "iopub.status.busy": "2023-11-12T12:40:48.155427Z", "iopub.status.idle": "2023-11-12T12:40:48.157647Z", "shell.execute_reply": "2023-11-12T12:40:48.157329Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def is_reducible(value: Any) -> bool:\n", " # Return True if `value` supports len() and indexing.\n", " try:\n", " _ = len(value)\n", " except TypeError:\n", " return False\n", "\n", " try:\n", " _ = value[0]\n", " except TypeError:\n", " return False\n", " except IndexError:\n", " return False\n", "\n", " return True" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Our method `process_args()` processes recorded call arguments, one after the one, until all are minimized or maximized. Processing them individually (rather than, say, all at once) allows maintaining a stable _context_ during reduction." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This method also does all the housekeeping, checking arguments and results, and raising errors if need be." ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.159396Z", "iopub.status.busy": "2023-11-12T12:40:48.159296Z", "iopub.status.idle": "2023-11-12T12:40:48.160985Z", "shell.execute_reply": "2023-11-12T12:40:48.160683Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class FailureNotReproducedError(ValueError):\n", " pass" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.162603Z", "iopub.status.busy": "2023-11-12T12:40:48.162486Z", "iopub.status.idle": "2023-11-12T12:40:48.165221Z", "shell.execute_reply": "2023-11-12T12:40:48.164937Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def check_reproducibility(self) -> None:\n", " # Check whether running the function again fails\n", " assert self._function, \\\n", " \"No call collected. Use `with dd: func()` first.\"\n", " assert self._args, \\\n", " \"No arguments collected. Use `with dd: func(args)` first.\"\n", "\n", " self.reset()\n", " outcome = self.test(self.args())\n", " if outcome == UNRESOLVED:\n", " raise FailureNotReproducedError(\n", " \"When called again, \" +\n", " self.format_call(self.args()) + \n", " \" raised \" +\n", " self.format_exception(self.last_exception) +\n", " \" instead of \" +\n", " self.format_exception(self.exception()))\n", "\n", " if outcome == PASS:\n", " raise NotFailingError(\"When called again, \" +\n", " self.format_call(self.args()) + \n", " \" did not fail\")\n", " assert outcome == FAIL" ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.167003Z", "iopub.status.busy": "2023-11-12T12:40:48.166914Z", "iopub.status.idle": "2023-11-12T12:40:48.172447Z", "shell.execute_reply": "2023-11-12T12:40:48.172183Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def process_args(self, strategy: Callable, **strategy_args: Any) -> \\\n", " Tuple[Dict[str, Any], Dict[str, Any], Dict[str, Any]]:\n", " \"\"\"\n", " Reduce all reducible arguments, using `strategy`(var, `strategy_args`).\n", " Can be overloaded in subclasses.\n", " \"\"\"\n", "\n", " pass_args = {} # Local copy\n", " fail_args = {} # Local copy\n", " diff_args = {}\n", " for var in self.args():\n", " fail_args[var] = self.args()[var]\n", " diff_args[var] = self.args()[var]\n", " pass_args[var] = self.args()[var]\n", "\n", " if is_reducible(pass_args[var]):\n", " pass_args[var] = empty(pass_args[var])\n", "\n", " vars_to_be_processed = set(fail_args.keys())\n", "\n", " pass_processed = 0\n", " fail_processed = 0\n", "\n", " self.check_reproducibility()\n", "\n", " # We take turns in processing variables until all are processed\n", " while len(vars_to_be_processed) > 0:\n", " for var in vars_to_be_processed:\n", " if not is_reducible(fail_args[var]):\n", " vars_to_be_processed.remove(var)\n", " break\n", "\n", " if self.log:\n", " print(f\"Processing {var}...\")\n", "\n", " maximized_pass_value, minimized_fail_value, diff = \\\n", " strategy(var, fail_args, **strategy_args)\n", "\n", " if (maximized_pass_value is not None and \n", " len(maximized_pass_value) > len(pass_args[var])):\n", " pass_args[var] = maximized_pass_value\n", " # FIXME: diff_args may not be correct for multiple args\n", " diff_args[var] = diff\n", " if self.log:\n", " print(f\"Maximized {var} to\",\n", " repr(maximized_pass_value))\n", " vars_to_be_processed = set(fail_args.keys())\n", " pass_processed += 1\n", "\n", " if (minimized_fail_value is not None and \n", " len(minimized_fail_value) < len(fail_args[var])):\n", " fail_args[var] = minimized_fail_value\n", " diff_args[var] = diff\n", " if self.log:\n", " print(f\"Minimized {var} to\",\n", " repr(minimized_fail_value))\n", " vars_to_be_processed = set(fail_args.keys())\n", " fail_processed += 1\n", "\n", " vars_to_be_processed.remove(var)\n", " break\n", "\n", " assert pass_processed == 0 or self.test(pass_args) == PASS, \\\n", " f\"{self.format_call(pass_args)} does not pass\"\n", " assert fail_processed == 0 or self.test(fail_args) == FAIL, \\\n", " f\"{self.format_call(fail_args)} does not fail\"\n", "\n", " if self.log and pass_processed > 0:\n", " print(\"Maximized passing call to\",\n", " self.format_call(pass_args))\n", " if self.log and fail_processed > 0:\n", " print(\"Minimized failing call to\",\n", " self.format_call(fail_args))\n", "\n", " return pass_args, fail_args, diff_args" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "For more housekeeping, we define the `after_collection()` method that will be invoked at the end of the `with` block. It checks for a number of additional preconditions." ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.174124Z", "iopub.status.busy": "2023-11-12T12:40:48.174033Z", "iopub.status.idle": "2023-11-12T12:40:48.176446Z", "shell.execute_reply": "2023-11-12T12:40:48.176160Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def after_collection(self) -> None:\n", " # Some post-collection checks\n", " if self._function is None:\n", " raise NoCallError(\"No function call observed\")\n", " if self.exception() is None:\n", " raise NotFailingError(\n", " f\"{self.format_call()} did not raise an exception\")\n", "\n", " if self.log:\n", " print(f\"Observed {self.format_call()}\" +\n", " f\" raising {self.format_exception(self.exception())}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Public API\n", "\n", "We finish the implementation with public methods that allow users to run delta debugging and obtain the diagnostics." ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.178169Z", "iopub.status.busy": "2023-11-12T12:40:48.178035Z", "iopub.status.idle": "2023-11-12T12:40:48.180016Z", "shell.execute_reply": "2023-11-12T12:40:48.179783Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def min_args(self) -> Dict[str, Any]:\n", " \"\"\"Return 1-minimal arguments.\"\"\"\n", " pass_args, fail_args, diff = self.process_args(self.dd, mode='-')\n", " return fail_args" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.181881Z", "iopub.status.busy": "2023-11-12T12:40:48.181730Z", "iopub.status.idle": "2023-11-12T12:40:48.184089Z", "shell.execute_reply": "2023-11-12T12:40:48.183849Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def max_args(self) -> Dict[str, Any]:\n", " \"\"\"Return 1-maximal arguments.\"\"\"\n", " pass_args, fail_args, diff = self.process_args(self.dd, mode='+')\n", " return pass_args" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.185851Z", "iopub.status.busy": "2023-11-12T12:40:48.185670Z", "iopub.status.idle": "2023-11-12T12:40:48.187660Z", "shell.execute_reply": "2023-11-12T12:40:48.187432Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def min_arg_diff(self) -> Tuple[Dict[str, Any], Dict[str, Any], Dict[str, Any]]:\n", " \"\"\"Return 1-minimal difference between arguments.\"\"\"\n", " return self.process_args(self.dd, mode='+-')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `__repr__()` method returns a string representation of the minimized call." ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.189274Z", "iopub.status.busy": "2023-11-12T12:40:48.189107Z", "iopub.status.idle": "2023-11-12T12:40:48.190801Z", "shell.execute_reply": "2023-11-12T12:40:48.190555Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class DeltaDebugger(DeltaDebugger):\n", " def __repr__(self) -> str:\n", " \"\"\"Return a string representation of the minimized call.\"\"\"\n", " return self.format_call(self.min_args())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### End of Excursion" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To see how the `DeltaDebugger` works, let us run it on our failing input. The expected usage is as introduced earlier – we wrap the failing function in a `with` block, and then print out the debugger to see the reduced arguments. We see that `DeltaDebugger` easily reduces the arguments to the minimal failure-inducing input:" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.192278Z", "iopub.status.busy": "2023-11-12T12:40:48.192152Z", "iopub.status.idle": "2023-11-12T12:40:48.194718Z", "shell.execute_reply": "2023-11-12T12:40:48.194449Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "mystery(inp='()')" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger() as dd:\n", " mystery(failing_input)\n", "dd" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can turn on logging for `DeltaDebugger` to see how it proceeds. With each step, we see how the remaining input gets smaller and smaller, until only two characters remain:" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.196381Z", "iopub.status.busy": "2023-11-12T12:40:48.196226Z", "iopub.status.idle": "2023-11-12T12:40:48.199392Z", "shell.execute_reply": "2023-11-12T12:40:48.199080Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observed mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_') raising ValueError: Invalid input\n", "Test #1 mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Processing inp...\n", "Test #2 mystery(inp=''): PASS\n", "Test #3 mystery(inp='V\"/+!aF-(V4EO'): PASS\n", "Test #4 mystery(inp='z*+s/Q,7)2@0_'): PASS\n", "Test #5 mystery(inp='-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #6 mystery(inp='*+s/Q,7)2@0_'): PASS\n", "Test #7 mystery(inp='-(V4EOz7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #8 mystery(inp='7)2@0_'): PASS\n", "Test #9 mystery(inp='-(V4EOz'): PASS\n", "Test #10 mystery(inp='-(V47)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #11 mystery(inp='-(V4@0_'): PASS\n", "Test #12 mystery(inp='-(V47)2'): FAIL (ValueError: Invalid input)\n", "Test #13 mystery(inp='-7)2'): PASS\n", "Test #14 mystery(inp='(V4'): PASS\n", "Test #15 mystery(inp='-47)2'): PASS\n", "Test #16 mystery(inp='-(V7)2'): FAIL (ValueError: Invalid input)\n", "Test #17 mystery(inp='-(V2'): PASS\n", "Test #18 mystery(inp='(V7)'): FAIL (ValueError: Invalid input)\n", "Test #19 mystery(inp='7)'): PASS\n", "Test #20 mystery(inp='(V'): PASS\n", "Test #21 mystery(inp='(7)'): FAIL (ValueError: Invalid input)\n", "Test #22 mystery(inp='()'): FAIL (ValueError: Invalid input)\n", "Test #23 mystery(inp=')'): PASS\n", "Test #24 mystery(inp='('): PASS\n", "Minimized inp to '()'\n", "Minimized failing call to mystery(inp='()')\n" ] }, { "data": { "text/plain": [ "mystery(inp='()')" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger(log=True) as dd:\n", " mystery(failing_input)\n", "dd" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "It is also possible to access the debugger programmatically:" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.201111Z", "iopub.status.busy": "2023-11-12T12:40:48.201008Z", "iopub.status.idle": "2023-11-12T12:40:48.202927Z", "shell.execute_reply": "2023-11-12T12:40:48.202604Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with DeltaDebugger() as dd:\n", " mystery(failing_input)" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.204501Z", "iopub.status.busy": "2023-11-12T12:40:48.204419Z", "iopub.status.idle": "2023-11-12T12:40:48.206514Z", "shell.execute_reply": "2023-11-12T12:40:48.206246Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'inp': 'V\"/+!aF-(V4EOz*+s/Q,7)2@0_'}" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.args()" ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.208359Z", "iopub.status.busy": "2023-11-12T12:40:48.208271Z", "iopub.status.idle": "2023-11-12T12:40:48.210366Z", "shell.execute_reply": "2023-11-12T12:40:48.210128Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'inp': '()'}" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.min_args()" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.212193Z", "iopub.status.busy": "2023-11-12T12:40:48.212076Z", "iopub.status.idle": "2023-11-12T12:40:48.216929Z", "shell.execute_reply": "2023-11-12T12:40:48.216642Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

What happens if the function under test does not raise an exception?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz(\"What happens if the function under test does not raise an exception?\",\n", " [\n", " \"Delta debugging searches for the minimal input\"\n", " \" that produces the same result\",\n", " \"Delta debugging starts a fuzzer to find an exception\",\n", " \"Delta debugging raises an exception\",\n", " \"Delta debugging runs forever in a loop\",\n", " ], '0 ** 0 + 1 ** 0 + 0 ** 1 + 1 ** 1')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Indeed, `DeltaDebugger` checks if an exception occurs. If not, you obtain a `NotFailingError`." ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.218434Z", "iopub.status.busy": "2023-11-12T12:40:48.218324Z", "iopub.status.idle": "2023-11-12T12:40:48.220359Z", "shell.execute_reply": "2023-11-12T12:40:48.220113Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/3784387889.py\", line 2, in \n", " with DeltaDebugger() as dd:\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4114738534.py\", line 24, in __exit__\n", " self.after_collection()\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/1934034330.py\", line 7, in after_collection\n", " raise NotFailingError(\n", "NotFailingError: mystery(inp='An input that does not fail') did not raise an exception (expected)\n" ] } ], "source": [ "with ExpectError(NotFailingError):\n", " with DeltaDebugger() as dd:\n", " mystery(\"An input that does not fail\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Delta Debugging also assumes that the function under test is _deterministic_. If it occasionally fails and occasionally passes, you will get random results." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Usage Examples\n", "\n", "Let us apply `DeltaDebugger` on a number of examples. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reducing remove_html_markup()\n", "\n", "For our ongoing `remove_html_markup()` example, we can reduce the failure-inducing input to a minimum, too:" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.221911Z", "iopub.status.busy": "2023-11-12T12:40:48.221794Z", "iopub.status.idle": "2023-11-12T12:40:48.248652Z", "shell.execute_reply": "2023-11-12T12:40:48.248344Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Assertions import remove_html_markup # minor dependency" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.250655Z", "iopub.status.busy": "2023-11-12T12:40:48.250490Z", "iopub.status.idle": "2023-11-12T12:40:48.253046Z", "shell.execute_reply": "2023-11-12T12:40:48.252823Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observed remove_html_markup(s='\"x > y\"') raising AssertionError\n", "Test #1 remove_html_markup(s='\"x > y\"'): FAIL (AssertionError)\n", "Processing s...\n", "Test #2 remove_html_markup(s=''): PASS\n", "Test #3 remove_html_markup(s='\"x >'): FAIL (AssertionError)\n", "Test #4 remove_html_markup(s='\"x'): PASS\n", "Test #5 remove_html_markup(s=' >'): PASS\n", "Test #6 remove_html_markup(s='x >'): PASS\n", "Test #7 remove_html_markup(s='\" >'): FAIL (AssertionError)\n", "Test #8 remove_html_markup(s='\">'): FAIL (AssertionError)\n", "Test #9 remove_html_markup(s='>'): PASS\n", "Test #10 remove_html_markup(s='\"'): PASS\n", "Minimized s to '\">'\n", "Minimized failing call to remove_html_markup(s='\">')\n" ] }, { "data": { "text/plain": [ "{'s': '\">'}" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger(log=True) as dd:\n", " remove_html_markup('\"x > y\"')\n", "dd.min_args()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reducing Multiple Arguments\n", "\n", "If a function has multiple reducible variables, they get reduced in turns. This `string_error()` function fails whenever `s1` is a substring of `s2`:" ] }, { "cell_type": "code", "execution_count": 99, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.254870Z", "iopub.status.busy": "2023-11-12T12:40:48.254711Z", "iopub.status.idle": "2023-11-12T12:40:48.256378Z", "shell.execute_reply": "2023-11-12T12:40:48.256140Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def string_error(s1: str, s2: str) -> None:\n", " assert s1 not in s2, \"no substrings\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Running `DeltaDebugger` on `string_error` shows how first `s1` is reduced, then `s2`, then `s1` again." ] }, { "cell_type": "code", "execution_count": 100, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.257786Z", "iopub.status.busy": "2023-11-12T12:40:48.257707Z", "iopub.status.idle": "2023-11-12T12:40:48.260433Z", "shell.execute_reply": "2023-11-12T12:40:48.260117Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observed string_error(s1='foo', s2='foobar') raising AssertionError: no substrings\n", "Test #1 string_error(s1='foo', s2='foobar'): FAIL (AssertionError: no substrings)\n", "Processing s1...\n", "Test #2 string_error(s1='', s2='foobar'): FAIL (AssertionError: no substrings)\n", "Minimized s1 to ''\n", "Processing s2...\n", "Test #3 string_error(s1='', s2=''): FAIL (AssertionError: no substrings)\n", "Minimized s2 to ''\n", "Minimized failing call to string_error(s1='', s2='')\n" ] }, { "data": { "text/plain": [ "{'s1': '', 's2': ''}" ] }, "execution_count": 100, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger(log=True) as dd:\n", " string_error(\"foo\", \"foobar\")\n", "\n", "string_error_args = dd.min_args()\n", "string_error_args" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We see that the failure also occurs if both strings are empty:" ] }, { "cell_type": "code", "execution_count": 101, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.262611Z", "iopub.status.busy": "2023-11-12T12:40:48.262415Z", "iopub.status.idle": "2023-11-12T12:40:48.264329Z", "shell.execute_reply": "2023-11-12T12:40:48.264102Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/3257360882.py\", line 2, in \n", " string_error(string_error_args['s1'], string_error_args['s2'])\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/773514021.py\", line 2, in string_error\n", " assert s1 not in s2, \"no substrings\"\n", "AssertionError: no substrings (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " string_error(string_error_args['s1'], string_error_args['s2'])" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Invoking an Interactive Debugger\n", "\n", "The results from delta debugging can be immediately used to invoke an [interactive debugger](Debugger.ipynb) on the minimized input. To this end, we need to turn the dictionary returned by `min_args()` into arguments of the (failing) function call." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Python provides a simple way to turn dictionaries into function calls. The construct\n", "\n", "```python\n", "fun(**args)\n", "```\n", "\n", "invokes the function `fun`, with all parameters assigned from the respective values in the dictionary. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "With this, we can immediately invoke a `Debugger` on the failing run with minimized arguments:" ] }, { "cell_type": "code", "execution_count": 102, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.265865Z", "iopub.status.busy": "2023-11-12T12:40:48.265749Z", "iopub.status.idle": "2023-11-12T12:40:48.280586Z", "shell.execute_reply": "2023-11-12T12:40:48.280065Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Debugger import Debugger # minor dependency" ] }, { "cell_type": "code", "execution_count": 103, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.282639Z", "iopub.status.busy": "2023-11-12T12:40:48.282411Z", "iopub.status.idle": "2023-11-12T12:40:48.284346Z", "shell.execute_reply": "2023-11-12T12:40:48.283995Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# ignore\n", "from bookutils import next_inputs" ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.285784Z", "iopub.status.busy": "2023-11-12T12:40:48.285678Z", "iopub.status.idle": "2023-11-12T12:40:48.287712Z", "shell.execute_reply": "2023-11-12T12:40:48.287480Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "['print', 'quit']" ] }, "execution_count": 104, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ignore\n", "next_inputs(['print', 'quit'])" ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.289067Z", "iopub.status.busy": "2023-11-12T12:40:48.288968Z", "iopub.status.idle": "2023-11-12T12:40:48.292389Z", "shell.execute_reply": "2023-11-12T12:40:48.292084Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Calling string_error(s1 = '', s2 = '')\n" ] }, { "data": { "text/html": [ "(debugger) print" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "s1 = ''\n", "s2 = ''\n" ] }, { "data": { "text/html": [ "(debugger) quit" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2680174709.py\", line 3, in \n", " string_error(**string_error_args)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/773514021.py\", line 2, in string_error\n", " assert s1 not in s2, \"no substrings\"\n", "AssertionError: no substrings (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " with Debugger():\n", " string_error(**string_error_args)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reducing other Collections\n", "\n", "Our `DeltaDebugger` is not limited to strings. It can reduce any argument `x` for which a `len(x)` operation and an indexing operation `x[i]` is defined – notably lists. Here is how to apply `DeltaDebugger` on a list:" ] }, { "cell_type": "code", "execution_count": 106, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.293902Z", "iopub.status.busy": "2023-11-12T12:40:48.293816Z", "iopub.status.idle": "2023-11-12T12:40:48.295696Z", "shell.execute_reply": "2023-11-12T12:40:48.295448Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def list_error(l1: List, l2: List, maxlen: int) -> None:\n", " assert len(l1) < len(l2) < maxlen, \"invalid string length\"" ] }, { "cell_type": "code", "execution_count": 107, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.297237Z", "iopub.status.busy": "2023-11-12T12:40:48.297126Z", "iopub.status.idle": "2023-11-12T12:40:48.299906Z", "shell.execute_reply": "2023-11-12T12:40:48.299617Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "list_error(l1=[], l2=[], maxlen=5)" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger() as dd:\n", " list_error(l1=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], l2=[1, 2, 3], maxlen=5)\n", "dd" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Debugging Inputs\n", "\n", "Sometimes, it may be useful to not _minimize_ the input, but rather _maximize_ it – that is, to find the _maximum_ input that does _not_ fail. For instance, you may have an input of which you want to _preserve_ as much as possible – to repair it, or to establish a _context_ that is as close as possible to the real input." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is possible by using the `max_arg()` method. It implements the `ddmax` variant of the general Delta Debugging algorithm \\cite{Kirschner2020}. With each step, it tries to add more and more characters to the passing input until it is _1-maximal_ – that is, any additional character that would be added from the failing input also would cause the function to fail." ] }, { "cell_type": "code", "execution_count": 108, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.301585Z", "iopub.status.busy": "2023-11-12T12:40:48.301492Z", "iopub.status.idle": "2023-11-12T12:40:48.304422Z", "shell.execute_reply": "2023-11-12T12:40:48.304152Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observed mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_') raising ValueError: Invalid input\n", "Test #1 mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Processing inp...\n", "Test #2 mystery(inp=''): PASS\n", "Test #3 mystery(inp='z*+s/Q,7)2@0_'): PASS\n", "Test #4 mystery(inp='-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #5 mystery(inp='V\"/+!aFz*+s/Q,7)2@0_'): PASS\n", "Test #6 mystery(inp='V\"/+!aF4EOz*+s/Q,7)2@0_'): PASS\n", "Test #7 mystery(inp='V\"/+!aF-4EOz*+s/Q,7)2@0_'): PASS\n", "Test #8 mystery(inp='V\"/+!aF-V4EOz*+s/Q,7)2@0_'): PASS\n", "Maximized inp to 'V\"/+!aF-V4EOz*+s/Q,7)2@0_'\n", "Maximized passing call to mystery(inp='V\"/+!aF-V4EOz*+s/Q,7)2@0_')\n" ] }, { "data": { "text/plain": [ "'V\"/+!aF-V4EOz*+s/Q,7)2@0_'" ] }, "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger(log=True) as dd:\n", " mystery(failing_input)\n", "max_passing_input = dd.max_args()['inp']\n", "max_passing_input" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Note that this is precisely the failure-inducing input _except_ for the first parenthesis. Adding this single character would cause the input to cause a failure." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Failure-Inducing Differences" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If one wants to look for _differences_ that distinguish passing from failing runs, Delta Debugging also has a direct method for this – by both maximizing the passing input and minimizing the failing input until they meet somewhere in the middle. The remaining difference is what makes the difference between passing and failing." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To compute the failure-inducing differences for `mystery()`, use the `min_arg_diff()` method:" ] }, { "cell_type": "code", "execution_count": 109, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.306096Z", "iopub.status.busy": "2023-11-12T12:40:48.305962Z", "iopub.status.idle": "2023-11-12T12:40:48.308891Z", "shell.execute_reply": "2023-11-12T12:40:48.308560Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observed mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_') raising ValueError: Invalid input\n", "Test #1 mystery(inp='V\"/+!aF-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Processing inp...\n", "Test #2 mystery(inp=''): PASS\n", "Test #3 mystery(inp='V\"/+!aF-(V4EO'): PASS\n", "Test #4 mystery(inp='z*+s/Q,7)2@0_'): PASS\n", "Test #5 mystery(inp='V\"/+!aFz*+s/Q,7)2@0_'): PASS\n", "Test #6 mystery(inp='-(V4EOz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #7 mystery(inp='-(Vz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #8 mystery(inp='(Vz*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Test #9 mystery(inp='(z*+s/Q,7)2@0_'): FAIL (ValueError: Invalid input)\n", "Maximized inp to 'z*+s/Q,7)2@0_'\n", "Minimized inp to '(z*+s/Q,7)2@0_'\n", "Maximized passing call to mystery(inp='z*+s/Q,7)2@0_')\n", "Minimized failing call to mystery(inp='(z*+s/Q,7)2@0_')\n" ] }, { "data": { "text/plain": [ "('z*+s/Q,7)2@0_', '(z*+s/Q,7)2@0_', '(')" ] }, "execution_count": 109, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger(log=True) as dd:\n", " mystery(failing_input)\n", "max_passing_args, min_failing_args, diff = dd.min_arg_diff()\n", "max_passing_args['inp'], min_failing_args['inp'], diff['inp']" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Minimizing failure-inducing differences is especially efficient on large inputs, since the number of differences between a passing and a failing input is much smaller than the inputs themselves. Here is the failure-inducing difference as determined by Delta Debugging:" ] }, { "cell_type": "code", "execution_count": 110, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.310738Z", "iopub.status.busy": "2023-11-12T12:40:48.310563Z", "iopub.status.idle": "2023-11-12T12:40:48.312785Z", "shell.execute_reply": "2023-11-12T12:40:48.312541Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'('" ] }, "execution_count": 110, "metadata": {}, "output_type": "execute_result" } ], "source": [ "diff['inp']" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Reducing Program Code\n", "\n", "One particularly fun application of reducers is on _program code_. Technically speaking, program code is just another input to a computation; and we can actually automatically determine which minimum of program code is required to produce a failure, using Delta Debugging. Such minimization of code is typically used as it comes to debugging programs that accept code as their input, such as _compilers_ and _interpreters_. However, it can also pinpoint failure causes in the (input) code itself." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "As an example, let us apply Delta Debugging on the code from [the chapter on assertions](Assertions.html). You do not need to have read the chapter; the important part is that this chapter provides an implementation of `remove_html_markup()` that we want to use." ] }, { "cell_type": "code", "execution_count": 111, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.314687Z", "iopub.status.busy": "2023-11-12T12:40:48.314519Z", "iopub.status.idle": "2023-11-12T12:40:48.316506Z", "shell.execute_reply": "2023-11-12T12:40:48.316202Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# ignore\n", "try:\n", " del remove_html_markup\n", "except NameError:\n", " pass" ] }, { "cell_type": "code", "execution_count": 112, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.318177Z", "iopub.status.busy": "2023-11-12T12:40:48.318061Z", "iopub.status.idle": "2023-11-12T12:40:48.319640Z", "shell.execute_reply": "2023-11-12T12:40:48.319389Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import Assertions # minor dependency" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is the source code of all the chapter; this is several hundred lines long." ] }, { "cell_type": "code", "execution_count": 113, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.321050Z", "iopub.status.busy": "2023-11-12T12:40:48.320968Z", "iopub.status.idle": "2023-11-12T12:40:48.322644Z", "shell.execute_reply": "2023-11-12T12:40:48.322399Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import inspect" ] }, { "cell_type": "code", "execution_count": 114, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.324042Z", "iopub.status.busy": "2023-11-12T12:40:48.323942Z", "iopub.status.idle": "2023-11-12T12:40:48.326443Z", "shell.execute_reply": "2023-11-12T12:40:48.326157Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "['from bookutils import YouTubeVideo\\n',\n", " 'YouTubeVideo(\"9mI9sbKFkwU\")\\n',\n", " '\\n',\n", " 'import bookutils.setup\\n',\n", " '\\n',\n", " 'from bookutils import quiz\\n',\n", " '\\n',\n", " 'import Tracer\\n',\n", " '\\n',\n", " 'from ExpectError import ExpectError\\n']" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assertions_source_lines, _ = inspect.getsourcelines(Assertions)\n", "# print_content(\"\".join(assertions_source_lines), \".py\")\n", "assertions_source_lines[:10]" ] }, { "cell_type": "code", "execution_count": 115, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.328316Z", "iopub.status.busy": "2023-11-12T12:40:48.328164Z", "iopub.status.idle": "2023-11-12T12:40:48.330461Z", "shell.execute_reply": "2023-11-12T12:40:48.330188Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "552" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(assertions_source_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can take this code and execute it. Nothing particular should happen here, as our imports only import definitions of functions, classes, and global variables." ] }, { "cell_type": "code", "execution_count": 116, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.331964Z", "iopub.status.busy": "2023-11-12T12:40:48.331854Z", "iopub.status.idle": "2023-11-12T12:40:48.334133Z", "shell.execute_reply": "2023-11-12T12:40:48.333782Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def compile_and_run(lines: List[str]) -> None:\n", " # To execute 'Assertions' in place, we need to define __name__ and __package__\n", " exec(\"\".join(lines), {'__name__': '',\n", " '__package__': 'debuggingbook',\n", " 'Any': Any,\n", " 'Type': Type,\n", " 'TracebackType': TracebackType,\n", " 'Optional': Optional},\n", " {})" ] }, { "cell_type": "code", "execution_count": 117, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.335663Z", "iopub.status.busy": "2023-11-12T12:40:48.335552Z", "iopub.status.idle": "2023-11-12T12:40:48.339925Z", "shell.execute_reply": "2023-11-12T12:40:48.339650Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "compile_and_run(assertions_source_lines)" ] }, { "cell_type": "code", "execution_count": 118, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.341513Z", "iopub.status.busy": "2023-11-12T12:40:48.341421Z", "iopub.status.idle": "2023-11-12T12:40:48.343290Z", "shell.execute_reply": "2023-11-12T12:40:48.342960Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Assertions import remove_html_markup # minor dependency" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us add some code to it – a \"My Test\" assertion that tests that `remove_html_markup()`, applied on a string with double quotes, should keep these in place:" ] }, { "cell_type": "code", "execution_count": 119, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.344975Z", "iopub.status.busy": "2023-11-12T12:40:48.344884Z", "iopub.status.idle": "2023-11-12T12:40:48.346729Z", "shell.execute_reply": "2023-11-12T12:40:48.346467Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def compile_and_test_html_markup_simple(lines: List[str]) -> None:\n", " compile_and_run(lines + \n", " [\n", " '''''',\n", " '''assert remove_html_markup('\"foo\"') == '\"foo\"', \"My Test\"\\n'''\n", " ])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This assertion fails. (As always, `remove_html_markup()` is buggy.)" ] }, { "cell_type": "code", "execution_count": 120, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.348241Z", "iopub.status.busy": "2023-11-12T12:40:48.348149Z", "iopub.status.idle": "2023-11-12T12:40:48.352667Z", "shell.execute_reply": "2023-11-12T12:40:48.352413Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2334363057.py\", line 2, in \n", " compile_and_test_html_markup_simple(assertions_source_lines)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/1838190689.py\", line 2, in compile_and_test_html_markup_simple\n", " compile_and_run(lines +\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2075409738.py\", line 3, in compile_and_run\n", " exec(\"\".join(lines), {'__name__': '',\n", " File \"\", line 553, in \n", "AssertionError: My Test (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " compile_and_test_html_markup_simple(assertions_source_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The question we want to address in this section is: Given this assertion, can we automatically determine which part of the `Assertions` code lines in `assertions_source_lines` is relevant for producing the failure?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reducing Code Lines\n", "\n", "Since our `Assertions` source code comes as a list of lines, we can apply our `DeltaDebugger` on it. The result will be the list of source lines that is necessary to make the assertion fail." ] }, { "cell_type": "code", "execution_count": 121, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.354091Z", "iopub.status.busy": "2023-11-12T12:40:48.354006Z", "iopub.status.idle": "2023-11-12T12:40:48.357776Z", "shell.execute_reply": "2023-11-12T12:40:48.357521Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

What will the reduced set of lines contain?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 121, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz(\"What will the reduced set of lines contain?\",\n", " [\n", " \"All of the source code in the assertions chapter.\",\n", " \"Only the source code of `remove_html_markup()`\",\n", " \"Only a subset of `remove_html_markup()`\",\n", " \"No lines at all.\"\n", " ], '[x for x in range((1 + 1) ** (1 + 1)) if x % (1 + 1) == 1][1]')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us see what the `DeltaDebugger` produces." ] }, { "cell_type": "code", "execution_count": 122, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.359354Z", "iopub.status.busy": "2023-11-12T12:40:48.359220Z", "iopub.status.idle": "2023-11-12T12:40:48.363548Z", "shell.execute_reply": "2023-11-12T12:40:48.363239Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "with DeltaDebugger(log=False) as dd:\n", " compile_and_test_html_markup_simple(assertions_source_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We get exactly _two_ lines of code:" ] }, { "cell_type": "code", "execution_count": 123, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.365625Z", "iopub.status.busy": "2023-11-12T12:40:48.365486Z", "iopub.status.idle": "2023-11-12T12:40:48.376846Z", "shell.execute_reply": "2023-11-12T12:40:48.376516Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 123, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reduced_lines = dd.min_args()['lines']\n", "len(reduced_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And these are:" ] }, { "cell_type": "code", "execution_count": 124, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.378468Z", "iopub.status.busy": "2023-11-12T12:40:48.378348Z", "iopub.status.idle": "2023-11-12T12:40:48.380066Z", "shell.execute_reply": "2023-11-12T12:40:48.379741Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import print_content" ] }, { "cell_type": "code", "execution_count": 125, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.381474Z", "iopub.status.busy": "2023-11-12T12:40:48.381367Z", "iopub.status.idle": "2023-11-12T12:40:48.449101Z", "shell.execute_reply": "2023-11-12T12:40:48.448730Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s): \u001b[37m# type: ignore\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(\"\".join(reduced_lines), \".py\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "On these lines, our test actually still fails:" ] }, { "cell_type": "code", "execution_count": 126, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.451034Z", "iopub.status.busy": "2023-11-12T12:40:48.450905Z", "iopub.status.idle": "2023-11-12T12:40:48.452929Z", "shell.execute_reply": "2023-11-12T12:40:48.452652Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2457905094.py\", line 2, in \n", " compile_and_test_html_markup_simple(reduced_lines)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/1838190689.py\", line 2, in compile_and_test_html_markup_simple\n", " compile_and_run(lines +\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2075409738.py\", line 3, in compile_and_run\n", " exec(\"\".join(lines), {'__name__': '',\n", " File \"\", line 3, in \n", "AssertionError: My Test (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " compile_and_test_html_markup_simple(reduced_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This failure may come as a surprise – `remove_html_markup()` is reduced to a function which does not even return a value. However, this is how it causes our \"My Test\" assertion to fail: In Python, a function without an explicit `return` statement returns `None`. This value is definitely not the string the \"My Test\" assertion expects, so it fails." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "At the same time, we also have a function `test_square_root()` which is equally devoid of any meaning – its code line does not even stem from its original implementation. Note, however, how the set of four lines is actually 1-minimal – removing any further line would result in a syntax error." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To ensure we do not remove code that actually would be necessary for normal behavior, let us add another check – one that checks for the _normal_ functionality of `remove_html_markup()`. If this one fails (say, after the code has been tampered with too much), it raises an exception – but a _different_ one from the original failure:" ] }, { "cell_type": "code", "execution_count": 127, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.454495Z", "iopub.status.busy": "2023-11-12T12:40:48.454376Z", "iopub.status.idle": "2023-11-12T12:40:48.456235Z", "shell.execute_reply": "2023-11-12T12:40:48.455969Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def compile_and_test_html_markup(lines: List[str]) -> None:\n", " compile_and_run(lines +\n", " [\n", " '',\n", " '''if remove_html_markup('bar') != 'bar':\\n''',\n", " ''' raise RuntimeError(\"Missing functionality\")\\n''',\n", " '''assert remove_html_markup('\"foo\"') == '\"foo\"', \"My Test\"\\n'''\n", " ])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "On our \"reduced\" code, we now obtain a different exception." ] }, { "cell_type": "code", "execution_count": 128, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.457806Z", "iopub.status.busy": "2023-11-12T12:40:48.457669Z", "iopub.status.idle": "2023-11-12T12:40:48.459985Z", "shell.execute_reply": "2023-11-12T12:40:48.459653Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/810405118.py\", line 2, in \n", " compile_and_test_html_markup(reduced_lines)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4067432722.py\", line 2, in compile_and_test_html_markup\n", " compile_and_run(lines +\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2075409738.py\", line 3, in compile_and_run\n", " exec(\"\".join(lines), {'__name__': '',\n", " File \"\", line 4, in \n", "RuntimeError: Missing functionality (expected)\n" ] } ], "source": [ "with ExpectError():\n", " compile_and_test_html_markup(reduced_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Such an outcome that is different from the original failure causes our `DeltaDebugger` not treating this as a failure, but rather as a `UNRESOLVED` outcome, indicating that the test cannot determine whether it passed or failed. The `ddmin` algorithm treats such unresolved outcomes as if they were passing; hence, the algorithm treats its minimization attempt as unsuccessful." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "How does this change things? When we reduce the `Assertions` source code with the extended assertions, we now get a different result:" ] }, { "cell_type": "code", "execution_count": 129, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.461799Z", "iopub.status.busy": "2023-11-12T12:40:48.461672Z", "iopub.status.idle": "2023-11-12T12:40:48.482862Z", "shell.execute_reply": "2023-11-12T12:40:48.482424Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with DeltaDebugger(log=False) as dd:\n", " compile_and_test_html_markup(assertions_source_lines)\n", "reduced_assertions_source_lines = dd.min_args()['lines']" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Our result actually is the source code of `remove_html_markup()` – and _only_ the source code. This is a success, as Delta Debugging has eliminated all the other parts of the `Assertions` source code; these neither contribute to the correct functioning of `remove_html_markup()`, nor to the failure at hand." ] }, { "cell_type": "code", "execution_count": 130, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.484702Z", "iopub.status.busy": "2023-11-12T12:40:48.484575Z", "iopub.status.idle": "2023-11-12T12:40:48.517981Z", "shell.execute_reply": "2023-11-12T12:40:48.517609Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s): \u001b[37m# type: ignore\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out = \u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " out = out + c\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(''.join(reduced_assertions_source_lines), '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "All in all, we have reduced the number of relevant lines in `Assertions` to about 3% of the original source code." ] }, { "cell_type": "code", "execution_count": 131, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.519663Z", "iopub.status.busy": "2023-11-12T12:40:48.519544Z", "iopub.status.idle": "2023-11-12T12:40:48.521875Z", "shell.execute_reply": "2023-11-12T12:40:48.521596Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.025362318840579712" ] }, "execution_count": 131, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(reduced_assertions_source_lines) / len(assertions_source_lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The astute reader may notice that `remove_html_markup()`, as shown above, is slightly different from the original version in the [chapter on assertions](Assertions.ipynb). Here's the original version for comparison:" ] }, { "cell_type": "code", "execution_count": 132, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.523406Z", "iopub.status.busy": "2023-11-12T12:40:48.523293Z", "iopub.status.idle": "2023-11-12T12:40:48.557455Z", "shell.execute_reply": "2023-11-12T12:40:48.557152Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s): \u001b[37m# type: ignore\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out = \u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", "\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " out = out + c\u001b[37m\u001b[39;49;00m\n", "\u001b[37m\u001b[39;49;00m\n", " \u001b[37m# postcondition\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34massert\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m \u001b[35min\u001b[39;49;00m out \u001b[35mand\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m \u001b[35min\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m\n", "\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "remove_html_markup_source_lines, _ = inspect.getsourcelines(Assertions.remove_html_markup)\n", "print_content(''.join(remove_html_markup_source_lines), '.py')" ] }, { "cell_type": "code", "execution_count": 133, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.559395Z", "iopub.status.busy": "2023-11-12T12:40:48.559257Z", "iopub.status.idle": "2023-11-12T12:40:48.563851Z", "shell.execute_reply": "2023-11-12T12:40:48.563551Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

In the reduced version, what has changed?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 133, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz(\"In the reduced version, what has changed?\",\n", " [\n", " \"Comments are deleted\",\n", " \"Blank lines are deleted\",\n", " \"Initializations are deleted\",\n", " \"The assertion is deleted\",\n", " ], '[(1 ** 0 - -1 ** 0) ** n for n in range(0, 3)]')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Indeed, Delta Debugging has determined all these as being irrelevant for reproducing the failure – and consequently, has deleted them." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reducing Code Characters" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can reduce the code further by removing individual _characters_ rather than lines. To this end, we convert our (already reduced) `remove_html_markup()` code into a list of characters." ] }, { "cell_type": "code", "execution_count": 134, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.565460Z", "iopub.status.busy": "2023-11-12T12:40:48.565363Z", "iopub.status.idle": "2023-11-12T12:40:48.567536Z", "shell.execute_reply": "2023-11-12T12:40:48.567197Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['d', 'e', 'f', ' ', 'r', 'e', 'm', 'o', 'v', 'e', '_', 'h', 't', 'm', 'l', '_', 'm', 'a', 'r', 'k', 'u', 'p', '(', 's', ')', ':', ' ', ' ', '#', ' ']\n" ] } ], "source": [ "reduced_assertions_source_characters = list(\"\".join(reduced_assertions_source_lines))\n", "print(reduced_assertions_source_characters[:30])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our `compile_and_test_html_markup()` works (and fails) as before: It still joins the given strings into one and executes them. (Remember that in Python, \"characters\" are simply strings of length one.)" ] }, { "cell_type": "code", "execution_count": 135, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.569165Z", "iopub.status.busy": "2023-11-12T12:40:48.569031Z", "iopub.status.idle": "2023-11-12T12:40:48.571250Z", "shell.execute_reply": "2023-11-12T12:40:48.570986Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/909985861.py\", line 2, in \n", " compile_and_test_html_markup(reduced_assertions_source_characters)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4067432722.py\", line 2, in compile_and_test_html_markup\n", " compile_and_run(lines +\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2075409738.py\", line 3, in compile_and_run\n", " exec(\"\".join(lines), {'__name__': '',\n", " File \"\", line 17, in \n", "AssertionError: My Test (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " compile_and_test_html_markup(reduced_assertions_source_characters)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let's see what Delta Debugging makes of that – and also, how long it takes. The `Timer` class gives us a simple means to measure time." ] }, { "cell_type": "code", "execution_count": 136, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.572789Z", "iopub.status.busy": "2023-11-12T12:40:48.572680Z", "iopub.status.idle": "2023-11-12T12:40:48.574316Z", "shell.execute_reply": "2023-11-12T12:40:48.574051Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Timer import Timer" ] }, { "cell_type": "code", "execution_count": 137, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.575718Z", "iopub.status.busy": "2023-11-12T12:40:48.575633Z", "iopub.status.idle": "2023-11-12T12:40:48.577670Z", "shell.execute_reply": "2023-11-12T12:40:48.577285Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "with DeltaDebugger(log=False) as dd:\n", " compile_and_test_html_markup(reduced_assertions_source_characters)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here's the reduced result:" ] }, { "cell_type": "code", "execution_count": 138, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.579753Z", "iopub.status.busy": "2023-11-12T12:40:48.579562Z", "iopub.status.idle": "2023-11-12T12:40:48.922166Z", "shell.execute_reply": "2023-11-12T12:40:48.921781Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s):\u001b[37m\u001b[39;49;00m\n", " tag=\u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " quote=\u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out=\u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c==\u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:tag=\u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c==\u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:tag=\u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c==\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[35mor\u001b[39;49;00m c==\u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[35mand\u001b[39;49;00m g:\u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:out=out+c\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "with Timer() as t:\n", " further_reduced_assertions_source_characters = dd.min_args()['lines']\n", "print_content(\"\".join(further_reduced_assertions_source_characters), \".py\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "There's a number of observations we can make about this code.\n", "\n", "* All superfluous blanks and even newlines have been removed.\n", "* As a curiosity, the initialization of `quote` and `out` to `\"\"` is now merged into a single (semantics-preserving) statement.\n", "* The semantics and effect of `<` and `>` characters is preserved, as mandated by our `RuntimeError` check.\n", "* Double quotes still have the effect of not being included in the returned value: the remaining `quote` has no effect.\n", "\n", "Semantics-wise, this reduced variant still yields the \"original\" failure; the biggest semantic differences, though, are in the condition and code associated with double quotes – which actually also is the location of the defect to be fixed. This is how reducing code can also point to not only _necessary_ locations, but also _defective_ locations." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Mind you that reducing code is not cheap, and especially not if you remove by characters. It has taken `DeltaDebugger` several thousand tests to obtain the result above:" ] }, { "cell_type": "code", "execution_count": 139, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.923938Z", "iopub.status.busy": "2023-11-12T12:40:48.923810Z", "iopub.status.idle": "2023-11-12T12:40:48.926211Z", "shell.execute_reply": "2023-11-12T12:40:48.925867Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1278" ] }, "execution_count": 139, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.tests" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And to do so, it even required _several seconds_. This may be little for a human, but from a CPU standpoint, this is an enormous effort." ] }, { "cell_type": "code", "execution_count": 140, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.928006Z", "iopub.status.busy": "2023-11-12T12:40:48.927842Z", "iopub.status.idle": "2023-11-12T12:40:48.930226Z", "shell.execute_reply": "2023-11-12T12:40:48.929931Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.3095454590002191" ] }, "execution_count": 140, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.elapsed_time()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reducing Syntax Trees" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "When reducing code (or generally speaking, recursive structures), using a _syntactic_ approach can be a much better alternative to the _line-by-line_ or _character-by-character_ approaches discussed above. The idea is that one represents the input as a _tree_ (rather than a sequence of strings), in which a reducer would work on entire subtrees, deleting or reducing parts of the tree." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We illustrate this concept on _syntax trees_ representing Python code. Python provides us with simple means to interactively convert code into syntax trees (and back again). So, in order to reduce code, we can\n", "\n", "1. _parse_ the program code into a syntax tree (called *abstract syntax tree* or *AST*);\n", "2. reduce the syntax tree to a minimum, executing it to test reductions; and\n", "3. _unparse_ the tree to obtain textual code again.\n", "\n", "Since transformations on the AST are much less likely to produce syntax errors, reducing ASTs is much more efficient than reducing program code as text." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In the [chapter on slicing](Slicer.ipynb), we already have seen several examples on how to work with ASTs. In our context, an AST also offers additional possibilities for reducing. Notably, instead of just _deleting_ code fragments, we can also _replace_ them with simpler fragments. For instance, we can replace arithmetic expressions with constants, or conditional statements `if cond: body` with the associated body `body`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us illustrate how this works, again choosing `remove_html_markup()` as our ongoing example. One more time, we create a function with associated test." ] }, { "cell_type": "code", "execution_count": 141, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.931907Z", "iopub.status.busy": "2023-11-12T12:40:48.931779Z", "iopub.status.idle": "2023-11-12T12:40:48.934048Z", "shell.execute_reply": "2023-11-12T12:40:48.933763Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "fun_source = inspect.getsource(remove_html_markup)" ] }, { "cell_type": "code", "execution_count": 142, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.935812Z", "iopub.status.busy": "2023-11-12T12:40:48.935690Z", "iopub.status.idle": "2023-11-12T12:40:48.968605Z", "shell.execute_reply": "2023-11-12T12:40:48.968312Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s): \u001b[37m# type: ignore\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out = \u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", "\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " out = out + c\u001b[37m\u001b[39;49;00m\n", "\u001b[37m\u001b[39;49;00m\n", " \u001b[37m# postcondition\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34massert\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m \u001b[35min\u001b[39;49;00m out \u001b[35mand\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m \u001b[35min\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m\n", "\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(fun_source, '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### From Code to Syntax Trees\n", "\n", "Let us parse this piece of code into an AST. This is done by the `ast.parse()` function." ] }, { "cell_type": "code", "execution_count": 143, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.970261Z", "iopub.status.busy": "2023-11-12T12:40:48.970142Z", "iopub.status.idle": "2023-11-12T12:40:48.971720Z", "shell.execute_reply": "2023-11-12T12:40:48.971435Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import ast" ] }, { "cell_type": "code", "execution_count": 144, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.973273Z", "iopub.status.busy": "2023-11-12T12:40:48.973152Z", "iopub.status.idle": "2023-11-12T12:40:48.974976Z", "shell.execute_reply": "2023-11-12T12:40:48.974687Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "fun_tree: ast.Module = ast.parse(fun_source)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The parsed tree contains the function definition:" ] }, { "cell_type": "code", "execution_count": 145, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.976554Z", "iopub.status.busy": "2023-11-12T12:40:48.976430Z", "iopub.status.idle": "2023-11-12T12:40:48.978300Z", "shell.execute_reply": "2023-11-12T12:40:48.977990Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import show_ast" ] }, { "cell_type": "code", "execution_count": 146, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:48.979921Z", "iopub.status.busy": "2023-11-12T12:40:48.979793Z", "iopub.status.idle": "2023-11-12T12:40:49.500416Z", "shell.execute_reply": "2023-11-12T12:40:49.499979Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "0\n", "FunctionDef\n", "\n", "\n", "\n", "1\n", ""remove_html_markup"\n", "\n", "\n", "\n", "0--1\n", "\n", "\n", "\n", "\n", "2\n", "arguments\n", "\n", "\n", "\n", "0--2\n", "\n", "\n", "\n", "\n", "5\n", "Assign\n", "\n", "\n", "\n", "0--5\n", "\n", "\n", "\n", "\n", "11\n", "Assign\n", "\n", "\n", "\n", "0--11\n", "\n", "\n", "\n", "\n", "17\n", "Assign\n", "\n", "\n", "\n", "0--17\n", "\n", "\n", "\n", "\n", "23\n", "For\n", "\n", "\n", "\n", "0--23\n", "\n", "\n", "\n", "\n", "121\n", "Assert\n", "\n", "\n", "\n", "0--121\n", "\n", "\n", "\n", "\n", "138\n", "Return\n", "\n", "\n", "\n", "0--138\n", "\n", "\n", "\n", "\n", "3\n", "arg\n", "\n", "\n", "\n", "2--3\n", "\n", "\n", "\n", "\n", "4\n", ""s"\n", "\n", "\n", "\n", "3--4\n", "\n", "\n", "\n", "\n", "6\n", "Name\n", "\n", "\n", "\n", "5--6\n", "\n", "\n", "\n", "\n", "9\n", "Constant\n", "\n", "\n", "\n", "5--9\n", "\n", "\n", "\n", "\n", "7\n", ""tag"\n", "\n", "\n", "\n", "6--7\n", "\n", "\n", "\n", "\n", "8\n", "Store\n", "\n", "\n", "\n", "6--8\n", "\n", "\n", "\n", "\n", "10\n", "False\n", "\n", "\n", "\n", "9--10\n", "\n", "\n", "\n", "\n", "12\n", "Name\n", "\n", "\n", "\n", "11--12\n", "\n", "\n", "\n", "\n", "15\n", "Constant\n", "\n", "\n", "\n", "11--15\n", "\n", "\n", "\n", "\n", "13\n", ""quote"\n", "\n", "\n", "\n", "12--13\n", "\n", "\n", "\n", "\n", "14\n", "Store\n", "\n", "\n", "\n", "12--14\n", "\n", "\n", "\n", "\n", "16\n", "False\n", "\n", "\n", "\n", "15--16\n", "\n", "\n", "\n", "\n", "18\n", "Name\n", "\n", "\n", "\n", "17--18\n", "\n", "\n", "\n", "\n", "21\n", "Constant\n", "\n", "\n", "\n", "17--21\n", "\n", "\n", "\n", "\n", "19\n", ""out"\n", "\n", "\n", "\n", "18--19\n", "\n", "\n", "\n", "\n", "20\n", "Store\n", "\n", "\n", "\n", "18--20\n", "\n", "\n", "\n", "\n", "22\n", """\n", "\n", "\n", "\n", "21--22\n", "\n", "\n", "\n", "\n", "24\n", "Name\n", "\n", "\n", "\n", "23--24\n", "\n", "\n", "\n", "\n", "27\n", "Name\n", "\n", "\n", "\n", "23--27\n", "\n", "\n", "\n", "\n", "30\n", "If\n", "\n", "\n", "\n", "23--30\n", "\n", "\n", "\n", "\n", "25\n", ""c"\n", "\n", "\n", "\n", "24--25\n", "\n", "\n", "\n", "\n", "26\n", "Store\n", "\n", "\n", "\n", "24--26\n", "\n", "\n", "\n", "\n", "28\n", ""s"\n", "\n", "\n", "\n", "27--28\n", "\n", "\n", "\n", "\n", "29\n", "Load\n", "\n", "\n", "\n", "27--29\n", "\n", "\n", "\n", "\n", "31\n", "BoolOp\n", "\n", "\n", "\n", "30--31\n", "\n", "\n", "\n", "\n", "45\n", "Assign\n", "\n", "\n", "\n", "30--45\n", "\n", "\n", "\n", "\n", "51\n", "If\n", "\n", "\n", "\n", "30--51\n", "\n", "\n", "\n", "\n", "32\n", "And\n", "\n", "\n", "\n", "31--32\n", "\n", "\n", "\n", "\n", "33\n", "Compare\n", "\n", "\n", "\n", "31--33\n", "\n", "\n", "\n", "\n", "40\n", "UnaryOp\n", "\n", "\n", "\n", "31--40\n", "\n", "\n", "\n", "\n", "34\n", "Name\n", "\n", "\n", "\n", "33--34\n", "\n", "\n", "\n", "\n", "37\n", "Eq\n", "\n", "\n", "\n", "33--37\n", "\n", "\n", "\n", "\n", "38\n", "Constant\n", "\n", "\n", "\n", "33--38\n", "\n", "\n", "\n", "\n", "35\n", ""c"\n", "\n", "\n", "\n", "34--35\n", "\n", "\n", "\n", "\n", "36\n", "Load\n", "\n", "\n", "\n", "34--36\n", "\n", "\n", "\n", "\n", "39\n", ""<"\n", "\n", "\n", "\n", "38--39\n", "\n", "\n", "\n", "\n", "41\n", "Not\n", "\n", "\n", "\n", "40--41\n", "\n", "\n", "\n", "\n", "42\n", "Name\n", "\n", "\n", "\n", "40--42\n", "\n", "\n", "\n", "\n", "43\n", ""quote"\n", "\n", "\n", "\n", "42--43\n", "\n", "\n", "\n", "\n", "44\n", "Load\n", "\n", "\n", "\n", "42--44\n", "\n", "\n", "\n", "\n", "46\n", "Name\n", "\n", "\n", "\n", "45--46\n", "\n", "\n", "\n", "\n", "49\n", "Constant\n", "\n", "\n", "\n", "45--49\n", "\n", "\n", "\n", "\n", "47\n", ""tag"\n", "\n", "\n", "\n", "46--47\n", "\n", "\n", "\n", "\n", "48\n", "Store\n", "\n", "\n", "\n", "46--48\n", "\n", "\n", "\n", "\n", "50\n", "True\n", "\n", "\n", "\n", "49--50\n", "\n", "\n", "\n", "\n", "52\n", "BoolOp\n", "\n", "\n", "\n", "51--52\n", "\n", "\n", "\n", "\n", "66\n", "Assign\n", "\n", "\n", "\n", "51--66\n", "\n", "\n", "\n", "\n", "72\n", "If\n", "\n", "\n", "\n", "51--72\n", "\n", "\n", "\n", "\n", "53\n", "And\n", "\n", "\n", "\n", "52--53\n", "\n", "\n", "\n", "\n", "54\n", "Compare\n", "\n", "\n", "\n", "52--54\n", "\n", "\n", "\n", "\n", "61\n", "UnaryOp\n", "\n", "\n", "\n", "52--61\n", "\n", "\n", "\n", "\n", "55\n", "Name\n", "\n", "\n", "\n", "54--55\n", "\n", "\n", "\n", "\n", "58\n", "Eq\n", "\n", "\n", "\n", "54--58\n", "\n", "\n", "\n", "\n", "59\n", "Constant\n", "\n", "\n", "\n", "54--59\n", "\n", "\n", "\n", "\n", "56\n", ""c"\n", "\n", "\n", "\n", "55--56\n", "\n", "\n", "\n", "\n", "57\n", "Load\n", "\n", "\n", "\n", "55--57\n", "\n", "\n", "\n", "\n", "60\n", "">"\n", "\n", "\n", "\n", "59--60\n", "\n", "\n", "\n", "\n", "62\n", "Not\n", "\n", "\n", "\n", "61--62\n", "\n", "\n", "\n", "\n", "63\n", "Name\n", "\n", "\n", "\n", "61--63\n", "\n", "\n", "\n", "\n", "64\n", ""quote"\n", "\n", "\n", "\n", "63--64\n", "\n", "\n", "\n", "\n", "65\n", "Load\n", "\n", "\n", "\n", "63--65\n", "\n", "\n", "\n", "\n", "67\n", "Name\n", "\n", "\n", "\n", "66--67\n", "\n", "\n", "\n", "\n", "70\n", "Constant\n", "\n", "\n", "\n", "66--70\n", "\n", "\n", "\n", "\n", "68\n", ""tag"\n", "\n", "\n", "\n", "67--68\n", "\n", "\n", "\n", "\n", "69\n", "Store\n", "\n", "\n", "\n", "67--69\n", "\n", "\n", "\n", "\n", "71\n", "False\n", "\n", "\n", "\n", "70--71\n", "\n", "\n", "\n", "\n", "73\n", "BoolOp\n", "\n", "\n", "\n", "72--73\n", "\n", "\n", "\n", "\n", "94\n", "Assign\n", "\n", "\n", "\n", "72--94\n", "\n", "\n", "\n", "\n", "103\n", "If\n", "\n", "\n", "\n", "72--103\n", "\n", "\n", "\n", "\n", "74\n", "Or\n", "\n", "\n", "\n", "73--74\n", "\n", "\n", "\n", "\n", "75\n", "Compare\n", "\n", "\n", "\n", "73--75\n", "\n", "\n", "\n", "\n", "82\n", "BoolOp\n", "\n", "\n", "\n", "73--82\n", "\n", "\n", "\n", "\n", "76\n", "Name\n", "\n", "\n", "\n", "75--76\n", "\n", "\n", "\n", "\n", "79\n", "Eq\n", "\n", "\n", "\n", "75--79\n", "\n", "\n", "\n", "\n", "80\n", "Constant\n", "\n", "\n", "\n", "75--80\n", "\n", "\n", "\n", "\n", "77\n", ""c"\n", "\n", "\n", "\n", "76--77\n", "\n", "\n", "\n", "\n", "78\n", "Load\n", "\n", "\n", "\n", "76--78\n", "\n", "\n", "\n", "\n", "81\n", """"\n", "\n", "\n", "\n", "80--81\n", "\n", "\n", "\n", "\n", "83\n", "And\n", "\n", "\n", "\n", "82--83\n", "\n", "\n", "\n", "\n", "84\n", "Compare\n", "\n", "\n", "\n", "82--84\n", "\n", "\n", "\n", "\n", "91\n", "Name\n", "\n", "\n", "\n", "82--91\n", "\n", "\n", "\n", "\n", "85\n", "Name\n", "\n", "\n", "\n", "84--85\n", "\n", "\n", "\n", "\n", "88\n", "Eq\n", "\n", "\n", "\n", "84--88\n", "\n", "\n", "\n", "\n", "89\n", "Constant\n", "\n", "\n", "\n", "84--89\n", "\n", "\n", "\n", "\n", "86\n", ""c"\n", "\n", "\n", "\n", "85--86\n", "\n", "\n", "\n", "\n", "87\n", "Load\n", "\n", "\n", "\n", "85--87\n", "\n", "\n", "\n", "\n", "90\n", ""'"\n", "\n", "\n", "\n", "89--90\n", "\n", "\n", "\n", "\n", "92\n", ""tag"\n", "\n", "\n", "\n", "91--92\n", "\n", "\n", "\n", "\n", "93\n", "Load\n", "\n", "\n", "\n", "91--93\n", "\n", "\n", "\n", "\n", "95\n", "Name\n", "\n", "\n", "\n", "94--95\n", "\n", "\n", "\n", "\n", "98\n", "UnaryOp\n", "\n", "\n", "\n", "94--98\n", "\n", "\n", "\n", "\n", "96\n", ""quote"\n", "\n", "\n", "\n", "95--96\n", "\n", "\n", "\n", "\n", "97\n", "Store\n", "\n", "\n", "\n", "95--97\n", "\n", "\n", "\n", "\n", "99\n", "Not\n", "\n", "\n", "\n", "98--99\n", "\n", "\n", "\n", "\n", "100\n", "Name\n", "\n", "\n", "\n", "98--100\n", "\n", "\n", "\n", "\n", "101\n", ""quote"\n", "\n", "\n", "\n", "100--101\n", "\n", "\n", "\n", "\n", "102\n", "Load\n", "\n", "\n", "\n", "100--102\n", "\n", "\n", "\n", "\n", "104\n", "UnaryOp\n", "\n", "\n", "\n", "103--104\n", "\n", "\n", "\n", "\n", "109\n", "Assign\n", "\n", "\n", "\n", "103--109\n", "\n", "\n", "\n", "\n", "105\n", "Not\n", "\n", "\n", "\n", "104--105\n", "\n", "\n", "\n", "\n", "106\n", "Name\n", "\n", "\n", "\n", "104--106\n", "\n", "\n", "\n", "\n", "107\n", ""tag"\n", "\n", "\n", "\n", "106--107\n", "\n", "\n", "\n", "\n", "108\n", "Load\n", "\n", "\n", "\n", "106--108\n", "\n", "\n", "\n", "\n", "110\n", "Name\n", "\n", "\n", "\n", "109--110\n", "\n", "\n", "\n", "\n", "113\n", "BinOp\n", "\n", "\n", "\n", "109--113\n", "\n", "\n", "\n", "\n", "111\n", ""out"\n", "\n", "\n", "\n", "110--111\n", "\n", "\n", "\n", "\n", "112\n", "Store\n", "\n", "\n", "\n", "110--112\n", "\n", "\n", "\n", "\n", "114\n", "Name\n", "\n", "\n", "\n", "113--114\n", "\n", "\n", "\n", "\n", "117\n", "Add\n", "\n", "\n", "\n", "113--117\n", "\n", "\n", "\n", "\n", "118\n", "Name\n", "\n", "\n", "\n", "113--118\n", "\n", "\n", "\n", "\n", "115\n", ""out"\n", "\n", "\n", "\n", "114--115\n", "\n", "\n", "\n", "\n", "116\n", "Load\n", "\n", "\n", "\n", "114--116\n", "\n", "\n", "\n", "\n", "119\n", ""c"\n", "\n", "\n", "\n", "118--119\n", "\n", "\n", "\n", "\n", "120\n", "Load\n", "\n", "\n", "\n", "118--120\n", "\n", "\n", "\n", "\n", "122\n", "BoolOp\n", "\n", "\n", "\n", "121--122\n", "\n", "\n", "\n", "\n", "123\n", "And\n", "\n", "\n", "\n", "122--123\n", "\n", "\n", "\n", "\n", "124\n", "Compare\n", "\n", "\n", "\n", "122--124\n", "\n", "\n", "\n", "\n", "131\n", "Compare\n", "\n", "\n", "\n", "122--131\n", "\n", "\n", "\n", "\n", "125\n", "Constant\n", "\n", "\n", "\n", "124--125\n", "\n", "\n", "\n", "\n", "127\n", "NotIn\n", "\n", "\n", "\n", "124--127\n", "\n", "\n", "\n", "\n", "128\n", "Name\n", "\n", "\n", "\n", "124--128\n", "\n", "\n", "\n", "\n", "126\n", ""<"\n", "\n", "\n", "\n", "125--126\n", "\n", "\n", "\n", "\n", "129\n", ""out"\n", "\n", "\n", "\n", "128--129\n", "\n", "\n", "\n", "\n", "130\n", "Load\n", "\n", "\n", "\n", "128--130\n", "\n", "\n", "\n", "\n", "132\n", "Constant\n", "\n", "\n", "\n", "131--132\n", "\n", "\n", "\n", "\n", "134\n", "NotIn\n", "\n", "\n", "\n", "131--134\n", "\n", "\n", "\n", "\n", "135\n", "Name\n", "\n", "\n", "\n", "131--135\n", "\n", "\n", "\n", "\n", "133\n", "">"\n", "\n", "\n", "\n", "132--133\n", "\n", "\n", "\n", "\n", "136\n", ""out"\n", "\n", "\n", "\n", "135--136\n", "\n", "\n", "\n", "\n", "137\n", "Load\n", "\n", "\n", "\n", "135--137\n", "\n", "\n", "\n", "\n", "139\n", "Name\n", "\n", "\n", "\n", "138--139\n", "\n", "\n", "\n", "\n", "140\n", ""out"\n", "\n", "\n", "\n", "139--140\n", "\n", "\n", "\n", "\n", "141\n", "Load\n", "\n", "\n", "\n", "139--141\n", "\n", "\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_ast(fun_tree)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us add some tests to this, using the same scheme:" ] }, { "cell_type": "code", "execution_count": 147, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.502334Z", "iopub.status.busy": "2023-11-12T12:40:49.502208Z", "iopub.status.idle": "2023-11-12T12:40:49.504010Z", "shell.execute_reply": "2023-11-12T12:40:49.503770Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "test_source = (\n", " '''if remove_html_markup('bar') != 'bar':\\n''' +\n", " ''' raise RuntimeError(\"Missing functionality\")\\n''' +\n", " '''assert remove_html_markup('\"foo\"') == '\"foo\"', \"My Test\"'''\n", ")" ] }, { "cell_type": "code", "execution_count": 148, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.505757Z", "iopub.status.busy": "2023-11-12T12:40:49.505506Z", "iopub.status.idle": "2023-11-12T12:40:49.507565Z", "shell.execute_reply": "2023-11-12T12:40:49.507217Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "test_tree: ast.Module = ast.parse(test_source)" ] }, { "cell_type": "code", "execution_count": 149, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.509685Z", "iopub.status.busy": "2023-11-12T12:40:49.509467Z", "iopub.status.idle": "2023-11-12T12:40:49.543821Z", "shell.execute_reply": "2023-11-12T12:40:49.543410Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mif\u001b[39;49;00m remove_html_markup(\u001b[33m'\u001b[39;49;00m\u001b[33mbar\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m) != \u001b[33m'\u001b[39;49;00m\u001b[33mbar\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mraise\u001b[39;49;00m \u001b[36mRuntimeError\u001b[39;49;00m(\u001b[33m'\u001b[39;49;00m\u001b[33mMissing functionality\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m)\u001b[37m\u001b[39;49;00m\n", "\u001b[34massert\u001b[39;49;00m remove_html_markup(\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33mfoo\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m) == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33mfoo\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m, \u001b[33m'\u001b[39;49;00m\u001b[33mMy Test\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(ast.unparse(test_tree), '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can merge the function definition tree and the test tree into a single one:" ] }, { "cell_type": "code", "execution_count": 150, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.545614Z", "iopub.status.busy": "2023-11-12T12:40:49.545485Z", "iopub.status.idle": "2023-11-12T12:40:49.547153Z", "shell.execute_reply": "2023-11-12T12:40:49.546878Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import copy" ] }, { "cell_type": "code", "execution_count": 151, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.548558Z", "iopub.status.busy": "2023-11-12T12:40:49.548466Z", "iopub.status.idle": "2023-11-12T12:40:49.551049Z", "shell.execute_reply": "2023-11-12T12:40:49.550601Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "fun_test_tree = copy.deepcopy(fun_tree)\n", "fun_test_tree.body += test_tree.body" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Such a tree can be compiled into a code object, using Python's `compile()` function:" ] }, { "cell_type": "code", "execution_count": 152, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.553458Z", "iopub.status.busy": "2023-11-12T12:40:49.553334Z", "iopub.status.idle": "2023-11-12T12:40:49.555741Z", "shell.execute_reply": "2023-11-12T12:40:49.555211Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "fun_test_code = compile(fun_test_tree, '', 'exec')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "and the resulting code object can be executed directly, using the Python `exec()` function. We see that our test fails as expected." ] }, { "cell_type": "code", "execution_count": 153, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.558127Z", "iopub.status.busy": "2023-11-12T12:40:49.557818Z", "iopub.status.idle": "2023-11-12T12:40:49.560142Z", "shell.execute_reply": "2023-11-12T12:40:49.559844Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/1290587190.py\", line 2, in \n", " exec(fun_test_code, {}, {})\n", " File \"\", line 3, in \n", "AssertionError: My Test (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " exec(fun_test_code, {}, {})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Traversing Syntax Trees\n", "\n", "Our goal is now to reduce this tree (or at least the subtree with the function definition) to a minimum. \n", "To this end, we manipulate the AST through the `ast` Python module. The [official Python `ast` reference](http://docs.python.org/3/library/ast) is complete, but a bit brief; the documentation [\"Green Tree Snakes - the missing Python AST docs\"](https://greentreesnakes.readthedocs.io/en/latest/) provides an excellent introduction." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The two means for exploring and changing ASTs are the classes `NodeVisitor` and `NodeTransformer`, respectively. We start with creating a list of all nodes in the tree, using a `NodeVisitor` subclass.\n", "\n", "Its `visit()` method is called for every node in the tree, which we achieve by having it return `self.generic_visit()` for the current node. It saves all visited nodes in the `_all_nodes` attribute." ] }, { "cell_type": "code", "execution_count": 154, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.561904Z", "iopub.status.busy": "2023-11-12T12:40:49.561767Z", "iopub.status.idle": "2023-11-12T12:40:49.563566Z", "shell.execute_reply": "2023-11-12T12:40:49.563223Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from ast import NodeTransformer, NodeVisitor, AST" ] }, { "cell_type": "code", "execution_count": 155, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.565570Z", "iopub.status.busy": "2023-11-12T12:40:49.565444Z", "iopub.status.idle": "2023-11-12T12:40:49.568265Z", "shell.execute_reply": "2023-11-12T12:40:49.567884Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class NodeCollector(NodeVisitor):\n", " \"\"\"Collect all nodes in an AST.\"\"\"\n", "\n", " def __init__(self) -> None:\n", " super().__init__()\n", " self._all_nodes: List[AST] = []\n", "\n", " def generic_visit(self, node: AST) -> None:\n", " self._all_nodes.append(node)\n", " return super().generic_visit(node)\n", "\n", " def collect(self, tree: AST) -> List[AST]:\n", " \"\"\"Return a list of all nodes in tree.\"\"\"\n", " self._all_nodes = []\n", " self.visit(tree)\n", " return self._all_nodes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This is how our `NodeCollector()` class produces a list of all nodes:" ] }, { "cell_type": "code", "execution_count": 156, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.570197Z", "iopub.status.busy": "2023-11-12T12:40:49.570047Z", "iopub.status.idle": "2023-11-12T12:40:49.572654Z", "shell.execute_reply": "2023-11-12T12:40:49.572322Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "107" ] }, "execution_count": 156, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fun_nodes = NodeCollector().collect(fun_tree)\n", "len(fun_nodes)" ] }, { "cell_type": "code", "execution_count": 157, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.574457Z", "iopub.status.busy": "2023-11-12T12:40:49.574295Z", "iopub.status.idle": "2023-11-12T12:40:49.576963Z", "shell.execute_reply": "2023-11-12T12:40:49.576680Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ]" ] }, "execution_count": 157, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fun_nodes[:30]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Such a list of nodes is what we can feed into Delta Debugging in order to reduce it. The idea is that with every test, we take the tree and for each node in the tree, we check whether it is still in the list – if not, we remove it. Thus, by reducing the list of nodes, we simultaneously reduce the tree as well." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Deleting Nodes\n", "\n", "In our next step, we write some code that, given such a list of nodes, _prunes_ the tree such that _only_ elements in the list are still contained. To this end, we proceed in four steps:\n", "\n", "1. We traverse the original AST, _marking_ all nodes as \"to be deleted\".\n", "2. We traverse the given list of nodes, clearing their markers.\n", "3. We copy the original tree (including the markers) into a new tree – the one to be reduced.\n", "4. We traverse the new tree, now deleting all marked nodes." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Why do we go through such an extra effort? The reason is that our list of nodes contains references into the _original_ tree – a tree that needs to stay unchanged such that we can reuse it for later. The new tree (the copy) has the same nodes, but at different addresses, so our original references cannot be used anymore. Markers, however, just like any other attributes, are safely copied from the original into the new tree." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `NodeMarker()` visitor marks all nodes in a tree:" ] }, { "cell_type": "code", "execution_count": 158, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.578746Z", "iopub.status.busy": "2023-11-12T12:40:49.578629Z", "iopub.status.idle": "2023-11-12T12:40:49.580698Z", "shell.execute_reply": "2023-11-12T12:40:49.580425Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class NodeMarker(NodeVisitor):\n", " def visit(self, node: AST) -> AST:\n", " node.marked = True # type: ignore\n", " return super().generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `NodeReducer()` transformer reduces all marked nodes. If a method `visit_()` is defined, it will be invoked; otherwise, `visit_Node()` is invoked, which _deletes_ the node (and its subtree) by returning `None`. " ] }, { "cell_type": "code", "execution_count": 159, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.582529Z", "iopub.status.busy": "2023-11-12T12:40:49.582223Z", "iopub.status.idle": "2023-11-12T12:40:49.585082Z", "shell.execute_reply": "2023-11-12T12:40:49.584792Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class NodeReducer(NodeTransformer):\n", " def visit(self, node: AST) -> Any:\n", " method = 'visit_' + node.__class__.__name__\n", " visitor = getattr(self, method, self.visit_Node)\n", " return visitor(node)\n", "\n", " def visit_Module(self, node: AST) -> Any:\n", " # Can't remove modules\n", " return super().generic_visit(node)\n", "\n", " def visit_Node(self, node: AST) -> Any:\n", " \"\"\"Default visitor for all nodes\"\"\"\n", " if node.marked: # type: ignore\n", " return None # delete it\n", " return super().generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Our function `copy_and_reduce()` puts these pieces together:" ] }, { "cell_type": "code", "execution_count": 160, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.586557Z", "iopub.status.busy": "2023-11-12T12:40:49.586448Z", "iopub.status.idle": "2023-11-12T12:40:49.588616Z", "shell.execute_reply": "2023-11-12T12:40:49.588345Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def copy_and_reduce(tree: AST, keep_list: List[AST]) -> AST:\n", " \"\"\"Copy tree, reducing all nodes that are not in keep_list.\"\"\"\n", "\n", " # Mark all nodes except those in keep_list\n", " NodeMarker().visit(tree)\n", " for node in keep_list:\n", " # print(\"Clearing\", node)\n", " node.marked = False # type: ignore\n", "\n", " # Copy tree and delete marked nodes\n", " new_tree = copy.deepcopy(tree)\n", " NodeReducer().visit(new_tree)\n", " return new_tree" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us apply this in practice. We take the first assignment in our tree..." ] }, { "cell_type": "code", "execution_count": 161, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.590391Z", "iopub.status.busy": "2023-11-12T12:40:49.590271Z", "iopub.status.idle": "2023-11-12T12:40:49.592929Z", "shell.execute_reply": "2023-11-12T12:40:49.592544Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 161, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fun_nodes[4]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "... whose subtree happens to be the assignment to `tag`:" ] }, { "cell_type": "code", "execution_count": 162, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.594585Z", "iopub.status.busy": "2023-11-12T12:40:49.594472Z", "iopub.status.idle": "2023-11-12T12:40:49.596573Z", "shell.execute_reply": "2023-11-12T12:40:49.596312Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'tag = False'" ] }, "execution_count": 162, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ast.unparse(fun_nodes[4])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We keep all nodes _except_ for this one." ] }, { "cell_type": "code", "execution_count": 163, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.598181Z", "iopub.status.busy": "2023-11-12T12:40:49.598055Z", "iopub.status.idle": "2023-11-12T12:40:49.599730Z", "shell.execute_reply": "2023-11-12T12:40:49.599473Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "keep_list = fun_nodes.copy()\n", "del keep_list[4]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us now create a copy of the tree in which the assignment is missing:" ] }, { "cell_type": "code", "execution_count": 164, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:49.601331Z", "iopub.status.busy": "2023-11-12T12:40:49.601213Z", "iopub.status.idle": "2023-11-12T12:40:50.074728Z", "shell.execute_reply": "2023-11-12T12:40:50.074307Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "0\n", "FunctionDef\n", "\n", "\n", "\n", "1\n", ""remove_html_markup"\n", "\n", "\n", "\n", "0--1\n", "\n", "\n", "\n", "\n", "2\n", "arguments\n", "\n", "\n", "\n", "0--2\n", "\n", "\n", "\n", "\n", "5\n", "Assign\n", "\n", "\n", "\n", "0--5\n", "\n", "\n", "\n", "\n", "11\n", "Assign\n", "\n", "\n", "\n", "0--11\n", "\n", "\n", "\n", "\n", "17\n", "For\n", "\n", "\n", "\n", "0--17\n", "\n", "\n", "\n", "\n", "115\n", "Assert\n", "\n", "\n", "\n", "0--115\n", "\n", "\n", "\n", "\n", "132\n", "Return\n", "\n", "\n", "\n", "0--132\n", "\n", "\n", "\n", "\n", "3\n", "arg\n", "\n", "\n", "\n", "2--3\n", "\n", "\n", "\n", "\n", "4\n", ""s"\n", "\n", "\n", "\n", "3--4\n", "\n", "\n", "\n", "\n", "6\n", "Name\n", "\n", "\n", "\n", "5--6\n", "\n", "\n", "\n", "\n", "9\n", "Constant\n", "\n", "\n", "\n", "5--9\n", "\n", "\n", "\n", "\n", "7\n", ""quote"\n", "\n", "\n", "\n", "6--7\n", "\n", "\n", "\n", "\n", "8\n", "Store\n", "\n", "\n", "\n", "6--8\n", "\n", "\n", "\n", "\n", "10\n", "False\n", "\n", "\n", "\n", "9--10\n", "\n", "\n", "\n", "\n", "12\n", "Name\n", "\n", "\n", "\n", "11--12\n", "\n", "\n", "\n", "\n", "15\n", "Constant\n", "\n", "\n", "\n", "11--15\n", "\n", "\n", "\n", "\n", "13\n", ""out"\n", "\n", "\n", "\n", "12--13\n", "\n", "\n", "\n", "\n", "14\n", "Store\n", "\n", "\n", "\n", "12--14\n", "\n", "\n", "\n", "\n", "16\n", """\n", "\n", "\n", "\n", "15--16\n", "\n", "\n", "\n", "\n", "18\n", "Name\n", "\n", "\n", "\n", "17--18\n", "\n", "\n", "\n", "\n", "21\n", "Name\n", "\n", "\n", "\n", "17--21\n", "\n", "\n", "\n", "\n", "24\n", "If\n", "\n", "\n", "\n", "17--24\n", "\n", "\n", "\n", "\n", "19\n", ""c"\n", "\n", "\n", "\n", "18--19\n", "\n", "\n", "\n", "\n", "20\n", "Store\n", "\n", "\n", "\n", "18--20\n", "\n", "\n", "\n", "\n", "22\n", ""s"\n", "\n", "\n", "\n", "21--22\n", "\n", "\n", "\n", "\n", "23\n", "Load\n", "\n", "\n", "\n", "21--23\n", "\n", "\n", "\n", "\n", "25\n", "BoolOp\n", "\n", "\n", "\n", "24--25\n", "\n", "\n", "\n", "\n", "39\n", "Assign\n", "\n", "\n", "\n", "24--39\n", "\n", "\n", "\n", "\n", "45\n", "If\n", "\n", "\n", "\n", "24--45\n", "\n", "\n", "\n", "\n", "26\n", "And\n", "\n", "\n", "\n", "25--26\n", "\n", "\n", "\n", "\n", "27\n", "Compare\n", "\n", "\n", "\n", "25--27\n", "\n", "\n", "\n", "\n", "34\n", "UnaryOp\n", "\n", "\n", "\n", "25--34\n", "\n", "\n", "\n", "\n", "28\n", "Name\n", "\n", "\n", "\n", "27--28\n", "\n", "\n", "\n", "\n", "31\n", "Eq\n", "\n", "\n", "\n", "27--31\n", "\n", "\n", "\n", "\n", "32\n", "Constant\n", "\n", "\n", "\n", "27--32\n", "\n", "\n", "\n", "\n", "29\n", ""c"\n", "\n", "\n", "\n", "28--29\n", "\n", "\n", "\n", "\n", "30\n", "Load\n", "\n", "\n", "\n", "28--30\n", "\n", "\n", "\n", "\n", "33\n", ""<"\n", "\n", "\n", "\n", "32--33\n", "\n", "\n", "\n", "\n", "35\n", "Not\n", "\n", "\n", "\n", "34--35\n", "\n", "\n", "\n", "\n", "36\n", "Name\n", "\n", "\n", "\n", "34--36\n", "\n", "\n", "\n", "\n", "37\n", ""quote"\n", "\n", "\n", "\n", "36--37\n", "\n", "\n", "\n", "\n", "38\n", "Load\n", "\n", "\n", "\n", "36--38\n", "\n", "\n", "\n", "\n", "40\n", "Name\n", "\n", "\n", "\n", "39--40\n", "\n", "\n", "\n", "\n", "43\n", "Constant\n", "\n", "\n", "\n", "39--43\n", "\n", "\n", "\n", "\n", "41\n", ""tag"\n", "\n", "\n", "\n", "40--41\n", "\n", "\n", "\n", "\n", "42\n", "Store\n", "\n", "\n", "\n", "40--42\n", "\n", "\n", "\n", "\n", "44\n", "True\n", "\n", "\n", "\n", "43--44\n", "\n", "\n", "\n", "\n", "46\n", "BoolOp\n", "\n", "\n", "\n", "45--46\n", "\n", "\n", "\n", "\n", "60\n", "Assign\n", "\n", "\n", "\n", "45--60\n", "\n", "\n", "\n", "\n", "66\n", "If\n", "\n", "\n", "\n", "45--66\n", "\n", "\n", "\n", "\n", "47\n", "And\n", "\n", "\n", "\n", "46--47\n", "\n", "\n", "\n", "\n", "48\n", "Compare\n", "\n", "\n", "\n", "46--48\n", "\n", "\n", "\n", "\n", "55\n", "UnaryOp\n", "\n", "\n", "\n", "46--55\n", "\n", "\n", "\n", "\n", "49\n", "Name\n", "\n", "\n", "\n", "48--49\n", "\n", "\n", "\n", "\n", "52\n", "Eq\n", "\n", "\n", "\n", "48--52\n", "\n", "\n", "\n", "\n", "53\n", "Constant\n", "\n", "\n", "\n", "48--53\n", "\n", "\n", "\n", "\n", "50\n", ""c"\n", "\n", "\n", "\n", "49--50\n", "\n", "\n", "\n", "\n", "51\n", "Load\n", "\n", "\n", "\n", "49--51\n", "\n", "\n", "\n", "\n", "54\n", "">"\n", "\n", "\n", "\n", "53--54\n", "\n", "\n", "\n", "\n", "56\n", "Not\n", "\n", "\n", "\n", "55--56\n", "\n", "\n", "\n", "\n", "57\n", "Name\n", "\n", "\n", "\n", "55--57\n", "\n", "\n", "\n", "\n", "58\n", ""quote"\n", "\n", "\n", "\n", "57--58\n", "\n", "\n", "\n", "\n", "59\n", "Load\n", "\n", "\n", "\n", "57--59\n", "\n", "\n", "\n", "\n", "61\n", "Name\n", "\n", "\n", "\n", "60--61\n", "\n", "\n", "\n", "\n", "64\n", "Constant\n", "\n", "\n", "\n", "60--64\n", "\n", "\n", "\n", "\n", "62\n", ""tag"\n", "\n", "\n", "\n", "61--62\n", "\n", "\n", "\n", "\n", "63\n", "Store\n", "\n", "\n", "\n", "61--63\n", "\n", "\n", "\n", "\n", "65\n", "False\n", "\n", "\n", "\n", "64--65\n", "\n", "\n", "\n", "\n", "67\n", "BoolOp\n", "\n", "\n", "\n", "66--67\n", "\n", "\n", "\n", "\n", "88\n", "Assign\n", "\n", "\n", "\n", "66--88\n", "\n", "\n", "\n", "\n", "97\n", "If\n", "\n", "\n", "\n", "66--97\n", "\n", "\n", "\n", "\n", "68\n", "Or\n", "\n", "\n", "\n", "67--68\n", "\n", "\n", "\n", "\n", "69\n", "Compare\n", "\n", "\n", "\n", "67--69\n", "\n", "\n", "\n", "\n", "76\n", "BoolOp\n", "\n", "\n", "\n", "67--76\n", "\n", "\n", "\n", "\n", "70\n", "Name\n", "\n", "\n", "\n", "69--70\n", "\n", "\n", "\n", "\n", "73\n", "Eq\n", "\n", "\n", "\n", "69--73\n", "\n", "\n", "\n", "\n", "74\n", "Constant\n", "\n", "\n", "\n", "69--74\n", "\n", "\n", "\n", "\n", "71\n", ""c"\n", "\n", "\n", "\n", "70--71\n", "\n", "\n", "\n", "\n", "72\n", "Load\n", "\n", "\n", "\n", "70--72\n", "\n", "\n", "\n", "\n", "75\n", """"\n", "\n", "\n", "\n", "74--75\n", "\n", "\n", "\n", "\n", "77\n", "And\n", "\n", "\n", "\n", "76--77\n", "\n", "\n", "\n", "\n", "78\n", "Compare\n", "\n", "\n", "\n", "76--78\n", "\n", "\n", "\n", "\n", "85\n", "Name\n", "\n", "\n", "\n", "76--85\n", "\n", "\n", "\n", "\n", "79\n", "Name\n", "\n", "\n", "\n", "78--79\n", "\n", "\n", "\n", "\n", "82\n", "Eq\n", "\n", "\n", "\n", "78--82\n", "\n", "\n", "\n", "\n", "83\n", "Constant\n", "\n", "\n", "\n", "78--83\n", "\n", "\n", "\n", "\n", "80\n", ""c"\n", "\n", "\n", "\n", "79--80\n", "\n", "\n", "\n", "\n", "81\n", "Load\n", "\n", "\n", "\n", "79--81\n", "\n", "\n", "\n", "\n", "84\n", ""'"\n", "\n", "\n", "\n", "83--84\n", "\n", "\n", "\n", "\n", "86\n", ""tag"\n", "\n", "\n", "\n", "85--86\n", "\n", "\n", "\n", "\n", "87\n", "Load\n", "\n", "\n", "\n", "85--87\n", "\n", "\n", "\n", "\n", "89\n", "Name\n", "\n", "\n", "\n", "88--89\n", "\n", "\n", "\n", "\n", "92\n", "UnaryOp\n", "\n", "\n", "\n", "88--92\n", "\n", "\n", "\n", "\n", "90\n", ""quote"\n", "\n", "\n", "\n", "89--90\n", "\n", "\n", "\n", "\n", "91\n", "Store\n", "\n", "\n", "\n", "89--91\n", "\n", "\n", "\n", "\n", "93\n", "Not\n", "\n", "\n", "\n", "92--93\n", "\n", "\n", "\n", "\n", "94\n", "Name\n", "\n", "\n", "\n", "92--94\n", "\n", "\n", "\n", "\n", "95\n", ""quote"\n", "\n", "\n", "\n", "94--95\n", "\n", "\n", "\n", "\n", "96\n", "Load\n", "\n", "\n", "\n", "94--96\n", "\n", "\n", "\n", "\n", "98\n", "UnaryOp\n", "\n", "\n", "\n", "97--98\n", "\n", "\n", "\n", "\n", "103\n", "Assign\n", "\n", "\n", "\n", "97--103\n", "\n", "\n", "\n", "\n", "99\n", "Not\n", "\n", "\n", "\n", "98--99\n", "\n", "\n", "\n", "\n", "100\n", "Name\n", "\n", "\n", "\n", "98--100\n", "\n", "\n", "\n", "\n", "101\n", ""tag"\n", "\n", "\n", "\n", "100--101\n", "\n", "\n", "\n", "\n", "102\n", "Load\n", "\n", "\n", "\n", "100--102\n", "\n", "\n", "\n", "\n", "104\n", "Name\n", "\n", "\n", "\n", "103--104\n", "\n", "\n", "\n", "\n", "107\n", "BinOp\n", "\n", "\n", "\n", "103--107\n", "\n", "\n", "\n", "\n", "105\n", ""out"\n", "\n", "\n", "\n", "104--105\n", "\n", "\n", "\n", "\n", "106\n", "Store\n", "\n", "\n", "\n", "104--106\n", "\n", "\n", "\n", "\n", "108\n", "Name\n", "\n", "\n", "\n", "107--108\n", "\n", "\n", "\n", "\n", "111\n", "Add\n", "\n", "\n", "\n", "107--111\n", "\n", "\n", "\n", "\n", "112\n", "Name\n", "\n", "\n", "\n", "107--112\n", "\n", "\n", "\n", "\n", "109\n", ""out"\n", "\n", "\n", "\n", "108--109\n", "\n", "\n", "\n", "\n", "110\n", "Load\n", "\n", "\n", "\n", "108--110\n", "\n", "\n", "\n", "\n", "113\n", ""c"\n", "\n", "\n", "\n", "112--113\n", "\n", "\n", "\n", "\n", "114\n", "Load\n", "\n", "\n", "\n", "112--114\n", "\n", "\n", "\n", "\n", "116\n", "BoolOp\n", "\n", "\n", "\n", "115--116\n", "\n", "\n", "\n", "\n", "117\n", "And\n", "\n", "\n", "\n", "116--117\n", "\n", "\n", "\n", "\n", "118\n", "Compare\n", "\n", "\n", "\n", "116--118\n", "\n", "\n", "\n", "\n", "125\n", "Compare\n", "\n", "\n", "\n", "116--125\n", "\n", "\n", "\n", "\n", "119\n", "Constant\n", "\n", "\n", "\n", "118--119\n", "\n", "\n", "\n", "\n", "121\n", "NotIn\n", "\n", "\n", "\n", "118--121\n", "\n", "\n", "\n", "\n", "122\n", "Name\n", "\n", "\n", "\n", "118--122\n", "\n", "\n", "\n", "\n", "120\n", ""<"\n", "\n", "\n", "\n", "119--120\n", "\n", "\n", "\n", "\n", "123\n", ""out"\n", "\n", "\n", "\n", "122--123\n", "\n", "\n", "\n", "\n", "124\n", "Load\n", "\n", "\n", "\n", "122--124\n", "\n", "\n", "\n", "\n", "126\n", "Constant\n", "\n", "\n", "\n", "125--126\n", "\n", "\n", "\n", "\n", "128\n", "NotIn\n", "\n", "\n", "\n", "125--128\n", "\n", "\n", "\n", "\n", "129\n", "Name\n", "\n", "\n", "\n", "125--129\n", "\n", "\n", "\n", "\n", "127\n", "">"\n", "\n", "\n", "\n", "126--127\n", "\n", "\n", "\n", "\n", "130\n", ""out"\n", "\n", "\n", "\n", "129--130\n", "\n", "\n", "\n", "\n", "131\n", "Load\n", "\n", "\n", "\n", "129--131\n", "\n", "\n", "\n", "\n", "133\n", "Name\n", "\n", "\n", "\n", "132--133\n", "\n", "\n", "\n", "\n", "134\n", ""out"\n", "\n", "\n", "\n", "133--134\n", "\n", "\n", "\n", "\n", "135\n", "Load\n", "\n", "\n", "\n", "133--135\n", "\n", "\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "new_fun_tree = cast(ast.Module, copy_and_reduce(fun_tree, keep_list))\n", "show_ast(new_fun_tree)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The new tree no longer contains the initial assignment to `tag`:" ] }, { "cell_type": "code", "execution_count": 165, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.077378Z", "iopub.status.busy": "2023-11-12T12:40:50.077062Z", "iopub.status.idle": "2023-11-12T12:40:50.113121Z", "shell.execute_reply": "2023-11-12T12:40:50.112807Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s):\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out = \u001b[33m'\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m (\u001b[35mnot\u001b[39;49;00m quote):\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m (\u001b[35mnot\u001b[39;49;00m quote):\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m (c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag):\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " out = out + c\u001b[37m\u001b[39;49;00m\n", " \u001b[34massert\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m \u001b[35min\u001b[39;49;00m out \u001b[35mand\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m \u001b[35min\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(ast.unparse(new_fun_tree), '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If we add our tests and then execute this code, we get an error, as `tag` is now no longer initialized:" ] }, { "cell_type": "code", "execution_count": 166, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.114789Z", "iopub.status.busy": "2023-11-12T12:40:50.114635Z", "iopub.status.idle": "2023-11-12T12:40:50.116750Z", "shell.execute_reply": "2023-11-12T12:40:50.116336Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "new_fun_tree.body += test_tree.body" ] }, { "cell_type": "code", "execution_count": 167, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.118555Z", "iopub.status.busy": "2023-11-12T12:40:50.118403Z", "iopub.status.idle": "2023-11-12T12:40:50.120471Z", "shell.execute_reply": "2023-11-12T12:40:50.120202Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "fun_code = compile(new_fun_tree, \"\", 'exec')" ] }, { "cell_type": "code", "execution_count": 168, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.121921Z", "iopub.status.busy": "2023-11-12T12:40:50.121808Z", "iopub.status.idle": "2023-11-12T12:40:50.123749Z", "shell.execute_reply": "2023-11-12T12:40:50.123517Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2411822553.py\", line 2, in \n", " exec(fun_code, {}, {})\n", " File \"\", line 3, in \n", " File \"\", line 13, in remove_html_markup\n", "UnboundLocalError: local variable 'tag' referenced before assignment (expected)\n" ] } ], "source": [ "with ExpectError(UnboundLocalError):\n", " exec(fun_code, {}, {})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If we have _no_ node in the keep list, the whole tree gets deleted:" ] }, { "cell_type": "code", "execution_count": 169, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.125989Z", "iopub.status.busy": "2023-11-12T12:40:50.125825Z", "iopub.status.idle": "2023-11-12T12:40:50.128446Z", "shell.execute_reply": "2023-11-12T12:40:50.128177Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "empty_tree = copy_and_reduce(fun_tree, [])" ] }, { "cell_type": "code", "execution_count": 170, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.130082Z", "iopub.status.busy": "2023-11-12T12:40:50.129940Z", "iopub.status.idle": "2023-11-12T12:40:50.132553Z", "shell.execute_reply": "2023-11-12T12:40:50.132218Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "''" ] }, "execution_count": 170, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ast.unparse(empty_tree)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Reducing Trees\n", "\n", "We can put all these steps together in a single function. `compile_and_test_ast()` takes a tree and a list of nodes, reduces the tree to those nodes in the list, and then compiles and runs the reduced AST." ] }, { "cell_type": "code", "execution_count": 171, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.134149Z", "iopub.status.busy": "2023-11-12T12:40:50.134029Z", "iopub.status.idle": "2023-11-12T12:40:50.136447Z", "shell.execute_reply": "2023-11-12T12:40:50.136181Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def compile_and_test_ast(tree: ast.Module, keep_list: List[AST], \n", " test_tree: Optional[ast.Module] = None) -> None:\n", " new_tree = cast(ast.Module, copy_and_reduce(tree, keep_list))\n", " # print(ast.unparse(new_tree))\n", "\n", " if test_tree is not None:\n", " new_tree.body += test_tree.body\n", "\n", " try:\n", " code_object = compile(new_tree, '', 'exec')\n", " except Exception:\n", " raise SyntaxError(\"Cannot compile\")\n", "\n", " exec(code_object, {}, {})" ] }, { "cell_type": "code", "execution_count": 172, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.137827Z", "iopub.status.busy": "2023-11-12T12:40:50.137720Z", "iopub.status.idle": "2023-11-12T12:40:50.140423Z", "shell.execute_reply": "2023-11-12T12:40:50.140140Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2588008834.py\", line 2, in \n", " compile_and_test_ast(fun_tree, fun_nodes, test_tree)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/1107678207.py\", line 14, in compile_and_test_ast\n", " exec(code_object, {}, {})\n", " File \"\", line 3, in \n", "AssertionError: My Test (expected)\n" ] } ], "source": [ "with ExpectError(AssertionError):\n", " compile_and_test_ast(fun_tree, fun_nodes, test_tree)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "When we run our delta debugger on the AST, this is the list of remaining nodes we obtain:" ] }, { "cell_type": "code", "execution_count": 173, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.142721Z", "iopub.status.busy": "2023-11-12T12:40:50.142516Z", "iopub.status.idle": "2023-11-12T12:40:50.145559Z", "shell.execute_reply": "2023-11-12T12:40:50.145218Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with DeltaDebugger() as dd:\n", " compile_and_test_ast(fun_tree, fun_nodes, test_tree)" ] }, { "cell_type": "code", "execution_count": 174, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.147126Z", "iopub.status.busy": "2023-11-12T12:40:50.147034Z", "iopub.status.idle": "2023-11-12T12:40:50.355730Z", "shell.execute_reply": "2023-11-12T12:40:50.355428Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "57" ] }, "execution_count": 174, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reduced_nodes = dd.min_args()['keep_list']\n", "len(reduced_nodes)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is the associated tree:" ] }, { "cell_type": "code", "execution_count": 175, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.357518Z", "iopub.status.busy": "2023-11-12T12:40:50.357291Z", "iopub.status.idle": "2023-11-12T12:40:50.897503Z", "shell.execute_reply": "2023-11-12T12:40:50.896916Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "0\n", "FunctionDef\n", "\n", "\n", "\n", "1\n", ""remove_html_markup"\n", "\n", "\n", "\n", "0--1\n", "\n", "\n", "\n", "\n", "2\n", "arguments\n", "\n", "\n", "\n", "0--2\n", "\n", "\n", "\n", "\n", "5\n", "Assign\n", "\n", "\n", "\n", "0--5\n", "\n", "\n", "\n", "\n", "11\n", "Assign\n", "\n", "\n", "\n", "0--11\n", "\n", "\n", "\n", "\n", "17\n", "Assign\n", "\n", "\n", "\n", "0--17\n", "\n", "\n", "\n", "\n", "23\n", "For\n", "\n", "\n", "\n", "0--23\n", "\n", "\n", "\n", "\n", "121\n", "Return\n", "\n", "\n", "\n", "0--121\n", "\n", "\n", "\n", "\n", "3\n", "arg\n", "\n", "\n", "\n", "2--3\n", "\n", "\n", "\n", "\n", "4\n", ""s"\n", "\n", "\n", "\n", "3--4\n", "\n", "\n", "\n", "\n", "6\n", "Name\n", "\n", "\n", "\n", "5--6\n", "\n", "\n", "\n", "\n", "9\n", "Constant\n", "\n", "\n", "\n", "5--9\n", "\n", "\n", "\n", "\n", "7\n", ""tag"\n", "\n", "\n", "\n", "6--7\n", "\n", "\n", "\n", "\n", "8\n", "Store\n", "\n", "\n", "\n", "6--8\n", "\n", "\n", "\n", "\n", "10\n", "False\n", "\n", "\n", "\n", "9--10\n", "\n", "\n", "\n", "\n", "12\n", "Name\n", "\n", "\n", "\n", "11--12\n", "\n", "\n", "\n", "\n", "15\n", "Constant\n", "\n", "\n", "\n", "11--15\n", "\n", "\n", "\n", "\n", "13\n", ""quote"\n", "\n", "\n", "\n", "12--13\n", "\n", "\n", "\n", "\n", "14\n", "Store\n", "\n", "\n", "\n", "12--14\n", "\n", "\n", "\n", "\n", "16\n", "False\n", "\n", "\n", "\n", "15--16\n", "\n", "\n", "\n", "\n", "18\n", "Name\n", "\n", "\n", "\n", "17--18\n", "\n", "\n", "\n", "\n", "21\n", "Constant\n", "\n", "\n", "\n", "17--21\n", "\n", "\n", "\n", "\n", "19\n", ""out"\n", "\n", "\n", "\n", "18--19\n", "\n", "\n", "\n", "\n", "20\n", "Store\n", "\n", "\n", "\n", "18--20\n", "\n", "\n", "\n", "\n", "22\n", """\n", "\n", "\n", "\n", "21--22\n", "\n", "\n", "\n", "\n", "24\n", "Name\n", "\n", "\n", "\n", "23--24\n", "\n", "\n", "\n", "\n", "27\n", "Name\n", "\n", "\n", "\n", "23--27\n", "\n", "\n", "\n", "\n", "30\n", "If\n", "\n", "\n", "\n", "23--30\n", "\n", "\n", "\n", "\n", "25\n", ""c"\n", "\n", "\n", "\n", "24--25\n", "\n", "\n", "\n", "\n", "26\n", "Store\n", "\n", "\n", "\n", "24--26\n", "\n", "\n", "\n", "\n", "28\n", ""s"\n", "\n", "\n", "\n", "27--28\n", "\n", "\n", "\n", "\n", "29\n", "Load\n", "\n", "\n", "\n", "27--29\n", "\n", "\n", "\n", "\n", "31\n", "BoolOp\n", "\n", "\n", "\n", "30--31\n", "\n", "\n", "\n", "\n", "45\n", "Assign\n", "\n", "\n", "\n", "30--45\n", "\n", "\n", "\n", "\n", "51\n", "If\n", "\n", "\n", "\n", "30--51\n", "\n", "\n", "\n", "\n", "32\n", "And\n", "\n", "\n", "\n", "31--32\n", "\n", "\n", "\n", "\n", "33\n", "Compare\n", "\n", "\n", "\n", "31--33\n", "\n", "\n", "\n", "\n", "40\n", "UnaryOp\n", "\n", "\n", "\n", "31--40\n", "\n", "\n", "\n", "\n", "34\n", "Name\n", "\n", "\n", "\n", "33--34\n", "\n", "\n", "\n", "\n", "37\n", "Eq\n", "\n", "\n", "\n", "33--37\n", "\n", "\n", "\n", "\n", "38\n", "Constant\n", "\n", "\n", "\n", "33--38\n", "\n", "\n", "\n", "\n", "35\n", ""c"\n", "\n", "\n", "\n", "34--35\n", "\n", "\n", "\n", "\n", "36\n", "Load\n", "\n", "\n", "\n", "34--36\n", "\n", "\n", "\n", "\n", "39\n", ""<"\n", "\n", "\n", "\n", "38--39\n", "\n", "\n", "\n", "\n", "41\n", "Not\n", "\n", "\n", "\n", "40--41\n", "\n", "\n", "\n", "\n", "42\n", "Name\n", "\n", "\n", "\n", "40--42\n", "\n", "\n", "\n", "\n", "43\n", ""quote"\n", "\n", "\n", "\n", "42--43\n", "\n", "\n", "\n", "\n", "44\n", "Load\n", "\n", "\n", "\n", "42--44\n", "\n", "\n", "\n", "\n", "46\n", "Name\n", "\n", "\n", "\n", "45--46\n", "\n", "\n", "\n", "\n", "49\n", "Constant\n", "\n", "\n", "\n", "45--49\n", "\n", "\n", "\n", "\n", "47\n", ""tag"\n", "\n", "\n", "\n", "46--47\n", "\n", "\n", "\n", "\n", "48\n", "Store\n", "\n", "\n", "\n", "46--48\n", "\n", "\n", "\n", "\n", "50\n", "True\n", "\n", "\n", "\n", "49--50\n", "\n", "\n", "\n", "\n", "52\n", "BoolOp\n", "\n", "\n", "\n", "51--52\n", "\n", "\n", "\n", "\n", "66\n", "Assign\n", "\n", "\n", "\n", "51--66\n", "\n", "\n", "\n", "\n", "72\n", "If\n", "\n", "\n", "\n", "51--72\n", "\n", "\n", "\n", "\n", "53\n", "And\n", "\n", "\n", "\n", "52--53\n", "\n", "\n", "\n", "\n", "54\n", "Compare\n", "\n", "\n", "\n", "52--54\n", "\n", "\n", "\n", "\n", "61\n", "UnaryOp\n", "\n", "\n", "\n", "52--61\n", "\n", "\n", "\n", "\n", "55\n", "Name\n", "\n", "\n", "\n", "54--55\n", "\n", "\n", "\n", "\n", "58\n", "Eq\n", "\n", "\n", "\n", "54--58\n", "\n", "\n", "\n", "\n", "59\n", "Constant\n", "\n", "\n", "\n", "54--59\n", "\n", "\n", "\n", "\n", "56\n", ""c"\n", "\n", "\n", "\n", "55--56\n", "\n", "\n", "\n", "\n", "57\n", "Load\n", "\n", "\n", "\n", "55--57\n", "\n", "\n", "\n", "\n", "60\n", "">"\n", "\n", "\n", "\n", "59--60\n", "\n", "\n", "\n", "\n", "62\n", "Not\n", "\n", "\n", "\n", "61--62\n", "\n", "\n", "\n", "\n", "63\n", "Name\n", "\n", "\n", "\n", "61--63\n", "\n", "\n", "\n", "\n", "64\n", ""quote"\n", "\n", "\n", "\n", "63--64\n", "\n", "\n", "\n", "\n", "65\n", "Load\n", "\n", "\n", "\n", "63--65\n", "\n", "\n", "\n", "\n", "67\n", "Name\n", "\n", "\n", "\n", "66--67\n", "\n", "\n", "\n", "\n", "70\n", "Constant\n", "\n", "\n", "\n", "66--70\n", "\n", "\n", "\n", "\n", "68\n", ""tag"\n", "\n", "\n", "\n", "67--68\n", "\n", "\n", "\n", "\n", "69\n", "Store\n", "\n", "\n", "\n", "67--69\n", "\n", "\n", "\n", "\n", "71\n", "False\n", "\n", "\n", "\n", "70--71\n", "\n", "\n", "\n", "\n", "73\n", "BoolOp\n", "\n", "\n", "\n", "72--73\n", "\n", "\n", "\n", "\n", "94\n", "Assign\n", "\n", "\n", "\n", "72--94\n", "\n", "\n", "\n", "\n", "103\n", "If\n", "\n", "\n", "\n", "72--103\n", "\n", "\n", "\n", "\n", "74\n", "Or\n", "\n", "\n", "\n", "73--74\n", "\n", "\n", "\n", "\n", "75\n", "Compare\n", "\n", "\n", "\n", "73--75\n", "\n", "\n", "\n", "\n", "82\n", "BoolOp\n", "\n", "\n", "\n", "73--82\n", "\n", "\n", "\n", "\n", "76\n", "Name\n", "\n", "\n", "\n", "75--76\n", "\n", "\n", "\n", "\n", "79\n", "Eq\n", "\n", "\n", "\n", "75--79\n", "\n", "\n", "\n", "\n", "80\n", "Constant\n", "\n", "\n", "\n", "75--80\n", "\n", "\n", "\n", "\n", "77\n", ""c"\n", "\n", "\n", "\n", "76--77\n", "\n", "\n", "\n", "\n", "78\n", "Load\n", "\n", "\n", "\n", "76--78\n", "\n", "\n", "\n", "\n", "81\n", """"\n", "\n", "\n", "\n", "80--81\n", "\n", "\n", "\n", "\n", "83\n", "And\n", "\n", "\n", "\n", "82--83\n", "\n", "\n", "\n", "\n", "84\n", "Compare\n", "\n", "\n", "\n", "82--84\n", "\n", "\n", "\n", "\n", "91\n", "Name\n", "\n", "\n", "\n", "82--91\n", "\n", "\n", "\n", "\n", "85\n", "Name\n", "\n", "\n", "\n", "84--85\n", "\n", "\n", "\n", "\n", "88\n", "Eq\n", "\n", "\n", "\n", "84--88\n", "\n", "\n", "\n", "\n", "89\n", "Constant\n", "\n", "\n", "\n", "84--89\n", "\n", "\n", "\n", "\n", "86\n", ""c"\n", "\n", "\n", "\n", "85--86\n", "\n", "\n", "\n", "\n", "87\n", "Load\n", "\n", "\n", "\n", "85--87\n", "\n", "\n", "\n", "\n", "90\n", ""'"\n", "\n", "\n", "\n", "89--90\n", "\n", "\n", "\n", "\n", "92\n", ""tag"\n", "\n", "\n", "\n", "91--92\n", "\n", "\n", "\n", "\n", "93\n", "Load\n", "\n", "\n", "\n", "91--93\n", "\n", "\n", "\n", "\n", "95\n", "Name\n", "\n", "\n", "\n", "94--95\n", "\n", "\n", "\n", "\n", "98\n", "UnaryOp\n", "\n", "\n", "\n", "94--98\n", "\n", "\n", "\n", "\n", "96\n", ""quote"\n", "\n", "\n", "\n", "95--96\n", "\n", "\n", "\n", "\n", "97\n", "Store\n", "\n", "\n", "\n", "95--97\n", "\n", "\n", "\n", "\n", "99\n", "Not\n", "\n", "\n", "\n", "98--99\n", "\n", "\n", "\n", "\n", "100\n", "Name\n", "\n", "\n", "\n", "98--100\n", "\n", "\n", "\n", "\n", "101\n", ""quote"\n", "\n", "\n", "\n", "100--101\n", "\n", "\n", "\n", "\n", "102\n", "Load\n", "\n", "\n", "\n", "100--102\n", "\n", "\n", "\n", "\n", "104\n", "UnaryOp\n", "\n", "\n", "\n", "103--104\n", "\n", "\n", "\n", "\n", "109\n", "Assign\n", "\n", "\n", "\n", "103--109\n", "\n", "\n", "\n", "\n", "105\n", "Not\n", "\n", "\n", "\n", "104--105\n", "\n", "\n", "\n", "\n", "106\n", "Name\n", "\n", "\n", "\n", "104--106\n", "\n", "\n", "\n", "\n", "107\n", ""tag"\n", "\n", "\n", "\n", "106--107\n", "\n", "\n", "\n", "\n", "108\n", "Load\n", "\n", "\n", "\n", "106--108\n", "\n", "\n", "\n", "\n", "110\n", "Name\n", "\n", "\n", "\n", "109--110\n", "\n", "\n", "\n", "\n", "113\n", "BinOp\n", "\n", "\n", "\n", "109--113\n", "\n", "\n", "\n", "\n", "111\n", ""out"\n", "\n", "\n", "\n", "110--111\n", "\n", "\n", "\n", "\n", "112\n", "Store\n", "\n", "\n", "\n", "110--112\n", "\n", "\n", "\n", "\n", "114\n", "Name\n", "\n", "\n", "\n", "113--114\n", "\n", "\n", "\n", "\n", "117\n", "Add\n", "\n", "\n", "\n", "113--117\n", "\n", "\n", "\n", "\n", "118\n", "Name\n", "\n", "\n", "\n", "113--118\n", "\n", "\n", "\n", "\n", "115\n", ""out"\n", "\n", "\n", "\n", "114--115\n", "\n", "\n", "\n", "\n", "116\n", "Load\n", "\n", "\n", "\n", "114--116\n", "\n", "\n", "\n", "\n", "119\n", ""c"\n", "\n", "\n", "\n", "118--119\n", "\n", "\n", "\n", "\n", "120\n", "Load\n", "\n", "\n", "\n", "118--120\n", "\n", "\n", "\n", "\n", "122\n", "Name\n", "\n", "\n", "\n", "121--122\n", "\n", "\n", "\n", "\n", "123\n", ""out"\n", "\n", "\n", "\n", "122--123\n", "\n", "\n", "\n", "\n", "124\n", "Load\n", "\n", "\n", "\n", "122--124\n", "\n", "\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "reduced_fun_tree = copy_and_reduce(fun_tree, reduced_nodes)\n", "show_ast(reduced_fun_tree)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And this is its textual representation:" ] }, { "cell_type": "code", "execution_count": 176, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.899409Z", "iopub.status.busy": "2023-11-12T12:40:50.899280Z", "iopub.status.idle": "2023-11-12T12:40:50.933790Z", "shell.execute_reply": "2023-11-12T12:40:50.933494Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s):\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out = \u001b[33m'\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m (\u001b[35mnot\u001b[39;49;00m quote):\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m (\u001b[35mnot\u001b[39;49;00m quote):\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m (c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag):\u001b[37m\u001b[39;49;00m\n", " quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " out = out + c\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(ast.unparse(reduced_fun_tree), '.py')" ] }, { "cell_type": "code", "execution_count": 177, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.935384Z", "iopub.status.busy": "2023-11-12T12:40:50.935268Z", "iopub.status.idle": "2023-11-12T12:40:50.937423Z", "shell.execute_reply": "2023-11-12T12:40:50.937006Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "310" ] }, "execution_count": 177, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.tests" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We see that some code was deleted – notably the assertion at the end – but otherwise, our deletion strategy was not particularly effective. This is because in Python, one cannot simply delete the single statement in a controlled body – this raises a syntax error. One would have to replace it with `pass` (or some other statement with no effect) to stay syntactically valid. Still, the syntax-based reduction would still single out `remove_html_markup()` from the `Assertions` source code – and do so even faster, as it would apply on one definition (rather than one line) after another." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Transforming Nodes\n", "\n", "To further boost our syntactic reduction strategy, we implement a set of additional reduction operators. First, as already discussed, we do not simply delete an assignment, but we replace it with a `pass` statement. To obtain the tree for `pass`, we simply parse it and access the subtree." ] }, { "cell_type": "code", "execution_count": 178, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.939276Z", "iopub.status.busy": "2023-11-12T12:40:50.939126Z", "iopub.status.idle": "2023-11-12T12:40:50.941429Z", "shell.execute_reply": "2023-11-12T12:40:50.941138Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class NodeReducer(NodeReducer):\n", " PASS_TREE = ast.parse(\"pass\").body[0]\n", "\n", " def visit_Assign(self, node: ast.Assign) -> AST:\n", " if node.marked: # type: ignore\n", " # Replace by pass\n", " return self.PASS_TREE\n", " return super().generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In a similar vein, we can replace comparison operators with `False`:" ] }, { "cell_type": "code", "execution_count": 179, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.943062Z", "iopub.status.busy": "2023-11-12T12:40:50.942938Z", "iopub.status.idle": "2023-11-12T12:40:50.945222Z", "shell.execute_reply": "2023-11-12T12:40:50.944897Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class NodeReducer(NodeReducer):\n", " FALSE_TREE = ast.parse(\"False\").body[0].value # type: ignore\n", "\n", " def visit_Compare(self, node: ast.Compare) -> AST:\n", " if node.marked: # type: ignore\n", " # Replace by False\n", " return self.FALSE_TREE\n", " return super().generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If we have a Boolean operator, we attempt to replace it with its left operand:" ] }, { "cell_type": "code", "execution_count": 180, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.947064Z", "iopub.status.busy": "2023-11-12T12:40:50.946921Z", "iopub.status.idle": "2023-11-12T12:40:50.949038Z", "shell.execute_reply": "2023-11-12T12:40:50.948746Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class NodeReducer(NodeReducer):\n", " def visit_BoolOp(self, node: ast.BoolOp) -> AST:\n", " if node.marked: # type: ignore\n", " # Replace by left operator\n", " return node.values[0]\n", " return super().generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And if we find an `If` clause, we attempt to replace it by its body:" ] }, { "cell_type": "code", "execution_count": 181, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.950478Z", "iopub.status.busy": "2023-11-12T12:40:50.950371Z", "iopub.status.idle": "2023-11-12T12:40:50.952432Z", "shell.execute_reply": "2023-11-12T12:40:50.952171Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class NodeReducer(NodeReducer):\n", " def visit_If(self, node: ast.If) -> Union[AST, List[ast.stmt]]:\n", " if node.marked: # type: ignore\n", " # Replace by body\n", " return node.body\n", " return super().generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us try to reduce our code with these additional reducers enabled:" ] }, { "cell_type": "code", "execution_count": 182, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.954010Z", "iopub.status.busy": "2023-11-12T12:40:50.953854Z", "iopub.status.idle": "2023-11-12T12:40:50.956669Z", "shell.execute_reply": "2023-11-12T12:40:50.956379Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "with DeltaDebugger() as dd:\n", " compile_and_test_ast(fun_tree, fun_nodes, test_tree)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is the reduced code we get. We see that all references to `quote` have gone, as has the handling of single quotes – none of this is relevant for the failure:" ] }, { "cell_type": "code", "execution_count": 183, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:50.958354Z", "iopub.status.busy": "2023-11-12T12:40:50.958227Z", "iopub.status.idle": "2023-11-12T12:40:51.180757Z", "shell.execute_reply": "2023-11-12T12:40:51.180302Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s):\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mpass\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " out = \u001b[33m'\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mpass\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " out = out + c\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "reduced_nodes = dd.min_args()['keep_list']\n", "reduced_fun_tree = copy_and_reduce(fun_tree, reduced_nodes)\n", "print_content(ast.unparse(reduced_fun_tree), '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Again, the best insights come from comparing this reduced version to the original implementation – and we learn that the problem is not related to the `quote` variable, or to the handling of single quotes; the problem is simply that when the input contains double quotes, these are not added to the final string." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "With our reduction code, however, we only touch the surface of what could actually be possible. So far, we implement exactly one reduction per node – but of course, there are many alternatives an expression or statement could be reduced to. We will explore some of these in the [exercises](#Exercises), below; also be sure to check out the [background](#Background) on code reduction." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Synopsis\n", "\n", "A _reducer_ takes a failure-inducing input and reduces it to the minimum that still reproduces the failure. This chapter provides a `DeltaDebugger` class that implements such a reducer." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is a simple example: An arithmetic expression causes an error in the Python interpreter:" ] }, { "cell_type": "code", "execution_count": 184, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.183213Z", "iopub.status.busy": "2023-11-12T12:40:51.183048Z", "iopub.status.idle": "2023-11-12T12:40:51.185020Z", "shell.execute_reply": "2023-11-12T12:40:51.184684Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def myeval(inp: str) -> Any:\n", " return eval(inp)" ] }, { "cell_type": "code", "execution_count": 185, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.187521Z", "iopub.status.busy": "2023-11-12T12:40:51.187366Z", "iopub.status.idle": "2023-11-12T12:40:51.191291Z", "shell.execute_reply": "2023-11-12T12:40:51.190293Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/4002351332.py\", line 2, in \n", " myeval('1 + 2 * 3 / 0')\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_25735/2200911420.py\", line 2, in myeval\n", " return eval(inp)\n", " File \"\", line 1, in \n", "ZeroDivisionError: division by zero (expected)\n" ] } ], "source": [ "with ExpectError(ZeroDivisionError):\n", " myeval('1 + 2 * 3 / 0')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Can we reduce this input to a minimum? _Delta Debugging_ is a simple and robust reduction algorithm. We provide a `DeltaDebugger` class that is used in conjunction with a (failing) function call:\n", "\n", "```python\n", "with DeltaDebugger() as dd:\n", " fun(args...)\n", "dd\n", "```\n", "\n", "The class automatically determines minimal arguments that cause the function to fail with the same exception as the original. Printing out the class object reveals the minimized call." ] }, { "cell_type": "code", "execution_count": 186, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.194601Z", "iopub.status.busy": "2023-11-12T12:40:51.194125Z", "iopub.status.idle": "2023-11-12T12:40:51.197808Z", "shell.execute_reply": "2023-11-12T12:40:51.197473Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "myeval(inp='3/0')" ] }, "execution_count": 186, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with DeltaDebugger() as dd:\n", " myeval('1 + 2 * 3 / 0')\n", "dd" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The input is reduced to the minimum: We get the essence of the division by zero." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "There also is an interface to access the reduced input(s) programmatically. The method `min_args()` returns a dictionary in which all function arguments are minimized:" ] }, { "cell_type": "code", "execution_count": 187, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.199767Z", "iopub.status.busy": "2023-11-12T12:40:51.199656Z", "iopub.status.idle": "2023-11-12T12:40:51.202462Z", "shell.execute_reply": "2023-11-12T12:40:51.202163Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'inp': '3/0'}" ] }, "execution_count": 187, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.min_args()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In contrast, `max_args()` returns a dictionary in which all function arguments are maximized, but still pass:" ] }, { "cell_type": "code", "execution_count": 188, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.204207Z", "iopub.status.busy": "2023-11-12T12:40:51.204089Z", "iopub.status.idle": "2023-11-12T12:40:51.206656Z", "shell.execute_reply": "2023-11-12T12:40:51.206356Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'inp': '1 + 2 * 3 '}" ] }, "execution_count": 188, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.max_args()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The method `min_arg_diff()` returns a triple of \n", "* passing input,\n", "* failing input, and\n", "* their minimal failure-inducing difference:" ] }, { "cell_type": "code", "execution_count": 189, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.208490Z", "iopub.status.busy": "2023-11-12T12:40:51.208363Z", "iopub.status.idle": "2023-11-12T12:40:51.211080Z", "shell.execute_reply": "2023-11-12T12:40:51.210792Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "({'inp': ' 3 '}, {'inp': ' 3 /0'}, {'inp': '/0'})" ] }, "execution_count": 189, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.min_arg_diff()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And you can also access the function itself, as well as its original arguments." ] }, { "cell_type": "code", "execution_count": 190, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.212835Z", "iopub.status.busy": "2023-11-12T12:40:51.212707Z", "iopub.status.idle": "2023-11-12T12:40:51.215261Z", "shell.execute_reply": "2023-11-12T12:40:51.214899Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "('myeval', {'inp': '1 + 2 * 3 / 0'})" ] }, "execution_count": 190, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dd.function().__name__, dd.args()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`DeltaDebugger` processes (i.e., minimizes or maximizes) all arguments that support a `len()` operation and that can be indexed – notably _strings_ and _lists_. If a function has multiple arguments, all arguments that can be processed will be processed." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This chapter also provides a number of superclasses to `DeltaDebugger`, notably `CallCollector`, which obtains the first function call for `DeltaDebugger`. `CallReducer` classes allow for implementing alternate call reduction strategies." ] }, { "cell_type": "code", "execution_count": 191, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.217067Z", "iopub.status.busy": "2023-11-12T12:40:51.216908Z", "iopub.status.idle": "2023-11-12T12:40:51.218841Z", "shell.execute_reply": "2023-11-12T12:40:51.218460Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# ignore\n", "from ClassDiagram import display_class_hierarchy" ] }, { "cell_type": "code", "execution_count": 192, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:40:51.220537Z", "iopub.status.busy": "2023-11-12T12:40:51.220354Z", "iopub.status.idle": "2023-11-12T12:40:51.927286Z", "shell.execute_reply": "2023-11-12T12:40:51.925868Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "DeltaDebugger\n", "\n", "\n", "DeltaDebugger\n", "\n", "\n", "\n", "__repr__()\n", "\n", "\n", "\n", "dd()\n", "\n", "\n", "\n", "max_args()\n", "\n", "\n", "\n", "min_arg_diff()\n", "\n", "\n", "\n", "min_args()\n", "\n", "\n", "\n", "after_collection()\n", "\n", "\n", "\n", "check_reproducibility()\n", "\n", "\n", "\n", "process_args()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CachingCallReducer\n", "\n", "\n", "CachingCallReducer\n", "\n", "\n", "\n", "init()\n", "\n", "\n", "\n", "test()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "DeltaDebugger->CachingCallReducer\n", "\n", "\n", "\n", "\n", "\n", "CallReducer\n", "\n", "\n", "CallReducer\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "reduce_arg()\n", "\n", "\n", "\n", "reset()\n", "\n", "\n", "\n", "run()\n", "\n", "\n", "\n", "test()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CachingCallReducer->CallReducer\n", "\n", "\n", "\n", "\n", "\n", "CallCollector\n", "\n", "\n", "CallCollector\n", "\n", "\n", "\n", "__enter__()\n", "\n", "\n", "\n", "__exit__()\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "args()\n", "\n", "\n", "\n", "call()\n", "\n", "\n", "\n", "exception()\n", "\n", "\n", "\n", "function()\n", "\n", "\n", "\n", "after_collection()\n", "\n", "\n", "\n", "format_call()\n", "\n", "\n", "\n", "format_exception()\n", "\n", "\n", "\n", "init()\n", "\n", "\n", "\n", "traceit()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CallReducer->CallCollector\n", "\n", "\n", "\n", "\n", "\n", "StackInspector\n", "\n", "\n", "StackInspector\n", "\n", "\n", "\n", "_generated_function_cache\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "caller_frame()\n", "\n", "\n", "\n", "caller_function()\n", "\n", "\n", "\n", "caller_globals()\n", "\n", "\n", "\n", "caller_locals()\n", "\n", "\n", "\n", "caller_location()\n", "\n", "\n", "\n", "is_internal_error()\n", "\n", "\n", "\n", "our_frame()\n", "\n", "\n", "\n", "search_frame()\n", "\n", "\n", "\n", "search_func()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CallCollector->StackInspector\n", "\n", "\n", "\n", "\n", "\n", "Legend\n", "Legend\n", "• \n", "public_method()\n", "• \n", "private_method()\n", "• \n", "overloaded_method()\n", "Hover over names to see doc\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 192, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ignore\n", "display_class_hierarchy([DeltaDebugger],\n", " public_methods=[\n", " StackInspector.caller_frame,\n", " StackInspector.caller_function,\n", " StackInspector.caller_globals,\n", " StackInspector.caller_locals,\n", " StackInspector.caller_location,\n", " StackInspector.search_frame,\n", " StackInspector.search_func,\n", " StackInspector.is_internal_error,\n", " StackInspector.our_frame,\n", " CallCollector.__init__,\n", " CallCollector.__enter__,\n", " CallCollector.__exit__,\n", " CallCollector.function,\n", " CallCollector.args,\n", " CallCollector.exception,\n", " CallCollector.call,\n", " CallReducer.__init__,\n", " CallReducer.reduce_arg,\n", " DeltaDebugger.dd,\n", " DeltaDebugger.min_args,\n", " DeltaDebugger.max_args,\n", " DeltaDebugger.min_arg_diff,\n", " DeltaDebugger.__repr__\n", " ],\n", " project='debuggingbook')" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Lessons Learned\n", "\n", "* Reducing failure-inducing inputs to a minimum is helpful for testing and debugging.\n", "* _Delta debugging_ is a simple and robust algorithm to easily reduce inputs of test cases, as well as their code.\n", "* Precisely specifying failure conditions helps to avoid false diagnoses." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Next Steps\n", "\n", "Our next chapter focuses on [finding _failure-inducing code changes_](ChangeDebugger.ipynb), using delta debugging and version control systems." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Background\n", "\n", "The \"lexical\" delta debugging algorithm discussed here – both in its simplifying `ddmin` and in its general `dd` form – stem from \\cite{Zeller2002}; actually, `ddmin` is the exact Python implementation as used by Zeller in 2002. The `ddmax` variant was first evaluated in \\cite{Kirschner2020}. This chapter is the first to show how both `ddmin` and `ddmax` can be implemented as small variations of `dd`." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The idea of systematically reducing inputs has been discovered a number of times, although not as automatic and generic as delta debugging. \\cite{Slutz1998}, for instance, discusses systematic reduction of SQL statements for SQL databases; the general process as manual work is well described by \\cite{Kernighan1999}." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The deficits of delta debugging as it comes to syntactically complex inputs were first discussed in *compiler testing*, and _reducing tree inputs_ rather than string inputs was quickly discovered as an alternative. *Hierarchical Delta Debugging* (*HDD*) \\cite{Misherghi2006} applies delta debugging on subtrees of a parse tree, systematically reducing a parse tree to a minimum. _Generalized Tree Reduction_ \\cite{Herfert2017} generalizes this idea to apply arbitrary _patterns_ such as replacing a term by a compatible term in a subtree. Using _grammars_ to reduce inputs was first implemented in the _Perses_ tool \\cite{Sun2018}. [A Python implementation of grammar-based input reduction](https://www.fuzzingbook.org/html/Reducer.html#Grammar-Based-Input-Reduction) is part of \"The Fuzzing Book\"." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "While applying delta debugging to code lines does a decent job, _syntactic_ and especially _language-specific_ approaches can do a much better job for the programming language at hand:\n", "\n", "* *C-Reduce* \\cite{Regehr2012} is a reducer specifically targeting the reduction of programming languages. Besides reductions in the style of delta debugging or tree transformations, C-Reduce comes with more than 30 source-to-source transformations that replace aggregates by scalars, remove function parameters at a definition and all call sites, change functions to return `void` and deleting all `return` statements, and many more. While specifically instantiated for the C language (and used for testing C compilers), these principles extend to arbitrary programming languages following an ALGOL-like syntax.\n", "\n", "* Kalhauge and Palsberg \\cite{Kalhauge2019} introduce *binary reduction of dependency graphs*, a general solution for reducing arbitrary inputs with dependencies. Their *J-Reduce* tool specifically targets Java programs, and again is much faster than delta debugging and achieves a higher reduction rate." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Reducing inputs also works well in the context of _property-based testing_; that is, generating data structures for individual functions, which can then be reduced (\"shrunk\") upon failure. The [Hypothesis](https://hypothesis.works) fuzzer has a number of type-specific shrinking strategies; this [blog article](https://hypothesis.works/articles/integrated-shrinking/) discusses some of its features." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This [blog post](https://www.drmaciver.com/2019/01/notes-on-test-case-reduction/) by David McIver contains lots of insights on how to apply reduction in practice, in particular multiple runs with different abstraction levels." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Exercises\n", "\n", "How to best reduce inputs is still an underdeveloped field of research, with lots of opportunities." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden", "solution2_first": true, "solution_first": true }, "source": [ "### Exercise 1: Advanced Syntactic Code Reduction\n", "\n", "Extend the code in [\"Transforming Nodes\"](#Transforming-Nodes) such that _multiple_ reduction possibilities for a node are being considered. For instance:\n", "\n", "* Replace a `BoolOp` node by `True`.\n", "* Replace a `BoolOp` node by `False`.\n", "* Replace a `BoolOp` node by its left operand.\n", "* Replace a `BoolOp` node by its right operand.\n", "\n", "or:\n", "\n", "* Replace an `If` node by its \"then\" body.\n", "* Replace an `If` node by its \"else\" body.\n", "\n", "or:\n", "\n", "* Replace all instances of a variable by a constant.\n", "\n", "or:\n", "\n", "* Replace expressions by a constant.\n", "\n", "Have a look at the [official Python `ast` reference](http://docs.python.org/3/library/ast) for a list of nodes (and some ideas on what to replace them by). The documentation [\"Green Tree Snakes - the missing Python AST docs\"](https://greentreesnakes.readthedocs.io/en/latest/) provides an excellent introduction on visitors and transformers. Make copious use of AST visualization and tests to ensure your syntax trees are still correct.\n", "\n", "Strategy-wise, you should first create a list of _possible_ reductions; and then pass to Delta Debugging a \"keep list\" of reductions that should _not_ be applied. When Delta Debugging minimizes this list, it will apply as many reductions as possible." ] } ], "metadata": { "ipub": { "bibliography": "fuzzingbook.bib", "toc": true }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.2" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "toc-autonumbering": false, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }