{ "cells": [ { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Statistical Debugging\n", "\n", "In this chapter, we introduce _statistical debugging_ – the idea that specific events during execution could be _statistically correlated_ with failures. We start with coverage of individual lines and then proceed towards further execution features." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.348346Z", "iopub.status.busy": "2023-11-12T12:41:20.348144Z", "iopub.status.idle": "2023-11-12T12:41:20.387249Z", "shell.execute_reply": "2023-11-12T12:41:20.386949Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from bookutils import YouTubeVideo\n", "YouTubeVideo(\"UNuso00zYiI\")" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "**Prerequisites**\n", "\n", "* You should have read the [chapter on tracing executions](Tracer.ipynb)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "button": false, "execution": { "iopub.execute_input": "2023-11-12T12:41:20.409388Z", "iopub.status.busy": "2023-11-12T12:41:20.409199Z", "iopub.status.idle": "2023-11-12T12:41:20.411701Z", "shell.execute_reply": "2023-11-12T12:41:20.411331Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import bookutils.setup" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## Synopsis\n", "\n", "\n", "To [use the code provided in this chapter](Importing.ipynb), write\n", "\n", "```python\n", ">>> from debuggingbook.StatisticalDebugger import \n", "```\n", "\n", "and then make use of the following features.\n", "\n", "\n", "This chapter introduces classes and techniques for _statistical debugging_ – that is, correlating specific events, such as lines covered, with passing and failing outcomes.\n", "\n", "To make use of the code in this chapter, use one of the provided `StatisticalDebugger` subclasses such as `TarantulaDebugger` or `OchiaiDebugger`. \n", "\n", "Both are instantiated with a `Collector` denoting the type of events you want to correlate outcomes with. The default `CoverageCollector`, collecting line coverage.\n", "\n", "### Collecting Events from Calls\n", "\n", "To collect events from calls that are labeled manually, use\n", "\n", "```python\n", ">>> debugger = TarantulaDebugger()\n", ">>> with debugger.collect_pass():\n", ">>> remove_html_markup(\"abc\")\n", ">>> with debugger.collect_pass():\n", ">>> remove_html_markup('abc')\n", ">>> with debugger.collect_fail():\n", ">>> remove_html_markup('\"abc\"')\n", "```\n", "Within each `with` block, the _first function call_ is collected and tracked for coverage. (Note that _only_ the first call is tracked.)\n", "\n", "### Collecting Events from Tests\n", "\n", "To collect events from _tests_ that use exceptions to indicate failure, use the simpler `with` form:\n", "\n", "```python\n", ">>> debugger = TarantulaDebugger()\n", ">>> with debugger:\n", ">>> remove_html_markup(\"abc\")\n", ">>> with debugger:\n", ">>> remove_html_markup('abc')\n", ">>> with debugger:\n", ">>> remove_html_markup('\"abc\"')\n", ">>> assert False # raise an exception\n", "```\n", "`with` blocks that raise an exception will be classified as failing, blocks that do not will be classified as passing. Note that exceptions raised are \"swallowed\" by the debugger.\n", "\n", "### Visualizing Events as a Table\n", "\n", "After collecting events, you can print out the observed events – in this case, line numbers – in a table, showing in which runs they occurred (`X`), and with colors highlighting the suspiciousness of the event. A \"red\" event means that the event predominantly occurs in failing runs.\n", "\n", "```python\n", ">>> debugger.event_table(args=True, color=True)\n", "```\n", "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n", "\n", "\n", "### Visualizing Suspicious Code\n", "\n", "If you collected coverage with `CoverageCollector`, you can also visualize the code with similar colors, highlighting suspicious lines:\n", "\n", "```python\n", ">>> debugger\n", "```\n", "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n", "\n", "\n", "### Ranking Events\n", "\n", "The method `rank()` returns a ranked list of events, starting with the most suspicious. This is useful for automated techniques that need potential defect locations.\n", "\n", "```python\n", ">>> debugger.rank()\n", "[('remove_html_markup', 12),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 1),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 16),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 8)]\n", "```\n", "### Classes and Methods\n", "\n", "Here are all classes defined in this chapter:\n", "\n", "![](PICS/StatisticalDebugger-synopsis-1.svg)\n", "\n", "![](PICS/StatisticalDebugger-synopsis-2.svg)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Introduction\n", "\n", "The idea behind _statistical debugging_ is fairly simple. We have a program that sometimes passes and sometimes fails. This outcome can be _correlated_ with events that precede it – properties of the input, properties of the execution, properties of the program state. If we, for instance, can find that \"the program always fails when Line 123 is executed, and it always passes when Line 123 is _not_ executed\", then we have a strong correlation between Line 123 being executed and failure.\n", "\n", "Such _correlation_ does not necessarily mean _causation_. For this, we would have to prove that executing Line 123 _always_ leads to failure, and that _not_ executing it does not lead to (this) failure. Also, a correlation (or even a causation) does not mean that Line 123 contains the defect – for this, we would have to show that it actually is an error. Still, correlations make excellent hints as it comes to search for failure causes – in all generality, if you let your search be guided by _events that correlate with failures_, you are more likely to find _important hints on how the failure comes to be_." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Collecting Events\n", "\n", "How can we determine events that correlate with failure? We start with a general mechanism to actually _collect_ events during execution. The abstract `Collector` class provides\n", "\n", "* a `collect()` method made for collecting events, called from the `traceit()` tracer; and\n", "* an `events()` method made for retrieving these events.\n", "\n", "Both of these are _abstract_ and will be defined further in subclasses." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.413781Z", "iopub.status.busy": "2023-11-12T12:41:20.413650Z", "iopub.status.idle": "2023-11-12T12:41:20.512433Z", "shell.execute_reply": "2023-11-12T12:41:20.512036Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Tracer import Tracer" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.514475Z", "iopub.status.busy": "2023-11-12T12:41:20.514325Z", "iopub.status.idle": "2023-11-12T12:41:20.516281Z", "shell.execute_reply": "2023-11-12T12:41:20.516009Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# ignore\n", "from typing import Any, Callable, Optional, Type, Tuple\n", "from typing import Dict, Set, List, TypeVar, Union" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.517861Z", "iopub.status.busy": "2023-11-12T12:41:20.517741Z", "iopub.status.idle": "2023-11-12T12:41:20.519446Z", "shell.execute_reply": "2023-11-12T12:41:20.519141Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from types import FrameType, TracebackType" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.521301Z", "iopub.status.busy": "2023-11-12T12:41:20.521177Z", "iopub.status.idle": "2023-11-12T12:41:20.523659Z", "shell.execute_reply": "2023-11-12T12:41:20.523390Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Collector(Tracer):\n", " \"\"\"A class to record events during execution.\"\"\"\n", "\n", " def collect(self, frame: FrameType, event: str, arg: Any) -> None:\n", " \"\"\"Collecting function. To be overridden in subclasses.\"\"\"\n", " pass\n", "\n", " def events(self) -> Set:\n", " \"\"\"Return a collection of events. To be overridden in subclasses.\"\"\"\n", " return set()\n", "\n", " def traceit(self, frame: FrameType, event: str, arg: Any) -> None:\n", " self.collect(frame, event, arg)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A `Collector` class is used like `Tracer`, using a `with` statement. Let us apply it on the buggy variant of `remove_html_markup()` from the [Introduction to Debugging](Intro_Debugging.ipynb):" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.525239Z", "iopub.status.busy": "2023-11-12T12:41:20.525124Z", "iopub.status.idle": "2023-11-12T12:41:20.527415Z", "shell.execute_reply": "2023-11-12T12:41:20.527061Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def remove_html_markup(s): # type: ignore\n", " tag = False\n", " quote = False\n", " out = \"\"\n", "\n", " for c in s:\n", " if c == '<' and not quote:\n", " tag = True\n", " elif c == '>' and not quote:\n", " tag = False\n", " elif c == '\"' or c == \"'\" and tag:\n", " quote = not quote\n", " elif not tag:\n", " out = out + c\n", "\n", " return out" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.529639Z", "iopub.status.busy": "2023-11-12T12:41:20.529472Z", "iopub.status.idle": "2023-11-12T12:41:20.532120Z", "shell.execute_reply": "2023-11-12T12:41:20.531765Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'abc'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with Collector() as c:\n", " out = remove_html_markup('\"abc\"')\n", "out" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "There's not much we can do with our collector, as the `collect()` and `events()` methods are yet empty. However, we can introduce an `id()` method which returns a string identifying the collector. This string is defined from the _first function call_ encountered." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.533770Z", "iopub.status.busy": "2023-11-12T12:41:20.533657Z", "iopub.status.idle": "2023-11-12T12:41:20.535373Z", "shell.execute_reply": "2023-11-12T12:41:20.535107Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "Coverage = Set[Tuple[Callable, int]]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.537492Z", "iopub.status.busy": "2023-11-12T12:41:20.537342Z", "iopub.status.idle": "2023-11-12T12:41:20.543360Z", "shell.execute_reply": "2023-11-12T12:41:20.543002Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Collector(Collector):\n", " def __init__(self) -> None:\n", " \"\"\"Constructor.\"\"\"\n", " self._function: Optional[Callable] = None\n", " self._args: Optional[Dict[str, Any]] = None\n", " self._argstring: Optional[str] = None\n", " self._exception: Optional[Type] = None\n", " self.items_to_ignore: List[Union[Type, Callable]] = [self.__class__]\n", "\n", " def traceit(self, frame: FrameType, event: str, arg: Any) -> None:\n", " \"\"\"\n", " Tracing function.\n", " Saves the first function and calls collect().\n", " \"\"\"\n", " for item in self.items_to_ignore:\n", " if (isinstance(item, type) and 'self' in frame.f_locals and\n", " isinstance(frame.f_locals['self'], item)):\n", " # Ignore this class\n", " return\n", " if item.__name__ == frame.f_code.co_name:\n", " # Ignore this function\n", " return\n", "\n", " if self._function is None and event == 'call':\n", " # Save function\n", " self._function = self.create_function(frame)\n", " self._args = frame.f_locals.copy()\n", " self._argstring = \", \".join([f\"{var}={repr(self._args[var])}\" \n", " for var in self._args])\n", "\n", " self.collect(frame, event, arg)\n", "\n", " def collect(self, frame: FrameType, event: str, arg: Any) -> None:\n", " \"\"\"Collector function. To be overloaded in subclasses.\"\"\"\n", " pass\n", "\n", " def id(self) -> str:\n", " \"\"\"Return an identifier for the collector, \n", " created from the first call\"\"\"\n", " return f\"{self.function().__name__}({self.argstring()})\"\n", "\n", " def function(self) -> Callable:\n", " \"\"\"Return the function from the first call, as a function object\"\"\"\n", " if not self._function:\n", " raise ValueError(\"No call collected\")\n", " return self._function\n", "\n", " def argstring(self) -> str:\n", " \"\"\"\n", " Return the list of arguments from the first call,\n", " as a printable string\n", " \"\"\"\n", " if not self._argstring:\n", " raise ValueError(\"No call collected\")\n", " return self._argstring\n", "\n", " def args(self) -> Dict[str, Any]:\n", " \"\"\"Return a dict of argument names and values from the first call\"\"\"\n", " if not self._args:\n", " raise ValueError(\"No call collected\")\n", " return self._args\n", "\n", " def exception(self) -> Optional[Type]:\n", " \"\"\"Return the exception class from the first call,\n", " or None if no exception was raised.\"\"\"\n", " return self._exception\n", "\n", " def __repr__(self) -> str:\n", " \"\"\"Return a string representation of the collector\"\"\"\n", " # We use the ID as default representation when printed\n", " return self.id()\n", "\n", " def covered_functions(self) -> Set[Callable]:\n", " \"\"\"Set of covered functions. To be overloaded in subclasses.\"\"\"\n", " return set()\n", "\n", " def coverage(self) -> Coverage:\n", " \"\"\"\n", " Return a set (function, lineno) with locations covered.\n", " To be overloaded in subclasses.\n", " \"\"\"\n", " return set()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's how the collector works. We use a `with` clause to collect details on a function call:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.545309Z", "iopub.status.busy": "2023-11-12T12:41:20.545143Z", "iopub.status.idle": "2023-11-12T12:41:20.547151Z", "shell.execute_reply": "2023-11-12T12:41:20.546874Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with Collector() as c:\n", " remove_html_markup('abc')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can now retrieve details such as the function called..." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.548773Z", "iopub.status.busy": "2023-11-12T12:41:20.548653Z", "iopub.status.idle": "2023-11-12T12:41:20.550818Z", "shell.execute_reply": "2023-11-12T12:41:20.550559Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.function()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "... or its arguments, as a name/value dictionary." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.552205Z", "iopub.status.busy": "2023-11-12T12:41:20.552104Z", "iopub.status.idle": "2023-11-12T12:41:20.554136Z", "shell.execute_reply": "2023-11-12T12:41:20.553893Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'s': 'abc'}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.args()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `id()` method returns a printable representation of the call:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.555656Z", "iopub.status.busy": "2023-11-12T12:41:20.555548Z", "iopub.status.idle": "2023-11-12T12:41:20.557544Z", "shell.execute_reply": "2023-11-12T12:41:20.557183Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "\"remove_html_markup(s='abc')\"" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.id()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `argstring()` method does the same for the argument string only." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.559076Z", "iopub.status.busy": "2023-11-12T12:41:20.558921Z", "iopub.status.idle": "2023-11-12T12:41:20.561163Z", "shell.execute_reply": "2023-11-12T12:41:20.560881Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "\"s='abc'\"" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.argstring()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "With this, we can collect the basic information to identify calls – such that we can later correlate their events with success or failure." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Error Prevention" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "While collecting, we'd like to avoid collecting events in the collection infrastructure. The `items_to_ignore` attribute takes care of this." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.562950Z", "iopub.status.busy": "2023-11-12T12:41:20.562826Z", "iopub.status.idle": "2023-11-12T12:41:20.564855Z", "shell.execute_reply": "2023-11-12T12:41:20.564558Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class Collector(Collector):\n", " def add_items_to_ignore(self,\n", " items_to_ignore: List[Union[Type, Callable]]) \\\n", " -> None:\n", " \"\"\"\n", " Define additional classes and functions to ignore during collection\n", " (typically `Debugger` classes using these collectors).\n", " \"\"\"\n", " self.items_to_ignore += items_to_ignore" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If we exit a block without having collected anything, that's likely an error." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.566416Z", "iopub.status.busy": "2023-11-12T12:41:20.566297Z", "iopub.status.idle": "2023-11-12T12:41:20.568528Z", "shell.execute_reply": "2023-11-12T12:41:20.568159Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Collector(Collector):\n", " def __exit__(self, exc_tp: Type, exc_value: BaseException,\n", " exc_traceback: TracebackType) -> Optional[bool]:\n", " \"\"\"Exit the `with` block.\"\"\"\n", " ret = super().__exit__(exc_tp, exc_value, exc_traceback)\n", "\n", " if not self._function:\n", " if exc_tp:\n", " return False # re-raise exception\n", " else:\n", " raise ValueError(\"No call collected\")\n", "\n", " return ret" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Collecting Coverage\n", "\n", "So far, our `Collector` class does not collect any events. Let us extend it such that it collects _coverage_ information – that is, the set of locations executed. To this end, we introduce a `CoverageCollector` subclass which saves the coverage in a set containing functions and line numbers." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.570214Z", "iopub.status.busy": "2023-11-12T12:41:20.570097Z", "iopub.status.idle": "2023-11-12T12:41:20.571710Z", "shell.execute_reply": "2023-11-12T12:41:20.571422Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from types import FrameType" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.573072Z", "iopub.status.busy": "2023-11-12T12:41:20.572988Z", "iopub.status.idle": "2023-11-12T12:41:20.574696Z", "shell.execute_reply": "2023-11-12T12:41:20.574428Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from StackInspector import StackInspector" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.576059Z", "iopub.status.busy": "2023-11-12T12:41:20.575974Z", "iopub.status.idle": "2023-11-12T12:41:20.578518Z", "shell.execute_reply": "2023-11-12T12:41:20.578245Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CoverageCollector(Collector, StackInspector):\n", " \"\"\"A class to record covered locations during execution.\"\"\"\n", "\n", " def __init__(self) -> None:\n", " \"\"\"Constructor.\"\"\"\n", " super().__init__()\n", " self._coverage: Coverage = set()\n", "\n", " def collect(self, frame: FrameType, event: str, arg: Any) -> None:\n", " \"\"\"\n", " Save coverage for an observed event.\n", " \"\"\"\n", " name = frame.f_code.co_name\n", " function = self.search_func(name, frame)\n", "\n", " if function is None:\n", " function = self.create_function(frame)\n", "\n", " location = (function, frame.f_lineno)\n", " self._coverage.add(location)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We also override `events()` such that it returns the set of covered locations." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.580131Z", "iopub.status.busy": "2023-11-12T12:41:20.580035Z", "iopub.status.idle": "2023-11-12T12:41:20.582172Z", "shell.execute_reply": "2023-11-12T12:41:20.581927Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class CoverageCollector(CoverageCollector):\n", " def events(self) -> Set[Tuple[str, int]]:\n", " \"\"\"\n", " Return the set of locations covered.\n", " Each location comes as a pair (`function_name`, `lineno`).\n", " \"\"\"\n", " return {(func.__name__, lineno) for func, lineno in self._coverage}" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The methods `coverage()` and `covered_functions()` allow precise access to the coverage obtained." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.583546Z", "iopub.status.busy": "2023-11-12T12:41:20.583464Z", "iopub.status.idle": "2023-11-12T12:41:20.585531Z", "shell.execute_reply": "2023-11-12T12:41:20.585263Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class CoverageCollector(CoverageCollector):\n", " def covered_functions(self) -> Set[Callable]:\n", " \"\"\"Return a set with all functions covered.\"\"\"\n", " return {func for func, lineno in self._coverage}\n", "\n", " def coverage(self) -> Coverage:\n", " \"\"\"Return a set (function, lineno) with all locations covered.\"\"\"\n", " return self._coverage" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is how we can use `CoverageCollector` to determine the lines executed during a run of `remove_html_markup()`:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.586994Z", "iopub.status.busy": "2023-11-12T12:41:20.586908Z", "iopub.status.idle": "2023-11-12T12:41:20.589358Z", "shell.execute_reply": "2023-11-12T12:41:20.589127Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with CoverageCollector() as c:\n", " remove_html_markup('abc')\n", "c.events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Sets of line numbers alone are not too revealing. They provide more insights if we actually list the code, highlighting these numbers:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.590802Z", "iopub.status.busy": "2023-11-12T12:41:20.590717Z", "iopub.status.idle": "2023-11-12T12:41:20.592337Z", "shell.execute_reply": "2023-11-12T12:41:20.592089Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import inspect" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.593659Z", "iopub.status.busy": "2023-11-12T12:41:20.593574Z", "iopub.status.idle": "2023-11-12T12:41:20.595316Z", "shell.execute_reply": "2023-11-12T12:41:20.595052Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import getsourcelines # like inspect.getsourcelines(), but in color" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.596774Z", "iopub.status.busy": "2023-11-12T12:41:20.596677Z", "iopub.status.idle": "2023-11-12T12:41:20.599075Z", "shell.execute_reply": "2023-11-12T12:41:20.598814Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def code_with_coverage(function: Callable, coverage: Coverage) -> None:\n", " source_lines, starting_line_number = \\\n", " getsourcelines(function)\n", "\n", " line_number = starting_line_number\n", " for line in source_lines:\n", " marker = '*' if (function, line_number) in coverage else ' '\n", " print(f\"{line_number:4} {marker} {line}\", end='')\n", " line_number += 1" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.600796Z", "iopub.status.busy": "2023-11-12T12:41:20.600674Z", "iopub.status.idle": "2023-11-12T12:41:20.672169Z", "shell.execute_reply": "2023-11-12T12:41:20.671844Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1 * \u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s): \u001b[37m# type: ignore\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 2 * tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 3 * quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 4 * out = \u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 5 \u001b[37m\u001b[39;49;00m\n", " 6 * \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " 7 * \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " 8 tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 9 * \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " 10 tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 11 * \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " 12 quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " 13 * \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " 14 * out = out + c\u001b[37m\u001b[39;49;00m\n", " 15 \u001b[37m\u001b[39;49;00m\n", " 16 * \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m\n" ] } ], "source": [ "code_with_coverage(remove_html_markup, c.coverage())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Remember that the input `s` was `\"abc\"`? In this listing, we can see which lines were covered and which lines were not. From the listing already, we can see that `s` has neither tags nor quotes." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Such coverage computation plays a big role in _testing_, as one wants tests to cover as many different aspects of program execution (and notably code) as possible. But also during debugging, code coverage is essential: If some code was not even executed in the failing run, then any change to it will have no effect." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.674138Z", "iopub.status.busy": "2023-11-12T12:41:20.673970Z", "iopub.status.idle": "2023-11-12T12:41:20.675735Z", "shell.execute_reply": "2023-11-12T12:41:20.675503Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import quiz" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.677122Z", "iopub.status.busy": "2023-11-12T12:41:20.677021Z", "iopub.status.idle": "2023-11-12T12:41:20.681831Z", "shell.execute_reply": "2023-11-12T12:41:20.681470Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

Let the input be \"<b>Don't do this!</b>\". Which of these lines are executed? Use the code to find out!
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz('Let the input be `\"Don\\'t do this!\"`. '\n", " \"Which of these lines are executed? Use the code to find out!\",\n", " [\n", " \"`tag = True`\",\n", " \"`tag = False`\",\n", " \"`quote = not quote`\",\n", " \"`out = out + c`\"\n", " ], \"[ord(c) - ord('a') - 1 for c in 'cdf']\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To find the solution, try this out yourself:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.684060Z", "iopub.status.busy": "2023-11-12T12:41:20.683911Z", "iopub.status.idle": "2023-11-12T12:41:20.686294Z", "shell.execute_reply": "2023-11-12T12:41:20.685790Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with CoverageCollector() as c:\n", " remove_html_markup(\"Don't do this!\")\n", "# code_with_coverage(remove_html_markup, c.coverage)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Computing Differences\n", "\n", "Let us get back to the idea that we want to _correlate_ events with passing and failing outcomes. For this, we need to examine events in both _passing_ and _failing_ runs, and determine their _differences_ – since it is these differences we want to associate with their respective outcome." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### A Base Class for Statistical Debugging\n", "\n", "The `StatisticalDebugger` base class takes a collector class (such as `CoverageCollector`). Its `collect()` method creates a new collector of that very class, which will be maintained by the debugger. As argument, `collect()` takes a string characterizing the outcome (such as `'PASS'` or `'FAIL'`). This is how one would use it:\n", "\n", "```python\n", "debugger = StatisticalDebugger()\n", "with debugger.collect('PASS'):\n", " some_passing_run()\n", "with debugger.collect('PASS'):\n", " another_passing_run()\n", "with debugger.collect('FAIL'):\n", " some_failing_run()\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us implement `StatisticalDebugger`. The base class gets a collector class as argument:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.688332Z", "iopub.status.busy": "2023-11-12T12:41:20.688212Z", "iopub.status.idle": "2023-11-12T12:41:20.690300Z", "shell.execute_reply": "2023-11-12T12:41:20.690038Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class StatisticalDebugger:\n", " \"\"\"A class to collect events for multiple outcomes.\"\"\"\n", "\n", " def __init__(self, collector_class: Type = CoverageCollector, log: bool = False):\n", " \"\"\"Constructor. Use instances of `collector_class` to collect events.\"\"\"\n", " self.collector_class = collector_class\n", " self.collectors: Dict[str, List[Collector]] = {}\n", " self.log = log" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `collect()` method creates (and stores) a collector for the given outcome, using the given outcome to characterize the run. Any additional arguments are passed to the collector." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.691854Z", "iopub.status.busy": "2023-11-12T12:41:20.691736Z", "iopub.status.idle": "2023-11-12T12:41:20.694193Z", "shell.execute_reply": "2023-11-12T12:41:20.693929Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class StatisticalDebugger(StatisticalDebugger):\n", " def collect(self, outcome: str, *args: Any, **kwargs: Any) -> Collector:\n", " \"\"\"Return a collector for the given outcome. \n", " Additional args are passed to the collector.\"\"\"\n", " collector = self.collector_class(*args, **kwargs)\n", " collector.add_items_to_ignore([self.__class__])\n", " return self.add_collector(outcome, collector)\n", "\n", " def add_collector(self, outcome: str, collector: Collector) -> Collector:\n", " if outcome not in self.collectors:\n", " self.collectors[outcome] = []\n", " self.collectors[outcome].append(collector)\n", " return collector" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `all_events()` method produces a union of all events observed. If an outcome is given, it produces a union of all events with that outcome:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.695782Z", "iopub.status.busy": "2023-11-12T12:41:20.695675Z", "iopub.status.idle": "2023-11-12T12:41:20.698427Z", "shell.execute_reply": "2023-11-12T12:41:20.698093Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class StatisticalDebugger(StatisticalDebugger):\n", " def all_events(self, outcome: Optional[str] = None) -> Set[Any]:\n", " \"\"\"Return a set of all events observed.\"\"\"\n", " all_events = set()\n", "\n", " if outcome:\n", " if outcome in self.collectors:\n", " for collector in self.collectors[outcome]:\n", " all_events.update(collector.events())\n", " else:\n", " for outcome in self.collectors:\n", " for collector in self.collectors[outcome]:\n", " all_events.update(collector.events())\n", "\n", " return all_events" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's a simple example of `StatisticalDebugger` in action:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.700133Z", "iopub.status.busy": "2023-11-12T12:41:20.700008Z", "iopub.status.idle": "2023-11-12T12:41:20.702257Z", "shell.execute_reply": "2023-11-12T12:41:20.701950Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "s = StatisticalDebugger()\n", "with s.collect('PASS'):\n", " remove_html_markup(\"abc\")\n", "with s.collect('PASS'):\n", " remove_html_markup('abc')\n", "with s.collect('FAIL'):\n", " remove_html_markup('\"abc\"')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The method `all_events()` returns all events collected:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.703974Z", "iopub.status.busy": "2023-11-12T12:41:20.703849Z", "iopub.status.idle": "2023-11-12T12:41:20.706487Z", "shell.execute_reply": "2023-11-12T12:41:20.706126Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 8),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 12),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.all_events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If given an outcome as argument, we obtain all events with the given outcome." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.708031Z", "iopub.status.busy": "2023-11-12T12:41:20.707917Z", "iopub.status.idle": "2023-11-12T12:41:20.710375Z", "shell.execute_reply": "2023-11-12T12:41:20.710103Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 12),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.all_events('FAIL')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The attribute `collectors` maps outcomes to lists of collectors:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.712121Z", "iopub.status.busy": "2023-11-12T12:41:20.711973Z", "iopub.status.idle": "2023-11-12T12:41:20.714585Z", "shell.execute_reply": "2023-11-12T12:41:20.714084Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{'PASS': [remove_html_markup(s='abc'), remove_html_markup(s='abc')],\n", " 'FAIL': [remove_html_markup(s='\"abc\"')]}" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.collectors" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here's the collector of the one (and first) passing run:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.716795Z", "iopub.status.busy": "2023-11-12T12:41:20.716625Z", "iopub.status.idle": "2023-11-12T12:41:20.719382Z", "shell.execute_reply": "2023-11-12T12:41:20.719084Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "\"remove_html_markup(s='abc')\"" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.collectors['PASS'][0].id()" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.721181Z", "iopub.status.busy": "2023-11-12T12:41:20.721047Z", "iopub.status.idle": "2023-11-12T12:41:20.723535Z", "shell.execute_reply": "2023-11-12T12:41:20.723235Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.collectors['PASS'][0].events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To better highlight the differences between the collected events, we introduce a method `event_table()` that prints out whether an event took place in a run." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Excursion: Printing an Event Table" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.724978Z", "iopub.status.busy": "2023-11-12T12:41:20.724874Z", "iopub.status.idle": "2023-11-12T12:41:20.726370Z", "shell.execute_reply": "2023-11-12T12:41:20.726129Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from IPython.display import Markdown" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.727734Z", "iopub.status.busy": "2023-11-12T12:41:20.727633Z", "iopub.status.idle": "2023-11-12T12:41:20.730354Z", "shell.execute_reply": "2023-11-12T12:41:20.729878Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import html" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.733319Z", "iopub.status.busy": "2023-11-12T12:41:20.732937Z", "iopub.status.idle": "2023-11-12T12:41:20.741235Z", "shell.execute_reply": "2023-11-12T12:41:20.740942Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class StatisticalDebugger(StatisticalDebugger):\n", " def function(self) -> Optional[Callable]:\n", " \"\"\"\n", " Return the entry function from the events observed,\n", " or None if ambiguous.\n", " \"\"\"\n", " names_seen = set()\n", " functions = []\n", " for outcome in self.collectors:\n", " for collector in self.collectors[outcome]:\n", " # We may have multiple copies of the function,\n", " # but sharing the same name\n", " func = collector.function()\n", " if func.__name__ not in names_seen:\n", " functions.append(func)\n", " names_seen.add(func.__name__)\n", "\n", " if len(functions) != 1:\n", " return None # ambiguous\n", " return functions[0]\n", "\n", " def covered_functions(self) -> Set[Callable]:\n", " \"\"\"Return a set of all functions observed.\"\"\"\n", " functions = set()\n", " for outcome in self.collectors:\n", " for collector in self.collectors[outcome]:\n", " functions |= collector.covered_functions()\n", " return functions\n", "\n", " def coverage(self) -> Coverage:\n", " \"\"\"Return a set of all (functions, line_numbers) observed\"\"\"\n", " coverage = set()\n", " for outcome in self.collectors:\n", " for collector in self.collectors[outcome]:\n", " coverage |= collector.coverage()\n", " return coverage\n", "\n", " def color(self, event: Any) -> Optional[str]:\n", " \"\"\"\n", " Return a color for the given event, or None.\n", " To be overloaded in subclasses.\n", " \"\"\"\n", " return None\n", "\n", " def tooltip(self, event: Any) -> Optional[str]:\n", " \"\"\"\n", " Return a tooltip string for the given event, or None.\n", " To be overloaded in subclasses.\n", " \"\"\"\n", " return None\n", "\n", " def event_str(self, event: Any) -> str:\n", " \"\"\"Format the given event. To be overloaded in subclasses.\"\"\"\n", " if isinstance(event, str):\n", " return event\n", " if isinstance(event, tuple):\n", " return \":\".join(self.event_str(elem) for elem in event)\n", " return str(event)\n", "\n", " def event_table_text(self, *, args: bool = False, color: bool = False) -> str:\n", " \"\"\"\n", " Print out a table of events observed.\n", " If `args` is True, use arguments as headers.\n", " If `color` is True, use colors.\n", " \"\"\"\n", " sep = ' | '\n", " all_events = self.all_events()\n", " longest_event = max(len(f\"{self.event_str(event)}\") \n", " for event in all_events)\n", " out = \"\"\n", "\n", " # Header\n", " if args:\n", " out += '| '\n", " func = self.function()\n", " if func:\n", " out += '`' + func.__name__ + '`'\n", " out += sep\n", " for name in self.collectors:\n", " for collector in self.collectors[name]:\n", " out += '`' + collector.argstring() + '`' + sep\n", " out += '\\n'\n", " else:\n", " out += '| ' + ' ' * longest_event + sep\n", " for name in self.collectors:\n", " for i in range(len(self.collectors[name])):\n", " out += name + sep\n", " out += '\\n'\n", "\n", " out += '| ' + '-' * longest_event + sep\n", " for name in self.collectors:\n", " for i in range(len(self.collectors[name])):\n", " out += '-' * len(name) + sep\n", " out += '\\n'\n", "\n", " # Data\n", " for event in sorted(all_events):\n", " event_name = self.event_str(event).rjust(longest_event)\n", "\n", " tooltip = self.tooltip(event)\n", " if tooltip:\n", " title = f' title=\"{tooltip}\"'\n", " else:\n", " title = ''\n", "\n", " if color:\n", " color_name = self.color(event)\n", " if color_name:\n", " event_name = \\\n", " f'' \\\n", " f'{html.escape(event_name)}' \\\n", " f''\n", "\n", " out += f\"| {event_name}\" + sep\n", " for name in self.collectors:\n", " for collector in self.collectors[name]:\n", " out += ' ' * (len(name) - 1)\n", " if event in collector.events():\n", " out += \"X\"\n", " else:\n", " out += \"-\"\n", " out += sep\n", " out += '\\n'\n", "\n", " return out\n", "\n", " def event_table(self, **_args: Any) -> Any:\n", " \"\"\"Print out event table in Markdown format.\"\"\"\n", " return Markdown(self.event_table_text(**_args))\n", "\n", " def __repr__(self) -> str:\n", " return self.event_table_text()\n", "\n", " def _repr_markdown_(self) -> str:\n", " return self.event_table_text(args=True, color=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### End of Excursion" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.742856Z", "iopub.status.busy": "2023-11-12T12:41:20.742756Z", "iopub.status.idle": "2023-11-12T12:41:20.745157Z", "shell.execute_reply": "2023-11-12T12:41:20.744696Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "s = StatisticalDebugger()\n", "with s.collect('PASS'):\n", " remove_html_markup(\"abc\")\n", "with s.collect('PASS'):\n", " remove_html_markup('abc')\n", "with s.collect('FAIL'):\n", " remove_html_markup('\"abc\"')" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.747068Z", "iopub.status.busy": "2023-11-12T12:41:20.746940Z", "iopub.status.idle": "2023-11-12T12:41:20.749334Z", "shell.execute_reply": "2023-11-12T12:41:20.749075Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ "" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.event_table(args=True)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.750728Z", "iopub.status.busy": "2023-11-12T12:41:20.750619Z", "iopub.status.idle": "2023-11-12T12:41:20.754181Z", "shell.execute_reply": "2023-11-12T12:41:20.753767Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

How many lines are executed in the failing run only?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz(\"How many lines are executed in the failing run only?\",\n", " [\n", " \"One\",\n", " \"Two\",\n", " \"Three\"\n", " ], 'len([12])')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Indeed, Line 12 executed in the failing run only would be a correlation to look for." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Collecting Passing and Failing Runs\n", "\n", "While our `StatisticalDebugger` class allows arbitrary outcomes, we are typically only interested in two outcomes, namely _passing_ vs. _failing_ runs. We therefore introduce a specialized `DifferenceDebugger` class that provides customized methods to collect and access passing and failing runs." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.755907Z", "iopub.status.busy": "2023-11-12T12:41:20.755817Z", "iopub.status.idle": "2023-11-12T12:41:20.759082Z", "shell.execute_reply": "2023-11-12T12:41:20.758843Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DifferenceDebugger(StatisticalDebugger):\n", " \"\"\"A class to collect events for passing and failing outcomes.\"\"\"\n", "\n", " PASS = 'PASS'\n", " FAIL = 'FAIL'\n", "\n", " def collect_pass(self, *args: Any, **kwargs: Any) -> Collector:\n", " \"\"\"Return a collector for passing runs.\"\"\"\n", " return self.collect(self.PASS, *args, **kwargs)\n", "\n", " def collect_fail(self, *args: Any, **kwargs: Any) -> Collector:\n", " \"\"\"Return a collector for failing runs.\"\"\"\n", " return self.collect(self.FAIL, *args, **kwargs)\n", "\n", " def pass_collectors(self) -> List[Collector]:\n", " return self.collectors[self.PASS]\n", "\n", " def fail_collectors(self) -> List[Collector]:\n", " return self.collectors[self.FAIL]\n", "\n", " def all_fail_events(self) -> Set[Any]:\n", " \"\"\"Return all events observed in failing runs.\"\"\"\n", " return self.all_events(self.FAIL)\n", "\n", " def all_pass_events(self) -> Set[Any]:\n", " \"\"\"Return all events observed in passing runs.\"\"\"\n", " return self.all_events(self.PASS)\n", "\n", " def only_fail_events(self) -> Set[Any]:\n", " \"\"\"Return all events observed only in failing runs.\"\"\"\n", " return self.all_fail_events() - self.all_pass_events()\n", "\n", " def only_pass_events(self) -> Set[Any]:\n", " \"\"\"Return all events observed only in passing runs.\"\"\"\n", " return self.all_pass_events() - self.all_fail_events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can use `DifferenceDebugger` just as a `StatisticalDebugger`:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.760538Z", "iopub.status.busy": "2023-11-12T12:41:20.760456Z", "iopub.status.idle": "2023-11-12T12:41:20.762316Z", "shell.execute_reply": "2023-11-12T12:41:20.762000Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# ignore\n", "T1 = TypeVar('T1', bound='DifferenceDebugger')" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.763972Z", "iopub.status.busy": "2023-11-12T12:41:20.763852Z", "iopub.status.idle": "2023-11-12T12:41:20.766011Z", "shell.execute_reply": "2023-11-12T12:41:20.765696Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def test_debugger_html_simple(debugger: T1) -> T1:\n", " with debugger.collect_pass():\n", " remove_html_markup('abc')\n", " with debugger.collect_pass():\n", " remove_html_markup('abc')\n", " with debugger.collect_fail():\n", " remove_html_markup('\"abc\"')\n", " return debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "However, since the outcome of tests may not always be predetermined, we provide a simpler interface for tests that can fail (= raise an exception) or pass (not raise an exception)." ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.767544Z", "iopub.status.busy": "2023-11-12T12:41:20.767460Z", "iopub.status.idle": "2023-11-12T12:41:20.770171Z", "shell.execute_reply": "2023-11-12T12:41:20.769907Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DifferenceDebugger(DifferenceDebugger):\n", " def __enter__(self) -> Any:\n", " \"\"\"Enter a `with` block. Collect coverage and outcome;\n", " classify as FAIL if the block raises an exception,\n", " and PASS if it does not.\n", " \"\"\"\n", " self.collector = self.collector_class()\n", " self.collector.add_items_to_ignore([self.__class__])\n", " self.collector.__enter__()\n", " return self\n", "\n", " def __exit__(self, exc_tp: Type, exc_value: BaseException,\n", " exc_traceback: TracebackType) -> Optional[bool]:\n", " \"\"\"Exit the `with` block.\"\"\"\n", " status = self.collector.__exit__(exc_tp, exc_value, exc_traceback)\n", "\n", " if status is None:\n", " pass\n", " else:\n", " return False # Internal error; re-raise exception\n", "\n", " if exc_tp is None:\n", " outcome = self.PASS\n", " else:\n", " outcome = self.FAIL\n", "\n", " self.add_collector(outcome, self.collector)\n", " return True # Ignore exception, if any" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Using this interface, we can rewrite `test_debugger_html()`:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.771821Z", "iopub.status.busy": "2023-11-12T12:41:20.771726Z", "iopub.status.idle": "2023-11-12T12:41:20.773427Z", "shell.execute_reply": "2023-11-12T12:41:20.773177Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# ignore\n", "T2 = TypeVar('T2', bound='DifferenceDebugger')" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.774712Z", "iopub.status.busy": "2023-11-12T12:41:20.774631Z", "iopub.status.idle": "2023-11-12T12:41:20.776568Z", "shell.execute_reply": "2023-11-12T12:41:20.776337Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def test_debugger_html(debugger: T2) -> T2:\n", " with debugger:\n", " remove_html_markup('abc')\n", " with debugger:\n", " remove_html_markup('abc')\n", " with debugger:\n", " remove_html_markup('\"abc\"')\n", " assert False # Mark test as failing\n", "\n", " return debugger" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.777906Z", "iopub.status.busy": "2023-11-12T12:41:20.777822Z", "iopub.status.idle": "2023-11-12T12:41:20.780538Z", "shell.execute_reply": "2023-11-12T12:41:20.780215Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ "| | PASS | PASS | FAIL | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | " ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_debugger_html(DifferenceDebugger())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Analyzing Events\n", "\n", "Let us now focus on _analyzing_ events collected. Since events come back as _sets_, we can compute _unions_ and _differences_ between these sets. For instance, we can compute which lines were executed in _any_ of the passing runs of `test_debugger_html()`, above:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.782244Z", "iopub.status.busy": "2023-11-12T12:41:20.782108Z", "iopub.status.idle": "2023-11-12T12:41:20.784090Z", "shell.execute_reply": "2023-11-12T12:41:20.783821Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = test_debugger_html(DifferenceDebugger())" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.785467Z", "iopub.status.busy": "2023-11-12T12:41:20.785361Z", "iopub.status.idle": "2023-11-12T12:41:20.786898Z", "shell.execute_reply": "2023-11-12T12:41:20.786652Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "pass_1_events = debugger.pass_collectors()[0].events()" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.788330Z", "iopub.status.busy": "2023-11-12T12:41:20.788232Z", "iopub.status.idle": "2023-11-12T12:41:20.789921Z", "shell.execute_reply": "2023-11-12T12:41:20.789667Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "pass_2_events = debugger.pass_collectors()[1].events()" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.791308Z", "iopub.status.busy": "2023-11-12T12:41:20.791226Z", "iopub.status.idle": "2023-11-12T12:41:20.793661Z", "shell.execute_reply": "2023-11-12T12:41:20.793383Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 8),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "in_any_pass = pass_1_events | pass_2_events\n", "in_any_pass" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Likewise, we can determine which lines were _only_ executed in the failing run:" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.795213Z", "iopub.status.busy": "2023-11-12T12:41:20.795085Z", "iopub.status.idle": "2023-11-12T12:41:20.796886Z", "shell.execute_reply": "2023-11-12T12:41:20.796610Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "fail_events = debugger.fail_collectors()[0].events()" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.798479Z", "iopub.status.busy": "2023-11-12T12:41:20.798368Z", "iopub.status.idle": "2023-11-12T12:41:20.800497Z", "shell.execute_reply": "2023-11-12T12:41:20.800219Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 12)}" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "only_in_fail = fail_events - in_any_pass\n", "only_in_fail" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "And we see that the \"failing\" run is characterized by processing quotes:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.801993Z", "iopub.status.busy": "2023-11-12T12:41:20.801882Z", "iopub.status.idle": "2023-11-12T12:41:20.836595Z", "shell.execute_reply": "2023-11-12T12:41:20.836155Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1 \u001b[34mdef\u001b[39;49;00m \u001b[32mremove_html_markup\u001b[39;49;00m(s): \u001b[37m# type: ignore\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 2 tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 3 quote = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 4 out = \u001b[33m\"\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 5 \u001b[37m\u001b[39;49;00m\n", " 6 \u001b[34mfor\u001b[39;49;00m c \u001b[35min\u001b[39;49;00m s:\u001b[37m\u001b[39;49;00m\n", " 7 \u001b[34mif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m<\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " 8 tag = \u001b[34mTrue\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 9 \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m>\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m quote:\u001b[37m\u001b[39;49;00m\n", " 10 tag = \u001b[34mFalse\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " 11 \u001b[34melif\u001b[39;49;00m c == \u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m \u001b[35mor\u001b[39;49;00m c == \u001b[33m\"\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m \u001b[35mand\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " 12 quote = \u001b[35mnot\u001b[39;49;00m quote\u001b[37m\u001b[39;49;00m\n", " 13 \u001b[34melif\u001b[39;49;00m \u001b[35mnot\u001b[39;49;00m tag:\u001b[37m\u001b[39;49;00m\n", " 14 out = out + c\u001b[37m\u001b[39;49;00m\n", " 15 \u001b[37m\u001b[39;49;00m\n", " 16 \u001b[34mreturn\u001b[39;49;00m out\u001b[37m\u001b[39;49;00m\n" ] } ], "source": [ "code_with_coverage(remove_html_markup, only_in_fail)" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.838585Z", "iopub.status.busy": "2023-11-12T12:41:20.838425Z", "iopub.status.idle": "2023-11-12T12:41:20.840317Z", "shell.execute_reply": "2023-11-12T12:41:20.840081Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "debugger = test_debugger_html(DifferenceDebugger())" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.841976Z", "iopub.status.busy": "2023-11-12T12:41:20.841840Z", "iopub.status.idle": "2023-11-12T12:41:20.844299Z", "shell.execute_reply": "2023-11-12T12:41:20.844034Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 8),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 12),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.all_events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "These are the lines executed only in the failing run:" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.845952Z", "iopub.status.busy": "2023-11-12T12:41:20.845829Z", "iopub.status.idle": "2023-11-12T12:41:20.847885Z", "shell.execute_reply": "2023-11-12T12:41:20.847640Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 12)}" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.only_fail_events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "These are the lines executed only in the passing runs:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.849298Z", "iopub.status.busy": "2023-11-12T12:41:20.849189Z", "iopub.status.idle": "2023-11-12T12:41:20.851176Z", "shell.execute_reply": "2023-11-12T12:41:20.850925Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{('remove_html_markup', 8), ('remove_html_markup', 10)}" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.only_pass_events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Again, having these lines individually is neat, but things become much more interesting if we can see the associated code lines just as well. That's what we will do in the next section." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Visualizing Differences\n", "\n", "To show correlations of line coverage in context, we introduce a number of _visualization_ techniques that _highlight_ code with different colors." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Discrete Spectrum\n", "\n", "The first idea is to use a _discrete_ spectrum of three colors:\n", "\n", "* _red_ for code executed in failing runs only\n", "* _green_ for code executed in passing runs only\n", "* _yellow_ for code executed in both passing and failing runs.\n", "\n", "Code that is not executed stays unhighlighted." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We first introduce an abstract class `SpectrumDebugger` that provides the essential functions. `suspiciousness()` returns a value between 0 and 1 indicating the suspiciousness of the given event - or `None` if unknown." ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.852620Z", "iopub.status.busy": "2023-11-12T12:41:20.852504Z", "iopub.status.idle": "2023-11-12T12:41:20.854420Z", "shell.execute_reply": "2023-11-12T12:41:20.854154Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class SpectrumDebugger(DifferenceDebugger):\n", " def suspiciousness(self, event: Any) -> Optional[float]:\n", " \"\"\"\n", " Return a suspiciousness value in the range [0, 1.0]\n", " for the given event, or `None` if unknown.\n", " To be overloaded in subclasses.\n", " \"\"\"\n", " return None" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `tooltip()` and `percentage()` methods convert the suspiciousness into a human-readable form." ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.855912Z", "iopub.status.busy": "2023-11-12T12:41:20.855800Z", "iopub.status.idle": "2023-11-12T12:41:20.858144Z", "shell.execute_reply": "2023-11-12T12:41:20.857844Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class SpectrumDebugger(SpectrumDebugger):\n", " def tooltip(self, event: Any) -> str:\n", " \"\"\"\n", " Return a tooltip for the given event (default: percentage).\n", " To be overloaded in subclasses.\n", " \"\"\"\n", " return self.percentage(event)\n", "\n", " def percentage(self, event: Any) -> str:\n", " \"\"\"\n", " Return the suspiciousness for the given event as percentage string.\n", " \"\"\"\n", " suspiciousness = self.suspiciousness(event)\n", " if suspiciousness is not None:\n", " return str(int(suspiciousness * 100)).rjust(3) + '%'\n", " else:\n", " return ' ' * len('100%')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `code()` method takes a function and shows each of its source code lines using the given spectrum, using HTML markup:" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.859625Z", "iopub.status.busy": "2023-11-12T12:41:20.859522Z", "iopub.status.idle": "2023-11-12T12:41:20.863617Z", "shell.execute_reply": "2023-11-12T12:41:20.863343Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class SpectrumDebugger(SpectrumDebugger):\n", " def code(self, functions: Optional[Set[Callable]] = None, *, \n", " color: bool = False, suspiciousness: bool = False,\n", " line_numbers: bool = True) -> str:\n", " \"\"\"\n", " Return a listing of `functions` (default: covered functions).\n", " If `color` is True, render as HTML, using suspiciousness colors.\n", " If `suspiciousness` is True, include suspiciousness values.\n", " If `line_numbers` is True (default), include line numbers.\n", " \"\"\"\n", "\n", " if not functions:\n", " functions = self.covered_functions()\n", "\n", " out = \"\"\n", " seen = set()\n", " for function in functions:\n", " source_lines, starting_line_number = \\\n", " inspect.getsourcelines(function)\n", "\n", " if (function.__name__, starting_line_number) in seen:\n", " continue\n", " seen.add((function.__name__, starting_line_number))\n", "\n", " if out:\n", " out += '\\n'\n", " if color:\n", " out += '

'\n", "\n", " line_number = starting_line_number\n", " for line in source_lines:\n", " if color:\n", " line = html.escape(line)\n", " if line.strip() == '':\n", " line = ' '\n", "\n", " location = (function.__name__, line_number)\n", " location_suspiciousness = self.suspiciousness(location)\n", " if location_suspiciousness is not None:\n", " tooltip = f\"Line {line_number}: {self.tooltip(location)}\"\n", " else:\n", " tooltip = f\"Line {line_number}: not executed\"\n", "\n", " if suspiciousness:\n", " line = self.percentage(location) + ' ' + line\n", "\n", " if line_numbers:\n", " line = str(line_number).rjust(4) + ' ' + line\n", "\n", " line_color = self.color(location)\n", "\n", " if color and line_color:\n", " line = f'''

{line.rstrip()}
'''\n", " elif color:\n", " line = f'
{line}
'\n", " else:\n", " line = line.rstrip()\n", "\n", " out += line + '\\n'\n", " line_number += 1\n", "\n", " return out" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We introduce a few helper methods to visualize the code with colors in various forms." ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.865013Z", "iopub.status.busy": "2023-11-12T12:41:20.864926Z", "iopub.status.idle": "2023-11-12T12:41:20.867145Z", "shell.execute_reply": "2023-11-12T12:41:20.866856Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class SpectrumDebugger(SpectrumDebugger):\n", " def _repr_html_(self) -> str:\n", " \"\"\"When output in Jupyter, visualize as HTML\"\"\"\n", " return self.code(color=True)\n", "\n", " def __str__(self) -> str:\n", " \"\"\"Show code as string\"\"\"\n", " return self.code(color=False, suspiciousness=True)\n", "\n", " def __repr__(self) -> str:\n", " \"\"\"Show code as string\"\"\"\n", " return self.code(color=False, suspiciousness=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "So far, however, central methods like `suspiciousness()` or `color()` were abstract – that is, to be defined in subclasses. Our `DiscreteSpectrumDebugger` subclass provides concrete implementations for these, with `color()` returning one of the three colors depending on the line number:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.868567Z", "iopub.status.busy": "2023-11-12T12:41:20.868486Z", "iopub.status.idle": "2023-11-12T12:41:20.871890Z", "shell.execute_reply": "2023-11-12T12:41:20.871600Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class DiscreteSpectrumDebugger(SpectrumDebugger):\n", " \"\"\"Visualize differences between executions using three discrete colors\"\"\"\n", "\n", " def suspiciousness(self, event: Any) -> Optional[float]:\n", " \"\"\"\n", " Return a suspiciousness value [0, 1.0]\n", " for the given event, or `None` if unknown.\n", " \"\"\"\n", " passing = self.all_pass_events()\n", " failing = self.all_fail_events()\n", "\n", " if event in passing and event in failing:\n", " return 0.5\n", " elif event in failing:\n", " return 1.0\n", " elif event in passing:\n", " return 0.0\n", " else:\n", " return None\n", "\n", " def color(self, event: Any) -> Optional[str]:\n", " \"\"\"\n", " Return a HTML color for the given event.\n", " \"\"\"\n", " suspiciousness = self.suspiciousness(event)\n", " if suspiciousness is None:\n", " return None\n", "\n", " if suspiciousness > 0.8:\n", " return 'mistyrose'\n", " if suspiciousness >= 0.5:\n", " return 'lightyellow'\n", "\n", " return 'honeydew'\n", "\n", " def tooltip(self, event: Any) -> str:\n", " \"\"\"Return a tooltip for the given event.\"\"\"\n", " passing = self.all_pass_events()\n", " failing = self.all_fail_events()\n", "\n", " if event in passing and event in failing:\n", " return \"in passing and failing runs\"\n", " elif event in failing:\n", " return \"only in failing runs\"\n", " elif event in passing:\n", " return \"only in passing runs\"\n", " else:\n", " return \"never\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This is how the `only_pass_events()` and `only_fail_events()` sets look like when visualized with code. The \"culprit\" line is well highlighted:" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.873317Z", "iopub.status.busy": "2023-11-12T12:41:20.873225Z", "iopub.status.idle": "2023-11-12T12:41:20.875169Z", "shell.execute_reply": "2023-11-12T12:41:20.874927Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = test_debugger_html(DiscreteSpectrumDebugger())" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.876637Z", "iopub.status.busy": "2023-11-12T12:41:20.876542Z", "iopub.status.idle": "2023-11-12T12:41:20.880477Z", "shell.execute_reply": "2023-11-12T12:41:20.880187Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ " 1 50% def remove_html_markup(s): # type: ignore\n", " 2 50% tag = False\n", " 3 50% quote = False\n", " 4 50% out = \"\"\n", " 5\n", " 6 50% for c in s:\n", " 7 50% if c == '<' and not quote:\n", " 8 0% tag = True\n", " 9 50% elif c == '>' and not quote:\n", " 10 0% tag = False\n", " 11 50% elif c == '\"' or c == \"'\" and tag:\n", " 12 100% quote = not quote\n", " 13 50% elif not tag:\n", " 14 50% out = out + c\n", " 15\n", " 16 50% return out" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can clearly see that the failure is correlated with the presence of quotes in the input string (which is an important hint!). But does this also show us _immediately_ where the defect to be fixed is?" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.882024Z", "iopub.status.busy": "2023-11-12T12:41:20.881898Z", "iopub.status.idle": "2023-11-12T12:41:20.886005Z", "shell.execute_reply": "2023-11-12T12:41:20.885744Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

Does the line quote = not quote actually contain the defect?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz(\"Does the line `quote = not quote` actually contain the defect?\",\n", " [\n", " \"Yes, it should be fixed\",\n", " \"No, the defect is elsewhere\"\n", " ], '164 * 2 % 326')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Indeed, it is the _governing condition_ that is wrong – that is, the condition that caused Line 12 to be executed in the first place. In order to fix a program, we have to find a location that\n", "\n", "1. _causes_ the failure (i.e., it can be changed to make the failure go away); and\n", "2. is a _defect_ (i.e., contains an error).\n", "\n", "In our example above, the highlighted code line is a _symptom_ for the error. To some extent, it is also a _cause_, since, say, commenting it out would also resolve the given failure, at the cost of causing other failures. However, the preceding condition also is a cause, as is the presence of quotes in the input.\n", "\n", "Only one of these also is a _defect_, though, and that is the preceding condition. Hence, while correlations can provide important hints, they do not necessarily locate defects." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "For those of us who may not have color HTML output ready, simply printing the debugger lists suspiciousness values as percentages." ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.887546Z", "iopub.status.busy": "2023-11-12T12:41:20.887452Z", "iopub.status.idle": "2023-11-12T12:41:20.889904Z", "shell.execute_reply": "2023-11-12T12:41:20.889635Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1 50% def remove_html_markup(s): # type: ignore\n", " 2 50% tag = False\n", " 3 50% quote = False\n", " 4 50% out = \"\"\n", " 5\n", " 6 50% for c in s:\n", " 7 50% if c == '<' and not quote:\n", " 8 0% tag = True\n", " 9 50% elif c == '>' and not quote:\n", " 10 0% tag = False\n", " 11 50% elif c == '\"' or c == \"'\" and tag:\n", " 12 100% quote = not quote\n", " 13 50% elif not tag:\n", " 14 50% out = out + c\n", " 15\n", " 16 50% return out\n", "\n" ] } ], "source": [ "print(debugger)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Continuous Spectrum\n", "\n", "The criterion that an event should _only_ occur in failing runs (and not in passing runs) can be too aggressive. In particular, if we have another run that executes the \"culprit\" lines, but does _not_ fail, our \"only in fail\" criterion will no longer be helpful." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here is an example. The input\n", "\n", "```html\n", "text\n", "```\n", "\n", "will trigger the \"culprit\" line\n", "\n", "```python\n", "quote = not quote\n", "```\n", "\n", "but actually produce an output where the tags are properly stripped:" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.891628Z", "iopub.status.busy": "2023-11-12T12:41:20.891506Z", "iopub.status.idle": "2023-11-12T12:41:20.893673Z", "shell.execute_reply": "2023-11-12T12:41:20.893387Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'text'" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "remove_html_markup('text')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "As a consequence, we no longer have lines that are being executed only in failing runs:" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.895197Z", "iopub.status.busy": "2023-11-12T12:41:20.895089Z", "iopub.status.idle": "2023-11-12T12:41:20.897181Z", "shell.execute_reply": "2023-11-12T12:41:20.896923Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = test_debugger_html(DiscreteSpectrumDebugger())\n", "with debugger.collect_pass():\n", " remove_html_markup('')" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.898580Z", "iopub.status.busy": "2023-11-12T12:41:20.898481Z", "iopub.status.idle": "2023-11-12T12:41:20.900524Z", "shell.execute_reply": "2023-11-12T12:41:20.900272Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "set()" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.only_fail_events()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In our spectrum output, the effect now is that the \"culprit\" line is as yellow as all others." ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.901973Z", "iopub.status.busy": "2023-11-12T12:41:20.901866Z", "iopub.status.idle": "2023-11-12T12:41:20.905382Z", "shell.execute_reply": "2023-11-12T12:41:20.905093Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s=''` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | X | \n", "| remove_html_markup:2 | X | X | X | X | \n", "| remove_html_markup:3 | X | X | X | X | \n", "| remove_html_markup:4 | X | X | X | X | \n", "| remove_html_markup:6 | X | X | X | X | \n", "| remove_html_markup:7 | X | X | X | X | \n", "| remove_html_markup:8 | - | X | X | - | \n", "| remove_html_markup:9 | X | X | X | X | \n", "| remove_html_markup:10 | - | X | X | - | \n", "| remove_html_markup:11 | X | X | X | X | \n", "| remove_html_markup:12 | - | - | X | X | \n", "| remove_html_markup:13 | X | X | X | X | \n", "| remove_html_markup:14 | X | X | - | X | \n", "| remove_html_markup:16 | X | X | X | X | \n" ], "text/plain": [ " 1 50% def remove_html_markup(s): # type: ignore\n", " 2 50% tag = False\n", " 3 50% quote = False\n", " 4 50% out = \"\"\n", " 5\n", " 6 50% for c in s:\n", " 7 50% if c == '<' and not quote:\n", " 8 0% tag = True\n", " 9 50% elif c == '>' and not quote:\n", " 10 0% tag = False\n", " 11 50% elif c == '\"' or c == \"'\" and tag:\n", " 12 50% quote = not quote\n", " 13 50% elif not tag:\n", " 14 50% out = out + c\n", " 15\n", " 16 50% return out" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We therefore introduce a different method for highlighting lines, based on their _relative_ occurrence with respect to all runs: If a line has been _mostly_ executed in failing runs, its color should shift towards red; if a line has been _mostly_ executed in passing runs, its color should shift towards green." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This _continuous spectrum_ has been introduced by the seminal _Tarantula_ tool \\cite{Jones2002}. In Tarantula, the color _hue_ for each line is defined as follows:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$\n", "\\textit{color hue}(\\textit{line}) = \\textit{low color(red)} + \\frac{\\%\\textit{passed}(\\textit{line})}{\\%\\textit{passed}(\\textit{line}) + \\%\\textit{failed}(\\textit{line})} \\times \\textit{color range}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here, `%passed` and `%failed` denote the percentage at which a line has been executed in passing and failing runs, respectively. A hue of 0.0 stands for red, a hue of 1.0 stands for green, and a hue of 0.5 stands for equal fractions of red and green, yielding yellow." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can implement these measures right away as methods in a new `ContinuousSpectrumDebugger` class:" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.906908Z", "iopub.status.busy": "2023-11-12T12:41:20.906828Z", "iopub.status.idle": "2023-11-12T12:41:20.910609Z", "shell.execute_reply": "2023-11-12T12:41:20.910356Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class ContinuousSpectrumDebugger(DiscreteSpectrumDebugger):\n", " \"\"\"Visualize differences between executions using a color spectrum\"\"\"\n", "\n", " def collectors_with_event(self, event: Any, category: str) -> Set[Collector]:\n", " \"\"\"\n", " Return all collectors in a category\n", " that observed the given event.\n", " \"\"\"\n", " all_runs = self.collectors[category]\n", " collectors_with_event = set(collector for collector in all_runs \n", " if event in collector.events())\n", " return collectors_with_event\n", "\n", " def collectors_without_event(self, event: Any, category: str) -> Set[Collector]:\n", " \"\"\"\n", " Return all collectors in a category\n", " that did not observe the given event.\n", " \"\"\"\n", " all_runs = self.collectors[category]\n", " collectors_without_event = set(collector for collector in all_runs \n", " if event not in collector.events())\n", " return collectors_without_event\n", "\n", " def event_fraction(self, event: Any, category: str) -> float:\n", " if category not in self.collectors:\n", " return 0.0\n", "\n", " all_collectors = self.collectors[category]\n", " collectors_with_event = self.collectors_with_event(event, category)\n", " fraction = len(collectors_with_event) / len(all_collectors)\n", " # print(f\"%{category}({event}) = {fraction}\")\n", " return fraction\n", "\n", " def passed_fraction(self, event: Any) -> float:\n", " return self.event_fraction(event, self.PASS)\n", "\n", " def failed_fraction(self, event: Any) -> float:\n", " return self.event_fraction(event, self.FAIL)\n", "\n", " def hue(self, event: Any) -> Optional[float]:\n", " \"\"\"Return a color hue from 0.0 (red) to 1.0 (green).\"\"\"\n", " passed = self.passed_fraction(event)\n", " failed = self.failed_fraction(event)\n", " if passed + failed > 0:\n", " return passed / (passed + failed)\n", " else:\n", " return None" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Having a continuous hue also implies a continuous suspiciousness and associated tooltips:" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.912118Z", "iopub.status.busy": "2023-11-12T12:41:20.912028Z", "iopub.status.idle": "2023-11-12T12:41:20.914334Z", "shell.execute_reply": "2023-11-12T12:41:20.914070Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ContinuousSpectrumDebugger(ContinuousSpectrumDebugger):\n", " def suspiciousness(self, event: Any) -> Optional[float]:\n", " hue = self.hue(event)\n", " if hue is None:\n", " return None\n", " return 1 - hue\n", "\n", " def tooltip(self, event: Any) -> str:\n", " return self.percentage(event)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The hue for lines executed only in failing runs is (deep) red, as expected:" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.915960Z", "iopub.status.busy": "2023-11-12T12:41:20.915854Z", "iopub.status.idle": "2023-11-12T12:41:20.917736Z", "shell.execute_reply": "2023-11-12T12:41:20.917482Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = test_debugger_html(ContinuousSpectrumDebugger())" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.919153Z", "iopub.status.busy": "2023-11-12T12:41:20.919046Z", "iopub.status.idle": "2023-11-12T12:41:20.920789Z", "shell.execute_reply": "2023-11-12T12:41:20.920535Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('remove_html_markup', 12) 0.0\n" ] } ], "source": [ "for location in debugger.only_fail_events():\n", " print(location, debugger.hue(location))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Likewise, the hue for lines executed in passing runs is (deep) green:" ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.922216Z", "iopub.status.busy": "2023-11-12T12:41:20.922114Z", "iopub.status.idle": "2023-11-12T12:41:20.923879Z", "shell.execute_reply": "2023-11-12T12:41:20.923645Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('remove_html_markup', 8) 1.0\n", "('remove_html_markup', 10) 1.0\n" ] } ], "source": [ "for location in debugger.only_pass_events():\n", " print(location, debugger.hue(location))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The Tarantula tool not only sets the hue for a line, but also uses _brightness_ as measure for support – that is, how often was the line executed at all. The brighter a line, the stronger the correlation with a passing or failing outcome." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The brightness is defined as follows:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "$$\\textit{brightness}(line) = \\max(\\%\\textit{passed}(\\textit{line}), \\%\\textit{failed}(\\textit{line}))$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "and it is easily implemented, too:" ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.925581Z", "iopub.status.busy": "2023-11-12T12:41:20.925469Z", "iopub.status.idle": "2023-11-12T12:41:20.927482Z", "shell.execute_reply": "2023-11-12T12:41:20.927065Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ContinuousSpectrumDebugger(ContinuousSpectrumDebugger):\n", " def brightness(self, event: Any) -> float:\n", " return max(self.passed_fraction(event), self.failed_fraction(event))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our single \"only in fail\" line has a brightness of 1.0 (the maximum)." ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.929570Z", "iopub.status.busy": "2023-11-12T12:41:20.929442Z", "iopub.status.idle": "2023-11-12T12:41:20.932136Z", "shell.execute_reply": "2023-11-12T12:41:20.931819Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('remove_html_markup', 12) 1.0\n" ] } ], "source": [ "debugger = test_debugger_html(ContinuousSpectrumDebugger())\n", "for location in debugger.only_fail_events():\n", " print(location, debugger.brightness(location))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "With this, we can now define a color for each line. To this end, we override the (previously discrete) `color()` method such that it returns a color specification giving hue and brightness. We use the HTML format `hsl(hue, saturation, lightness)` where the hue is given as a value between 0 and 360 (0 is red, 120 is green) and saturation and lightness are provided as percentages." ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.933932Z", "iopub.status.busy": "2023-11-12T12:41:20.933793Z", "iopub.status.idle": "2023-11-12T12:41:20.936214Z", "shell.execute_reply": "2023-11-12T12:41:20.935897Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class ContinuousSpectrumDebugger(ContinuousSpectrumDebugger):\n", " def color(self, event: Any) -> Optional[str]:\n", " hue = self.hue(event)\n", " if hue is None:\n", " return None\n", " saturation = self.brightness(event)\n", "\n", " # HSL color values are specified with: \n", " # hsl(hue, saturation, lightness).\n", " return f\"hsl({hue * 120}, {saturation * 100}%, 80%)\"" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.937703Z", "iopub.status.busy": "2023-11-12T12:41:20.937600Z", "iopub.status.idle": "2023-11-12T12:41:20.939533Z", "shell.execute_reply": "2023-11-12T12:41:20.939274Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = test_debugger_html(ContinuousSpectrumDebugger())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Lines executed only in failing runs are still shown in red:" ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.941071Z", "iopub.status.busy": "2023-11-12T12:41:20.940961Z", "iopub.status.idle": "2023-11-12T12:41:20.942708Z", "shell.execute_reply": "2023-11-12T12:41:20.942443Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('remove_html_markup', 12) hsl(0.0, 100.0%, 80%)\n" ] } ], "source": [ "for location in debugger.only_fail_events():\n", " print(location, debugger.color(location))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "... whereas lines executed only in passing runs are still shown in green:" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.944221Z", "iopub.status.busy": "2023-11-12T12:41:20.944107Z", "iopub.status.idle": "2023-11-12T12:41:20.945916Z", "shell.execute_reply": "2023-11-12T12:41:20.945649Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('remove_html_markup', 8) hsl(120.0, 50.0%, 80%)\n", "('remove_html_markup', 10) hsl(120.0, 50.0%, 80%)\n" ] } ], "source": [ "for location in debugger.only_pass_events():\n", " print(location, debugger.color(location))" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.947542Z", "iopub.status.busy": "2023-11-12T12:41:20.947435Z", "iopub.status.idle": "2023-11-12T12:41:20.951068Z", "shell.execute_reply": "2023-11-12T12:41:20.950797Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ " 1 50% def remove_html_markup(s): # type: ignore\n", " 2 50% tag = False\n", " 3 50% quote = False\n", " 4 50% out = \"\"\n", " 5\n", " 6 50% for c in s:\n", " 7 50% if c == '<' and not quote:\n", " 8 0% tag = True\n", " 9 50% elif c == '>' and not quote:\n", " 10 0% tag = False\n", " 11 50% elif c == '\"' or c == \"'\" and tag:\n", " 12 100% quote = not quote\n", " 13 50% elif not tag:\n", " 14 50% out = out + c\n", " 15\n", " 16 50% return out" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "What happens with our `quote = not quote` \"culprit\" line if it is executed in passing runs, too?" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.952587Z", "iopub.status.busy": "2023-11-12T12:41:20.952493Z", "iopub.status.idle": "2023-11-12T12:41:20.954372Z", "shell.execute_reply": "2023-11-12T12:41:20.954097Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with debugger.collect_pass():\n", " out = remove_html_markup('')" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.955910Z", "iopub.status.busy": "2023-11-12T12:41:20.955816Z", "iopub.status.idle": "2023-11-12T12:41:20.959811Z", "shell.execute_reply": "2023-11-12T12:41:20.959524Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

In which color will the quote = not quote \"culprit\" line be shown after executing the above code?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz('In which color will the `quote = not quote` \"culprit\" line '\n", " 'be shown after executing the above code?',\n", " [\n", " 'Green',\n", " 'Yellow',\n", " 'Orange',\n", " 'Red'\n", " ], '999 // 333')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We see that it still is shown with an orange-red tint." ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.961449Z", "iopub.status.busy": "2023-11-12T12:41:20.961315Z", "iopub.status.idle": "2023-11-12T12:41:20.965514Z", "shell.execute_reply": "2023-11-12T12:41:20.965240Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s=''` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | X | \n", "| remove_html_markup:2 | X | X | X | X | \n", "| remove_html_markup:3 | X | X | X | X | \n", "| remove_html_markup:4 | X | X | X | X | \n", "| remove_html_markup:6 | X | X | X | X | \n", "| remove_html_markup:7 | X | X | X | X | \n", "| remove_html_markup:8 | - | X | X | - | \n", "| remove_html_markup:9 | X | X | X | X | \n", "| remove_html_markup:10 | - | X | X | - | \n", "| remove_html_markup:11 | X | X | X | X | \n", "| remove_html_markup:12 | - | - | X | X | \n", "| remove_html_markup:13 | X | X | X | X | \n", "| remove_html_markup:14 | X | X | - | X | \n", "| remove_html_markup:16 | X | X | X | X | \n" ], "text/plain": [ " 1 50% def remove_html_markup(s): # type: ignore\n", " 2 50% tag = False\n", " 3 50% quote = False\n", " 4 50% out = \"\"\n", " 5\n", " 6 50% for c in s:\n", " 7 50% if c == '<' and not quote:\n", " 8 0% tag = True\n", " 9 50% elif c == '>' and not quote:\n", " 10 0% tag = False\n", " 11 50% elif c == '\"' or c == \"'\" and tag:\n", " 12 75% quote = not quote\n", " 13 50% elif not tag:\n", " 14 60% out = out + c\n", " 15\n", " 16 50% return out" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here's another example, coming right from the Tarantula paper. The `middle()` function takes three numbers `x`, `y`, and `z`, and returns the one that is neither the minimum nor the maximum of the three:" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.967101Z", "iopub.status.busy": "2023-11-12T12:41:20.967005Z", "iopub.status.idle": "2023-11-12T12:41:20.969033Z", "shell.execute_reply": "2023-11-12T12:41:20.968772Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def middle(x, y, z): # type: ignore\n", " if y < z:\n", " if x < y:\n", " return y\n", " elif x < z:\n", " return y\n", " else:\n", " if x > y:\n", " return y\n", " elif x > z:\n", " return x\n", " return z" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.970527Z", "iopub.status.busy": "2023-11-12T12:41:20.970431Z", "iopub.status.idle": "2023-11-12T12:41:20.972679Z", "shell.execute_reply": "2023-11-12T12:41:20.972409Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "middle(1, 2, 3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Unfortunately, `middle()` can fail:" ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.974022Z", "iopub.status.busy": "2023-11-12T12:41:20.973938Z", "iopub.status.idle": "2023-11-12T12:41:20.976003Z", "shell.execute_reply": "2023-11-12T12:41:20.975710Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "middle(2, 1, 3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let is see whether we can find the bug with a few additional test cases:" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.977496Z", "iopub.status.busy": "2023-11-12T12:41:20.977388Z", "iopub.status.idle": "2023-11-12T12:41:20.979185Z", "shell.execute_reply": "2023-11-12T12:41:20.978916Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# ignore\n", "T3 = TypeVar('T3', bound='DifferenceDebugger')" ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.980665Z", "iopub.status.busy": "2023-11-12T12:41:20.980561Z", "iopub.status.idle": "2023-11-12T12:41:20.982814Z", "shell.execute_reply": "2023-11-12T12:41:20.982569Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def test_debugger_middle(debugger: T3) -> T3:\n", " with debugger.collect_pass():\n", " middle(3, 3, 5)\n", " with debugger.collect_pass():\n", " middle(1, 2, 3)\n", " with debugger.collect_pass():\n", " middle(3, 2, 1)\n", " with debugger.collect_pass():\n", " middle(5, 5, 5)\n", " with debugger.collect_pass():\n", " middle(5, 3, 4)\n", " with debugger.collect_fail():\n", " middle(2, 1, 3)\n", " return debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Note that in order to collect data from multiple function invocations, you need to have a separate `with` clause for every invocation. The following will _not_ work correctly:\n", "\n", "```python\n", " with debugger.collect_pass():\n", " middle(3, 3, 5)\n", " middle(1, 2, 3)\n", " ...\n", "```" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.984220Z", "iopub.status.busy": "2023-11-12T12:41:20.984112Z", "iopub.status.idle": "2023-11-12T12:41:20.985792Z", "shell.execute_reply": "2023-11-12T12:41:20.985545Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = test_debugger_middle(ContinuousSpectrumDebugger())" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.987434Z", "iopub.status.busy": "2023-11-12T12:41:20.987311Z", "iopub.status.idle": "2023-11-12T12:41:20.989727Z", "shell.execute_reply": "2023-11-12T12:41:20.989475Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/markdown": [ "| `middle` | `x=3, y=3, z=5` | `x=1, y=2, z=3` | `x=3, y=2, z=1` | `x=5, y=5, z=5` | `x=5, y=3, z=4` | `x=2, y=1, z=3` | \n", "| --------- | ---- | ---- | ---- | ---- | ---- | ---- | \n", "| middle:1 | X | X | X | X | X | X | \n", "| middle:2 | X | X | X | X | X | X | \n", "| middle:3 | X | X | - | - | X | X | \n", "| middle:4 | - | X | - | - | - | - | \n", "| middle:5 | X | - | - | - | X | X | \n", "| middle:6 | X | - | - | - | - | X | \n", "| middle:8 | - | - | X | X | - | - | \n", "| middle:9 | - | - | X | - | - | - | \n", "| middle:10 | - | - | - | X | - | - | \n", "| middle:12 | - | - | - | X | X | - | \n" ], "text/plain": [ "" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.event_table(args=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Here comes the visualization. We see that the `return y` line is the culprit here – and actually also the one to be fixed." ] }, { "cell_type": "code", "execution_count": 99, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.991128Z", "iopub.status.busy": "2023-11-12T12:41:20.991025Z", "iopub.status.idle": "2023-11-12T12:41:20.994315Z", "shell.execute_reply": "2023-11-12T12:41:20.994030Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
   1 def middle(x, y, z):  # type: ignore
\n", "
   2     if y < z:
\n", "
   3         if x < y:
\n", "
   4             return y
\n", "
   5         elif x < z:
\n", "
   6             return y
\n", "
   7     else:\n",
       "
\n", "
   8         if x > y:
\n", "
   9             return y
\n", "
  10         elif x > z:
\n", "
  11             return x\n",
       "
\n", "
  12     return z
\n" ], "text/markdown": [ "| `middle` | `x=3, y=3, z=5` | `x=1, y=2, z=3` | `x=3, y=2, z=1` | `x=5, y=5, z=5` | `x=5, y=3, z=4` | `x=2, y=1, z=3` | \n", "| --------- | ---- | ---- | ---- | ---- | ---- | ---- | \n", "| middle:1 | X | X | X | X | X | X | \n", "| middle:2 | X | X | X | X | X | X | \n", "| middle:3 | X | X | - | - | X | X | \n", "| middle:4 | - | X | - | - | - | - | \n", "| middle:5 | X | - | - | - | X | X | \n", "| middle:6 | X | - | - | - | - | X | \n", "| middle:8 | - | - | X | X | - | - | \n", "| middle:9 | - | - | X | - | - | - | \n", "| middle:10 | - | - | - | X | - | - | \n", "| middle:12 | - | - | - | X | X | - | \n" ], "text/plain": [ " 1 50% def middle(x, y, z): # type: ignore\n", " 2 50% if y < z:\n", " 3 62% if x < y:\n", " 4 0% return y\n", " 5 71% elif x < z:\n", " 6 83% return y\n", " 7 else:\n", " 8 0% if x > y:\n", " 9 0% return y\n", " 10 0% elif x > z:\n", " 11 return x\n", " 12 0% return z" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger" ] }, { "cell_type": "code", "execution_count": 100, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:20.995997Z", "iopub.status.busy": "2023-11-12T12:41:20.995896Z", "iopub.status.idle": "2023-11-12T12:41:21.000054Z", "shell.execute_reply": "2023-11-12T12:41:20.999774Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", "
\n", "

Quiz

\n", "

\n", "

Which of the above lines should be fixed?
\n", "

\n", "

\n", "

\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", " \n", "
\n", " \n", "
\n", "

\n", " \n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 100, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quiz(\"Which of the above lines should be fixed?\",\n", " [\n", " 'Line 3: `if x < y`',\n", " 'Line 5: `elif x < z`',\n", " 'Line 6: `return y`',\n", " 'Line 9: `return y`',\n", " ], r'len(\" middle \".strip()[:3])')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Indeed, in the `middle()` example, the \"reddest\" line is also the one to be fixed. Here is the fixed version:" ] }, { "cell_type": "code", "execution_count": 101, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.001503Z", "iopub.status.busy": "2023-11-12T12:41:21.001395Z", "iopub.status.idle": "2023-11-12T12:41:21.003378Z", "shell.execute_reply": "2023-11-12T12:41:21.003073Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def middle_fixed(x, y, z): # type: ignore\n", " if y < z:\n", " if x < y:\n", " return y\n", " elif x < z:\n", " return x\n", " else:\n", " if x > y:\n", " return y\n", " elif x > z:\n", " return x\n", " return z" ] }, { "cell_type": "code", "execution_count": 102, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.004910Z", "iopub.status.busy": "2023-11-12T12:41:21.004800Z", "iopub.status.idle": "2023-11-12T12:41:21.006933Z", "shell.execute_reply": "2023-11-12T12:41:21.006667Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 102, "metadata": {}, "output_type": "execute_result" } ], "source": [ "middle_fixed(2, 1, 3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Ranking Lines by Suspiciousness\n", "\n", "In a large program, there can be several locations (and events) that could be flagged as suspicious. It suffices that some large code block of say, 1,000 lines, is mostly executed in failing runs, and then all of this code block will be visualized in some shade of red. \n", "\n", "To further highlight the \"most suspicious\" events, one idea is to use a _ranking_ – that is, coming up with a list of events where those events most correlated with failures would be shown at the top. The programmer would then examine these events one by one and proceed down the list. We will show how this works for two \"correlation\" metrics – first the _Tarantula_ metric, as introduced above, and then the _Ochiai_ metric, which has shown to be one of the best \"ranking\" metrics." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We introduce a base class `RankingDebugger` with an abstract method `suspiciousness()` to be overloaded in subclasses. The method `rank()` returns a list of all events observed, sorted by suspiciousness, highest first." ] }, { "cell_type": "code", "execution_count": 103, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.008527Z", "iopub.status.busy": "2023-11-12T12:41:21.008407Z", "iopub.status.idle": "2023-11-12T12:41:21.011018Z", "shell.execute_reply": "2023-11-12T12:41:21.010729Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class RankingDebugger(DiscreteSpectrumDebugger):\n", " \"\"\"Rank events by their suspiciousness\"\"\"\n", "\n", " def rank(self) -> List[Any]:\n", " \"\"\"Return a list of events, sorted by suspiciousness, highest first.\"\"\"\n", "\n", " def susp(event: Any) -> float:\n", " suspiciousness = self.suspiciousness(event)\n", " assert suspiciousness is not None\n", " return suspiciousness\n", "\n", " events = list(self.all_events())\n", " events.sort(key=susp, reverse=True)\n", " return events\n", "\n", " def __repr__(self) -> str:\n", " return repr(self.rank())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### The Tarantula Metric\n", "\n", "We can use the Tarantula metric to sort lines according to their suspiciousness. The \"redder\" a line (a hue of 0.0), the more suspicious it is. We can simply define" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$\n", "\\textit{suspiciousness}_\\textit{tarantula}(\\textit{event}) = 1 - \\textit{color hue}(\\textit{event})\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "where $\\textit{color hue}$ is as defined above." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is exactly the `suspiciousness()` function as already implemented in our `ContinuousSpectrumDebugger`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We introduce the `TarantulaDebugger` class, inheriting visualization capabilities from the `ContinuousSpectrumDebugger` class as well as the suspiciousness features from the `RankingDebugger` class." ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.012746Z", "iopub.status.busy": "2023-11-12T12:41:21.012613Z", "iopub.status.idle": "2023-11-12T12:41:21.014369Z", "shell.execute_reply": "2023-11-12T12:41:21.014122Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class TarantulaDebugger(ContinuousSpectrumDebugger, RankingDebugger):\n", " \"\"\"Spectrum-based Debugger using the Tarantula metric for suspiciousness\"\"\"\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us list `remove_html_markup()` with highlighted lines again:" ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.015902Z", "iopub.status.busy": "2023-11-12T12:41:21.015794Z", "iopub.status.idle": "2023-11-12T12:41:21.017656Z", "shell.execute_reply": "2023-11-12T12:41:21.017361Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "tarantula_html = test_debugger_html(TarantulaDebugger())" ] }, { "cell_type": "code", "execution_count": 106, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.019153Z", "iopub.status.busy": "2023-11-12T12:41:21.019047Z", "iopub.status.idle": "2023-11-12T12:41:21.022276Z", "shell.execute_reply": "2023-11-12T12:41:21.021997Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ "[('remove_html_markup', 12), ('remove_html_markup', 11), ('remove_html_markup', 14), ('remove_html_markup', 3), ('remove_html_markup', 6), ('remove_html_markup', 9), ('remove_html_markup', 1), ('remove_html_markup', 7), ('remove_html_markup', 4), ('remove_html_markup', 13), ('remove_html_markup', 16), ('remove_html_markup', 2), ('remove_html_markup', 10), ('remove_html_markup', 8)]" ] }, "execution_count": 106, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tarantula_html" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here's our ranking of lines, from most suspicious to least suspicious:" ] }, { "cell_type": "code", "execution_count": 107, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.023800Z", "iopub.status.busy": "2023-11-12T12:41:21.023717Z", "iopub.status.idle": "2023-11-12T12:41:21.026063Z", "shell.execute_reply": "2023-11-12T12:41:21.025782Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[('remove_html_markup', 12),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 1),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 16),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 8)]" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tarantula_html.rank()" ] }, { "cell_type": "code", "execution_count": 108, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.027606Z", "iopub.status.busy": "2023-11-12T12:41:21.027449Z", "iopub.status.idle": "2023-11-12T12:41:21.029792Z", "shell.execute_reply": "2023-11-12T12:41:21.029504Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "1.0" ] }, "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tarantula_html.suspiciousness(tarantula_html.rank()[0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We see that the first line in the list is indeed the most suspicious; the two \"green\" lines come at the very end." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For the `middle()` function, we also obtain a ranking from \"reddest\" to \"greenest\"." ] }, { "cell_type": "code", "execution_count": 109, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.031256Z", "iopub.status.busy": "2023-11-12T12:41:21.031174Z", "iopub.status.idle": "2023-11-12T12:41:21.032882Z", "shell.execute_reply": "2023-11-12T12:41:21.032631Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "tarantula_middle = test_debugger_middle(TarantulaDebugger())" ] }, { "cell_type": "code", "execution_count": 110, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.034162Z", "iopub.status.busy": "2023-11-12T12:41:21.034081Z", "iopub.status.idle": "2023-11-12T12:41:21.037223Z", "shell.execute_reply": "2023-11-12T12:41:21.036962Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
   1 def middle(x, y, z):  # type: ignore
\n", "
   2     if y < z:
\n", "
   3         if x < y:
\n", "
   4             return y
\n", "
   5         elif x < z:
\n", "
   6             return y
\n", "
   7     else:\n",
       "
\n", "
   8         if x > y:
\n", "
   9             return y
\n", "
  10         elif x > z:
\n", "
  11             return x\n",
       "
\n", "
  12     return z
\n" ], "text/markdown": [ "| `middle` | `x=3, y=3, z=5` | `x=1, y=2, z=3` | `x=3, y=2, z=1` | `x=5, y=5, z=5` | `x=5, y=3, z=4` | `x=2, y=1, z=3` | \n", "| --------- | ---- | ---- | ---- | ---- | ---- | ---- | \n", "| middle:1 | X | X | X | X | X | X | \n", "| middle:2 | X | X | X | X | X | X | \n", "| middle:3 | X | X | - | - | X | X | \n", "| middle:4 | - | X | - | - | - | - | \n", "| middle:5 | X | - | - | - | X | X | \n", "| middle:6 | X | - | - | - | - | X | \n", "| middle:8 | - | - | X | X | - | - | \n", "| middle:9 | - | - | X | - | - | - | \n", "| middle:10 | - | - | - | X | - | - | \n", "| middle:12 | - | - | - | X | X | - | \n" ], "text/plain": [ "[('middle', 6), ('middle', 5), ('middle', 3), ('middle', 2), ('middle', 1), ('middle', 12), ('middle', 9), ('middle', 8), ('middle', 4), ('middle', 10)]" ] }, "execution_count": 110, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tarantula_middle" ] }, { "cell_type": "code", "execution_count": 111, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.038786Z", "iopub.status.busy": "2023-11-12T12:41:21.038690Z", "iopub.status.idle": "2023-11-12T12:41:21.041069Z", "shell.execute_reply": "2023-11-12T12:41:21.040793Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[('middle', 6),\n", " ('middle', 5),\n", " ('middle', 3),\n", " ('middle', 2),\n", " ('middle', 1),\n", " ('middle', 12),\n", " ('middle', 9),\n", " ('middle', 8),\n", " ('middle', 4),\n", " ('middle', 10)]" ] }, "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tarantula_middle.rank()" ] }, { "cell_type": "code", "execution_count": 112, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.042704Z", "iopub.status.busy": "2023-11-12T12:41:21.042570Z", "iopub.status.idle": "2023-11-12T12:41:21.044717Z", "shell.execute_reply": "2023-11-12T12:41:21.044453Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.8333333333333333" ] }, "execution_count": 112, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tarantula_middle.suspiciousness(tarantula_middle.rank()[0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### The Ochiai Metric\n", "\n", "The _Ochiai_ Metric \\cite{Ochiai1957} first introduced in the biology domain \\cite{daSilvaMeyer2004} and later applied for fault localization by Abreu et al. \\cite{Abreu2009}, is defined as follows:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "$$\n", "\\textit{suspiciousness}_\\textit{ochiai} = \\frac\n", "{\\textit{failed}(\\textit{event})}\n", "{\\sqrt{\n", "\\bigl(\\textit{failed}(\\textit{event}) + \\textit{not-in-failed}(\\textit{event})\\bigr)\n", "\\times\n", "\\bigl(\\textit{failed}(\\textit{event}) + \\textit{passed}(\\textit{event})\\bigr)\n", "}}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "where\n", "\n", "* $\\textit{failed}(\\textit{event})$ is the number of times the event occurred in _failing_ runs\n", "* $\\textit{not-in-failed}(\\textit{event})$ is the number of times the event did _not_ occur in failing runs\n", "* $\\textit{passed}(\\textit{event})$ is the number of times the event occurred in _passing_ runs.\n", "\n", "We can easily implement this formula:" ] }, { "cell_type": "code", "execution_count": 113, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.046395Z", "iopub.status.busy": "2023-11-12T12:41:21.046164Z", "iopub.status.idle": "2023-11-12T12:41:21.047813Z", "shell.execute_reply": "2023-11-12T12:41:21.047582Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import math" ] }, { "cell_type": "code", "execution_count": 114, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.049278Z", "iopub.status.busy": "2023-11-12T12:41:21.049169Z", "iopub.status.idle": "2023-11-12T12:41:21.052155Z", "shell.execute_reply": "2023-11-12T12:41:21.051890Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class OchiaiDebugger(ContinuousSpectrumDebugger, RankingDebugger):\n", " \"\"\"Spectrum-based Debugger using the Ochiai metric for suspiciousness\"\"\"\n", "\n", " def suspiciousness(self, event: Any) -> Optional[float]:\n", " failed = len(self.collectors_with_event(event, self.FAIL))\n", " not_in_failed = len(self.collectors_without_event(event, self.FAIL))\n", " passed = len(self.collectors_with_event(event, self.PASS))\n", "\n", " try:\n", " return failed / math.sqrt((failed + not_in_failed) * (failed + passed))\n", " except ZeroDivisionError:\n", " return None\n", "\n", " def hue(self, event: Any) -> Optional[float]:\n", " suspiciousness = self.suspiciousness(event)\n", " if suspiciousness is None:\n", " return None\n", " return 1 - suspiciousness" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Applied on the `remove_html_markup()` function, the individual suspiciousness scores differ from Tarantula. However, we obtain a very similar visualization, and the same ranking." ] }, { "cell_type": "code", "execution_count": 115, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.053503Z", "iopub.status.busy": "2023-11-12T12:41:21.053404Z", "iopub.status.idle": "2023-11-12T12:41:21.055133Z", "shell.execute_reply": "2023-11-12T12:41:21.054866Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "ochiai_html = test_debugger_html(OchiaiDebugger())" ] }, { "cell_type": "code", "execution_count": 116, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.056571Z", "iopub.status.busy": "2023-11-12T12:41:21.056459Z", "iopub.status.idle": "2023-11-12T12:41:21.059802Z", "shell.execute_reply": "2023-11-12T12:41:21.059517Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ "[('remove_html_markup', 12), ('remove_html_markup', 11), ('remove_html_markup', 14), ('remove_html_markup', 3), ('remove_html_markup', 6), ('remove_html_markup', 9), ('remove_html_markup', 1), ('remove_html_markup', 7), ('remove_html_markup', 4), ('remove_html_markup', 13), ('remove_html_markup', 16), ('remove_html_markup', 2), ('remove_html_markup', 10), ('remove_html_markup', 8)]" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_html" ] }, { "cell_type": "code", "execution_count": 117, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.061367Z", "iopub.status.busy": "2023-11-12T12:41:21.061265Z", "iopub.status.idle": "2023-11-12T12:41:21.063773Z", "shell.execute_reply": "2023-11-12T12:41:21.063487Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[('remove_html_markup', 12),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 1),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 16),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 8)]" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_html.rank()" ] }, { "cell_type": "code", "execution_count": 118, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.065162Z", "iopub.status.busy": "2023-11-12T12:41:21.065070Z", "iopub.status.idle": "2023-11-12T12:41:21.067263Z", "shell.execute_reply": "2023-11-12T12:41:21.067032Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "1.0" ] }, "execution_count": 118, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_html.suspiciousness(ochiai_html.rank()[0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The same observations also apply for the `middle()` function." ] }, { "cell_type": "code", "execution_count": 119, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.087133Z", "iopub.status.busy": "2023-11-12T12:41:21.086988Z", "iopub.status.idle": "2023-11-12T12:41:21.088906Z", "shell.execute_reply": "2023-11-12T12:41:21.088592Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "ochiai_middle = test_debugger_middle(OchiaiDebugger())" ] }, { "cell_type": "code", "execution_count": 120, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.090482Z", "iopub.status.busy": "2023-11-12T12:41:21.090361Z", "iopub.status.idle": "2023-11-12T12:41:21.093340Z", "shell.execute_reply": "2023-11-12T12:41:21.093068Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
   1 def middle(x, y, z):  # type: ignore
\n", "
   2     if y < z:
\n", "
   3         if x < y:
\n", "
   4             return y
\n", "
   5         elif x < z:
\n", "
   6             return y
\n", "
   7     else:\n",
       "
\n", "
   8         if x > y:
\n", "
   9             return y
\n", "
  10         elif x > z:
\n", "
  11             return x\n",
       "
\n", "
  12     return z
\n" ], "text/markdown": [ "| `middle` | `x=3, y=3, z=5` | `x=1, y=2, z=3` | `x=3, y=2, z=1` | `x=5, y=5, z=5` | `x=5, y=3, z=4` | `x=2, y=1, z=3` | \n", "| --------- | ---- | ---- | ---- | ---- | ---- | ---- | \n", "| middle:1 | X | X | X | X | X | X | \n", "| middle:2 | X | X | X | X | X | X | \n", "| middle:3 | X | X | - | - | X | X | \n", "| middle:4 | - | X | - | - | - | - | \n", "| middle:5 | X | - | - | - | X | X | \n", "| middle:6 | X | - | - | - | - | X | \n", "| middle:8 | - | - | X | X | - | - | \n", "| middle:9 | - | - | X | - | - | - | \n", "| middle:10 | - | - | - | X | - | - | \n", "| middle:12 | - | - | - | X | X | - | \n" ], "text/plain": [ "[('middle', 6), ('middle', 5), ('middle', 3), ('middle', 2), ('middle', 1), ('middle', 12), ('middle', 9), ('middle', 8), ('middle', 4), ('middle', 10)]" ] }, "execution_count": 120, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_middle" ] }, { "cell_type": "code", "execution_count": 121, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.094739Z", "iopub.status.busy": "2023-11-12T12:41:21.094638Z", "iopub.status.idle": "2023-11-12T12:41:21.097034Z", "shell.execute_reply": "2023-11-12T12:41:21.096632Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[('middle', 6),\n", " ('middle', 5),\n", " ('middle', 3),\n", " ('middle', 2),\n", " ('middle', 1),\n", " ('middle', 12),\n", " ('middle', 9),\n", " ('middle', 8),\n", " ('middle', 4),\n", " ('middle', 10)]" ] }, "execution_count": 121, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_middle.rank()" ] }, { "cell_type": "code", "execution_count": 122, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.099136Z", "iopub.status.busy": "2023-11-12T12:41:21.098939Z", "iopub.status.idle": "2023-11-12T12:41:21.101772Z", "shell.execute_reply": "2023-11-12T12:41:21.101346Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.7071067811865475" ] }, "execution_count": 122, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_middle.suspiciousness(ochiai_middle.rank()[0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### How Useful is Ranking?\n", "\n", "So, which metric is better? The standard method to evaluate such rankings is to determine a _ground truth_ – that is, the set of locations that eventually are fixed – and to check at which point in the ranking any such location occurs – the earlier, the better. In our `remove_html_markup()` and `middle()` examples, both the Tarantula and the Ochiai metric perform flawlessly, as the \"culprit\" line is always ranked at the top. However, this need not always be the case; the exact performance depends on the nature of the code and the observed runs. (Also, the question of whether there always is exactly one possible location where the program can be fixed is open for discussion.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "You will be surprised that over time, _several dozen_ metrics have been proposed \\cite{Wong2016}, each performing somewhat better or somewhat worse depending on which benchmark they were applied on. The two metrics discussed above each have their merits – the Tarantula metric was among the first such metrics, and the Ochiai metric is generally shown to be among the most effective ones \\cite{Abreu2009}." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "While rankings can be easily _evaluated_, it is not necessarily clear whether and how much they serve programmers. As stated above, the assumption of rankings is that developers examine one potentially defective statement after another until they find the actually defective one. However, in a series of human studies with developers, Parnin and Orso \\cite{Parnin2011} found that this assumption may not hold:\n", "\n", "> It is unclear whether developers can actually determine the faulty nature of a statement by simply looking at it, without any additional information (e.g., the state of the program when the statement was executed or the statements that were executed before or after that one).\n", "\n", "In their study, they found that rankings could help completing a task faster, but this effect was limited to experienced developers and simpler code. Artificially changing the rank of faulty statements had little to no effect, implying that developers would not strictly follow the ranked list of statements, but rather search through the code to understand it. At this point, a _visualization_ as in the Tarantula tool can be helpful to programmers as it _guides_ the search, but a _ranking_ that _defines_ where to search may be less useful." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Having said that, ranking has its merits – notably as it comes to informing _automated_ debugging techniques. In the [chapter on program repair](Repairer.ipynb), we will see how ranked lists of potentially faulty statements tell automated repair techniques where to try to repair the program first. And once such a repair is successful, we have a very strong indication on where and how the program could be fixed!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Using Large Test Suites" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In fault localization, the larger and the more thorough the test suite, the higher the precision. Let us try out what happens if we extend the `middle()` test suite with additional test cases." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The function `middle_testcase()` returns a random input for `middle()`:" ] }, { "cell_type": "code", "execution_count": 123, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.103681Z", "iopub.status.busy": "2023-11-12T12:41:21.103564Z", "iopub.status.idle": "2023-11-12T12:41:21.105097Z", "shell.execute_reply": "2023-11-12T12:41:21.104842Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import random" ] }, { "cell_type": "code", "execution_count": 124, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.106436Z", "iopub.status.busy": "2023-11-12T12:41:21.106352Z", "iopub.status.idle": "2023-11-12T12:41:21.108316Z", "shell.execute_reply": "2023-11-12T12:41:21.108053Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def middle_testcase() -> Tuple[int, int, int]:\n", " x = random.randrange(10)\n", " y = random.randrange(10)\n", " z = random.randrange(10)\n", " return x, y, z" ] }, { "cell_type": "code", "execution_count": 125, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.109638Z", "iopub.status.busy": "2023-11-12T12:41:21.109545Z", "iopub.status.idle": "2023-11-12T12:41:21.112092Z", "shell.execute_reply": "2023-11-12T12:41:21.111825Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[(9, 0, 5), (0, 1, 9), (9, 0, 9), (2, 9, 0), (4, 7, 2)]" ] }, "execution_count": 125, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[middle_testcase() for i in range(5)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The function `middle_test()` simply checks if `middle()` operates correctly – by placing `x`, `y`, and `z` in a list, sorting it, and checking the middle argument. If `middle()` fails, `middle_test()` raises an exception." ] }, { "cell_type": "code", "execution_count": 126, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.113681Z", "iopub.status.busy": "2023-11-12T12:41:21.113571Z", "iopub.status.idle": "2023-11-12T12:41:21.115461Z", "shell.execute_reply": "2023-11-12T12:41:21.115201Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def middle_test(x: int, y: int, z: int) -> None:\n", " m = middle(x, y, z)\n", " assert m == sorted([x, y, z])[1]" ] }, { "cell_type": "code", "execution_count": 127, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.116864Z", "iopub.status.busy": "2023-11-12T12:41:21.116767Z", "iopub.status.idle": "2023-11-12T12:41:21.118318Z", "shell.execute_reply": "2023-11-12T12:41:21.118085Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "middle_test(4, 5, 6)" ] }, { "cell_type": "code", "execution_count": 128, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.119708Z", "iopub.status.busy": "2023-11-12T12:41:21.119620Z", "iopub.status.idle": "2023-11-12T12:41:21.121371Z", "shell.execute_reply": "2023-11-12T12:41:21.121079Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from ExpectError import ExpectError" ] }, { "cell_type": "code", "execution_count": 129, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.122803Z", "iopub.status.busy": "2023-11-12T12:41:21.122712Z", "iopub.status.idle": "2023-11-12T12:41:21.124619Z", "shell.execute_reply": "2023-11-12T12:41:21.124366Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_26712/3661663124.py\", line 2, in \n", " middle_test(2, 1, 3)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_26712/40742806.py\", line 3, in middle_test\n", " assert m == sorted([x, y, z])[1]\n", "AssertionError (expected)\n" ] } ], "source": [ "with ExpectError():\n", " middle_test(2, 1, 3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The function `middle_passing_testcase()` searches and returns a triple `x`, `y`, `z` that causes `middle_test()` to pass." ] }, { "cell_type": "code", "execution_count": 130, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.126138Z", "iopub.status.busy": "2023-11-12T12:41:21.126047Z", "iopub.status.idle": "2023-11-12T12:41:21.128170Z", "shell.execute_reply": "2023-11-12T12:41:21.127927Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def middle_passing_testcase() -> Tuple[int, int, int]:\n", " while True:\n", " try:\n", " x, y, z = middle_testcase()\n", " middle_test(x, y, z)\n", " return x, y, z\n", " except AssertionError:\n", " pass" ] }, { "cell_type": "code", "execution_count": 131, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.129611Z", "iopub.status.busy": "2023-11-12T12:41:21.129478Z", "iopub.status.idle": "2023-11-12T12:41:21.131690Z", "shell.execute_reply": "2023-11-12T12:41:21.131394Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "middle(1, 6, 1) = 1\n" ] } ], "source": [ "(x, y, z) = middle_passing_testcase()\n", "m = middle(x, y, z)\n", "print(f\"middle({x}, {y}, {z}) = {m}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The function `middle_failing_testcase()` does the same; but its triple `x`, `y`, `z` causes `middle_test()` to fail." ] }, { "cell_type": "code", "execution_count": 132, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.133933Z", "iopub.status.busy": "2023-11-12T12:41:21.133646Z", "iopub.status.idle": "2023-11-12T12:41:21.136395Z", "shell.execute_reply": "2023-11-12T12:41:21.135871Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def middle_failing_testcase() -> Tuple[int, int, int]:\n", " while True:\n", " try:\n", " x, y, z = middle_testcase()\n", " middle_test(x, y, z)\n", " except AssertionError:\n", " return x, y, z" ] }, { "cell_type": "code", "execution_count": 133, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.137976Z", "iopub.status.busy": "2023-11-12T12:41:21.137861Z", "iopub.status.idle": "2023-11-12T12:41:21.139868Z", "shell.execute_reply": "2023-11-12T12:41:21.139611Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "middle(5, 2, 6) = 2\n" ] } ], "source": [ "(x, y, z) = middle_failing_testcase()\n", "m = middle(x, y, z)\n", "print(f\"middle({x}, {y}, {z}) = {m}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "With these, we can define two sets of test cases, each with 100 inputs." ] }, { "cell_type": "code", "execution_count": 134, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.141342Z", "iopub.status.busy": "2023-11-12T12:41:21.141224Z", "iopub.status.idle": "2023-11-12T12:41:21.142796Z", "shell.execute_reply": "2023-11-12T12:41:21.142531Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "MIDDLE_TESTS = 100" ] }, { "cell_type": "code", "execution_count": 135, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.144321Z", "iopub.status.busy": "2023-11-12T12:41:21.144201Z", "iopub.status.idle": "2023-11-12T12:41:21.146442Z", "shell.execute_reply": "2023-11-12T12:41:21.146106Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "MIDDLE_PASSING_TESTCASES = [middle_passing_testcase()\n", " for i in range(MIDDLE_TESTS)]" ] }, { "cell_type": "code", "execution_count": 136, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.148064Z", "iopub.status.busy": "2023-11-12T12:41:21.147948Z", "iopub.status.idle": "2023-11-12T12:41:21.150690Z", "shell.execute_reply": "2023-11-12T12:41:21.150420Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "MIDDLE_FAILING_TESTCASES = [middle_failing_testcase()\n", " for i in range(MIDDLE_TESTS)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us run the `OchiaiDebugger` with these two test sets." ] }, { "cell_type": "code", "execution_count": 137, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.152249Z", "iopub.status.busy": "2023-11-12T12:41:21.152123Z", "iopub.status.idle": "2023-11-12T12:41:21.157013Z", "shell.execute_reply": "2023-11-12T12:41:21.156735Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "ochiai_middle = OchiaiDebugger()\n", "\n", "for x, y, z in MIDDLE_PASSING_TESTCASES:\n", " with ochiai_middle.collect_pass():\n", " middle(x, y, z)\n", "\n", "for x, y, z in MIDDLE_FAILING_TESTCASES:\n", " with ochiai_middle.collect_fail():\n", " middle(x, y, z)" ] }, { "cell_type": "code", "execution_count": 138, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.158507Z", "iopub.status.busy": "2023-11-12T12:41:21.158421Z", "iopub.status.idle": "2023-11-12T12:41:21.176241Z", "shell.execute_reply": "2023-11-12T12:41:21.175953Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
   1 def middle(x, y, z):  # type: ignore
\n", "
   2     if y < z:
\n", "
   3         if x < y:
\n", "
   4             return y
\n", "
   5         elif x < z:
\n", "
   6             return y
\n", "
   7     else:\n",
       "
\n", "
   8         if x > y:
\n", "
   9             return y
\n", "
  10         elif x > z:
\n", "
  11             return x
\n", "
  12     return z
\n" ], "text/markdown": [ "| `middle` | `x=7, y=5, z=4` | `x=6, y=6, z=2` | `x=7, y=6, z=4` | `x=0, y=7, z=7` | `x=3, y=9, z=7` | `x=2, y=3, z=8` | `x=2, y=6, z=3` | `x=2, y=8, z=4` | `x=8, y=7, z=5` | `x=0, y=0, z=6` | `x=5, y=2, z=5` | `x=9, y=6, z=9` | `x=8, y=5, z=5` | `x=1, y=7, z=0` | `x=0, y=4, z=4` | `x=8, y=7, z=5` | `x=7, y=1, z=5` | `x=3, y=5, z=3` | `x=6, y=3, z=2` | `x=3, y=9, z=8` | `x=1, y=3, z=8` | `x=5, y=9, z=1` | `x=5, y=5, z=7` | `x=9, y=0, z=3` | `x=5, y=6, z=3` | `x=7, y=9, z=8` | `x=2, y=2, z=3` | `x=1, y=8, z=1` | `x=3, y=3, z=3` | `x=7, y=3, z=6` | `x=3, y=7, z=6` | `x=2, y=3, z=1` | `x=8, y=0, z=6` | `x=5, y=4, z=0` | `x=5, y=9, z=2` | `x=6, y=7, z=1` | `x=1, y=1, z=7` | `x=1, y=6, z=8` | `x=0, y=9, z=1` | `x=5, y=9, z=7` | `x=1, y=5, z=6` | `x=1, y=6, z=1` | `x=7, y=8, z=4` | `x=2, y=8, z=4` | `x=3, y=4, z=3` | `x=8, y=4, z=5` | `x=0, y=5, z=8` | `x=9, y=7, z=0` | `x=1, y=5, z=4` | `x=9, y=8, z=9` | `x=9, y=2, z=4` | `x=3, y=2, z=0` | `x=8, y=5, z=2` | `x=4, y=9, z=7` | `x=4, y=8, z=0` | `x=5, y=9, z=1` | `x=7, y=6, z=4` | `x=8, y=9, z=2` | `x=9, y=1, z=4` | `x=9, y=7, z=2` | `x=1, y=9, z=5` | `x=1, y=9, z=1` | `x=7, y=8, z=0` | `x=4, y=8, z=6` | `x=9, y=2, z=1` | `x=9, y=4, z=5` | `x=0, y=4, z=2` | `x=6, y=8, z=4` | `x=0, y=0, z=1` | `x=6, y=4, z=3` | `x=8, y=0, z=3` | `x=9, y=5, z=1` | `x=7, y=6, z=6` | `x=4, y=6, z=0` | `x=1, y=1, z=0` | `x=8, y=4, z=1` | `x=1, y=6, z=2` | `x=4, y=5, z=1` | `x=1, y=1, z=6` | `x=1, y=2, z=1` | `x=2, y=4, z=2` | `x=7, y=3, z=3` | `x=2, y=7, z=4` | `x=0, y=8, z=0` | `x=4, y=6, z=4` | `x=1, y=5, z=2` | `x=0, y=8, z=2` | `x=3, y=5, z=6` | `x=7, y=2, z=4` | `x=5, y=4, z=5` | `x=2, y=3, z=4` | `x=2, y=2, z=4` | `x=3, y=6, z=2` | `x=9, y=4, z=0` | `x=7, y=7, z=0` | `x=4, y=9, z=8` | `x=3, y=8, z=4` | `x=3, y=3, z=5` | `x=3, y=2, z=2` | `x=8, y=8, z=1` | `x=2, y=1, z=5` | `x=5, y=2, z=9` | `x=2, y=1, z=3` | `x=3, y=0, z=7` | `x=4, y=1, z=6` | `x=2, y=0, z=9` | `x=4, y=3, z=7` | `x=5, y=3, z=8` | `x=6, y=0, z=9` | `x=2, y=0, z=4` | `x=4, y=0, z=5` | `x=6, y=5, z=9` | `x=5, y=3, z=9` | `x=7, y=0, z=9` | `x=5, y=3, z=9` | `x=2, y=1, z=6` | `x=3, y=2, z=4` | `x=2, y=1, z=5` | `x=3, y=2, z=6` | `x=6, y=4, z=7` | `x=5, y=2, z=6` | `x=4, y=2, z=5` | `x=6, y=2, z=9` | `x=4, y=2, z=8` | `x=2, y=0, z=3` | `x=2, y=1, z=5` | `x=2, y=1, z=9` | `x=7, y=6, z=9` | `x=6, y=0, z=9` | `x=7, y=6, z=9` | `x=8, y=6, z=9` | `x=7, y=6, z=9` | `x=1, y=0, z=2` | `x=1, y=0, z=3` | `x=3, y=2, z=7` | `x=6, y=5, z=9` | `x=1, y=0, z=3` | `x=6, y=4, z=7` | `x=8, y=7, z=9` | `x=2, y=0, z=4` | `x=5, y=1, z=7` | `x=6, y=4, z=9` | `x=8, y=0, z=9` | `x=2, y=1, z=4` | `x=4, y=1, z=8` | `x=3, y=0, z=4` | `x=3, y=1, z=5` | `x=7, y=5, z=8` | `x=6, y=2, z=9` | `x=2, y=0, z=4` | `x=7, y=6, z=9` | `x=2, y=1, z=6` | `x=3, y=0, z=8` | `x=5, y=3, z=8` | `x=5, y=0, z=9` | `x=4, y=0, z=9` | `x=7, y=5, z=8` | `x=4, y=2, z=7` | `x=3, y=1, z=6` | `x=2, y=0, z=4` | `x=4, y=0, z=8` | `x=2, y=1, z=8` | `x=5, y=3, z=9` | `x=4, y=3, z=5` | `x=6, y=4, z=9` | `x=7, y=6, z=8` | `x=3, y=1, z=9` | `x=4, y=2, z=7` | `x=1, y=0, z=6` | `x=1, y=0, z=5` | `x=7, y=0, z=9` | `x=4, y=0, z=7` | `x=3, y=2, z=9` | `x=4, y=1, z=6` | `x=5, y=2, z=9` | `x=8, y=0, z=9` | `x=2, y=1, z=9` | `x=4, y=3, z=6` | `x=6, y=0, z=9` | `x=3, y=1, z=4` | `x=7, y=6, z=8` | `x=6, y=1, z=7` | `x=6, y=2, z=7` | `x=3, y=0, z=9` | `x=5, y=4, z=9` | `x=8, y=6, z=9` | `x=5, y=3, z=6` | `x=2, y=1, z=7` | `x=3, y=0, z=5` | `x=6, y=1, z=8` | `x=2, y=0, z=5` | `x=2, y=0, z=9` | `x=6, y=5, z=8` | `x=2, y=0, z=4` | `x=5, y=0, z=9` | `x=1, y=0, z=9` | `x=3, y=1, z=6` | `x=3, y=1, z=5` | `x=5, y=1, z=9` | `x=6, y=2, z=7` | \n", "| --------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | \n", "| middle:1 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | \n", "| middle:2 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | \n", "| middle:3 | - | - | - | - | - | X | - | - | - | X | X | X | - | - | - | - | X | - | - | - | X | - | X | X | - | - | X | - | - | X | - | - | X | - | - | - | X | X | - | - | X | - | - | - | - | X | X | - | - | X | X | - | - | - | - | - | - | - | X | - | - | - | - | - | - | X | - | - | X | - | X | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | X | X | X | X | X | - | - | - | - | - | X | - | - | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | \n", "| middle:4 | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | X | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | \n", "| middle:5 | - | - | - | - | - | - | - | - | - | X | X | X | - | - | - | - | X | - | - | - | - | - | X | X | - | - | X | - | - | X | - | - | X | - | - | - | X | - | - | - | - | - | - | - | - | X | - | - | - | X | X | - | - | - | - | - | - | - | X | - | - | - | - | - | - | X | - | - | X | - | X | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | X | X | - | X | - | - | - | - | - | X | - | - | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | \n", "| middle:6 | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | X | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | X | - | - | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | \n", "| middle:8 | X | X | X | X | X | - | X | X | X | - | - | - | X | X | X | X | - | X | X | X | - | X | - | - | X | X | - | X | X | - | X | X | - | X | X | X | - | - | X | X | - | X | X | X | X | - | - | X | X | - | - | X | X | X | X | X | X | X | - | X | X | X | X | X | X | - | X | X | - | X | - | X | X | X | X | X | X | X | - | X | X | X | X | X | X | X | X | - | - | - | - | - | X | X | X | X | X | - | X | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | \n", "| middle:9 | X | - | X | - | - | - | - | - | X | - | - | - | X | - | - | X | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | X | X | - | - | - | X | - | - | X | - | - | - | - | X | - | - | - | - | X | - | X | X | - | - | X | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | \n", "| middle:10 | - | X | - | X | X | - | X | X | - | - | - | - | - | X | X | - | - | X | - | X | - | X | - | - | X | X | - | X | X | - | X | X | - | - | X | X | - | - | X | X | - | X | X | X | X | - | - | - | X | - | - | - | - | X | X | X | - | X | - | - | X | X | X | X | - | - | X | X | - | - | - | - | - | X | X | - | X | X | - | X | X | - | X | X | X | X | X | - | - | - | - | - | X | - | X | X | X | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | \n", "| middle:11 | - | X | - | - | - | - | - | - | - | - | - | - | - | X | - | - | - | - | - | - | - | X | - | - | X | - | - | - | - | - | - | X | - | - | X | X | - | - | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | X | X | - | X | - | - | - | - | X | - | - | - | - | X | - | - | - | - | - | X | X | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | X | - | X | - | - | - | - | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | \n", "| middle:12 | - | - | - | X | X | - | X | X | - | - | X | X | - | - | X | - | X | X | - | X | - | - | - | X | - | X | - | X | X | X | X | - | X | - | - | - | - | - | X | X | - | X | - | X | X | X | - | - | X | X | X | - | - | X | - | - | - | - | X | - | X | X | - | X | - | X | X | - | - | - | X | - | - | - | - | - | X | - | - | X | X | - | X | X | X | X | X | - | X | X | - | - | - | - | - | X | X | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | \n" ], "text/plain": [ "[('middle', 6), ('middle', 5), ('middle', 3), ('middle', 2), ('middle', 1), ('middle', 12), ('middle', 9), ('middle', 8), ('middle', 11), ('middle', 4), ('middle', 10)]" ] }, "execution_count": 138, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ochiai_middle" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We see that the \"culprit\" line is still the most likely to be fixed, but the two conditions leading to the error (`x < y` and `x < z`) are also listed as potentially faulty. That is because the error might also be fixed be changing these conditions – although this would result in a more complex fix." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Other Events besides Coverage\n", "\n", "We close this chapter with two directions for further thought. If you wondered why in the above code, we were mostly talking about `events` rather than lines covered, that is because our framework allows for tracking arbitrary events, not just coverage. In fact, any data item a collector can extract from the execution can be used for correlation analysis. (It may not be so easily visualized, though.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here's an example. We define a `ValueCollector` class that collects pairs of (local) variables and their values during execution. Its `events()` method then returns the set of all these pairs." ] }, { "cell_type": "code", "execution_count": 139, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.177843Z", "iopub.status.busy": "2023-11-12T12:41:21.177716Z", "iopub.status.idle": "2023-11-12T12:41:21.180243Z", "shell.execute_reply": "2023-11-12T12:41:21.179972Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class ValueCollector(Collector):\n", " \"\"\"\"A class to collect local variables and their values.\"\"\"\n", "\n", " def __init__(self) -> None:\n", " \"\"\"Constructor.\"\"\"\n", " super().__init__()\n", " self.vars: Set[str] = set()\n", "\n", " def collect(self, frame: FrameType, event: str, arg: Any) -> None:\n", " local_vars = frame.f_locals\n", " for var in local_vars:\n", " value = local_vars[var]\n", " self.vars.add(f\"{var} = {repr(value)}\")\n", "\n", " def events(self) -> Set[str]:\n", " \"\"\"A set of (variable, value) pairs observed\"\"\"\n", " return self.vars" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If we apply this collector on our set of HTML test cases, these are all the events that we obtain – essentially all variables and all values ever seen:" ] }, { "cell_type": "code", "execution_count": 140, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.181980Z", "iopub.status.busy": "2023-11-12T12:41:21.181822Z", "iopub.status.idle": "2023-11-12T12:41:21.184632Z", "shell.execute_reply": "2023-11-12T12:41:21.184211Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "out = 'ab'\n", "s = 'abc'\n", "quote = True\n", "out = 'abc'\n", "s = '\"abc\"'\n", "out = 'a'\n", "tag = False\n", "out = ''\n", "c = 'b'\n", "c = '>'\n", "c = 'a'\n", "tag = True\n", "c = '\"'\n", "s = 'abc'\n", "c = 'c'\n", "c = '/'\n", "quote = False\n", "c = '<'\n" ] } ], "source": [ "debugger = test_debugger_html(ContinuousSpectrumDebugger(ValueCollector))\n", "for event in debugger.all_events():\n", " print(event)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "However, some of these events only occur in the failing run:" ] }, { "cell_type": "code", "execution_count": 141, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.186930Z", "iopub.status.busy": "2023-11-12T12:41:21.186767Z", "iopub.status.idle": "2023-11-12T12:41:21.188700Z", "shell.execute_reply": "2023-11-12T12:41:21.188418Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "s = '\"abc\"'\n", "c = '\"'\n", "quote = True\n" ] } ], "source": [ "for event in debugger.only_fail_events():\n", " print(event)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Some of these differences are spurious – the string `\"abc\"` (with quotes) only occurs in the failing run – but others, such as `quote` being True and `c` containing a single quote are actually relevant for explaining when the failure comes to be." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can even visualize the suspiciousness of the individual events, setting the (so far undiscussed) `color` flag for producing an event table:" ] }, { "cell_type": "code", "execution_count": 142, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.190269Z", "iopub.status.busy": "2023-11-12T12:41:21.190113Z", "iopub.status.idle": "2023-11-12T12:41:21.192523Z", "shell.execute_reply": "2023-11-12T12:41:21.192289Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| ---------------- | ---- | ---- | ---- | \n", "| c = '"' | - | - | X | \n", "| c = '/' | - | X | - | \n", "| c = '<' | - | X | - | \n", "| c = '>' | - | X | - | \n", "| c = 'a' | X | X | X | \n", "| c = 'b' | X | X | X | \n", "| c = 'c' | X | X | X | \n", "| out = '' | X | X | X | \n", "| out = 'a' | X | X | X | \n", "| out = 'ab' | X | X | X | \n", "| out = 'abc' | X | X | X | \n", "| quote = False | X | X | X | \n", "| quote = True | - | - | X | \n", "| s = '"abc"' | - | - | X | \n", "| s = '<b>abc</b>' | - | X | - | \n", "| s = 'abc' | X | - | - | \n", "| tag = False | X | X | X | \n", "| tag = True | - | X | - | \n" ], "text/plain": [ "" ] }, "execution_count": 142, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.event_table(color=True, args=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "There are many ways one can continue from here.\n", "\n", "* Rather than checking for concrete values, one could check for more _abstract properties_, for instance – what is the sign of the value? What is the length of the string? \n", "* One could check for specifics of the _control flow_ – is the loop taken? How many times?\n", "* One could check for specifics of the _information flow_ – which values flow from one variable to another?\n", "\n", "There are lots of properties that all could be related to failures – and if we happen to check for the right one, we may obtain a much crisper definition of what causes the failure. We will come up with more ideas on properties to check as it comes to [mining specifications](SpecificationMining,ipynb)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Training Classifiers\n", "\n", "The metrics we have discussed so far are pretty _generic_ – that is, they are fixed no matter how the actual event space is structured. The field of _machine learning_ has come up with techniques that learn _classifiers_ from a given set of data – classifiers that are trained from labeled data and then can predict labels for new data sets. In our case, the labels are test outcomes (PASS and FAIL), whereas the data would be features of the events observed." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "A classifier by itself is not immediately useful for debugging (although it could predict whether future inputs will fail or not). Some classifiers, however, have great _diagnostic_ quality; that is, they can _explain_ how their classification comes to be. [Decision trees](https://scikit-learn.org/stable/modules/tree.html) fall into this very category." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A decision tree contains a number of _nodes_, each one associated with a predicate. Depending on whether the predicate is true or false, we follow the given \"true\" or \"false\" branch to end up in the next node, which again contains a predicate. Eventually, we end up in the outcome predicted by the tree. The neat thing is that the node predicates actually give important hints on the circumstances that are _most relevant_ for deciding the outcome." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Let us illustrate this with an example. We build a class `ClassifyingDebugger` that trains a decision tree from the events collected. To this end, we need to set up our input data such that it can be fed into a classifier." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We start with identifying our _samples_ (runs) and the respective _labels_ (outcomes). All values have to be encoded into numerical values." ] }, { "cell_type": "code", "execution_count": 143, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.194317Z", "iopub.status.busy": "2023-11-12T12:41:21.194178Z", "iopub.status.idle": "2023-11-12T12:41:21.196630Z", "shell.execute_reply": "2023-11-12T12:41:21.196336Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class ClassifyingDebugger(DifferenceDebugger):\n", " \"\"\"A debugger implementing a decision tree for events\"\"\"\n", "\n", " PASS_VALUE = +1.0\n", " FAIL_VALUE = -1.0\n", "\n", " def samples(self) -> Dict[str, float]:\n", " samples = {}\n", " for collector in self.pass_collectors():\n", " samples[collector.id()] = self.PASS_VALUE\n", " for collector in debugger.fail_collectors():\n", " samples[collector.id()] = self.FAIL_VALUE\n", " return samples" ] }, { "cell_type": "code", "execution_count": 144, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.198176Z", "iopub.status.busy": "2023-11-12T12:41:21.198060Z", "iopub.status.idle": "2023-11-12T12:41:21.200488Z", "shell.execute_reply": "2023-11-12T12:41:21.200239Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{\"remove_html_markup(s='abc')\": 1.0,\n", " \"remove_html_markup(s='abc')\": 1.0,\n", " 'remove_html_markup(s=\\'\"abc\"\\')': -1.0}" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "debugger.samples()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Next, we identify the _features_, which in our case is the set of lines executed in each sample:" ] }, { "cell_type": "code", "execution_count": 145, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.202042Z", "iopub.status.busy": "2023-11-12T12:41:21.201917Z", "iopub.status.idle": "2023-11-12T12:41:21.204374Z", "shell.execute_reply": "2023-11-12T12:41:21.204080Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def features(self) -> Dict[str, Any]:\n", " features = {}\n", " for collector in debugger.pass_collectors():\n", " features[collector.id()] = collector.events()\n", " for collector in debugger.fail_collectors():\n", " features[collector.id()] = collector.events()\n", " return features" ] }, { "cell_type": "code", "execution_count": 146, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.205804Z", "iopub.status.busy": "2023-11-12T12:41:21.205700Z", "iopub.status.idle": "2023-11-12T12:41:21.208907Z", "shell.execute_reply": "2023-11-12T12:41:21.208567Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "{\"remove_html_markup(s='abc')\": {('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)},\n", " \"remove_html_markup(s='abc')\": {('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 8),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)},\n", " 'remove_html_markup(s=\\'\"abc\"\\')': {('remove_html_markup', 1),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 12),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 16)}}" ] }, "execution_count": 146, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "debugger.features()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "All our features have names, which must be strings." ] }, { "cell_type": "code", "execution_count": 147, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.210338Z", "iopub.status.busy": "2023-11-12T12:41:21.210232Z", "iopub.status.idle": "2023-11-12T12:41:21.212334Z", "shell.execute_reply": "2023-11-12T12:41:21.211956Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def feature_names(self) -> List[str]:\n", " return [repr(feature) for feature in self.all_events()]" ] }, { "cell_type": "code", "execution_count": 148, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.214056Z", "iopub.status.busy": "2023-11-12T12:41:21.213958Z", "iopub.status.idle": "2023-11-12T12:41:21.216648Z", "shell.execute_reply": "2023-11-12T12:41:21.216374Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[\"('remove_html_markup', 11)\",\n", " \"('remove_html_markup', 14)\",\n", " \"('remove_html_markup', 3)\",\n", " \"('remove_html_markup', 6)\",\n", " \"('remove_html_markup', 12)\",\n", " \"('remove_html_markup', 9)\",\n", " \"('remove_html_markup', 1)\",\n", " \"('remove_html_markup', 7)\",\n", " \"('remove_html_markup', 4)\",\n", " \"('remove_html_markup', 10)\",\n", " \"('remove_html_markup', 13)\",\n", " \"('remove_html_markup', 16)\",\n", " \"('remove_html_markup', 2)\",\n", " \"('remove_html_markup', 8)\"]" ] }, "execution_count": 148, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "debugger.feature_names()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Next, we define the _shape_ for an individual sample, which is a value of +1 or -1 for each feature seen (i.e., +1 if the line was covered, -1 if not)." ] }, { "cell_type": "code", "execution_count": 149, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.218227Z", "iopub.status.busy": "2023-11-12T12:41:21.218115Z", "iopub.status.idle": "2023-11-12T12:41:21.220319Z", "shell.execute_reply": "2023-11-12T12:41:21.220026Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def shape(self, sample: str) -> List[float]:\n", " x = []\n", " features = self.features()\n", " for f in self.all_events():\n", " if f in features[sample]:\n", " x += [+1.0]\n", " else:\n", " x += [-1.0]\n", " return x" ] }, { "cell_type": "code", "execution_count": 150, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.221924Z", "iopub.status.busy": "2023-11-12T12:41:21.221802Z", "iopub.status.idle": "2023-11-12T12:41:21.224510Z", "shell.execute_reply": "2023-11-12T12:41:21.224158Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, -1.0]" ] }, "execution_count": 150, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "debugger.shape(\"remove_html_markup(s='abc')\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our input X for the classifier now is a list of such shapes, one for each sample." ] }, { "cell_type": "code", "execution_count": 151, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.226106Z", "iopub.status.busy": "2023-11-12T12:41:21.225986Z", "iopub.status.idle": "2023-11-12T12:41:21.228042Z", "shell.execute_reply": "2023-11-12T12:41:21.227777Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def X(self) -> List[List[float]]:\n", " X = []\n", " samples = self.samples()\n", " for key in samples:\n", " X += [self.shape(key)]\n", " return X" ] }, { "cell_type": "code", "execution_count": 152, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.229432Z", "iopub.status.busy": "2023-11-12T12:41:21.229330Z", "iopub.status.idle": "2023-11-12T12:41:21.232040Z", "shell.execute_reply": "2023-11-12T12:41:21.231665Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[[1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, -1.0],\n", " [1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],\n", " [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, -1.0]]" ] }, "execution_count": 152, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "debugger.X()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Our input Y for the classifier, in contrast, is the list of labels, again indexed by sample." ] }, { "cell_type": "code", "execution_count": 153, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.233702Z", "iopub.status.busy": "2023-11-12T12:41:21.233588Z", "iopub.status.idle": "2023-11-12T12:41:21.235517Z", "shell.execute_reply": "2023-11-12T12:41:21.235260Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def Y(self) -> List[float]:\n", " Y = []\n", " samples = self.samples()\n", " for key in samples:\n", " Y += [samples[key]]\n", " return Y" ] }, { "cell_type": "code", "execution_count": 154, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.236908Z", "iopub.status.busy": "2023-11-12T12:41:21.236799Z", "iopub.status.idle": "2023-11-12T12:41:21.239262Z", "shell.execute_reply": "2023-11-12T12:41:21.238994Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[1.0, 1.0, -1.0]" ] }, "execution_count": 154, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "debugger.Y()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We now have all our data ready to be fit into a tree classifier. The method `classifier()` creates and returns the (tree) classifier for the observed runs." ] }, { "cell_type": "code", "execution_count": 155, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:21.240714Z", "iopub.status.busy": "2023-11-12T12:41:21.240606Z", "iopub.status.idle": "2023-11-12T12:41:24.181116Z", "shell.execute_reply": "2023-11-12T12:41:24.180802Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from sklearn.tree import DecisionTreeClassifier, export_text, export_graphviz" ] }, { "cell_type": "code", "execution_count": 156, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.183177Z", "iopub.status.busy": "2023-11-12T12:41:24.182880Z", "iopub.status.idle": "2023-11-12T12:41:24.185118Z", "shell.execute_reply": "2023-11-12T12:41:24.184871Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def classifier(self) -> DecisionTreeClassifier:\n", " classifier = DecisionTreeClassifier()\n", " classifier = classifier.fit(self.X(), self.Y())\n", " return classifier" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We define a special method to show classifiers:" ] }, { "cell_type": "code", "execution_count": 157, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.186649Z", "iopub.status.busy": "2023-11-12T12:41:24.186532Z", "iopub.status.idle": "2023-11-12T12:41:24.188148Z", "shell.execute_reply": "2023-11-12T12:41:24.187881Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import graphviz" ] }, { "cell_type": "code", "execution_count": 158, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.189631Z", "iopub.status.busy": "2023-11-12T12:41:24.189519Z", "iopub.status.idle": "2023-11-12T12:41:24.191858Z", "shell.execute_reply": "2023-11-12T12:41:24.191605Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class ClassifyingDebugger(ClassifyingDebugger):\n", " def show_classifier(self, classifier: DecisionTreeClassifier) -> Any:\n", " dot_data = export_graphviz(classifier, out_file=None, \n", " filled=False, rounded=True,\n", " feature_names=self.feature_names(),\n", " class_names=[\"FAIL\", \"PASS\"],\n", " label='none',\n", " node_ids=False,\n", " impurity=False,\n", " proportion=True,\n", " special_characters=True)\n", "\n", " return graphviz.Source(dot_data)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This is the tree we get for our `remove_html_markup()` tests. The top predicate is whether the \"culprit\" line was executed (-1 means no, +1 means yes). If not (-1), the outcome is PASS. Otherwise, the outcome is TRUE." ] }, { "cell_type": "code", "execution_count": 159, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.193384Z", "iopub.status.busy": "2023-11-12T12:41:24.193267Z", "iopub.status.idle": "2023-11-12T12:41:24.602029Z", "shell.execute_reply": "2023-11-12T12:41:24.601614Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "Tree\n", "\n", "\n", "\n", "0\n", "\n", "('remove_html_markup', 12) ≤ 0.0\n", "100.0%\n", "[0.333, 0.667]\n", "PASS\n", "\n", "\n", "\n", "1\n", "\n", "66.7%\n", "[0.0, 1.0]\n", "PASS\n", "\n", "\n", "\n", "0->1\n", "\n", "\n", "True\n", "\n", "\n", "\n", "2\n", "\n", "33.3%\n", "[1.0, 0.0]\n", "FAIL\n", "\n", "\n", "\n", "0->2\n", "\n", "\n", "False\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 159, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger = test_debugger_html(ClassifyingDebugger())\n", "classifier = debugger.classifier()\n", "debugger.show_classifier(classifier)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can even use our classifier to predict the outcome of additional runs. If, for instance, we execute all lines except for, say, Line 7, 9, and 11, our tree classifier would predict failure – because the \"culprit\" line 12 is executed." ] }, { "cell_type": "code", "execution_count": 160, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.604424Z", "iopub.status.busy": "2023-11-12T12:41:24.604278Z", "iopub.status.idle": "2023-11-12T12:41:24.607785Z", "shell.execute_reply": "2023-11-12T12:41:24.607408Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([-1.])" ] }, "execution_count": 160, "metadata": {}, "output_type": "execute_result" } ], "source": [ "classifier.predict([[1, 1, 1, 1, 1, 1, -1, 1, -1, 1, -1, 1, 1, 1]])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Again, there are many ways to continue from here. Which events should we train the classifier from? How do classifiers compare in their performance and diagnostic quality? There are lots of possibilities left to explore, and we only begin to realize the potential for automated debugging." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Synopsis\n", "\n", "This chapter introduces classes and techniques for _statistical debugging_ – that is, correlating specific events, such as lines covered, with passing and failing outcomes.\n", "\n", "To make use of the code in this chapter, use one of the provided `StatisticalDebugger` subclasses such as `TarantulaDebugger` or `OchiaiDebugger`. \n", "\n", "Both are instantiated with a `Collector` denoting the type of events you want to correlate outcomes with. The default `CoverageCollector`, collecting line coverage." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Collecting Events from Calls\n", "\n", "To collect events from calls that are labeled manually, use" ] }, { "cell_type": "code", "execution_count": 161, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.609564Z", "iopub.status.busy": "2023-11-12T12:41:24.609401Z", "iopub.status.idle": "2023-11-12T12:41:24.612325Z", "shell.execute_reply": "2023-11-12T12:41:24.611909Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = TarantulaDebugger()\n", "with debugger.collect_pass():\n", " remove_html_markup(\"abc\")\n", "with debugger.collect_pass():\n", " remove_html_markup('abc')\n", "with debugger.collect_fail():\n", " remove_html_markup('\"abc\"')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Within each `with` block, the _first function call_ is collected and tracked for coverage. (Note that _only_ the first call is tracked.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Collecting Events from Tests\n", "\n", "To collect events from _tests_ that use exceptions to indicate failure, use the simpler `with` form:" ] }, { "cell_type": "code", "execution_count": 162, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.614212Z", "iopub.status.busy": "2023-11-12T12:41:24.614080Z", "iopub.status.idle": "2023-11-12T12:41:24.616373Z", "shell.execute_reply": "2023-11-12T12:41:24.616093Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "debugger = TarantulaDebugger()\n", "with debugger:\n", " remove_html_markup(\"abc\")\n", "with debugger:\n", " remove_html_markup('abc')\n", "with debugger:\n", " remove_html_markup('\"abc\"')\n", " assert False # raise an exception" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`with` blocks that raise an exception will be classified as failing, blocks that do not will be classified as passing. Note that exceptions raised are \"swallowed\" by the debugger." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Visualizing Events as a Table\n", "\n", "After collecting events, you can print out the observed events – in this case, line numbers – in a table, showing in which runs they occurred (`X`), and with colors highlighting the suspiciousness of the event. A \"red\" event means that the event predominantly occurs in failing runs." ] }, { "cell_type": "code", "execution_count": 163, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.617916Z", "iopub.status.busy": "2023-11-12T12:41:24.617807Z", "iopub.status.idle": "2023-11-12T12:41:24.620623Z", "shell.execute_reply": "2023-11-12T12:41:24.620249Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ "" ] }, "execution_count": 163, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.event_table(args=True, color=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Visualizing Suspicious Code\n", "\n", "If you collected coverage with `CoverageCollector`, you can also visualize the code with similar colors, highlighting suspicious lines:" ] }, { "cell_type": "code", "execution_count": 164, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.622235Z", "iopub.status.busy": "2023-11-12T12:41:24.622102Z", "iopub.status.idle": "2023-11-12T12:41:24.625844Z", "shell.execute_reply": "2023-11-12T12:41:24.625386Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
   1 def remove_html_markup(s):  # type: ignore
\n", "
   2     tag = False
\n", "
   3     quote = False
\n", "
   4     out = ""
\n", "
   5  
\n", "
   6     for c in s:
\n", "
   7         if c == '<' and not quote:
\n", "
   8             tag = True
\n", "
   9         elif c == '>' and not quote:
\n", "
  10             tag = False
\n", "
  11         elif c == '"' or c == "'" and tag:
\n", "
  12             quote = not quote
\n", "
  13         elif not tag:
\n", "
  14             out = out + c
\n", "
  15  
\n", "
  16     return out
\n" ], "text/markdown": [ "| `remove_html_markup` | `s='abc'` | `s='abc'` | `s='\"abc\"'` | \n", "| --------------------- | ---- | ---- | ---- | \n", "| remove_html_markup:1 | X | X | X | \n", "| remove_html_markup:2 | X | X | X | \n", "| remove_html_markup:3 | X | X | X | \n", "| remove_html_markup:4 | X | X | X | \n", "| remove_html_markup:6 | X | X | X | \n", "| remove_html_markup:7 | X | X | X | \n", "| remove_html_markup:8 | - | X | - | \n", "| remove_html_markup:9 | X | X | X | \n", "| remove_html_markup:10 | - | X | - | \n", "| remove_html_markup:11 | X | X | X | \n", "| remove_html_markup:12 | - | - | X | \n", "| remove_html_markup:13 | X | X | X | \n", "| remove_html_markup:14 | X | X | X | \n", "| remove_html_markup:16 | X | X | X | \n" ], "text/plain": [ "[('remove_html_markup', 12), ('remove_html_markup', 11), ('remove_html_markup', 14), ('remove_html_markup', 3), ('remove_html_markup', 6), ('remove_html_markup', 9), ('remove_html_markup', 1), ('remove_html_markup', 7), ('remove_html_markup', 4), ('remove_html_markup', 13), ('remove_html_markup', 16), ('remove_html_markup', 2), ('remove_html_markup', 10), ('remove_html_markup', 8)]" ] }, "execution_count": 164, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Ranking Events\n", "\n", "The method `rank()` returns a ranked list of events, starting with the most suspicious. This is useful for automated techniques that need potential defect locations." ] }, { "cell_type": "code", "execution_count": 165, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.627617Z", "iopub.status.busy": "2023-11-12T12:41:24.627498Z", "iopub.status.idle": "2023-11-12T12:41:24.630288Z", "shell.execute_reply": "2023-11-12T12:41:24.629993Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[('remove_html_markup', 12),\n", " ('remove_html_markup', 11),\n", " ('remove_html_markup', 14),\n", " ('remove_html_markup', 3),\n", " ('remove_html_markup', 6),\n", " ('remove_html_markup', 9),\n", " ('remove_html_markup', 1),\n", " ('remove_html_markup', 7),\n", " ('remove_html_markup', 4),\n", " ('remove_html_markup', 13),\n", " ('remove_html_markup', 16),\n", " ('remove_html_markup', 2),\n", " ('remove_html_markup', 10),\n", " ('remove_html_markup', 8)]" ] }, "execution_count": 165, "metadata": {}, "output_type": "execute_result" } ], "source": [ "debugger.rank()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Classes and Methods\n", "\n", "Here are all classes defined in this chapter:" ] }, { "cell_type": "code", "execution_count": 166, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.631810Z", "iopub.status.busy": "2023-11-12T12:41:24.631691Z", "iopub.status.idle": "2023-11-12T12:41:24.633377Z", "shell.execute_reply": "2023-11-12T12:41:24.633126Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# ignore\n", "from ClassDiagram import display_class_hierarchy" ] }, { "cell_type": "code", "execution_count": 167, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:24.634700Z", "iopub.status.busy": "2023-11-12T12:41:24.634612Z", "iopub.status.idle": "2023-11-12T12:41:25.088950Z", "shell.execute_reply": "2023-11-12T12:41:25.088505Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "TarantulaDebugger\n", "\n", "\n", "TarantulaDebugger\n", "\n", "\n", "\n", "\n", "\n", "ContinuousSpectrumDebugger\n", "\n", "\n", "ContinuousSpectrumDebugger\n", "\n", "\n", "\n", "brightness()\n", "\n", "\n", "\n", "collectors_with_event()\n", "\n", "\n", "\n", "collectors_without_event()\n", "\n", "\n", "\n", "color()\n", "\n", "\n", "\n", "event_fraction()\n", "\n", "\n", "\n", "failed_fraction()\n", "\n", "\n", "\n", "hue()\n", "\n", "\n", "\n", "passed_fraction()\n", "\n", "\n", "\n", "suspiciousness()\n", "\n", "\n", "\n", "tooltip()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "TarantulaDebugger->ContinuousSpectrumDebugger\n", "\n", "\n", "\n", "\n", "\n", "RankingDebugger\n", "\n", "\n", "RankingDebugger\n", "\n", "\n", "\n", "rank()\n", "\n", "\n", "\n", "__repr__()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "TarantulaDebugger->RankingDebugger\n", "\n", "\n", "\n", "\n", "\n", "DiscreteSpectrumDebugger\n", "\n", "\n", "DiscreteSpectrumDebugger\n", "\n", "\n", "\n", "color()\n", "\n", "\n", "\n", "suspiciousness()\n", "\n", "\n", "\n", "tooltip()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "ContinuousSpectrumDebugger->DiscreteSpectrumDebugger\n", "\n", "\n", "\n", "\n", "\n", "SpectrumDebugger\n", "\n", "\n", "SpectrumDebugger\n", "\n", "\n", "\n", "__repr__()\n", "\n", "\n", "\n", "__str__()\n", "\n", "\n", "\n", "_repr_html_()\n", "\n", "\n", "\n", "code()\n", "\n", "\n", "\n", "percentage()\n", "\n", "\n", "\n", "suspiciousness()\n", "\n", "\n", "\n", "tooltip()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "DiscreteSpectrumDebugger->SpectrumDebugger\n", "\n", "\n", "\n", "\n", "\n", "DifferenceDebugger\n", "\n", "\n", "DifferenceDebugger\n", "\n", "\n", "\n", "FAIL\n", "\n", "\n", "\n", "PASS\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "__enter__()\n", "\n", "\n", "\n", "__exit__()\n", "\n", "\n", "\n", "all_fail_events()\n", "\n", "\n", "\n", "all_pass_events()\n", "\n", "\n", "\n", "collect_fail()\n", "\n", "\n", "\n", "collect_pass()\n", "\n", "\n", "\n", "only_fail_events()\n", "\n", "\n", "\n", "only_pass_events()\n", "\n", "\n", "\n", "fail_collectors()\n", "\n", "\n", "\n", "pass_collectors()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "SpectrumDebugger->DifferenceDebugger\n", "\n", "\n", "\n", "\n", "\n", "StatisticalDebugger\n", "\n", "\n", "StatisticalDebugger\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "all_events()\n", "\n", "\n", "\n", "coverage()\n", "\n", "\n", "\n", "covered_functions()\n", "\n", "\n", "\n", "event_table()\n", "\n", "\n", "\n", "function()\n", "\n", "\n", "\n", "__repr__()\n", "\n", "\n", "\n", "_repr_markdown_()\n", "\n", "\n", "\n", "add_collector()\n", "\n", "\n", "\n", "collect()\n", "\n", "\n", "\n", "color()\n", "\n", "\n", "\n", "event_str()\n", "\n", "\n", "\n", "event_table_text()\n", "\n", "\n", "\n", "tooltip()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "DifferenceDebugger->StatisticalDebugger\n", "\n", "\n", "\n", "\n", "\n", "RankingDebugger->DiscreteSpectrumDebugger\n", "\n", "\n", "\n", "\n", "\n", "OchiaiDebugger\n", "\n", "\n", "OchiaiDebugger\n", "\n", "\n", "\n", "hue()\n", "\n", "\n", "\n", "suspiciousness()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "OchiaiDebugger->ContinuousSpectrumDebugger\n", "\n", "\n", "\n", "\n", "\n", "OchiaiDebugger->RankingDebugger\n", "\n", "\n", "\n", "\n", "\n", "Legend\n", "Legend\n", "• \n", "public_method()\n", "• \n", "private_method()\n", "• \n", "overloaded_method()\n", "Hover over names to see doc\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 167, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ignore\n", "display_class_hierarchy([TarantulaDebugger, OchiaiDebugger],\n", " abstract_classes=[\n", " StatisticalDebugger,\n", " DifferenceDebugger,\n", " RankingDebugger\n", " ],\n", " public_methods=[\n", " StatisticalDebugger.__init__,\n", " StatisticalDebugger.all_events,\n", " StatisticalDebugger.event_table,\n", " StatisticalDebugger.function,\n", " StatisticalDebugger.coverage,\n", " StatisticalDebugger.covered_functions,\n", " DifferenceDebugger.__enter__,\n", " DifferenceDebugger.__exit__,\n", " DifferenceDebugger.all_pass_events,\n", " DifferenceDebugger.all_fail_events,\n", " DifferenceDebugger.collect_pass,\n", " DifferenceDebugger.collect_fail,\n", " DifferenceDebugger.only_pass_events,\n", " DifferenceDebugger.only_fail_events,\n", " SpectrumDebugger.code,\n", " SpectrumDebugger.__repr__,\n", " SpectrumDebugger.__str__,\n", " SpectrumDebugger._repr_html_,\n", " ContinuousSpectrumDebugger.code,\n", " ContinuousSpectrumDebugger.__repr__,\n", " RankingDebugger.rank\n", " ],\n", " project='debuggingbook')" ] }, { "cell_type": "code", "execution_count": 168, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:25.090826Z", "iopub.status.busy": "2023-11-12T12:41:25.090690Z", "iopub.status.idle": "2023-11-12T12:41:25.620790Z", "shell.execute_reply": "2023-11-12T12:41:25.620396Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CoverageCollector\n", "\n", "\n", "CoverageCollector\n", "\n", "\n", "\n", "coverage()\n", "\n", "\n", "\n", "covered_functions()\n", "\n", "\n", "\n", "events()\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "collect()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Collector\n", "\n", "\n", "Collector\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "__repr__()\n", "\n", "\n", "\n", "args()\n", "\n", "\n", "\n", "argstring()\n", "\n", "\n", "\n", "collect()\n", "\n", "\n", "\n", "exception()\n", "\n", "\n", "\n", "function()\n", "\n", "\n", "\n", "id()\n", "\n", "\n", "\n", "__exit__()\n", "\n", "\n", "\n", "add_items_to_ignore()\n", "\n", "\n", "\n", "coverage()\n", "\n", "\n", "\n", "covered_functions()\n", "\n", "\n", "\n", "events()\n", "\n", "\n", "\n", "traceit()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CoverageCollector->Collector\n", "\n", "\n", "\n", "\n", "\n", "StackInspector\n", "\n", "\n", "StackInspector\n", "\n", "\n", "\n", "_generated_function_cache\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CoverageCollector->StackInspector\n", "\n", "\n", "\n", "\n", "\n", "Tracer\n", "\n", "\n", "Tracer\n", "\n", "\n", "\n", "__enter__()\n", "\n", "\n", "\n", "__exit__()\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "changed_vars()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Collector->Tracer\n", "\n", "\n", "\n", "\n", "\n", "Tracer->StackInspector\n", "\n", "\n", "\n", "\n", "\n", "ValueCollector\n", "\n", "\n", "ValueCollector\n", "\n", "\n", "\n", "__init__()\n", "\n", "\n", "\n", "events()\n", "\n", "\n", "\n", "collect()\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "ValueCollector->Collector\n", "\n", "\n", "\n", "\n", "\n", "Legend\n", "Legend\n", "• \n", "public_method()\n", "• \n", "private_method()\n", "• \n", "overloaded_method()\n", "Hover over names to see doc\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 168, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ignore\n", "display_class_hierarchy([CoverageCollector, ValueCollector],\n", " public_methods=[\n", " Tracer.__init__,\n", " Tracer.__enter__,\n", " Tracer.__exit__,\n", " Tracer.changed_vars, # type: ignore\n", " Collector.__init__,\n", " Collector.__repr__,\n", " Collector.function,\n", " Collector.args,\n", " Collector.argstring,\n", " Collector.exception,\n", " Collector.id,\n", " Collector.collect,\n", " CoverageCollector.coverage,\n", " CoverageCollector.covered_functions,\n", " CoverageCollector.events,\n", " ValueCollector.__init__,\n", " ValueCollector.events\n", " ],\n", " project='debuggingbook')" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Lessons Learned\n", "\n", "* _Correlations_ between execution events and outcomes (pass/fail) can make important hints for debugging\n", "* Events occurring only (or mostly) during failing runs can be _highlighted_ and _ranked_ to guide the search\n", "* Important hints include whether the _execution of specific code locations_ correlates with failure" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Next Steps\n", "\n", "Chapters that build on this one include\n", "\n", "* [how to determine invariants that correlate with failures](DynamicInvariants.ipynb)\n", "* [how to automatically repair programs](Repairer.ipynb)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Background\n", "\n", "The seminal works on statistical debugging are two papers:\n", "\n", "* \"Visualization of Test Information to Assist Fault Localization\" \\cite{Jones2002} by James Jones, Mary Jean Harrold, and John Stasko introducing Tarantula and its visualization. The paper won an ACM SIGSOFT 10-year impact award.\n", "* \"Bug Isolation via Remote Program Sampling\" \\cite{Liblit2003} by Ben Liblit, Alex Aiken, Alice X. Zheng, and Michael I. Jordan, introducing the term \"Statistical debugging\". Liblit won the ACM Doctoral Dissertation Award for this work.\n", "\n", "The Ochiai metric for fault localization was introduced by \\cite{Abreu2009}. The overview by Wong et al. \\cite{Wong2016} gives a comprehensive overview on the field of statistical fault localization.\n", "\n", "The study by Parnin and Orso \\cite{Parnin2011} is a must to understand the limitations of the technique." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Exercises" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden", "solution2_first": true }, "source": [ "### Exercise 1: A Postcondition for Middle\n", "\n", "What would be a postcondition for `middle()`? How can you check it?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "**Solution.** A simple postcondition for `middle()` would be\n", "\n", "```python\n", "assert m == sorted([x, y, z])[1]\n", "```\n", "\n", "where `m` is the value returned by `middle()`. `sorted()` sorts the given list, and the index `[1]` returns, well, the middle element. (This might also be a much shorter, but possibly slightly more expensive implementation for `middle()`)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "Since `middle()` has several `return` statements, the easiest way to check the result is to create a wrapper around `middle()`:" ] }, { "cell_type": "code", "execution_count": 169, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:25.622904Z", "iopub.status.busy": "2023-11-12T12:41:25.622750Z", "iopub.status.idle": "2023-11-12T12:41:25.625073Z", "shell.execute_reply": "2023-11-12T12:41:25.624816Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "def middle_checked(x, y, z): # type: ignore\n", " m = middle(x, y, z)\n", " assert m == sorted([x, y, z])[1]\n", " return m" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "`middle_checked()` catches the error:" ] }, { "cell_type": "code", "execution_count": 170, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:25.626621Z", "iopub.status.busy": "2023-11-12T12:41:25.626506Z", "iopub.status.idle": "2023-11-12T12:41:25.628505Z", "shell.execute_reply": "2023-11-12T12:41:25.628125Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "from ExpectError import ExpectError" ] }, { "cell_type": "code", "execution_count": 171, "metadata": { "execution": { "iopub.execute_input": "2023-11-12T12:41:25.630102Z", "iopub.status.busy": "2023-11-12T12:41:25.629943Z", "iopub.status.idle": "2023-11-12T12:41:25.632352Z", "shell.execute_reply": "2023-11-12T12:41:25.632013Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_26712/3016629944.py\", line 2, in \n", " m = middle_checked(2, 1, 3)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_26712/1374660292.py\", line 3, in middle_checked\n", " assert m == sorted([x, y, z])[1]\n", "AssertionError (expected)\n" ] } ], "source": [ "with ExpectError():\n", " m = middle_checked(2, 1, 3)" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Exercise 2: Statistical Dependencies\n", "\n", "Using the dependencies from [the chapter on slicing](Slicer.ipynb), can you determine which specific data or control dependencies correlate with failure?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Exercise 3: Correlating with Conditions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In HTML, it is permissible that tag attribute values can also have single quotes, as in\n", "\n", "```html\n", "abc\n", "```\n", "\n", "Such attributes are actually treated correctly by our `remove_html_markup()` code." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "solution2": "hidden", "solution2_first": true }, "source": [ "#### Part 1: Experiment\n", "\n", "What happens if test inputs with single quote attributes become part of our test set? How does statistical fault localization change?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "**Solution.** The effect is that the line `quote = not quote` will be executed both in passing and failing runs, spoiling the correlation." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Part 2: Collecting Conditions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Instead of aiming for _lines_ that correlate with failures, we can look at individual _branch conditions_ such as `c == \"'\"`, `c == '\"'`, or `tag` that correlate with failures. In the above case, the condition `c == \"'\"` would correlate, whereas `c == '\"'` would not. \n", "\n", "Reusing the code instrumentation from [the chapter on slicing](Slicer.ipynb), collect the individual values of Boolean conditions in tests during execution in a `ConditionCollector` class. Show the event table." ] } ], "metadata": { "ipub": { "bibliography": "fuzzingbook.bib", "toc": true }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.2" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "toc-autonumbering": false }, "nbformat": 4, "nbformat_minor": 4 }