{ "cells": [ { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Mutation Analysis\n", "\n", "In the [chapter on coverage](Coverage.ipynb), we showed how one can identify which parts of the program are executed by a program, and hence get a sense of the effectiveness of a set of test cases in covering the program structure. However, coverage alone may not be the best measure for the effectiveness of a test, as one can have great coverage without ever checking a result for correctness. In this chapter, we introduce another means for assessing the effectiveness of a test suite: After injecting *mutations* – _artificial faults_ – into the code, we check whether a test suite can detect these artificial faults. The idea is that if it fails to detect such mutations, it will also miss real bugs." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "**Prerequisites**\n", "\n", "* You need some understanding of how a program is executed.\n", "* You should have read [the chapter on coverage](Coverage.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## Synopsis\n", "\n", "\n", "To [use the code provided in this chapter](Importing.ipynb), write\n", "\n", "```python\n", ">>> from fuzzingbook.MutationAnalysis import \n", "```\n", "\n", "and then make use of the following features.\n", "\n", "\n", "This chapter introduces two methods of running *mutation analysis* on subject programs. The first class `MuFunctionAnalyzer` targets individual functions. Given a function `gcd` and two test cases evaluate, one can run mutation analysis on the test cases as follows:\n", "\n", "```python\n", ">>> for mutant in MuFunctionAnalyzer(gcd, log=True):\n", ">>> with mutant:\n", ">>> assert gcd(1, 0) == 1, \"Minimal\"\n", ">>> assert gcd(0, 1) == 1, \"Mirror\"\n", ">>> mutant.pm.score()\n", "->\tgcd_1\n", "<-\tgcd_1\n", "Detected gcd_1 local variable 'c' referenced before assignment\n", "\n", "->\tgcd_2\n", "<-\tgcd_2\n", "Detected gcd_2 Mirror\n", "\n", "->\tgcd_3\n", "<-\tgcd_3\n", "\n", "->\tgcd_4\n", "<-\tgcd_4\n", "\n", "->\tgcd_5\n", "<-\tgcd_5\n", "\n", "->\tgcd_6\n", "<-\tgcd_6\n", "\n", "->\tgcd_7\n", "<-\tgcd_7\n", "Detected gcd_7 Minimal\n", "\n", "\n", "0.42857142857142855\n", "```\n", "The second class `MuProgramAnalyzer` targets standalone programs with test suites. Given a program `gcd` whose source code is provided in `gcd_src` and the test suite is provided by `TestGCD`, one can evaluate the mutation score of `TestGCD` as follows:\n", "\n", "```python\n", ">>> class TestGCD(unittest.TestCase):\n", ">>> def test_simple(self):\n", ">>> assert cfg.gcd(1, 0) == 1\n", ">>> \n", ">>> def test_mirror(self):\n", ">>> assert cfg.gcd(0, 1) == 1\n", ">>> for mutant in MuProgramAnalyzer('gcd', gcd_src):\n", ">>> mutant[test_module].runTest('TestGCD')\n", ">>> mutant.pm.score()\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "\n", "1.0\n", "```\n", "The mutation score thus obtained is a better indicator of the quality of a given test suite than pure coverage.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Why Structural Coverage is Not Enough\n", "\n", "One of the problems with [structural coverage](Coverage.ipynb) measures is that it fails to check whether the program executions generated by the test suite were actually _correct_. That is, an execution that produces a wrong output that is unnoticed by the test suite is counted exactly the same as an execution that produces the right output for coverage. Indeed, if one deletes the assertions in a typical test case, the coverage would not change for the new test suite, but the new test suite is much less useful than the original one. As an example, consider this \"test\":" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.550032Z", "iopub.status.busy": "2024-01-18T17:15:43.549694Z", "iopub.status.idle": "2024-01-18T17:15:43.562971Z", "shell.execute_reply": "2024-01-18T17:15:43.561780Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def ineffective_test_1():\n", " execute_the_program_as_a_whole()\n", " assert True" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The final assertion here will always pass, no matter what `execute_the_program_as_a_whole()` will do. Okay, if `execute_the_program_as_a_whole()` raises an exception, the test will fail, but we can also get around that:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.567287Z", "iopub.status.busy": "2024-01-18T17:15:43.567012Z", "iopub.status.idle": "2024-01-18T17:15:43.570143Z", "shell.execute_reply": "2024-01-18T17:15:43.569568Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def ineffective_test_2():\n", " try:\n", " execute_the_program_as_a_whole()\n", " except:\n", " pass\n", " assert True" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The problem with these \"tests\", however, is that `execute_the_program_as_a_whole()` may achieve 100% code coverage (or 100% of any other structural coverage metric). Yet, this number of 100% does not reflect the ability of the test to discover bugs, which actually is 0%." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "This is indeed, not an optimal state of affairs. How can we verify that our tests are actually useful? One alternative (hinted in the chapter on coverage) is to inject bugs into the program, and evaluate the effectiveness of test suites in catching these injected bugs. However, that introduces another problem. How do we produce these bugs in the first place? Any manual effort is likely to be biased by the preconceptions of the developer as to where the bugs are likely to occur, and what effect it would have. Further, writing good bugs is likely to take a significant amount of time, for a very indirect benefit. Hence, such a solution is not sufficient." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Seeding Artificial Faults with Mutation Analysis\n", "\n", "Mutation Analysis offers an alternative solution to assess the effectiveness of a test suite. The idea of mutation analysis is to seed _artificial faults_, known as *mutations*, into the program code, and to check whether the test suite finds them. Such a mutation could, for instance, replace a `+` by a `-` somewhere within `execute_the_program_as_a_whole()`. Of course, the above ineffective tests would not detect this, as they do not check any of the results. An effective test would, however; and the assumption is that the more effective a test is in finding _artificial_ faults, the more effective it would be in finding _real_ faults." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "The insight from Mutation Analysis is to consider the probability of insertion of a bug from the perspective of a programmer. If one assumes that the attention received by each program element in the program is sufficiently similar, one can further assume that each token in the program has a similar probability of being incorrectly transcribed. Of course, the programmer will correct any mistakes that gets detected by the compilers (or other static analysis tools). So the set of valid tokens different from the original that make it past the compilation stage is considered to be its possible set of _mutations_ that represent the _probable faults_ in the program. A test suite is then judged by its capability to detect (and hence prevent) such mutations. The proportion of such mutants detected over all _valid_ mutants produced is taken as the mutation score. In this chapter, we see how one can implement Mutation Analysis in Python programs. The mutation score obtained represents the ability of any program analysis tools to prevent faults, and can be used to judge static test suites, test generators such as fuzzers, and also static and symbolic execution frameworks." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "It might be intuitive to consider a slightly different perspective. A test suite is a program that can be considered to accept as its input, the program to be tested. What is the best way to evaluate such a program (the test suite)? We can essentially *fuzz* the test suite by applying small mutations to the input program, and verifying that the test suite in question does not produce unexpected behaviors. The test suite is supposed to only allow the original through; and hence any mutant that is not detected as faulty represents a bug in the test suite." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Structural Coverage Adequacy by Example\n", "\n", "Let us introduce a more detailed example to illustrate both the problems with coverage as well as how mutation analysis works. The `triangle()` program below classifies a triangle with edge lengths $a$, $b$, and $c$ into the proper triangle category. We want to verify that the program works correctly." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.573731Z", "iopub.status.busy": "2024-01-18T17:15:43.573517Z", "iopub.status.idle": "2024-01-18T17:15:43.577187Z", "shell.execute_reply": "2024-01-18T17:15:43.576558Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def triangle(a, b, c):\n", " if a == b:\n", " if b == c:\n", " return 'Equilateral'\n", " else:\n", " return 'Isosceles'\n", " else:\n", " if b == c:\n", " return \"Isosceles\"\n", " else:\n", " if a == c:\n", " return \"Isosceles\"\n", " else:\n", " return \"Scalene\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here are a few test cases to ensure that the program works." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.580257Z", "iopub.status.busy": "2024-01-18T17:15:43.580030Z", "iopub.status.idle": "2024-01-18T17:15:43.583129Z", "shell.execute_reply": "2024-01-18T17:15:43.582708Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def strong_oracle(fn):\n", " assert fn(1, 1, 1) == 'Equilateral'\n", "\n", " assert fn(1, 2, 1) == 'Isosceles'\n", " assert fn(2, 2, 1) == 'Isosceles'\n", " assert fn(1, 2, 2) == 'Isosceles'\n", "\n", " assert fn(1, 2, 3) == 'Scalene'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Running them actually causes all tests to pass." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.587146Z", "iopub.status.busy": "2024-01-18T17:15:43.586913Z", "iopub.status.idle": "2024-01-18T17:15:43.589530Z", "shell.execute_reply": "2024-01-18T17:15:43.589021Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "strong_oracle(triangle)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "However, the statement that \"all tests pass\" has value only if we know that our tests are effective.\n", "What is the effectiveness of our test suite? As we saw in the [chapter on coverage](Coverage.ipynb), one can use structural coverage techniques such as statement coverage to obtain a measure of effectiveness of the test case." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "button": false, "execution": { "iopub.execute_input": "2024-01-18T17:15:43.592723Z", "iopub.status.busy": "2024-01-18T17:15:43.592479Z", "iopub.status.idle": "2024-01-18T17:15:43.633499Z", "shell.execute_reply": "2024-01-18T17:15:43.633174Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import bookutils.setup" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.635392Z", "iopub.status.busy": "2024-01-18T17:15:43.635233Z", "iopub.status.idle": "2024-01-18T17:15:43.951285Z", "shell.execute_reply": "2024-01-18T17:15:43.950434Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from Coverage import Coverage" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.958635Z", "iopub.status.busy": "2024-01-18T17:15:43.958052Z", "iopub.status.idle": "2024-01-18T17:15:43.983130Z", "shell.execute_reply": "2024-01-18T17:15:43.961815Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import inspect" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We add a function `show_coverage()` to visualize the coverage obtained." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:43.993836Z", "iopub.status.busy": "2024-01-18T17:15:43.993381Z", "iopub.status.idle": "2024-01-18T17:15:44.000973Z", "shell.execute_reply": "2024-01-18T17:15:44.000648Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class VisualCoverage(Coverage):\n", " def show_coverage(self, fn):\n", " src = inspect.getsource(fn)\n", " name = fn.__name__\n", " covered = set([lineno for method,\n", " lineno in self._trace if method == name])\n", " for i, s in enumerate(src.split('\\n')):\n", " print('%s %2d: %s' % ('#' if i + 1 in covered else ' ', i + 1, s))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.002803Z", "iopub.status.busy": "2024-01-18T17:15:44.002713Z", "iopub.status.idle": "2024-01-18T17:15:44.004463Z", "shell.execute_reply": "2024-01-18T17:15:44.004149Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with VisualCoverage() as cov:\n", " strong_oracle(triangle)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.005998Z", "iopub.status.busy": "2024-01-18T17:15:44.005903Z", "iopub.status.idle": "2024-01-18T17:15:44.008646Z", "shell.execute_reply": "2024-01-18T17:15:44.008275Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1: def triangle(a, b, c):\n", "# 2: if a == b:\n", "# 3: if b == c:\n", "# 4: return 'Equilateral'\n", " 5: else:\n", "# 6: return 'Isosceles'\n", " 7: else:\n", "# 8: if b == c:\n", "# 9: return \"Isosceles\"\n", " 10: else:\n", "# 11: if a == c:\n", "# 12: return \"Isosceles\"\n", " 13: else:\n", "# 14: return \"Scalene\"\n", " 15: \n" ] } ], "source": [ "cov.show_coverage(triangle)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Our `strong_oracle()` seems to have adequately covered all possible conditions.\n", "That is, our set of test cases is reasonably good according to structural coverage. However, does the coverage obtained tell the whole story? Consider this test suite instead:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.029079Z", "iopub.status.busy": "2024-01-18T17:15:44.028930Z", "iopub.status.idle": "2024-01-18T17:15:44.031230Z", "shell.execute_reply": "2024-01-18T17:15:44.030955Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def weak_oracle(fn):\n", " assert fn(1, 1, 1) == 'Equilateral'\n", "\n", " assert fn(1, 2, 1) != 'Equilateral'\n", " assert fn(2, 2, 1) != 'Equilateral'\n", " assert fn(1, 2, 2) != 'Equilateral'\n", "\n", " assert fn(1, 2, 3) != 'Equilateral'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "All that we are checking here is that a triangle with unequal sides is not equilateral. What is the coverage obtained?" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.032994Z", "iopub.status.busy": "2024-01-18T17:15:44.032886Z", "iopub.status.idle": "2024-01-18T17:15:44.034613Z", "shell.execute_reply": "2024-01-18T17:15:44.034343Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "with VisualCoverage() as cov:\n", " weak_oracle(triangle)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.036151Z", "iopub.status.busy": "2024-01-18T17:15:44.036037Z", "iopub.status.idle": "2024-01-18T17:15:44.038068Z", "shell.execute_reply": "2024-01-18T17:15:44.037827Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1: def triangle(a, b, c):\n", "# 2: if a == b:\n", "# 3: if b == c:\n", "# 4: return 'Equilateral'\n", " 5: else:\n", "# 6: return 'Isosceles'\n", " 7: else:\n", "# 8: if b == c:\n", "# 9: return \"Isosceles\"\n", " 10: else:\n", "# 11: if a == c:\n", "# 12: return \"Isosceles\"\n", " 13: else:\n", "# 14: return \"Scalene\"\n", " 15: \n" ] } ], "source": [ "cov.show_coverage(triangle)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Indeed, there does not seem to be _any_ difference in coverage.\n", "The `weak_oracle()` obtains exactly the same coverage as that of `strong_oracle()`. However, a moment's reflection should convince one that the `weak_oracle()` is not as effective as `strong_oracle()`. However, _coverage_ is unable to distinguish between the two test suites. What are we missing in coverage?\n", "The problem here is that coverage is unable to evaluate the _quality_ of our assertions. Indeed, coverage does not care about assertions at all. However, as we saw above, assertions are an extremely important part of test suite effectiveness. Hence, what we need is a way to evaluate the quality of assertions." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Injecting Artificial Faults\n", "\n", "Notice that in the [chapter on coverage](Coverage.ipynb), coverage was presented as a _proxy_ for the likelihood of a test suite to uncover bugs. What if we actually try to evaluate the likelihood of a test suite to uncover bugs? All we need is to inject bugs into the program, one at a time, and count the number of such bugs that our test suite detects. The frequency of detection will provide us with the actual likelihood of the test suite to uncover bugs. This technique is called _fault injection_. Here is an example for _fault injection_." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.039928Z", "iopub.status.busy": "2024-01-18T17:15:44.039734Z", "iopub.status.idle": "2024-01-18T17:15:44.041984Z", "shell.execute_reply": "2024-01-18T17:15:44.041712Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def triangle_m1(a, b, c):\n", " if a == b:\n", " if b == c:\n", " return 'Equilateral'\n", " else:\n", " # return 'Isosceles'\n", " return None # <-- injected fault\n", " else:\n", " if b == c:\n", " return \"Isosceles\"\n", " else:\n", " if a == c:\n", " return \"Isosceles\"\n", " else:\n", " return \"Scalene\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let us see if our test suites are good enough to catch this fault. We first check whether `weak_oracle()` can detect this change." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.043607Z", "iopub.status.busy": "2024-01-18T17:15:44.043491Z", "iopub.status.idle": "2024-01-18T17:15:44.045093Z", "shell.execute_reply": "2024-01-18T17:15:44.044852Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from ExpectError import ExpectError" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.046522Z", "iopub.status.busy": "2024-01-18T17:15:44.046437Z", "iopub.status.idle": "2024-01-18T17:15:44.047973Z", "shell.execute_reply": "2024-01-18T17:15:44.047736Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with ExpectError():\n", " weak_oracle(triangle_m1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `weak_oracle()` is unable to detect any changes. What about our `strong_oracle()`?" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.049455Z", "iopub.status.busy": "2024-01-18T17:15:44.049375Z", "iopub.status.idle": "2024-01-18T17:15:44.051209Z", "shell.execute_reply": "2024-01-18T17:15:44.050957Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/3191755624.py\", line 2, in \n", " strong_oracle(triangle_m1)\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/566939880.py\", line 5, in strong_oracle\n", " assert fn(2, 2, 1) == 'Isosceles'\n", "AssertionError (expected)\n" ] } ], "source": [ "with ExpectError():\n", " strong_oracle(triangle_m1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Our `strong_oracle()` is able to detect this fault, which is evidence that `strong_oracle()` is probably a better test suite." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "_Fault injection_ can provide a good measure of effectiveness of a test suite, provided we have a list of possible faults. The problem is that collecting such a set of _unbiased_ faults is rather expensive. It is difficult to create good faults that are reasonably hard to detect, and it is a manual process. Given that it is a manual process, the generated faults will be biased by the preconceptions of the developer who creates it. Even when such curated faults are available, they are unlikely to be exhaustive, and likely to miss important classes of bugs, and parts of the program. Hence, _fault injection_ is an insufficient replacement for coverage. Can we do better?" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Mutation Analysis provides an alternative to a curated set of faults. The key insight is that, if one assumes that the programmer understands the program in question, the majority of errors made are very likely small transcription errors (a few tokens). A compiler will likely catch most of these errors. Hence, the majority of residual faults in a program is likely to be due to small (single token) variations at certain points in the structure of the program from the correct program (This particular assumption is called the *Competent Programmer Hypothesis* or the *Finite Neighborhood Hypothesis*).\n", "\n", "What about the larger faults composed of multiple smaller faults? The key insight here is that, for a majority of such complex faults, test cases that detect a single smaller fault in isolation is very likely to detect the larger complex fault that contains it. (This assumption is called the *Coupling Effect*.)\n", "\n", "How can we use these assumptions in practice? The idea is to simply generate *all* possible *valid* variants of the program that differs from the original by a small change (such as a single token change) (Such variants are called *mutants*). Next, the given test suite is applied to each variant thus generated. Any mutant detected by the test suite is said to have been *killed* by the test suite. The effectiveness of a test suite is given by the proportion of mutants killed to the valid mutants generated." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We next implement a simple mutation analysis framework and use it to evaluate our test suites." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Mutating Python Code\n", "\n", "To manipulate a Python program, we work on the _abstract syntax tree_ (AST) representation – which is the internal representation compilers and interpreters work on after reading in the program text.\n", "\n", "Briefly speaking, we convert the program into a tree, and then _change parts of this tree_ – for instance, by changing `+` operators into `-` or vice versa, or actual statements into `pass` statements that do nothing. The resulting mutated tree can then be processed further; it can be passed on to the Python interpreter for execution, or we can _unparse_ it back into a textual form." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "fragment" } }, "source": [ "We begin by importing the AST manipulation modules." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.052973Z", "iopub.status.busy": "2024-01-18T17:15:44.052868Z", "iopub.status.idle": "2024-01-18T17:15:44.054523Z", "shell.execute_reply": "2024-01-18T17:15:44.054262Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import ast\n", "import inspect" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can get the source of a Python function using `inspect.getsource()`. (Note that this does not work for functions defined in other notebooks.)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.056072Z", "iopub.status.busy": "2024-01-18T17:15:44.055963Z", "iopub.status.idle": "2024-01-18T17:15:44.059653Z", "shell.execute_reply": "2024-01-18T17:15:44.059368Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'def triangle(a, b, c):\\n if a == b:\\n if b == c:\\n return \\'Equilateral\\'\\n else:\\n return \\'Isosceles\\'\\n else:\\n if b == c:\\n return \"Isosceles\"\\n else:\\n if a == c:\\n return \"Isosceles\"\\n else:\\n return \"Scalene\"\\n'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "triangle_source = inspect.getsource(triangle)\n", "triangle_source" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To view these in a visually pleasing form, our function `print_content(s, suffix)` formats and highlights the string `s` as if it were a file with ending `suffix`. We can thus view (and highlight) the source as if it were a Python file:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.061241Z", "iopub.status.busy": "2024-01-18T17:15:44.061157Z", "iopub.status.idle": "2024-01-18T17:15:44.062740Z", "shell.execute_reply": "2024-01-18T17:15:44.062477Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import print_content" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.064234Z", "iopub.status.busy": "2024-01-18T17:15:44.064152Z", "iopub.status.idle": "2024-01-18T17:15:44.145817Z", "shell.execute_reply": "2024-01-18T17:15:44.145525Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mtriangle\u001b[39;49;00m(a, b, c):\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m a == b:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m b == c:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mEquilateral\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melse\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mIsosceles\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melse\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m b == c:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m\"\u001b[39;49;00m\u001b[33mIsosceles\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melse\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m a == c:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m\"\u001b[39;49;00m\u001b[33mIsosceles\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melse\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m\"\u001b[39;49;00m\u001b[33mScalene\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(triangle_source, '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Parsing this gives us an abstract syntax tree (AST) – a representation of the program in tree form." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.147495Z", "iopub.status.busy": "2024-01-18T17:15:44.147385Z", "iopub.status.idle": "2024-01-18T17:15:44.149126Z", "shell.execute_reply": "2024-01-18T17:15:44.148868Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "triangle_ast = ast.parse(triangle_source)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What does this AST look like? The helper functions `ast.dump()` (textual output) and `showast.show_ast()` (graphical output with [showast](https://github.com/hchasestevens/show_ast)) allow us to inspect the structure of the tree. We see that the function starts as a `FunctionDef` with name and arguments, followed by a body, which is a list of statements; in this case, the body contains only an `If`, which itself contains other nodes of type `If`, `Compare`, `Name`, `Str`, and `Return`." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.150648Z", "iopub.status.busy": "2024-01-18T17:15:44.150546Z", "iopub.status.idle": "2024-01-18T17:15:44.152301Z", "shell.execute_reply": "2024-01-18T17:15:44.152059Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Module(\n", " body=[\n", " FunctionDef(\n", " name='triangle',\n", " args=arguments(\n", " posonlyargs=[],\n", " args=[\n", " arg(arg='a'),\n", " arg(arg='b'),\n", " arg(arg='c')],\n", " kwonlyargs=[],\n", " kw_defaults=[],\n", " defaults=[]),\n", " body=[\n", " If(\n", " test=Compare(\n", " left=Name(id='a', ctx=Load()),\n", " ops=[\n", " Eq()],\n", " comparators=[\n", " Name(id='b', ctx=Load())]),\n", " body=[\n", " If(\n", " test=Compare(\n", " left=Name(id='b', ctx=Load()),\n", " ops=[\n", " Eq()],\n", " comparators=[\n", " Name(id='c', ctx=Load())]),\n", " body=[\n", " Return(\n", " value=Constant(value='Equilateral'))],\n", " orelse=[\n", " Return(\n", " value=Constant(value='Isosceles'))])],\n", " orelse=[\n", " If(\n", " test=Compare(\n", " left=Name(id='b', ctx=Load()),\n", " ops=[\n", " Eq()],\n", " comparators=[\n", " Name(id='c', ctx=Load())]),\n", " body=[\n", " Return(\n", " value=Constant(value='Isosceles'))],\n", " orelse=[\n", " If(\n", " test=Compare(\n", " left=Name(id='a', ctx=Load()),\n", " ops=[\n", " Eq()],\n", " comparators=[\n", " Name(id='c', ctx=Load())]),\n", " body=[\n", " Return(\n", " value=Constant(value='Isosceles'))],\n", " orelse=[\n", " Return(\n", " value=Constant(value='Scalene'))])])])],\n", " decorator_list=[])],\n", " type_ignores=[])\n" ] } ], "source": [ "print(ast.dump(triangle_ast, indent=4))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Too much text for you? This graphical representation may make things simpler." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.153763Z", "iopub.status.busy": "2024-01-18T17:15:44.153653Z", "iopub.status.idle": "2024-01-18T17:15:44.155475Z", "shell.execute_reply": "2024-01-18T17:15:44.155121Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from bookutils import rich_output" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.157219Z", "iopub.status.busy": "2024-01-18T17:15:44.157089Z", "iopub.status.idle": "2024-01-18T17:15:44.571623Z", "shell.execute_reply": "2024-01-18T17:15:44.571227Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "0\n", "FunctionDef\n", "\n", "\n", "\n", "1\n", ""triangle"\n", "\n", "\n", "\n", "0--1\n", "\n", "\n", "\n", "\n", "2\n", "arguments\n", "\n", "\n", "\n", "0--2\n", "\n", "\n", "\n", "\n", "9\n", "If\n", "\n", "\n", "\n", "0--9\n", "\n", "\n", "\n", "\n", "3\n", "arg\n", "\n", "\n", "\n", "2--3\n", "\n", "\n", "\n", "\n", "5\n", "arg\n", "\n", "\n", "\n", "2--5\n", "\n", "\n", "\n", "\n", "7\n", "arg\n", "\n", "\n", "\n", "2--7\n", "\n", "\n", "\n", "\n", "4\n", ""a"\n", "\n", "\n", "\n", "3--4\n", "\n", "\n", "\n", "\n", "6\n", ""b"\n", "\n", "\n", "\n", "5--6\n", "\n", "\n", "\n", "\n", "8\n", ""c"\n", "\n", "\n", "\n", "7--8\n", "\n", "\n", "\n", "\n", "10\n", "Compare\n", "\n", "\n", "\n", "9--10\n", "\n", "\n", "\n", "\n", "18\n", "If\n", "\n", "\n", "\n", "9--18\n", "\n", "\n", "\n", "\n", "33\n", "If\n", "\n", "\n", "\n", "9--33\n", "\n", "\n", "\n", "\n", "11\n", "Name\n", "\n", "\n", "\n", "10--11\n", "\n", "\n", "\n", "\n", "14\n", "Eq\n", "\n", "\n", "\n", "10--14\n", "\n", "\n", "\n", "\n", "15\n", "Name\n", "\n", "\n", "\n", "10--15\n", "\n", "\n", "\n", "\n", "12\n", ""a"\n", "\n", "\n", "\n", "11--12\n", "\n", "\n", "\n", "\n", "13\n", "Load\n", "\n", "\n", "\n", "11--13\n", "\n", "\n", "\n", "\n", "16\n", ""b"\n", "\n", "\n", "\n", "15--16\n", "\n", "\n", "\n", "\n", "17\n", "Load\n", "\n", "\n", "\n", "15--17\n", "\n", "\n", "\n", "\n", "19\n", "Compare\n", "\n", "\n", "\n", "18--19\n", "\n", "\n", "\n", "\n", "27\n", "Return\n", "\n", "\n", "\n", "18--27\n", "\n", "\n", "\n", "\n", "30\n", "Return\n", "\n", "\n", "\n", "18--30\n", "\n", "\n", "\n", "\n", "20\n", "Name\n", "\n", "\n", "\n", "19--20\n", "\n", "\n", "\n", "\n", "23\n", "Eq\n", "\n", "\n", "\n", "19--23\n", "\n", "\n", "\n", "\n", "24\n", "Name\n", "\n", "\n", "\n", "19--24\n", "\n", "\n", "\n", "\n", "21\n", ""b"\n", "\n", "\n", "\n", "20--21\n", "\n", "\n", "\n", "\n", "22\n", "Load\n", "\n", "\n", "\n", "20--22\n", "\n", "\n", "\n", "\n", "25\n", ""c"\n", "\n", "\n", "\n", "24--25\n", "\n", "\n", "\n", "\n", "26\n", "Load\n", "\n", "\n", "\n", "24--26\n", "\n", "\n", "\n", "\n", "28\n", "Constant\n", "\n", "\n", "\n", "27--28\n", "\n", "\n", "\n", "\n", "29\n", ""Equilateral"\n", "\n", "\n", "\n", "28--29\n", "\n", "\n", "\n", "\n", "31\n", "Constant\n", "\n", "\n", "\n", "30--31\n", "\n", "\n", "\n", "\n", "32\n", ""Isosceles"\n", "\n", "\n", "\n", "31--32\n", "\n", "\n", "\n", "\n", "34\n", "Compare\n", "\n", "\n", "\n", "33--34\n", "\n", "\n", "\n", "\n", "42\n", "Return\n", "\n", "\n", "\n", "33--42\n", "\n", "\n", "\n", "\n", "45\n", "If\n", "\n", "\n", "\n", "33--45\n", "\n", "\n", "\n", "\n", "35\n", "Name\n", "\n", "\n", "\n", "34--35\n", "\n", "\n", "\n", "\n", "38\n", "Eq\n", "\n", "\n", "\n", "34--38\n", "\n", "\n", "\n", "\n", "39\n", "Name\n", "\n", "\n", "\n", "34--39\n", "\n", "\n", "\n", "\n", "36\n", ""b"\n", "\n", "\n", "\n", "35--36\n", "\n", "\n", "\n", "\n", "37\n", "Load\n", "\n", "\n", "\n", "35--37\n", "\n", "\n", "\n", "\n", "40\n", ""c"\n", "\n", "\n", "\n", "39--40\n", "\n", "\n", "\n", "\n", "41\n", "Load\n", "\n", "\n", "\n", "39--41\n", "\n", "\n", "\n", "\n", "43\n", "Constant\n", "\n", "\n", "\n", "42--43\n", "\n", "\n", "\n", "\n", "44\n", ""Isosceles"\n", "\n", "\n", "\n", "43--44\n", "\n", "\n", "\n", "\n", "46\n", "Compare\n", "\n", "\n", "\n", "45--46\n", "\n", "\n", "\n", "\n", "54\n", "Return\n", "\n", "\n", "\n", "45--54\n", "\n", "\n", "\n", "\n", "57\n", "Return\n", "\n", "\n", "\n", "45--57\n", "\n", "\n", "\n", "\n", "47\n", "Name\n", "\n", "\n", "\n", "46--47\n", "\n", "\n", "\n", "\n", "50\n", "Eq\n", "\n", "\n", "\n", "46--50\n", "\n", "\n", "\n", "\n", "51\n", "Name\n", "\n", "\n", "\n", "46--51\n", "\n", "\n", "\n", "\n", "48\n", ""a"\n", "\n", "\n", "\n", "47--48\n", "\n", "\n", "\n", "\n", "49\n", "Load\n", "\n", "\n", "\n", "47--49\n", "\n", "\n", "\n", "\n", "52\n", ""c"\n", "\n", "\n", "\n", "51--52\n", "\n", "\n", "\n", "\n", "53\n", "Load\n", "\n", "\n", "\n", "51--53\n", "\n", "\n", "\n", "\n", "55\n", "Constant\n", "\n", "\n", "\n", "54--55\n", "\n", "\n", "\n", "\n", "56\n", ""Isosceles"\n", "\n", "\n", "\n", "55--56\n", "\n", "\n", "\n", "\n", "58\n", "Constant\n", "\n", "\n", "\n", "57--58\n", "\n", "\n", "\n", "\n", "59\n", ""Scalene"\n", "\n", "\n", "\n", "58--59\n", "\n", "\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "if rich_output():\n", " import showast\n", " showast.show_ast(triangle_ast)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The function `ast.unparse()` converts such a tree back into the more familiar textual Python code representation." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.573654Z", "iopub.status.busy": "2024-01-18T17:15:44.573477Z", "iopub.status.idle": "2024-01-18T17:15:44.607097Z", "shell.execute_reply": "2024-01-18T17:15:44.606775Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mdef\u001b[39;49;00m \u001b[32mtriangle\u001b[39;49;00m(a, b, c):\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m a == b:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mif\u001b[39;49;00m b == c:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mEquilateral\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melse\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mIsosceles\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m b == c:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mIsosceles\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melif\u001b[39;49;00m a == c:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mIsosceles\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m\n", " \u001b[34melse\u001b[39;49;00m:\u001b[37m\u001b[39;49;00m\n", " \u001b[34mreturn\u001b[39;49;00m \u001b[33m'\u001b[39;49;00m\u001b[33mScalene\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m\u001b[37m\u001b[39;49;00m" ] } ], "source": [ "print_content(ast.unparse(triangle_ast), '.py')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## A Simple Mutator for Functions\n", "\n", "Let us now go and mutate the `triangle()` program. A simple way to produce valid mutated version of this program is to replace some of its statements by `pass`.\n", "\n", "The `MuFunctionAnalyzer` is the main class responsible for mutation analysis of the test suite. It accepts the function to be tested. It normalizes the source code given by parsing and unparsing it once, using the functions discussed above. This is required to ensure that later `diff`s between the original and mutant are not derailed by differences in whitespace, comments, etc." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.609131Z", "iopub.status.busy": "2024-01-18T17:15:44.609013Z", "iopub.status.idle": "2024-01-18T17:15:44.611705Z", "shell.execute_reply": "2024-01-18T17:15:44.611467Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class MuFunctionAnalyzer:\n", " def __init__(self, fn, log=False):\n", " self.fn = fn\n", " self.name = fn.__name__\n", " src = inspect.getsource(fn)\n", " self.ast = ast.parse(src)\n", " self.src = ast.unparse(self.ast) # normalize\n", " self.mutator = self.mutator_object()\n", " self.nmutations = self.get_mutation_count()\n", " self.un_detected = set()\n", " self.mutants = []\n", " self.log = log\n", "\n", " def mutator_object(self, locations=None):\n", " return StmtDeletionMutator(locations)\n", "\n", " def register(self, m):\n", " self.mutants.append(m)\n", "\n", " def finish(self):\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `get_mutation_count()` fetches the number of possible mutations available. We will see later how this can be implemented." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.613333Z", "iopub.status.busy": "2024-01-18T17:15:44.613224Z", "iopub.status.idle": "2024-01-18T17:15:44.615038Z", "shell.execute_reply": "2024-01-18T17:15:44.614775Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class MuFunctionAnalyzer(MuFunctionAnalyzer):\n", " def get_mutation_count(self):\n", " self.mutator.visit(self.ast)\n", " return self.mutator.count" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `Mutator` provides the base class for implementing individual mutations. It accepts a list of locations to mutate. It assumes that the method `mutable_visit()` is invoked on all nodes of interest as determined by the subclass. When the `Mutator` is invoked without a list of locations to mutate, it simply loops through all possible mutation points and retains a count in `self.count`. If it is invoked with a specific list of locations to mutate, the `mutable_visit()` method calls the `mutation_visit()` which performs the mutation on the node. Note that a single location can produce multiple mutations. (Hence the hashmap)." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.616565Z", "iopub.status.busy": "2024-01-18T17:15:44.616463Z", "iopub.status.idle": "2024-01-18T17:15:44.618628Z", "shell.execute_reply": "2024-01-18T17:15:44.618314Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Mutator(ast.NodeTransformer):\n", " def __init__(self, mutate_location=-1):\n", " self.count = 0\n", " self.mutate_location = mutate_location\n", "\n", " def mutable_visit(self, node):\n", " self.count += 1 # statements start at line no 1\n", " if self.count == self.mutate_location:\n", " return self.mutation_visit(node)\n", " return self.generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `StmtDeletionMutator` simply hooks into all the statement processing visitors. It performs mutation by replacing the given statement with `pass`. As you can see, it visits all kinds of statements." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.620101Z", "iopub.status.busy": "2024-01-18T17:15:44.619991Z", "iopub.status.idle": "2024-01-18T17:15:44.622960Z", "shell.execute_reply": "2024-01-18T17:15:44.622701Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class StmtDeletionMutator(Mutator):\n", " def visit_Return(self, node): return self.mutable_visit(node)\n", " def visit_Delete(self, node): return self.mutable_visit(node)\n", "\n", " def visit_Assign(self, node): return self.mutable_visit(node)\n", " def visit_AnnAssign(self, node): return self.mutable_visit(node)\n", " def visit_AugAssign(self, node): return self.mutable_visit(node)\n", "\n", " def visit_Raise(self, node): return self.mutable_visit(node)\n", " def visit_Assert(self, node): return self.mutable_visit(node)\n", "\n", " def visit_Global(self, node): return self.mutable_visit(node)\n", " def visit_Nonlocal(self, node): return self.mutable_visit(node)\n", "\n", " def visit_Expr(self, node): return self.mutable_visit(node)\n", "\n", " def visit_Pass(self, node): return self.mutable_visit(node)\n", " def visit_Break(self, node): return self.mutable_visit(node)\n", " def visit_Continue(self, node): return self.mutable_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The actual mutation consists of replacing the node with a `pass` statement:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.624348Z", "iopub.status.busy": "2024-01-18T17:15:44.624240Z", "iopub.status.idle": "2024-01-18T17:15:44.625885Z", "shell.execute_reply": "2024-01-18T17:15:44.625594Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class StmtDeletionMutator(StmtDeletionMutator):\n", " def mutation_visit(self, node): return ast.Pass() " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For `triangle()`, this visitor produces five mutations – namely, replacing the five `return` statements with `pass`:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.627489Z", "iopub.status.busy": "2024-01-18T17:15:44.627355Z", "iopub.status.idle": "2024-01-18T17:15:44.630125Z", "shell.execute_reply": "2024-01-18T17:15:44.629871Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "MuFunctionAnalyzer(triangle).nmutations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We need a way to obtain the individual mutants. For this, we convert our `MuFunctionAnalyzer` to an *iterable*." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.631566Z", "iopub.status.busy": "2024-01-18T17:15:44.631463Z", "iopub.status.idle": "2024-01-18T17:15:44.633100Z", "shell.execute_reply": "2024-01-18T17:15:44.632849Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class MuFunctionAnalyzer(MuFunctionAnalyzer):\n", " def __iter__(self):\n", " return PMIterator(self)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `PMIterator`, which is the *iterator* class for `MuFunctionAnalyzer` is defined as follows." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.634389Z", "iopub.status.busy": "2024-01-18T17:15:44.634311Z", "iopub.status.idle": "2024-01-18T17:15:44.635994Z", "shell.execute_reply": "2024-01-18T17:15:44.635757Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class PMIterator:\n", " def __init__(self, pm):\n", " self.pm = pm\n", " self.idx = 0" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `next()` method returns the corresponding `Mutant`:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.637348Z", "iopub.status.busy": "2024-01-18T17:15:44.637261Z", "iopub.status.idle": "2024-01-18T17:15:44.639497Z", "shell.execute_reply": "2024-01-18T17:15:44.639232Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class PMIterator(PMIterator):\n", " def __next__(self):\n", " i = self.idx\n", " if i >= self.pm.nmutations:\n", " self.pm.finish()\n", " raise StopIteration()\n", " self.idx += 1\n", " mutant = Mutant(self.pm, self.idx, log=self.pm.log)\n", " self.pm.register(mutant)\n", " return mutant" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `Mutant` class contains logic for generating mutants when given the locations to mutate." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.641263Z", "iopub.status.busy": "2024-01-18T17:15:44.641127Z", "iopub.status.idle": "2024-01-18T17:15:44.643199Z", "shell.execute_reply": "2024-01-18T17:15:44.642951Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class Mutant:\n", " def __init__(self, pm, location, log=False):\n", " self.pm = pm\n", " self.i = location\n", " self.name = \"%s_%s\" % (self.pm.name, self.i)\n", " self._src = None\n", " self.tests = []\n", " self.detected = False\n", " self.log = log" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is how it can be used:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.644663Z", "iopub.status.busy": "2024-01-18T17:15:44.644558Z", "iopub.status.idle": "2024-01-18T17:15:44.646630Z", "shell.execute_reply": "2024-01-18T17:15:44.646381Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "triangle_1\n", "triangle_2\n", "triangle_3\n", "triangle_4\n", "triangle_5\n" ] } ], "source": [ "for m in MuFunctionAnalyzer(triangle):\n", " print(m.name)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "These names are a bit generic yet. Let's see whether we can get more insights into the mutations produced." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `generate_mutant()` simply calls the `mutator()` method, and passes the mutator a copy of the AST. " ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.648057Z", "iopub.status.busy": "2024-01-18T17:15:44.647941Z", "iopub.status.idle": "2024-01-18T17:15:44.649833Z", "shell.execute_reply": "2024-01-18T17:15:44.649553Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Mutant(Mutant):\n", " def generate_mutant(self, location):\n", " mutant_ast = self.pm.mutator_object(\n", " location).visit(ast.parse(self.pm.src)) # copy\n", " return ast.unparse(mutant_ast)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `src()` method returns the mutated source." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.651365Z", "iopub.status.busy": "2024-01-18T17:15:44.651247Z", "iopub.status.idle": "2024-01-18T17:15:44.653041Z", "shell.execute_reply": "2024-01-18T17:15:44.652806Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class Mutant(Mutant):\n", " def src(self):\n", " if self._src is None:\n", " self._src = self.generate_mutant(self.i)\n", " return self._src" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is how one can obtain the mutants, and visualize the difference from the original:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.654385Z", "iopub.status.busy": "2024-01-18T17:15:44.654287Z", "iopub.status.idle": "2024-01-18T17:15:44.655871Z", "shell.execute_reply": "2024-01-18T17:15:44.655621Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import difflib" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.657696Z", "iopub.status.busy": "2024-01-18T17:15:44.657543Z", "iopub.status.idle": "2024-01-18T17:15:44.661894Z", "shell.execute_reply": "2024-01-18T17:15:44.661619Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- triangle\n", "\n", "+++ triangle_1\n", "\n", "@@ -1,7 +1,7 @@\n", "\n", " def triangle(a, b, c):\n", " if a == b:\n", " if b == c:\n", "- return 'Equilateral'\n", "+ pass\n", " else:\n", " return 'Isosceles'\n", " elif b == c:\n", "--- triangle\n", "\n", "+++ triangle_2\n", "\n", "@@ -3,7 +3,7 @@\n", "\n", " if b == c:\n", " return 'Equilateral'\n", " else:\n", "- return 'Isosceles'\n", "+ pass\n", " elif b == c:\n", " return 'Isosceles'\n", " elif a == c:\n", "--- triangle\n", "\n", "+++ triangle_3\n", "\n", "@@ -5,7 +5,7 @@\n", "\n", " else:\n", " return 'Isosceles'\n", " elif b == c:\n", "- return 'Isosceles'\n", "+ pass\n", " elif a == c:\n", " return 'Isosceles'\n", " else:\n", "--- triangle\n", "\n", "+++ triangle_4\n", "\n", "@@ -7,6 +7,6 @@\n", "\n", " elif b == c:\n", " return 'Isosceles'\n", " elif a == c:\n", "- return 'Isosceles'\n", "+ pass\n", " else:\n", " return 'Scalene'\n", "--- triangle\n", "\n", "+++ triangle_5\n", "\n", "@@ -9,4 +9,4 @@\n", "\n", " elif a == c:\n", " return 'Isosceles'\n", " else:\n", "- return 'Scalene'\n", "+ pass\n" ] } ], "source": [ "for mutant in MuFunctionAnalyzer(triangle):\n", " shape_src = mutant.pm.src\n", " for line in difflib.unified_diff(mutant.pm.src.split('\\n'),\n", " mutant.src().split('\\n'),\n", " fromfile=mutant.pm.name,\n", " tofile=mutant.name, n=3):\n", " print(line)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In this `diff` output, lines prefixed with `+` are added, whereas lines prefixed with `-` are deleted. We see that each of the five mutants indeed replaces a return statement with a `pass` statement." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We add the `diff()` method to `Mutant` so that it can be called directly." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.663624Z", "iopub.status.busy": "2024-01-18T17:15:44.663541Z", "iopub.status.idle": "2024-01-18T17:15:44.665575Z", "shell.execute_reply": "2024-01-18T17:15:44.665317Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class Mutant(Mutant):\n", " def diff(self):\n", " return '\\n'.join(difflib.unified_diff(self.pm.src.split('\\n'),\n", " self.src().split('\\n'),\n", " fromfile='original',\n", " tofile='mutant',\n", " n=3))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Evaluating Mutations\n", "\n", "We are now ready to implement the actual evaluation. We define our mutant as a _context manager_ that verifies whether all assertions given succeed. The idea is that we can write code such as\n", "\n", "```python\n", "for mutant in MuFunctionAnalyzer(function):\n", " with mutant:\n", " assert function(x) == y\n", "```\n", "\n", "and while `mutant` is active (i.e., the code block under `with:`), the original function is replaced by the mutated function." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `__enter__()` function is called when the `with` block is entered. It creates the mutant as a Python function and places it in the global namespace, such that the `assert` statement executes the mutated function rather than the original." ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.667301Z", "iopub.status.busy": "2024-01-18T17:15:44.667218Z", "iopub.status.idle": "2024-01-18T17:15:44.669151Z", "shell.execute_reply": "2024-01-18T17:15:44.668902Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class Mutant(Mutant):\n", " def __enter__(self):\n", " if self.log:\n", " print('->\\t%s' % self.name)\n", " c = compile(self.src(), '', 'exec')\n", " eval(c, globals())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `__exit__()` function checks whether an exception has occurred (i.e., the assertion failed, or some other error was raised); if so, it marks the mutation as `detected`. Finally, it restores the original function definition." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.670761Z", "iopub.status.busy": "2024-01-18T17:15:44.670612Z", "iopub.status.idle": "2024-01-18T17:15:44.673051Z", "shell.execute_reply": "2024-01-18T17:15:44.672789Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Mutant(Mutant):\n", " def __exit__(self, exc_type, exc_value, traceback):\n", " if self.log:\n", " print('<-\\t%s' % self.name)\n", " if exc_type is not None:\n", " self.detected = True\n", " if self.log:\n", " print(\"Detected %s\" % self.name, exc_type, exc_value)\n", " globals()[self.pm.name] = self.pm.fn\n", " if self.log:\n", " print()\n", " return True" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `finish()` method simply invokes the method on the mutant, checks if the mutant was discovered, and returns the result." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.675089Z", "iopub.status.busy": "2024-01-18T17:15:44.674841Z", "iopub.status.idle": "2024-01-18T17:15:44.676549Z", "shell.execute_reply": "2024-01-18T17:15:44.676301Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from ExpectError import ExpectTimeout" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.678094Z", "iopub.status.busy": "2024-01-18T17:15:44.677981Z", "iopub.status.idle": "2024-01-18T17:15:44.679794Z", "shell.execute_reply": "2024-01-18T17:15:44.679531Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class MuFunctionAnalyzer(MuFunctionAnalyzer):\n", " def finish(self):\n", " self.un_detected = {\n", " mutant for mutant in self.mutants if not mutant.detected}" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The mutation score – the ratio of mutants detected by the test suite - is computed by `score()`. A score of 1.0 means that all mutants were discovered; a score of 0.1 means that only 10% of mutants were detected." ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.681595Z", "iopub.status.busy": "2024-01-18T17:15:44.681467Z", "iopub.status.idle": "2024-01-18T17:15:44.683391Z", "shell.execute_reply": "2024-01-18T17:15:44.683120Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class MuFunctionAnalyzer(MuFunctionAnalyzer):\n", " def score(self):\n", " return (self.nmutations - len(self.un_detected)) / self.nmutations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is how we use our framework." ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.684932Z", "iopub.status.busy": "2024-01-18T17:15:44.684823Z", "iopub.status.idle": "2024-01-18T17:15:44.686324Z", "shell.execute_reply": "2024-01-18T17:15:44.686092Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import sys" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.687731Z", "iopub.status.busy": "2024-01-18T17:15:44.687644Z", "iopub.status.idle": "2024-01-18T17:15:44.692241Z", "shell.execute_reply": "2024-01-18T17:15:44.691936Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "->\ttriangle_1\n", "<-\ttriangle_1\n", "Detected triangle_1 Equal Check1\n", "\n", "->\ttriangle_2\n", "<-\ttriangle_2\n", "\n", "->\ttriangle_3\n", "<-\ttriangle_3\n", "\n", "->\ttriangle_4\n", "<-\ttriangle_4\n", "\n", "->\ttriangle_5\n", "<-\ttriangle_5\n", "\n" ] }, { "data": { "text/plain": [ "0.2" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuFunctionAnalyzer(triangle, log=True):\n", " with mutant:\n", " assert triangle(1, 1, 1) == 'Equilateral', \"Equal Check1\"\n", " assert triangle(1, 0, 1) != 'Equilateral', \"Equal Check2\"\n", " assert triangle(1, 0, 2) != 'Equilateral', \"Equal Check3\"\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Only one out of five mutations resulted in a failing assertion. Hence, the `weak_oracle()` test suite gets a mutation score of 20%." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.693744Z", "iopub.status.busy": "2024-01-18T17:15:44.693661Z", "iopub.status.idle": "2024-01-18T17:15:44.697376Z", "shell.execute_reply": "2024-01-18T17:15:44.697117Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.2" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuFunctionAnalyzer(triangle):\n", " with mutant:\n", " weak_oracle(triangle)\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Since we are modifying the global namespace, we do not have to refer to the function directly within the for loop of mutant." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.698872Z", "iopub.status.busy": "2024-01-18T17:15:44.698790Z", "iopub.status.idle": "2024-01-18T17:15:44.700405Z", "shell.execute_reply": "2024-01-18T17:15:44.700165Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def oracle():\n", " strong_oracle(triangle)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.701874Z", "iopub.status.busy": "2024-01-18T17:15:44.701779Z", "iopub.status.idle": "2024-01-18T17:15:44.705983Z", "shell.execute_reply": "2024-01-18T17:15:44.705549Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "->\ttriangle_1\n", "<-\ttriangle_1\n", "Detected triangle_1 \n", "\n", "->\ttriangle_2\n", "<-\ttriangle_2\n", "Detected triangle_2 \n", "\n", "->\ttriangle_3\n", "<-\ttriangle_3\n", "Detected triangle_3 \n", "\n", "->\ttriangle_4\n", "<-\ttriangle_4\n", "Detected triangle_4 \n", "\n", "->\ttriangle_5\n", "<-\ttriangle_5\n", "Detected triangle_5 \n", "\n" ] }, { "data": { "text/plain": [ "1.0" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuFunctionAnalyzer(triangle, log=True):\n", " with mutant:\n", " oracle()\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "That is, we were able to achieve `100%` mutation score with the `strong_oracle()` test suite.\n", "\n", "Here is another example. `gcd()` computes the greatest common divisor of two numbers." ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.707800Z", "iopub.status.busy": "2024-01-18T17:15:44.707692Z", "iopub.status.idle": "2024-01-18T17:15:44.709652Z", "shell.execute_reply": "2024-01-18T17:15:44.709350Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def gcd(a, b):\n", " if a < b:\n", " c = a\n", " a = b\n", " b = c\n", "\n", " while b != 0:\n", " c = a\n", " a = b\n", " b = c % b\n", "\n", " return a" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here's a test for it. How effective is it?" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.711410Z", "iopub.status.busy": "2024-01-18T17:15:44.711319Z", "iopub.status.idle": "2024-01-18T17:15:44.715279Z", "shell.execute_reply": "2024-01-18T17:15:44.715044Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "->\tgcd_1\n", "<-\tgcd_1\n", "Detected gcd_1 local variable 'c' referenced before assignment\n", "\n", "->\tgcd_2\n", "<-\tgcd_2\n", "Detected gcd_2 Mirror\n", "\n", "->\tgcd_3\n", "<-\tgcd_3\n", "\n", "->\tgcd_4\n", "<-\tgcd_4\n", "\n", "->\tgcd_5\n", "<-\tgcd_5\n", "\n", "->\tgcd_6\n", "<-\tgcd_6\n", "\n", "->\tgcd_7\n", "<-\tgcd_7\n", "Detected gcd_7 Minimal\n", "\n" ] } ], "source": [ "for mutant in MuFunctionAnalyzer(gcd, log=True):\n", " with mutant:\n", " assert gcd(1, 0) == 1, \"Minimal\"\n", " assert gcd(0, 1) == 1, \"Mirror\"" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.716905Z", "iopub.status.busy": "2024-01-18T17:15:44.716722Z", "iopub.status.idle": "2024-01-18T17:15:44.718910Z", "shell.execute_reply": "2024-01-18T17:15:44.718696Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.42857142857142855" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We see that our `TestGCD` test suite is able to obtain a mutation score of 42%." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Mutator for Modules and Test Suites\n", "\n", "Consider the `triangle()` program we discussed previously. As we discussed, a simple way to produce valid mutated version of this program is to replace some of its statements by `pass`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For demonstration purposes, we would like to proceed as though the program was in a different file. We can do that by producing a `Module` object in Python, and attaching the function to it." ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.720624Z", "iopub.status.busy": "2024-01-18T17:15:44.720445Z", "iopub.status.idle": "2024-01-18T17:15:44.722156Z", "shell.execute_reply": "2024-01-18T17:15:44.721916Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import types" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.723778Z", "iopub.status.busy": "2024-01-18T17:15:44.723632Z", "iopub.status.idle": "2024-01-18T17:15:44.725648Z", "shell.execute_reply": "2024-01-18T17:15:44.725326Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def import_code(code, name):\n", " module = types.ModuleType(name)\n", " exec(code, module.__dict__)\n", " return module" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We attach the `triangle()` function to the `shape` module." ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.727325Z", "iopub.status.busy": "2024-01-18T17:15:44.727124Z", "iopub.status.idle": "2024-01-18T17:15:44.729176Z", "shell.execute_reply": "2024-01-18T17:15:44.728920Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "shape = import_code(shape_src, 'shape')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can now invoke triangle through the module `shape`." ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.730705Z", "iopub.status.busy": "2024-01-18T17:15:44.730596Z", "iopub.status.idle": "2024-01-18T17:15:44.732817Z", "shell.execute_reply": "2024-01-18T17:15:44.732486Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Equilateral'" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shape.triangle(1, 1, 1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We want to test the `triangle()` function. For that, we define a `StrongShapeTest` class as below." ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "button": false, "execution": { "iopub.execute_input": "2024-01-18T17:15:44.734323Z", "iopub.status.busy": "2024-01-18T17:15:44.734194Z", "iopub.status.idle": "2024-01-18T17:15:44.740618Z", "shell.execute_reply": "2024-01-18T17:15:44.740280Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import unittest" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "button": false, "execution": { "iopub.execute_input": "2024-01-18T17:15:44.742363Z", "iopub.status.busy": "2024-01-18T17:15:44.742240Z", "iopub.status.idle": "2024-01-18T17:15:44.744858Z", "shell.execute_reply": "2024-01-18T17:15:44.744556Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "class StrongShapeTest(unittest.TestCase):\n", "\n", " def test_equilateral(self):\n", " assert shape.triangle(1, 1, 1) == 'Equilateral'\n", "\n", " def test_isosceles(self):\n", " assert shape.triangle(1, 2, 1) == 'Isosceles'\n", " assert shape.triangle(2, 2, 1) == 'Isosceles'\n", " assert shape.triangle(1, 2, 2) == 'Isosceles'\n", "\n", " def test_scalene(self):\n", " assert shape.triangle(1, 2, 3) == 'Scalene'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We define a helper function `suite()` that looks through a given class and identifies the test functions." ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "button": false, "execution": { "iopub.execute_input": "2024-01-18T17:15:44.746395Z", "iopub.status.busy": "2024-01-18T17:15:44.746287Z", "iopub.status.idle": "2024-01-18T17:15:44.748229Z", "shell.execute_reply": "2024-01-18T17:15:44.747933Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "def suite(test_class):\n", " suite = unittest.TestSuite()\n", " for f in test_class.__dict__:\n", " if f.startswith('test_'):\n", " suite.addTest(test_class(f))\n", " return suite" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The tests in `TestTriangle` class can be invoked with different test runners. The simplest is to directly invoke the `run()` method of the `TestCase`." ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.749658Z", "iopub.status.busy": "2024-01-18T17:15:44.749551Z", "iopub.status.idle": "2024-01-18T17:15:44.752134Z", "shell.execute_reply": "2024-01-18T17:15:44.751756Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "suite(StrongShapeTest).run(unittest.TestResult())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `TextTestRunner` class provides ability to control the verbosity of execution. It also allows one to return on the *first* failure." ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "button": false, "execution": { "iopub.execute_input": "2024-01-18T17:15:44.754126Z", "iopub.status.busy": "2024-01-18T17:15:44.754004Z", "iopub.status.idle": "2024-01-18T17:15:44.756673Z", "shell.execute_reply": "2024-01-18T17:15:44.756408Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "----------------------------------------------------------------------\n", "Ran 3 tests in 0.000s\n", "\n", "OK\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "runner = unittest.TextTestRunner(verbosity=0, failfast=True)\n", "runner.run(suite(StrongShapeTest))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Running the program under coverage is accomplished as follows:" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.758339Z", "iopub.status.busy": "2024-01-18T17:15:44.758210Z", "iopub.status.idle": "2024-01-18T17:15:44.760802Z", "shell.execute_reply": "2024-01-18T17:15:44.760408Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with VisualCoverage() as cov:\n", " suite(StrongShapeTest).run(unittest.TestResult())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The coverage obtained is given by:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.762679Z", "iopub.status.busy": "2024-01-18T17:15:44.762545Z", "iopub.status.idle": "2024-01-18T17:15:44.764664Z", "shell.execute_reply": "2024-01-18T17:15:44.764431Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1: def triangle(a, b, c):\n", "# 2: if a == b:\n", "# 3: if b == c:\n", "# 4: return 'Equilateral'\n", " 5: else:\n", "# 6: return 'Isosceles'\n", "# 7: else:\n", "# 8: if b == c:\n", "# 9: return \"Isosceles\"\n", "# 10: else:\n", " 11: if a == c:\n", "# 12: return \"Isosceles\"\n", " 13: else:\n", " 14: return \"Scalene\"\n", " 15: \n" ] } ], "source": [ "cov.show_coverage(triangle)" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.766197Z", "iopub.status.busy": "2024-01-18T17:15:44.766075Z", "iopub.status.idle": "2024-01-18T17:15:44.768853Z", "shell.execute_reply": "2024-01-18T17:15:44.768593Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class WeakShapeTest(unittest.TestCase):\n", " def test_equilateral(self):\n", " assert shape.triangle(1, 1, 1) == 'Equilateral'\n", "\n", " def test_isosceles(self):\n", " assert shape.triangle(1, 2, 1) != 'Equilateral'\n", " assert shape.triangle(2, 2, 1) != 'Equilateral'\n", " assert shape.triangle(1, 2, 2) != 'Equilateral'\n", "\n", " def test_scalene(self):\n", " assert shape.triangle(1, 2, 3) != 'Equilateral'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "How much coverage does it obtain?" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.770404Z", "iopub.status.busy": "2024-01-18T17:15:44.770270Z", "iopub.status.idle": "2024-01-18T17:15:44.772687Z", "shell.execute_reply": "2024-01-18T17:15:44.772315Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "with VisualCoverage() as cov:\n", " suite(WeakShapeTest).run(unittest.TestResult())" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.774427Z", "iopub.status.busy": "2024-01-18T17:15:44.774249Z", "iopub.status.idle": "2024-01-18T17:15:44.776780Z", "shell.execute_reply": "2024-01-18T17:15:44.776392Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1: def triangle(a, b, c):\n", "# 2: if a == b:\n", "# 3: if b == c:\n", "# 4: return 'Equilateral'\n", " 5: else:\n", "# 6: return 'Isosceles'\n", "# 7: else:\n", "# 8: if b == c:\n", "# 9: return \"Isosceles\"\n", "# 10: else:\n", " 11: if a == c:\n", "# 12: return \"Isosceles\"\n", " 13: else:\n", " 14: return \"Scalene\"\n", " 15: \n" ] } ], "source": [ "cov.show_coverage(triangle)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `MuProgramAnalyzer` is the main class responsible for mutation analysis of the test suite. It accepts the name of the module to be tested, and its source code. It normalizes the source code given by parsing and unparsing it once. This is required to ensure that later `diff`s between the original and mutant are not derailed by differences in whitespace, comments, etc." ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.778484Z", "iopub.status.busy": "2024-01-18T17:15:44.778333Z", "iopub.status.idle": "2024-01-18T17:15:44.781021Z", "shell.execute_reply": "2024-01-18T17:15:44.780742Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class MuProgramAnalyzer(MuFunctionAnalyzer):\n", " def __init__(self, name, src):\n", " self.name = name\n", " self.ast = ast.parse(src)\n", " self.src = ast.unparse(self.ast)\n", " self.changes = []\n", " self.mutator = self.mutator_object()\n", " self.nmutations = self.get_mutation_count()\n", " self.un_detected = set()\n", "\n", " def mutator_object(self, locations=None):\n", " return AdvStmtDeletionMutator(self, locations)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We now extend the `Mutator` class." ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.782604Z", "iopub.status.busy": "2024-01-18T17:15:44.782482Z", "iopub.status.idle": "2024-01-18T17:15:44.784705Z", "shell.execute_reply": "2024-01-18T17:15:44.784435Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class AdvMutator(Mutator):\n", " def __init__(self, analyzer, mutate_locations=None):\n", " self.count = 0\n", " self.mutate_locations = [] if mutate_locations is None else mutate_locations\n", " self.pm = analyzer\n", "\n", " def mutable_visit(self, node):\n", " self.count += 1 # statements start at line no 1\n", " return self.mutation_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `AdvStmtDeletionMutator` simply hooks into all the statement processing visitors. It performs mutation by replacing the given statement with `pass`." ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.786284Z", "iopub.status.busy": "2024-01-18T17:15:44.786171Z", "iopub.status.idle": "2024-01-18T17:15:44.788963Z", "shell.execute_reply": "2024-01-18T17:15:44.788650Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class AdvStmtDeletionMutator(AdvMutator, StmtDeletionMutator):\n", " def __init__(self, analyzer, mutate_locations=None):\n", " AdvMutator.__init__(self, analyzer, mutate_locations)\n", "\n", " def mutation_visit(self, node):\n", " index = 0 # there is only one way to delete a statement -- replace it by pass\n", " if not self.mutate_locations: # counting pass\n", " self.pm.changes.append((self.count, index))\n", " return self.generic_visit(node)\n", " else:\n", " # get matching changes for this pass\n", " mutating_lines = set((count, idx)\n", " for (count, idx) in self.mutate_locations)\n", " if (self.count, index) in mutating_lines:\n", " return ast.Pass()\n", " else:\n", " return self.generic_visit(node)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Aagin, we can obtain the number of mutations produced for `triangle()` as follows." ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.790867Z", "iopub.status.busy": "2024-01-18T17:15:44.790746Z", "iopub.status.idle": "2024-01-18T17:15:44.793203Z", "shell.execute_reply": "2024-01-18T17:15:44.792923Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "MuProgramAnalyzer('shape', shape_src).nmutations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We need a way to obtain the individual mutants. For this, we convert our `MuProgramAnalyzer` to an *iterable*." ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.794683Z", "iopub.status.busy": "2024-01-18T17:15:44.794578Z", "iopub.status.idle": "2024-01-18T17:15:44.796349Z", "shell.execute_reply": "2024-01-18T17:15:44.796095Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class MuProgramAnalyzer(MuProgramAnalyzer):\n", " def __iter__(self):\n", " return AdvPMIterator(self)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `AdvPMIterator`, which is the *iterator* class for `MuProgramAnalyzer` is defined as follows." ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.797945Z", "iopub.status.busy": "2024-01-18T17:15:44.797834Z", "iopub.status.idle": "2024-01-18T17:15:44.799578Z", "shell.execute_reply": "2024-01-18T17:15:44.799335Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class AdvPMIterator:\n", " def __init__(self, pm):\n", " self.pm = pm\n", " self.idx = 0" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `next()` method returns the corresponding `Mutant`" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.801073Z", "iopub.status.busy": "2024-01-18T17:15:44.800943Z", "iopub.status.idle": "2024-01-18T17:15:44.803174Z", "shell.execute_reply": "2024-01-18T17:15:44.802897Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class AdvPMIterator(AdvPMIterator):\n", " def __next__(self):\n", " i = self.idx\n", " if i >= len(self.pm.changes):\n", " raise StopIteration()\n", " self.idx += 1\n", " # there could be multiple changes in one mutant\n", " return AdvMutant(self.pm, [self.pm.changes[i]])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `Mutant` class contains logic for generating mutants when given the locations to mutate." ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.804619Z", "iopub.status.busy": "2024-01-18T17:15:44.804516Z", "iopub.status.idle": "2024-01-18T17:15:44.806722Z", "shell.execute_reply": "2024-01-18T17:15:44.806315Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class AdvMutant(Mutant):\n", " def __init__(self, pm, locations):\n", " self.pm = pm\n", " self.i = locations\n", " self.name = \"%s_%s\" % (self.pm.name,\n", " '_'.join([str(i) for i in self.i]))\n", " self._src = None" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is how it can be used:" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.808540Z", "iopub.status.busy": "2024-01-18T17:15:44.808423Z", "iopub.status.idle": "2024-01-18T17:15:44.810306Z", "shell.execute_reply": "2024-01-18T17:15:44.810057Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "shape_src = inspect.getsource(triangle)" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.811827Z", "iopub.status.busy": "2024-01-18T17:15:44.811725Z", "iopub.status.idle": "2024-01-18T17:15:44.813773Z", "shell.execute_reply": "2024-01-18T17:15:44.813350Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "shape_(1, 0)\n", "shape_(2, 0)\n", "shape_(3, 0)\n", "shape_(4, 0)\n", "shape_(5, 0)\n" ] } ], "source": [ "for m in MuProgramAnalyzer('shape', shape_src):\n", " print(m.name)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `generate_mutant()` simply calls the `mutator()` method, and passes the mutator a copy of the AST. " ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.815468Z", "iopub.status.busy": "2024-01-18T17:15:44.815357Z", "iopub.status.idle": "2024-01-18T17:15:44.817371Z", "shell.execute_reply": "2024-01-18T17:15:44.817069Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class AdvMutant(AdvMutant):\n", " def generate_mutant(self, locations):\n", " mutant_ast = self.pm.mutator_object(\n", " locations).visit(ast.parse(self.pm.src)) # copy\n", " return ast.unparse(mutant_ast)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `src()` method returns the mutated source." ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.818992Z", "iopub.status.busy": "2024-01-18T17:15:44.818870Z", "iopub.status.idle": "2024-01-18T17:15:44.821112Z", "shell.execute_reply": "2024-01-18T17:15:44.820818Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class AdvMutant(AdvMutant):\n", " def src(self):\n", " if self._src is None:\n", " self._src = self.generate_mutant(self.i)\n", " return self._src" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Again, we visualize mutants as difference from the original:" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.822989Z", "iopub.status.busy": "2024-01-18T17:15:44.822852Z", "iopub.status.idle": "2024-01-18T17:15:44.824713Z", "shell.execute_reply": "2024-01-18T17:15:44.824347Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import difflib" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We add the `diff()` method to `Mutant` so that it can be called directly." ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.826838Z", "iopub.status.busy": "2024-01-18T17:15:44.826690Z", "iopub.status.idle": "2024-01-18T17:15:44.828859Z", "shell.execute_reply": "2024-01-18T17:15:44.828557Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class AdvMutant(AdvMutant):\n", " def diff(self):\n", " return '\\n'.join(difflib.unified_diff(self.pm.src.split('\\n'),\n", " self.src().split('\\n'),\n", " fromfile='original',\n", " tofile='mutant',\n", " n=3))" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.830472Z", "iopub.status.busy": "2024-01-18T17:15:44.830345Z", "iopub.status.idle": "2024-01-18T17:15:44.832586Z", "shell.execute_reply": "2024-01-18T17:15:44.832298Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "shape_(1, 0)\n", "--- original\n", "\n", "+++ mutant\n", "\n", "@@ -1,7 +1,7 @@\n", "\n", " def triangle(a, b, c):\n", " if a == b:\n", " if b == c:\n", "- return 'Equilateral'\n", "+ pass\n", " else:\n", " return 'Isosceles'\n", " elif b == c:\n" ] } ], "source": [ "for mutant in MuProgramAnalyzer('shape', shape_src):\n", " print(mutant.name)\n", " print(mutant.diff())\n", " break" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We are now ready to implement the actual evaluation. For doing that, we require the ability to accept the module where the test suite is defined, and invoke the test method on it. The method `getitem` accepts the test module, fixes the import entries on the test module to correctly point to the mutant module, and passes it to the test runner `MutantTestRunner`." ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.834048Z", "iopub.status.busy": "2024-01-18T17:15:44.833940Z", "iopub.status.idle": "2024-01-18T17:15:44.835970Z", "shell.execute_reply": "2024-01-18T17:15:44.835679Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class AdvMutant(AdvMutant):\n", " def __getitem__(self, test_module):\n", " test_module.__dict__[\n", " self.pm.name] = import_code(\n", " self.src(), self.pm.name)\n", " return MutantTestRunner(self, test_module)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The `MutantTestRunner` simply calls all `test_` methods on the test module, checks if the mutant was discovered, and returns the result." ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.837444Z", "iopub.status.busy": "2024-01-18T17:15:44.837321Z", "iopub.status.idle": "2024-01-18T17:15:44.838999Z", "shell.execute_reply": "2024-01-18T17:15:44.838706Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from ExpectError import ExpectTimeout" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.840855Z", "iopub.status.busy": "2024-01-18T17:15:44.840697Z", "iopub.status.idle": "2024-01-18T17:15:44.843905Z", "shell.execute_reply": "2024-01-18T17:15:44.843629Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class MutantTestRunner:\n", " def __init__(self, mutant, test_module):\n", " self.mutant = mutant\n", " self.tm = test_module\n", "\n", " def runTest(self, tc):\n", " suite = unittest.TestSuite()\n", " test_class = self.tm.__dict__[tc]\n", " for f in test_class.__dict__:\n", " if f.startswith('test_'):\n", " suite.addTest(test_class(f))\n", " runner = unittest.TextTestRunner(verbosity=0, failfast=True)\n", " try:\n", " with ExpectTimeout(1):\n", " res = runner.run(suite)\n", " if res.wasSuccessful():\n", " self.mutant.pm.un_detected.add(self)\n", " return res\n", " except SyntaxError:\n", " print('Syntax Error (%s)' % self.mutant.name)\n", " return None\n", " raise Exception('Unhandled exception during test execution')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The mutation score is computed by `score()`." ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.845542Z", "iopub.status.busy": "2024-01-18T17:15:44.845454Z", "iopub.status.idle": "2024-01-18T17:15:44.847342Z", "shell.execute_reply": "2024-01-18T17:15:44.847067Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class MuProgramAnalyzer(MuProgramAnalyzer):\n", " def score(self):\n", " return (self.nmutations - len(self.un_detected)) / self.nmutations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Here is how we use our framework." ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.848847Z", "iopub.status.busy": "2024-01-18T17:15:44.848760Z", "iopub.status.idle": "2024-01-18T17:15:44.850356Z", "shell.execute_reply": "2024-01-18T17:15:44.850081Z" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import sys" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.851880Z", "iopub.status.busy": "2024-01-18T17:15:44.851799Z", "iopub.status.idle": "2024-01-18T17:15:44.857674Z", "shell.execute_reply": "2024-01-18T17:15:44.857332Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "======================================================================\n", "FAIL: test_equilateral (__main__.WeakShapeTest)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/511514204.py\", line 3, in test_equilateral\n", " assert shape.triangle(1, 1, 1) == 'Equilateral'\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "----------------------------------------------------------------------\n", "Ran 3 tests in 0.000s\n", "\n", "OK\n", "----------------------------------------------------------------------\n", "Ran 3 tests in 0.000s\n", "\n", "OK\n", "----------------------------------------------------------------------\n", "Ran 3 tests in 0.000s\n", "\n", "OK\n", "----------------------------------------------------------------------\n", "Ran 3 tests in 0.000s\n", "\n", "OK\n" ] }, { "data": { "text/plain": [ "0.2" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_module = sys.modules[__name__]\n", "for mutant in MuProgramAnalyzer('shape', shape_src):\n", " mutant[test_module].runTest('WeakShapeTest')\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The `WeakShape` test suite resulted in only `20%` mutation score." ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.859360Z", "iopub.status.busy": "2024-01-18T17:15:44.859265Z", "iopub.status.idle": "2024-01-18T17:15:44.866224Z", "shell.execute_reply": "2024-01-18T17:15:44.865877Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "======================================================================\n", "FAIL: test_equilateral (__main__.StrongShapeTest)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2670057129.py\", line 4, in test_equilateral\n", " assert shape.triangle(1, 1, 1) == 'Equilateral'\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_isosceles (__main__.StrongShapeTest)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2670057129.py\", line 8, in test_isosceles\n", " assert shape.triangle(2, 2, 1) == 'Isosceles'\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_isosceles (__main__.StrongShapeTest)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2670057129.py\", line 9, in test_isosceles\n", " assert shape.triangle(1, 2, 2) == 'Isosceles'\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_isosceles (__main__.StrongShapeTest)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2670057129.py\", line 7, in test_isosceles\n", " assert shape.triangle(1, 2, 1) == 'Isosceles'\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_scalene (__main__.StrongShapeTest)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2670057129.py\", line 12, in test_scalene\n", " assert shape.triangle(1, 2, 3) == 'Scalene'\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 3 tests in 0.000s\n", "\n", "FAILED (failures=1)\n" ] }, { "data": { "text/plain": [ "1.0" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuProgramAnalyzer('shape', shape_src):\n", " mutant[test_module].runTest('StrongShapeTest')\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "On the other hand, we were able to achieve `100%` mutation score with `StrongShapeTest` test suite.\n", "\n", "Here is another example, `gcd()`." ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.868040Z", "iopub.status.busy": "2024-01-18T17:15:44.867891Z", "iopub.status.idle": "2024-01-18T17:15:44.869929Z", "shell.execute_reply": "2024-01-18T17:15:44.869641Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "gcd_src = inspect.getsource(gcd)" ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.871900Z", "iopub.status.busy": "2024-01-18T17:15:44.871764Z", "iopub.status.idle": "2024-01-18T17:15:44.873796Z", "shell.execute_reply": "2024-01-18T17:15:44.873538Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class TestGCD(unittest.TestCase):\n", " def test_simple(self):\n", " assert cfg.gcd(1, 0) == 1\n", "\n", " def test_mirror(self):\n", " assert cfg.gcd(0, 1) == 1" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.875491Z", "iopub.status.busy": "2024-01-18T17:15:44.875363Z", "iopub.status.idle": "2024-01-18T17:15:44.882822Z", "shell.execute_reply": "2024-01-18T17:15:44.882564Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "======================================================================\n", "ERROR: test_mirror (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 6, in test_mirror\n", " assert cfg.gcd(0, 1) == 1\n", " File \"\", line 5, in gcd\n", "UnboundLocalError: local variable 'c' referenced before assignment\n", "\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "FAILED (errors=1)\n", "======================================================================\n", "FAIL: test_mirror (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 6, in test_mirror\n", " assert cfg.gcd(0, 1) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "FAILED (failures=1)\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "OK\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "OK\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "OK\n", "----------------------------------------------------------------------\n", "Ran 2 tests in 0.000s\n", "\n", "OK\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n" ] }, { "data": { "text/plain": [ "0.42857142857142855" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuProgramAnalyzer('cfg', gcd_src):\n", " mutant[test_module].runTest('TestGCD')\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We see that our `TestGCD` test suite is able to obtain `42%` mutation score." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## The Problem of Equivalent Mutants\n", "\n", "One of the problems with mutation analysis is that not all mutants generated need to be faulty. For example, consider the `new_gcd()` program below." ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.884917Z", "iopub.status.busy": "2024-01-18T17:15:44.884701Z", "iopub.status.idle": "2024-01-18T17:15:44.886700Z", "shell.execute_reply": "2024-01-18T17:15:44.886442Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def new_gcd(a, b):\n", " if a < b:\n", " a, b = b, a\n", " else:\n", " a, b = a, b\n", "\n", " while b != 0:\n", " a, b = b, a % b\n", " return a" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This program can be mutated to produce the following mutant." ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.888425Z", "iopub.status.busy": "2024-01-18T17:15:44.888275Z", "iopub.status.idle": "2024-01-18T17:15:44.890316Z", "shell.execute_reply": "2024-01-18T17:15:44.890081Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def mutated_gcd(a, b):\n", " if a < b:\n", " a, b = b, a\n", " else:\n", " pass\n", "\n", " while b != 0:\n", " a, b = b, a % b\n", " return a" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.892356Z", "iopub.status.busy": "2024-01-18T17:15:44.892103Z", "iopub.status.idle": "2024-01-18T17:15:44.895088Z", "shell.execute_reply": "2024-01-18T17:15:44.894844Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 def new_gcd(a, b):\n", " if a < b:\n", " pass\n", " else:\n", " (a, b) = (a, b)\n", " while b != 0:\n", " (a, b) = (b, a % b)\n", " return a\n", "1 def new_gcd(a, b):\n", " if a < b:\n", " (a, b) = (b, a)\n", " else:\n", " pass\n", " while b != 0:\n", " (a, b) = (b, a % b)\n", " return a\n", "2 def new_gcd(a, b):\n", " if a < b:\n", " (a, b) = (b, a)\n", " else:\n", " (a, b) = (a, b)\n", " while b != 0:\n", " pass\n", " return a\n", "3 def new_gcd(a, b):\n", " if a < b:\n", " (a, b) = (b, a)\n", " else:\n", " (a, b) = (a, b)\n", " while b != 0:\n", " (a, b) = (b, a % b)\n", " pass\n" ] } ], "source": [ "for i, mutant in enumerate(MuFunctionAnalyzer(new_gcd)):\n", " print(i,mutant.src())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "While other mutants are faulty compared to the original, `mutant 1` is indistinguishable from the original in terms of its semantics because it removes an inconsequential assignment. This means that `mutant 1` does not represent a fault. These kinds of mutants that do not represent a fault are called *Equivalent mutants*. The problem with equivalent mutants is that it becomes very difficult to judge the mutation score in the presence of equivalent mutants. For example, with a mutation score of 70%, anywhere from 0 to 30% of the mutants may be equivalent. Hence, without knowing the actual number of equivalent mutants, it is impossible to judge how much the tests can be improved. We discuss two methods to deal with equivalent mutants." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Statistical Estimation of Number of Equivalent Mutants\n", "\n", "If the number of mutants that are alive is small enough, one may rely on simply inspecting them manually. However, if the number of mutants are sufficiently large (say > 1000), one may choose a smaller number of mutants from the alive mutants randomly and manually evaluate them to see whether they represent faults. The sample size determination is governed by the following formula for a binomial distribution (approximated by a normal distribution):\n", "\n", "$$\n", "n \\ge \\hat{p}(1-\\hat{p})\\bigg(\\frac{Z_{\\frac{\\alpha}{2}}}{\\Delta}\\bigg)^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "where $n$ is the number of samples, $p$ is the parameter for the probability distribution, $\\alpha$ is the accuracy desired, $\\Delta$ the precision. For an accuracy of $95\\%$, $Z_{0.95}=1.96$. we have the following values (the maximum value of $\\hat{p}(1-\\hat{p}) = 0.25$) and $Z$ is the critical value for normal distribution:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$\n", "n \\ge 0.25\\bigg(\\frac{1.96}{\\Delta}\\bigg)^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For $\\Delta = 0.01$, (that is for a maximum error of 1%), we need to evaluate $9604$ mutants for equivalence. If one relaxes the constraint to $\\Delta = 0.1$ (that is an error of $10\\%$), then one needs to evaluate only $96$ mutants for equivalence." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Statistical Estimation of the Number of Immortals by Chao's Estimator\n", "\n", "While the idea of sampling only a limited number of mutants is appealing, it is still limited in that manual analysis is necessary. If computing power is cheap, another way to estimate the number of true mutants (and hence the number of equivalent mutants) is by means of Chao's estimator. As we will see in the chapter on [when to stop fuzzing](WhenToStopFuzzing.ipynb), the formula is given by:\n", "\n", "$$\n", "\\hat S_\\text{Chao1} = \\begin{cases}\n", "S(n) + \\frac{f_1^2}{2f_2} & \\text{if $f_2>0$}\\\\\n", "S(n) + \\frac{f_1(f_1-1)}{2} & \\text{otherwise}\n", "\\end{cases}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The basic idea is to compute the result of the complete test matrix $T \\times M$ of each test against each mutant. The variable $f_1$ represents the number of mutants that were killed exactly once, and the variable $f_2$ represents the number of variables that were killed exactly twice. $S(n)$ is the total number of mutants killed. Here, $\\hat{S}_{Chao1}$ provides the estimate of the true number of mutants. If $M$ is the total mutants generated, then $M - \\hat{S}_{Chao1}$ represents the number of **immortal** mutants. Note that these **immortal** mutants are somewhat different from the traditional equivalent mutants in that the **mortality** depends on the oracle used to distinguish variant behavior. That is, if one uses a fuzzer that relies on errors thrown to detect killing, it will not detect mutants that produce different output but does not throw an error. Hence, the *Chao1* estimate will essentially be the asymptote value of mutants the fuzzer can detect if it is given an infinite amount of time. The **immortal** mutant estimate will approach true **equivalent** mutant estimate when the oracle used is sufficiently strong.\n", "For more details see the chapter on [when to stop fuzzing](WhenToStopFuzzing.ipynb). A comprehensive guide to species discovery in testing is the paper by Boehme \\cite{boehme2018species}." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Synopsis" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This chapter introduces two methods of running *mutation analysis* on subject programs. The first class `MuFunctionAnalyzer` targets individual functions. Given a function `gcd` and two test cases evaluate, one can run mutation analysis on the test cases as follows:" ] }, { "cell_type": "code", "execution_count": 99, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.897030Z", "iopub.status.busy": "2024-01-18T17:15:44.896905Z", "iopub.status.idle": "2024-01-18T17:15:44.901488Z", "shell.execute_reply": "2024-01-18T17:15:44.901241Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "->\tgcd_1\n", "<-\tgcd_1\n", "Detected gcd_1 local variable 'c' referenced before assignment\n", "\n", "->\tgcd_2\n", "<-\tgcd_2\n", "Detected gcd_2 Mirror\n", "\n", "->\tgcd_3\n", "<-\tgcd_3\n", "\n", "->\tgcd_4\n", "<-\tgcd_4\n", "\n", "->\tgcd_5\n", "<-\tgcd_5\n", "\n", "->\tgcd_6\n", "<-\tgcd_6\n", "\n", "->\tgcd_7\n", "<-\tgcd_7\n", "Detected gcd_7 Minimal\n", "\n" ] }, { "data": { "text/plain": [ "0.42857142857142855" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuFunctionAnalyzer(gcd, log=True):\n", " with mutant:\n", " assert gcd(1, 0) == 1, \"Minimal\"\n", " assert gcd(0, 1) == 1, \"Mirror\"\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The second class `MuProgramAnalyzer` targets standalone programs with test suites. Given a program `gcd` whose source code is provided in `gcd_src` and the test suite is provided by `TestGCD`, one can evaluate the mutation score of `TestGCD` as follows:" ] }, { "cell_type": "code", "execution_count": 100, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.903307Z", "iopub.status.busy": "2024-01-18T17:15:44.903147Z", "iopub.status.idle": "2024-01-18T17:15:44.905077Z", "shell.execute_reply": "2024-01-18T17:15:44.904837Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class TestGCD(unittest.TestCase):\n", " def test_simple(self):\n", " assert cfg.gcd(1, 0) == 1\n", "\n", " def test_mirror(self):\n", " assert cfg.gcd(0, 1) == 1" ] }, { "cell_type": "code", "execution_count": 101, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.906708Z", "iopub.status.busy": "2024-01-18T17:15:44.906558Z", "iopub.status.idle": "2024-01-18T17:15:44.916098Z", "shell.execute_reply": "2024-01-18T17:15:44.915759Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n", "======================================================================\n", "FAIL: test_simple (__main__.TestGCD)\n", "----------------------------------------------------------------------\n", "Traceback (most recent call last):\n", " File \"/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_75331/2565918356.py\", line 3, in test_simple\n", " assert cfg.gcd(1, 0) == 1\n", "AssertionError\n", "\n", "----------------------------------------------------------------------\n", "Ran 1 test in 0.000s\n", "\n", "FAILED (failures=1)\n" ] }, { "data": { "text/plain": [ "1.0" ] }, "execution_count": 101, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for mutant in MuProgramAnalyzer('gcd', gcd_src):\n", " mutant[test_module].runTest('TestGCD')\n", "mutant.pm.score()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The mutation score thus obtained is a better indicator of the quality of a given test suite than pure coverage." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Lessons Learned\n", "\n", "* We have learned why structural coverage is insufficient to evaluate the quality of test suites.\n", "* We have learned how to use Mutation Analysis for evaluating test suite quality.\n", "* We have learned the limitations of Mutation Analysis -- Equivalent and Redundant mutants, and how to estimate them." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Next Steps\n", "\n", "* While naive fuzzing generates poor quality oracles, techniques such as [symbolic](SymbolicFuzzer.ipynb) and [concolic](ConcolicFuzzer.ipynb) can enhance the quality oracles used in fuzzing.\n", "* [Dynamic invariants](DynamicInvariants.ipynb) can also be of great help in improving the quality of oracles.\n", "* The chapter on [when to stop fuzzing](WhenToStopFuzzing.ipynb) provides a detailed overview of the Chao estimator." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Background\n", "\n", "The idea of Mutation Analysis was first introduced by Lipton et al. \\cite{lipton1971fault}. An excellent survey of mutation analysis research was published by Jia et al. \\cite{jia2011analysis}. The chapter on Mutation Analysis by Papadakis et al \\cite{papadakis2019mutation} is another excellent overview of the current trends in mutation analysis." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": true, "run_control": { "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Exercises" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Exercise 1: Arithmetic Expression Mutators\n", "\n", "Our simple statement deletion mutation is only one of the ways in which a program could be mutated. Another category of mutants is _expression mutation_ where arithmetic operators such as `{+,-,*,/}` etc. are replaced for one another. For example, given an expression such as\n", "```\n", "x = x + 1\n", "```\n", "One can mutate it to\n", "```\n", "x = x - 1\n", "```\n", "and\n", "```\n", "x = x * 1\n", "```\n", "and\n", "```\n", "x = x / 1\n", "```" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "First, we need to find out which node types we want to mutate. We get these via the ast functions and find that the node type is named BinOp" ] }, { "cell_type": "code", "execution_count": 102, "metadata": { "button": false, "execution": { "iopub.execute_input": "2024-01-18T17:15:44.918224Z", "iopub.status.busy": "2024-01-18T17:15:44.918115Z", "iopub.status.idle": "2024-01-18T17:15:44.920150Z", "shell.execute_reply": "2024-01-18T17:15:44.919903Z" }, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Module(\n", " body=[\n", " Expr(\n", " value=BinOp(\n", " left=BinOp(\n", " left=Constant(value=1),\n", " op=Add(),\n", " right=Constant(value=2)),\n", " op=Sub(),\n", " right=BinOp(\n", " left=BinOp(\n", " left=Constant(value=3),\n", " op=Mult(),\n", " right=Constant(value=4)),\n", " op=Div(),\n", " right=Constant(value=5))))],\n", " type_ignores=[])\n" ] } ], "source": [ "print(ast.dump(ast.parse(\"1 + 2 - 3 * 4 / 5\"), indent=4))" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution2": "hidden", "solution2_first": true }, "source": [ "To mutate the tree, you thus need to change the `op` attribute (which has one of the values `Add`, `Sub`, `Mult`, and `Div`)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "**Solution.** First, we need to find out which node types we want to mutate. We get these via the `ast` functions and find that the node type is named `BinOp`" ] }, { "cell_type": "code", "execution_count": 103, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.921926Z", "iopub.status.busy": "2024-01-18T17:15:44.921824Z", "iopub.status.idle": "2024-01-18T17:15:44.923814Z", "shell.execute_reply": "2024-01-18T17:15:44.923568Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Module(\n", " body=[\n", " Expr(\n", " value=BinOp(\n", " left=BinOp(\n", " left=Constant(value=1),\n", " op=Add(),\n", " right=Constant(value=2)),\n", " op=Sub(),\n", " right=BinOp(\n", " left=BinOp(\n", " left=Constant(value=3),\n", " op=Mult(),\n", " right=Constant(value=4)),\n", " op=Div(),\n", " right=Constant(value=5))))],\n", " type_ignores=[])\n" ] } ], "source": [ "print(ast.dump(ast.parse(\"1 + 2 - 3 * 4 / 5\"), indent=4))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden", "solution2_first": true }, "source": [ "To mutate the tree, we need to change the `op` attribute (which has one of the values `Add`, `Sub`, `Mult`, and `Div`). Write a class `BinOpMutator` that does the necessary mutations, and then create a class `MuBinOpAnalyzer` as subclass of `MuFunctionAnalyzer` which makes use of `BinOpMutator`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "**Solution.** As with `StmtDeletionMutator`, we have to visit the AST – in this case, all `BinOp` nodes – and make appropriate changes:" ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.925622Z", "iopub.status.busy": "2024-01-18T17:15:44.925389Z", "iopub.status.idle": "2024-01-18T17:15:44.927354Z", "shell.execute_reply": "2024-01-18T17:15:44.927085Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "class BinOpMutator(Mutator):\n", " def visit_BinOp(self, node): return self.mutable_visit(node)" ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.929009Z", "iopub.status.busy": "2024-01-18T17:15:44.928863Z", "iopub.status.idle": "2024-01-18T17:15:44.931109Z", "shell.execute_reply": "2024-01-18T17:15:44.930890Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "class BinOpMutator(BinOpMutator):\n", " def mutation_visit(self, node):\n", " replacement = {\n", " type(ast.Add()): ast.Sub(),\n", " type(ast.Sub()): ast.Add(),\n", " type(ast.Mult()): ast.Div(),\n", " type(ast.Div()): ast.Mult()\n", " }\n", "\n", " try:\n", " node.op = replacement[type(node.op)]\n", " except KeyError:\n", " pass # All other binary operators (and, mod, etc.)\n", "\n", " return node" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "We hook this into our own analyzer:" ] }, { "cell_type": "code", "execution_count": 106, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.932721Z", "iopub.status.busy": "2024-01-18T17:15:44.932520Z", "iopub.status.idle": "2024-01-18T17:15:44.934189Z", "shell.execute_reply": "2024-01-18T17:15:44.933985Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "class MuBinOpAnalyzer(MuFunctionAnalyzer):\n", " def mutator_object(self, locations=None):\n", " return BinOpMutator(locations)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "source": [ "Here's how it mutates a function:" ] }, { "cell_type": "code", "execution_count": 107, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.935821Z", "iopub.status.busy": "2024-01-18T17:15:44.935638Z", "iopub.status.idle": "2024-01-18T17:15:44.937370Z", "shell.execute_reply": "2024-01-18T17:15:44.937125Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [], "source": [ "def arith_expr():\n", " return 1 + 2 - 3 * 4 / 5" ] }, { "cell_type": "code", "execution_count": 108, "metadata": { "execution": { "iopub.execute_input": "2024-01-18T17:15:44.939152Z", "iopub.status.busy": "2024-01-18T17:15:44.938781Z", "iopub.status.idle": "2024-01-18T17:15:44.941539Z", "shell.execute_reply": "2024-01-18T17:15:44.941292Z" }, "slideshow": { "slide_type": "skip" }, "solution2": "hidden" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- original\n", "\n", "+++ mutant\n", "\n", "@@ -1,2 +1,2 @@\n", "\n", " def arith_expr():\n", "- return 1 + 2 - 3 * 4 / 5\n", "+ return 1 + 2 + 3 * 4 / 5\n", "--- original\n", "\n", "+++ mutant\n", "\n", "@@ -1,2 +1,2 @@\n", "\n", " def arith_expr():\n", "- return 1 + 2 - 3 * 4 / 5\n", "+ return 1 - 2 - 3 * 4 / 5\n", "--- original\n", "\n", "+++ mutant\n", "\n", "@@ -1,2 +1,2 @@\n", "\n", " def arith_expr():\n", "- return 1 + 2 - 3 * 4 / 5\n", "+ return 1 + 2 - 3 * 4 * 5\n", "--- original\n", "\n", "+++ mutant\n", "\n", "@@ -1,2 +1,2 @@\n", "\n", " def arith_expr():\n", "- return 1 + 2 - 3 * 4 / 5\n", "+ return 1 + 2 - 3 / 4 / 5\n" ] } ], "source": [ "for mutant in MuBinOpAnalyzer(arith_expr, log=True):\n", " print(mutant.diff())" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden", "solution2_first": true, "solution_first": true }, "source": [ "### Exercise 2: Optimizing Mutation Analysis\n", "\n", "Our technique for mutation analysis is somewhat inefficient in that we run the tests even on mutants that have mutations in code not covered by the test case. Test cases have no possibility of detecting errors on portions of code they do not cover. Hence, one of the simplest optimizations is to first recover the coverage information from the given test case, and only run the test case on mutants where the mutations lie in the code being covered by the test case. Can you modify the `MuFunctionAnalyzer` to incorporate recovering coverage as the first step?" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" }, "solution": "hidden", "solution2": "hidden" }, "source": [ "**Solution.** Left to the astute reader." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden", "solution2_first": true, "solution_first": true }, "source": [ "### Exercise 3: Byte Code Mutator\n", "\n", "We have seen how to mutate the AST given the source. One of the deficiencies with this approach is that the Python bytecode is targeted by other languages too. In such cases, the source may not be readily converted to a Python AST, and it is desirable to mutate the bytecode instead. Can you implement a bytecode mutator for Python function that mutates the bytecode instead of fetching the source and then mutating it?" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" }, "solution": "hidden", "solution2": "hidden" }, "source": [ "**Solution.** Left to the astute reader." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "subslide" }, "solution": "hidden", "solution2": "hidden", "solution2_first": true, "solution_first": true }, "source": [ "### Exercise 4: Estimating Residual Defect Density\n", "\n", "The defect density of a program is the number of defects in a program that that were detected before release divided by the program size. The residual defect density is the percentage of defects that escaped detection. While estimation of the real residual defect density is difficult, mutation analysis can provide an upper bound. The number of mutants that remain undetected is a plausible upper bound on the number of defects that remain within the program. However, this upper bound may be too wide. The reason is that some remaining faults can interact with each other, and if present together, can be detected by the available test suite. Hence, a tighter bound is the number of mutants that can exist *together* in a given program without being detected by the given test suite. This can be accomplished by starting with the complete set of mutations possible, and applying delta-debugging from [the chapter on reducing](Reducer.ipynb) to determine the minimum number of mutations that need to be removed to make the mutant pass undetected by the test suite. Can you produce a new `RDDEstimator` by extending the `MuFunctionAnalyzer` that estimates the residual defect density upper bound using this technique?" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "slideshow": { "slide_type": "skip" }, "solution": "hidden", "solution2": "hidden" }, "source": [ "**Solution.** Left to the astute reader." ] } ], "metadata": { "ipub": { "bibliography": "fuzzingbook.bib", "toc": true }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.2" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "toc-autonumbering": false, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }