{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Testing\n", "\n", "Code is assumed guilty until proven innocent. This applies to software written by\n", "other people, but even more so to software written by yourself. The mechanism that\n", "builds trust that software is performing correctly is called testing.\n", "Testing is the process by which the expected results of code are compared against the\n", "observed results of actually having run that code.\n", "\n", "## What and How to Test?\n", "\n", "Let’s see how this mindset applies to an actual physics problem. Given two previous\n", "observations in the sky and the time between them, Kepler’s Laws provide a closedform\n", "equation for the future location of a celestial body. This can be implemented via\n", "a function named `kepler_loc()`. The following is a stub interface representing this\n", "function that lacks the actual function body:\n", "\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def kepler_loc(p1, p2, dt, t):\n", " ...\n", " return p3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a basic test of this function, we can take three points on the planet Jupiter’s actual\n", "measured path and use the latest of these as the expected result. We will then compare\n", "this to the result that we observe as the output of the kepler_loc() function.\n", "\n", "The following example is pseudocode for testing that the\n", "measured positions of Jupiter, given by the function jupiter(), can be predicted with\n", "the kepler_loc() function:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# note that assertions are superior to the ValueError in the example below.\n", "def test_kepler_loc():\n", " p1 = jupiter(two_days_ago)\n", " p2 = jupiter(yesterday)\n", " exp = jupiter(today)\n", " obs = kepler_loc(p1, p2, 1, 1)\n", " if exp != obs:\n", " raise ValueError(\"Jupiter is not where it should be!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following pseudocode represents the basic foundation of this, \n", "and indeed all, tests. This example uses an assertion." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def test_func():\n", " exp = get_expected()\n", " obs = func(*args, **kwargs)\n", " assert exp == obs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below, we can rewrite the kepler test, using an assertion." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def test_keppler_loc():\n", " p1 = jupiter(two_days_ago)\n", " p2 = jupiter(yesterday)\n", " exp = jupiter(today)\n", " obs = keppler_loc(p1, p2, 1, 1)\n", " assert exp == obs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Exercise: Add a Test to Your Project\n", "\n", "1) Create a file called test_filename.py for a file (filename) in your project source code.\n", "\n", "2) For the most important function in the file, create a test function using an assertion.\n", "\n", "3) Save and run the test file. Does the test pass? How can you tell?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Nose\n", "`nose` has a variety of helpful and specific assertion functions that display extra debugging\n", "information when they fail. These are all accessible through the nose.tools\n", "module. The simplest one is named `assert_equal()`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from nose.tools import assert_equal\n", "\n", "def test_kepler_loc():\n", " p1 = jupiter(two_days_ago)\n", " p2 = jupiter(yesterday)\n", " exp = jupiter(today)\n", " obs = keppler_loc(p1, p2, 1, 1)\n", " assert_equal(exp, obs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Running Tests\n", "\n", "The major boon a testing framework provides is a utility to find and run the tests\n", "automatically. With nose, this is a command-line tool called nosetests. If the\n", "following fibonacci function is being tested, we can run all of the tests with \n", "the `nosetests` command on the command line." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def fib(n):\n", " if n == 0 or n == 1:\n", " return 1\n", " else:\n", " return fib(n - 1) + fib(n - 2)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from nose.tools import assert_equal\n", "\n", "def test_fib0():\n", " # test edge 0\n", " obs = fib(0)\n", " assert_equal(1, obs)\n", "\n", "def test_fib1():\n", " # test edge 1\n", " obs = fib(1)\n", " assert_equal(1, obs)\n", "\n", "def test_fib6():\n", " # test regular point\n", " obs = fib(6)\n", " assert_equal(13, obs)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Running the test functions manually should produce no output if the tests pass\n", "test_fib0()\n", "test_fib1()\n", "test_fib6()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Exercise: Use Nose For Your Project\n", "\n", "1) Rewrite your new project test to use nose instead.\n", "\n", "2) To run the nose test, type `nosetests` in the directory where the test file is stored.\n", "\n", "3) Attempt this and debug your function and test until the test passes.\n", "\n", "4) If you have extra time, try writing another test or two." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n", "\n", "def sinc2d(x, y):\n", " if x == 0.0 and y == 0.0:\n", " return 1.0\n", " elif x == 0.0:\n", " return np.sin(y) / y\n", " elif y == 0.0:\n", " return np.sin(x) / x\n", " else:\n", " return (np.sin(x) / x) * (np.sin(y) / y)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n", "from nose.tools import assert_equal\n", "\n", "def test_internal():\n", " exp = (2.0 / np.pi) * (-2.0 / (3.0 * np.pi))\n", " obs = sinc2d(np.pi / 2.0, 3.0 * np.pi / 2.0)\n", " assert_equal(exp, obs)\n", "\n", "def test_edge_x():\n", " exp = (-2.0 / (3.0 * np.pi))\n", " obs = sinc2d(0.0, 3.0 * np.pi / 2.0)\n", " assert_equal(exp, obs)\n", "\n", "def test_edge_y():\n", " exp = (2.0 / np.pi)\n", " obs = sinc2d(np.pi / 2.0, 0.0)\n", " assert_equal(exp, obs)\n", "\n", "def test_corner():\n", " exp = 1.0\n", " obs = sinc2d(0.0, 0.0)\n", " assert_equal(exp, obs)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# run the tests for sinc2() manually here!" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def a(x):\n", " return x + 1\n", "\n", "def b(x):\n", " return 2 * x\n", "\n", "def c(x):\n", " return b(a(x))" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from nose.tools import assert_equal\n", "\n", "def test_c():\n", " exp = 6\n", " obs = c(2)\n", " assert_equal(exp, obs)\n", " \n", "test_c()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test-Driven Development\n", "\n", "To start, we write a test for computing the standard deviation from a list of numbers\n", "as follows:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from nose.tools import assert_equal\n", "\n", "def test_std1():\n", " obs = std([0.0, 2.0])\n", " exp = 1.0\n", " assert_equal(obs, exp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we write the minimal version of std() that will cause test_std1() to pass:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def std(vals):\n", " # surely this is cheating...\n", " return 1.0\n", "\n", "# run the test\n", "test_std1()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we only ever want to take the standard deviation of the\n", "numbers 0.0 and 2.0, or 1.0 and 3.0, and so on, then this implementation will work\n", "perfectly. If we want to branch out, then we probably need to write more robust code.\n", "However, before we can write more code, we first need to add another test or two:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "ename": "AssertionError", "evalue": "1.0 != 0.0", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAssertionError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 16\u001b[0m \u001b[0;31m# run the tests\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0mtest_std1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 18\u001b[0;31m \u001b[0mtest_std2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 19\u001b[0m \u001b[0mtest_std3\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m\u001b[0m in \u001b[0;36mtest_std2\u001b[0;34m()\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0mobs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mstd\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mexp\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 9\u001b[0;31m \u001b[0massert_equal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mobs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mexp\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 10\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtest_std3\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/Users/khuff/anaconda3/lib/python3.4/unittest/case.py\u001b[0m in \u001b[0;36massertEqual\u001b[0;34m(self, first, second, msg)\u001b[0m\n\u001b[1;32m 795\u001b[0m \"\"\"\n\u001b[1;32m 796\u001b[0m \u001b[0massertion_func\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_getAssertEqualityFunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msecond\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 797\u001b[0;31m \u001b[0massertion_func\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msecond\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmsg\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmsg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 798\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 799\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0massertNotEqual\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msecond\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmsg\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/Users/khuff/anaconda3/lib/python3.4/unittest/case.py\u001b[0m in \u001b[0;36m_baseAssertEqual\u001b[0;34m(self, first, second, msg)\u001b[0m\n\u001b[1;32m 788\u001b[0m \u001b[0mstandardMsg\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'%s != %s'\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0m_common_shorten_repr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msecond\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 789\u001b[0m \u001b[0mmsg\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_formatMessage\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmsg\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstandardMsg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 790\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfailureException\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmsg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 791\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 792\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0massertEqual\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfirst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msecond\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmsg\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mAssertionError\u001b[0m: 1.0 != 0.0" ] } ], "source": [ "def test_std1():\n", " obs = std([0.0, 2.0])\n", " exp = 1.0\n", " assert_equal(obs, exp)\n", "\n", "def test_std2():\n", " obs = std([]) \n", " exp = 0.0\n", " assert_equal(obs, exp)\n", "\n", "def test_std3():\n", " obs = std([0.0, 4.0])\n", " exp = 2.0\n", " assert_equal(obs, exp)\n", " \n", "# run the tests\n", "test_std1()\n", "test_std2()\n", "test_std3()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll need to improve the function to make these pass." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def std(vals):\n", " # a little better\n", " if len(vals) == 0:\n", " return 0.0\n", " return vals[-1] / 2.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even though the tests all pass, this is clearly still not a generic standard deviation\n", "function. To create a better implementation, TDD states that we again need to expand\n", "the test suite:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def test_std1():\n", " obs = std([0.0, 2.0])\n", " exp = 1.0\n", " assert_equal(obs, exp)\n", "\n", "def test_std2():\n", " obs = std([])\n", " exp = 0.0\n", " assert_equal(obs, exp)\n", "\n", "def test_std3():\n", " obs = std([0.0, 4.0])\n", " exp = 2.0\n", " assert_equal(obs, exp)\n", "\n", "def test_std4():\n", " obs = std([1.0, 3.0])\n", " exp = 1.0\n", " assert_equal(obs, exp)\n", "\n", "def test_std5():\n", " obs = std([1.0, 1.0, 1.0])\n", " exp = 0.0\n", " assert_equal(obs, exp)\n", "\n", "# run the tests\n", "test_std1()\n", "test_std2()\n", "test_std3()\n", "test_std4()\n", "test_std5()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At this point, we may as well try to implement a generic standard deviation function.\n", "We would spend more time trying to come up with clever approximations to the\n", "standard deviation than we would spend actually coding it. Just biting the bullet, we\n", "might write the following implementation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def std(vals):\n", " # finally, some math\n", " n = len(vals)\n", " if n == 0:\n", " return 0.0\n", " mu = sum(vals) / n\n", " var = 0.0\n", " for val in vals:\n", " var = var + (val - mu)**2\n", " return (var / n)**0.5" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# run the tests\n", "test_std1()\n", "test_std2()\n", "test_std3()\n", "test_std4()\n", "test_std5()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Testing Wrap-Up\n", "\n", "At this point, you should know that:\n", "\n", "- Tests compare that the result observed from running code is the same as what\n", " was expected ahead of time.\n", "- Tests should be written at the same time as the code they are testing is written.\n", "- The person best suited to write a test is the author of the original code.\n", "- Tests are grouped together in a test suite.\n", "- Test frameworks, like nose, discover and execute tests for you automatically.\n", "- An edge case is when an input is at the limit of its range.\n", "- A corner case is where two or more edge cases meet.\n", "- Unit tests try to test the smallest pieces of code possible, usually functions and\n", " methods.\n", "- Integration tests make sure that code units work together properly.\n", "- Regression tests ensure that everything works the same today as it did yesterday.\n", "- Test generators can be used to efficiently check many cases.\n", "- Test coverage is the percentage of the code base that is executed by the test suite.\n", "- Test-driven development says to write your tests before you write the code that is\n", " being tested." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.core.display import HTML\n", "def css_styling():\n", " styles = open(\"styles/custom.css\", \"r\").read()\n", " return HTML(styles)\n", "css_styling()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3" } }, "nbformat": 4, "nbformat_minor": 0 }