{ "metadata": { "language_info": { "name": "python", "mimetype": "text/x-python", "file_extension": ".py", "codemirror_mode": { "name": "ipython", "version": 3 }, "version": "3.5.2", "nbconvert_exporter": "python", "pygments_lexer": "ipython3" }, "kernelspec": { "name": "python3", "language": "python", "display_name": "Python 3" } }, "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Python 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Defensive Programming" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "

Learning Objectives

\n", "
\n", "
\n", "\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our previous lessons have covered the basic tools of programming: variables and lists, file I/O, loops, conditionals, and functions. What they haven\u2019t done is show us how to tell whether a program is getting the right answer, and how to tell if it\u2019s still getting the right answer as we make changes to it.\n", "\n", "To achieve that, we need to:\n", "\n", "- Write programs that check their own operation.\n", "- Write and run tests for widely-used functions.\n", "- Make sure we know what \u201ccorrect\u201d actually means.\n", "\n", "The good news is, doing these things will speed up our programming, not slow it down. As in real carpentry \u2014 the kind done with lumber \u2014 the time saved by measuring carefully before cutting a piece of wood is much greater than the time that measuring takes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Assertions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first step toward getting the right answers from our programs is to assume that mistakes _will_ happen and to guard against them. This is called [defensive programming](http://swcarpentry.github.io/python-novice-inflammation/reference.html#defensive-programming), and the most common way to do it is to add [assertions](http://swcarpentry.github.io/python-novice-inflammation/reference.html#assertion) to our code so that it checks itself as it runs. An assertion is simply a statement that something must be true at a certain point in a program. When Python sees one, it evaluates the assertion\u2019s condition. If it\u2019s true, Python does nothing, but if it\u2019s false, Python halts the program immediately and prints the error message if one is provided. For example, this piece of code halts as soon as the loop encounters a value that isn\u2019t positive:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Programs like the Firefox browser are full of assertions: 10-20% of the code they contain are there to check that the other 80-90% are working correctly. Broadly speaking, assertions fall into three categories:\n", "\n", "- A [precondition](http://swcarpentry.github.io/python-novice-inflammation/reference.html#precondition) is something that must be true at the start of a function in order for it to work correctly.\n", "- A [postcondition](http://swcarpentry.github.io/python-novice-inflammation/reference.html#postcondition) is something that the function guarantees is true when it finishes.\n", "- An [invariant](http://swcarpentry.github.io/python-novice-inflammation/reference.html#invariant) is something that is always true at a particular point inside a piece of code.\n", "\n", "For example, suppose we are representing rectangles using a [tuple](http://swcarpentry.github.io/python-novice-inflammation/reference.html#tuple) of four coordinates `(x0, y0, x1, y1)`, representing the lower left and upper right corners of the rectangle. In order to do some calculations, we need to normalize the rectangle so that the lower left corner is at the origin and the longest side is 1.0 units long. This function does that, but checks that its input is correctly formatted and that its result makes sense:" ] }, { "cell_type": "code", "metadata": { "collapsed": true }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The preconditions on lines 2, 4, and 5 catch invalid inputs:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The post-conditions help us catch bugs by telling us when our calculations cannot have been correct. For example, if we normalize a rectangle that is taller than it is wide everything seems OK:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "but if we normalize one that\u2019s wider than it is tall, the assertion is triggered:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Re-reading our function, we realize that line 10 should divide `dy` by `dx` rather than `dx` by `dy`. (You can display line numbers by typing Ctrl-M, then L.) If we had left out the assertion at the end of the function, we would have created and returned something that had the right shape as a valid answer, but wasn\u2019t. Detecting and debugging that would almost certainly have taken more time in the long run than writing the assertion.\n", "\n", "But assertions aren\u2019t just about catching errors: they also help people understand programs. Each assertion gives the person reading the program a chance to check (consciously or otherwise) that their understanding matches what the code is doing.\n", "\n", "Most good programmers follow two rules when adding assertions to their code. The first is, _fail early, fail often_. The greater the distance between when and where an error occurs and when it\u2019s noticed, the harder the error will be to debug, so good code catches mistakes as early as possible.\n", "\n", "The second rule is, _turn bugs into assertions or tests_. Whenever you fix a bug, write an assertion that catches the mistake should you make it again. If you made a mistake in a piece of code, the odds are good that you have made other mistakes nearby, or will make the same mistake (or a related one) the next time you change it. Writing assertions to check that you haven\u2019t [regressed](http://swcarpentry.github.io/python-novice-inflammation/reference.html#regression) (i.e., haven\u2019t re-introduced an old problem) can save a lot of time in the long run, and helps to warn people who are reading the code (including your future self) that this bit is tricky." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Test-Driven Development" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An assertion checks that something is true at a particular point in the program. The next step is to check the overall behavior of a piece of code, i.e., to make sure that it produces the right output when it\u2019s given a particular input. For example, suppose we need to find where two or more time series overlap. The range of each time series is represented as a pair of numbers, which are the time the interval started and ended. The output is the largest range that they all include:\n", "\n", "![Overlapping Ranges](fig/python-overlapping-ranges.svg)\n", "\n", "Most novice programmers would solve this problem like this:\n", "\n", "1. Write a function `range_overlap`.\n", "2. Call it interactively on two or three different inputs.\n", "3. If it produces the wrong answer, fix the function and re-run that test.\n", "\n", "This clearly works \u2014 after all, thousands of scientists are doing it right now \u2014 but there\u2019s a better way:\n", "\n", "1. Write a short function for each test.\n", "2. Write a `range_overlap` function that should pass those tests.\n", "3. If `range_overlap` produces any wrong answers, fix it and re-run the test functions.\n", "\n", "Writing the tests before writing the function they exercise is called [test-driven development](http://swcarpentry.github.io/python-novice-inflammation/reference.html#test-driven-development) (TDD). Its advocates believe it produces better code faster because:\n", "\n", "1. If people write tests after writing the thing to be tested, they are subject to confirmation bias, i.e., they subconsciously write tests to show that their code is correct, rather than to find errors.\n", "2. Writing tests helps programmers figure out what the function is actually supposed to do.\n", "\n", "Here are three test functions for `range_overlap`:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "

Inline with Jupyter Notebooks

\n", "
\n", "
\n", "If you're interested in learning more about TDD (and you should be), you can find more detail [in this lecture](https://github.com/SolarDrew/test-driven-dev). It's also worth looking at the documentation for the [`unittest`](https://docs.python.org/3/library/unittest.html) and [`pytest`](http://doc.pytest.org/en/latest/) libraries.\n", "
" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The error is actually reassuring: we haven\u2019t written `range_overlap` yet, so if the tests passed, it would be a sign that someone else had and that we were accidentally using their function.\n", "\n", "And as a bonus of writing these tests, we\u2019ve implicitly defined what our input and output look like: we expect a list of pairs as input, and produce a single pair as output.\n", "\n", "Something important is missing, though. We don\u2019t have any tests for the case where the ranges don\u2019t overlap at all:" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "assert range_overlap([ (0.0, 1.0), (5.0, 6.0) ]) == ???" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What should `range_overlap` do in this case: fail with an error message, produce a special value like `(0.0, 0.0)` to signal that there\u2019s no overlap, or something else? Any actual implementation of the function will do one of these things; writing the tests first helps us figure out which is best _before_ we\u2019re emotionally invested in whatever we happened to write before we realized there was an issue.\n", "\n", "And what about this case?" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "assert range_overlap([ (0.0, 1.0), (1.0, 2.0) ]) == ???" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Do two segments that touch at their endpoints overlap or not? Mathematicians usually say \"yes\", but engineers usually say \"no\". The best answer is \"whatever is most useful in the rest of our program\", but again, any actual implementation of `range_overlap` is going to do _something_, and whatever it is ought to be consistent with what it does when there\u2019s no overlap at all.\n", "\n", "Since we\u2019re planning to use the range this function returns as the X axis in a time series chart, we decide that:\n", "\n", "1. every overlap has to have non-zero width, and\n", "2. we will return the special value None when there\u2019s no overlap.\n", "\n", "`None` is built into Python, and means \u201cnothing here\u201d. (Other languages often call the equivalent value `null` or `nil`). With that decision made, we can finish writing our last two tests:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, we get an error because we haven\u2019t written our function, but we\u2019re now ready to do so:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Take a moment to think about why we use `max` to raise `lowest` and `min` to lower `highest`). We\u2019d now like to re-run our tests, but they\u2019re scattered across three different cells. To make running them easier, let\u2019s put them all in a function:" ] }, { "cell_type": "code", "metadata": { "collapsed": true }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now test `range_overlap` with a single function call:" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first test that was supposed to produce `None` fails, so we know something is wrong with our function. We _don\u2019t_ know whether the other tests passed or failed because Python halted the program as soon as it spotted the first error. Still, some information is better than none, and if we trace the behavior of the function with that input, we realize that we\u2019re initializing `lowest` and `highest` to 0.0 and 1.0 respectively, regardless of the input values. This violates another important rule of programming: _always initialize from data_." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "

Pre- and post-conditions

\n", "
\n", "
\n", "

Suppose you are writing a function called average that calculates the average of the numbers in a list. What pre-conditions and post-conditions would you write for it? Compare your answer to your neighbor\u2019s: can you think of a function that will pass your tests but not hers or vice versa?

\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "

Testing assertions

\n", "
\n", "
\n", "

Given a sequence of values, the function running returns a list containing the running totals at each index.

\n", "
running([1, 2, 3, 4])
\n", "
[1, 3, 6, 10]
\n", "
running('abc')
\n", "
['a', 'ab', 'abc']
\n", "

Explain in words what the assertions in this function check, and for each one, give an example of input that will make that assertion fail.

\n", "
def running(values):\n",
    "    assert len(values) > 0\n",
    "    result = [values[0]]\n",
    "    for v in values[1:]:\n",
    "        assert result[-1] >= 0\n",
    "        result.append(result[-1] + v)\n",
    "        assert result[-1] >= result[0]\n",
    "    return result
\n", "
\n", "
" ] }, { "cell_type": "code", "metadata": { "collapsed": false }, "execution_count": null, "source": [ "# assert len(values) > 0 tests that you haven't passed an empty sequence in\n", "# assert result[-1] >= 0 checks that the last item in the results list is greater than or equal to 0\n", "# assert result[-1] >= result[0] checks that the last item in the results list is greater than or equal to the first item\n" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "

Fixing and testing

\n", "
\n", "
\n", "

Fix range_overlap. Re-run test_range_overlap after each change you make.

\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Debugging" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once testing has uncovered problems, the next step is to fix them. Many novices do this by making more-or-less random changes to their code until it seems to produce the right answer, but that\u2019s very inefficient (and the result is usually only correct for the one case they\u2019re testing). The more experienced a programmer is, the more systematically they debug, and most follow some variation on the rules explained below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Know What It\u2019s Supposed to Do\n", "\n", "The first step in debugging something is to _know what it\u2019s supposed to do_. \u201cMy program doesn\u2019t work\u201d isn\u2019t good enough: in order to diagnose and fix problems, we need to be able to tell correct output from incorrect. If we can write a test case for the failing case \u2014 i.e., if we can assert that with _these_ inputs, the function should produce _that_ result \u2014 then we\u2019re ready to start debugging. If we can\u2019t, then we need to figure out how we\u2019re going to know when we\u2019ve fixed things.\n", "\n", "But writing test cases for scientific software is frequently harder than writing test cases for commercial applications, because if we knew what the output of the scientific code was supposed to be, we wouldn\u2019t be running the software: we\u2019d be writing up our results and moving on to the next program. In practice, scientists tend to do the following:\n", "\n", "1. _Test with simplified data_. Before doing statistics on a real data set, we should try calculating statistics for a single record, for two identical records, for two records whose values are one step apart, or for some other case where we can calculate the right answer by hand.\n", "2. _Test a simplified case_. If our program is supposed to simulate magnetic eddies in rapidly-rotating blobs of supercooled helium, our first test should be a blob of helium that isn\u2019t rotating, and isn\u2019t being subjected to any external electromagnetic fields. Similarly, if we\u2019re looking at the effects of climate change on speciation, our first test should hold temperature, precipitation, and other factors constant.\n", "3. _Compare to an oracle_. A [test oracle](http://swcarpentry.github.io/python-novice-inflammation/reference.html#test-oracle) is something whose results are trusted, such as experimental data, an older program, or a human expert. We use test oracles to determine if our new program produces the correct results. If we have a test oracle, we should store its output for particular cases so that we can compare it with our new results as often as we like without re-running that program.\n", "4. _Check conservation laws_. Mass, energy, and other quantitites are conserved in physical systems, so they should be in programs as well. Similarly, if we are analyzing patient data, the number of records should either stay the same or decrease as we move from one analysis to the next (since we might throw away outliers or records with missing values). If \u201cnew\u201d patients start appearing out of nowhere as we move through our pipeline, it\u2019s probably a sign that something is wrong.\n", "5. _Visualize_. Data analysts frequently use simple visualizations to check both the science they\u2019re doing and the correctness of their code (just as we did in the opening lesson of this tutorial). This should only be used for debugging as a last resort, though, since it\u2019s very hard to compare two visualizations automatically." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make it Fail Every Time\n", "\n", "We can only debug something when it fails, so the second step is always to find a test case that _makes it fail every time_. The \u201cevery time\u201d part is important because few things are more frustrating than debugging an intermittent problem: if we have to call a function a dozen times to get a single failure, the odds are good that we\u2019ll scroll past the failure when it actually occurs.\n", "\n", "As part of this, it\u2019s always important to check that our code is \u201cplugged in\u201d, i.e., that we\u2019re actually exercising the problem that we think we are. Every programmer has spent hours chasing a bug, only to realize that they were actually calling their code on the wrong data set or with the wrong configuration parameters, or are using the wrong version of the software entirely. Mistakes like these are particularly likely to happen when we\u2019re tired, frustrated, and up against a deadline, which is one of the reasons late-night (or overnight) coding sessions are almost never worthwhile." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make it Fail Fast\n", "\n", "If it takes 20 minutes for the bug to surface, we can only do three experiments an hour. That doesn\u2019t must mean we\u2019ll get less data in more time: we\u2019re also more likely to be distracted by other things as we wait for our program to fail, which means the time we are spending on the problem is less focused. It\u2019s therefore critical to _make it fail fast_.\n", "\n", "As well as making the program fail fast in time, we want to make it fail fast in space, i.e., we want to localize the failure to the smallest possible region of code:\n", "\n", "1. The smaller the gap between cause and effect, the easier the connection is to find. Many programmers therefore use a divide and conquer strategy to find bugs, i.e., if the output of a function is wrong, they check whether things are OK in the middle, then concentrate on either the first or second half, and so on.\n", "2. N things can interact in $N^2$ different ways, so every line of code that isn\u2019t run as part of a test means more than one thing we don\u2019t need to worry about." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Change One Thing at a Time, For a Reason\n", "\n", "Replacing random chunks of code is unlikely to do much good. (After all, if you got it wrong the first time, you\u2019ll probably get it wrong the second and third as well.) Good programmers therefore _change one thing at a time, for a reason_ They are either trying to gather more information (\u201cis the bug still there if we change the order of the loops?\u201d) or test a fix (\u201ccan we make the bug go away by sorting our data before processing it?\u201d).\n", "\n", "Every time we make a change, however small, we should re-run our tests immediately, because the more things we change at once, the harder it is to know what\u2019s responsible for what (those $N^2$ interactions again). And we should re-run _all_ of our tests: more than half of fixes made to code introduce (or re-introduce) bugs, so re-running all of our tests tells us whether we have regressed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Keep Track of What You've Done\n", "\n", "Good scientists keep track of what they\u2019ve done so that they can reproduce their work, and so that they don\u2019t waste time repeating the same experiments or running ones whose results won\u2019t be interesting. Similarly, debugging works best when we _keep track of what we\u2019ve done_ and how well it worked. If we find ourselves asking, \u201cDid left followed by right with an odd number of lines cause the crash? Or was it right followed by left? Or was I using an even number of lines?\u201d then it\u2019s time to step away from the computer, take a deep breath, and start working more systematically.\n", "\n", "Records are particularly useful when the time comes to ask for help. People are more likely to listen to us when we can explain clearly what we did, and we\u2019re better able to give them the information they need to be useful." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Be Humble\n", "\n", "And speaking of help: if we can\u2019t find a bug in 10 minutes, we should `be humble` and ask for help. Just explaining the problem aloud is often useful, since hearing what we\u2019re thinking helps us spot inconsistencies and hidden assumptions.\n", "\n", "Asking for help also helps alleviate confirmation bias. If we have just spent an hour writing a complicated program, we want it to work, so we\u2019re likely to keep telling ourselves why it should, rather than searching for the reason it doesn\u2019t. People who aren\u2019t emotionally invested in the code can be more objective, which is why they\u2019re often able to spot the simple mistakes we have overlooked.\n", "\n", "Part of being humble is learning from our mistakes. Programmers tend to get the same things wrong over and over: either they don\u2019t understand the language and libraries they\u2019re working with, or their model of how things work is wrong. In either case, taking note of why the error occurred and checking for it next time quickly turns into not making the mistake at all.\n", "\n", "And that is what makes us most productive in the long run. As the saying goes, _A week of hard work can sometimes save you an hour of thought_. If we train ourselves to avoid making some kinds of mistakes, to break our code into modular, testable chunks, and to turn every assumption (or mistake) into an assertion, it will actually take us less time to produce working programs, not more." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "

Debug with a neighbour

\n", "
\n", "
\n", "

Take a function that you have written today, and introduce a tricky bug. Your function should still run, but will give the wrong output. Switch seats with your neighbor and attempt to debug the bug that they introduced into their function. Which of the principles discussed above did you find helpful?

\n", "
\n", "
" ] } ], "nbformat": 4, "nbformat_minor": 0 }