{
 "metadata": {
  "name": "",
  "signature": "sha256:bbf03ac24dffb0117ac7101f2c2d2bbf42f91a6788588fa00950973fc834cc0c"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Sources and Diagnosis of Numerical Error"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "When Mathematics Attacks! (Next on FOX)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "You've written a code to carry out some precise numerical algorithm you worked out meticulously on paper.  Now you're either unit-testing it (kudos!) or you have an internal comparison of a floating-point calculation to a result.\n",
      "\n",
      "At some point, a comparison like the following arises."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "(1.1-0.8) == 0.3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 1,
       "text": [
        "False"
       ]
      }
     ],
     "prompt_number": 1
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Uh oh.  What went wrong?\n",
      "\n",
      "It even gets worse when associativity and commutativity fail."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# associativity failure\n",
      "(1.0e-8+1.0)-1.0 == 1.0e-8+(1.0-1.0)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 15,
       "text": [
        "False"
       ]
      }
     ],
     "prompt_number": 15
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# commutativity failure\n",
      "(0.1+0.2+0.3) == (0.3+0.2+0.1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 25,
       "text": [
        "False"
       ]
      }
     ],
     "prompt_number": 25
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "So what on earth is going wrong and what can you do about it?"
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Numerical Error"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Most of the time when we are talking about errors in a code, we mean _bugs_:  systematic unintended errors often arising from our misapprehension of what we wrote (or what we think someone else wrote).  However, even in well-written code, errors can arise as a result of how numbers are stored in a computer and how algorithms are carried out.  This latter is _numerical error_.\n",
      "\n",
      "Numerical error can be extraordinarily difficult to diagnose\u2014particularly if the equations you are solving are ill-characterized (like nonlinear equations)\u2014but there are a few known techniques.  First, let's take a digression into how numbers are represented on the computer."
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Machine Representation of Numbers"
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Positional Notation"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "One of the great innovations of ancient mathematics was the adoption of positional notation and the zero.  We use the arabic numerals to write our base ten, or decimal, numbers.  We use the convention that the _position_ of a digit is what determines its relative value.  Thus, you are used to writing (unambiguously), $1$, $10$, $100$ to mean three different number, with the zero indicating only that no value is meant for that place.\n",
      "\n",
      "Although it is less conventional to spell things out explicitly, what if we write out _exactly_ what we mean by each place?  This makes binary notation (the language of ones and zeros in a computer) much easier to understand.  So for a regular decimal number, say 1742, we could write the following:\n",
      "$$1742_d = 1\\times 1000 + 7\\times 100 + 4\\times 10 + 2\\times 1 = 1\\times 10^3 + 7\\times 10^2 + 4\\times 10^1 + 2\\times 10^0 \\textrm{.}$$\n",
      "Writing it this last way highlights the fundamental point of positional notation:  the number of spaces to the left (or the right) of the decimal point determines the relative power that the base (in this case ten) assumes.\n",
      "\n",
      "The lesson we wish you to draw from this is that for certain applications it may be (counterintuitively) more convenient to utilize a base other than ten.  For instance, the degree\u2013minute\u2013second notation we use for latitude and longitude is base-60:  $44^{\\circ} 23' 57.6'' W$.  You may also have heard of hexadecimal notation, base-16, which is used as a shorthand for binary, base-2.  This occurs in HTML color tags, for instance:  #4D3200."
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Binary Representation"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "In digital computation, the one ring to rule them all is binary, or base-2, notation.  This means of writing numbers using the powers of two allows us to use only a discrete change in electrical characteristics to represent all possible numbers.  Thus we can write:\n",
      "$$1101_b = 1\\times 2^3 + 1\\times 2^2 + 0\\times 2^1 + 1\\times 2^0 = 13_d$$\n",
      "or\n",
      "$$1110101_b = 1\\times 2^6 + 1\\times 2^5 + 1\\times 2^4 + 0\\times 2^3 + 1\\times 2^2 + 0\\times 2^1 + 1\\times 2^0 = 117_d$$\n",
      "or, going the other way,\n",
      "$$1742_d = 1\\times 2^{10} + 1\\times 2^9 + 0\\times 2^8 + 1\\times 2^7 + 1\\times 2^6 + 0\\times 2^5 + 0\\times 2^4 + 1\\times 2^3 + 1\\times 2^2 + 1\\times 2^1 + 0\\times 2^0 = 11011001110_b \\textrm{.}$$\n",
      "\n",
      "The binary representation of a number in Python may be obtained through the function `bin`.  The corresponding decimal representation of a binary string may be obtained through `int('0b####', 2)`."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "int('0b1101',2)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 6,
       "text": [
        "13"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "int('0b1110101',2)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 5,
       "text": [
        "117"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "bin(1742)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 4,
       "text": [
        "'0b11011001110'"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Fractional Numbers and Fixed-Point Notation"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The method of binary numbers as discussed above works well for integers, but founders a bit on the question of how to represent fractional parts.  This is facilitated by defining a _binary point_ \"$_\\times$\" analogous to the decimal point \"$.$\":  _i.e._,\n",
      "$$1101_{\\times} 1001_b = 1\\times 2^3 + 1\\times 2^2 + 0\\times 2^1 + 1\\times 2^0 + 1\\times 2^{-1} + 0\\times 2^{-2} + 0\\times 2^{-3} + 1\\times 2^{-4} = 13.5625_d \\textrm{.}$$\n",
      "\n",
      "In this case, you need to arbitrarily designate a certain number of digits after the binary point and stick to that convention.  (That is, you effectively write numbers with a certain scaling factor, such as 1/1000:  thus 83.741 would be written 83741 and 724.2 would be 724200.)  This notation is referred to as _fixed-point_ notation.  It is functional, but soon exhausts its utility when one needs to represent numbers far away from the binary point."
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Floating-Point Notation"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Just as with decimal numbers, however, we can adopt a scientific notation which requires us to specify only a few parameters.  For instance, instead of 45,230,000,000,000 we write $4.523 \\times 10^{13}$\u2014requiring us to specify two numbers (4.523 and 13) to gain an economy of expression (or, to be strict, three, should we designate positive and negative as multiplication by $-1$).\n",
      "\n",
      "Floating-point mathematics allows the representation of numbers too large to fit into memory well as integers, as well as with fractional parts.  Modern floating-point arithmetic typically allows the representation of numbers with 32 bits (`float`) or 64 bits (`double`).\n",
      "\n",
      "This task is achieved by splitting the bytes up into a sign bit, an binary exponent, and a binary mantissa or significand.  (Consult the figure below.)\n",
      "\n",
      "* The sign bit tells you if the number is positive (`0`) or negative (`1`).\n",
      "* The exponent is counted from a negative value at half the range in order to also represent small numbers.\n",
      "* The mantissa includes an implied leading bit, because you don't write $0.11\\times 10^{-2}$ but $1.1\\times 10^{-3}$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<img src=\"img/floating-point.png\">"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "(As a technical aside, Intel x86 and x86-64 processors support an [80-bit `long double` format](https://en.wikipedia.org/wiki/Extended_precision) which some software routines take advantage of as well (64-bit mantissa, 15-bit exponent).  There is also a 128-bit `quad` data type available with software support through the [GNU Scientific Library](https://www.gnu.org/software/gsl/).)"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "The Limits of Precision"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "So now we're prepared to return to our earlier example comparing $1.1-0.8$ and $0.3$ and see what went wrong under the hood."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "- Calculate the binary representation of $1.1$, $0.8$, and $0.3$ using [this calculator](http://babbage.cs.qc.edu/IEEE-754/) (or see the table below).  Compare $0.3$ to the result of $1.1-0.8$ (compare the `double` floating-point type)."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "0.3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "1.1-0.8"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "There's a difference, but why?  Let's look at the binary floating-point representation of 1.1, 0.8, and 0.3 to see.\n",
      "\n",
      "| Decimal | Size | Sign Bit | Exponent | Mantissa |\n",
      "| --- | ------ | -- | -------- | ----------------------- |\n",
      "| $1.1$ | single | 0  | 01111111 | 00011001100110011001101 |\n",
      "|       |        | $1 \\times$ | $2^{(127-127)} \\times$ | $1.10000002384185791015625$ |\n",
      "|       | double | 0  | 01111111111 | 0001100110011001100110011001100110011001100110011010 |\n",
      "|       |        | $1 \\times$ | $2^{(1023-1023)} \\times$ | $1.1000000000000000888178419700125232338905$ |\n",
      "|\n",
      "| $0.8$ | single | 0  | 01111110 | 10011001100110011001101 |\n",
      "|       |        | $1 \\times$ | $2^{(126-127)} \\times$ | $1.60000002384185791015625$ |\n",
      "|       | double | 0  | 01111111110 | 1001100110011001100110011001100110011001100110011010 |\n",
      "|       |        | $1 \\times$ | $2^{(1022-1023)} \\times$ | $1.6000000000000000888178419700125232338905$ |\n",
      "|\n",
      "| $1.1-0.8$ | single | 0  | 01111101 | 00110011001100110011010 |\n",
      "| $\\,\\, = 0.3$  |        | $1 \\times$ | $2^{(125-127)} \\times$ | $1.2000000476837158203125$ |\n",
      "|       | double | 0  | 01111111110 | 0011001100110011001100110011001100110011001100110100 |\n",
      "| $\\,\\, = 0.30000000000000004$ | | $1 \\times$ | $2^{(1021-1023)} \\times$ | $1.2000000000000001776356839400250464677810$ |\n",
      "|\n",
      "| $0.3$ | single | 0  | 01111101 | 00110011001100110011010 |\n",
      "|       |        | $1 \\times$ | $2^{(125-127)} \\times$ | $1.2000000476837158203125$ |\n",
      "|       | double | 0  | 01111111101 | 0011001100110011001100110011001100110011001100110011 |\n",
      "|       |        | $1 \\times$ | $2^{(1021-1023)} \\times$ | $1.1999999999999999555910790149937383830547$ |\n",
      "\n",
      "Thus you should never compare floating-point values exactly\u2014check to see that they are within a narrow range of each other (with a relative or absolute tolerance, as necessary)."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "_Floating-point operations are not true real-valued mathematics_:  floating-point numbers do not behave exactly like either the real numbers $\\mathbb{R}$ or the rational numbers $\\mathbb{Q}$ (although integers do behave like a subset of the mathematical integers $\\mathbb{Z}$).\n",
      "\n",
      "We as well that associativity and commutativity sometimes fail.  Indeed, addition is not in general associative for floating-point numbers:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "print (1.0e-8+1.0)-1.0\n",
      "print 1.0e-8+(1.0-1.0)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "9.99999993923e-09\n",
        "1e-08\n"
       ]
      }
     ],
     "prompt_number": 26
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Also, truncation in conversion back to integers can be troublesome:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "x = 5.0/9.0\n",
      "y = 100*x\n",
      "print y, int(y), round(y)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "55.5555555556 55 56.0\n"
       ]
      }
     ],
     "prompt_number": 29
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now you know something about _why_ your calculations fail in these cases:  the finite representation of decimal quantities in binary terms leads to small precision errors that can materially impact comparisons."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Another error arising from the limits of precision is the effect of relative precision on calculations with very large or small intermediate values.\n",
      "\n",
      "There is a maximum representable number in double-precision format, $2^{1024}(1-\\epsilon) \\sim 1.79\\cdot10^{308}$ (more on this in *Overflow and Underflow* below).  Consider the effect of adding any value to this number.  We may expect the following to overflow."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "1.7976931348623157e+308 + 1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "However, it doesn't behave as we expect.  Similarly,"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "1.7976931348623157e+308 + 1e250"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "What's going on here?  It turns out that the relative precision of the added numbers is smaller than the representable precision in the larger number, so there is effectively no addition operation taking place.  You have to operate on numbers within each other's field of view, so to speak:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "1.7976931348623157e+308 + 1e300"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "which correctly overflows to `inf`."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "If you are curious about what the limit of relative precision is, you can use the following Python snippet."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Find the point at which addition of the unit value can no longer be captured accurately by the addition operation.\n",
      "x = 1.0\n",
      "p = 0\n",
      "while x != x + 1:\n",
      "    x *= 2\n",
      "    p += 1\n",
      "print \"x =\", x\n",
      "print \"p =\", p"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "x = 9.00719925474e+15\n",
        "p = 53\n"
       ]
      }
     ],
     "prompt_number": 12
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The moral is that you should _avoid multi-step calculations (like certain series summations) with extremely large or small intermediate values_."
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Overflow and Underflow"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "There are clearly limits to how precisely any decimal number can be represented in binary floating-point format (just as the decimal representation often requires repeated fractions).  Sometimes we have a number too large (_overflow_) or too small (_underflow_) to fit; sometimes the number just requires too many digits of accuracy to fit into a clean machine representation.\n",
      "\n",
      "| Machine Parameter | Single Precision | Double Precision | Quadruple Precision |\n",
      "| ----------------- | ---------------- | ---------------- | ------------------- |\n",
      "| Size | (4 bytes; 32 bits) | (8 bytes; 64 bits) | (16 bytes; 128 bits) |\n",
      "| Relative machine precision, $\\epsilon$ | $2^{-24} \\sim 5.96 \\cdot 10^{-8}$ | $2^{-53} \\sim 1.11 \\cdot 10^{-16}$ | $2^{-113} \\sim 9.63 \\cdot 10^{-35}$ |\n",
      "| Underflow threshold | $2^{-126} \\sim 1.18 \\cdot 10^{-38}$ | $2^{-1022} \\sim 2.23 \\cdot 10^{-308}$ | $2^{-16382} \\sim 3.36 \\cdot 10^{-4392}$ |\n",
      "| Overflow threshold | $2^{128}(1-\\epsilon) \\sim 3.40 \\cdot 10^{38}$ | $2^{1024}(1-\\epsilon) \\sim 1.79 \\cdot 10^{308}$ | $2^{16384}(1-\\epsilon) \\sim 1.19 \\cdot 10^{4392}$ |\n",
      "\n",
      "Let's take a look at another example related to underflow and overflow.  From the table above, we can see that the maximum representable number in 64-bit precision is about $1.79 \\cdot 10^{308}$ (in Python, `sys.float_info.max`).  Let's double this value and see what happens\u2014it correctly overflows to $+\\infty$.  But what about a smaller (but still nonnegligible) addition?"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import sys\n",
      "print sys.float_info"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Some special types exist as well:  `inf` ($+\\infty$), `-inf` ($-\\infty$), `NaN` (Not A Number).  This latter occurs as the result of procedures with undefined results (such as division by zero) and allow you to systematically handle errors and bugs as they arise.  In order to avoid defining a special `inf` keyword, you must pass these to the `float` operator as a string."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "float('inf')/float('inf')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "For more details, consult [Finley, 2000](http://tfinley.net/notes/cps104/floating.html) (from whom the example figured was borrowed) and [IEEE-754 Analysis](http://babbage.cs.qc.edu/IEEE-754/), a calculator which shows you the pieces of the representation for `float` and `double` types. "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "---"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Accrued Error"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "A tragic failure stemming from something as simple as lost precision led, during the February 1991 Gulf War, to the deaths of 28 American soldiers in a barracks in Dharan, Saudi Arabia.\n",
      "\n",
      "A Patriot missile interceptor system used an internal clock which gave the time in tenths of a second.  This number was multiplied by 0.1 to give the time in seconds, performed on that machine's hardware.  When an Iraqi Scud missile came in, the Patriot system had been on line for about 100 hours.  What did this mean in practical terms?\n",
      "\n",
      "1/10 in binary is $0.0001100110011001100110011001100...b$.  Since a machine can't represent the infinitely repeating bits, the finite precision representation $0.00011001100110011001100b$ introduces an error of $0.0000000000000000000000011001100...b = 0.000000095d$ relative to the real world.  Over one hundred hours, this becomes a relative error of $100\\times 60\\times 60\\times 10\\times 0.000000095 = 0.34 \\,\\textrm{s}$.\n",
      "\n",
      "If a Scud missile travels at $1.676 \\,\\textrm{km}\\cdot\\textrm{s}^{-1}$, then in $0.34 \\,\\textrm{s}$ it has traveled more than half a kilometer, outside the precision of the interceptor to catch."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# actual multiplier on Patriot interceptor hardware\n",
      "dt = 2**-4+2**-5 + 2**-8+2**-9 + 2**-12+2**-13 + 2**-16+2**-17 + 2**-20+2**-21"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 47
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# \"real life\" tenth of a second (or in this case, double precision)\n",
      "dT = 0.1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 48
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "uptime = 100*60*60*10 #ds (deciseconds ... a little odd)\n",
      "\n",
      "assumed_diff = (uptime*dt)\n",
      "real_diff = (uptime*dT)\n",
      "\n",
      "real_diff - assumed_diff"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 50,
       "text": [
        "0.34332275390625"
       ]
      }
     ],
     "prompt_number": 50
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# difference in Scud missile position as a result\n",
      "v = 1676 # m/s\n",
      "v*(real_diff-assumed_diff) # m"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 52,
       "text": [
        "575.408935546875"
       ]
      }
     ],
     "prompt_number": 52
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "---"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Modified Equations and Truncation Error"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "You can consider your numerical system either as an approximate solution of an exact equation (the normal way of thinking) or the exact solution of an approximate equation.  What I mean by the latter is this:  the approximations introduced by your numerical method exactly solve a real equation which is similar to the equation you would like to solve.  So one question is how we can identify errors by seeing what additional physics have been introduced into your problem by the solution and then looking for those physics.\n",
      "\n",
      "In other words, given a finite-difference approximation of a partial differential equation, find another PDE which is better approximated by the finite-difference scheme.  Then use the FD method to learn about the behavior of the new PDE\u2014the modified equation\u2014and thus the behavior of the error terms."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "_First-Order Example._  Consider the hyperbolic PDE $u_t + a u_x = 0$ with initial condition $u(x, 0) = u_0 (x)$ and boundary conditions $u(L, t) = u_L (t)$; $u(R, t) = u_R (t)$.  The Lax\u2013Friedrichs forward in time, centered in space (FTCS) scheme with artificial viscosity can be written as\n",
      "\n",
      "$$\\frac{U_{j}^{n+1} - U_{j}^{n}}{\\Delta t} + a \\frac{U_{j+1}^{n} - U_{j-1}^{n}}{2 \\Delta x} - \\frac{U_{j+1}^{n} - 2 U_{j}^{n} + U_{j-1}^{n}}{2 \\Delta t} = 0.$$\n",
      "\n",
      "A Taylor series expansion of the original equation yields\n",
      "$$u_{t} + a u_{x} - \\frac{1}{2 \\Delta t} u_{xx} \\Delta x^2 + \\frac{1}{2} u_{tt} \\Delta t + \\frac{1}{6} a u_{xxx} \\Delta x^2 - \\frac{1}{24 \\Delta t} u_{xxxx} \\Delta x^4 + \\,\\, ... $$\n",
      "$$\\,\\,\\,\\,\\,\\,\\, = \\left(u_{t} + a u_{x} \\right) + \\frac{1}{2} \\left({ u_tt \\Delta t}- u_{xx}\\frac{\\Delta x^2}{\\Delta t} \\right) + \\,\\, ... $$\n",
      "$$\\,\\,\\,\\,\\,\\,\\, = \\left(u_{t} + a u_{x} \\right) + \\frac{1}{2} \\left({ a^2 \\Delta t}- \\frac{\\Delta x^2}{\\Delta t} \\right) u_{xx} + \\,\\, ... $$\n",
      "\n",
      "From this last result, we arrive at the modified equation\n",
      "$$u_{t} + a u_{x} = \\frac{\\Delta x^2}{2 \\Delta t} \\left(1 - \\left(\\frac{a \\Delta t}{\\Delta x}\\right)^2 \\right) u_{xx} $$\n",
      "which is an advection\u2013diffusion equation, having added a diffusion term and an antidiffusive correction (by the central differencing).  Thus if we see unexpected diffusive behavior, we can suspect that this is due to the numerical method used rather than another source of error.  Depending on the size of $a$, we could even find that a smaller time step size relative to $\\Delta x$ _increases_ the diffusive error.  (This occurs with some other techniques, such as the DuFort\u2013Frankel method.)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "_Second-Order Example._  Consider the parabolic PDE $\\phi_t - \\alpha \\phi_{xx} = 0$ with initial condition $\\phi(x, 0) = \\phi_0 (x)$ and boundary conditions $\\phi(L, t) = \\phi_L (t)$; $\\phi(R, t) = \\phi_R (t)$.  The explicit Euler in time, second-order centered in space scheme can be written as\n",
      "\n",
      "$$\\frac{\\Phi_{j}^{n+1} - \\Phi_{j}^{n}}{\\Delta t} - \\alpha \\frac{\\Phi_{j+1}^{n} - 2\\Phi_{j}^{n} + \\Phi_{j-1}^{n}}{\\Delta x^2} = 0.$$\n",
      "\n",
      "We will write the Taylor series expansion differently this time, breaking it out explicitly by terms:\n",
      "$$\\Phi_j^{n+1} = \\Phi_j^n + \\Delta t \\frac{\\partial \\Phi_j^n}{\\partial t} + \\frac{\\Delta t^2}{2} \\frac{\\partial^2 \\Phi_j^n}{\\partial t^2} + \\,\\, ... $$\n",
      "$$\\Phi_j^{n-1} = \\Phi_j^n - \\Delta t \\frac{\\partial \\Phi_j^n}{\\partial t} + \\frac{\\Delta t^2}{2} \\frac{\\partial^2 \\Phi_j^n}{\\partial t^2} + \\,\\, ... $$\n",
      "\n",
      "$$\\Phi_j^{n+1} = \\Phi_j^n + \\Delta x \\frac{\\partial \\Phi_j^n}{\\partial x} + \\frac{\\Delta x^2}{2} \\frac{\\partial^2 \\Phi_j^n}{\\partial x^2} + \\,\\, ... $$\n",
      "$$\\Phi_j^{n-1} = \\Phi_j^n - \\Delta x \\frac{\\partial \\Phi_j^n}{\\partial x} + \\frac{\\Delta x^2}{2} \\frac{\\partial^2 \\Phi_j^n}{\\partial x^2} + \\,\\, ... $$\n",
      "\n",
      "From these expressions, we can reconstruct both the numerical approximation and the lowest-order portion of the error term:\n",
      "$$\\phi_t - \\alpha \\phi_{xx} = \\alpha \\frac{\\Delta x^2}{12} \\phi_{xxxx} - \\frac{\\Delta t}{2} \\phi_{tt} + \\,\\, ... $$\n",
      "This is the corresponding modified equation.  Thus the equation is first-order accurate in time and second-order accurate in space.\n",
      "\n",
      "The error will thus behave as a fourth-order biharmonic term in space, and a wavelike term in time.  In addition, note that reduction of either the time step or the spatial discretization size is subject to diminishing returns without a corresponding reduction in the other step size."
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Finding & Mitigating Numerical Error"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Beyond general debugging and best practices in software development, there are a few techniques specific to scientific and numerical computing that you should bear in mind.\n",
      "\n",
      "In the first place, it is good to use characteristic scales such that _most_ of your expected values lie within the range $\\left(10^3-10^{-3}\\right)$.  This allows you to easily identify underflow and overflow errors, as well as gross oscillations and instabilities.\n",
      "\n",
      "Another way you can detect numerical error is to swap out the 64-bit `double`s in your code for 32-bit `float`s to see what difference, if any, it makes.  (Note particularly that in our example the 32-bit `float` was _not_ susceptible to the error introduced by the vastly more precise `double`.)"
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Enhanced-Precision Summation"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's go back to our second example above (the accrued imprecision from repeated summations).  If we were to track the lost bits of precision separately (and perhaps make some assumptions about rounding consistent with IEEE-754), we could perhaps improve the result of the long-term calculation.\n",
      "\n",
      "Thus, the Python `math` module provides an enhanced-precision summation function `fsum` which acts on an iterable (such as a list) and tracks intermediate partial sums to improve the final accuracy."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "sum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 20,
       "text": [
        "0.9999999999999999"
       ]
      }
     ],
     "prompt_number": 20
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# fsum will solve issues related to drift in repeated operations.\n",
      "from math import fsum\n",
      "fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 23,
       "text": [
        "1.0"
       ]
      }
     ],
     "prompt_number": 23
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# fsum will not solve fundamental precision errors---\n",
      "print 'fsum([1.1, -0.8]) == 0.3 is', fsum([1.1, -0.8]) == 0.3\n",
      "#---but will mitigate accumulated error.\n",
      "print 'fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]) == 1.0 is', fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]) == 1.0"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "fsum([1.1, -0.8]) == 0.3 is False\n",
        "fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1]) == 1.0 is True\n"
       ]
      }
     ],
     "prompt_number": 24
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Avoiding Extreme Intermediate Values"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Maintaining precision is important in calculations with very large or small intermediate values.  In these cases, it may be appropriate to utilize either arbitrary-precision arithmetic (slow) or [Kahan enhanced summation](http://en.wikipedia.org/wiki/Kahan_summation_algorithm).  Kahan summation tracks the lost precision from a result separately and adds it back in at the end, allowing you to maintain higher precision than would otherwise be possible."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "---"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "References & Further Reading"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "- Arnold, D. N.  [Two disasters caused by computer arithmetic errors](http://www.ima.umn.edu/~arnold/455.f96/disasters.html).  University of Minnesota.\n",
      "- Bush, B.  [The perils of floating point](http://www.lahey.com/float.htm).  Lahey Computer Systems.\n",
      "- Finley, T.  [Floating point](http://tfinley.net/notes/cps104/floating.html)\n",
      "- Goldberg, D.  [What every computer scientist should know about floating-point arithmetic](http://www.validlab.com/goldberg/paper.pdf).\n",
      "- [IEEE-754 Analysis Calculator](http://babbage.cs.qc.cuny.edu/IEEE-754/).\n",
      "- Netlib.org ScaLAPACK User's Guide, [\u201cSources of Error in Numerical Calculations\u201d](http://www.netlib.org/scalapack/slug/node133.html).\n",
      "- [Floating-point arithmetic:  Issues and limitations](https://docs.python.org/3/tutorial/floatingpoint.html), Python Foundation."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Neal Davis developed these materials for [Computational Science and Engineering](http://cse.illinois.edu/) at the University of Illinois at Urbana\u2013Champaign.  This content is available under a Creative Commons Attribution 3.0 Unported License."
     ]
    }
   ],
   "metadata": {}
  }
 ]
}