{ "cells": [ { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": false, "grade_id": "title", "locked": true, "solution": false } }, "source": [ "# Importing data and plotting\n", "\n", "For this exercise we will be looking at data for Prussian cavalry killed by horse kicks, this is found in the aptly named 'HorseKicks.dat' file. The first column holds the year and the subsequent columns hold the number of horse kick deaths for 14 different cavalry corps. This data was made famous by von Bortkiewicz as it follows a Poisson model reasonably well.\n", "\n", "Data files such as .csv and .dat files can be imported into numpy arrays via the numpy function loadtxt. The usage of this is as follows:\n", " \n", " import numpy as np\n", " \n", " data = np.loadtxt('HorseKicks.dat',dtype = 'int')\n", " \n", "This loads the data into a 2-dimensional array, the columns and rows can be accessed by indices and slices. E.g. the first column is given by data[:,0]. We have used dtype here to specify we would like the data to imported as integers rather than floats. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true, "nbgrader": { "grade": false, "grade_id": "init", "locked": true, "solution": false } }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": false, "grade_id": "its_your_boiiii", "locked": true, "solution": false } }, "source": [ "### Poisson Distribution\n", "\n", "The Poisson distribution is defined by its mean, $\\lambda$. This can be found from our imported data using a standard arithmetic mean:\n", "\n", "$$ \\bar{x} = \\frac{1}{N} \\sum_{i} x_i $$\n", "\n", "Then the probability of $k$ events within the interval is given by:\n", "\n", "$$ P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} $$\n", "\n", "##### Write a function named factorial to calculate $k!$ including a special case for $0!$:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "nbgrader": { "grade": false, "grade_id": "1", "locked": false, "solution": true } }, "outputs": [], "source": [ "def factorial(x):\n", " if int(x) == 0:\n", " return 1\n", " elif int(x) < 0:\n", " print('incorrect input')\n", " else:\n", " result = int(x)\n", " for i in range(1,int(x)-1):\n", " result *= int(x)-i\n", " return result " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "nbgrader": { "grade": true, "grade_id": "fact_test", "locked": true, "points": 2, "solution": false } }, "outputs": [], "source": [ "assert factorial(0) == 1\n", "from random import randint\n", "from scipy.misc import factorial as fact\n", "rand = randint(1,9)\n", "assert factorial(rand) == fact(rand)" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": false, "grade_id": "qwertyuiop", "locked": true, "solution": false } }, "source": [ "### Main Exercise\n", "Below write some code to import the data and calculate the mean and store the value in a variable named mean" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "nbgrader": { "grade": false, "grade_id": "import_data", "locked": false, "solution": true } }, "outputs": [], "source": [ "data = np.loadtxt('HorseKicks.dat',dtype = 'int')\n", "year = data[:,0]\n", "\n", "kick_data = data[:,1:]\n", "mean = np.sum(kick_data)/kick_data.ravel().shape[0]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true, "nbgrader": { "grade": true, "grade_id": "check_mean", "locked": true, "solution": false } }, "outputs": [], "source": [ "assert mean == 0.7" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": false, "grade_id": "plot_give", "locked": true, "solution": false } }, "source": [ "### Plotting\n", "Now write code below to plot the probability of $n$ officers being kicked to death against $n$ from the data and from the Poisson distribution " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "nbgrader": { "grade": true, "grade_id": "plot_ans", "locked": false, "solution": true } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAEACAYAAABWLgY0AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEnlJREFUeJzt3X9sXWd9x/H31y1FI0xIpVLB7mKMy2AgRoZEKGqn3q7K\nCNtE0IZGSgoL00RAazaBNrVDWHZkpFGtY4MixqJl8xjdWgRTG2CMWrRXCEHbDFpgkDSpsbzEbrtV\nQKcY2EL93R/3xru9tXOv7Wvf6yfvl2Tl/Hh8zjdP2s89fs55jiMzkSSVqa/bBUiS1o8hL0kFM+Ql\nqWCGvCQVzJCXpIIZ8pJUsLZCPiJ2RsSxiDgeETcu06YSEQ9GxL9HxL2dLVOStBrR6jn5iOgDjgPX\nAnPAEWB3Zh5raPM84KvAr2bmbERckplPrF/ZkqR2tHMlvx04kZkzmXkGuB3Y1dTmrcBnMnMWwICX\npN7QTsgPACcb1k/VtzX6eeDiiLg3Io5ExNs6VaAkafUu7OBxXg38CrAF+FpEfC0zH+nQ8SVJq9BO\nyM8CWxvWL6tva3QKeCIzfwL8JCK+DLwKeFrIR4QvypGkVcjMWM33tTNccwS4PCIGI+IiYDdwuKnN\nXcBVEXFBRDwHeC1wdJlCe/5rdHS06zVYp3Vu1hqts/Nfa9HySj4zn4qIG4C7qX0oHMrMoxGxr7Y7\nD2bmsYj4IvAt4CngYGZ+d02VSZLWrK0x+cz8V+ClTdv+umn9FuCWzpUmSVorZ7wuoVKpdLuEtlhn\nZ22GOjdDjWCdvaTlZKiOniwiN/J8klSCiCDX8carJGmTMuQlqWCGvCQVzJCXpIIZ8pJUMENekgpm\nyEtSwQx5SSqYIS9JBTPkJalghrwkFcyQl6SCGfKSVDBDXpIKZshLUsEMeUkqmCEvSQUz5CWpYIa8\nJBXMkJekghnyklQwQ16SCrbhIX/99QeYnp7Z6NNK0nkpMnPjThaRcJrh4VEmJ/czNDS4YeeWpM0q\nIsjMWM33dmG4ZgtTUwcYGZnY+FNL0nmmS2PyW5ibW+jOqSXpPNJWyEfEzog4FhHHI+LGJfZfHRE/\njIhv1L/ef+4jztPf7z1fSVpvF7ZqEBF9wEeBa4E54EhE3JWZx5qafjkz39j6lPMMD48yPr5/FeVK\nklaincvp7cCJzJzJzDPA7cCuJdq1dVNgz55bvOkqSRuknZAfAE42rJ+qb2v2uoh4KCI+HxEvX+5g\nL+EEfTgeL0kboVMD418HtmbmNmpDO3cu1/CPbruNW3fsYGZ6ukOnliQtp+WYPDALbG1Yv6y+bVFm\nnm5Y/kJEfCwiLs7M7zcf7M+Ai6am+J1duxj7yEeoVCqrq1ySClWtVqlWqx05VsvJUBFxAfAwtRuv\njwIPANdl5tGGNpdm5uP15e3ApzLzRUsca/Fso9dcw4F77unIX0KSSraWyVAtr+Qz86mIuAG4m9rw\nzqHMPBoR+2q78yDw5oh4N3AG+DHwlnMdcx7o6+9fTb2SpBXY8NcanAZGh4fZPznJ4NDQhp1bkjar\nTfVag1v27DHgJWmDbPiV/EaeT5JKsKmu5CVJG8eQl6SCGfKSVDBDXpIKZshLUsEMeUkqmCEvSQUz\n5CWpYIa8JBXMkJekghnyklQwQ16SCmbIS1LBDHlJKpghL0kFM+QlqWCGvCQVzJCXpIIZ8pJUMENe\nkgpmyEtSwQx5SSqYIS9JBTPkJalghrwkFcyQl6SCGfKSVLC2Qj4idkbEsYg4HhE3nqPdayLiTET8\nZudKlCStVsuQj4g+4KPA64FXANdFxMuWafdB4IudLlKStDrtXMlvB05k5kxmngFuB3Yt0W4/8Gng\nPztYnyRpDdoJ+QHgZMP6qfq2RRHRD7wpM/8KiM6VJ0laiws7dJy/BBrH6pcN+rGxscXlSqVCpVLp\nUAmSVIZqtUq1Wu3IsSIzz90g4gpgLDN31tdvAjIzb25o872zi8AlwDzwzsw83HSsbHU+SdLTRQSZ\nuapRknZC/gLgYeBa4FHgAeC6zDy6TPu/Az6bmf+8xD5DXpJWaC0h33K4JjOfiogbgLupjeEfysyj\nEbGvtjsPNn/LagqRJHVeyyv5jp7MK3lJWrG1XMk741WSCmbIS1LBDHlJKpghL0kFM+QlqWCGvCQV\nzJCXpIIZ8pJUMENekgpmyEtSwQx5SSpYp94nrw00Mz3NxMgIC7Oz9A0MsHd8nMGhoW6XJakH+YKy\nTWZmeppbd+zgwNQUW6i9uH90eJj9k5MGvVQoX1B2HpkYGVkMeIAtwIGpKSZGRrpZlqQeZchvMguz\ns4sBf9YWYGFurhvlSOpxhvwm0zcwwHzTtnmgr7+/G+VI6nGG/Cazd3yc0eHhxaA/Oya/d3y8m2VJ\n6lHeeN2EFp+umZujr7/fp2ukwq3rL/LuJENeklbOp2skSUtyMtQmND09w8jIBLOzCwwM9DE+vpeh\nocFulyWpBzlcs8lMT8+wY8etTE0dgPp0qOHhUSYn9xv0UqEcrjmPjIxMNAQ8wBampg4wMjLRxaok\n9SpDfpOZnV2AJaZDzc0tdKMcST3OkN9kBgb6YInpUP39/lNKeiaTYZMZH9/L8PAoNEyHGh4eZXx8\nb9dqktS7vPG6CZ19umZuboH+fp+ukUrnZChJKti6P10TETsj4lhEHI+IG5fY/8aI+GZEPBgRD0TE\nlaspRpLUWS2v5COiDzgOXAvMAUeA3Zl5rKHNczLzR/XlVwKfysxfWOJYXslL0gqt95X8duBEZs5k\n5hngdmBXY4OzAV/3XMDn+SSpB7QT8gPAyYb1U/VtTxMRb4qIo8Bngd/tTHmSpLXo2LtrMvNO4M6I\nuAr4ALBjqXZjY2OLy5VKhUql0qkSJKkI1WqVarXakWO1MyZ/BTCWmTvr6zcBmZk3n+N7poDXZOb3\nm7Y7Ji9JK7TeY/JHgMsjYjAiLgJ2A4ebChhuWH41cFFzwEuSNl7L4ZrMfCoibgDupvahcCgzj0bE\nvtruPAj8VkS8Hfhf4MfAb69n0ZKk9jgZSpJ6nK8aliQtyZCXpIIZ8pJUMENekgpmyEtSwQx5SSqY\nIS9JBTPkJalghrwkFcyQl6SCGfKSVDBDXpIKZshLUsEMeUkqmCEvSQUz5CWpYIa8JBXMkJekghny\nklQwQ16SCmbIS1LBDHlJKpghL0kFM+QlqWCGvCQVzJCXpIIZ8pJUMENekgrWVshHxM6IOBYRxyPi\nxiX2vzUivln/+kpEvLLzpUqSVioy89wNIvqA48C1wBxwBNidmcca2lwBHM3MJyNiJzCWmVcscaxs\ndT5J0tNFBJkZq/nedq7ktwMnMnMmM88AtwO7Ghtk5n2Z+WR99T5gYDXFSJI6q52QHwBONqyf4twh\n/nvAF9ZSlCSpMy7s5MEi4hrgHcBVy7UZGxtbXK5UKlQqlU6WIEmbXrVapVqtduRY7YzJX0FtjH1n\nff0mIDPz5qZ2vwh8BtiZmVPLHMsxeUlaofUekz8CXB4RgxFxEbAbONxUwFZqAf+25QJekrTxWg7X\nZOZTEXEDcDe1D4VDmXk0IvbVdudBYAS4GPhYRARwJjO3r2fhkqTWWg7XdPRkDtdI0oqt93CNJGmT\nMuQlqWCGvCQVzJCXpIIZ8pJUMENekgpmyEtSwQx5SSqYIS9JBTPkJalghrwkFcyQl6SCGfKSVDBD\nXpIKZshLUsEMeUkqmCEvSQUz5CWpYIa8JBXMkJekghnyklQwQ16SCnZhtwuQum1mepqJkREWZmfp\nGxhg7/g4g0ND3S5L6ojIzI07WURu5PmkVmamp7l1xw4OTE2xBZgHRoeH2T85adCrZ0QEmRmr+V6H\na3RemxgZWQx4gC3AgakpJkZGulmW1DGGvM5r8w0Bf9YWYH7qe90oR+o4Q17ntYcem2e+ads88NBj\np7tRjtRxhrzOa6cvrbCH4cWgnwf2MMzpF1S6WJXUOW2FfETsjIhjEXE8Im5cYv9LI+KrEfGTiHhv\n58uU1seLL38+d3EX29jDL3MN29jDXdzFi4ef3+3SpI5o+XRNRPQBx4FrgTngCLA7M481tLkEGATe\nBPwgMz+0zLF8ukY9ZXp6hh07bmVq6gD10XiGh0eZnNzP0NBgt8uTgLU9XdPOc/LbgROZOVM/2e3A\nLmAx5DPzCeCJiPiN1RQhdcvQ0CCTk/sZGbmFubkF+vv7GB834FWOdkJ+ADjZsH6KWvBLRRgaGuST\nnxztdhnSuvDGqyQVrJ0r+Vlga8P6ZfVtqzI2Nra4XKlUqFQqqz2UJBWpWq1SrVY7cqx2brxeADxM\n7cbro8ADwHWZeXSJtqPA6cz882WO5Y1XSVqhtdx4bevdNRGxE/gwteGdQ5n5wYjYB2RmHoyIS4F/\nA34WWABOAy/PzNNNxzHkJWmF1j3kO8WQl6SV8wVlkqQlGfKSVDBDXpIKZshLUsEMeUkqmCEvSQUz\n5CWpYIa8JBXMkJekghnyklQwQ16SCmbIS1LBDHlJKpghL0kFM+QlqWDt/Po/SWrL9PQMIyMTzM4u\nMDDQx/j4XoaGBrtd1nnNXxoiqSOmp2fYseNWpqYOAFuAeYaHR5mc3G/Qr5G/NERS142MTDQEPMAW\npqYOMDIy0cWqZMhL6ojZ2QX+P+DP2sLc3EI3ylGdY/KSOmJgoA/4Dpfzp7yAWR5jgEf4E/r7vZbs\nJkNeUke8653X8j93XMnET5+sj8jD3gs/x7ve+blul3Ze8yNWUkd86eDHFwMeagM3Ez99ki8d/Hg3\nyzrvGfKSOmJhdnaJEXlYmJvrRjmqM+QldUTfwADzTdvmgb7+/m6UozpDXlJH7B0fZ3R4eDHo54HR\n4WH2jo93s6zznpOhJHXMzPQ0EyMjLMzN0dffz97xcQaHhrpd1qa3lslQhrwk9ThnvEpSgaanZ7j+\n+gNrOkZbIR8ROyPiWEQcj4gbl2nzkYg4EREPRcS2NVUlSee56ekZrrn6A9x/29E1HadlyEdEH/BR\n4PXAK4DrIuJlTW3eAAxn5kuAfcCmfjC2Wq12u4S2WGdnbYY6N0ONYJ2d8N73fIhtJ7/EQ9yxpuO0\ncyW/HTiRmTOZeQa4HdjV1GYX8AmAzLwfeF5EXLqmyrqol//hG1lnZ22GOjdDjWCdnfBfX7uH25h+\nxtyDlWon5AeAkw3rp+rbztVmdok2kqQ2vZDTaw548MarJPWkF13xqmdMLluNlo9QRsQVwFhm7qyv\n3wRkZt7c0ObjwL2ZeUd9/RhwdWY+3nQsn5+UpFVY7SOU7byF8ghweUQMAo8Cu4HrmtocBn4fuKP+\nofDD5oBfS5GSpNVpGfKZ+VRE3ADcTW1451BmHo2IfbXdeTAz/yUifi0iHqE2m/kd61u2JKkdGzrj\nVZK0sdblxutmmTzVqs6IuDoifhgR36h/vb8LNR6KiMcj4lvnaNMLfXnOOnukLy+LiHsi4jsR8e2I\n+INl2nW1P9ups0f689kRcX9EPFivc3SZdt3uz5Z19kJ/1uvoq5//8DL7V96XmdnRL2ofHI8Ag8Cz\ngIeAlzW1eQPw+frya4H7Ol1Hh+q8Gji80bU11XAVsA341jL7u96XbdbZC335AmBbffm5wMM9+t9m\nO3V2vT/rdTyn/ucFwH3A9l7rzzbr7JX+fA/wyaVqWW1frseV/GaZPNVOnQBdvVmcmV8BfnCOJr3Q\nl+3UCd3vy8cy86H68mngKM+cz9H1/myzTuhyfwJk5o/qi8+mdo+vefy36/1ZP3erOqHL/RkRlwG/\nBvzNMk1W1ZfrEfKbZfJUO3UCvK7+o9HnI+LlG1PaivRCX7arZ/oyIl5E7SeP+5t29VR/nqNO6IH+\nrA8vPAg8Bkxm5pGmJj3Rn23UCd3vz78A/pilP4BglX3pZKhz+zqwNTO3UXt/z51drmcz65m+jIjn\nAp8G/rB+pdyTWtTZE/2ZmQuZ+UvAZcBru/3hvZw26uxqf0bErwOP13+CCzr4U8V6hPwssLVh/bL6\ntuY2P9eizXprWWdmnj77Y15mfgF4VkRcvHEltqUX+rKlXunLiLiQWnD+Q2betUSTnujPVnX2Sn82\n1PPfwL3AzqZdPdGfZy1XZw/055XAGyPie8A/AddExCea2qyqL9cj5BcnT0XERdQmTzXfKT4MvB0W\nZ9QuOXlqnbWss3G8KyK2U3vk9PsbW2bt9Cz/yd4LfXnWsnX2UF/+LfDdzPzwMvt7pT/PWWcv9GdE\nXBIRz6sv/wywAzjW1Kzr/dlOnd3uz8x8X2ZuzcwXU8uiezLz7U3NVtWX7cx4XWmxm2LyVDt1Am+O\niHcDZ4AfA2/Z6Doj4h+BCvD8iPgPYBS4iB7qy3bqpDf68kpgD/Dt+vhsAu+j9oRVz/RnO3XSA/0J\nvBD4+6i9jrwPuKPefz31/3o7ddIb/fkMnehLJ0NJUsG88SpJBTPkJalghrwkFcyQl6SCGfKSVDBD\nXpIKZshLUsEMeUkq2P8BU8eAX/e6tOUAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x = np.array((0.,1.,2.,3.,4.))\n", "count_data = np.bincount(kick_data.ravel())/kick_data.ravel().shape[0]\n", "\n", "plt.plot(x,count_data,'bo')\n", "\n", "P = [mean**y*np.exp(-mean)/factorial(y) for y in x]\n", "\n", "plt.plot(x,P,'ro')\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Create Assignment", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }