{ "metadata": { "name": "", "signature": "sha256:b47885f40315541b63c36ee22d27e66a640dfeae2713c09b685a214c76c7ca86" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Practical 5a\n", "============\n", "\n", "1. Examine the file \"data/qzpercentages.csv\" (use %load iPython command \u2013 %load data/qzpercentages.csv)." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Tip! By starting iPython notebook from the pipeline you will know what\n", "# folder you are in and find it easier to find the file to load. If you find\n", "# relative paths too much for you then just copy the data into the working\n", "# folder for now.\n", "\n", "%load ../data/qzpercentages.csv" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "Quartz Percentages in Samples\n", "53\n", "49\n", "56\n", "61\n", "41\n", "52\n", "24\n", "51\n", "32\n", "34\n", "51\n", "49\n", "41\n", "45\n", "48\n", "57\n", "47\n", "42\n", "36\n", "55\n", "47\n", "50\n", "58\n", "53\n", "45\n", "37\n", "45\n", "41\n", "51\n", "46\n", "42\n", "61\n", "47\n", "40\n", "55\n", "37\n", "35\n", "43\n", "32\n", "43\n", "53\n", "29\n", "56\n", "56\n", "46\n", "36\n", "40\n", "37\n", "50\n", "39\n", "45\n", "43\n", "38\n", "37\n", "53\n", "51\n", "55\n", "51\n", "48\n", "50\n", "55\n", "55\n", "48\n", "46\n", "50\n", "53\n", "51\n", "42\n", "52\n", "54\n", "48\n", "52\n", "60\n", "43\n", "46\n", "42\n", "40\n", "34\n", "44\n", "43\n", "46\n", "48\n", "61\n", "54\n", "46\n", "44\n", "57\n", "56\n", "41\n", "54\n", "60\n", "55\n", "32\n", "38\n", "45\n", "63\n", "44\n", "51\n", "65\n", "45\n", "34\n", "47\n", "42\n", "49\n", "51\n", "41\n", "55\n", "56\n", "48\n", "44\n", "28\n", "50\n", "66\n", "50\n", "42\n", "36\n", "47\n", "51\n", "42\n", "56\n", "33\n", "44\n", "35\n", "44\n", "43\n", "49\n", "38\n", "48\n", "49\n", "34\n", "46\n", "53\n", "41\n", "51\n", "46\n", "45\n", "36\n", "54\n", "45\n", "65\n", "48\n", "45\n", "50\n", "48\n", "52\n", "34\n", "41\n", "44\n", "48\n", "40\n", "40\n", "52\n", "52\n", "45\n", "55\n", "38\n", "48\n", "42\n", "46\n", "46\n", "42\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Write a program to read in the data and then calculate the mean, median, range, interquartile range, standard deviation, variance, and mode." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# In the previous lecture course you learned how to read in ascii file and parse them.\n", "\n", "# We could do that here as .csv files are easy to read but...\n", "\n", "# ...it is much better/faster/easier to always spend a few minutes online to see if \n", "# there is a \"right\" way of doing this.\n", "\n", "# In this case I would write lots of loops, carefully skip the header (if there were\n", "# multiple columns then I would have to handle that as well) etc. But a quick check\n", "# online reveals I only have to do this...\n", "\n", "import numpy as np\n", "\n", "# Read in the records.\n", "record = np.recfromcsv(\"../data/qzpercentages.csv\") \n", "\n", "# Convert this to a numpy array - note that while the data in the record was of\n", "# type integer (how do I know this?), I only have to specify dtype to convert all\n", "# the data to floats.\n", "array = np.array(record, dtype=float)\n", "\n", "# Finally you bask in the glory of your cleverness having RTFM'ed...\n", "# http://docs.scipy.org/doc/numpy/reference/routines.statistics.html\n", "print \"Mean %g\"%np.mean(array)\n", "print \"Median %g\"%np.median(array)\n", "print \"Max, min (%g, %g)\"%(np.amin(array), np.amax(array))\n", "print \"Range %g\"%np.ptp(array)\n", "print \"Interquartile range %g\"%(np.percentile(array, 75) - np.percentile(array, 25))\n", "print \"Standard deviation %g\"%np.std(array)\n", "print \"Variance %g\"%np.var(array)\n", "\n", "# np doesn't have mode, however a quick online search throws up a whole lot more :-)\n", "# http://docs.scipy.org/doc/scipy/reference/stats.html\n", "from scipy import stats\n", "print \"Mode %g\"%stats.mode(array)[0]\n", "\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Mean 46.5714\n", "Median 46\n", "Max, min (24, 66)\n", "Range 42\n", "Interquartile range 10\n", "Standard deviation 7.76896\n", "Variance 60.3567\n", "Mode 48\n" ] } ], "prompt_number": 48 } ], "metadata": {} } ] }