{ "metadata": { "name": "", "signature": "sha256:70494b3455eef6b39078bf8db0b82b3600a42c341a90da855b1bd1111eea2b54" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This material assumes that you have programmed before. This first lecture provides a quick introduction to programming in Python for those who either haven't used Python before or need a quick refresher.\n", "\n", "Let's start with a hypothetical problem we want to solve. We are interested in understanding the relationship between the weather and the number of mosquitos occuring in a particular year so that we can plan mosquito control measures accordingly. Since we want to apply these mosquito control measures at a number of different sites we need to understand both the relationship at a particular site and whether or not it is consistent across sites. The data we have to address this problem comes from the local government and are stored in tables in comma-separated values (CSV) files. Each file holds the data for a single location, each row holds the information for a single year at that location, and the columns hold the data on both mosquito numbers and the average temperature and rainfall from the beginning of mosquito breeding season. The first few rows of our first file look like:\n", "\n", "~~~\n", "year,temperature,rainfall,mosquitos\n", "2001,87,222,198\n", "2002,72,103,105\n", "2003,77,176,166\n", "~~~" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Objectives" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Conduct variable assignment, looping, and conditionals in Python\n", "* Use an external Python library\n", "* Read tabular data from a file\n", "* Subset and perform analysis on data\n", "* Display simple graphs" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Loading Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to load the data, we need to import a library called Pandas that knows\n", "how to operate on tables of data." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now use Pandas to read our data file." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pandas.read_csv('A1_mosquito_data.csv')" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", " | year | \n", "temperature | \n", "rainfall | \n", "mosquitos | \n", "
---|---|---|---|---|
0 | \n", "2001 | \n", "80 | \n", "157 | \n", "150 | \n", "
1 | \n", "2002 | \n", "85 | \n", "252 | \n", "217 | \n", "
2 | \n", "2003 | \n", "86 | \n", "154 | \n", "153 | \n", "
3 | \n", "2004 | \n", "87 | \n", "159 | \n", "158 | \n", "
4 | \n", "2005 | \n", "74 | \n", "292 | \n", "243 | \n", "
5 | \n", "2006 | \n", "75 | \n", "283 | \n", "237 | \n", "
6 | \n", "2007 | \n", "80 | \n", "214 | \n", "190 | \n", "
7 | \n", "2008 | \n", "85 | \n", "197 | \n", "181 | \n", "
8 | \n", "2009 | \n", "74 | \n", "231 | \n", "200 | \n", "
9 | \n", "2010 | \n", "74 | \n", "207 | \n", "184 | \n", "
10 rows \u00d7 4 columns
\n", "\n", " | year | \n", "temperature | \n", "rainfall | \n", "mosquitos | \n", "
---|---|---|---|---|
0 | \n", "2001 | \n", "80 | \n", "157 | \n", "150 | \n", "
1 | \n", "2002 | \n", "85 | \n", "252 | \n", "217 | \n", "
2 | \n", "2003 | \n", "86 | \n", "154 | \n", "153 | \n", "
3 | \n", "2004 | \n", "87 | \n", "159 | \n", "158 | \n", "
4 | \n", "2005 | \n", "74 | \n", "292 | \n", "243 | \n", "
5 | \n", "2006 | \n", "75 | \n", "283 | \n", "237 | \n", "
6 | \n", "2007 | \n", "80 | \n", "214 | \n", "190 | \n", "
7 | \n", "2008 | \n", "85 | \n", "197 | \n", "181 | \n", "
8 | \n", "2009 | \n", "74 | \n", "231 | \n", "200 | \n", "
9 | \n", "2010 | \n", "74 | \n", "207 | \n", "184 | \n", "
10 rows \u00d7 4 columns
\n", "