{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CR\u00dcCIAL P\u0178THON Week 2: Iteration" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.core.display import Image \n", "Image(url='http://labrosa.ee.columbia.edu/crucialpython/logo.png', width=600) " ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "" ], "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterators\n", "Python makes it easy to iterate over objects which are collections of things (**generators**). This makes for loops look really clean, among other things." ] }, { "cell_type": "code", "collapsed": false, "input": [ "a_list = ['hey', [1, 2, 3], 2, 10, {}]\n", "for item in a_list:\n", " print item" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "hey\n", "[1, 2, 3]\n", "2\n", "10\n", "{}\n" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "# Another example of an iterator is a file\n", "with open('some_file.txt', 'r') as f:\n", " for line in f:\n", " print line" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "there are some lines\n", "\n", "in this text file\n", "\n", "here's another one\n", "\n", "and another\n", "\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## enumerate\n", "Often, you want to iterate over the indices of the items in an object. A naive way to do it would be to construct an iterator of the list indices and then access your list using each index. ``enumerate()`` provides a cleaner way..." ] }, { "cell_type": "code", "collapsed": false, "input": [ "a_dict = {}\n", "# This is kind of ugly and only works for objects with a len\n", "for n in xrange(len(a_list)):\n", " if isinstance(a_list[n], int):\n", " a_dict[n] = n*a_list[n]\n", "# You know about pprint (pretty print), right?\n", "import pprint\n", "pprint.pprint(a_dict)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{2: 4, 3: 30}\n" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "# file objects don't have a len() because they are not loaded into memory.\n", "# But, you can still use enumerate, which will give you an index.\n", "# Plus, it's cleaner.\n", "for n, item in enumerate(open('some_file.txt', 'r')):\n", " a_dict[n + 5] = item\n", "pprint.pprint(a_dict)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{2: 4,\n", " 3: 30,\n", " 5: 'there are some lines\\n',\n", " 6: 'in this text file\\n',\n", " 7: \"here's another one\\n\",\n", " 8: 'and another\\n'}\n" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## zip\n", "One situation where you might be tempted to iterate over list indices is when you need to access multiple lists at a time. ``zip()`` is your friend here." ] }, { "cell_type": "code", "collapsed": false, "input": [ "b_list = ['cool', 'great', 'awesome', 'nice', 'sweet']\n", "c_list = range(10, 15)\n", "# Naw\n", "for n in xrange(len(a_list)):\n", " print n, a_list[n], b_list[n], c_list[n]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0 hey cool 10\n", "1 [1, 2, 3] great 11\n", "2 2 awesome 12\n", "3 10 nice 13\n", "4 {} sweet 14\n" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "# Yeah\n", "for a_item, b_item, c_item in zip(a_list, b_list, c_list):\n", " print a_item, b_item, c_item" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "hey cool 10\n", "[1, 2, 3] great 11\n", "2 awesome 12\n", "10 nice 13\n", "{} sweet 14\n" ] } ], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "# When you need indices and items across lists, enumerate and zip!\n", "for n, (a_item, b_item, c_item) in enumerate(zip(a_list, b_list, c_list)):\n", " print n, a_item, b_item, c_item" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0 hey cool 10\n", "1 [1, 2, 3] great 11\n", "2 2 awesome 12\n", "3 10 nice 13\n", "4 {} sweet 14\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## itertools\n", "The ``itertools`` module provides a lot of fancy and useful functions for combining iterators. In general, if you have some kind of funky iteration you want to try, check that it isn't already done more cleanly here: http://docs.python.org/2/library/itertools.html" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import itertools" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "# We can add lists to concatenate them, but we can't add list + file.\n", "# Plus, we don't know how many items the file has in it.\n", "# Chain lets us iterate over file first, then the list.\n", "# We could do this in two for loops, but then we'd have to duplicate code.\n", "for item in itertools.chain(open('some_file.txt', 'r'), a_list):\n", " print item" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "there are some lines\n", "\n", "in this text file\n", "\n", "here's another one\n", "\n", "and another\n", "\n", "hey\n", "[1, 2, 3]\n", "2\n", "10\n", "{}\n" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "# Zip for things of different length\n", "for a, b in itertools.izip(a_list, open('some_file.txt', 'r')):\n", " print a, b" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "hey there are some lines\n", "\n", "[1, 2, 3] in this text file\n", "\n", "2 here's another one\n", "\n", "10 and another\n", "\n" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "for a, b in itertools.izip_longest(a_list, open('some_file.txt', 'r')):\n", " print a, b" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "hey there are some lines\n", "\n", "[1, 2, 3] in this text file\n", "\n", "2 here's another one\n", "\n", "10 and another\n", "\n", "{} None\n" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "short_list = ['dang', 'cool', 'nice']\n", "# Get all length-n combinations of items from the list, ordered\n", "for item_a, item_b in itertools.permutations(short_list, 2):\n", " print item_a, item_b" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "dang cool\n", "dang nice\n", "cool dang\n", "cool nice\n", "nice dang\n", "nice cool\n" ] } ], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "# Get all length-n combinations of items from the list, no repeats\n", "for item_a, item_b in itertools.combinations(short_list, 2):\n", " print item_a, item_b" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "dang cool\n", "dang nice\n", "cool nice\n" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## List comprehension\n", "A truly crucial tool: Create lists using a for loop" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Basic example: Read in all lines\n", "print [line for line in open('some_file.txt', 'r')]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['there are some lines\\n', 'in this text file\\n', \"here's another one\\n\", 'and another\\n']\n" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "# We can apply functions to each item too\n", "print [line.strip() for line in open('some_file.txt', 'r')]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['there are some lines', 'in this text file', \"here's another one\", 'and another']\n" ] } ], "prompt_number": 19 }, { "cell_type": "code", "collapsed": false, "input": [ "# Also can use conditional statements\n", "print [line.strip() for line in open('some_file.txt', 'r') if not 'another' in line]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['there are some lines', 'in this text file']\n" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "# Also can do nested list comprehension! \n", "print [line.strip() for f in ['some_file.txt', 'some_other_file.txt'] for line in open(f, 'r')]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['there are some lines', 'in this text file', \"here's another one\", 'and another', 'these are lines', 'from a different text file', 'a second one, specifically']\n" ] } ], "prompt_number": 22 } ], "metadata": {} } ] }