{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Lecture 5: Loops\n", "\n", "CSCI 1360: Foundations for Informatics and Analytics" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Overview and Objectives\n", "\n", "In this lecture, we'll go over the basics of looping in Python. By the end of this lecture, you should be able to" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - Perform basic arithmetic operations using arbitrary-length collections\n", " - Use \"unpacking\" as a shortcut for iterating through dictionaries\n", " - Describe the differences between the separate kinds of loops" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Part 1: `for` Loops" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Looping, like lists, is a critical component in programming and data science. When we're training models on data, we'll need to loop over each data point, examining it in turn and adjusting our model accordingly *regardless of how many data points there are*. This kind of repetitive task is ideal for looping." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's define for ourselves the following list:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "ages = [21, 22, 19, 19, 22, 21, 22, 31]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is a list containing the ages of some group of students, and we want to compute the average. How do we compute averages?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "We know an average is some *total quantity* divided by *number of elements*. Well, the latter is easy enough to compute:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8\n" ] } ], "source": [ "number_of_elements = len(ages)\n", "print(number_of_elements)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The total quantity is a bit trickier. You could certainly sum them all manually--" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "age_sum = ages[0] + ages[1] + ages[2] # + ... and so on" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "...but that seems really, really tedious. Plus, how do you even know how many elements your list has?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Loop structure" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The structure itself is pretty simple:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - some collection of \"things\" to iterate over" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - a placeholder for the current \"thing\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - a chunk of code describing what to do with the current \"thing\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's start simple: looping through a list, printing out each item one at a time." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "5\n", "7\n", "9\n" ] } ], "source": [ "for N in [2, 5, 7, 9]: # Header\n", " print(N) # Body" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "There are two main parts to the loop: the **header** and the **body**." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - The **header** contains 1) the collection we're iterating over (in this example, the list), and 2) the \"placeholder\" we're using to hold the current value (in this example, `N`)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - The **body** is the chunk of code under the header (indented!) that executes on each iteration." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Back, then, to computing an average:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Average age: 22.12\n" ] } ], "source": [ "age_sum = 0\n", "ages = [21, 22, 19, 19, 22, 21, 22, 31]\n", "\n", "for age in ages:\n", " age_sum += age\n", "\n", "avg = age_sum / number_of_elements # Compute the average using the formula we know and love!\n", "print(\"Average age: {:.2f}\".format(avg))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "You can loop through sets and tuples the same way." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "2\n", "3\n", "5\n" ] } ], "source": [ "s = set([1, 1, 2, 3, 5])\n", "for item in s:\n", " print(item)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "1\n", "2\n", "3\n", "5\n" ] } ], "source": [ "t = tuple([1, 1, 2, 3, 5])\n", "for item in t:\n", " print(item)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Iterators" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The unifying theme with all these collections you can loop through is that they're all examples of *iterators*." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Easily the most common iterator you'll use (aside from lists, sets, and tuples) is the `range` function:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 5 6 7 8 9 " ] } ], "source": [ "for i in range(10):\n", " print(i, end = \" \") # Prints everything on 1 line." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Note, again, that the range of numbers goes from 0 (inclusive) to the specified end (exclusive)! The critical point is that the argument to `range` specifies the *length* of the returned iterator." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "A few more examples of `range` before we get back to loops:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 " ] } ], "source": [ "for i in range(5): # One argument: specifies the \"end\"\n", " print(i, end = \" \")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5 6 7 8 9 " ] } ], "source": [ "for i in range(5, 10): # Two arguments: first is \"start\" (inclusive), second is \"end\" (exclusive)\n", " print(i, end = \" \")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 2 4 6 8 " ] } ], "source": [ "for i in range(0, 10, 2): # Three arguments: start, end, and increment\n", " print(i, end = \" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**IMPORTANT: INDENTATION MATTERS**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "You'll notice in these loops that the *loop body* is distinctly indented relative to the *loop header*. This is intentional and is indeed how it works! If you fail to indent the body of the loop, Python will complain:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "IndentationError", "evalue": "expected an indented block (, line 3)", "output_type": "error", "traceback": [ "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m3\u001b[0m\n\u001b[0;31m print(item)\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m expected an indented block\n" ] } ], "source": [ "some_list = [3.14159, \"random stuff\", 4200]\n", "for item in some_list:\n", "print(item)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "With loops, whitespace in Python *really* starts to matter. If you want many things to happen inside of a loop, you'll need to indent every line!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's say in some future homework assignment, I ask you to write a loop computing the squares of the numbers 1-10. How would you do it?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Well, you could manually write it out, I suppose..." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "squares = [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "...but that's awfully boring." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Instead, let's use the `range` function we were just discussing:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n" ] } ], "source": [ "squares = [] # Empty list for all our squares\n", "\n", "for num in range(10):\n", " squared_number = num ** 2 # Exponent operation!\n", " squares.append(squared_number) # Add to our list.\n", "\n", "print(squares)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Looping through dictionaries" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This gets its own subsection because it pulls together pretty much all the concepts we've discussed so far: lists, tuples, dictionaries, and looping." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's start by defining a dictionary. In this case, we'll set up a dictionary that maps people to their favorite programming language." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "favorite_languages = {\n", " 'jen': 'python',\n", " 'sarah': 'c',\n", " 'edward': 'ruby',\n", " 'shannon': 'python'\n", "}\n", "# Notice the indentation, if you decide to define a dictionary this way!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Remember the super-useful methods for iterating through dictionaries? `keys` gives you a list of all the keys, `values` a list of all the values, and `items` a list of *tuples* of the key-value pairs. Here's the loop:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sarah prefers c.\n", "shannon prefers python.\n", "jen prefers python.\n", "edward prefers ruby.\n" ] } ], "source": [ "for key, value in favorite_languages.items(): # 1\n", " print(\"{} prefers {}.\".format(key, value))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "1: Notice how `key, value` are just out there floating! This is called *unpacking* and is a very useful technique in Python. If I have a list of a few items, and (critically) *I know how many items there are*, I can do this" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "some_list = ['a', 'b']\n", "a, b = some_list" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "instead of this" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "some_list = ['a', 'b']\n", "a = some_list[0]\n", "b = some_list[1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "In the same vein, I could have just as easily written the loop like this:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sarah prefers c.\n", "shannon prefers python.\n", "jen prefers python.\n", "edward prefers ruby.\n" ] } ], "source": [ "for keyvalue in favorite_languages.items(): # 1\n", " key = keyvalue[0]\n", " value = keyvalue[1]\n", " print(\"{} prefers {}.\".format(key, value)) # 2" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "and indeed, if that is easier for you to understand, by all means do it! This is to illustrate all the concepts at play at once:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - the loop header iterates through a list provided by `favorite_languages.items()`\n", " - each iteration, `items()` provides a tuple: a key-value pair from the dictionary\n", " - we can \"unpack\" these variables using shorthand, but it's also perfectly valid to do it the \"regular\" way" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "That's pretty much `for` loops!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What about the case where you don't know ahead of time how many iterations your loop will take?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Part 2: `while` Loops" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "\"While\" loops go back yet again to the concept of boolean logic we introduced in an earlier lecture: loop until some condition is reached." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The structure here is a little different than `for` loops. Instead of explicitly looping over an iterator, you'll set some condition that evaluates to either `True` or `False`; as long as the condition is `True`, Python executes another loop." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10 11 12 13 14 " ] } ], "source": [ "x = 10\n", "\n", "while x < 15:\n", " print(x, end = \" \")\n", " x += 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`x < 15` is a boolean statement: it is either `True` or `False`, depending on the value of `x`. Initially, this number is 10, which is certainly `< 15`, so the loop executes. 10 is printed, `x` is incremented, and the condition is checked again." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "A potential downside of `while` loops: **forgetting to update the condition inside the loop.**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "It's easy to take for granted; `for` loops implicitly handle this for us!" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10 11 12 13 14 " ] } ], "source": [ "for i in range(10, 15):\n", " print(i, end = \" \")\n", " # No update needed!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Use `for` loops frequently enough, and when you occasionally use a `while` loop, you'll forget you need to update the loop condition." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Review Questions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Some questions to discuss and consider:\n", "\n", "1: Using the awful matrix construct of a \"list of lists,\" show how you could write loops that double the value of each element of the matrix.\n", "\n", "2: `for` and `while` loops may have different syntax and different use cases, but you can often translate the same task between the two types of loops. Show how you could use a `while` loop to iterate through a list of numbers from `range()`.\n", "\n", "3: Let's say you have two lists, `K` and `V`, that are both the same length. Show how, using only 1 loop, you can loop through both of them simultaneously.\n", "\n", "4: Now let's say your lists `K` and `V` are *not* the same length. Using 1 loop, iterate through them, stopping when you reach the end of the shorter list." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Course Administrivia" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**A1 is out!**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Even though A0 isn't \"graded\", please submit by 11:59 tonight.**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Next week: generators and comprehensions and `enumerate`, oh my!**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Additional Resources\n", "\n", " 1. Matthes, Eric. *Python Crash Course*. 2016. ISBN-13: 978-1593276034\n", " 2. Grus, Joel. *Data Science from Scratch*. 2015. ISBN-13: 978-1491901427" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }