{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Python Programming" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Beginning Programming" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lets have a look at the following code snippet, which introduces a `for` loop:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shopping = ['bread', 'potatoes', 'eggs', 'flour', 'rubber duck', 'pizza', 'milk']\n", "for item in shopping:\n", " print(item)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "This is a very short program, which creates a variable (`shopping`) that refers to a list and then prints out each of the items in turn. There are a couple of things to comment on here. Firstly, the `for` statement creates the variable `item` (the variable name can be anything that you want), then sets the value to each of the elements in the list. The line that is indented is then executed for each value assigned to the `item` variable, printing out this value. If you have not done so yet, run the code cell with the `for` loop inside of your Jupyter Notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Whenever we want to execute a bit of Python code several times, a **for loop** is one of the ways that we can do it. Python recognises the lines we want to form part of the loop by the level of indentation and it is vital that you maintain consistent indentation throughout your programs. For example, you can choose to indent lines of code with spaces or with tabs but, whichever one you choose, you should only use one or the other for your whole program. Also, make sure that you keep the amount of indentation consistent across all the levels in your code. You will find that this approach makes your programs easier to read and understand, because you can see the structure of the program at a glance by the indentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### _Exercise 2.1_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Change the program above by adding a second list (with a different variable name) to the program, which contains cheese, spaghetti and sausages. Change the loop so that instead of printing the element, it appends it to the old list. Then, at the end, print out the new list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# type your command(s) here..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Counting Loops\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looping over elements of a list is great, but there are other circumstances where you just want to do something a set number of times. Fortunately, Python has a function which generates a list of numbers for us to use in a for loop. Run the code cell below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(10):\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This should have printed out the numbers 0 to 9. The `range()` function gives a `range` object, which can be used to generate a list of integers. When given a single argument, `range(N)` creates an generator which generates the first `N` numbers (starting from 0). Run the following code cell to verify that type of object returned is a `range` object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_range = range(50)\n", "print(my_range)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Note__ At first glance, it might seem inconvenient that you don't get a list as output from the `range()` function, and instead get a `range` object. The reason behind this is that the `range` object is a much faster and more efficient way of generating values that will be looped through one-at-a-time, and this is the aim of the vast majority of calls to the `range()` function. If you do actually want to create a full list of integer values in a range, you can pass the use of `range()` into the explicit initialisation of a list as below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "range_list = list(range(10)) # or\n", "range_list = [range(10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### _Exercise 2.2_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Explore what you can do with the `range` function. It can take just one number as we did above, or two as starting and ending values, or even three - the start, the end and a step value. Try all three versions of the `range` command, and then work out how to produce the list: `[4, 11, 18, 25]`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# type your command(s) here...\n", "help(range)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Direct and Indirect Loops" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, `range` can get us a list that we can use to count to any number that we want, but why does it stop short of the upper limit we give it? Why does `range(N)` mean 0..N-1 instead of 0..N or 1..N? Well, try out the following two pieces of code:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for item in shopping:\n", " print(item)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(len(shopping)):\n", " print(shopping[i])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "They should be exactly the same: `range` behaves as it does so that you can use it to generate lists of indexes for sequence data types (like lists). In the blocks of code above, the first is an example of a **direct loop**, where you pull out the items one by one directly from the list. The second is an **indirect list**, where you step through the indices and use them to access the required elements from the list. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which one is better? Generally, the direct method is slightly clearer and a bit more _Pythonesque_. However, there are circumstances where an indirect loop is the only option. If you have two lists of the same size, you might need to print out the corresponding elements of the two lists (although there might be better ways to do this, as well). In this case, you can use `range` with the size of one of the lists, and then use the index to get the corresponding elements from both." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### _Exercise 2.3_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start with your shopping list (or a new, shorter one to save some typing) and create a new list with the amounts you need to buy of each item. So for example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shopping = ['bicycle pump', 'sofa', 'yellow paint']\n", "amounts = ['1', '7', '9'] " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then write a loop to step through and print the item and the amount on the same line. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# type your command(s) here..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Looking Up Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Keeping data in parallel arrays like this is fine if you are really really careful and you don’t need to change the arrays that much. Otherwise, it is prone to errors. One way of getting around this (and our last new data type) is to use a _dictionary_. Dictionaries are sort of like lists, but instead of holding just a single value, they hold a key-value pair. So, when you want to look up a value in the dictionary, you specify the key and the dictionary returns the value, rather than just using an index. An example might help: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studentNumbers = { 'Bioscience Technology': 16, \n", " 'Computational Biology': 12,\n", " 'Post-Genomic Biology': 20,\n", " 'Ecology and Environmental Management': 3,\n", " 'Maths in the Living Environment': 0\n", " }\n", "studentNumbers['Bioscience Technology']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data is enclosed in curly brackets and is a comma separated list of key-value pairs. The key and value are separated by `:`. The key can be any immutable type (so, mainly strings, numbers or tuples). Notice I have split the assignment statement to create the dictionary over several lines, to make it easier to read. Normally, Python expects a command to be on a single line, but sometimes it recognises that a command isn’t finished and lets you continue on the next line. This mainly happens when you haven’t closed a set of brackets, which in the above example was deliberate, but in my case is usually because I have forgotten. Python will continue to prompt for input until you close the bracket properly before trying to execute the command. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dictionaries themselves are a mutable datatype, so the values associated with a key can be changed:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studentNumbers['Bioscience Technology'] += 1 # x += 1 does the same as x = x + 1\n", "studentNumbers['Bioscience Technology']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you try to assign a value to a key that doesn’t exist, Python creates the entry for you automatically:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studentNumbers['Gardening'] = 10\n", "studentNumbers['Gardening']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can get rid of entries in the dictionary as well, using the `del` statement: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "del studentNumbers['Maths in the Living Environment']\n", "studentNumbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we know the keys in the dictionary we can look up the values. If we want to loop over the values in the dictionary, we could create a list of the keys and loop over that, but that’s no better than keeping the keys and values in separate lists. Instead, Python can create a collection of the keys for you when you need it: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studentNumbers.keys()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now put this into a `for` loop, with or without sorting it first. If we are not bothered about the order, then we can use `for` and `in` to loop directly over the keys in the dictionary: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for key in studentNumbers:\n", " print(key, studentNumbers[key])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That should work as expected. Python doesn’t make any promises about the order the keys will be supplied in: they will be given the way Python thinks is best.*****" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*__Note__ The way that dictionaries are implemented in Python fundamentally changed in v3.6, resulting in them taking up ~1/2 the space and working ~2x as fast as they used to. A side effect of this is that dictionary objects in Python 3.6 remember the order that entries were created in and you should be able to access their entries in this order. Regardless, in the examples and exercises in this course, we assume that this order cannot be relied upon - this is not yet considered a 'stable' feature of the language i.e. future versions of Python are not guaranteed to preserve the order of dictionaries. When writing your own code, if you want to access dictionary entries in a particular order, you should make sure to do so by providing keys in a specific order, as we will show below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As well as getting the keys, you could also get the values as a list using `.values()`. Slightly more efficient is to get the key-value pairs in one step using `.items()`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studentNumbers.values()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studentNumbers.items() " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Have a careful look at this output. The square brackets show that this is a list (technically a `dict_items` object) of things . But each item in that list is in fact two pieces of data in parenthesis (round brackets). Parenthesis denote a type of list called a **tuple**, which is very similar to a list, with one major distinction: Tuples are *immutable*, and once defined, cannot be changed. There are two ways we can use the output by `.items()` in a `for` loop. Firstly, we can use a variable which will contain the tuple and unpack it in body of the loop: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for data in studentNumbers.items():\n", " print(data[0], data[1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or (this is usually my preference) you can unpack the data directly and more explicitly in the `for` statement:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for course, students in studentNumbers.items():\n", " print(course, students)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output of `.items()` is our first example of a compound data structure (in this case a list of tuples). The ability to easily construct arbitrarily complex data structures like this is one of the most powerful features of Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### _Exercise 2.4_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Go back to your shopping list code from exercise 2.3 and change the program so that the amounts and shopping items are stored in a dictionary, then print out the items and their respective amounts by looping over the dictionary. Do it twice, once looping over the the dictionary to get the keys (or use the keys to get the values) and once by getting the key-value pairs directly from the dictionary." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* `for` loops can be used to repeat a block of code for each item in a list.\n", "* `range()` can be used to create a list of numbers, and to repeat the loop for each of those numbers, to execute the loop a given number of times.\n", "* Dictionaries are another object data type which stores key-value pairs.\n", "* The `.keys()`, `.values()` and `.items()` methods are used to get lists of the contents of a dictionary." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 1 }