{ "cells": [ { "cell_type": "code", "execution_count": 27, "metadata": { "scrolled": false, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "from IPython.display import Image\n", "from IPython.display import clear_output\n", "from IPython.display import FileLink, FileLinks\n", "import matplotlib.pylab as plt\n", "import pandas as pd\n", "import os\n", "import time" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Introduction to\n", "\n", "![title](img/python-logo-master-flat.png)\n", "\n", "### with Application to Bioinformatics\n", "\n", "#### - Day 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Practical issues\n", "\n", "- Course website:\n", "https://uppsala.instructure.com/courses/99844 \n", "- Course lectures streamed from Uppsala and Umeå\n", "- TAs on each site\n", "- Short lectures with many breaks\n", "- Schedule times are approximate" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Schedule\n", "\n", "\"Drawing\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### To start with\n", "\n", "- Has everyone managed to log in to Canvas?\n", "- Has everyone managed to install Python?\n", "- Have you managed to run the test script?\n", "- Have you installed notebooks? (optional)\n", "- Canvas tour\n", "- PyQuizzes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### What is programming?\n", "\n", "Wikipedia: \n", "\n", "\"Computer programming is the process of building and designing an executable computer program for accomplishing a specific computing task\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "### What can we use it for?\n", "\n", "Endless possibilities! \n", "- reverse complement DNA\n", "- custom filtering of VCF files\n", "- plotting of results\n", "- all excel stuff!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Why Python?\n", "\n", "### Typical workflow\n", "\n", "1. Get data\n", "2. Clean, transform data in spreadsheet\n", "3. Copy-paste, copy-paste, copy-paste\n", "4. Run analysis & export results\n", "7. Realise the columns were not sorted correctly\n", "8. Go back to step 2, Repeat\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "\"Drawing\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Python versions\n", "\n", "|Old versions|Python 3|\n", "|--- |--- |\n", "|Python 1.0 - January 1994|Python 3.0 - December 3, 2008|\n", "|Python 1.0 - January 1994|Python 3.1 - June 27, 2009|\n", "|Python 1.2 - April 10, 1995|Python 3.2 - February 20, 2011|\n", "|Python 1.3 - October 12, 1995|Python 3.3 - September 29, 2012|\n", "|Python 1.4 - October 25, 1996|Python 3.4 - March 16, 2014|\n", "|Python 1.5 - December 31, 1997|Python 3.5 - September 13, 2015|\n", "|Python 1.6 - September 5, 2000|Python 3.6 - December 23, 2016|\n", "|Python 2.0 - October 16, 2000|Python 3.7 - June 27, 2018|\n", "|Python 2.1 - April 17, 2001|Python 3.8 - October 14, 2019|\n", "|Python 2.2 - December 21, 2001|Python 3.9 - October 5, 2020|\n", "|Python 2.3 - July 29, 2003|Python 3.10 - October 4, 2021|\n", "|Python 2.4 - November 30, 2004|Python 3.11 - October 24 2022|\n", "|Python 2.5 - September 19, 2006|Python 3.12 - October 2 2023|\n", "|Python 2.6 - October 1, 2008|Python 3.13 - October 2024|\n", "|Python 2.7 - July 3, 2010||" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "

Course content

\n", "\n", "- Core concepts about Python syntax: Data types, blocks and indentation, variable scoping, iteration, functions, methods and arguments \n", "- Different ways to control program flow using loops and conditional tests \n", "- Regular expressions and pattern matching \n", "- Writing functions and best-practice ways of making them usable \n", "- Reading from and writing to files \n", "- Code packaging and Python libraries \n", "- How to work with biological data using external libraries. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "

Learning outcomes

\n", "\n", "At the end of the course, you should be able to:\n", "\n", "- Use variables and exlain how operators work\n", "- Process data using loops\n", "- Separate data using if/else statements\n", "- Use functions to read and write to files\n", "- Describe their own approach to a coding task\n", "- Understand the difference between functions and methods\n", "- Be able to read the documentation for built-in functions/methods\n", "- Give examples of use cases for dictionaries\n", "- Write data to a simple dictionary\n", "- Understand the concept and syntax of a function" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "

Learning outcomes, cont.

\n", "\n", "At the end of the course, you should be able to:\n", "\n", "- Write basic functions for processing data\n", "- Describe pandas dataframes\n", "- Give examples of how to use pandas for processing data\n", "- Explain how regex can be used\n", "- Define the python syntax for regex\n", "- Combine basic concepts to create functional stand-alone programs to process data\n", "- Write file processing Python programs that produce output to the terminal and/or external files\n", "- Explain how to debug and further develop your skills in Python after the course" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Some good advice\n", "\n", "- 5 days to learn Python is not much\n", "- Amount of information will decrease over days\n", "- Complexity of tasks will increase over days\n", "- Read the error messages!\n", "- Save all your code\n", "\n", "How to seek help: \n", "- Google\n", "- Ask your neighbour\n", "- Ask an assistant" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## You will look like this:\n", "\n", "
\n", "\"Drawing\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Day 1\n", "\n", "- Types and variables\n", "- Operations\n", "- Loops\n", "- if/else statements" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Example of a simple Python script" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "u is2\n", "u is3\n", "u is4\n", "u is5\n", "u is6\n", "u is7\n", "u is8\n", "u is9\n", "u is10\n", "u is11\n" ] } ], "source": [ "# A simple loop that adds 2 to a number\n", "i = 0\n", "while i < 10:\n", " u = i + 2\n", " print('u is' + str(u))\n", " i += 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Example of a simple Python script\n", "\n", "\"Drawing\"\n", "\n", "### Comment\n", "\n", "All lines starting with # is interpreted by python as a comment and are not executed. Comments are important for documenting code and considered good practise when doing all types of programming" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Example of a simple Python script\n", "\n", "\"Drawing\"\n", "\n", "\n", "### Literals\n", "\n", "All literals have a type:\n", "\n", "- Strings (str)       ‘Hello’ “Hi”\n", "- Integers (int)\t    5\n", "- Floats (float)\t    3.14\n", "- Boolean (bool)     True or False" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Literals define values" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "bool" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'this is a string'\n", "\"this is also a string\"\n", "3 # here we can put a comment so we know that this is an integer\n", "3.14 # this is a float\n", "True # this is a boolean\n", "\n", "type(True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "### Collections" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[3, 5, 7, 4, 99] # this is a list of integers\n", "\n", "('a', 'b', 'c', 'd') # this is a tuple of strings\n", "{'a', 'b', 'c'} # this is a set of strings\n", "{'a':3, 'b':5, 'c':7} # this is a dictionary with strings as keys and integers as values\n", "\n", "type([3, 5, 7, 4, 99])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### What operations can we do with different values?\n", "\n", "That depends on their type:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "ename": "TypeError", "evalue": "can't multiply sequence by non-int of type 'float'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Input \u001b[0;32mIn [32]\u001b[0m, in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[38;5;241m2\u001b[39m \u001b[38;5;241m+\u001b[39m \u001b[38;5;241m3.4\u001b[39m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124ma string \u001b[39m\u001b[38;5;124m'\u001b[39m \u001b[38;5;241m*\u001b[39m \u001b[38;5;241m3\u001b[39m\n\u001b[0;32m----> 4\u001b[0m \u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43ma string \u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m3.4\u001b[39;49m\n", "\u001b[0;31mTypeError\u001b[0m: can't multiply sequence by non-int of type 'float'" ] } ], "source": [ "'a string'+' another string'\n", "2 + 3.4\n", "'a string ' * 3\n", "'a string ' * 3.4" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Type         Operations \n", "\n", "int           + - * / ** % // ... \n", "float           + - * / ** % // ... \n", "string           + *" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Example of a simple Python script\n", "\n", "\n", "\"Drawing\" \n", "\n", "### Identifiers\n", "\n", "Identifiers are used to identify a program element in the code. \n", "\n", "For example: \n", "- Variables\n", "- Functions\n", "- Modules\n", "- Classes " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Variables\n", "\n", "Used to store values and to assign them a name.\n", "\n", "Examples: \n", "- `i = 0`\n", "- `counter = 5`\n", "- `snpname = 'rs2315487'`\n", "- `snplist = ['rs21354', 'rs214569']` " ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "840" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "width = 42\n", "height = 20\n", "\n", "snpname = 'rs56483 '\n", "snplist = ['rs12345','rs458782']\n", "\n", "snpname * 3\n", "width * height" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### How to correctly name a variable\n", "\n", "\n", "\n", "\"Drawing\" \n", "\n", "__Allowed:                       Not allowed:__ \n", "Var\\_name                       2save \n", "\\_total                           \\*important \n", "aReallyLongName                 Special% \n", "with\\_digit\\_2                       With   spaces \n", "dkfsjdsklut   _(well, allowed, but NOT recommended)_\n", "\n", "__NO special characters:__ \n", "\\+ - * $ % ; : , ? ! { } ( ) < > “ ‘ | \\ / @" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Reserved keywords\n", "\n", "\"Drawing\" \n", "\n", "\n", "These words can not be used as variable names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Summary\n", "\n", "- Comment your code!\n", "- Literals define values and can have different types (strings, integers, floats, boolean)\n", "- Values can be collected in lists, tuples, sets, and dictionaries\n", "- The operation that can be performed on a certain value depends on the type\n", "- Variables are identified by a name and are used to store a value or collections of values\n", "- Name your variables using descriptive words without special characters and reserved keywords\n", "\n", "__→ Notebook Day_1_Exercise_1 (~30 minutes)__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## NOTE!\n", "\n", "### How to get help?\n", "\n", "- [Google](https://www.google.com/) and [Stack overflow](https://stackoverflow.com/) are your best friends!\n", "- Official [python documentation](https://docs.python.org/3/)\n", "- Ask your neighbour\n", "- Ask us" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Python standard library\n", "\n", "\"Drawing\" \n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Example `print()` and `str()`\n", "\n", "\"Drawing\" \n", "\n", "__Note!__ \n", "Here we format everything to a string before printing it" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Python standard library\n", "\n", "\"Drawing\" \n", "\n" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "width = 5\n", "height = 3.6\n", "snps = ['rs123', 'rs5487']\n", "snp = 'rs2546'\n", "active = True\n", "nums = [2,4,6,8,4,5,2]\n", "\n", "int(height)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## More on operations\n", "\n", "\"Drawing\" \n" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "64" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 4\n", "y = 3\n", "z = [2, 3, 6, 3, 9, 23]\n", "pow(x, y)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Comparison operators\n", "\n", "\"Drawing\" \n", "\n", "Can be used on int, float, str, and bool. Outputs a boolean." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 5\n", "y = 3\n", "\n", "y != x" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Logical operators\n", "\n", "\"Drawing\" \n", "\n", "\n", "\n", "## Membership operators\n", "\n", "\"Drawing\" \n" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 2\n", "y = 3\n", "x == 2 and y == 5\n", "x = [2,4,7,3,5,9]\n", "y = ['a','b','c']\n", "\n", "2 in x\n", "4 in x and 'd' in y" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "scrolled": false, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "num is 2. Is this number even? True\n", "num is 3. Is this number even? False\n", "num is 4. Is this number even? True\n", "num is 5. Is this number even? False\n", "num is 6. Is this number even? True\n", "num is 7. Is this number even? False\n", "num is 8. Is this number even? True\n", "num is 9. Is this number even? False\n", "num is 10. Is this number even? True\n", "num is 11. Is this number even? False\n" ] } ], "source": [ "# A simple loop that adds 2 to a number and checks if the number is even\n", "i = 0\n", "even = [2,4,6,8,10]\n", "while i < 10:\n", " num = i + 2\n", " print('num is '+str(num)+'. Is this number even? '+str(num in even))\n", " i += 1" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "scrolled": false, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "num is 2. Is this number even and below 5? True\n", "num is 3. Is this number even and below 5? False\n", "num is 4. Is this number even and below 5? True\n", "num is 5. Is this number even and below 5? False\n", "num is 6. Is this number even and below 5? False\n", "num is 7. Is this number even and below 5? False\n", "num is 8. Is this number even and below 5? False\n", "num is 9. Is this number even and below 5? False\n", "num is 10. Is this number even and below 5? False\n", "num is 11. Is this number even and below 5? False\n" ] } ], "source": [ "# A simple loop that adds 2 to a number, check if number is even and below 5\n", "i = 0\n", "even = [2,4,6,8,10]\n", "while i < 10:\n", " num = i + 2\n", " print('num is '+str(num)+'. Is this number even and below 5? '+\\\n", " str(num in even and num < 5))\n", " i += 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Order of precedence\n", "\n", "There is an order of precedence for all operators:\n", "\n", "\"Drawing\" \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Word of caution when using operators" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 5\n", "y = 7\n", "z = 2\n", "x == 5 and y < 7 or z > 1\n", "\n", "# and binds stronger than or\n", "x > 4 or y == 6 and z > 3\n", "x > 4 or (y == 6 and z > 3)\n", "(x > 4 or y == 6) and z > 3" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# BEWARE!\n", "x = 5\n", "y = 8\n", "\n", "#xx == 6 or xxx == 6 or x > 2\n", "x > 42 and (xx > 1000 or y < 7)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "__Python does short-circuit evaluation of operators__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## More on sequences (For example strings and lists)\n", "\n", "Lists (and strings) are an ORDERED collection of elements where every element can be accessed through an index.\n", "\n", "\"Drawing\" \n" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "ename": "TypeError", "evalue": "'str' object does not support item assignment", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Input \u001b[0;32mIn [42]\u001b[0m, in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 8\u001b[0m s[\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1\u001b[39m]\n\u001b[1;32m 9\u001b[0m l[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m42\u001b[39m\n\u001b[0;32m---> 10\u001b[0m s[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mS\u001b[39m\u001b[38;5;124m'\u001b[39m\n", "\u001b[0;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } ], "source": [ "l = [2,3,4,5,3,7,5,9]\n", "s = 'some longrandomstring'\n", "\n", "#'o' in s\n", "l[0]\n", "s[4:7]\n", "s[0:8:2]\n", "s[-1]\n", "l[0] = 42\n", "s[0] = 'S'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Mutable vs Immutable objects\n", "\n", "

\n", "\n", "Mutable objects can be altered after creation, while immutable objects can't.\n", "\n", "\n", "__Immutable objects:       Mutable objects:__ \n", "- `int`               • `list`\n", "- `float`                • `set`\n", "- `bool`                • `dict`\n", "- `str`\n", "- `tuple`\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Operations on mutable sequences\n", "\n", "\n", "\n", "\"Drawing\" " ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "[9, 8, 7, 6, 5, 10, 4, 3, 2, 1, 0, 10]" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = [0,1,2,3,4,5,6,7,8,9]\n", "s.insert(5,10)\n", "s.reverse()\n", "s.append(10)\n", "s" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Summary\n", "\n", "- The python standard library has many built-in functions regularly used\n", "- Operators are used to carry out computations on different values\n", "- Three types of operators; comparison, logical, and membership\n", "- Order of precedence crucial!\n", "- Mutable object can be changed after creation while immutable objects cannot be changed\n", "\n", "

\n", "

\n", "\n", "__→ Notebook Day_1_Exercise_2 (~30 minutes)__ \n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Loops in Python" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "pear\n", "banana\n", "orange\n", "grapes\n", "pears\n" ] } ], "source": [ "fruits = ['apple','pear','banana','orange', 'grapes', 'pears']\n", "\n", "print(fruits[0])\n", "print(fruits[1])\n", "print(fruits[2])\n", "print(fruits[3])\n", "print(fruits[4])\n", "print(fruits[5])" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "pear\n", "banana\n", "orange\n", "grapes\n", "DONE!\n" ] } ], "source": [ "fruits = ['apple','pear','banana','orange', 'grapes']\n", "\n", "for fruit in fruits:\n", " print(fruit)\n", "print('DONE!')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "__Always remember to INDENT your loops!__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Different types of loops" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "### `For` loop" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "pear\n", "banana\n", "orange\n" ] } ], "source": [ "fruits = ['apple','pear','banana','orange']\n", "mystring = 'mylongstring'\n", "\n", "for fruit in fruits:\n", " print(fruit)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "### `While` loop" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "pear\n", "banana\n", "orange\n", "4\n" ] } ], "source": [ "fruits = ['apple','pear','banana','orange']\n", "\n", "i = 0\n", "while i < len(fruits):\n", " print(fruits[i])\n", " i = i + 1\n", "\n", "print(i)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Different types of loops\n", "\n", "__`For` loop__\n", "\n", "Is a control flow statement that performs a fixed operation over a known amount of steps.\n", "\n", "__`While` loop__\n", "\n", "Is a control flow statement that allows code to be executed repeatedly based on a given Boolean condition.\n", "\n", "

\n", "\n", "__Which one to use?__\n", "\n", "`For` loops better for simple iterations over lists and other iterable objects\n", "\n", "`While` loops are more flexible and can iterate an unspecified number of times\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Example of a simple Python script\n", "\n", "

\n", "\n", "\"Drawing\" " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "__→ Notebook Day_1_Exercise_3 (~20 minutes)__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Conditional `if/else`  statements\n", "\n", "\n", "\"Drawing\" " ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "scrolled": false, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Go shopping!\n" ] } ], "source": [ "shopping_list = ['bread', 'egg', 'butter', 'milk']\n", "\n", "if len(shopping_list) > 3:\n", " print('Go shopping!')" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Better get it over with today anyway\n" ] } ], "source": [ "shopping_list = ['bread', 'egg', 'butter', 'milk']\n", "tired = False\n", "\n", "if len(shopping_list) > 5:\n", " if not tired:\n", " print('Go shopping!')\n", " else:\n", " print('Too tired, I\\'ll do it later')\n", "else:\n", " if not tired:\n", " print('Better get it over with today anyway')\n", " else:\n", " print('Nah! I\\'ll do it tomorrow!')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "### This is an example of a nested conditional" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Putting everything into a Python script\n", "\n", "Any longer pieces of code that have been used and will be re-used SHOULD be saved\n", "\n", "Two options:\n", "- Save it as a text file and make it executable\n", "- Save it as a notebook file" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Things to remember when working with scripts\n", "\n", "- Put _#!/usr/bin/env python_ in the beginning of the file\n", "- Make the file executable to run with `./script.py`\n", "- Otherwise run script with `python script.py`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Working on files" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "pear\n", "banana\n", "orange\n" ] } ], "source": [ "fruits = ['apple','pear','banana','orange']\n", "\n", "for fruit in fruits:\n", " print(fruit)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\"Drawing\" " ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "\n", "pear\n", "\n", "banana\n", "\n", "orange\n", "\n" ] } ], "source": [ "fh = open('../files/fruits.txt', 'r', encoding = 'utf-8')\n", "\n", "for line in fh:\n", " print(line)\n", " \n", "fh.close()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Aditional useful methods:\n", "

\n", "\n", "`'string'.strip()`       Removes whitespace \n", "`'string'.split()`       Splits on whitespace into list " ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "['an', 'example', 'string', 'to', 'split', 'with', 'whitespace', 'in', 'end']" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = ' an example string to split with whitespace in end '\n", "sw = s.strip()\n", "sw\n", "swl = sw.split()\n", "swl = s.strip().split()\n", "swl" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\"Drawing\" " ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "scrolled": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "apple\n", "pear\n", "banana\n", "orange\n" ] } ], "source": [ "fh = open('../files/fruits.txt', 'r', encoding = 'utf-8')\n", "\n", "for line in fh:\n", " print(line.strip())\n", "\n", "fh.close()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Another example" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "\"Drawing\" \n", "How much money is spent on ICA?" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "scrolled": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total amount spent on ICA is: 1186.71\n" ] } ], "source": [ "fh = open(\"../files/bank_statement.txt\", \"r\", encoding = \"utf-8\")\n", "\n", "total = 0\n", "\n", "for line in fh:\n", " expenses = line.strip().split() # split line into list\n", " store = expenses[0] # save what store\n", " price = float(expenses[1]) # save the price\n", " if store == 'ICA': # only count the price if store is ICA\n", " total = total + price\n", "fh.close()\n", "\n", "print('Total amount spent on ICA is: '+str(total))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Slightly more complex..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "\"Drawing\" \n", "\n", "How much money is spent on ICA in September?" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "scrolled": false, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "fh = open(\"../files/bank_statement_extended.txt\", \"r\", encoding = \"utf-8\")\n", "\n", "total = 0\n", "\n", "for line in fh:\n", " if not line.startswith('store'):\n", " expenses = line.strip().split()\n", " store = expenses[0]\n", " year = expenses[1]\n", " month = expenses[2]\n", " day = expenses[3]\n", " price = float(expenses[4])\n", " if store == 'ICA' and month == '09': # store has to be ICA and month september\n", " total = total + price\n", "fh.close()\n", "\n", "out = open(\"../files/bank_statement_results.txt\", \"w\", encoding = \"utf-8\") # open a file for writing the results to\n", "out.write('Total amount spent on ICA in september is: '+str(total))\n", "out.close()" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tue Oct 11 18:39:02 2022 \t 250.imdb\n", "Thu May 20 17:46:00 2021 \t bank_statement.txt\n", "Thu May 20 17:46:00 2021 \t bank_statement_extended.txt\n", "Fri Oct 6 12:35:06 2023 \t bank_statement_results.txt\n", "Thu May 20 17:46:00 2021 \t blocket_listings_selected.txt\n", "Thu May 20 17:46:01 2021 \t cheat_sheet.pdf\n", "Thu May 20 17:46:01 2021 \t fruits.txt\n", "Thu May 20 17:46:01 2021 \t fruits_extended.txt\n", "Wed Oct 12 08:43:09 2022 \t imdb_reformatted.txt\n", "Fri Sep 30 15:40:44 2022 \t schedule.csv\n", "Thu May 20 17:46:01 2021 \t somerandomfile.txt\n" ] } ], "source": [ "for file in os.scandir(\"../files/\"):\n", " print(time.ctime(os.stat(file).st_mtime), '\\t', file.name)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "\"Drawing\" " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Summary\n", "\n", "- Python has two types of loops, `For` loops and `While` loops\n", "- Loops can be used on any iterable types and objects\n", "- `If/Else` statement are used when deciding actions depending on a condition that evaluates to a boolean\n", "- Several `If/Else` statements can be nested\n", "- Save code as notebook or text file to be run using python\n", "- The function `open()` can be used to read in text files\n", "- A text file is iterable, meaning it is possible to loop over the lines\n", "\n", "__→ Notebook Day_1_Exercise_4__" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.4" }, "rise": { "height": 1024, "width": 1024 } }, "nbformat": 4, "nbformat_minor": 2 }