{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Python programming\n", "\n", "This tutorial primerily consists of a modified version of the lecture by J.R. Johansson (robert@riken.jp) http://dml.riken.jp/~rob/.\n", "\n", "The latest version of the original [IPython notebook](http://ipython.org/notebook.html) lecture is available at [http://github.com/jrjohansson/scientific-python-lectures](http://github.com/jrjohansson/scientific-python-lectures).\n", "\n", "**Prerequisites:** *No prerequisites.*\n", "\n", "**NOTE:** This and the original material are copyrighted. For any additional details, please refer to the [LICENSE](https://github.com/mar-one/ACM-Python-Tutorials-KAUST-2014/blob/master/LICENSE.md) and to [the original repository](http://github.com/jrjohansson/scientific-python-lectures)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Python?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Python](http://www.python.org/) is a modern, general-purpose, object-oriented, high-level programming language.\n", "\n", "General characteristics of Python:\n", "\n", "* **clean and simple language:** Easy-to-read and intuitive code, easy-to-learn minimalistic syntax, maintainability scales well with size of projects.\n", "* **expressive language:** Fewer lines of code, fewer bugs, easier to maintain.\n", "\n", "Technical details:\n", "\n", "* **dynamically typed:** No need to define the type of variables, function arguments or return types.\n", "* **automatic memory management:** No need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs. \n", "* **interpreted:** No need to compile the code. The Python interpreter reads and executes the python code directly.\n", "\n", "Advantages:\n", "\n", "* The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.\n", "* Well designed language that encourage many good programming practices:\n", " * Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.\n", " * Documentation tightly integrated with the code.\n", "* A large standard library, and a large collection of add-on packages.\n", "\n", "Disadvantages:\n", "\n", "* Since Python is an interpreted and dynamically typed programming language, the execution of python code can be slow compared to compiled statically typed programming languages, such as C and Fortran. \n", "* Somewhat decentralized, with different environment, packages and documentation spread out at different places. Can make it harder to get started." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What makes python suitable for scientific computing?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "* Python has a strong position in scientific computing: \n", " * Large community of users, easy to find help and documentation.\n", "\n", "* Extensive ecosystem of scientific libraries and environments\n", " * numpy: http://numpy.scipy.org - Numerical Python\n", " * scipy: http://www.scipy.org - Scientific Python\n", " * matplotlib: http://www.matplotlib.org - graphics library\n", "\n", "* Great performance due to close integration with time-tested and highly optimized codes written in C and Fortran:\n", " * blas, altas blas, lapack, arpack, Intel MKL, ...\n", "\n", "* Good support for \n", " * Parallel processing with processes and threads\n", " * Interprocess communication (MPI)\n", " * GPU computing (OpenCL and CUDA)\n", "\n", "* Readily available and suitable for use on high-performance computing clusters. \n", "\n", "* No license costs, no unnecessary use of research budget.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Note for keyboard shortcuts users\n", "The iPython Notebook has several useful shortcuts that can save you a lot of time. To see the list of available shortcuts press ctrl-m + h." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python program files\n", "\n", "* Python code is usually stored in text files with the file ending \"`.py`\":\n", "\n", " myprogram.py\n", "\n", "* Every line in a Python program file is assumed to be a Python statement, or part thereof. \n", "\n", " * The only exception is comment lines, which start with the character `#` (optionally preceded by an arbitrary number of white-space characters, i.e., tabs or spaces). Comment lines are usually ignored by the Python interpreter.\n", "\n", "\n", "* To run our Python program from the command line we use:\n", "\n", " $ python myprogram.py\n", "\n", "* On UNIX systems it is common to define the path to the interpreter on the first line of the program (note that this is a comment line as far as the Python interpreter is concerned):\n", "\n", " #!/usr/bin/env python\n", "\n", " If we do, and if we additionally set the file script to be executable, we can run the program like this:\n", "\n", " $ myprogram.py\n", "\n", "#### Example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ls scripts/hello-world*.py" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "scripts/hello-world-in-arabic.py scripts/hello-world.py\r\n" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "cat scripts/hello-world.py" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "#!/usr/bin/env python\r\n", "\r\n", "print(\"Hello world!\")\r\n" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "!python scripts/hello-world.py" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Hello world!\r\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Character encoding\n", "\n", "The standard character encoding is ASCII, but we can use any other encoding, for example UTF-8. To specify that UTF-8 is used we include the special line\n", "\n", " # -*- coding: UTF-8 -*-\n", "\n", "at the top of the file." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cat scripts/hello-world-in-arabic.py" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "#!/usr/bin/env python\r\n", "#-*- coding: utf-8 -*-\r\n", "\r\n", "print(\"\u0645\u0631\u062d\u0628\u0627\")\r\n" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "!python scripts/hello-world-in-arabic.py" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\u0645\u0631\u062d\u0628\u0627\r\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other than these two *optional* lines in the beginning of a Python code file, no additional code is required for initializing a program. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IPython notebooks\n", "\n", "This file - an IPython notebook - does not follow the standard pattern with Python code in a text file. Instead, an IPython notebook is stored as a file in the [JSON](http://en.wikipedia.org/wiki/JSON) format. The advantage is that we can mix formatted text, Python code and code output. It requires the IPython notebook server to run it though, and therefore isn't a stand-alone Python program as described above. Other than that, there is no difference between the Python code that goes into a program file or an IPython notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modules\n", "\n", "Most of the functionality in Python is provided by *modules*. The Python Standard Library is a large collection of modules that provides *cross-platform* implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more.\n", "\n", "### References\n", " \n", " * The Python Language Reference: http://docs.python.org/2/reference/index.html\n", " * The Python Standard Library: http://docs.python.org/2/library/\n", "\n", "To use a module in a Python program it first has to be imported. A module can be imported using the `import` statement. For example, to import the module `math`, which contains many standard mathematical functions, we can do:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This includes the whole module and makes it available for use later in the program. For example, we can do:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "\n", "x = math.cos(2 * math.pi)\n", "\n", "print(x)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1.0\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can chose to import all symbols (functions and variables) in a module to the current namespace (so that we don't need to use the prefix \"`math.`\" every time we use something from the `math` module:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from math import *\n", "\n", "x = cos(2 * pi)\n", "\n", "print(x)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1.0\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This pattern can be very convenient, but in large programs that include many modules it is often a good idea to keep the symbols from each module in their own namespaces, by using the `import math` pattern. This would elminate potentially confusing problems with name space collisions.\n", "\n", "As a third alternative, we can chose to import only a few selected symbols from a module by explicitly listing which ones we want to import instead of using the wildcard character `*`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from math import cos, pi\n", "\n", "x = cos(2 * pi)\n", "\n", "print(x)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1.0\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to rename the function or the imported module" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from math import cos as cos1\n", "import math as ma\n", "\n", "x = cos1(2 * ma.pi)\n", "\n", "print(x)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1.0\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Looking at what a module contains, and its documentation\n", "\n", "Once a module is imported, we can list the symbols it provides using the `dir` function:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "\n", "print(dir(math))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'hypot', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']\n" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And using the function `help` we can get a description of each function (almost .. not all functions have docstrings, as they are technically called, but the vast majority of functions are documented this way). " ] }, { "cell_type": "code", "collapsed": false, "input": [ "help(math.log)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Help on built-in function log in module math:\n", "\n", "log(...)\n", " log(x[, base])\n", " \n", " Return the logarithm of x to the given base.\n", " If the base not specified, returns the natural logarithm (base e) of x.\n", "\n" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "log(10)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "2.302585092994046" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "log(10, 2)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "3.3219280948873626" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use the `help` function directly on modules: Try\n", "\n", " help(math) \n", "\n", "Some very useful modules form the Python standard library are `os`, `sys`, `math`, `shutil`, `re`, `subprocess`, `multiprocessing`, `threading`. \n", "\n", "A complete lists of standard modules for Python 2 and Python 3 are available at http://docs.python.org/2/library/ and http://docs.python.org/3/library/, respectively." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "The `os` module contains a function named `getcwd`. Find out what this functions do and print its output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/getting_help.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 114 }, { "cell_type": "code", "collapsed": false, "input": [ "from os import getcwd\n", "\n", "help(getcwd)\n", "\n", "print(getcwd())\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variables and types\n", "\n", "### Symbol names \n", "\n", "Variable names in Python can contain alphanumerical characters `a-z`, `A-Z`, `0-9` and some special characters such as `_`. Normal variable names must start with a letter. \n", "\n", "By convension, variable names start with a lower-case letter, and Class names start with a capital letter. \n", "\n", "In addition, there are a number of Python keywords that cannot be used as variable names. These keywords are:\n", "\n", " and, as, assert, break, class, continue, def, del, elif, else, except, \n", " exec, finally, for, from, global, if, import, in, is, lambda, not, or,\n", " pass, print, raise, return, try, while, with, yield\n", "\n", "Note: Be aware of the keyword `lambda`, which could easily be a natural variable name in a scientific program. But being a keyword, it cannot be used as a variable name.\n", "\n", "### Assignment\n", "\n", "The assignment operator in Python is `=`. Python is a dynamically typed language, so we do not need to specify the type of a variable when we create one.\n", "\n", "Assigning a value to a new variable creates the variable:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# variable assignments\n", "x = 1.0\n", "my_variable = 12.2" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although not explicitly specified, a variable do have a type associated with it. The type is derived form the value it was assigned." ] }, { "cell_type": "code", "collapsed": false, "input": [ "type(x)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "float" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we assign a new value to a variable, its type can change." ] }, { "cell_type": "code", "collapsed": false, "input": [ "x = 1" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "type(x)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "int" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we try to use a variable that has not yet been defined we get an `NameError`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(y)" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'y' is not defined", "output_type": "pyerr", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'y' is not defined" ] } ], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fundamental types" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# integers\n", "x = 1\n", "type(x)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ "int" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "# float\n", "x = 1.0\n", "type(x)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "float" ] } ], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "# boolean\n", "b1 = True\n", "b2 = False\n", "\n", "type(b1)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 23, "text": [ "bool" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "# complex numbers: note the use of `j` to specify the imaginary part\n", "x = 1.0 - 1.0j\n", "type(x)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 24, "text": [ "complex" ] } ], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "print(x)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1-1j)\n" ] } ], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "print(x.real, x.imag)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1.0, -1.0)\n" ] } ], "prompt_number": 26 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Type utility functions\n", "\n", "The module `types` contains a number of type name definitions that can be used to test if variables are of certain types:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import types\n", "\n", "# print all types defined in the `types` module\n", "print(dir(types))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['BooleanType', 'BufferType', 'BuiltinFunctionType', 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', 'GetSetDescriptorType', 'InstanceType', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MemberDescriptorType', 'MethodType', 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRangeType', '__builtins__', '__doc__', '__file__', '__name__', '__package__']\n" ] } ], "prompt_number": 27 }, { "cell_type": "code", "collapsed": false, "input": [ "x = 1.0\n", "\n", "# check if the variable x is a float\n", "type(x) is float" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ "True" ] } ], "prompt_number": 28 }, { "cell_type": "code", "collapsed": false, "input": [ "# check if the variable x is an int\n", "type(x) is int" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "False" ] } ], "prompt_number": 29 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use the `isinstance` method for testing types of variables:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "isinstance(x, float)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 30, "text": [ "True" ] } ], "prompt_number": 30 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Type casting" ] }, { "cell_type": "code", "collapsed": false, "input": [ "x = 1.5\n", "\n", "print(x, type(x))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1.5, )\n" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "x = int(x)\n", "\n", "print(x, type(x))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1, )\n" ] } ], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "z = complex(x)\n", "\n", "print(z, type(z))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "((1+0j), )\n" ] } ], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "x = float(z)" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "can't convert complex to float", "output_type": "pyerr", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mz\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: can't convert complex to float" ] } ], "prompt_number": 34 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Complex variables cannot be cast to floats or integers. We need to use `z.real` or `z.imag` to extract the part of the complex number we want:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "y = bool(z.real)\n", "\n", "print(z.real, \" -> \", y, type(y))\n", "\n", "y = bool(z.imag)\n", "\n", "print(z.imag, \" -> \", y, type(y))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1.0, ' -> ', True, )\n", "(0.0, ' -> ', False, )\n" ] } ], "prompt_number": 35 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Operators and comparisons\n", "\n", "Most operators and comparisons in Python work as one would expect:\n", "\n", "* Arithmetic operators `+`, `-`, `*`, `/`, `//` (integer division), '**' power\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "1 + 2, 1 - 2, 1 * 2, 1 / 2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 36, "text": [ "(3, -1, 2, 0)" ] } ], "prompt_number": 36 }, { "cell_type": "code", "collapsed": false, "input": [ "1.0 + 2.0, 1.0 - 2.0, 1.0 * 2.0, 1.0 / 2.0" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 37, "text": [ "(3.0, -1.0, 2.0, 0.5)" ] } ], "prompt_number": 37 }, { "cell_type": "code", "collapsed": false, "input": [ "# Integer division of float numbers\n", "3.0 // 2.0" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 38, "text": [ "1.0" ] } ], "prompt_number": 38 }, { "cell_type": "code", "collapsed": false, "input": [ "# Note! The power operators in python isn't ^, but **\n", "2 ** 2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 39, "text": [ "4" ] } ], "prompt_number": 39 }, { "cell_type": "markdown", "metadata": {}, "source": [ "* The boolean operators are spelled out as words `and`, `not`, `or`. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "True and False" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 40, "text": [ "False" ] } ], "prompt_number": 40 }, { "cell_type": "code", "collapsed": false, "input": [ "not False" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 41, "text": [ "True" ] } ], "prompt_number": 41 }, { "cell_type": "code", "collapsed": false, "input": [ "True or False" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 42, "text": [ "True" ] } ], "prompt_number": 42 }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Comparison operators `>`, `<`, `>=` (greater or equal), `<=` (less or equal), `==` equality, `!=` inequality, `is` identical." ] }, { "cell_type": "code", "collapsed": false, "input": [ "2 > 1, 2 < 1" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 43, "text": [ "(True, False)" ] } ], "prompt_number": 43 }, { "cell_type": "code", "collapsed": false, "input": [ "2 > 2, 2 < 2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 44, "text": [ "(False, False)" ] } ], "prompt_number": 44 }, { "cell_type": "code", "collapsed": false, "input": [ "2 >= 2, 2 <= 2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 45, "text": [ "(True, True)" ] } ], "prompt_number": 45 }, { "cell_type": "code", "collapsed": false, "input": [ "# equality\n", "[1,2] == [1,2]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 46, "text": [ "True" ] } ], "prompt_number": 46 }, { "cell_type": "code", "collapsed": false, "input": [ "# inequality\n", "1 != 1" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 47, "text": [ "False" ] } ], "prompt_number": 47 }, { "cell_type": "code", "collapsed": false, "input": [ "# objects identical?\n", "l1 =[1,2]\n", "l2 = l1\n", "l1 is l2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 48, "text": [ "True" ] } ], "prompt_number": 48 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "Use Python to evaluate the following expression, for the given variables' values, to get the correct results and print the output\n", "\n", "`(5 * (a>b) + 2 * (ab)) + float(2*(a<(b*4))))/float(a) * c\n", "\n", "print(val)\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compound types: Strings, List and dictionaries\n", "\n", "### Strings\n", "\n", "Strings are the variable type that is used for storing text messages. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "s = \"Hello world\"\n", "type(s)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 51, "text": [ "str" ] } ], "prompt_number": 51 }, { "cell_type": "code", "collapsed": false, "input": [ "s1 = \"KAUST's hello world\"\n", "print(s1)\n", "s2 = 'He said \"Hello\"'\n", "print(s2)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "KAUST's hello world\n", "He said \"Hello\"\n" ] } ], "prompt_number": 52 }, { "cell_type": "code", "collapsed": false, "input": [ "# length of the string: the number of characters\n", "len(s)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 53, "text": [ "11" ] } ], "prompt_number": 53 }, { "cell_type": "code", "collapsed": false, "input": [ "# replace a substring in a string with somethign else\n", "s2 = s.replace(\"world\", \"test\")\n", "print(s2)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Hello test\n" ] } ], "prompt_number": 54 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can index a character in a string using `[]`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "s[0]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 55, "text": [ "'H'" ] } ], "prompt_number": 55 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Heads up MATLAB users:** Indexing start at 0!\n", "\n", "We can extract a part of a string using the syntax `[start:stop]`, which extracts characters between index `start` and `stop`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "s[0:5]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 56, "text": [ "'Hello'" ] } ], "prompt_number": 56 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we omit either (or both) of `start` or `stop` from `[start:stop]`, the default is the beginning and the end of the string, respectively:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "s[:5]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 57, "text": [ "'Hello'" ] } ], "prompt_number": 57 }, { "cell_type": "code", "collapsed": false, "input": [ "s[6:]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 58, "text": [ "'world'" ] } ], "prompt_number": 58 }, { "cell_type": "code", "collapsed": false, "input": [ "s[:]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 59, "text": [ "'Hello world'" ] } ], "prompt_number": 59 }, { "cell_type": "code", "collapsed": false, "input": [ "s[:-2]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 60, "text": [ "'Hello wor'" ] } ], "prompt_number": 60 }, { "cell_type": "code", "collapsed": false, "input": [ "s[-1], s[-2]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 61, "text": [ "('d', 'l')" ] } ], "prompt_number": 61 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also define the step size using the syntax `[start:end:step]` (the default value for `step` is 1, as we saw above):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "s[::1]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 62, "text": [ "'Hello world'" ] } ], "prompt_number": 62 }, { "cell_type": "code", "collapsed": false, "input": [ "s[::2]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 63, "text": [ "'Hlowrd'" ] } ], "prompt_number": 63 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This technique is called *slicing*. Read more about the syntax here: http://docs.python.org/release/2.7.3/library/functions.html?highlight=slice#slice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### String formatting\n", "\n", "Some examples of this subsection are brought from [http://docs.python.org/2/tutorial/inputoutput.html](http://docs.python.org/2/tutorial/inputoutput.html)\n", "\n", "Many operations can be performed over the strings in python. The following command shows the available string operations provided by str objects." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(dir(str))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']\n" ] } ], "prompt_number": 64 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we will look at `rjust`, `format`, `zfill`, and using the `%` operator for c-style formatting.\n", "\n", "The following code shows an example of using the c-style formatting. More formatting specifiers can be found in a c language documentation. A good source: [http://www.cplusplus.com/reference/cstdio/printf/](http://www.cplusplus.com/reference/cstdio/printf/)." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "print 'The value of %5s is approximately %5.3f.' % ('PI', math.pi)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "The value of PI is approximately 3.142.\n" ] } ], "prompt_number": 65 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `zfill` function pads the given number with zeros from the left to reach the desired width of the number" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print('1234'.zfill(7))\n", "print('-3.14'.zfill(7))\n", "print('1234'.zfill(2))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0001234\n", "-003.14\n", "1234\n" ] } ], "prompt_number": 66 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `rjust` command pads a given string from the left, using spaces by default, to reach the desired width. The following example shows how this function can be used to print tables in pretty way." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print str(1).rjust(2), str(1**2).rjust(3),str(1**3).rjust(4)\n", "print str(5).rjust(2), str(5**2).rjust(3),str(5**3).rjust(4)\n", "print str(10).rjust(2), str(10**2).rjust(3),str(10**3).rjust(4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " 1 1 1\n", " 5 25 125\n", "10 100 1000\n" ] } ], "prompt_number": 67 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following code shows examples of using the `format` function to control the formatting of the input values to the string" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print('We are the {} who say \"{}!\"'.format('knights', 'Ni'))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "We are the knights who say \"Ni!\"\n" ] } ], "prompt_number": 68 }, { "cell_type": "code", "collapsed": false, "input": [ "print('{0} and {1}'.format('spam', 'eggs'))\n", "print('{1} and {0}'.format('spam', 'eggs'))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "spam and eggs\n", "eggs and spam\n" ] } ], "prompt_number": 69 }, { "cell_type": "code", "collapsed": false, "input": [ "print( 'This {food} is {adjective}.'.format(food='spam', adjective='absolutely horrible') )" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "This spam is absolutely horrible.\n" ] } ], "prompt_number": 70 }, { "cell_type": "code", "collapsed": false, "input": [ "print( 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', other='Georg') )" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "The story of Bill, Manfred, and Georg.\n" ] } ], "prompt_number": 71 }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "print( 'The value of PI is approximately {0:.3f}.'.format(math.pi) )" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "The value of PI is approximately 3.142.\n" ] } ], "prompt_number": 72 }, { "cell_type": "code", "collapsed": false, "input": [ "name1 = 'Sjoerd'\n", "phone1 = 4127\n", "name2 = 'Jack'\n", "phone2 = 4098\n", "name3 = 'Dcab'\n", "phone3 = 7678\n", "\n", "print( '{0:10} ==> {1:10d}'.format(name1, phone1) )\n", "print( '{0:10} ==> {1:10d}'.format(name2, phone2) )\n", "print( '{0:10} ==> {1:10d}'.format(name3, phone3) )\n", " " ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Sjoerd ==> 4127\n", "Jack ==> 4098\n", "Dcab ==> 7678\n" ] } ], "prompt_number": 73 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python has a very rich set of functions for text processing. See for example http://docs.python.org/2/library/string.html for more information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Use string built-in functions and sliding operation to perform the following over the text of `Hello world`\n", "\n", "* replace the letter `o` with `a`. (Hint: use the 'replace' built-in function in the string)\n", "* Convert the first word to upper case letters. (Hint: use the 'upper' built-in function in the string)\n", "* Convert the second word to lower case letters. (Hint: use the 'lower' built-in function in the string)\n", "* Print out the character between the 4th and charachter before last inclusive. (Hint: use slicing operation)\n" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 73 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/strings.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 116 }, { "cell_type": "code", "collapsed": false, "input": [ "s = 'Hello world'\n", "\n", "s = s.replace('o', 'a')\n", "\n", "s = s[:6].lower() + s[6:].upper() \n", "\n", "print(s[3:-1])\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List\n", "\n", "Lists are very similar to strings, except that each element can be of any type.\n", "\n", "The syntax for creating lists in Python is `[...]`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l = [1,2,3,4]\n", "\n", "print(type(l))\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "[1, 2, 3, 4]\n" ] } ], "prompt_number": 75 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the same slicing techniques to manipulate lists as we could use on strings:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(l)\n", "\n", "print(l[1:3])\n", "\n", "print(l[::2])" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[1, 2, 3, 4]\n", "[2, 3]\n", "[1, 3]\n" ] } ], "prompt_number": 76 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Heads up MATLAB users:** Indexing starts at 0!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l[0]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 77, "text": [ "1" ] } ], "prompt_number": 77 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Elements in a list do not all have to be of the same type:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l = [1, 'a', 1.0, 1-1j]\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[1, 'a', 1.0, (1-1j)]\n" ] } ], "prompt_number": 78 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python lists can be inhomogeneous and arbitrarily nested:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "nested_list = [1, [2, [3, [4, [5]]]]]\n", "\n", "print(nested_list)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[1, [2, [3, [4, [5]]]]]\n" ] } ], "prompt_number": 79 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Accessing elements in nested lists" ] }, { "cell_type": "code", "collapsed": false, "input": [ "nl = [1, [2, 3, 4], [5, [6, 7, 8]]]\n", "\n", "print(nl)\n", "print(nl[0])\n", "print(nl[1][1])\n", "print(nl[2][1][2])\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[1, [2, 3, 4], [5, [6, 7, 8]]]\n", "1\n", "3\n", "8\n" ] } ], "prompt_number": 80 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lists play a very important role in Python, and are for example used in loops and other flow control structures (discussed below). There are number of convenient functions for generating lists of various types, for example the `range` function:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "start = 10\n", "stop = 30\n", "step = 2\n", "\n", "range(start, stop, step)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 81, "text": [ "[10, 12, 14, 16, 18, 20, 22, 24, 26, 28]" ] } ], "prompt_number": 81 }, { "cell_type": "code", "collapsed": false, "input": [ "# in python 3 range generates an interator, which can be converted to a list using 'list(...)'. It has no effect in python 2\n", "list(range(start, stop, step))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 82, "text": [ "[10, 12, 14, 16, 18, 20, 22, 24, 26, 28]" ] } ], "prompt_number": 82 }, { "cell_type": "code", "collapsed": false, "input": [ "list(range(-10, 10))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 83, "text": [ "[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] } ], "prompt_number": 83 }, { "cell_type": "code", "collapsed": false, "input": [ "s" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 84, "text": [ "'Hello world'" ] } ], "prompt_number": 84 }, { "cell_type": "code", "collapsed": false, "input": [ "# convert a string to a list by type casting:\n", "\n", "s2 = list(s)\n", "\n", "s2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 85, "text": [ "['H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']" ] } ], "prompt_number": 85 }, { "cell_type": "code", "collapsed": false, "input": [ "# sorting lists\n", "s2.sort()\n", "\n", "print(s2)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[' ', 'H', 'd', 'e', 'l', 'l', 'l', 'o', 'o', 'r', 'w']\n" ] } ], "prompt_number": 86 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Adding, inserting, modifying, and removing elements from lists" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# create a new empty list\n", "l = []\n", "\n", "# add an elements using `append`\n", "l.append(\"A\")\n", "l.append(\"d\")\n", "l.append(\"d\")\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['A', 'd', 'd']\n" ] } ], "prompt_number": 87 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can modify lists by assigning new values to elements in the list. In technical jargon, lists are *mutable*." ] }, { "cell_type": "code", "collapsed": false, "input": [ "l[1] = \"p\"\n", "l[2] = \"p\"\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['A', 'p', 'p']\n" ] } ], "prompt_number": 88 }, { "cell_type": "code", "collapsed": false, "input": [ "l[1:3] = [\"d\", \"d\"]\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['A', 'd', 'd']\n" ] } ], "prompt_number": 89 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Insert an element at an specific index using `insert`" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l.insert(0, \"i\")\n", "l.insert(1, \"n\")\n", "l.insert(2, \"s\")\n", "l.insert(3, \"e\")\n", "l.insert(4, \"r\")\n", "l.insert(5, \"t\")\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['i', 'n', 's', 'e', 'r', 't', 'A', 'd', 'd']\n" ] } ], "prompt_number": 90 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remove first element with specific value using 'remove'" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l.remove(\"A\")\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['i', 'n', 's', 'e', 'r', 't', 'd', 'd']\n" ] } ], "prompt_number": 91 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remove an element at a specific location using `del`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "del l[7]\n", "del l[6]\n", "\n", "print(l)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['i', 'n', 's', 'e', 'r', 't']\n" ] } ], "prompt_number": 92 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using operators with lists" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l1 = [1, 2, 3] + [4, 5, 6]\n", "print(l1)\n", "\n", "l2 = [1, 2, 3] * 2\n", "print(l2)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[1, 2, 3, 4, 5, 6]\n", "[1, 2, 3, 1, 2, 3]\n" ] } ], "prompt_number": 93 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Copy list by reference or by value\n", "By default Python copies lists by reference as we can see in the following example" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a = [1, 2, 3]\n", "b = a\n", "print(\"a is b? \", a is b)\n", "b[0] = -1\n", "print(\"a = \", a)\n", "print(\"b = \", b)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "('a is b? ', True)\n", "('a = ', [-1, 2, 3])\n", "('b = ', [-1, 2, 3])\n" ] } ], "prompt_number": 94 }, { "cell_type": "markdown", "metadata": {}, "source": [ "To copy the array by value we can do the following" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a = [1, 2, 3]\n", "b = a[:] # or:\n", "print(\"a is b? \", a is b)\n", "c = list(a)\n", "b[0] = -1\n", "c[1] = -1\n", "print(\"a = \", a)\n", "print(\"b = \", b)\n", "print(\"c = \", c)\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "('a is b? ', False)\n", "('a = ', [1, 2, 3])\n", "('b = ', [-1, 2, 3])\n", "('c = ', [1, -1, 3])\n" ] } ], "prompt_number": 95 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This method does not work when the list contains lists" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a = [ [1, 2, 3], 4, 5]\n", "b = a[:]\n", "c = list(a)\n", "a[0].append(-1)\n", "print(\"a = \", a)\n", "print(\"b = \", b)\n", "print(\"c = \", c)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "('a = ', [[1, 2, 3, -1], 4, 5])\n", "('b = ', [[1, 2, 3, -1], 4, 5])\n", "('c = ', [[1, 2, 3, -1], 4, 5])\n" ] } ], "prompt_number": 96 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The solution here is to use the copy module" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from copy import deepcopy\n", "a = [ [1, 2, 3], 4, 5]\n", "b = deepcopy(a)\n", "a[0].append(-1)\n", "\n", "print(\"a = \", a)\n", "print(\"b = \", b)\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "('a = ', [[1, 2, 3, -1], 4, 5])\n", "('b = ', [[1, 2, 3], 4, 5])\n" ] } ], "prompt_number": 97 }, { "cell_type": "markdown", "metadata": {}, "source": [ "See `help(list)` for more details, or read the online documentation " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise \n", "\n", "Perform the following list operations and print the final output\n", "\n", "* Create a list of the odd numbers between 4 and 16\n", "* Replace the last element of the list with a list of even numbers between 3 and 9\n", "* At the list in the last element, change the value of the element before last with `-1`\n", "* Remove elements between the 2nd and 3rd inclusive\n", "* Insert a the string `Hello` after the first element\n" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 97 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/lists.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 117 }, { "cell_type": "code", "collapsed": false, "input": [ "l = range(5,16,2)\n", "print(l)\n", "\n", "l[-1] = range(4,9,2)\n", "print(l)\n", "\n", "l[-1][-2] = -1\n", "print(l)\n", "\n", "del l[1:4]\n", "print(l)\n", "\n", "l.insert(1, 'Hello')\n", "print(l)\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tuples\n", "\n", "Tuples are like lists, except that they cannot be modified once created, that is they are *immutable*. \n", "\n", "In Python, tuples are created using the syntax `(..., ..., ...)`, or even `..., ...`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "point = (10, 20)\n", "\n", "print(point, type(point))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "((10, 20), )\n" ] } ], "prompt_number": 99 }, { "cell_type": "code", "collapsed": false, "input": [ "point = 10, 20\n", "\n", "print(point, type(point))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "((10, 20), )\n" ] } ], "prompt_number": 100 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can unpack a tuple by assigning it to a comma-separated list of variables:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "x, y = point\n", "\n", "print(\"x =\", x)\n", "print(\"y =\", y)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "('x =', 10)\n", "('y =', 20)\n" ] } ], "prompt_number": 101 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we try to assign a new value to an element in a tuple we get an error:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "point[0] = 20" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "'tuple' object does not support item assignment", "output_type": "pyerr", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpoint\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m20\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment" ] } ], "prompt_number": 102 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dictionaries\n", "\n", "Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, ...}`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "params = {\"parameter1\" : 1.0,\n", " \"parameter2\" : 2.0,\n", " \"parameter3\" : 3.0,\n", " 1: 4.0, \n", " (5, 'ho'): 'hi'}\n", "\n", "print(type(params))\n", "print(params)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "{1: 4.0, 'parameter1': 1.0, (5, 'ho'): 'hi', 'parameter3': 3.0, 'parameter2': 2.0}\n" ] } ], "prompt_number": 103 }, { "cell_type": "code", "collapsed": false, "input": [ "print(\"parameter1 = \" + str(params[\"parameter1\"]))\n", "print(\"parameter2 = \" + str(params[\"parameter2\"]))\n", "print(\"parameter3 = \" + str(params[\"parameter3\"]))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "parameter1 = 1.0\n", "parameter2 = 2.0\n", "parameter3 = 3.0\n" ] } ], "prompt_number": 104 }, { "cell_type": "code", "collapsed": false, "input": [ "params[\"parameter1\"] = \"A\"\n", "params[\"parameter2\"] = \"B\"\n", "\n", "# add a new entry\n", "params[\"parameter4\"] = \"D\"\n", "\n", "print(\"parameter1 = \" + str(params[\"parameter1\"]))\n", "print(\"parameter2 = \" + str(params[\"parameter2\"]))\n", "print(\"parameter3 = \" + str(params[\"parameter3\"]))\n", "print(\"parameter4 = \" + str(params[\"parameter4\"]))\n", "print(\"'key 1' = \" + str(params[1]))\n", "print(\"'key (5, 'ho')' = \" + str(params[(5, 'ho')]))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "parameter1 = A\n", "parameter2 = B\n", "parameter3 = 3.0\n", "parameter4 = D\n", "'key 1' = 4.0\n", "'key (5, 'ho')' = hi\n" ] } ], "prompt_number": 105 }, { "cell_type": "code", "collapsed": false, "input": [ "del params[\"parameter2\"]\n", "print(params)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{'parameter4': 'D', 'parameter1': 'A', 'parameter3': 3.0, 1: 4.0, (5, 'ho'): 'hi'}\n" ] } ], "prompt_number": 106 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Create a dictionary that uses a tuple of the first and last name of the person as a key and his/her corresponding age as a value for the following list of people\n", "\n", "* John Smith 30\n", "* Ahmad Said 22\n", "* Sara John 2\n", "\n", "Perform the following updates to the dictionary\n", "\n", "* Increase the age of John by one year\n", "* Add a new person named Ahmad Ahmad with age 19\n", "* remove Sara from the dictionary\n" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 106 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/dicts.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 118 }, { "cell_type": "code", "collapsed": false, "input": [ "d = {('John', 'Smith'):30,\n", " ('Ahmad', 'Said'):22,\n", " ('Sara', 'John'):2\n", " }\n", "print(d)\n", "\n", "john = ('John', 'Smith')\n", "d[john] = d[john] + 1\n", "print(d)\n", "\n", "d[('Ahmad', 'Ahmad')] = 19\n", "print(d)\n", "\n", "del d[('Sara','John')]\n", "print(d)\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Control Flow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conditional statements: if, elif, else\n", "\n", "The Python syntax for conditional execution of code use the keywords `if`, `elif` (else if), `else`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "statement1 = False\n", "statement2 = False\n", "\n", "if statement1:\n", " print(\"statement1 is True\")\n", " \n", "elif statement2:\n", " print(\"statement2 is True\")\n", " \n", "else:\n", " print(\"statement1 and statement2 are False\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "statement1 and statement2 are False\n" ] } ], "prompt_number": 108 }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the first time, here we encounted a peculiar and unusual aspect of the Python programming language: Program blocks are defined by their indentation level. \n", "\n", "Compare to the equivalent C code:\n", "\n", " if (statement1)\n", " {\n", " printf(\"statement1 is True\\n\");\n", " }\n", " else if (statement2)\n", " {\n", " printf(\"statement2 is True\\n\");\n", " }\n", " else\n", " {\n", " printf(\"statement1 and statement2 are False\\n\");\n", " }\n", "\n", "In C blocks are defined by the enclosing curly brakets `{` and `}`. And the level of indentation (white space before the code statements) does not matter (completely optional). \n", "\n", "But in Python, the extent of a code block is defined by the indentation level (usually a tab or say four white spaces). This means that we have to be careful to indent our code correctly, or else we will get syntax errors. \n", "\n", "**Examples:**" ] }, { "cell_type": "code", "collapsed": false, "input": [ "statement1 = statement2 = True\n", "\n", "if statement1:\n", " if statement2:\n", " print(\"both statement1 and statement2 are True\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "both statement1 and statement2 are True\n" ] } ], "prompt_number": 109 }, { "cell_type": "code", "collapsed": false, "input": [ "# Bad indentation!\n", "if statement1:\n", " if statement2:\n", " print(\"both statement1 and statement2 are True\") # this line is not properly indented" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "IndentationError", "evalue": "expected an indented block (, line 4)", "output_type": "pyerr", "traceback": [ "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m4\u001b[0m\n\u001b[0;31m print(\"both statement1 and statement2 are True\") # this line is not properly indented\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m expected an indented block\n" ] } ], "prompt_number": 110 }, { "cell_type": "code", "collapsed": false, "input": [ "statement1 = False \n", "\n", "if statement1:\n", " print(\"printed if statement1 is True\")\n", " \n", " print(\"still inside the if block\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "x = 10\n", "y = 5\n", "\n", "if x > y:\n", " print(float(x)/y)\n", "elif x == y:\n", " print(1)\n", "else:\n", " print(float(y)/x)\n", " " ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "if statement1:\n", " print(\"printed if statement1 is True\")\n", " \n", "print(\"now outside the if block\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# a compact way for using the if statement\n", "a = 2 if statement1 else 4\n", "print(\"a= \", a)\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "name = 'john'\n", "if name in ['jed', 'john']:\n", " print(\"We have Jed or John\")\n", " \n", "num = 1\n", "if num in [1, 2]:\n", " print(\"We have 1 or 2\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loops\n", "\n", "In Python, loops can be programmed in a number of different ways. The most common is the `for` loop, which is used together with iterable objects, such as lists. The basic syntax is:\n", "\n", "\n", "**`for` loops**:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for x in [1,2,3]:\n", " print(x)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `for` loop iterates over the elements of the supplied list, and executes the containing block once for each element. Any kind of list can be used in the `for` loop. For example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for x in range(4): # by default range start at 0\n", " print(x)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: `range(4)` does not include 4 !" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for x in range(-3,3):\n", " print(x)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "for word in [\"scientific\", \"computing\", \"with\", \"python\"]:\n", " print(word)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over key-value pairs of a dictionary:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for key, value in params.items():\n", " print(key + \" = \" + str(value))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes it is useful to have access to the indices of the values when iterating over a list. We can use the `enumerate` function for this:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for idx, x in enumerate(range(-3,3)):\n", " print(idx, x)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**List comprehensions: Creating lists using `for` loops**:\n", "\n", "A convenient and compact way to initialize lists:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l1 = [x**2 for x in range(0,5)]\n", "\n", "print(l1)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Nested list comprehensions\n", "l2 = [(x, y) for x in range(0,5) for y in range(5,10)]\n", "\n", "print(l2)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "#List comprehensions with conditional statement\n", "l1 = [x**2 for x in range(0,5) if x != 2]\n", "\n", "print(l1)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**`while` loops**:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "i = 0\n", "\n", "while i < 5:\n", " print(i)\n", " \n", " i = i + 1\n", " \n", "print(\"done\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the `print(\"done\")` statement is not part of the `while` loop body because of the difference in indentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "Loop through the following list of words and create a dictionary according to the following rules:\n", "\n", "- Add a word to the dictionary if its length is greater than 2.\n", "- Word lengths should be the keys, lists of words with corresponding lengths should be the values. In other words, for every key in the dictionary its value is a list of words with the same length as the key." ] }, { "cell_type": "code", "collapsed": false, "input": [ "words = [\"Aerial\", \"Affect\", \"Agile\", \"Agriculture\", \"Animal\", \"Attract\", \"Audubon\",\n", " \"Backyard\", \"Barrier\", \"Beak\", \"Bill\", \"Birdbath\", \"Branch\", \"Breed\", \"Buzzard\",\n", " \"The\", \"On\", \"Upper\", \"Not\", \"What\", \"Linked\", \"Up\", \"In\", \"A\", \"lol\"]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/control_flow.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 119 }, { "cell_type": "code", "collapsed": false, "input": [ "d = {}\n", "for w in words:\n", " if len(w) > 2:\n", " if len(w) in d:\n", " d[len(w)].append(w)\n", " else:\n", " d[len(w)] = [w]\n", "print d\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Functions\n", "\n", "A function in Python is defined using the keyword `def`, followed by a function name, a signature within parentheses `()`, and a colon `:`. The following code, with one additional level of indentation, is the function body." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def func0(): \n", " print(\"test\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "func0()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Optionally, but highly recommended, we can define a so called \"docstring\", which is a description of the functions purpose and behaivor. The docstring should follow directly after the function definition, before the code in the function body." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def func1(s):\n", " \"\"\"\n", " Print a string 's' and tell how many characters it has \n", " \"\"\"\n", " \n", " print(s + \" has \" + str(len(s)) + \" characters\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "help(func1)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "func1(\"test\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Functions that returns a value use the `return` keyword:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def square(x):\n", " \"\"\"\n", " Return the square of x.\n", " \"\"\"\n", " return x ** 2" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "square(4)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can return multiple values from a function using tuples (see above):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def powers(x):\n", " \"\"\"\n", " Return a few powers of x.\n", " \"\"\"\n", " return x ** 2, x ** 3, x ** 4" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "powers(3)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "x2, x3, x4 = powers(3)\n", "\n", "print(x3)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Default argument and keyword arguments\n", "\n", "In a definition of a function, we can give default values to the arguments the function takes:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def myfunc(x, p=2, debug=False):\n", " if debug:\n", " print(\"evaluating myfunc for x = \" + str(x) + \" using exponent p = \" + str(p))\n", " return x**p" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we don't provide a value of the `debug` argument when calling the the function `myfunc` it defaults to the value provided in the function definition:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "myfunc(5)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "myfunc(5, debug=True)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we explicitly list the name of the arguments in the function calls, they do not need to come in the same order as in the function definition. This is called *keyword* arguments, and is often very useful in functions that takes a lot of optional arguments." ] }, { "cell_type": "code", "collapsed": false, "input": [ "myfunc(p=3, debug=True, x=7)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Unnamed functions (lambda function)\n", "\n", "In Python we can also create unnamed functions, using the `lambda` keyword:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "f1 = lambda x: x**2\n", " \n", "# is equivalent to \n", "\n", "def f2(x):\n", " return x**2" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "f1(2), f2(2)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This technique is useful for exmample when we want to pass a simple function as an argument to another function, like this:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# map is a built-in python function\n", "map(lambda x: x**2, range(-3,4))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# in python 3 we can use `list(...)` to convert the iterator to an explicit list\n", "list(map(lambda x: x**2, range(-3,4)))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "Write function called `apply` which does the same thing as `map` does: It should take a list and a function as an argument, and returns a list with values that returned by the provided function applied to the list elements. Test it on a provided list and use a lambda function that takes an element and returns its absolute value." ] }, { "cell_type": "code", "collapsed": false, "input": [ "elements = [-100, 21, 115, 0.34, 45, -80, 12, 120, 73, -1]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/functions.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 120 }, { "cell_type": "code", "collapsed": false, "input": [ "def apply(l, fun):\n", " new_l = []\n", " for el in l:\n", " new_l.append(fun(el))\n", " return new_l\n", "\n", "apply(elements, lambda x: abs(x))\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Regular expressions\n", "[Regular expressions](http://en.wikipedia.org/wiki/Regular_expression) are special sequences of symbol that are used in various pattern matching tasks. They are extremely useful, and can help save a bunch of time when we are trying to parse text files. Below, we will learn the most essintial things about regular expressions in Pyhton on a few examples. For more details see [the documentation](http://docs.python.org/2/howto/regex.html)." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import re # Regular expression functionality of Python is placed in a separate module 're'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 121 }, { "cell_type": "code", "collapsed": false, "input": [ "string = 'Hello, class! This is a testing string that contains a word CAT'\n", "pattern = r'CAT'\n", "match = re.search(pattern, string)\n", "if match:\n", " print 'found', match.group()\n", "else:\n", " print 'not found'" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "found CAT\n" ] } ], "prompt_number": 122 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most often we don't have particular and precise patterns we need to match. Instead, we have fuzzy definitions, different word forms, etc., and this is exactly where regular expressions save our time." ] }, { "cell_type": "code", "collapsed": false, "input": [ "string = 'Hello, class! This is a testing string that contains a word caT'\n", "pat = r'[cC][aA][tT]'\n", "print 'found', re.search(pat, string).group()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "found caT\n" ] } ], "prompt_number": 123 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The basic syntax of regular expressions:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Special characters: . ^ $ * + ? { } [ ] \\ | ( )\n", "string = 'Hello, class! This is a testing string that contains a word CAR'\n", "pat = 'CA.'\n", "print 'found', re.search(pat, string).group()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "found CAR\n" ] } ], "prompt_number": 124 }, { "cell_type": "code", "collapsed": false, "input": [ "string = 'Hello, class! This is a testing string that contains a word CAAAAAAAAAAAT'\n", "pat = 'CA*T'\n", "print 'found', re.search(pat, string).group()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "found CAAAAAAAAAAAT\n" ] } ], "prompt_number": 125 }, { "cell_type": "code", "collapsed": false, "input": [ "string = 'Hello, class! This is a testing string that contains a word CAAT'\n", "pat = 'CA+T'\n", "print 'found', re.search(pat, string).group()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "found CAAT\n" ] } ], "prompt_number": 126 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, if we need to search some pattern multiple times, it is more efficient to 'compile' it first, and then use it." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pat = re.compile(r'CA+T')\n", "print 'found', pat.search(string).group()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "found CAAT\n" ] } ], "prompt_number": 127 }, { "cell_type": "code", "collapsed": false, "input": [ "# Yet another example\n", "line = \"Cats are smarter than dogs\"\n", "\n", "matchObj = re.match( r'(.*) are (.*?) .*', line)\n", "\n", "if matchObj:\n", " print \"matchObj.group() : \", matchObj.group()\n", " print \"matchObj.group(1) : \", matchObj.group(1)\n", " print \"matchObj.group(2) : \", matchObj.group(2)\n", "else:\n", " print \"No match!!\"" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "matchObj.group() : Cats are smarter than dogs\n", "matchObj.group(1) : Cats\n", "matchObj.group(2) : smarter\n" ] } ], "prompt_number": 128 }, { "cell_type": "code", "collapsed": false, "input": [ "dir(pat)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 129, "text": [ "['__class__',\n", " '__copy__',\n", " '__deepcopy__',\n", " '__delattr__',\n", " '__doc__',\n", " '__format__',\n", " '__getattribute__',\n", " '__hash__',\n", " '__init__',\n", " '__new__',\n", " '__reduce__',\n", " '__reduce_ex__',\n", " '__repr__',\n", " '__setattr__',\n", " '__sizeof__',\n", " '__str__',\n", " '__subclasshook__',\n", " 'findall',\n", " 'finditer',\n", " 'flags',\n", " 'groupindex',\n", " 'groups',\n", " 'match',\n", " 'pattern',\n", " 'scanner',\n", " 'search',\n", " 'split',\n", " 'sub',\n", " 'subn']" ] } ], "prompt_number": 129 }, { "cell_type": "code", "collapsed": false, "input": [ "pat.findall?" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 130 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are ready to use now regular expressions in more common way, and show its power." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# A piece of text from \"KAUST Highlights of 2013\" article\n", "text = \"\"\"\n", "This past year saw members of KAUST faculty and leadership being awarded major international awards.\n", "At the beginning of the year, Prof. Jean Frechet, KAUST Vice President for Research, was awarded the 2013 Japan\n", "Prize in recognition of his original and outstanding scientific achievements serving to promote peace and prosperity\n", "for all mankind. Dr. Frechet and C. Grant Willson of the University of Texas at Austin were recognized for their achievement\n", "in the development of chemically amplified resistant polymer materials for innovative semiconductor manufacturing processes.\n", "Also appointed in the past year, Prof. Yves Gnanou, KAUST Dean of Physical Science and Engineering (PSE), was inducted into\n", "the elite ranks of the French Ordre national de la L\u00e9gion d'honneur (National Order of the Legion of Honor). He was\n", "presented with the coveted Chevalier medal at a ceremony held in Paris on June 27, 2013. Established over two centuries\n", "ago by Napoleon Bonaparte, the award recognizes its recipients' extraordinary contributions to France and French culture.\n", "Prof. Gnanou previously served as Vice President of Academic Affairs and Research at \u00c9cole Polytechnique in Paris. Spanning\n", "a third continent, the Desert Research Institute (DRI), in the US, presented Nina Fedoroff, Distinguished Professor of Bioscience\n", "and Director of the KAUST Center for Desert Agriculture, with the 2013 Nevada Medal. The award acknowledges outstanding achievements\n", "in science and engineering. The DRI President, Dr. Stephen Wells, said: \"In only a few decades Prof. Fedoroff's research\n", "has helped stimulate a revolution in biology.\"\n", "\"\"\"\n", "pat = re.compile(r'awards?')\n", "pat.findall(text)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 131, "text": [ "['award', 'awards', 'award', 'award', 'award']" ] } ], "prompt_number": 131 }, { "cell_type": "code", "collapsed": false, "input": [ "pat = re.compile(r'award[^\\s,.!?:;]*')\n", "pat.findall(text)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 132, "text": [ "['awarded', 'awards', 'awarded', 'award', 'award']" ] } ], "prompt_number": 132 }, { "cell_type": "markdown", "metadata": {}, "source": [ "`findall` method allows us to nicely combine parsed results into groups. In order to do this, we need only to use parentheses." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pat = re.compile(r'(award)([^\\s,.!?:;]*)')\n", "pat.findall(text)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 133, "text": [ "[('award', 'ed'),\n", " ('award', 's'),\n", " ('award', 'ed'),\n", " ('award', ''),\n", " ('award', '')]" ] } ], "prompt_number": 133 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise:\n", "Design a pattern that matches only correct email addresses. Apply this pattern to match all the emails in the provided string. Count the number of emails you got. Also make sure that your pattern's `findall` returns groups of kind ('name', 'domain.com') excluding '@' symbol." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Taken from somewhere from the web\n", "emails =\"\"\"\n", "litke@talktalk.net\n", "owarne000s@talk21.com\n", "seo@webindustry.co.uk\n", "w0lfspirit32@hotmail.co.uk\n", "robert.douglas3@virgin.net\n", "dan@webindustry.co.uk\n", "sasha_aitken@mail.ru\n", "brian@tweddle1966.freeserve.co.uk\n", "asmileiscatching@hotmail.com\n", "sashalws@aol.com\n", "ebbygraham372@hotmail.com\n", "chnelxx@hotmail.com\n", "choudhury.esaa@hotmail.co.uk\n", "girishvishwasjoshi@gmail.com\n", "mark.clynch@ntlworld.com\n", "ssladmin@eskdalesaddlery.co.uk\n", "owarnes.t21@btinternet.com\n", "mgrahamhaulage@aol.com\n", "beccababes@dsl.pipex.com\n", "victor-2k7@hotmail.co.uk\n", "kellybomaholly@hotmail.com\n", "sugar@caton9108.freeserve.co.uk\n", "paulapowley@btinternet.com\n", "owarnes@talk21.com\n", "lesleygodwin1@hotmail.com\n", "nicolawinter4@hotmail.com\n", "diwasbhattarai@yahoo.com\n", "james.hutchinson@aggregate.com\n", "margaret.howroyd@hotmail.com\n", "tina.wane@tiscali.co.uk\n", "lizzy26259495@yahoo.com\n", "kennyspence53@hotmail.com\n", "pedrovieira1@gmail.com\n", "versivul@mail.ru\n", "booom2012@mail.ru sin_0.8@mail.ru\n", "kaplya71@mail.ru smirnova-s-s@mail.ru\n", "325jm@mail.ru rasmyrik@mail.ru\n", "tanya_tyurnina@mail.ru fedorova2006@inbox.ru\n", "veniaminm77@mail.ru dimon_gushin@mail.ru\n", "anna_shevchenko@list.ru belexovagalina@mail.ru\n", "engelgardt_ledi_elena_01.11.77@mail.ru\n", "kolzhanov92@mail.ru digital1q@mail.ru\n", "\"\"\"" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 134 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/regular.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 135 }, { "cell_type": "code", "collapsed": false, "input": [ "pat = re.compile(\"([\\w.-_]+)@([\\w.-]+\\.[a-z]+)\")\n", "email_list = pat.findall(emails)\n", "print email_list[:3]\n", "print \"# of emails in the list is\", len(email_list)\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exceptions\n", "\n", "In Python errors are managed with a special language construct called \"Exceptions\". When errors occur exceptions can be raised, which interrupts the normal program flow and fallback to somewhere else in the code where the closest try-except statement is defined.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To generate an exception we can use the `raise` statement, which takes an argument that must be an instance of the class `BaseExpection` or a class derived from it. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "raise Exception(\"description of the error\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A typical use of exceptions is to abort functions when some error condition occurs, for example:\n", "\n", " def my_function(arguments):\n", " \n", " if not verify(arguments):\n", " raise Expection(\"Invalid arguments\")\n", " \n", " # rest of the code goes here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To gracefully catch errors that are generated by functions and class methods, or by the Python interpreter itself, use the `try` and `except` statements:\n", "\n", " try:\n", " # normal code goes here\n", " except:\n", " # code for error handling goes here\n", " # this code is not executed unless the code\n", " # above generated an error\n", "\n", "For example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "try:\n", " print(\"test\")\n", " # generate an error: the variable test is not defined\n", " print(test)\n", "except:\n", " print(\"Caught an expection\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "test\n", "Caught an expection\n" ] } ], "prompt_number": 111 }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get information about the error, we can access the `Exception` class instance that describes the exception by using for example:\n", "\n", " except Exception as e:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "try:\n", " print(\"test\")\n", " # generate an error: the variable test is not defined\n", " print(test)\n", "except Exception as e:\n", " print(\"Caught an exception:\" + str(e))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "test\n", "Caught an exception:name 'test' is not defined\n" ] } ], "prompt_number": 112 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "The os module contains the `makedirs` function which creates a directory, but if the directory already exists an error is raised. The goal of this exercise is to create a function that creates a directory given its name and ignores the error if the directory already exist.\n", "(Hint: the module `errno` contains an exception named `EEXIST` that corresponds to the `OSError` that is raised when the directory already exists)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "os.makedirs('mydir')" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'os' is not defined", "output_type": "pyerr", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmakedirs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'mydir'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'os' is not defined" ] } ], "prompt_number": 113 }, { "cell_type": "code", "collapsed": false, "input": [ "os.makedirs('mydir') # This will raise error because the directory already exist" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load solutions/exeptions.py" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 136 }, { "cell_type": "code", "collapsed": false, "input": [ "def ensure_dir(d):\n", " import os, errno\n", " try:\n", " os.makedirs(d)\n", " #except OSError as exc:\n", " except Exception as exc:\n", " if exc.errno == errno.EEXIST:\n", " pass\n", " else: raise\n", "\n", "ensure_dir('mydir')\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further reading\n", "\n", "* http://www.python.org - The official web page of the Python programming language.\n", "* http://www.python.org/dev/peps/pep-0008 - Style guide for Python programming. Highly recommended. \n", "* http://www.greenteapress.com/thinkpython/ - A free book on Python programming.\n", "* [Python Essential Reference](http://www.amazon.com/Python-Essential-Reference-4th-Edition/dp/0672329786) - A good reference book on Python programming." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Copyright 2014, Tareq Malas and Maruan Al-Shedivat, ACM Student Members.*" ] } ], "metadata": {} } ] }