" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A command about *my self*" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\tWho am i\n", "\n", "twitter: @paolo.donorio\n", "github: @pdonorio\n", "email: p.donoriodemeo@cineca.it\n" ] }, { "data": { "text/plain": [ "42" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This is a new command added from the extension 'whoami'\n", "%helloworld" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### I will double check\n", "* note to self: click on the cell below\n", "* then press shift+enter" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Today is 15/06/2015\n" ] } ], "source": [ "import time\n", "print (\"Today is \" + time.strftime(\"%d/%m/%Y\"))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Introduction" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Quick note**\n", "\n", "*These presentations are based on the awesome work of J.R. Johansson* \n", "\n", "source: http://dml.riken.jp/~rob\n", "\n", "*I also took inspiration from one of that project forks* \n", "\n", "source: http://nbviewer.ipython.org/gist/rpmuller/5920182\n", "\n", "(opensource rulez)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "** Testing the audience **\n", "\n", "- Do you already know python?\n", "- Which version of python are you using?\n", "- What is the main reason you decided to use python? \n", "- Have you ever used `numpy`?\n", "- Do you know what `ipython` is?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## A notebook and a Calculator" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Many of the things I used to use a calculator for, I now use Python for:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2+2" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(50-5*6)/4" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "There are some gotchas compared to using a normal calculator." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "7/3" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "* Python integer division, like C or Fortran integer division, truncates the remainder and returns an integer. \n", " * At least it does in version 2. \n", " * In version 3, Python returns a floating point number. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2.3333333333333335" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Preview of py3k feature in Python 2 by importing the module from the *future* \n", "from __future__ import division\n", "7/3" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Alternatively, you can convert one of the integers to a floating point number, in which case the division function returns another floating point number." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2.3333333333333335" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# One way\n", "7/3." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2.3333333333333335" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Second way\n", "7/float(3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "##What did we see so far?\n", "\n", "* integers\n", "* floating point numbers\n", "* import (of a python library)\n", "* libraries are called **modules**" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "9.0" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# An example of using a module\n", "from math import sqrt\n", "sqrt(81)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "9.0" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Or you can simply import the math library itself\n", "import math\n", "math.sqrt(81)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "You can define variables using the equals (=) sign:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "600" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "width = 20\n", "length = 30\n", "area = length*width\n", "area" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If you try to access a variable that you haven't yet defined, you get an error:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "ename": "NameError", "evalue": "name 'volume' is not defined", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mvolume\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mNameError\u001b[0m: name 'volume' is not defined" ] } ], "source": [ "volume" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "and you need to define it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "depth = 10\n", "volume = area*depth\n", "volume" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* You can name a variable *almost* anything you want\n", "* It needs to start with an alphabetical character or \"\\_\" \n", "* It can contain alphanumeric charcters plus underscores (\"\\_\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Certain words, however, are **reserved** for the *language*:\n", "\n", " and, as, assert, break, class, continue, def, del, elif, else, except, \n", " exec, finally, for, from, global, if, import, in, i\n", " s, lambda, not, or,\n", " pass, print, raise, return, try, while, with, yield" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Trying to define a variable using one of these will result in a syntax error:\n", "return = 0" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "a little step back:\n", "\n", "## The role of computing in science" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Science has traditionally been divided into \n", "\n", "* **experimental** and \n", "* **theoretical** disciplines." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "During the last several decades ***computing*** has emerged\n", "\n", "* Related to both experiments and theory.\n", "* Often viewed as a new third branch of science. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "*Nowadays a vast majority of both experimental and theoretical papers involve some **numerical calculations**, simulations or computer modeling*." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In experimental sciences\n", "* the methods used and the results are published\n", "* All experimental data should be available upon request\n", "* It is considered unscientific to withhold crucial details in a theoretical proof\n", "\n", "In computational sciences \n", "* There are not yet any well established guidelines for how **source code** and **generated data** should be handled. \n", "\n", "*A number of editorials in high-profile journals have started \n", "to demand of authors to provide the source code for simulation software*\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Requirements on scientific computing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "With respect to numerical work:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Replication**\n", " - An author of a scientific paper that involves numerical calculations should be able to rerun the simulations and replicate the results upon request. \n", " - Other scientist should also be able to perform the same calculations and obtain the same results, given the information about the methods used in a publication." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Reproducibility** \n", " - The results obtained should be reproducible with an independent implementation of the method, or using a different method altogether. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### To achieve these goals\n", "\n", "* Keep source code and version that was used to produce data and figures in published papers\n", "* Record information of which version of external software that was used\n", " - Keep access to the environment that was used\n", "* Be ready to give additional information about the methods used\n", "* Ideally codes should be published online" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Tools for managing source code\n", "*this is extremely important for your future work*" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Ensuring replicability and reprodicibility of scientific simulations is a *complicated problem*" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "####Revision Control System (RCS) software\n", "Good choices include\n", "* git - http://git-scm.com\n", "* mercurial - http://mercurial.selenic.com. Also known as `hg`\n", "* subversion - http://subversion.apache.org. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "sources:\n", "- http://blog.codeeval.com/codeevalblog/2014#.VXGkF5rtlBc=\n", "- http://blog.codeeval.com/codeevalblog/2015#.VXGkE5rtlBc=" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### What makes python suitable for scientific computing?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Python has a strong position in *scientific computing* \n", "\n", "- Large community of users\n", "- easy to find help and documentation\n", "\n", "Extensive ecosystem of *scientific libraries* and environments\n", "- **numpy** http://numpy.scipy.org - Numerical Python\n", "- **scipy** http://www.scipy.org - Scientific Python\n", "- **matplotlib** http://www.matplotlib.org - graphics library" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "* Great performance due to close integration with time-tested and highly optimized codes written in C and Fortran:\n", " * blas, altas blas, lapack, arpack, Intel MKL, ...\n", "\n", "* Good support for \n", " * Parallel processing with processes and threads\n", " * Interprocess communication (MPI)\n", " * GPU computing (OpenCL and CUDA)\n", "\n", "* Readily available and suitable for use on high-performance computing clusters. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "##No license costs!\n", "\n", "No unnecessary use of research budget" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### The scientific python software stack\n", "Lots of '*goodies*'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Python interpreter" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The standard way to use the Python programming language is to use the Python interpreter to run python code" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "* The python interpreter is a program that read and execute the python code in files passed to it as arguments\n", "* At the command prompt, the command ``python`` is used to invoke the Python interpreter\n", "\n", "For example, to run a file ``my-program.py`` that contains python code from the command prompt, use:\n", "\n", " $ python my-program.py" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can also start the interpreter by simply typing ``python`` at the command line, and interactively type python code into the interpreter. \n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### IPython" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "IPython is an interactive shell that addresses the limitation of the standard python interpreter\n", "\n", "...it is a work-horse for scientific use of python! " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "It provides an interactive prompt to the python interpreter with a greatly improved user-friendliness.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Some of the many useful features of IPython includes:\n", "\n", "* Command history, which can be browsed with the up and down arrows on the keyboard.\n", "* Tab auto-completion.\n", "* In-line editing of code.\n", "* Object introspection, and automatic extract of documentation strings from python objects like classes and functions.\n", "* Good interaction with operating system shell.\n", "* Support for multiple parallel back-end processes, that can run on computing clusters or cloud services like Amazon EE2.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# IPython notebook" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "[IPython notebook](http://ipython.org/notebook.html) is an HTML-based notebook environment for Python\n", "\n", "* Based on the IPython shell\n", "* Provides a cell-based environment with great interactivity\n", "* calculations can be organized documented in a structured way\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Although using the a web browser as graphical interface, \n", "\n", "IPython notebooks are usually run **locally**\n", "\n", "from the same computer that run the browser. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "To start a new IPython notebook session, run the following command:\n", "\n", " $ ipython notebook\n", "\n", "from a directory where you want the notebooks to be stored. \n", "\n", "(This will open a new browser window with a running explorer of the current path)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Welcome, *you* \n", "##a notebooker scientist" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`let's demo together`\n", "\n", "Link: http://j.mp/ourpylab\n", "\n", " ... and yes, you will be able to do what i can do at the end of this day " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "##The notebook magic" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- explorer, create new, remove, rename" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- move inside, run cell code, help" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- the kernel, cell types " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- markdown and notes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- download ipynb, python, html" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- install a library and use it" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Versions of Python" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "There are currently two versions of python: \n", "\n", "**Python 2** and **Python 3**. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "* Python 3 will eventually supercede Python 2\n", " + but **it is not backward-compatible** with Python 2\n", "* A lot of existing python code and packages has been written for Python 2\n", " + and it is still the most wide-spread version\n", " \n", "We will stick with Python 2 for this time.\n", "\n", "*Note*:\n", "\n", "> Several versions of Python can be installed in parallel, as shown above.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "before getting serious:\n", "\n", "## Python and module versions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "For the reproducibility of an IPython notebook:\n", "\n", "- We record the versions of all these different software packages\n", "- If this is done properly it will be easy to reproduce the environment \n", "\n", "\n", "To encourage the practice of recording versions in notebooks:\n", "\n", "> a simple IPython extension that produces a table with versions numbers of selected software components" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Installed version_information.py. 