{ "cells": [ { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "1ac9ecee-f28a-462f-a772-e7cea9a82b07" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Introduction to Python\n", "## Getting Started" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "ba2291e8-4da1-4384-aed9-89d1352ed506" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Schedule\n", "\n", "| Date | Session | Title | Description |\n", "|--------------|--------------|----------------------|----------------------------------------------|\n", "| Day 1 | Lecture 1 | Intro and Setup | *Why Python, Setup, etc* |\n", "| Day 1 | Lecture 2 | Basic Python | *Syntax, data structures, control flow, etc* |\n", "| Day 2 | Lecture 3 | Numpy + Pandas I | *Numpy basics, series/dataframe basics*|\n", "| Day 2 | Lecture 4 | Pandas II | *Joining, advanced indexing, reshaping, etc* |\n", "| Day 3 | Lecture 5 | Pandas III | *Grouping, apply, transform, etc* |\n", "| Day 3 | Lecture 6 | Plotting | *Intro to plotting in Python* |\n", "| Day 3 | Lecture 7 | Intro to Modeling | *Intro to stats/ML models in Python* |" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "0f862571-7c07-4522-8b38-dee55d550f96" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Course materials\n", "\n", "### [github.com/ihmeuw/ihme-python-course](https://github.com/ihmeuw/ihme-python-course)\n", "\n", "First, you're going to want to get a copy of this repository onto your\n", "machine. Simply fire up ``git`` and clone it:\n", "\n", "1. Open up a shell (e.g. ``git.exe``, ``cmd.exe``, or ``terminal.app``)\n", "\n", "2. Navigate to where you'd like to save this. \n", " - We recommend ``~/repos/`` (e.g. ``C:/Users//repos/`` on Windows, ``/Users//repos/`` on Mac, or ``/home//repos/`` on Unix).\n", "\n", "3. Clone this repo:\n", " git clone https://github.com/ihmeuw/ihme-python-course.git\n", " \n", "If you need help with setting up `git`, see [this page](https://help.github.com/articles/set-up-git/#setting-up-git) or simply download the repo as a [zip file](https://github.com/IHME/ihme-python-course/archive/master.zip) for now..." ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "d8735df1-cb35-45cb-97ce-0edb86f5c0f6" }, "slideshow": { "slide_type": "slide" } }, "source": [ "![I wrote 20 short programs in Python yesterday. It was wonderful. Perl, I'm leaving you.](http://imgs.xkcd.com/comics/python.png)\n", "\n", "via [xkcd](https://xkcd.com/353/) (see also [xkcd in python](https://pypi.python.org/pypi/xkcd/) and [xkcd for matplotlib](http://matplotlib.org/xkcd/examples/showcase/xkcd.html))" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "cbb05c14-518f-433d-9e18-f6aca30a9ad5" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# What is Python? 🐍\n", "\n", "[Python](http://www.python.org/) is a widely used high-level, general-purpose, interpreted, dynamic programming language.\n", "\n", "## Broadly:\n", "\n", "Officially, Python is an interpreted scripting language (meaning that it is not compiled until it is run) for the C programming language; in fact, Python itself is coded in C (though there are other non-C implementations). It offers the power and flexibility of lower level (*i.e.* compiled) languages, without the steep learning curve and associated programming overhead. The language is very clean and readable, and it is available for almost every modern computing platform." ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "36363234-6fbb-42bd-b55d-6012593a7403" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Advantages:\n", "\n", "* Ease of programming, minimizing the time required to develop, debug and maintain code\n", "* Well-designed language that encourages good programming practices:\n", " * Modular and object-oriented programming, good system for packaging and re-use of code\n", " * Documentation tightly integrated with the code\n", "* A large standard library with many extensions\n", "\n", "## Disadvantages:\n", "\n", "* Since Python is an interpreted and dynamically typed programming language, the execution of python code can be slow compared to compiled statically typed programming languages, such as C and Fortran\n", "* Somewhat decentralized, with different environments, packages and documentation spread out at different places" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "8560f6c0-055f-4b75-9607-f1b330fa95ba" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Scientific Computing in Python\n", "\n", "\n", "## Why do we use Python at IHME?\n", "\n", "- Powerful and easy to use\n", "- Interactive\n", "- Extensible\n", "- Large third-party library ecosystem\n", "- Free and open" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "9b8de004-3973-4f9f-b34e-867d0a260445" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Powerful and easy to use\n", "\n", "- Python is simultaneously powerful, flexible and easy to learn and use (in general, these qualities are traded off for a given programming language)\n", "- Anything that can be coded in C, FORTRAN, or Java can be done in Python, almost always in fewer (and more readable) lines of code, and with fewer debugging headaches\n", "- Its standard library is extremely rich, including modules for string manipulation, regular expressions, file compression, mathematics, profiling and debugging (*etc*)\n", "- Python is object-oriented, which is an important programming paradigm particularly well-suited to scientific programming, which allows data structures to be abstracted in a natural way" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "0d1ba0ad-c734-4c74-b356-d95bad89cdb9" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Interactive \n", "\n", "- Python may be run interactively on the command line, in much the same way as R\n", "- Notebooks offer convenient prototyping, mixing code in with outputs such as graphs and direct viewing of data structures\n", "- Rather than compiling and running a particular program, commands may entered serially which is often useful for mathematical programming and debugging" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "63ba0d58-d09f-417b-8305-955ee07045f6" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Extensible\n", "\n", "- Often referred to as a “glue” language, meaning that it is a useful in a mixed-language environment \n", " - (such as at IHME, where we often have to combine R, Stata, C++, etc)\n", "- Python was designed to interact with other programming languages, both through interfaces like [rpy2](http://rpy2.bitbucket.org/) and by compiling directly into Python extensions using e.g. [Cython](http://cython.org/)\n", "- New interfaces coming out all the time, such as for GPU computing" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "29b82dc1-b5b9-48c5-9c60-bdc26452e190" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Large third-party library ecosystem\n", "\n", "There are modules available for just about anything you could want to do in Python, with nearly 100,000 available on [PyPI](https://pypi.python.org/pypi) alone. Some notable packages:\n", "\n", "- [**NumPy**](http://www.numpy.org/): Numerical Python (NumPy) is a set of extensions that provides the ability to specify and manipulate array data structures. It provides array manipulation and computational capabilities similar to those found in Matlab or Octave. \n", "- [**SciPy**](http://www.scipy.org/): An open source library of scientific tools for Python, SciPy supplements the NumPy module. SciPy gathering a variety of high level science and engineering modules together as a single package. SciPy includes modules for graphics and plotting, optimization, integration, special functions, signal and image processing, genetic algorithms, ODE solvers, and others.\n", "- [**Matplotlib**](http://matplotlib.org/): Matplotlib is a python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms. Its syntax is very similar to Matlab. \n", "- [**Pandas**](http://pandas.pydata.org/): A module that provides high-performance, easy-to-use data structures and data analysis tools. In particular, the `DataFrame` class is useful for spreadsheet-like representation and mannipulation of data. Also includes high-level plotting functionality.\n", "- [**IPython**](https://ipython.org/): An enhanced Python shell, designed to increase the efficiency and usability of coding, testing and debugging Python. It includes both a Qt-based console and an interactive HTML notebook interface, both of which feature multiline editing, interactive plotting and syntax highlighting." ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "57d3a95e-3eb6-4d24-b64b-5ee69c453fce" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Free and open\n", "\n", "- Python is released on all platforms under an open license (Python Software Foundation License), meaning that the language and its source is freely distributable\n", "- Keeps costs down for scientists and universities operating under a limited budget\n", "- Frees programmers from licensing concerns for any software they may develop" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "5f237cbd-7cd5-40fe-9dc9-5881a440ca56" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Setup\n", "\n", "## Python 2 vs 3\n", "A [debate](https://wiki.python.org/moin/Python2orPython3) as old as time. Don't worry about it, just use Python 3 for now.\n", "\n", "## The Anaconda Distribution\n", "[Anaconda](https://www.continuum.io/anaconda-overview) is a suite of tools for Python (and R...) that will install everything you need to get up and running with Python" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "36b34ce8-261e-4423-9dd7-8f7ddb66c79b" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Python interpreter\n", "\n", "- Simply typing `python` at the command line will open up the standard Python interpreter:\n", "\n", "- Seldom used interactively - but we'll use it a lot for running completed programs, e.g.\n", " \n", " `python my-program.py`\n" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "834c0a44-4d7e-4fb3-9fa9-d0f4262a8e61" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## IPython\n", "\n", "IPython is an interactive interpreter for Python that adds many user-friendly features on top of the standard `python`:\n", "\n", "- Tab auto-completion\n", "- Command history (using up and down arrows)\n", "- In-line highlighting and editing of code\n", "- Object introspection\n", "- Automatic extraction of docstrings from Python objects like classes and functions\n", "\n", "Start IPython by simply calling `ipython` from the command line:\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "33186ec3-970a-433d-a2c9-0bcb5da35ca9" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Jupyter\n", "\n", "- [Jupyter notebooks](http://jupyter.org/) are HTML-based environments for IPython, R, and more, which allow you to interactively code, explore your data, and integrate documentation\n", "- *This* is a Jupyter notebook\n", "- Starting a Jupyter notebook server will launch a local webserver that you can view in your browser to create, view, and run notebooks:\n", " \n", " jupyter notebook\n", " \n", "" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "a521a6ed-dbc0-4822-8df3-1f3c0caba522" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Spyder\n", "\n", "- [Spyder](http://code.google.com/p/spyderlib/) is a MATLAB-like IDE for scientific computing with Python\n", "- Everything from code editing, execution and debugging is carried out in a single environment, and work on different calculations can be organized as projects in the IDE environment\n", "- Calling Spyder will open up a new project\n", "\n", " spyder\n", " \n", "\n", "\n", "Another interesting new IDE for Python is [Rodeo](https://www.yhat.com/products/rodeo)" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "18b1b9f6-dac6-4bc2-9ff8-c776049d56ab" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Installation\n", "\n", "### The easy way\n", "Go to the [Anaconda download page](https://www.continuum.io/downloads) and download the installer for Python 3.5 (64-bit) and simply click through to follow the instructions\n", "\n", "### The fancy way\n", "If you'd like to setup a [Docker container with Anaconda](https://www.continuum.io/blog/developer-blog/anaconda-and-docker-better-together-reproducible-data-science), check out the [Docker setup instructions](../Docker-Instructions.rst). But be warned that it doesn't play terribly nicely with Windows 7 or 8..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Slideshows\n", "If you'd like to be able to view these notebooks in slideshow mode, install [RISE](https://github.com/damianavila/RISE)\n", " \n", " conda install -c damianavila82 rise" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/markdown": [ "# Exercise 1\n", "\n", "## Installing Anaconda\n", "\n", "See [the README](../README.rst) for instructions on how to install Anaconda\n", "on your system.\n", "\n", "\n", "## Clone this repo\n", "See [the README](../README.rst) again for instructions on cloning this repo\n", "(into its recommended location of `~/repos/`).\n", "\n", "\n", "## Opening the `python` interpreter\n", "\n", "- Open up your terminal and type `python` at the command line\n", "- Try typing some things in to see how they work... some suggestions:\n", " * `1 + 1`\n", " * `1.0 + 1`\n", " * `\"hello world\"`\n", " * `print(\"hello world\")`\n", "- Type `exit()` to exit\n", "\n", "\n", "## Running `ipython`\n", "\n", "- Open up the interactive IPython interpreter\n", " * `ipython`\n", "- Try some of the same inputs as above\n", "- Type `print?` and hit `return`\n", "- Type `print(` and hit `tab`\n", "- Type `exit()` to exit\n", "\n", "\n", "## Exploring `jupyter notebook`\n", "\n", "- Open up a terminal and navigate to the root directory of this repo (e.g. \n", " `~/repos/ihme-python-course`)\n", "- Type `jupyter notebook` to startup a notebook server\n", "- Your web browser will most likely automatically open \n", " * If not, you can navigate to \n", " [http://localhost:8888/](http://localhost:8888/)\n", " * _Note_: if you're already running something on port `8888` it might start\n", " the server on a different port. Look for a message in your terminal \n", " like\n", " The Jupyter Notebook is running at: http://localhost:8889/\n", " to find it.\n", "- Use the file tree to navigate to the `Lecture 1/Exercise 1` directory and \n", " and then click on `test-notebook.ipynb` to open it\n", "- Follow the instructions in the notebook\n", "\n", "\n", "## Run a script from the command line\n", "\n", "- Examine [Exercise 1/test-script.py](Exercise 1/test-script.py) with your\n", " favorite editor or Spyder\n", "- Run it from the command line\n", " * _Hint_: remember `python my-script.py`\n", "- Change the message from \"hello world\" to something else and execute it\n", "\n", "\n", "## _Optional_: Install `RISE`\n", "\n", "- [`RISE`](https://github.com/damianavila/RISE) will add a new button to your\n", " Jupyter toolbar that'll allow you to view these notebooks in slideshow mode\n", "- Click the new bargraph-esque icon in the toolbar to start a slideshow\n", "- Use `spacebar` and `shift+spacebar` to navigate through the slides\n", "- To edit how the slideshow is structured, take a look at \n", " `View > Cell Toolbar > Slideshow`\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import Markdown, display\n", "display(Markdown(open('./Exercise 1/README.md', 'r').read()))" ] }, { "cell_type": "markdown", "metadata": { "nbpresent": { "id": "8ea66e52-ab66-4ea4-841c-5f4be654c984" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# References\n", "\n", "- Slide materials inspired by and adapted from [Chris Fonnesbeck](https://github.com/fonnesbeck/HealthPolicyPython) and [J Robert Johansson](https://github.com/jrjohansson/scientific-python-lectures)" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [conda root]", "language": "python", "name": "conda-root-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" }, "livereveal": { "height": 768, "scroll": true, "slideNumber": true, "start_slideshow_at": "selected", "theme": "league", "width": 1024 } }, "nbformat": 4, "nbformat_minor": 0 }