# Introduction to Jupyter notebooks

*This tutorial was modified from materials of the Caltech course Data Analysis in the Biological Sciences taught by Justin Bois* http://bebi103.caltech.edu/2015/tutorials.html

The first thing we'll do, [discussed later](#Best-practices-for-code-cells), is import all the modules we'll need.  You should in general do this at the very beginning of each notebook, and in fact each `.py` file you write.

In [None]:
# Import numerical tools
import numpy as np
import scipy.integrate

# Import pyplot for plotting
import matplotlib.pyplot as plt

# Magic function to make matplotlib inline; other style specs must come AFTER
%matplotlib inline

%config InlineBackend.figure_formats = {'svg',}
#%config InlineBackend.figure_formats = {'png', 'retina'}

In this tutorial, you will learn the basics on how to use Jupyter notebooks.  Your problem sets will be submitted as Jupyter notebooks, so this is something you will need to master.

Reading [the official Jupyter documentation](http://jupyter-notebook.readthedocs.org/) can also be helpful.

## Contents
* [What is Jupyter](#What-is-Jupyter?)
* [Launching a Jupyter notebook](#Launching-a-Jupyter-notebook)
* [Cells](#Cells)
* [Code cells](#Code-cells)
    - [Display of graphics](#Display-of-graphics)
    - [Interactive plotting with Bokeh](#Interactive-plotting-with-Bokeh)
    - [Proper formatting of cells](#Proper-formatting-of-cells)
    - [Best practices for code cells](#Best-practices-for-code-cells)

## What is Jupyter?
[Jupyter](http://jupyter.org) is a way to combine text and code (which runs and can display graphic output!) in an easy-to-read document that renders in a web browser.  The notebook itself is stored as a text file in [JSON](http://json.org) format.  This text file is what you will submit to bCourses for your problem sets.

Many different types of programming languages can be run within a Jupyter notebook. We will be using the language [Python](http://python.org/) which provides flexible and powerful tools for data analysis and plotting.

## Launching a Jupyter notebook

To launch a Jupyter notebook, you can do the following.
* **Mac**: Use the Anaconda launcher and select Jupyter notebook.
* **Windows**: Under "Search programs and files" from the Start menu, type `jupyter notebook` and select "Jupyter notebook."

A Jupyter notebook will then launch in your default web browser.

You can also launch Jupyter from the command line.  To do this, simply enter

    jupyter notebook

on the command line and hit enter.  This also allows for greater flexibility, as you can launch Jupyter with command line flags.  For example, I launch Jupyter using

    jupyter notebook --browser=safari

This fires up Jupyter with Safari as the browser.  If you launch Jupyter from the command line, your shell will be occupied with Jupyter and will occasionally print information to the screen. 

When you launch  Jupyter, you will be presented with a menu of files in your current working directory to choose to edit.  You can also navigate around the files on your computer to find a file you wish to edit by clicking the "Upload" button in the upper right corner.  You can also click "New" in the upper right corner to get a new Jupyter notebook.  After selecting the file you wish to edit, it will appear in a new window in your browser, beautifully formatted and ready to edit.

## Cells
A Jupyter notebook consists of **cells**.  The two main types of cells you will use are **code cells** and **markdown cells**, and we will go into their properties in depth momentarily.  First, an overview.

A code cell contains actual code that you want to run.  You can specify a cell as a code cell using the pulldown menu in the toolbar in your Jupyter notebook.  Otherwise, you can can hit `esc` and then `y` (denoted "`esc, y`") while a cell is selected to specify that it is a code cell.  Note that you will have to hit enter after doing this to start editing it.

If you want to execute the code in a code cell, hit "`shift + enter`."  Note that code cells are executed in the order you execute them.  That is to say, the ordering of the cells for which you hit "`shift + enter`" is the order in which the code is executed.  If you did not explicitly execute a cell early in the document, its results are not known to the Python interpreter.

Markdown cells contain text.  The text is written in **markdown**, a lightweight markup language.  You can read about its syntax [here](http://daringfireball.net/projects/markdown/syntax).  Note that you can also insert HTML or $\LaTeX$ expressions into markdown cells, and they will be rendered properly.

As you are typing the contents of these cells, the results appear as text.  Hitting "`shift + enter`" renders the text in the formatting you specify. You can specify a cell as being a markdown cell in the Jupyter toolbar, or by hitting "`esc, m`" in the cell.  Again, you have to hit enter after using the quick keys to bring the cell into edit mode (or you can just double-click on the cell).

In general, when you want to add a new cell, you can use the "Insert" pulldown menu from the Jupyter toolbar.  The shortcut to insert a cell below is "`esc, b`" and to insert a cell above is "`esc, a`."  Alternatively, you  can execute a cell and automatically add a new one below it by hitting "`alt + enter`."

## Code cells
Below is an example of a code cell printing `hello, world.`  Notice that the output of the print statement appears in the same cell, though separate from the code block. Notice also that the first line of the code cell has the hash-tag (`#`) in front of it. This symbol designates a line as a "comment." These lines are ignored by the compiler (i.e., not treated as code).

In [None]:
# Say hello to the world.
print('hello, world.')

If you evaluate a Python expression that returns a value, that value is displayed as output of the code cell.  This only happens, however, for the last line of the code cell.

In [None]:
# Would show 9 if this were the last line, but it is not, so shows nothing
4 + 5

# I hope we see 11.
5 + 6

Note, however, if the last line does not return a value, such as if we assigned a variable, there is no visible output from the code cell.

In [None]:
# Variable assignment, so no visible output.
a = 5 + 6

In [None]:
# However, now if we ask for a, its value will be displayed
a

### Display of graphics
When displaying graphics, you should have them **inline**, meaning that they are displayed directly in the IPython notebook and not in a separate window.  You can specify that, as I did at the top of this document, using the `%matplotlib inline` magic function.  Below is an example of graphics displayed inline.

Generally, I prefer presenting graphics as scalable vector graphics (SVG).  Vector graphics are infinitely zoom-able; i.e., the graphics are represented as points, lines, curves, etc., in space, not as a set of pixel values as is the case with raster graphics (such as PNG).  By default, graphics are displayed as PNGs, but you can specify SVG as I have at the top of this document in the first code cell. 

    %config InlineBackend.figure_formats = {'svg',}

If SVG graphics aren't working in your browser PNG graphics at a high resolution can be used instead

    %config InlineBackend.figure_formats = {'png', 'retina'}
    
at the top of your file, as we have here.

In [None]:
# Generate data to plot
x = np.linspace(0, 2 * np.pi, 200)
y = np.exp(np.sin(x))

# Make plot
plt.plot(x, y)
plt.xlim((0, 2 * np.pi))
plt.xlabel(r'$x$')
plt.ylabel(r'$\mathrm{e}^{\sin{x}}$')
plt.show()

### Proper formatting of cells
Generally, it is a good idea to keep cells simple.  You can define one function, or maybe two or three closely related functions, in a single cell, and that's about it.  When you define a function, you should make sure it is properly commented with descriptive doc strings.  Below is an example of how I might generate a plot of the Lorenz attractor (a classic system of differential equations that exhibits chaotic behavior and is very interesting-looking) with code cells and markdown cells with discussion of what I am doing. Don't worry about the details of the math here; it isn't too relevant to what we'll be doing in this course. Instead, what I want you to pay attention to is how functions are used and defined.

The first function defines the system of equations that constitutes the Lorenz attractor, which will ultimately be used in another equation that solves the system of equations. 

In [None]:
def lorenz_attractor(r, t, p):
    """
    Compute the right hand side of system of the equations for Lorenz attractor.
    
    Parameters
    ----------
    r : array_like, shape (3,)
        (x, y, z) position of trajectory.
    t : dummy_argument
        Dummy argument, necessary to pass function into 
        scipy.integrate.odeint
    p : array_like, shape (3,)
        Parameters (s, k, b) for the attractor.
        
    Returns
    -------
    output : ndarray, shape (3,)
        Time derivatives of Lorenz attractor.
        
    Notes
    -----
    .. Returns the right hand side of the system of differential equations describing
       the Lorenz attractor.
        x' = s * (y - x)
        y' = x * (k - z) - y
        z' = x * y - b * z
    """
    # Unpack variables and parameters. 
    x, y, z = r #r is one of the inputs of the function. It is an array containing three elements, which end up getting assigned to the variables x, y, and z.
    s, p, b = p #Similarly, p is another 3-element array, in which the elements are assigned to the variables s, p, and b.
    
    return np.array([s * (y - x), 
                     x * (p - z) - y, 
                     x * y - b * z])

With this function in hand, we just have to pick our initial conditions and time points, run the numerical integration, and then plot the result.

In [None]:
# Parameters to use
p = np.array([10.0, 28.0, 8.0 / 3.0]) #This is the command that we use to define an array.

# Initial condition
r0 = np.array([0.1, 0.0, 0.0])

# Time points to sample
t = np.linspace(0.0, 80.0, 10000) #linspace produces an array of linearly spaced points between 0 and 10000, with a time step of 80

# Use scipy.integrate.odeint to integrate Lorentz attractor
r = scipy.integrate.odeint(lorenz_attractor, r0, t, args=(p,)) #This is a specialized built-in function that solves the system of equations for x, y, and z at any instant in time.

# Unpack results into x, y, z.
x, y, z = r.transpose() #This assigns the output of the solution to three variables: x, y, and z (easier to refer to in plotting!)

# Plot the result
plt.plot(x, z, '-', linewidth=0.5) #plt means that we are now calling on functions form the plotting library. This shows trajectories of the solution in x-z space. Ooh, pretty, huh?
plt.xlabel(r'$x(t)$', fontsize=18)
plt.ylabel(r'$z(t)$', fontsize=18)
plt.title(r'$x$-$z$ proj. of Lorenz attractor traj.')
plt.show()

### Best practices for code cells
Here is a summary of some general rules for composing and formatting your code cells.
1. Do not exceed the width of the code cell.
2. Keep your code cells short.  If you find yourself having one massive code cell, break it up.
3. Always properly comment your code.  Provide complete doc strings for any functions you define.
4. Do all of your imports in the first code cell at the top of the notebook.  Import one module per line.
5. For submitting problem sets, **always** display your graphics inline.  You can render the graphics as PNGs if your browser starts experiencing performance issues, but SVG is preferred.