{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data Analysis with Jupyter Notebooks.\n",
"\n",
"# Tutorial 1 — Introducing Jupyter Notebooks\n",
"\n",
"Benjamin J. Morgan, University of Bath."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Contents\n",
"\n",
"- [Data Analysis with Jupyter Notebooks](#intro)\n",
" - [Why analyse data with computer code?](#why)\n",
"- [Getting Started with Jupyter Notebooks](#getting_started)\n",
"- [Running Code](#running_code)\n",
"- [Saving Notebooks](#saving_notebooks)\n",
"- [Comments and Markdown cells](#comments)\n",
"- [Using assert Statements for Interactive Feedback](#assert_statements)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data Analysis with Jupyter Notebooks\n",
"\n",
"This is a Jupyter notebook. In some ways each document is like a physical notebook. It can be used for describing an experiment, recording data, and commenting on it. Unlike a physical notebook, a Jupyter notebook also allows you to run and easily share computer code. This combination makes Jupyter notebooks a very useful tool for analysing data collected from experiments. Unlike spreadsheets or combinations of separate data analysis codes, a Jupyter notebook allows you to collect desciptions and notes for individual experiments, links to the raw data collected, the computer code that performs any necessary data analysis, and the final figures generated with these data, ready for use in a report or published paper.\n",
"\n",
"## Why analyse data using code?\n",
"Using computer code allows you to analyse experimental data *programmatically*. All the steps for working with your data are carried out according to a *program*; a predefined series of instructions; like a recipe for a particular meal. \n",
"\n",
"Once a particular program has been written, it will always produce the same results with the same starting data. This makes it possible to “show your working”. Scientists presenting new results can share their original data, alongside the code that they used for all their analysis. This has a number of benefits. Other scientists can review the code, run it against the original data set, and check that any analysis has been done correctly. \n",
"\n",
"Finished code can also be used as a starting point for looking at a similar set of data. The original scientist might repeat their experiment to confirm their results, or another group might collect data under slightly different conditions, and want to compare the two cases. Often the steps described by the code are the same for small data sets and for large data sets. Once an analysis program exists, processing enormous data sets simply becomes a question of access to sufficiently powerful computers. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Getting Started with Jupyter Notebooks\n",
"\n",
"A Jupyter notebook consists of a series of **cells** that contain text. These cells are arranged vertically, top-to-bottom in the document. Any cell can be edited by clicking on it. A cell in **edit mode** is indicated by a green border. \n",
"
\n",
"A cell with a blue border is in **command mode**. \n",
"
\n",
"In command mode you are not able to type into a cell, but you can still edit the notebook (reordering cells, executing code, etc.) Commands for editing notebooks can be accessed from the manu at the top of the screen, and commonly used commands have keyboard shortcuts, which will be highlighted in the examples below. The full list of keyboard shortcuts can be found through **Help > Keyboard Shortcuts** in the menu, or by pressing **H**.\n",
"\n",
"To edit a cell in command mode, press enter or double click on the cell."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Running Code \n",
"\n",
"The Jupyter notebook is primarily useful for writing and running code. A large number of different computer languages can be used in Jupyter notebooks. In these examples, we will be using Python (specifically Python 3). Python is increasingly used across a large number of scientific disciplines for data management and analysis. The large scientific community means that very good resources already exist for data processing, such as the Jupyter project, and specific prewritten tools for manipulating and plotting data.\n",
"\n",
"This course will not go into detail about how to write your own Python code. Instead, as much as possible we are going to focus on learning how to use readily available packages for data analysis. There are plenty of resources for learning Python for more traditional programming tasks, such as the DataCamp [Learn Python for Data Science](https://www.datacamp.com/courses/intro-to-python-for-data-science) tutorial.\n",
"\n",
"The default cell type in a Jupyter notebook is a **code** cell. If you open a new notebook it will have one, empty, code cell. And you can always create more cells by clicking in the menu on **Insert > Insert Cell Above** (or press **A**) or **Insert > Insert Cell Below** (or press **B**). \n",
"
\n",
"Any code typed into a code cell can be run (or “**executed**”) by pressing `Shift-Enter` or pressing the
button in the notebook toolbar.\n",
"\n",
"This practical consists of an interactive tutorial (this notebook), followed by a a series of exercises. Some code cells in the tutorial will already have code in them, which you can **run** by selecting and pressing `Shift-Enter` or clicking the toolbar button:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"2+3 # run this cell…"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You should now have Out[ ]: with the result of running this code printed next to it:\n",
"
\n",
"and the focus has automatically moved to the next cell. You can always re-select a cell to run it again.\n",
"\n",
"Most of the code examples will be presented like this:\n",
"\n",
">```python\n",
"print(\"hello\")\n",
"```\n",
"\n",
"with an empty code cell underneath. These examples are for you to type into the empty code cell and then run. Do not copy and paste these. You will learn the concepts faster and become comfortable with writing your own code if you type each piece of code out.\n",
"\n",
"Start with this example:\n",
"\n",
">```python\n",
"print(\"hello\")\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print( \"hello\" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Your output should be\n",
"\n",
"```\n",
"hello\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There will also be small exercises that ask you to write your own piece of code from scratch, or modify an example that is not finished yet; it might contain an error – often called a **bug**, or just not do exactly what we would like. These will be indicated with green boxes."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"\n",
"You can make a copy of a notebook (for example, to save an old version while you work on a new idea, or to duplicate a notebook to a different directory) you can either select **File > Make a Copy**, which duplicates the current notebook in the same directory; or you can select **File > Download as > Notebook (.ipynb)**, to download a copy into your **Downloads**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explain Your Code: Comments and Markdown Cells\n",
"One of the advantages of using *code* for numerical calculations and data analysis is that you end up with a record of exactly what you have done. You, or anyone else, can read the code, to understand how you reached your answer. You can think of this as “showing your working”, and it can be very helpful if you want to solve a *similar* problem in the future.\n",
"\n",
"Often, reading only the code can get quite cryptic (it is called “code”, after all), which makes it difficult to understand what is happening. To explain what a particular piece of code does, or to explain *why* a piece of code is being used, you can include **comments**.\n",
"\n",
" \n",
"```python\n",
"# this is a comment\n",
"```\n",
"\n",
"Any text that appears after a # symbol is part of the comment, and is ignored when the code is run. \n",
"\n",
"Jupyter notebooks offer a second way to describe what you are doing: **Markdown cells**. A code cell can be converted to a Markdown cell by selecting Cell > Cell Type > Markdown from the menu. \n",
"
\n",
"A Markdown cell can be used to type plain text, which is displayed when the cell is run. Markdown cells are useful for documenting a notebook, particularly when you want to write something more detailed than a short comment. Markdown cells can also contain basic text formatting, links, images, and equations (more information is [here](http://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/Working%20With%20Markdown%20Cells.html))."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"