{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial: Jupyter notebooks" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "__author__ = \"Lucy Li\"\n", "__version__ = \"CS224u, Stanford, Spring 2020\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Contents\n", "\n", "1. [Starting up](#Starting-up)\n", "1. [Cells](#Cells)\n", " 1. [Code](#Code)\n", " 1. [Markdown](#Markdown)\n", " 1. [Headers](#Headers)\n", " 1. [Displaying code](#Displaying-code)\n", " 1. [LaTeX](#LaTeX)\n", " 1. [Quotations](#Quotations)\n", " 1. [Lists](#Lists)\n", " 1. [Images](#Images)\n", " 1. [Dividers](#Dividers)\n", "1. [Kernels](#Kernels)\n", "1. [Shortcuts](#Shortcuts)\n", "1. [Shutdown](#Shutdown)\n", "1. [Extras](#Extras)\n", " 1. [Checkpoints](#Checkpoints)\n", " 1. [NbViewer](#NbViewer)\n", "1. [More resources](#More-resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Starting up\n", "\n", "This tutorial assumes that you have followed the [course setup](https://nbviewer.jupyter.org/github/cgpotts/cs224u/blob/master/setup.ipynb) instructions. This means Jupyter is installed using Conda. \n", "\n", "1. Open up Terminal (Mac/Linux) or Command Prompt (Windows). \n", "2. Enter a directory that you'd like to have as your `Home`, e.g., where your cloned `cs224u` Github repo resides. \n", "3. Type `jupyter notebook` and enter. After a few moments, a new browser window should open, listing the contents of your `Home` directory. \n", " - Note that on your screen, you'll see something like `[I 17:23:47.479 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/`. This tells you where your notebook is located. So if you were to accidentally close the window, you can open it again while your server is running. For this example, navigating to `http://localhost:8888/` on your favorite web browser should open it up again. \n", " - You may also specify a port number, e.g. `jupyter notebook --port 5656`. In this case, `http://localhost:5656/` is where your directory resides. \n", "4. Click on a notebook with `.ipynb` extension to open it. If you want to create a new notebook, in the top right corner, click on `New` and under `Notebooks`, click on `Python`. If you have multiple environments, you should choose the one you want, e.g. `Python [nlu]`. \n", " - You can rename your notebook by clicking on its name (originally \"Untitled\") at the top of the notebook and modifying it. \n", " - Files with `.ipynb` are formatted as a JSON and so if you open them in vim, emacs, or a code editor, it's much harder to read and edit. \n", "\n", "Jupyter Notebooks allow for **interactive computing**. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cells\n", "\n", "Cells help you organize your work into manageable chunks. \n", "\n", "The top of your notebook contains a row of buttons. If you hover over them, the tooltips explain what each one is for: saving, inserting a new cell, cut/copy/paste cells, moving cells up/down, running/stopping a cell, choosing cell types, etc. Under Edit, Insert, and Cell in the toolbar, there are more cell-related options. \n", "\n", "Notice how the bar on the left of the cell changes color depending on whether you're in edit mode or command mode. This is useful for knowing when certain keyboard shortcuts apply (discussed later). \n", "\n", "There are three main types of cells: **code**, **markdown**, and raw. \n", "\n", "Raw cells are less common than the other two, and you don't need to understand them to get going for this course. If you put anything in this type of cell, you can't run it. They are used for situations where you might want to convert your notebook to HTML or LaTeX using the `nbconvert` tool or File -> Download as a format that isn't `.ipynb`. Read more about raw cells [here](https://nbsphinx.readthedocs.io/en/0.4.2/raw-cells.html) if you're curious. \n", "\n", "### Code\n", "\n", "Use the following code cells to explore various operations. \n", "\n", "Typically it's good practice to put import statements in the first cell or at least in their own cell. \n", "\n", "The square brackets next to the cell indicate the order in which you run cells. If there is an asterisk, it means the cell is currently running. \n", "\n", "The output of a cell is usually any print statements in the cell and the value of the last line in the cell. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"cats\")\n", "# run this cell and notice how both strings appear as outputs\n", "\"cheese\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# cut/copy and paste this cell\n", "# move this cell up and down\n", "# run this cell\n", "# toggle the output\n", "# toggle scrolling to make long output smaller\n", "# clear the output\n", "for i in range(50): \n", " print(\"cats\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# run this cell and stop before it finishes\n", "# stop acts like a KeyboardInterrupt\n", "for i in range(50): \n", " time.sleep(1) # make loop run slowly\n", " print(\"cats\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# running this cell leads to no output\n", "def function1(): \n", " print(\"dogs\")\n", "\n", "# put cursor in front of this comment and split and merge this cell.\n", "def function2(): \n", " print(\"cheese\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "function1()\n", "function2()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One difference between coding a Python script and a notebook is how you can run code \"out of order\" for the latter. This means you should be careful about variable reuse. It is good practice to order cells in the order which you expect someone to use the notebook, and organize code in ways that prevent problems from happening. \n", "\n", "Clearing the output doesn't remove the old variable value. In the example below, we need to rerun cell A to start with a new `a`. If we don't keep track of how many times we've run cell B or cell C, we might encounter unexpected bugs. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Cell A\n", "a = []" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Cell B\n", "# try running this cell multiple times to add more pineapple\n", "a.append('pineapple')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Cell C\n", "# try running this cell multiple times to add more cake\n", "a.append('cake')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# depending on the number of times you ran \n", "# cells B and C, the output of this cell will \n", "# be different.\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even deleting cell D's code after running it doesn't remove list `b` from this notebook. This means if you are modifying code, whatever outputs you had from old code may still remain in the background of your notebook. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Cell D\n", "# run this cell, delete/erase it, and run the empty cell\n", "b = ['apple pie']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# b still exists after cell C is gone\n", "b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Restart the kernel (Kernel -> Restart & Clear Output) to start anew. To check that things run okay in the intended order, restart and run everything (Kernel -> Restart & Run All). This is especially good to do before sharing your notebook with someone else. \n", "\n", "Jupyter notebooks are handy for telling stories using your code. You can view Pandas DataFrames and plots directly under each code cell. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# dataframe example\n", "d = {'ingredient': ['flour', 'sugar'], '# of cups': [3, 4], 'purchase date': ['April 1', 'April 4']}\n", "df = pd.DataFrame(data=d)\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# plot example\n", "plt.title(\"pineapple locations\")\n", "plt.ylabel('latitude')\n", "plt.xlabel('longitude')\n", "_ = plt.scatter(np.random.randn(5), np.random.randn(5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Markdown\n", "\n", "The other type of cell is Markdown, which allows you to write blocks of text in your notebook. Double click on any Markdown cell to view/edit it. Don't worry if you don't remember all of these things right away. You'll write more code than Markdown essays for this course, but the following are handy things to be aware of. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Headers\n", "\n", "You may notice that this cell's header is prefixed with `###`. The fewer hashtags, the larger the header. You can go up to five hashtags for the smallest level header. \n", "\n", "Here is a table. You can emphasize text using underscores or asterisks. You can also include links. \n", "\n", "| Markdown | Outcome |\n", "| ----------------------------- | ---------------------------- |\n", "| `_italics_ or *italics*` | _italics_ or *italics* |\n", "| `__bold__ or **bold**` | __bold__ or **bold** |\n", "| `[link](http://web.stanford.edu/class/cs224u/)` | [link](http://web.stanford.edu/class/cs224u/) |\n", "| `[jump to Cells section](#cells)` | [jump to Cells section](#cells) |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Displaying code\n", "\n", "Try removing/adding the `python` in the code formatting below to toggle code coloring. \n", "\n", "```python\n", "if text == code: \n", " print(\"You can write code between a pair of triple backquotes, e.g. ```long text``` or `short text`\")\n", "```\n", "\n", "#### LaTeX\n", "\n", "Latex also works: \n", "$y = \\int_0^1 2x dx$\n", "$$y = x^2 + x^3$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Quotations\n", "\n", "> You can also format quotes by putting a \">\" in front of each line. \n", ">\n", "> You can space your lines apart with \">\" followed by no text." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are three different ways to write a bullet list (asterisk, dash, plus): \n", "* sugar\n", "* tea\n", " * earl gray\n", " * english breakfast\n", "- cats\n", " - persian\n", "- dogs\n", "+ pineapple\n", "+ apple\n", " + granny smith\n", "\n", "Example of a numbered list: \n", "1. tokens\n", "2. vectors\n", "3. relations\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Images\n", "\n", "You can also insert images: \n", "\n", "`![alt-text](./fig/nli-rnn-chained.png \"Title\")`\n", "\n", "(Try removing the backquotes and look at what happens.)\n", "\n", "#### Dividers\n", "\n", "A line of dashes, e.g. `----------------`, becomes a divider. \n", "\n", "------------------" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Kernels\n", "\n", "A kernel executes code in a notebook. \n", "\n", "You may have multiple conda environments on your computer. You can change which environment your notebook is using by going to Kernel -> Change kernel. \n", "\n", "When you open a notebook, you may get a message that looks something like \"Kernel not found. I couldn't find a kernel matching ____. Please select a kernel.\" This just means you need to choose the version of Python or environment that you want to have for your notebook. \n", "\n", "If you have difficulty getting your conda environment to show up as a kernel, [this](https://stackoverflow.com/questions/39604271/conda-environments-not-showing-up-in-jupyter-notebook) may help.\n", "\n", "In our class we will be using IPython notebooks, which means the code cells run Python. \n", "\n", "Fun fact: there are also kernels for other languages, e.g., Julia. This means you can create notebooks in these other languages as well, if you have them on your computer. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Shortcuts\n", "\n", "Go to Help -> Keyboard Shortcuts to view the shortcuts you may use in Jupyter Notebook. \n", "\n", "Here are a few that I find useful on a regular basis: \n", "- **run** a cell, select below: shift + enter\n", "- **save** and checkpoint: command + S (just like other file types)\n", "- enter **edit** mode from command mode: press enter\n", "- enter **command** mode from edit mode: esc\n", "- **delete** a cell (command mode): select a cell and press D\n", "- **dedent** while editing: command + [\n", "- **indent** while editing: command + ]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# play around with this cell with shortcuts\n", "# delete this cell \n", "# Edit -> Undo Delete Cells\n", "for i in range(10): \n", " print(\"jelly beans\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Shutdown\n", "\n", "Notice that when you are done working and exit out of this notebook's window, the notebook icon in the home directory listing next to this notebook is green. This means your kernel is still running. If you want to shut it down, check the box next to your notebook in the directory and click \"Shutdown.\" \n", "\n", "To shutdown the jupyter notebook app as a whole, use Control-C in Terminal to stop the server and shut down all kernels." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Extras" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "These are some extra things that aren't top priority to know but may be interesting. \n", "\n", "### Checkpoints\n", "\n", "When you create a notebook, a checkpoint file is also saved in a hidden directory called `.ipynb_checkpoints`. Every time you manually save the notebook, the checkpoint file updates. Jupyter autosaves your work on occasion, which only updates the `.ipynb` file but not the checkpoint. You can revert back to the latest checkpoint using File -> Revert to Checkpoint. \n", "\n", "### NbViewer\n", "\n", "We use this in our class for viewing jupyter notebooks from our course website. It allows you to render notebooks on the Internet. Check it out [here](https://nbviewer.jupyter.org/). \n", "\n", "View -> **Cell toolbar**\n", "- **Edit Metadata**: Modify the metadata of a cell by editing its json representation. Example of metadata: whether cell output should be collapsed, whether it should be scrolled, deletability of cell, name, and tags. \n", "- **Slideshow**: For turning your notebook into a presentation. This means different cells fall under slide types, e.g. Notes, Skip, Slide. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More resources\n", "\n", "If you click on \"Help\" in the toolbar, there is a list of references for common Python tools, e.g. numpy, pandas. \n", "\n", "[IPython website](https://ipython.org/)\n", "\n", "[Markdown basics](https://daringfireball.net/projects/markdown/)\n", "\n", "[Jupyter Notebook Documentation](https://jupyter-notebook.readthedocs.io/en/stable/index.html)\n", "\n", "[Real Python Jupyter Tutorial](https://realpython.com/jupyter-notebook-introduction/)\n", "\n", "[Dataquest Jupyter Notebook Tutorial](https://www.dataquest.io/blog/jupyter-notebook-tutorial/)\n", "\n", "[Stack Overflow](https://stackoverflow.com/)" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }