{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to Python\n",
    "# Class 1: Functions and variables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Objectives\n",
    "\n",
    "Welcome to Introduction to Python from fredhutch.io! \n",
    "This course introduces you Python by working through common tasks in data science: \n",
    "importing, manipulating, and visualizing data.\n",
    "\n",
    "Python is a computer programming language widely used for a variety of applications. \n",
    "For more information about Python and ways to use it at Fred Hutch, please see the \n",
    "[Python](https://sciwiki.fredhutch.org/scicomputing/software_python/) \n",
    "entry for the Fred Hutch Biomedical Data Science Wiki.\n",
    "\n",
    "Before proceeding with these training materials, \n",
    "please ensure you have installed Python and Jupyter notebooks via Anaconda as described \n",
    "[here](http://www.fredhutch.io/software/#python-jupyter-notebooks). \n",
    "Please note you'll also need to install `plotnine` separately for the last module.\n",
    "\n",
    "By the end of this first module, you should be able to:\n",
    "\n",
    "- work in a Jupyter notebook to run and record Python code\n",
    "- understand basic Python syntax to use functions and assign variables\n",
    "- create basic data structures (sequences and dictionaries)\n",
    "- define functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A brief orientation to Python and Jupyter notebooks\n",
    "\n",
    "Python is a commonly used programming language among researchers,\n",
    "and has a large community and set of tools available to support its use.\n",
    "As a result, there are many different ways to interact with Python,\n",
    "the choice of which depends on your specific need for coding.\n",
    "In this class, we'll be using [Jupyter notebooks](https://jupyter.readthedocs.io/en/latest/) \n",
    "to write, run, and maintain a record of our work.\n",
    "\n",
    "A Jupyter notebook is an interface operated in a web browser \n",
    "that allows inclusion of code, output (including graphics) and explanatory text\n",
    "all in the same document. \n",
    "In fact, these lesson materials are written in a Jupyter notebook.\n",
    "Jupyter notebooks can also be used as a method of communicating research methods,\n",
    "such as [this notebook](https://github.com/rasilab/machkovech_2018/blob/master/scripts/NA43_competition.ipynb) \n",
    "associated with a published manuscript from Rasi Subramaniam's lab at Fred Hutch."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a new Jupyter notebook\n",
    "\n",
    "You can access Jupyter notebooks through Anaconda,\n",
    "which is the software you used to install Python as well.\n",
    "Anaconda is a version of conda, a package manager that helps you install and update software.\n",
    "\n",
    "Open the Anaconda Navigator software on your computer, \n",
    "then click on Jupyter Notebook \n",
    "(note that this is different than Jupyter Lab!).\n",
    "\n",
    "You'll see your default web browser open a new tab.\n",
    "On a Mac, you may also see a Terminal window open;\n",
    "this window needs to stay open for Python to run,\n",
    "but we recommend you minimize it so it stays out of the way.\n",
    "\n",
    "In the Jupyter notebook window in your web browser,\n",
    "note the URL at the top: \n",
    "it should start with something like `http://localhost:8888/tree`.\n",
    "In the browser window, you should see folders like \"Documents\" and \"Desktop.\"\n",
    "This window represents a different way to interact with the files on your computer.\n",
    "Although you're viewing these files in a web browser,\n",
    "you're not necessarily working with files online.\n",
    "This means that you can securely use Jupyter notebooks to work with sensitive data, \n",
    "as long as those data are stored in a secure location.\n",
    "\n",
    "We're going to create a project directory for the purposes of this course. \n",
    "You can think of a project as a discrete unit of work, \n",
    "such as a chapter of a thesis/dissertation, analysis for a manuscript, or a monthly report. \n",
    "We recommend organizing your code, data, and other associated files as projects, \n",
    "which allows you to keep all parts of an analysis together for easier access.\n",
    "\n",
    "Create a new project for this class using the Jupyter notebook file browser:\n",
    "\n",
    "- Navigate to the location in your computer where you'd like to save files for this class (we recommend Desktop or Documents).\n",
    "- Click \"New\" in the upper right hand corner of the screen, then \"Folder\". This will create a new folder named \"Untitled Folder\".\n",
    "- Click the box next to \"Untitled Folder\", then select \"Rename\" near the upper left corner of the screen. Name the new directory \"intro_python\"; we'll now refer to this as your project directory.\n",
    "- Click on the new folder to view its contents (it should be empty).\n",
    "- Click \"New\" in upper right hand corner of the window, then select \"Python3\". This creates a new ipython notebook file and opens it in a new tab. Click on the title of the notebook to rename the file \"class1\". If you click on the browser tab for the file browser, you can also rename as for the folder earlier. You'll note this filename has a suffix of `ipynb`.\n",
    "\n",
    "> Jupyter notebooks have a handy \"auto-save\" feature so you don't have to manually save constantly.\n",
    "You may see messages appearing at the top of your notebook referencing \"checkpoints,\" \n",
    "which means the auto-save feature is functioning."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Executing code in a Jupyter notebook\n",
    "\n",
    "Now that we have a new project and an empty notebook set up, we can begin orienting ourselves to how notebooks work to hold our text, code, and output.\n",
    "\n",
    "The pale gray box you see at the top of your screen with `In [ ]` to the left is a cell.\n",
    "By default, each cell is created as a code cell.\n",
    "Because our notebook is Python 3, \n",
    "our code cells are able to execute Python code.\n",
    "We can test this out by entering `3 + 4` into the cell,\n",
    "then holding down the Shift key and pressing Enter/Return.\n",
    "This executes (runs) the code in the cell and prints the output below,\n",
    "prefaced by `Out[ ]`.\n",
    "Executing the code this way also creates a new cell below the one you executed.\n",
    "If a new cell doesn't appear,\n",
    "you can add one using the `+` button in the toolbar at the top of the screen. \n",
    "\n",
    "Cells can also be used to enter text using [Markdown formatting](https://www.markdownguide.org/basic-syntax/).\n",
    "Change the type of your new cell by going to the dropdown box in the tool bar at the top of the window and changing \"Code\" to \"Markdown.\" \n",
    "Add a subtitle in this cell by entering `## Operators, functions, and data types`,\n",
    "then using Shift + Enter to execute the cell,\n",
    "which formats the text as large and bold.\n",
    "The link above includes more information about Markdown formatting,\n",
    "but we'll generally use only plain text and subtitles for this course.\n",
    "\n",
    "Jupyter notebooks include many other features, \n",
    "which you can explore in the toolbar and dropdown menus at the top of the screen. \n",
    "Additional keyboard shortcuts are also available under \"Help -> Keyboard Shortcuts\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Operators, functions, and data types"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have a notebook created, \n",
    "as well as a basic understanding of how to write and execute code, \n",
    "we can begin learning more about Python syntax,\n",
    "which are rules that dictate how combinations of words and symbols are interpreted in a language."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# mathematical operator\n",
    "4 + 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first line in the example above is a code comment. \n",
    "It is not interpreted by Python, but is a human-readable explanation of the code that follows. \n",
    "In Python, anything to the right of one or more `#` symbols represents a comment.\n",
    "\n",
    "> Syntax differs among language.\n",
    "> So far in this lesson, \n",
    "> we've learned that Markdown interprets `#` as a way of formatting titles and subtitles, \n",
    "> while in Python the same symbol represents a code comment.\n",
    "\n",
    "As we proceed through these lessons, \n",
    "we recommend trying to type the example code so it appears as similar as possible to what is presented here.\n",
    "From the example above,\n",
    "you may now be wondering if the spaces on either side of the `+` are required.\n",
    "We can test this for ourselves:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "4+5 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The code above indicates that the spaces are not required, \n",
    "but are convention. \n",
    "Code convention and style doesn’t make or break the ability of your code to run, \n",
    "but it does affect whether other people can easily understand your code. \n",
    "We'll try to model appropriate code convention for this course,\n",
    "and you can read more about Python formatting recommendations [here](https://www.python.org/dev/peps/pep-0008/#id26).\n",
    "\n",
    "We can also use logical operators to evaluate whether a given statement is true or false:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "3 > 4 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In addition to logical data,\n",
    "python possesses a few other built-in data types:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# data types in python\n",
    "number = 42 \n",
    "pi_value = 3.1415 \n",
    "text = \"Fred Hutch\" "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the code above, \n",
    "we have assigned three new variables.\n",
    "Like in math, \n",
    "a variable is a word used to represent data,\n",
    "which can be a single value or more complex collections. \n",
    "\n",
    "We can use the variables we just created to explore other built-in data types using functions.\n",
    "Functions are pre-defined sets of code that allow you to repeat particular actions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "int"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# use function to identify data type\n",
    "type(number)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the code above, `type` is the function and `number` is the variable we assigned earlier.\n",
    "This code is asking what type of data `number` represents, \n",
    "and the output, `int`, stands for integer (whole number data)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "float"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(pi_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`float` data represents numbers with decimal points."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "str"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`str` represents character data, also referred to as strings.\n",
    "These data include anything that can be included inside quotation marks,\n",
    "including letters, numbers, punctuation, and even emoji.\n",
    "\n",
    "We can also use functions to convert data among these types:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# convert float to integer\n",
    "int(pi_value) # decimals removed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When we assigned (created) this variable, \n",
    "the two decimal places instructed Python to interpret it as a float value.\n",
    "By using the function `int`, \n",
    "we can convert the value to integer.\n",
    "\n",
    "If we again inspect the type of `pi_value`, though:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "float"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(pi_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see the data type is still float. \n",
    "This is because we haven't altered the data type of the original variable, \n",
    "only the data type of the output printed to the screen.\n",
    "\n",
    "We can change the data type of our original variable by reassigning back to the same name:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "int"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# reassign variable\n",
    "pi_value = int(pi_value)\n",
    "type(pi_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we see the type of the variable has changed to integer.\n",
    "\n",
    "Similarly, we can convert integers to float:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "42.0"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# convert integer to float\n",
    "float(number)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Although the numerical value hasn't changed, the presence of the decimal in the output indicates it is a float.\n",
    "\n",
    "Notebooks allow you a handy shortcut to view the contents of a variable by executing only the variable name:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Fred Hutch'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This approach will work well enough for us in this class,\n",
    "since we'll be using notebooks the whole time.\n",
    "If you are using code written by other people, \n",
    "or begin writing code in scripts (outside notebooks),\n",
    "you'll often see the `print` function used:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fred Hutch\n"
     ]
    }
   ],
   "source": [
    "# print output to screen\n",
    "print(text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data output are the same for each of the two previous code cells,\n",
    "though they look slightly different.\n",
    "It's useful to note that the notebook will only print the result of the last command executed in a code cell,\n",
    "so if there are is other output in the cell you'd like to see, \n",
    "you may need to use the `print` function then as well.\n",
    "\n",
    "If you would like to find help on a function, \n",
    "there's a function for that:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on built-in function print in module builtins:\n",
      "\n",
      "print(...)\n",
      "    print(value, ..., sep=' ', end='\\n', file=sys.stdout, flush=False)\n",
      "    \n",
      "    Prints the values to a stream, or to sys.stdout by default.\n",
      "    Optional keyword arguments:\n",
      "    file:  a file-like object (stream); defaults to the current sys.stdout.\n",
      "    sep:   string inserted between values, default a space.\n",
      "    end:   string appended after the last value, default a newline.\n",
      "    flush: whether to forcibly flush the stream.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# find help on a function\n",
    "help(print)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The help documentation may seem difficult to decipher right now, \n",
    "but includes following relevant information:\n",
    "\n",
    "- `Help on built-in function print in module builtins:` is a title for the information below\n",
    "-  The next two lines tell us basic information about how the function works\n",
    "- `Prints the values to a stream, or to sys.stdout by default.` is the explanation for the function\n",
    "- the remaining lines offer different options (arguments) for running this function\n",
    "\n",
    "`print` and `type` are built in functions that come with your installation of Python.\n",
    "There are many other functions available that allow you to perform common tasks while programming.\n",
    "You can also write your own functions \n",
    "(which we'll cover at the end of these materials),\n",
    "as well as load additional functions contained in packages written by other people \n",
    "(which we'll cover in our next class)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sequences"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far we've been working with variables containing a single value.\n",
    "It's often the case that we would like to use a variable to reference collections of values.\n",
    "Sequences are a data structure which hold collections of elements.\n",
    "Lists are one type of sequence,\n",
    "and are defined in Python using square brackets:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# assign a list to a variable\n",
    "numbers = [1, 2, 3]\n",
    "numbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we've created a list,\n",
    "we can access different portions of it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# access first element in list\n",
    "numbers[0] "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The number in the square brackets above indicates the position, or index, of the element we are accessing.\n",
    "Python begins indexing (counting) at 0,\n",
    "so the index positions in `numbers` are `0`, `1`, and `2`.\n",
    "\n",
    "If you need to find information about your variable,\n",
    "you can run `?numbers` in a code cell and a help window will pop up containing information about things like the variable's type and length.\n",
    "\n",
    "Similar (but more extensive) information appears in an output cell if you run `help(numbers)`\n",
    "this additional detail may be useful to you as your programming skills develop. \n",
    "\n",
    "We can modify lists after they are created:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# add element (number) to end of list\n",
    "numbers.append(4) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that nothing is printed as output unless we specifically ask for it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 2, 3, 4]\n"
     ]
    }
   ],
   "source": [
    "print(numbers)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`append()` is a method, or function associated with a particular variable. \n",
    "In this case, it is a method associated with lists that allows us to directly modify it.\n",
    "You can learn more about this method by typing `?numbers.append` in a new code cell,\n",
    "which presents a help window with the following information:\n",
    "\n",
    "```\n",
    "Docstring: L.append(object) -> None -- append object to end\n",
    "Type:      builtin_function_or_method\n",
    "```\n",
    "\n",
    "You can view other methods available for lists by typing `?numbers.` \n",
    "in a new code cell and hitting the `tab` key.\n",
    "This provides a drop-down list that shows all methods available for the variable.\n",
    "\n",
    "Although we've worked so far with numerical data (integers and floats), \n",
    "we can also create lists using string data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['lung', 'breast', 'prostate']"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# lists of string data\n",
    "organs = [\"lung\", \"breast\", \"prostate\"]\n",
    "organs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> #### Challenge-numbers\n",
    "What happens when you execute `numbers[1] = 5`?\n",
    "\n",
    "> #### Challenge-add\n",
    "What online search term could you use to determine a method for adding multiple values to a list?\n",
    "\n",
    "> #### Challenge-remove\n",
    "How do you remove items from a list?\n",
    "\n",
    "Now that we have a basic understanding of lists,\n",
    "we can take a look at another type of sequence: tuples.\n",
    "A tuple is a list with an ordered sequence of elements,\n",
    "and they are created using parentheses:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "# assign a tuple variable\n",
    "a_tuple = (1, 2, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> #### Challenge-tuple \n",
    "What happens when you execute `a_tuple[2] = 5`?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The output you see when running the code above is called Traceback,\n",
    "or a multi-line block of information about an error. \n",
    "It includes information about what went wrong, \n",
    "and where in the code it happened (this is useful when dealing with multi-line code chunks!).\n",
    "\n",
    "> If you have code in your notebook that will cause an error to occur,\n",
    "we recommend commenting out the code if you would like to retain the information,\n",
    "but not continue executing it with the rest of your functional code.\n",
    "\n",
    "Lists and tuples differ in their mutability, \n",
    "or ability to be changed once created:\n",
    "lists are sequences that can be modified,\n",
    "tuples are sequences that cannot be modified.\n",
    "Python recognizes the difference between these data structures based on the symbols used to create them.\n",
    "\n",
    "We've worked with sequences so far that contain a single data type,\n",
    "but sequences can contain more than one data type:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('lung', 200, 'chromosome 1')"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create tuple containing multiple data types\n",
    "mix_tuple = (\"lung\", 200, \"chromosome 1\") \n",
    "mix_tuple"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also create lists of mixed data types, \n",
    "though it's more common they represent a single data type.\n",
    "\n",
    "We've been printing the contents of lists so far to the screen,\n",
    "but we often would like to access each element in a structure once at a time.\n",
    "We can accomplish this using a programming structure called a for loop.\n",
    "For loops exist in many programming languages, \n",
    "and can be used to repeat actions across a set of things.\n",
    "Here, we'll access elements in `mixed_tuple` one at a time:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "lung\n",
      "200\n",
      "chromosome 1\n"
     ]
    }
   ],
   "source": [
    "# for loop to access elements in tuple one at a time\n",
    "for num in mix_tuple:\n",
    "    print(num)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the code above, `num` represents a variable used inside the for loop.\n",
    "There is a predictable format for the syntax of a for loop,\n",
    "Loops require specific syntax, including `for`, `in`, and `:` in the first line;\n",
    "we'll work through some more examples later, \n",
    "and you can read about for loop structures in Python [here](https://wiki.python.org/moin/ForLoop)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dictionaries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have a basic understanding of lists and tuples,\n",
    "we can explore another data structure: dictionaries.\n",
    "A dictionary holds elements that are paired (key and value).\n",
    "We can create an example containing two such pairs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create a dictionary\n",
    "translation = {\"one\": 1, \"two\": 2}\n",
    "translation[\"one\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the code above, the strings (e.g., \"one\") represent the keys, \n",
    "and the numbers (e.g., 2) are the values,\n",
    "so a single pair would be \"one\" and 1.\n",
    "This may seem like an odd way to store data, \n",
    "but it can be useful if you need to reference particular matched values repeatedly\n",
    "(for example, when [reverse-complementing nucleotide sequences](https://stackoverflow.com/questions/25188968/reverse-complement-of-dna-strand-using-python)).\n",
    "\n",
    "It's useful to note that the values can include lists:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'yes': [1, 2, 3]}"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create dictionary with a list as the value\n",
    "list_value = {\"yes\": [1, 2, 3]}\n",
    "list_value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, keys cannot be a list. \n",
    "You can try this by attempting to execute `list_key = {[1, 2, 3]: \"nope\"}`.\n",
    "You'll see an error indicating that a list is \"unhashable.\"\n",
    "\n",
    "While our `translation` variable represents keys that are strings and values that are integers,\n",
    "we can create a dictionary with those data types reversed:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1: 'one', 2: 'two'}"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create dictionary with integer as key and string as value\n",
    "rev = {1: \"one\", 2: \"two\"} \n",
    "rev"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use this variable to demonstrate an approach to add a new pair to the dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1: 'one', 2: 'two', 3: 'three'}"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# add items to dictionaries by assigning new value to key\n",
    "rev[3] = \"three\"\n",
    "rev"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With can now combine this understanding of dictionaries with our earlier exploration of for loops,\n",
    "and examine two different approaches for printing the key/value pairs in a dictionary.\n",
    "\n",
    "The first way accesses each element (pair) using the method `dict.keys`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 -> one\n",
      "2 -> two\n",
      "3 -> three\n"
     ]
    }
   ],
   "source": [
    "for key in rev.keys():\n",
    "    print(key, \"->\", rev[key])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, `rev.keys()` is used to list all keys in the dictionary,\n",
    "with the value printed from accessing each of the respective keys.\n",
    "\n",
    "The second way accesses each pair using the method `dict.items`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 -> one\n",
      "2 -> two\n",
      "3 -> three\n"
     ]
    }
   ],
   "source": [
    "# access each element using dict.items \n",
    "for key, value in rev.items():\n",
    "    print(key, \"->\", value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because `rev.items()` accesses both the key and value of the pair \n",
    "(you can confirm this by printing `rev.items()`),\n",
    "you can print each directly from the respective variable internal to the for loop.\n",
    "\n",
    "> #### Challenge-applesauce\n",
    "> - print only the values of the `rev` dictionary to the screen\n",
    "> - Reassign the second value (in the key value pair) so that it no longer reads “two” but instead “apple-sauce”\n",
    "> - Print the values of `rev` to the screen again to see if the value has changed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this last section, we'll briefly overview how to write our own custom functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define a chunk of code as function\n",
    "def add_function(a, b):\n",
    "    result = a + b\n",
    "    return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first line of code defines the function with the name `add_function()` \n",
    "that accepts two items as input (`a` and `b`).\n",
    "The second line performs the action,\n",
    "and the last line determines what is output.\n",
    "\n",
    "We can test the function by evaluating its use on data with an easily predictable outcome:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "42\n"
     ]
    }
   ],
   "source": [
    "z = add_function(20, 22)\n",
    "print(z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> #### Challenge-function \n",
    "Define a new function called `subtract_function` that subtracts `d` from `c` and test on numbers of your choice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Wrapping up\n",
    "\n",
    "This first section introduced you to Python syntax and Jupyter notebooks.\n",
    "We've covered general data types, a few data structures, and two basic programming structures \n",
    "(for loops and defining functions). \n",
    "We won't be relying heavily on these data and programming structures for the rest of the course,\n",
    "but you should now have a good idea of some basic functionality of Python.\n",
    "\n",
    "In the next session,\n",
    "we'll begin working with a large clinical cancer dataset,\n",
    "similar to other spreadsheet-style data you're likely to encounter in your own work.\n",
    "\n",
    "**When you are done working with Python in Jupyter notebooks,**\n",
    "you should ensure the auto-save feature has captured your work\n",
    "(either by checking the time stamp on in your Jupyter file browser, \n",
    "or by using the manual \"Save\" option in your notebook).\n",
    "If you look in the Jupyter file browser window,\n",
    "an active (running) notebook will have a green book icon with \"Running\" also listed.\n",
    "Closing the browser windows for Jupyter notebook and the file browser \n",
    "(and even shutting down Anaconda Navigator) \n",
    "will not shut down the Python processes running in the background\n",
    "(these are sometimes referred to as the Python *kernel*).\n",
    "To shut everything down,\n",
    "go to \"File -> Close and Halt\" for the notebook. \n",
    "When you revisit the Jupyter file browser window, \n",
    "the book icon next to the notebook should be gray.\n",
    "Now, closing the browser windows and Anaconda Navigator will complete the shut down process.\n",
    "For more information on how to shut down Jupyter Notebooks,\n",
    "please see the [official documentation](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/execute.html#close-a-notebook-kernel-shut-down).\n",
    "\n",
    "> If you are running Juptyer notebooks on Mac,\n",
    "you can stop the kernel running by closing the Terminal window.\n",
    "Similarly, you can launch jupyter notebooks *without* Anaconda Navigator by opening a Terminal window, \n",
    "typing `jupyter notebook`,\n",
    "and hitting enter. \n",
    "Your Jupyter windows in your file browser will launch as through Anaconda Navigator.\n",
    "\n",
    "**If you need to reopen your project after closing Jupyter notebooks,**\n",
    "you'll need to reopen Anaconda Navigator and re-launch Jupyter notebooks.\n",
    "If you completely closed down your notebooks in your last session\n",
    "(or restarted your computer),\n",
    "your file browser will show each notebook with a gray book icon.\n",
    "Re-launch the notebook by clicking on the file name to work with the code in that notebook again.\n",
    "Although both your code and output will appear in the browser window,\n",
    "Python won't be able to \"remember\" any of this work.\n",
    "You'll need to re-execute all cells starting from the top of the notebook to be able to continue working in the same document.\n",
    "Also remember to only have one window open for each Jupyter notebook at any point in time;\n",
    "multiple open windows for the same notebook (e.g., \"class1.ipynb\")\n",
    "will also result in errors.\n",
    "\n",
    "## Extra exercises\n",
    "\n",
    "Answers to all challenge exercises are available [here](https://fredhutchio.github.io/python_intro/solutions/). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}