{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Chapter 10 - Dictionaries\n",
    "*This notebook uses code snippets and explanation from [this course](https://github.com/kadarakos/python-course/blob/master/Chapter%205%20-%20Lists.ipynb)*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The last type of container we will introduce in this topic is **dictionaries**. Programming is mostly about solving real-world problems as efficiently as possible, but it is also important to write and organize code in a human-readable fashion. A dictionary offers a kind of abstraction that comes in handy often: it is a type of \"associative memory\" or key:value storage. It allows you to describe two pieces of data and their relationship. \n",
    "\n",
    "**At the end of this chapter, you will:**\n",
    "* understand the relevance of dictionaries\n",
    "* know how to create a dictionary\n",
    "* know how to add items to a dictionary\n",
    "* know how to inspect/extract items from a dictionary\n",
    "* know how to count with a dictionary\n",
    "* know how to create nested dictionaries\n",
    "\n",
    "**If you want to learn more about these topics, you might find the following links useful:**\n",
    "* [Python documentation](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you have **questions** about this chapter, please contact us **(cltl.python.course@gmail.com)**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Dictionaries\n",
    "Imagine that you are a teacher, and you've graded exams (everyone got high grades, of course). You would like to store this information so that you can *ask* the program for the grade of a particular student. After some thought, you first try to accomplish this using a list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = ['Frank', 8, 'Susan', 7, 'Guido', 10]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student = 'Frank'\n",
    "index_of_student = student_grades.index(student) # we use the index method (list.index)\n",
    "print('grade of', student, 'is', student_grades[index_of_student + 1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, you're not happy about the solution. Every time you request a grade, we need to \n",
    "first determine the position of the student in the list and then use that index + 1 to obtain the grade.  That's pretty inefficient. The take-home message here is that **lists are not really good if we want two pieces of information together**. Dictionaries for the rescue!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = {'Frank': 8, 'Susan': 7, 'Guido': 10}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades['Frank']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. How to create a dictionary\n",
    "Let's take another look at the **student_grades** dictionary. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = {'Frank': 8, 'Susan': 7, 'Guido': 10}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* a dictionary is surrounded by curly brackets, and the key/value pairs are separated by commas.\n",
    "* A dictionary consists of one or more **key:value pairs**. The key is the \"identifier\" or \"name\" that is used to describe the value.\n",
    "* the **keys** in a dictionary are unique\n",
    "* the syntax for a key/value pair is: KEY : VALUE\n",
    "* the keys (e.g. 'Frank') in a dictionary have to be **immutable**\n",
    "* the values (e.g., 8) in a dictionary can by **any python object**\n",
    "* a dictionary can be empty"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please note that **keys** in a dictionary have to **immutable**. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This works (strings as keys):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = {'Frank': 8, 'Susan': 7, 'Guido': 10}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This does not (list as keys):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_dict = {['a', 'list']: 8}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please note that the values in a dictionary can by **any python object**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This works (integers as values):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_dict = {'Frank': 8, 'Susan': 7}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But this as well (lists as values):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "another_dict = {'Frank' : [8], 'Susan' : [7]}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please note that a dictionary can be empty (use **dict()**):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "an_empty_dict = dict()\n",
    "another_empty_dict = {} # This works too, but it is less readable and confusing (looks similar to sets)\n",
    "print(type(another_empty_dict), type(an_empty_dict))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. How to add items to a dictionary\n",
    "There is one very simple way in order to add a **key:value** pair to a dictionary. Please look at the following code snippet:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_dict = dict()\n",
    "print(a_dict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_dict['Frank'] = 8\n",
    "print(a_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please note that dictionary keys should be **unique** identifiers for the values in the dictionary.  **Key:value** pairs get overwritten if you assign a different value to an existing key."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_dict = dict()\n",
    "a_dict['Frank'] = 8\n",
    "print(a_dict)\n",
    "a_dict['Frank'] = 7\n",
    "print(a_dict)\n",
    "a_dict['Frank'] = 9\n",
    "print(a_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. How to access data in a dictionary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The most basic operation on a dictionary is a **look-up**. Simply enter the key and the dictionary returns the value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = {'Frank': 8, 'Susan': 7, 'Guido': 10}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(student_grades['Frank'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If the key is not in the dictionary, it will return a KeyError."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades['Piet']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to avoid getting an error, you can use an **if-statement**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "key = 'Piet'\n",
    "if key in student_grades:\n",
    "    print(student_grades[key])\n",
    "else:\n",
    "    print(key, 'not in dictionary')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "key = 'Frank'\n",
    "if key in student_grades:\n",
    "    print(student_grades[key])\n",
    "else:\n",
    "    print(key, 'not in dictionary')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "the **keys** method returns the keys in a dictionary "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = {'Frank': 8, 'Susan': 7, 'Guido': 10}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "the_keys = student_grades.keys()\n",
    "print(the_keys)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "the **values** method returns the values in a dictionary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "the_values = student_grades.values()\n",
    "print(the_values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use the built-in functions to inspect the keys and values. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "the_values = student_grades.values()\n",
    "print(len(the_values)) # number of values in a dict\n",
    "print(max(the_values)) # highest value of values in a dict\n",
    "print(min(the_values)) # lowest value of values in a dict\n",
    "print(sum(the_values)) # sum of all values of values in a dict"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, what if we want to know which students got a 8 or higher? The **items** method is very useful for this scenario. Please carefully look at the following code snippet."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "student_grades = {'Frank': 8, 'Susan': 7, 'Guido': 10}\n",
    "print(student_grades.items())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **items** method returns a list of tuples. We can combine what we have learnt about looping and tuples to access the keys (the students' names) and values (their grades):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for key, value in student_grades.items(): # please note the tuple unpacking\n",
    "    print(key, value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This also makes it possible to detect which students obtained a grade of 8 or higher."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for student, grade in student_grades.items():\n",
    "    if grade > 7:\n",
    "        print(student, grade)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Counting with a dictionary\n",
    "Dictionaries are very useful to derive statistics. For example, we can easily determine the frequency of each letter in a word."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "letter2freq = dict()\n",
    "word = 'hippo'\n",
    "\n",
    "for letter in word: \n",
    "    \n",
    "    if letter in letter2freq: # add 1 to the dictionary if the keys exists\n",
    "        letter2freq[letter] += 1 # note: x +=1 does the same as x = x + 1\n",
    "    else:\n",
    "        letter2freq[letter] = 1 # set default value to 1 if key does not exists \n",
    "\n",
    "    print(letter, letter2freq)\n",
    "    \n",
    "print()\n",
    "print(letter2freq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can do this as well with lists"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_sentence = ['Obama', 'was', 'the', 'president', 'of', 'the', 'USA']\n",
    "word2freq = dict()\n",
    "\n",
    "for word in a_sentence: \n",
    "    \n",
    "    if word in word2freq: # add 1 to the dictionary if the keys exists\n",
    "        word2freq[word] += 1 \n",
    "    else:\n",
    "        word2freq[word] = 1 # set default value to 1 if key does not exists \n",
    "\n",
    "    print(word, word2freq)\n",
    "\n",
    "print()\n",
    "print(word2freq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python actually has a module, which is very useful for counting. It's called [collections](https://docs.python.org/3/library/collections.html#collections.Counter)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import Counter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "word_freq = Counter(['Obama', 'was', 'the', 'president', 'of', 'the', 'USA'])\n",
    "print(word_freq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Feel free to start using this module **after** the assignment of this block."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Nested dictionaries\n",
    "Since dictionaries consists of **key:value** pairs, we can actually make another dictionary the **value** of a dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_nested_dictionary = {'a_key': \n",
    "                       {'nested_key1': 1,\n",
    "                        'nested_key2': 2,\n",
    "                        'nested_key3': 3}\n",
    "                      }\n",
    "print(a_nested_dictionary)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please note that the value is in fact a dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(a_nested_dictionary['a_key'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to access the nested value, we must do a look up for each key on each nested level"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "the_nested_value = a_nested_dictionary['a_key']['nested_key1']\n",
    "print(the_nested_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Practice questions:\n",
    "    \n",
    "    What do sets and dictionaries have in common?\n",
    "    What do lists and tuples have in common?\n",
    "    Can you add things to a list?\n",
    "    Can you add things to a tuples?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An overview:\n",
    "    \n",
    "| property                       | set               | list            | tuple       | dict keys | dict values | \n",
    "|------------------------------- |-------------------|-----------------|-------------|-----------|-------------|\n",
    "| **mutable** (can you add add/remove?) | yes        | yes             | no          | yes       | yes         |      \n",
    "| **can** contain duplicates     | no                | yes             | yes         | no        | yes            |\n",
    "| **ordered**                    | no                | yes             | yes         | yes, but do not rely on it          | depends on type of value         |\n",
    "| **finding** element(s)         | quick             | slow            | slow        | quick             | depends on type of value         |\n",
    "| **can** contain                | immutables | all     | all | immutables  |  all           |\n",
    "\n",
    "\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 1:\n",
    "\n",
    "You are tying to keep track of your groceries using a python dictionary. Please add 'tomatoes', 'bread', 'chocolate bars' and 'pineapples' to your shopping dictionary and assign values according to how many items of each you would like to buy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 2:\n",
    "    \n",
    "Print the number of *pineapples* you would like to buy by using only one line of code and without printing the entire dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 3:\n",
    "\n",
    "Use a loop and unpacking to print the items and numbers on your shopping list in the following format:\n",
    "\n",
    "Item: [Item], number: [number]\n",
    "\n",
    "e.g. Item: tomatoes, number: 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# you code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 4:\n",
    "   \n",
    " Which container would you use to count the frequency of each word in a text?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}