{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 9 Dictionaries\n", "\"Open\n", "\n", "http://openbookproject.net/thinkcs/python/english3e/dictionaries.html\n", "\n", "## Topics\n", "- dictionary data types\n", "- create and use dictionary\n", "- dictionary methods and operations\n", "- dictionary applications and problems\n", "\n", "\n", "## 9.1 Dictionary\n", "- another compound type/container like lists and tuples\n", "- very useful data structure/container that can store data as lookup table \n", "- Python's mapping type similar to `map` container, or associative arrays in C++ and other languages\n", "- dictionaries maps keys (immutable type) to values of any type (heterogeneous)\n", "- Python uses complex hash algorithm to index key for fast access (Worst case \"Big-Oh\": $O(1)$)\n", " - as a result the ordering is not maintained\n", "- starting from verion 3.7, python dict remembers the oderes of the elements inserted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.2 Creating dictionary objects" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "eng2sp = {} # or\n", "eng2sp1 = dict()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{} {}\n" ] } ], "source": [ "print(eng2sp, eng2sp1)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "eng2sp[\"One\"] = \"uno\"\n", "eng2sp[\"two\"] = \"dos\"\n", "eng2sp[\"three\"] = \"tres\"\n", "eng2sp[4] = \"quatro\"\n", "eng2sp[\"five\"] = \"sinco\"" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'One': 'uno', 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco'}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eng2sp" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "key = 'Five'\n", "eng2sp[key] = 'Sinco'" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 'uno', 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco', 'Five': 'Sinco'}\n" ] } ], "source": [ "print(eng2sp)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "uno\n" ] } ], "source": [ "print(eng2sp['One'])" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "symbolNames = {'*':'asterick', '+':\"plus\", '-': 'minus'}" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 'uno', 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco', 'Five': 'Sinco'} {'*': 'asterick', '+': 'plus', '-': 'minus'}\n" ] } ], "source": [ "print(eng2sp, symbolNames)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "dict1 ={'one': 'uno', 'two': 'dos', 'three': 'tres', '4': 'quatro', 'five': 'sinco'}" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'one': 'uno', 'two': 'dos', 'three': 'tres', '4': 'quatro', 'five': 'sinco'}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dict1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.3 Accessing values\n", "- use index operator ['key'] \n", "- dict is mutable" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "one = 'One'" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'uno'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eng2sp[one]" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'One': 'uno',\n", " 'two': 'dos',\n", " 'three': 'tres',\n", " 4: 'quatro',\n", " 'five': 'sinco',\n", " 'Five': 'Sinco'}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eng2sp" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "diez\n" ] } ], "source": [ "eng2sp['ten'] = 'diez'\n", "print(eng2sp['ten'])" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'] = 'Uno'" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'One': 'Uno',\n", " 'two': 'dos',\n", " 'three': 'tres',\n", " 4: 'quatro',\n", " 'five': 'sinco',\n", " 'Five': 'Sinco',\n", " 'ten': 'diez'}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eng2sp" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'] = ['uno']" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'].append('Uno')" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'].insert(0, 'UNO')" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': ['UNO', 'uno', 0, 'Uno'], 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco', 'Five': 'Sinco', 'ten': 'diez'}\n" ] } ], "source": [ "print(eng2sp)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "adict = {1: ['uno', 'one'], 2:('two', 'dos'), 3:{'three':'tres'}}" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "dos\n" ] } ], "source": [ "print(adict[2][1])" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "s\n" ] } ], "source": [ "# How do you access tres in adict?\n", "print(adict[3]['three'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.4 Dictionary methods" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on class dict in module builtins:\n", "\n", "class dict(object)\n", " | dict() -> new empty dictionary\n", " | dict(mapping) -> new dictionary initialized from a mapping object's\n", " | (key, value) pairs\n", " | dict(iterable) -> new dictionary initialized as if via:\n", " | d = {}\n", " | for k, v in iterable:\n", " | d[k] = v\n", " | dict(**kwargs) -> new dictionary initialized with the name=value pairs\n", " | in the keyword argument list. For example: dict(one=1, two=2)\n", " | \n", " | Methods defined here:\n", " | \n", " | __contains__(self, key, /)\n", " | True if the dictionary has the specified key, else False.\n", " | \n", " | __delitem__(self, key, /)\n", " | Delete self[key].\n", " | \n", " | __eq__(self, value, /)\n", " | Return self==value.\n", " | \n", " | __ge__(self, value, /)\n", " | Return self>=value.\n", " | \n", " | __getattribute__(self, name, /)\n", " | Return getattr(self, name).\n", " | \n", " | __getitem__(...)\n", " | x.__getitem__(y) <==> x[y]\n", " | \n", " | __gt__(self, value, /)\n", " | Return self>value.\n", " | \n", " | __init__(self, /, *args, **kwargs)\n", " | Initialize self. See help(type(self)) for accurate signature.\n", " | \n", " | __iter__(self, /)\n", " | Implement iter(self).\n", " | \n", " | __le__(self, value, /)\n", " | Return self<=value.\n", " | \n", " | __len__(self, /)\n", " | Return len(self).\n", " | \n", " | __lt__(self, value, /)\n", " | Return self size of D in memory, in bytes\n", " | \n", " | clear(...)\n", " | D.clear() -> None. Remove all items from D.\n", " | \n", " | copy(...)\n", " | D.copy() -> a shallow copy of D\n", " | \n", " | get(self, key, default=None, /)\n", " | Return the value for key if key is in the dictionary, else default.\n", " | \n", " | items(...)\n", " | D.items() -> a set-like object providing a view on D's items\n", " | \n", " | keys(...)\n", " | D.keys() -> a set-like object providing a view on D's keys\n", " | \n", " | pop(...)\n", " | D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n", " | If key is not found, d is returned if given, otherwise KeyError is raised\n", " | \n", " | popitem(...)\n", " | D.popitem() -> (k, v), remove and return some (key, value) pair as a\n", " | 2-tuple; but raise KeyError if D is empty.\n", " | \n", " | setdefault(self, key, default=None, /)\n", " | Insert key with a value of default if key is not in the dictionary.\n", " | \n", " | Return the value for key if key is in the dictionary, else default.\n", " | \n", " | update(...)\n", " | D.update([E, ]**F) -> None. Update D from dict/iterable E and F.\n", " | If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]\n", " | If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v\n", " | In either case, this is followed by: for k in F: D[k] = F[k]\n", " | \n", " | values(...)\n", " | D.values() -> an object providing a view on D's values\n", " | \n", " | ----------------------------------------------------------------------\n", " | Class methods defined here:\n", " | \n", " | fromkeys(iterable, value=None, /) from builtins.type\n", " | Create a new dictionary with keys from iterable and values set to value.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Static methods defined here:\n", " | \n", " | __new__(*args, **kwargs) from builtins.type\n", " | Create and return a new object. See help(type) for accurate signature.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data and other attributes defined here:\n", " | \n", " | __hash__ = None\n", "\n" ] } ], "source": [ "help(dict)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "key One maps to value ['UNO', 'uno', 0, 'Uno']\n", "key two maps to value dos\n", "key three maps to value tres\n", "key 4 maps to value quatro\n", "key five maps to value sinco\n", "key Five maps to value Sinco\n", "key ten maps to value diez\n" ] } ], "source": [ "for k in eng2sp.keys(): # the order of the keys is not defined\n", " print(\"key {} maps to value {}\".format(k, eng2sp[k]))" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['One', 'two', 'three', 4, 'five', 'Five', 'ten']\n" ] } ], "source": [ "print(list(eng2sp.keys()))" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[['UNO', 'uno', 0, 'Uno'], 'dos', 'tres', 'quatro', 'sinco', 'Sinco', 'diez']\n" ] } ], "source": [ "print(list(eng2sp.values()))" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "key = One value = ['UNO', 'uno', 0, 'Uno']\n", "key = two value = dos\n", "key = three value = tres\n", "key = 4 value = quatro\n", "key = five value = sinco\n", "key = Five value = Sinco\n", "key = ten value = diez\n" ] } ], "source": [ "# iterate over keys\n", "for k in eng2sp:\n", " print('key = {} value = {}'.format(k, eng2sp.get(k)))" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "print(eng2sp.get('asdfsf'))" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "print(eng2sp.get(\"Ondfe\"))" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "value = ['UNO', 'uno', 0, 'Uno']\n", "value = dos\n", "value = tres\n", "value = quatro\n", "value = sinco\n", "value = Sinco\n", "value = diez\n" ] } ], "source": [ "# iterate over values\n", "for val in eng2sp.values():\n", " print(\"value = \", val)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "values = list(eng2sp.values())" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[['UNO', 'uno', 0, 'Uno'], 'dos', 'tres', 'quatro', 'sinco', 'Sinco', 'diez']" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "values" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "items = list(eng2sp.items())" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('One', ['UNO', 'uno', 0, 'Uno']), ('two', 'dos'), ('three', 'tres'), (4, 'quatro'), ('five', 'sinco'), ('Five', 'Sinco'), ('ten', 'diez')]\n" ] } ], "source": [ "print(items)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dict2 = dict(items)\n", "print(dict2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for k, v in eng2sp.items():\n", " print('{} -> {}'.format(k, v))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(eng2sp.popitem())\n", "print(eng2sp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.5 Checking keys\n", "- in and not in operators can be used to check if some keys exist in a given dictionary\n", "- knowing if key exists helps automatically create dictionaries and access corresponding values" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"One\" in eng2sp" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"Ten\" in eng2sp" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"twenty\" not in eng2sp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.6 Copying dictionary objects\n", "- shallow copy vs deep copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import copy\n", "digits = {1: 'one', 2: 'two', 3: ['three', 'Three', 'THREE']}\n", "digits1 = digits # creates an alias\n", "digits2 = digits.copy() # shallow copy\n", "digits3 = copy.deepcopy(digits) # deep copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### [visualize in pythontutor.com](https://goo.gl/XmvYkB)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import IFrame\n", "src = '''\n", "http://pythontutor.com/iframe-embed.html#code=import%20copy%0Adigits%20%3D%20%7B1%3A%20'one',%202%3A%20'two',%203%3A%20%5B'three',%20'Three',%20'THREE'%5D%7D%0Adigits1%20%3D%20digits%20%23%20creates%20an%20alias%0Adigits2%20%3D%20digits.copy%28%29%20%23%20shallow%20copy%0Adigits3%20%3D%20copy.deepcopy%28digits%29%20%23%20deep%20copy&codeDivHeight=400&codeDivWidth=350&cumulative=false&curInstr=0&heapPrimitives=false&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false\n", "'''\n", "IFrame(src, width=900, height=600)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.7 Passing dictionaries to functions\n", "- dict is a mutable type\n", "- therefore, dict objects are passed by reference" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# find the histogram (frequency of each unique character) in a word\n", "def histogram(word, hist):\n", " for c in word:\n", " c = c.lower()\n", " if c in hist:\n", " hist[c] += 1\n", " else:\n", " hist[c] = 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "h = {}\n", "histogram('Mississippim', h)\n", "for k, v in h.items():\n", " print(k, v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.8 Returning dict from functions\n", "- dict objects can be returned from functions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def getHist(word):# = \"Mississippi\"\n", " h = {}\n", " for c in word:\n", " if c in h:\n", " h[c] += 1\n", " else:\n", " h[c] = 1\n", " return h" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hist = getHist('Mississippi')\n", "print(hist)\n", "if 'M' in hist:\n", " print('M is in histogram')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.9 Exercises" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Count and print letter frequency in a given word. Hint: use get method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Write a program that reads some text data and prints a frequency table of the letters in alphabetical order. Case should be ignored. A sample output of the program when the user enters the data \"ThiS is String with Upper and lower case Letters\", would look this:\n", "
\n",
    "a  2\n",
    "c  1\n",
    "d  1\n",
    "e  5\n",
    "g  1\n",
    "h  2\n",
    "i  4\n",
    "l  2\n",
    "n  2\n",
    "o  1\n",
    "p  2\n",
    "r  4\n",
    "s  5\n",
    "t  5\n",
    "u  1\n",
    "w  2\n",
    "
" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Kattis problems that can be solved using dict\n", "1. I've Been Everywhere, Man - https://open.kattis.com/problems/everywhere\n", "- Seven Wonders - https://open.kattis.com/problems/sevenwonders\n", "- ACM Contest Scoring - https://open.kattis.com/problems/acm\n", "- Stacking Cups - https://open.kattis.com/problems/cups\n", "- A New Alphabet - https://open.kattis.com/problems/anewalphabet\n", "- Words for Numbers - https://open.kattis.com/problems/wordsfornumbers\n", "- Babelfish - https://open.kattis.com/problems/babelfish\n", "- Popular Vote - https://open.kattis.com/problems/vote\n", "- Adding Words - https://open.kattis.com/problems/addingwords\n", "- Grandpa Bernie - https://open.kattis.com/problems/grandpabernie\n", "- Judging Troubles - https://open.kattis.com/problems/judging\n", "- Not Amused - https://open.kattis.com/problems/notamused\n", "- Engineering English - https://open.kattis.com/problems/engineeringenglish\n", "- Hardwood Species - https://open.kattis.com/problems/hardwoodspecies\n", "- Conformity - https://open.kattis.com/problems/conformity\n", "- Galactic Collegiate Programming Contest - https://open.kattis.com/problems/gcpc\n", "- Simplicity - https://open.kattis.com/problems/simplicity" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }