{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dictionaries\n", "- another compound type/container like lists and tuples\n", "- very useful data structure/container that can store data as lookup table \n", "- Python's mapping type similar to `map` container, or associative arrays in C++ and other languages\n", "- dict maps keys (immutable type) to values of any type (heterogenous)\n", "- Python uses complex hash algorithm to index key for fast access (Worst case \"Big-Oh\": $O(1)$), as a result the ordering is not maintained" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## creating dictionary objects" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "eng2sp = {} # or\n", "eng2sp1 = dict()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{} {}\n" ] } ], "source": [ "print(eng2sp, eng2sp1)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "eng2sp[\"One\"] = \"uno\"\n", "eng2sp[\"two\"] = \"dos\"\n", "eng2sp[\"three\"] = \"tres\"\n", "eng2sp[4] = \"quatro\"\n", "eng2sp[\"five\"] = \"sinco\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "key = 'Five'\n", "eng2sp[key] = 'Sinco'" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 'uno', 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco'}\n" ] } ], "source": [ "print(eng2sp)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "uno\n" ] } ], "source": [ "print(eng2sp['One'])" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "symbolNames = {'*':'asterick', '+':\"plus\", '-': 'minus'}" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 'uno', 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco'} {'*': 'asterick', '+': 'plus', '-': 'minus'}\n" ] } ], "source": [ "print(eng2sp, symbolNames)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "dict1 ={'one': 'uno', 'two': 'dos', 'three': 'tres', '4': 'quatro', 'five': 'sinco'}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## accessing values\n", "- use index operator ['key'] \n", "- dict is mutable" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "one = 'One'" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'uno'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eng2sp[one]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "diez\n" ] } ], "source": [ "eng2sp['ten'] = 'diez'\n", "print(eng2sp['ten'])" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'] = 'Uno'" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'] = ['uno']" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "eng2sp['One'].append('Uno')" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': ['uno', 'Uno'], 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco', 'ten': 'diez', 'one': 'Uno'}\n" ] } ], "source": [ "print(eng2sp)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "adict = {1: ['uno', 'one'], 2:('two', 'dos'), 3:{'three':'tres'}}" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "one\n", "tres\n" ] } ], "source": [ "# How do you access tres in adict?\n", "print(adict[1][1])\n", "print(adict[3]['three'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## dictionary methods" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on class dict in module builtins:\n", "\n", "class dict(object)\n", " | dict() -> new empty dictionary\n", " | dict(mapping) -> new dictionary initialized from a mapping object's\n", " | (key, value) pairs\n", " | dict(iterable) -> new dictionary initialized as if via:\n", " | d = {}\n", " | for k, v in iterable:\n", " | d[k] = v\n", " | dict(**kwargs) -> new dictionary initialized with the name=value pairs\n", " | in the keyword argument list. For example: dict(one=1, two=2)\n", " | \n", " | Methods defined here:\n", " | \n", " | __contains__(self, key, /)\n", " | True if the dictionary has the specified key, else False.\n", " | \n", " | __delitem__(self, key, /)\n", " | Delete self[key].\n", " | \n", " | __eq__(self, value, /)\n", " | Return self==value.\n", " | \n", " | __ge__(self, value, /)\n", " | Return self>=value.\n", " | \n", " | __getattribute__(self, name, /)\n", " | Return getattr(self, name).\n", " | \n", " | __getitem__(...)\n", " | x.__getitem__(y) <==> x[y]\n", " | \n", " | __gt__(self, value, /)\n", " | Return self>value.\n", " | \n", " | __init__(self, /, *args, **kwargs)\n", " | Initialize self. See help(type(self)) for accurate signature.\n", " | \n", " | __iter__(self, /)\n", " | Implement iter(self).\n", " | \n", " | __le__(self, value, /)\n", " | Return self<=value.\n", " | \n", " | __len__(self, /)\n", " | Return len(self).\n", " | \n", " | __lt__(self, value, /)\n", " | Return self size of D in memory, in bytes\n", " | \n", " | clear(...)\n", " | D.clear() -> None. Remove all items from D.\n", " | \n", " | copy(...)\n", " | D.copy() -> a shallow copy of D\n", " | \n", " | get(self, key, default=None, /)\n", " | Return the value for key if key is in the dictionary, else default.\n", " | \n", " | items(...)\n", " | D.items() -> a set-like object providing a view on D's items\n", " | \n", " | keys(...)\n", " | D.keys() -> a set-like object providing a view on D's keys\n", " | \n", " | pop(...)\n", " | D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n", " | If key is not found, d is returned if given, otherwise KeyError is raised\n", " | \n", " | popitem(...)\n", " | D.popitem() -> (k, v), remove and return some (key, value) pair as a\n", " | 2-tuple; but raise KeyError if D is empty.\n", " | \n", " | setdefault(self, key, default=None, /)\n", " | Insert key with a value of default if key is not in the dictionary.\n", " | \n", " | Return the value for key if key is in the dictionary, else default.\n", " | \n", " | update(...)\n", " | D.update([E, ]**F) -> None. Update D from dict/iterable E and F.\n", " | If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]\n", " | If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v\n", " | In either case, this is followed by: for k in F: D[k] = F[k]\n", " | \n", " | values(...)\n", " | D.values() -> an object providing a view on D's values\n", " | \n", " | ----------------------------------------------------------------------\n", " | Class methods defined here:\n", " | \n", " | fromkeys(iterable, value=None, /) from builtins.type\n", " | Create a new dictionary with keys from iterable and values set to value.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Static methods defined here:\n", " | \n", " | __new__(*args, **kwargs) from builtins.type\n", " | Create and return a new object. See help(type) for accurate signature.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data and other attributes defined here:\n", " | \n", " | __hash__ = None\n", "\n" ] } ], "source": [ "help(dict)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "key One maps to value uno\n", "key two maps to value dos\n", "key three maps to value tres\n", "key 4 maps to value quatro\n", "key five maps to value sinco\n" ] } ], "source": [ "for k in eng2sp.keys(): # the order of the keys is not defined\n", " print(\"key {} maps to value {}\".format(k, eng2sp[k]))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "key = One value = uno\n", "key = two value = dos\n", "key = three value = tres\n", "key = 4 value = quatro\n", "key = five value = sinco\n" ] } ], "source": [ "# iterate over keys\n", "for k in eng2sp:\n", " print('key = {} value = {}'.format(k, eng2sp[k]))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "value = uno\n", "value = dos\n", "value = tres\n", "value = quatro\n", "value = sinco\n" ] } ], "source": [ "# iterate over values\n", "for val in eng2sp.values():\n", " print(\"value = \", val)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "values = list(eng2sp.values())" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['uno', 'dos', 'tres', 'quatro', 'sinco']" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "values" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "items = list(eng2sp.items())" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('One', 'uno'), ('two', 'dos'), ('three', 'tres'), (4, 'quatro'), ('five', 'sinco')]\n" ] } ], "source": [ "print(items)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'One': 'uno', 'two': 'dos', 'three': 'tres', 4: 'quatro', 'five': 'sinco'}\n" ] } ], "source": [ "dict2 = dict(items)\n", "print(dict2)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "One -> uno\n", "two -> dos\n", "three -> tres\n", "4 -> quatro\n", "five -> sinco\n" ] } ], "source": [ "for k, v in eng2sp.items():\n", " print('{} -> {}'.format(k, v))" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "ename": "KeyError", "evalue": "'popitem(): dictionary is empty'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0meng2sp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpopitem\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0meng2sp\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mKeyError\u001b[0m: 'popitem(): dictionary is empty'" ] } ], "source": [ "print(eng2sp.popitem())\n", "print(eng2sp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## checking if key exists\n", "- in and not in operators can be used to check for keys" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"One\" in eng2sp" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"Ten\" in eng2sp" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"twenty\" not in eng2sp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# copying dictionary objects\n", "- shallow copy vs deep copy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import copy\n", "digits = {1: 'one', 2: 'two', 3: ['three', 'Three', 'THREE']}\n", "digits1 = digits # creates an alias\n", "digits2 = digits.copy() # shallow copy\n", "digits3 = copy.deepcopy(digits) # deep copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### visualize in pythontutor.com: \n", "- https://goo.gl/XmvYkB" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import IFrame\n", "src = '''\n", "http://pythontutor.com/iframe-embed.html#code=import%20copy%0Adigits%20%3D%20%7B1%3A%20'one',%202%3A%20'two',%203%3A%20%5B'three',%20'Three',%20'THREE'%5D%7D%0Adigits1%20%3D%20digits%20%23%20creates%20an%20alias%0Adigits2%20%3D%20digits.copy%28%29%20%23%20shallow%20copy%0Adigits3%20%3D%20copy.deepcopy%28digits%29%20%23%20deep%20copy&codeDivHeight=400&codeDivWidth=350&cumulative=false&curInstr=0&heapPrimitives=false&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false\n", "'''\n", "IFrame(src, width=900, height=600)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## passing dict to function\n", "- dict is a mutable type\n", "- dict objects are passed by reference" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# find the histogram (frequency of each unique character) in a word\n", "def histogram(word, hist):\n", " for c in word:\n", " c = c.lower()\n", " if c in hist:\n", " hist[c] += 1\n", " else:\n", " hist[c] = 1" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "m 2\n", "i 4\n", "s 4\n", "p 2\n" ] } ], "source": [ "h = {}\n", "histogram('Mississippim', h)\n", "for k, v in h.items():\n", " print(k, v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## returning dict from functions\n", "- dict objects can be returned from functions" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def getHist(word):# = \"Mississippi\"\n", " h = {}\n", " for c in word:\n", " if c in h:\n", " h[c] += 1\n", " else:\n", " h[c] = 1\n", " return h" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'M': 1, 'i': 4, 's': 4, 'p': 2}\n", "M is in histogram\n" ] } ], "source": [ "hist = getHist('Mississippi')\n", "print(hist)\n", "if 'M' in hist:\n", " print('M is in histogram')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercises" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### exercise 1\n", "Count and print letter frequency in some words. Hint: use get method\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### exercise 2\n", "Write a program that reads a string and returns a table of the letters of the alphabet in alphabetical order which occur in the string together with the number of times each letter occurs. Case should be ignored. A sample output of the program when the user enters the data “ThiS is String with Upper and lower case Letters”, would look this:\n", "
\n",
    "a  2\n",
    "c  1\n",
    "d  1\n",
    "e  5\n",
    "g  1\n",
    "h  2\n",
    "i  4\n",
    "l  2\n",
    "n  2\n",
    "o  1\n",
    "p  2\n",
    "r  4\n",
    "s  5\n",
    "t  5\n",
    "u  1\n",
    "w  2\n",
    "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 2 }