{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dictionaries #\n", "\n", "## Motivation ##\n", "\n", "Suppose we need to represent years and the total North American fossil fuel CO2 emissions for those years.\n", "\n", "Question: How should we do this?\n", "\n", "One option:\n", "\n", " years = [1799, 1800, 1801, 1802, 1902, 2002] \n", " emissions = [1, 70, 74, 79, 82, 215630, 1733297] # metric tons of carbon, thousands\n", "\n", "We call these *parallel lists*: The `years` list at position `i` corresponds to the `emissions` list at \n", " position `i`.\n", "\n", "Question: How would operations on the data work? For example:\n", "\n", "(a) to add an entry, eg, `(1950, 734914)`?\n", " \n", "We need to modify both lists. We could append or keep both lists sorted (then must find the right spot\n", "and insert there).Either way, both lists must be kept in sync.\n", "\n", "(b) to edit the emissions value for a particular year?\n", "Need to find the year in the years lists and modify the corresponding item in the emissions list.\n", " \n", "\n", "In general, not terribly convenient.\n", "\n", "Notice that the lists don't explicitly represent the associations like the expression `(1799, 1)` does.\n", "\n", "Option two: *a list of lists*. Better, but still somewhat of a pain to look up a year. Must search the list to find it.\n", "\n", "There is a better way: a new type of object called \"dictionary\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionary basics ##\n", "\n", "A dictionary keeps track of associations for you.\n", " * you give it some info, and a value you want to store the info under.\n", " * it stashes this pair for you.\n", " * it makes lookup (and some other useful things) easy." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "74\n", "734914\n" ] } ], "source": [ "# Braces indicate that you are defining a dictionary.\n", "emissions_by_year = {1799: 1, 1800: 70, 1801: 74, 1802: 82, 1902: 215630, 2002: 1733297} \n", "\n", "# Look up the emissions for the given year\n", "print(emissions_by_year[1801])\n", "\n", "# Add another year to the dictionary\n", "emissions_by_year[1950] = 734914\n", "print(emissions_by_year[1950]) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dictionary entries have two parts: a *key* and a *value*. \n", "In our example, the key is the year and the value is the CO2 emissions. \n", "\n", "Why is it called a key?\n", "Like a physical (or metaphorical) key, it provides a means of gaining access \n", "to something.\n", "\n", "Keys don't have to be numbers. But they do have to be immutable objects (so, not lists, or dictionaries)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "d = {1:5, 3:45, 4:10}\n", "d[\"abc\"] = \"Hello!\"\n", "# d[ [1, 2, 3] ] = 77 # error" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And the associated values can be anything: any type, and mutable or not." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'weird': ['my', 'you', 'walrus'], 5: ('Diane', '978-6024', 'BA', 4236), 'nested': {'diane': 4236, 'paul': 4238}}\n" ] } ], "source": [ "d = {}\n", "d[5] = (\"Diane\", \"978-6024\", \"BA\", 4236)\n", "d[\"weird\"] = [\"my\", \"you\", \"walrus\"]\n", "d[\"nested\"] = {\"diane\": 4236, \"paul\": 4238}\n", "print(d)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Dictionaries themselves are mutable -- you can modify their contents." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionary operations ##" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{2000: 10, 2002: 1733297, 1950: 734914, 1799: 1, 1800: 70, 1801: 74, 1802: 82, 2009: 1000000, 1902: 215630}\n" ] } ], "source": [ "print(emissions_by_year)\n", " \n", "# extend (add a new key and its value)\n", "emissions_by_year[2009] = 1000000 # Wishful thinking" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{2000: 10, 2002: 1733297, 1950: 734914, 1799: 1, 1800: 70, 1801: 74, 1802: 82, 2009: 1000000, 1902: 215630}\n" ] } ], "source": [ "# update (change the value associated with a key)\n", "emissions_by_year[2000] = 10 # Old value is tossed out\n", "print(emissions_by_year) # Reports most recent values\n", " " ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check for membership\n", "1950 in emissions_by_year # A dict operator (not a function\n", " # or method). This one is binary." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# remove a key-value pair\n", "del emissions_by_year[1950] # A unary dict operator.\n", "1950 in emissions_by_year # This is now false" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# determine length (number of key-value pairs)\n", "len(emissions_by_year)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2000\n", "2002\n", "1799\n", "1800\n", "1801\n", "1802\n", "2009\n", "1902\n" ] } ], "source": [ "# Iterating over the dictionary\n", "for key in emissions_by_year:\n", " print(key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Why did the keys come out in an unexpected order??\n", " \n", " Dictionaries are unordered. \n", " The order that the keys are traversed (when you loop through) is arbitrary: \n", " there is no guarantee that it will be in the order that they were added.\n", "\n", " Silly analogy: A `dict` is like a filing assistant who is very efficient\n", " but keeps everything in a secret room. You have no idea how they organize\n", " things, and you don't care -- as long as he can pull the file you need\n", " when you give him the key." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionary methods ##" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_keys([2000, 2002, 1799, 1800, 1801, 1802, 2009, 1902])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emissions_by_year.keys()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_values([10, 1733297, 1, 70, 74, 82, 1000000, 215630])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emissions_by_year.values()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " items: the (key, value) pairs" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_items([(2000, 10), (2002, 1733297), (1799, 1), (1800, 70), (1801, 74), (1802, 82), (2009, 1000000), (1902, 215630)])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emissions_by_year.items()\n", "# this is a list of tuples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterating through a dictionary ##" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "phone = {'555-7632': 'Paul', '555-9832': 'Andrew', '555-6677': 'Dan', '555-9823': 'Michael', '555-6342' : 'Cathy', '555-7343' : 'Diane'}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(a) Going through the keys" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "555-9832\n", "555-9823\n", "555-7632\n", "555-6342\n", "555-7343\n", "555-6677\n", "555-9832\n", "555-9823\n", "555-7632\n", "555-6342\n", "555-7343\n", "555-6677\n" ] } ], "source": [ "#The proper way:\n", "for key in phone:\n", " print(key)\n", "\n", "# The is equivalent, but not considered good style:\n", "for key in phone.keys():\n", " print(key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(b) Going through the key-value pairs:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('555-9832', 'Andrew')\n", "('555-9823', 'Michael')\n", "('555-7632', 'Paul')\n", "('555-6342', 'Cathy')\n", "('555-7343', 'Diane')\n", "('555-6677', 'Dan')\n", "Name: Andrew ; Phone Number: 555-9832\n", "Name: Michael ; Phone Number: 555-9823\n", "Name: Paul ; Phone Number: 555-7632\n", "Name: Cathy ; Phone Number: 555-6342\n", "Name: Diane ; Phone Number: 555-7343\n", "Name: Dan ; Phone Number: 555-6677\n" ] } ], "source": [ "# This gives you a series of tuples.\n", "for item in phone.items():\n", " print(item)\n", "\n", "# You can pull the pieces of the tuple out as you go:\n", "for (number, name) in phone.items():\n", " print(\"Name:\", name, \"; Phone Number:\", number)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inverting a dictionary \n", " Here's a dictionary mapping phone numbers to names. \n", " Some people have more than one phone number, of course." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": true }, "outputs": [], "source": [ "phone = {'555-7632': 'Paul', '555-9832': 'Andrew', '555-6677': 'Dan', '555-9823': 'Michael', '555-6342' : 'Cathy', '555-2222': 'Michael', '555-7343' : 'Diane'}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Suppose we want to create a list of all of Michael's phone numbers:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['555-9823', '555-2222']\n" ] } ], "source": [ "# Method 1\n", "michael = []\n", "for key in phone:\n", " if phone[key] == 'Michael':\n", " michael.append(key)\n", "print(michael)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " But what if I want to be able to do this for all people? \n", " Question: is there some object you could create to make this easy?\n", " Answer: A dictionary!\n", " - The old one takes us from numbers to names.\n", " - The new one will take us in the reverse direction, from names to #s" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{'Andrew': ['555-9832'],\n", " 'Cathy': ['555-6342'],\n", " 'Dan': ['555-6677'],\n", " 'Diane': ['555-7343'],\n", " 'Michael': ['555-9823', '555-2222'],\n", " 'Paul': ['555-7632']}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new_phone = {}\n", "for (number, name) in phone.items():\n", " if name in new_phone:\n", " new_phone[name].append(number)\n", " else:\n", " new_phone[name] = [number]\n", "new_phone" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We call this an *inverted* dictionary." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }