{ "metadata": { "name": "", "signature": "sha256:d5ca69bf6b5b91aa32f9e96a965519d45e06b686e5ae9b82e2b00b237b0c2887" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dictionaries\n", "## Manipulating dictionaries\n", "\n", "A dictionary is a more general version of a [list](list.html) because the index does not necessarily have to be an integer. A dictionary is defined by a set of *case-sensitive* **unordered** items (pairs of **keys** and **values**) where there is a mapping between a given key and its corresponding value. There is no order among the items in a dictionary. To define a new dictionary, use the Python built-in function `dict()` or empty curly braces: " ] }, { "cell_type": "code", "collapsed": false, "input": [ "names = dict()\n", "names = {}\n", "print names" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{}\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second line will return an empty set of curly brackets representing an empty dictionary. \n", "\n", "The function `len()` can be used to calculate the number **items** (i.e. key-value pairs) in a dictionary." ] }, { "cell_type": "code", "collapsed": false, "input": [ "len(names)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "0" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note about dictionary keys**: Though the keys (or indices for a dictionary) can be values other than integers, the keys must be immutable. Keys can be strings, integers or [tuples](tuples.html) which is discussed in the next section. Items such as lists (which are mutable) cannot be used as keys in a dictionary, but lists can be values in a dictionary. The reason is because a dictionary is implemented using a *hashtable* and the keys must be *hashable*. \n", "\n", "**Note about dictionary values**: Dictionaries values can be any type (strings, integers, other objects, or even other dictionaries). \n", "\n", "## Dictionary operators \n", "The operators `+` and `*` **do not** work on dictionaries.\n", "\n", "#### Bracket / curly bracket operator\n", "To add an item (key and value) to the dictionary, you can do one of the following ways: \n", "\n", "1. use the bracket operator `[]`\n", "2. the curly bracket operator `{}`" ] }, { "cell_type": "code", "collapsed": false, "input": [ "names['JK'] = 'Rowling'\n", "names = {'JK':'Rowling', 'Mark':'Twain', 'William':'Shakespeare', \\\n", " 'George':'Orwell', 'JRR':'Tolkien', 'Emily':'Dickinson', 'Lewis':'Carroll'}" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the second line, you can multiple items with commas separating the key and values where here the **key** is the first name 'John' and the **value** is the last name 'Smith'. If you want to **lookup** a value given a dictionary and a key, you can use the bracket operator on a **key** to return the **value**. Here we are asking to print the last name of the author named `Emily`. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "print names['Emily']" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Dickinson\n" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "names" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "{'Emily': 'Dickinson',\n", " 'George': 'Orwell',\n", " 'JK': 'Rowling',\n", " 'JRR': 'Tolkien',\n", " 'Lewis': 'Carroll',\n", " 'Mark': 'Twain',\n", " 'William': 'Shakespeare'}" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**: You cannot have duplicate keys in a dictionary. Assigning a value to an existing key will over-write the old value. \n", "\n", "**Note**: You cannot **reverse lookup** a key (k) given a dictionary (d) and a value (v). There may be more than one key that maps to a given value, so you may have a list of keys that map to a given value. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "def reverse_lookup(d, v):\n", " for k in d:\n", " if d[k] == v:\n", " return k\n", " raise ValueError, 'value does not appear in the dictionary'\n", "\n", "print reverse_lookup(names, 'Dickinson')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Emily\n" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The last line uses the statement `raise` which can take in a detailed error message. Here, if the value does not exist in the dictionary, then we `raise` an error message. \n", "\n", "If you want to delete a specific item in a dictionary, you can use the `del` function" ] }, { "cell_type": "code", "collapsed": false, "input": [ "del names['Mark']\n", "print names" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{'Lewis': 'Carroll', 'William': 'Shakespeare', 'JRR': 'Tolkien', 'JK': 'Rowling', 'George': 'Orwell', 'Emily': 'Dickinson'}\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### In operator\n", "Similar to strings and lists, the `in` (and `not in`) operator works with dictionaries. The operator will if a **key** exists in a dictionary and return a `True` or `False`. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "'Lewis' in names" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "True" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "To check if a value exists in a dictionary, you can use the dictionary method `values()`" ] }, { "cell_type": "code", "collapsed": false, "input": [ "printvals = names.values()\n", "'Lewis' in printvals" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "False" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionary methods\n", "Similar to strings, there are set of dictionary methods in Python that are useful to manipulate dictionaries. The syntax is the name of the dictionary followed by a dot (or period) followed by the name of the dictionary method. \n", "\n", "### List of dictionary methods\n", "* `keys()` = returns all the keys as a list from a dictionary\n", "* `values()` = returns all the values as a list from a dictionary\n", "* `items()` = returns the list of items in the diction as a list of tuples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* `get()` = takes in a key and a default value as arguments and will return the value in the dictionary if the key exists or the second argument if the key does not exist. For example, the key-value pair `Jane`-`Austen` does not exisit in the `names` dictionary, but `JRR` does. \n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print names.get('Jane', True)\n", "print names.get('JRR', True)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "True\n", "Tolkien\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "* `update()` = takes in a list of tuples and adds them to the dictionary as key-value pairs to an existing dictionary\n", "* `clear()`= deletes all items from a dictionary" ] }, { "cell_type": "code", "collapsed": false, "input": [ "names.clear()\n", "print names" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{}\n" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loop through dictionaries\n", "To traverse through a set of keys in a dictionary, you can use `for` loops (similar to strings and lists): " ] }, { "cell_type": "code", "collapsed": false, "input": [ "names = {'JK':'Rowling', 'Mark':'Twain', 'William':'Shakespeare', \\\n", " 'George':'Orwell', 'JRR':'Tolkien', 'Emily':'Dickinson', 'Lewis':'Carroll'}\n", "\n", "for keys in names:\n", " print names[keys]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Carroll\n", "Twain\n", "Shakespeare\n", "Tolkien\n", "Rowling\n", "Orwell\n", "Dickinson\n" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "A `for` loop over an empty dictionary does not execute anything. \n", "\n", "Another way to loop through the items in a dictionary is to use the dictionary method `items()` which lists all the key-value pairs in the dictionary. Here we assign a name to the keys and a names to the values and then print each parameter. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "for key, val in names.items():\n", " print key, val" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Lewis Carroll\n", "Mark Twain\n", "William Shakespeare\n", "JRR Tolkien\n", "JK Rowling\n", "George Orwell\n", "Emily Dickinson\n" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "print \"keys: \", names.keys()\n", "print \"values: \", names.values()\n", "names.items()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "keys: ['Lewis', 'Mark', 'William', 'JRR', 'JK', 'George', 'Emily']\n", "values: ['Carroll', 'Twain', 'Shakespeare', 'Tolkien', 'Rowling', 'Orwell', 'Dickinson']\n" ] }, { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "[('Lewis', 'Carroll'),\n", " ('Mark', 'Twain'),\n", " ('William', 'Shakespeare'),\n", " ('JRR', 'Tolkien'),\n", " ('JK', 'Rowling'),\n", " ('George', 'Orwell'),\n", " ('Emily', 'Dickinson')]" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 } ], "metadata": {} } ] }