{ "metadata": { "name": "", "signature": "sha256:12ddcb2f87be32e08c8867c57134f3197e5f2cb030a6e356f8cb79f025cbd7f5" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#Homework assignment 2\n", "\n", "To turn in this assignment, use the same methodology as we used last week. (Download a copy of this notebook, fill in the blanks, and e-mail to Dan.)\n", "\n", "##Problem set 1: Working with dictionaries\n", "\n", "In the following code cell, I've made a dictionary mapping the names of several states to their capitals, called `state_capitals`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "state_capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix'}" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the blank below, write an expression that evaluates `Juneau`, using square brackets to get a value from the dictionary." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, write an expression that evaluates to the number of keys in the dictionary." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following code cell, I've made a list of strings and assigned it to a variable called `cheeses`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "cheeses = [\"cheddar\", \"emmental\", \"gouda\", \"brie\", \"camembert\"]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the blank below, I've provided the skeleton of a `for` loop. Replace the `???` in the `for` loop with a statement that will cause the `for` loop to fill in the blank dictionary `cheese_name_lengths`, such that the dictionary has a key for every string in the `cheeses` list, and each key maps to a value that is the length of that string. The final line of the code compares `cheese_name_lengths` to the known correct value for the dictionary; when you run the code cell, it should print out `True`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cheese_name_lengths = {}\n", "for cheese in cheeses:\n", " ???\n", "print cheese_name_lengths == {'emmental': 8, 'gouda': 5, 'cheddar': 7, 'brie': 4, 'camembert': 9}" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Problem set 2: the New York Times API\n", "\n", "This one is tough, but I have faith in you. You're smart, and capable, and the outfit you're wearing for doing homework in is *great*.\n", "\n", "Get a key for the [Campaign Finance API](http://developer.nytimes.com/docs/campaign_finance_api). Write a Python program in the cell below that calculates and prints out the *total dollar amount* of presidential campaign contributions from contributors in New York state, to any candidate, in the 2012 election cycle. (Hint: Use the [Presidential State/Zip URI structure](http://developer.nytimes.com/docs/campaign_finance_api#h3-pres-state-zip). Make use of the [API tool](http://prototype.nytimes.com/gst/apitool/index.html) as appropriate.) I've already filled in the appropriate `import` statements for you." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib\n", "import json\n", "\n", "# your code here!" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Problem set 3: Working with strings\n", "\n", "In the cell below, I've created a list of strings and assigned it to a variable `capitalize_me`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "capitalize_me = ['an abacus', 'bitter beefsteak', 'comfy culottes']" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following blank code cell, write a short program (or a single expression!) that evaluates to another list, containing copies of these strings with their first letter capitalized. In other words, your filled-in code cell should display this when you run it:\n", "\n", " ['An abacus', 'Bitter beefsteak', 'Comfy culottes']\n", "\n", "Use string slices and the `.upper()` method in your solution." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Problem set 4: Regular expressions\n", "\n", "We're going to work with the Enron e-mail subject lines in this problem set. Make sure you have a copy of the corpus downloaded to your machine by running the following code cell:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib\n", "urllib.urlretrieve(\"https://raw.githubusercontent.com/ledeprogram/courses/master/databases/data/enronsubjects.txt\", \"enronsubjects.txt\")\n", "subjects = [x.strip() for x in open(\"enronsubjects.txt\").readlines()]\n", "all_subjects = open(\"enronsubjects.txt\").read()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The variable `subjects` now contains a list, with each item in the list being a string that has a single subject line in it. The `all_subjects` variable contains a big string with all of the subject lines in it.\n", "\n", "In the following cell, write a list comprehension that evaluates to a list of all subject lines that contain a US phone number (i.e., in the format 555-555-1212). Use the `re.search()` function to accomplish this task. (Hint: there should be 28 of them.) I've included the appropriate `import` statement for you." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import re\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now use the `re.findall()` function to create an expression that evaluates to a list of *just* the phone numbers." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }