{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from __future__ import print_function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercises" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 1\n", "\n", "When talking about floating point, we discussed _machine epsilon_, $\\epsilon$—this is the smallest number that when added to 1 is still different from 1.\n", "\n", "We'll compute $\\epsilon$ here:\n", "\n", " * Pick an initial guess for $\\epsilon$ of `eps = 1`. \n", "\n", " * Create a loop that checks whether `1 + eps` is different from `1`\n", " \n", " * Each loop iteration, cut the value of `eps` in half\n", " \n", "What value of $\\epsilon$ do you find?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 2\n", "\n", "To iterate over the tuples, where the _i_-th tuple contains _i_-th elements of certain sequences, we can use `zip(*sequences)` function.\n", "\n", "We will iterate over two lists, `names` and `age`, and print out the resulting tuples.\n", "\n", " * Start by initializing lists `names = [\"Mary\", \"John\", \"Sarah\"]` and `age = [21, 56, 98]`.\n", " \n", " * Iterate over the tuples containing a name and an age, the `zip(list1, list2)` function might be useful here.\n", " \n", " * Print out formatted strings of the type \"*NAME is AGE years old*\"." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 3\n", "\n", "The function `enumerate(sequence)` returns tuples containing indecies of objects in the sequence, and the objects. \n", "\n", "The `random` module provides tools for working with the random objects. In particular, `random.randint(start, end)` generates a random number not smaller than `start`, and not bigger than `end`.\n", "\n", " * Generate a list of 10 random numbers from 0 to 9.\n", " \n", " * Using the `enumerate(random_list)` function, iterate over the tuples of random numbers and their indecies, and print out *\"Match: NUMBER and INDEX\"* if the random number and its index in the list match." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import random\n", "\n", "random_number = random.randint(0,9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 4\n", "\n", "The Fibbonacci sequence is a numerical sequence where each number is the sum of the 2 preceding numbers, e.g., 1, 1, 2, 3, 5, 8, 13, ...\n", "\n", "Create a list where the elements are the terms in the Fibbonacci sequence:\n", "\n", " * Start with the list `fib = [1, 1]`\n", " \n", " * Loop 25 times, compute the next term as the sum of the previous 2 terms and append to the list\n", " \n", " * After the loop is complete, print out the terms \n", " \n", "You may find it useful to use `fib[-1]` and `fib[-2]` to access the last to items in the list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 5\n", "\n", "We can use the `input()` function to ask for input from the prompt (note: in python 2 the function was called `raw_input()`).\n", "\n", "Create an empty list and use a while loop to ask the user for input and append their input to the list. Keep looping until 10 items are added to the list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 6\n", "\n", "Here is a list of book titles (from http://thegreatestbooks.org). Loop through the list and capitalize each word in each title. You might find the `.capitalize()` method that works on strings useful." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "titles = [\"don quixote\", \n", " \"in search of lost time\", \n", " \"ulysses\", \n", " \"the odyssey\", \n", " \"war and piece\", \n", " \"moby dick\", \n", " \"the divine comedy\", \n", " \"hamlet\", \n", " \"the adventures of huckleberry finn\", \n", " \"the great gatsby\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q 7\n", "\n", "Here's some text (the Gettysburg Address). Our goal is to count how many times each word repeats. We'll do a brute force method first, and then we'll look a ways to do it more efficiently (and compactly)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "gettysburg_address = \"\"\"\n", "Four score and seven years ago our fathers brought forth on this continent, \n", "a new nation, conceived in Liberty, and dedicated to the proposition that \n", "all men are created equal.\n", "\n", "Now we are engaged in a great civil war, testing whether that nation, or \n", "any nation so conceived and so dedicated, can long endure. We are met on\n", "a great battle-field of that war. We have come to dedicate a portion of\n", "that field, as a final resting place for those who here gave their lives\n", "that that nation might live. It is altogether fitting and proper that we\n", "should do this.\n", "\n", "But, in a larger sense, we can not dedicate -- we can not consecrate -- we\n", "can not hallow -- this ground. The brave men, living and dead, who struggled\n", "here, have consecrated it, far above our poor power to add or detract. The\n", "world will little note, nor long remember what we say here, but it can never\n", "forget what they did here. It is for us the living, rather, to be dedicated\n", "here to the unfinished work which they who fought here have thus far so nobly\n", "advanced. It is rather for us to be here dedicated to the great task remaining\n", "before us -- that from these honored dead we take increased devotion to that\n", "cause for which they gave the last full measure of devotion -- that we here\n", "highly resolve that these dead shall not have died in vain -- that this\n", "nation, under God, shall have a new birth of freedom -- and that government\n", "of the people, by the people, for the people, shall not perish from the earth.\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We've already seen the `.split()` method will, by default, split by spaces, so it will split this into words, producing a list:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "ga = gettysburg_address.split()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ga" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, the next problem is that some of these still have punctuation. In particular, we see \"`.`\", \"`,`\", and \"`--`\".\n", "\n", "When considering a word, we can get rid of these by using the `replace()` method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = \"end.,\"\n", "b = a.replace(\".\", \"\").replace(\",\", \"\")\n", "b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another problem is case—we want to count \"but\" and \"But\" as the same. Strings have a `lower()` method that can be used to covert a string:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = \"But\"\n", "b = \"but\"\n", "a == b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a.lower() == b.lower()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall that strings are immutable, so `replace()` produces a new string on output." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## your task\n", "\n", "Create a dictionary that uses the unique words as keys and has as a value the number of times that word appears. \n", "\n", "Write a loop over the words in the string (using our split version) and do the following:\n", " * remove any punctuation\n", " * convert to lowercase\n", " * test if the word is already a key in the dictionary (using the `in` operator)\n", " - if the key exists, increment the word count for that key\n", " - otherwise, add it to the dictionary with the appropiate count of `1`.\n", "\n", "At the end, print out the words and a count of how many times they appear" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# your code here\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More compact way\n", "\n", "We can actually do this a lot more compactly by using another list comprehensions and another python datatype called a set. A set is a group of items, where each item is unique (e.g., no repetitions)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's a list comprehension that removes all the punctuation and converts to lower case:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "words = [q.lower().replace(\".\", \"\").replace(\",\", \"\") for q in ga]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and by using the `set()` function, we turn the list into a set, removing any duplicates:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "unique_words = set(words)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "now we can loop over the unique words and use the `count` method of a list to find how many there are" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "count = {}\n", "for uw in unique_words:\n", " count[uw] = words.count(uw)\n", " \n", "count" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even shorter -- we can use a dictionary comprehension, like a list comprehension" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "c = {uw: count[uw] for uw in unique_words}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.2" } }, "nbformat": 4, "nbformat_minor": 1 }