{ "metadata": { "name": "", "signature": "sha256:6ba9e9f0c380011dd2bf526ce07725052f2508ccc285c46fdb0a2e73adbe62a6" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#0. Reading In Data Files (from Gutenberg URLs)\n", "\n", "### Lynn Cherny, 2/15, arnicas@gmail \n", "Full repo here: https://github.com/arnicas/NLP-in-Python\n", "\n", "*Some code here is inspired or borrowed from http://nbviewer.ipython.org/github/sgsinclair/alta/blob/master/ipynb/GettingTexts.ipynb*\n", "\n", "Start with some example urls from [Gutenberg](https://www.gutenberg.org/) - go find you own links to text files you want to add!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib\n", "\n", "poeUrl = \"http://www.gutenberg.org/cache/epub/2147/pg2147.txt\"\n", "grimmsUrl = \"https://www.gutenberg.org/cache/epub/11027/pg11027.txt\"\n", "andersonsUrl = \"https://www.gutenberg.org/cache/epub/1597/pg1597.txt\"\n", "irishFairyUrl = \"https://www.gutenberg.org/cache/epub/32202/pg32202.txt\"\n", "eliotPoemsUrl = \"https://www.gutenberg.org/cache/epub/1567/pg1567.txt\"\n", "rosettiPoemsUrl = \"https://www.gutenberg.org/cache/epub/19188/pg19188.txt\"\n", "lovecraftUrl = \"https://www.gutenberg.org/cache/epub/31469/pg31469.txt\"\n", "mrjamesUrl = \"https://www.gutenberg.org/cache/epub/8486/pg8486.txt\"\n", "## add your own here - go to https://www.gutenberg.org/ and navigate to a .txt file page!" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "I already included a bunch of ones I wanted to use in the data/books directory, but you can add to them:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ls -al data/books" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "total 3760\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "drwxr-xr-x 8 lynn staff 272 Feb 3 18:28 \u001b[34m.\u001b[m\u001b[m/\r\n", "drwxr-xr-x 7 lynn staff 238 Feb 12 13:22 \u001b[34m..\u001b[m\u001b[m/\r\n", "-rw-r--r-- 1 lynn staff 305130 Feb 2 18:13 anderson.txt\r\n", "-rw-r--r-- 1 lynn staff 270499 Feb 3 15:42 grimms.txt\r\n", "-rw-r--r-- 1 lynn staff 483731 Feb 2 18:07 irishfairy.txt\r\n", "-rw-r--r-- 1 lynn staff 65283 Feb 3 14:28 lovecraft.txt\r\n", "-rw-r--r-- 1 lynn staff 255457 Feb 12 11:24 mrjames.txt\r\n", "-rw-r--r-- 1 lynn staff 530313 Feb 2 18:07 poe.txt\r\n" ] } ], "prompt_number": 18 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "A function to download, and then a utility to clean headers/footers!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Make it easier to download books with this in one function:\n", "def downloadGut(urlstring, filename):\n", " \"\"\" Use an urlstring to a txt file on gutenberg, and an output filename\"\"\"\n", " import urllib\n", " \n", " req = urllib.urlopen(urlstring)\n", " fileString = req.read().decode('utf-8', 'ignore')\n", " with file(filename, 'w') as handle:\n", " handle.write(fileString.encode('ascii','ignore'))\n", " print 'Made file ', filename, ' -- now strip the boilerplate by hand or with utils/stripgutenberg.pl'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 19 }, { "cell_type": "code", "collapsed": true, "input": [ "filename = 'mrjames.txt'\n", "newfile = 'data/books/' + filename\n", "\n", "# download the book\n", "downloadGut(mrjamesUrl, filename)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Made file mrjames.txt -- now strip the boilerplate by hand or with utils/stripgutenberg.pl\n" ] } ], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Now we strip headers and footers. We can refer to the variables in the shell command with $'s:**" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!perl utils/stripgutenberg.pl < $filename > $newfile" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since I put the initial full download in the current dir, delete it now:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!rm $filename" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 22 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Downloading Corpora for NLTK" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import nltk" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "nltk.download() #- find the popup winow, go to the corpora tab" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "showing info http://nltk.github.com/nltk_data/\n" ] }, { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "True" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "You need a bunch of things from NLTK for the tutorial files. I'd suggest getting \"Collections, Book\":\n", "" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Appendix: Break Gutenberg Files into Chapters / Stories" ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "In the data/stories directory, I already gave you the fairy tales as separate files. Here's how I did it:\n", "Just copy from the TOC in Andersen first and assign to a var:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "storiesString = \"\"\"\n", " The Emperor's New Clothes\n", " The Swineherd\n", " The Real Princess\n", " The Shoes of Fortune\n", " The Fir Tree\n", " The Snow Queen\n", " The Leap-Frog\n", " The Elderbush\n", " The Bell\n", " The Old House\n", " The Happy Family\n", " The Story of a Mother\n", " The False Collar\n", " The Shadow\n", " The Little Match Girl\n", " The Dream of Little Tuk\n", " The Naughty Boy\n", " The Red Shoes\n", " \"\"\"" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "stories = storiesString.split('\\n')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "stories" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 25, "text": [ "['',\n", " \" The Emperor's New Clothes\",\n", " ' The Swineherd',\n", " ' The Real Princess',\n", " ' The Shoes of Fortune',\n", " ' The Fir Tree',\n", " ' The Snow Queen',\n", " ' The Leap-Frog',\n", " ' The Elderbush',\n", " ' The Bell',\n", " ' The Old House',\n", " ' The Happy Family',\n", " ' The Story of a Mother',\n", " ' The False Collar',\n", " ' The Shadow',\n", " ' The Little Match Girl',\n", " ' The Dream of Little Tuk',\n", " ' The Naughty Boy',\n", " ' The Red Shoes',\n", " ' ']" ] } ], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "# want to match the strings in the body of the file, to search for story boundaries... \n", "# Beware, in Grimm's the TOC didn't match the in-file strings exactly :(\n", "\n", "stories = [story.strip().upper() for story in stories if story.strip()]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "stories" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 27, "text": [ "[\"THE EMPEROR'S NEW CLOTHES\",\n", " 'THE SWINEHERD',\n", " 'THE REAL PRINCESS',\n", " 'THE SHOES OF FORTUNE',\n", " 'THE FIR TREE',\n", " 'THE SNOW QUEEN',\n", " 'THE LEAP-FROG',\n", " 'THE ELDERBUSH',\n", " 'THE BELL',\n", " 'THE OLD HOUSE',\n", " 'THE HAPPY FAMILY',\n", " 'THE STORY OF A MOTHER',\n", " 'THE FALSE COLLAR',\n", " 'THE SHADOW',\n", " 'THE LITTLE MATCH GIRL',\n", " 'THE DREAM OF LITTLE TUK',\n", " 'THE NAUGHTY BOY',\n", " 'THE RED SHOES']" ] } ], "prompt_number": 27 }, { "cell_type": "code", "collapsed": false, "input": [ "def pairwise(iterable):\n", " \"\"\" Borrowed from Itertools's excellent page of examples:\n", " A utility to make pairs from the story list - (story1, story2), (story2, story3)... \n", " \"\"\"\n", " import itertools\n", " \"s -> (s0,s1), (s1,s2), (s2, s3), ...\"\n", " a, b = itertools.tee(iterable)\n", " next(b, None)\n", " return itertools.izip(a, b)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 28 }, { "cell_type": "code", "collapsed": false, "input": [ "storypairs = list(pairwise(stories)) # it returns a generator, so you have to make it a list" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "storypairs.append(('THE RED SHOES', '')) # add a special last story pair" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 30 }, { "cell_type": "code", "collapsed": false, "input": [ "storypairs" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 31, "text": [ "[(\"THE EMPEROR'S NEW CLOTHES\", 'THE SWINEHERD'),\n", " ('THE SWINEHERD', 'THE REAL PRINCESS'),\n", " ('THE REAL PRINCESS', 'THE SHOES OF FORTUNE'),\n", " ('THE SHOES OF FORTUNE', 'THE FIR TREE'),\n", " ('THE FIR TREE', 'THE SNOW QUEEN'),\n", " ('THE SNOW QUEEN', 'THE LEAP-FROG'),\n", " ('THE LEAP-FROG', 'THE ELDERBUSH'),\n", " ('THE ELDERBUSH', 'THE BELL'),\n", " ('THE BELL', 'THE OLD HOUSE'),\n", " ('THE OLD HOUSE', 'THE HAPPY FAMILY'),\n", " ('THE HAPPY FAMILY', 'THE STORY OF A MOTHER'),\n", " ('THE STORY OF A MOTHER', 'THE FALSE COLLAR'),\n", " ('THE FALSE COLLAR', 'THE SHADOW'),\n", " ('THE SHADOW', 'THE LITTLE MATCH GIRL'),\n", " ('THE LITTLE MATCH GIRL', 'THE DREAM OF LITTLE TUK'),\n", " ('THE DREAM OF LITTLE TUK', 'THE NAUGHTY BOY'),\n", " ('THE NAUGHTY BOY', 'THE RED SHOES'),\n", " ('THE RED SHOES', '')]" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "def get_story(storypair, text):\n", " print storypair[0], \",\", storypair[1]\n", " start = text.find(storypair[0])\n", " if storypair[1] != '': # last pair for last story\n", " end = text.find(storypair[1])\n", " print start, end\n", " storyString = text[start:end]\n", " else:\n", " storyString = text[start:] # special case for last file\n", " return storyString" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "# read in the text file for the whole collection:\n", "\n", "with file('data/books/anderson.txt') as handle:\n", " anderson = handle.read()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "len(anderson)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 34, "text": [ "305130" ] } ], "prompt_number": 34 }, { "cell_type": "code", "collapsed": false, "input": [ "# Illustration of it working:\n", "get_story(storypairs[0], anderson)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "THE EMPEROR'S NEW CLOTHES , THE SWINEHERD\n", "505 11181\n" ] }, { "metadata": {}, "output_type": "pyout", "prompt_number": 35, "text": [ "'THE EMPEROR\\'S NEW CLOTHES\\r\\n\\r\\nMany years ago, there was an Emperor, who was so excessively fond of\\r\\nnew clothes, that he spent all his money in dress. He did not trouble\\r\\nhimself in the least about his soldiers; nor did he care to go either to\\r\\nthe theatre or the chase, except for the opportunities then afforded him\\r\\nfor displaying his new clothes. He had a different suit for each hour of\\r\\nthe day; and as of any other king or emperor, one is accustomed to say,\\r\\n\"he is sitting in council,\" it was always said of him, \"The Emperor is\\r\\nsitting in his wardrobe.\"\\r\\n\\r\\nTime passed merrily in the large town which was his capital; strangers\\r\\narrived every day at the court. One day, two rogues, calling themselves\\r\\nweavers, made their appearance. They gave out that they knew how to\\r\\nweave stuffs of the most beautiful colors and elaborate patterns, the\\r\\nclothes manufactured from which should have the wonderful property of\\r\\nremaining invisible to everyone who was unfit for the office he held, or\\r\\nwho was extraordinarily simple in character.\\r\\n\\r\\n\"These must, indeed, be splendid clothes!\" thought the Emperor. \"Had I\\r\\nsuch a suit, I might at once find out what men in my realms are unfit\\r\\nfor their office, and also be able to distinguish the wise from the\\r\\nfoolish! This stuff must be woven for me immediately.\" And he caused\\r\\nlarge sums of money to be given to both the weavers in order that they\\r\\nmight begin their work directly.\\r\\n\\r\\nSo the two pretended weavers set up two looms, and affected to work very\\r\\nbusily, though in reality they did nothing at all. They asked for the\\r\\nmost delicate silk and the purest gold thread; put both into their own\\r\\nknapsacks; and then continued their pretended work at the empty looms\\r\\nuntil late at night.\\r\\n\\r\\n\"I should like to know how the weavers are getting on with my cloth,\"\\r\\nsaid the Emperor to himself, after some little time had elapsed; he was,\\r\\nhowever, rather embarrassed, when he remembered that a simpleton, or\\r\\none unfit for his office, would be unable to see the manufacture. To be\\r\\nsure, he thought he had nothing to risk in his own person; but yet, he\\r\\nwould prefer sending somebody else, to bring him intelligence about the\\r\\nweavers, and their work, before he troubled himself in the affair. All\\r\\nthe people throughout the city had heard of the wonderful property the\\r\\ncloth was to possess; and all were anxious to learn how wise, or how\\r\\nignorant, their neighbors might prove to be.\\r\\n\\r\\n\"I will send my faithful old minister to the weavers,\" said the Emperor\\r\\nat last, after some deliberation, \"he will be best able to see how the\\r\\ncloth looks; for he is a man of sense, and no one can be more suitable\\r\\nfor his office than he is.\"\\r\\n\\r\\nSo the faithful old minister went into the hall, where the knaves were\\r\\nworking with all their might, at their empty looms. \"What can be the\\r\\nmeaning of this?\" thought the old man, opening his eyes very wide. \"I\\r\\ncannot discover the least bit of thread on the looms.\" However, he did\\r\\nnot express his thoughts aloud.\\r\\n\\r\\nThe impostors requested him very courteously to be so good as to come\\r\\nnearer their looms; and then asked him whether the design pleased\\r\\nhim, and whether the colors were not very beautiful; at the same time\\r\\npointing to the empty frames. The poor old minister looked and looked,\\r\\nhe could not discover anything on the looms, for a very good reason,\\r\\nviz: there was nothing there. \"What!\" thought he again. \"Is it possible\\r\\nthat I am a simpleton? I have never thought so myself; and no one must\\r\\nknow it now if I am so. Can it be, that I am unfit for my office? No,\\r\\nthat must not be said either. I will never confess that I could not see\\r\\nthe stuff.\"\\r\\n\\r\\n\"Well, Sir Minister!\" said one of the knaves, still pretending to work.\\r\\n\"You do not say whether the stuff pleases you.\"\\r\\n\\r\\n\"Oh, it is excellent!\" replied the old minister, looking at the loom\\r\\nthrough his spectacles. \"This pattern, and the colors, yes, I will tell\\r\\nthe Emperor without delay, how very beautiful I think them.\"\\r\\n\\r\\n\"We shall be much obliged to you,\" said the impostors, and then they\\r\\nnamed the different colors and described the pattern of the pretended\\r\\nstuff. The old minister listened attentively to their words, in order\\r\\nthat he might repeat them to the Emperor; and then the knaves asked for\\r\\nmore silk and gold, saying that it was necessary to complete what\\r\\nthey had begun. However, they put all that was given them into their\\r\\nknapsacks; and continued to work with as much apparent diligence as\\r\\nbefore at their empty looms.\\r\\n\\r\\nThe Emperor now sent another officer of his court to see how the men\\r\\nwere getting on, and to ascertain whether the cloth would soon be\\r\\nready. It was just the same with this gentleman as with the minister;\\r\\nhe surveyed the looms on all sides, but could see nothing at all but the\\r\\nempty frames.\\r\\n\\r\\n\"Does not the stuff appear as beautiful to you, as it did to my lord the\\r\\nminister?\" asked the impostors of the Emperor\\'s second ambassador; at\\r\\nthe same time making the same gestures as before, and talking of the\\r\\ndesign and colors which were not there.\\r\\n\\r\\n\"I certainly am not stupid!\" thought the messenger. \"It must be, that I\\r\\nam not fit for my good, profitable office! That is very odd; however, no\\r\\none shall know anything about it.\" And accordingly he praised the stuff\\r\\nhe could not see, and declared that he was delighted with both colors\\r\\nand patterns. \"Indeed, please your Imperial Majesty,\" said he to his\\r\\nsovereign when he returned, \"the cloth which the weavers are preparing\\r\\nis extraordinarily magnificent.\"\\r\\n\\r\\nThe whole city was talking of the splendid cloth which the Emperor had\\r\\nordered to be woven at his own expense.\\r\\n\\r\\nAnd now the Emperor himself wished to see the costly manufacture, while\\r\\nit was still in the loom. Accompanied by a select number of officers of\\r\\nthe court, among whom were the two honest men who had already admired\\r\\nthe cloth, he went to the crafty impostors, who, as soon as they were\\r\\naware of the Emperor\\'s approach, went on working more diligently than\\r\\never; although they still did not pass a single thread through the\\r\\nlooms.\\r\\n\\r\\n\"Is not the work absolutely magnificent?\" said the two officers of the\\r\\ncrown, already mentioned. \"If your Majesty will only be pleased to look\\r\\nat it! What a splendid design! What glorious colors!\" and at the same\\r\\ntime they pointed to the empty frames; for they imagined that everyone\\r\\nelse could see this exquisite piece of workmanship.\\r\\n\\r\\n\"How is this?\" said the Emperor to himself. \"I can see nothing! This\\r\\nis indeed a terrible affair! Am I a simpleton, or am I unfit to be an\\r\\nEmperor? That would be the worst thing that could happen--Oh! the cloth\\r\\nis charming,\" said he, aloud. \"It has my complete approbation.\" And he\\r\\nsmiled most graciously, and looked closely at the empty looms; for on no\\r\\naccount would he say that he could not see what two of the officers of\\r\\nhis court had praised so much. All his retinue now strained their eyes,\\r\\nhoping to discover something on the looms, but they could see no more\\r\\nthan the others; nevertheless, they all exclaimed, \"Oh, how beautiful!\"\\r\\nand advised his majesty to have some new clothes made from this splendid\\r\\nmaterial, for the approaching procession. \"Magnificent! Charming!\\r\\nExcellent!\" resounded on all sides; and everyone was uncommonly gay. The\\r\\nEmperor shared in the general satisfaction; and presented the impostors\\r\\nwith the riband of an order of knighthood, to be worn in their\\r\\nbutton-holes, and the title of \"Gentlemen Weavers.\"\\r\\n\\r\\nThe rogues sat up the whole of the night before the day on which the\\r\\nprocession was to take place, and had sixteen lights burning, so that\\r\\neveryone might see how anxious they were to finish the Emperor\\'s new\\r\\nsuit. They pretended to roll the cloth off the looms; cut the air with\\r\\ntheir scissors; and sewed with needles without any thread in them.\\r\\n\"See!\" cried they, at last. \"The Emperor\\'s new clothes are ready!\"\\r\\n\\r\\nAnd now the Emperor, with all the grandees of his court, came to the\\r\\nweavers; and the rogues raised their arms, as if in the act of holding\\r\\nsomething up, saying, \"Here are your Majesty\\'s trousers! Here is the\\r\\nscarf! Here is the mantle! The whole suit is as light as a cobweb;\\r\\none might fancy one has nothing at all on, when dressed in it; that,\\r\\nhowever, is the great virtue of this delicate cloth.\"\\r\\n\\r\\n\"Yes indeed!\" said all the courtiers, although not one of them could see\\r\\nanything of this exquisite manufacture.\\r\\n\\r\\n\"If your Imperial Majesty will be graciously pleased to take off your\\r\\nclothes, we will fit on the new suit, in front of the looking glass.\"\\r\\n\\r\\nThe Emperor was accordingly undressed, and the rogues pretended to\\r\\narray him in his new suit; the Emperor turning round, from side to side,\\r\\nbefore the looking glass.\\r\\n\\r\\n\"How splendid his Majesty looks in his new clothes, and how well they\\r\\nfit!\" everyone cried out. \"What a design! What colors! These are indeed\\r\\nroyal robes!\"\\r\\n\\r\\n\"The canopy which is to be borne over your Majesty, in the procession,\\r\\nis waiting,\" announced the chief master of the ceremonies.\\r\\n\\r\\n\"I am quite ready,\" answered the Emperor. \"Do my new clothes fit well?\"\\r\\nasked he, turning himself round again before the looking glass, in order\\r\\nthat he might appear to be examining his handsome suit.\\r\\n\\r\\nThe lords of the bedchamber, who were to carry his Majesty\\'s train felt\\r\\nabout on the ground, as if they were lifting up the ends of the mantle;\\r\\nand pretended to be carrying something; for they would by no means\\r\\nbetray anything like simplicity, or unfitness for their office.\\r\\n\\r\\nSo now the Emperor walked under his high canopy in the midst of the\\r\\nprocession, through the streets of his capital; and all the people\\r\\nstanding by, and those at the windows, cried out, \"Oh! How beautiful\\r\\nare our Emperor\\'s new clothes! What a magnificent train there is to\\r\\nthe mantle; and how gracefully the scarf hangs!\" in short, no one would\\r\\nallow that he could not see these much-admired clothes; because, in\\r\\ndoing so, he would have declared himself either a simpleton or unfit\\r\\nfor his office. Certainly, none of the Emperor\\'s various suits, had ever\\r\\nmade so great an impression, as these invisible ones.\\r\\n\\r\\n\"But the Emperor has nothing at all on!\" said a little child.\\r\\n\\r\\n\"Listen to the voice of innocence!\" exclaimed his father; and what the\\r\\nchild had said was whispered from one to another.\\r\\n\\r\\n\"But he has nothing at all on!\" at last cried out all the people.\\r\\nThe Emperor was vexed, for he knew that the people were right; but he\\r\\nthought the procession must go on now! And the lords of the bedchamber\\r\\ntook greater pains than ever, to appear holding up a train, although, in\\r\\nreality, there was no train to hold.\\r\\n\\r\\n\\r\\n\\r\\n\\r\\n'" ] } ], "prompt_number": 35 }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Make sure you have a directory data/stories..**" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for storypair in storypairs:\n", " string = get_story(storypair, anderson)\n", " with file('data/stories/A_' + storypair[0] + '.txt', 'w') as handle:\n", " handle.write(string)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "THE EMPEROR'S NEW CLOTHES , THE SWINEHERD\n", "505 11181\n", "THE SWINEHERD" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " , THE REAL PRINCESS\n", "11181 19569\n", "THE REAL PRINCESS" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " , THE SHOES OF FORTUNE\n", "19569 21820\n", "THE SHOES OF FORTUNE" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " , THE FIR TREE\n", "21820 93861\n", "THE FIR TREE , THE SNOW QUEEN\n", "93861 111425\n", "THE SNOW QUEEN" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " , THE LEAP-FROG\n", "111425 176347\n", "THE LEAP-FROG , THE ELDERBUSH\n", "176347 180064\n", "THE ELDERBUSH , THE BELL\n", "180064 196453\n", "THE BELL , THE OLD HOUSE\n", "196453 207381\n", "THE OLD HOUSE , THE HAPPY FAMILY\n", "207381 223621\n", "THE HAPPY FAMILY , THE STORY OF A MOTHER\n", "223621 230679\n", "THE STORY OF A MOTHER , THE FALSE COLLAR\n", "230679 241097\n", "THE FALSE COLLAR , THE SHADOW\n", "241097 245834\n", "THE SHADOW , THE LITTLE MATCH GIRL\n", "245834 272667\n", "THE LITTLE MATCH GIRL , THE DREAM OF LITTLE TUK\n", "272667 278273\n", "THE DREAM OF LITTLE TUK , THE NAUGHTY BOY\n", "278273 288643\n", "THE NAUGHTY BOY , THE RED SHOES\n", "288643 293064\n", "THE RED SHOES , \n" ] } ], "prompt_number": 36 }, { "cell_type": "code", "collapsed": false, "input": [ "ls -al data/stories" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "total 1296\r\n", "drwxr-xr-x 46 lynn staff 1564 Feb 3 15:44 \u001b[34m.\u001b[m\u001b[m/\r\n", "drwxr-xr-x 7 lynn staff 238 Feb 12 13:22 \u001b[34m..\u001b[m\u001b[m/\r\n", "-rw-r--r--@ 1 lynn staff 6148 Feb 3 15:44 .DS_Store\r\n", "-rw-r--r-- 1 lynn staff 10928 Feb 13 10:02 A_THE BELL.txt\r\n", "-rw-r--r-- 1 lynn staff 10370 Feb 13 10:02 A_THE DREAM OF LITTLE TUK.txt\r\n", "-rw-r--r-- 1 lynn staff 16389 Feb 13 10:02 A_THE ELDERBUSH.txt\r\n", "-rw-r--r-- 1 lynn staff 10676 Feb 13 10:02 A_THE EMPEROR'S NEW CLOTHES.txt\r\n", "-rw-r--r-- 1 lynn staff 4737 Feb 13 10:02 A_THE FALSE COLLAR.txt\r\n", "-rw-r--r-- 1 lynn staff 17564 Feb 13 10:02 A_THE FIR TREE.txt\r\n", "-rw-r--r-- 1 lynn staff 7058 Feb 13 10:02 A_THE HAPPY FAMILY.txt\r\n", "-rw-r--r-- 1 lynn staff 3717 Feb 13 10:02 A_THE LEAP-FROG.txt\r\n", "-rw-r--r-- 1 lynn staff 5606 Feb 13 10:02 A_THE LITTLE MATCH GIRL.txt\r\n", "-rw-r--r-- 1 lynn staff 4421 Feb 13 10:02 A_THE NAUGHTY BOY.txt\r\n", "-rw-r--r-- 1 lynn staff 16240 Feb 13 10:02 A_THE OLD HOUSE.txt\r\n", "-rw-r--r-- 1 lynn staff 2251 Feb 13 10:02 A_THE REAL PRINCESS.txt\r\n", "-rw-r--r-- 1 lynn staff 12066 Feb 13 10:02 A_THE RED SHOES.txt\r\n", "-rw-r--r-- 1 lynn staff 26833 Feb 13 10:02 A_THE SHADOW.txt\r\n", "-rw-r--r-- 1 lynn staff 72041 Feb 13 10:02 A_THE SHOES OF FORTUNE.txt\r\n", "-rw-r--r-- 1 lynn staff 64922 Feb 13 10:02 A_THE SNOW QUEEN.txt\r\n", "-rw-r--r-- 1 lynn staff 10418 Feb 13 10:02 A_THE STORY OF A MOTHER.txt\r\n", "-rw-r--r-- 1 lynn staff 8388 Feb 13 10:02 A_THE SWINEHERD.txt\r\n", "-rw-r--r-- 1 lynn staff 10562 Feb 3 15:44 G_BEARSKIN.txt\r\n", "-rw-r--r-- 1 lynn staff 6256 Feb 3 15:44 G_BRIAR ROSE.txt\r\n", "-rw-r--r-- 1 lynn staff 13518 Feb 3 15:44 G_CATHERINE AND FREDERICK.txt\r\n", "-rw-r--r-- 1 lynn staff 10311 Feb 3 15:44 G_CINDERELLA.txt\r\n", "-rw-r--r-- 1 lynn staff 7033 Feb 3 15:44 G_DUMMLING AND THE THREE FEATHERS.txt\r\n", "-rw-r--r-- 1 lynn staff 16374 Feb 3 15:44 G_FAITHFUL JOHN.txt\r\n", "-rw-r--r-- 1 lynn staff 14884 Feb 3 15:44 G_HANSEL AND GRETHEL.txt\r\n", "-rw-r--r-- 1 lynn staff 13352 Feb 3 15:44 G_LITTLE ONE-EYE, TWO-EYES AND THREE-EYES.txt\r\n", "-rw-r--r-- 1 lynn staff 5979 Feb 3 15:44 G_LITTLE RED-CAP.txt\r\n", "-rw-r--r-- 1 lynn staff 12046 Feb 3 15:44 G_LITTLE SNOW-WHITE.txt\r\n", "-rw-r--r-- 1 lynn staff 6096 Feb 3 15:44 G_MOTHER HOLLE.txt\r\n", "-rw-r--r-- 1 lynn staff 19068 Feb 3 15:44 G_OH, IF I COULD BUT SHIVER!.txt\r\n", "-rw-r--r-- 1 lynn staff 7525 Feb 3 15:44 G_RAPUNZEL.txt\r\n", "-rw-r--r-- 1 lynn staff 5832 Feb 3 15:44 G_RUMPELSTILTSKIN.txt\r\n", "-rw-r--r-- 1 lynn staff 14512 Feb 3 15:44 G_SNOW-WHITE AND ROSE-RED.txt\r\n", "-rw-r--r-- 1 lynn staff 7368 Feb 3 15:44 G_THE FROG PRINCE.txt\r\n", "-rw-r--r-- 1 lynn staff 5718 Feb 3 15:44 G_THE GOLDEN GOOSE.txt\r\n", "-rw-r--r-- 1 lynn staff 11087 Feb 3 15:44 G_THE GOOSE-GIRL.txt\r\n", "-rw-r--r-- 1 lynn staff 11428 Feb 3 15:44 G_THE LITTLE BROTHER AND SISTER.txt\r\n", "-rw-r--r-- 1 lynn staff 10057 Feb 3 15:44 G_THE SIX SWANS.txt\r\n", "-rw-r--r-- 1 lynn staff 10601 Feb 3 15:44 G_THE THREE LITTLE MEN IN THE WOOD.txt\r\n", "-rw-r--r-- 1 lynn staff 9541 Feb 3 15:44 G_THE TRAVELS OF TOM THUMB.txt\r\n", "-rw-r--r-- 1 lynn staff 17354 Feb 3 15:44 G_THE VALIANT LITTLE TAILOR.txt\r\n", "-rw-r--r-- 1 lynn staff 12331 Feb 3 15:44 G_THE WATER OF LIFE.txt\r\n", "-rw-r--r-- 1 lynn staff 11533 Feb 3 15:44 G_THUMBLING.txt\r\n" ] } ], "prompt_number": 37 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Copy in the TOC from the start of Grimm's now too:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "stories2 = \"\"\"\n", "THE GOOSE-GIRL\n", "\n", "THE LITTLE BROTHER AND SISTER\n", "\n", "HANSEL AND GRETHEL\n", "\n", "OH, IF I COULD BUT SHIVER!\n", "\n", "DUMMLING AND THE THREE FEATHERS\n", "\n", "LITTLE SNOW-WHITE\n", "\n", "CATHERINE AND FREDERICK\n", "\n", "THE VALIANT LITTLE TAILOR\n", "\n", "LITTLE RED-CAP\n", "\n", "THE GOLDEN GOOSE\n", "\n", "BEARSKIN\n", "\n", "CINDERELLA\n", "\n", "FAITHFUL JOHN\n", "\n", "THE WATER OF LIFE\n", "\n", "THUMBLING\n", "\n", "BRIAR ROSE\n", "\n", "THE SIX SWANS\n", "\n", "RAPUNZEL\n", "\n", "MOTHER HOLLE\n", "\n", "THE FROG PRINCE\n", "\n", "THE TRAVELS OF TOM THUMB\n", "\n", "SNOW-WHITE AND ROSE-RED\n", "\n", "THE THREE LITTLE MEN IN THE WOOD\n", "\n", "RUMPELSTILTSKIN\n", "\n", "LITTLE ONE-EYE, TWO-EYES AND THREE-EYES\"\"\"" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 38 }, { "cell_type": "code", "collapsed": false, "input": [ "stories2 = stories2.split('\\n')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 39 }, { "cell_type": "code", "collapsed": false, "input": [ "stories2 = [story.strip() for story in stories2 if story]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 40 }, { "cell_type": "code", "collapsed": false, "input": [ "stories2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 41, "text": [ "['THE GOOSE-GIRL',\n", " 'THE LITTLE BROTHER AND SISTER',\n", " 'HANSEL AND GRETHEL',\n", " 'OH, IF I COULD BUT SHIVER!',\n", " 'DUMMLING AND THE THREE FEATHERS',\n", " 'LITTLE SNOW-WHITE',\n", " 'CATHERINE AND FREDERICK',\n", " 'THE VALIANT LITTLE TAILOR',\n", " 'LITTLE RED-CAP',\n", " 'THE GOLDEN GOOSE',\n", " 'BEARSKIN',\n", " 'CINDERELLA',\n", " 'FAITHFUL JOHN',\n", " 'THE WATER OF LIFE',\n", " 'THUMBLING',\n", " 'BRIAR ROSE',\n", " 'THE SIX SWANS',\n", " 'RAPUNZEL',\n", " 'MOTHER HOLLE',\n", " 'THE FROG PRINCE',\n", " 'THE TRAVELS OF TOM THUMB',\n", " 'SNOW-WHITE AND ROSE-RED',\n", " 'THE THREE LITTLE MEN IN THE WOOD',\n", " 'RUMPELSTILTSKIN',\n", " 'LITTLE ONE-EYE, TWO-EYES AND THREE-EYES']" ] } ], "prompt_number": 41 }, { "cell_type": "code", "collapsed": false, "input": [ "storypairs2 = list(pairwise(stories2))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 42 }, { "cell_type": "code", "collapsed": false, "input": [ "storypairs2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 43, "text": [ "[('THE GOOSE-GIRL', 'THE LITTLE BROTHER AND SISTER'),\n", " ('THE LITTLE BROTHER AND SISTER', 'HANSEL AND GRETHEL'),\n", " ('HANSEL AND GRETHEL', 'OH, IF I COULD BUT SHIVER!'),\n", " ('OH, IF I COULD BUT SHIVER!', 'DUMMLING AND THE THREE FEATHERS'),\n", " ('DUMMLING AND THE THREE FEATHERS', 'LITTLE SNOW-WHITE'),\n", " ('LITTLE SNOW-WHITE', 'CATHERINE AND FREDERICK'),\n", " ('CATHERINE AND FREDERICK', 'THE VALIANT LITTLE TAILOR'),\n", " ('THE VALIANT LITTLE TAILOR', 'LITTLE RED-CAP'),\n", " ('LITTLE RED-CAP', 'THE GOLDEN GOOSE'),\n", " ('THE GOLDEN GOOSE', 'BEARSKIN'),\n", " ('BEARSKIN', 'CINDERELLA'),\n", " ('CINDERELLA', 'FAITHFUL JOHN'),\n", " ('FAITHFUL JOHN', 'THE WATER OF LIFE'),\n", " ('THE WATER OF LIFE', 'THUMBLING'),\n", " ('THUMBLING', 'BRIAR ROSE'),\n", " ('BRIAR ROSE', 'THE SIX SWANS'),\n", " ('THE SIX SWANS', 'RAPUNZEL'),\n", " ('RAPUNZEL', 'MOTHER HOLLE'),\n", " ('MOTHER HOLLE', 'THE FROG PRINCE'),\n", " ('THE FROG PRINCE', 'THE TRAVELS OF TOM THUMB'),\n", " ('THE TRAVELS OF TOM THUMB', 'SNOW-WHITE AND ROSE-RED'),\n", " ('SNOW-WHITE AND ROSE-RED', 'THE THREE LITTLE MEN IN THE WOOD'),\n", " ('THE THREE LITTLE MEN IN THE WOOD', 'RUMPELSTILTSKIN'),\n", " ('RUMPELSTILTSKIN', 'LITTLE ONE-EYE, TWO-EYES AND THREE-EYES')]" ] } ], "prompt_number": 43 }, { "cell_type": "code", "collapsed": false, "input": [ "storypairs2.append(('LITTLE ONE-EYE, TWO-EYES AND THREE-EYES', ''))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 44 }, { "cell_type": "code", "collapsed": false, "input": [ "# read in the file\n", "with file('data/books/grimms.txt') as handle:\n", " grimm = handle.read()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 45 }, { "cell_type": "code", "collapsed": false, "input": [ "len(grimm)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 46, "text": [ "270499" ] } ], "prompt_number": 46 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Warning: For Grimm's, there was hand-editing needed. I removed the TOC from the file after getting it for the story list, and made sure the hyphens from TOC were in the contents (they didn't match before). Then I re-read in the text file." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Write story out with a prepended G so we know the source:\n", "\n", "for storypair in storypairs2:\n", " string = get_story(storypair, grimm)\n", " with file('data/stories/G_' + storypair[0] + '.txt', 'w') as handle:\n", " handle.write(string)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "THE GOOSE-GIRL , THE LITTLE BROTHER AND SISTER\n", "133 11220\n", "THE LITTLE BROTHER AND SISTER , HANSEL AND GRETHEL\n", "11220 22648\n", "HANSEL AND GRETHEL , OH, IF I COULD BUT SHIVER!\n", "22648 37532\n", "OH, IF I COULD BUT SHIVER!" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " , DUMMLING AND THE THREE FEATHERS\n", "37532 56600\n", "DUMMLING AND THE THREE FEATHERS , LITTLE SNOW-WHITE\n", "56600 63633\n", "LITTLE SNOW-WHITE , CATHERINE AND FREDERICK\n", "63633 75679\n", "CATHERINE AND FREDERICK , THE VALIANT LITTLE TAILOR\n", "75679 89197\n", "THE VALIANT LITTLE TAILOR , LITTLE RED-CAP\n", "89197 106551\n", "LITTLE RED-CAP , THE GOLDEN GOOSE\n", "106551 112530\n", "THE GOLDEN GOOSE , BEARSKIN\n", "112530 118248\n", "BEARSKIN , CINDERELLA\n", "118248 128810\n", "CINDERELLA , FAITHFUL JOHN\n", "128810 139121\n", "FAITHFUL JOHN , THE WATER OF LIFE\n", "139121 155495\n", "THE WATER OF LIFE , THUMBLING\n", "155495 167826\n", "THUMBLING , BRIAR ROSE\n", "167826 179359\n", "BRIAR ROSE , THE SIX SWANS\n", "179359 185615\n", "THE SIX SWANS , RAPUNZEL\n", "185615 195672\n", "RAPUNZEL , MOTHER HOLLE\n", "195672 203197\n", "MOTHER HOLLE , THE FROG PRINCE\n", "203197 209293\n", "THE FROG PRINCE" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " , THE TRAVELS OF TOM THUMB\n", "209293 216661\n", "THE TRAVELS OF TOM THUMB , SNOW-WHITE AND ROSE-RED\n", "216661 226202\n", "SNOW-WHITE AND ROSE-RED , THE THREE LITTLE MEN IN THE WOOD\n", "226202 240714\n", "THE THREE LITTLE MEN IN THE WOOD , RUMPELSTILTSKIN\n", "240714 251315\n", "RUMPELSTILTSKIN , LITTLE ONE-EYE, TWO-EYES AND THREE-EYES\n", "251315 257147\n", "LITTLE ONE-EYE, TWO-EYES AND THREE-EYES , \n" ] } ], "prompt_number": 47 }, { "cell_type": "code", "collapsed": false, "input": [ "ls -al data/stories" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "total 1296\r\n", "drwxr-xr-x 46 lynn staff 1564 Feb 3 15:44 \u001b[34m.\u001b[m\u001b[m/\r\n", "drwxr-xr-x 7 lynn staff 238 Feb 12 13:22 \u001b[34m..\u001b[m\u001b[m/\r\n", "-rw-r--r--@ 1 lynn staff 6148 Feb 3 15:44 .DS_Store\r\n", "-rw-r--r-- 1 lynn staff 10928 Feb 13 10:02 A_THE BELL.txt\r\n", "-rw-r--r-- 1 lynn staff 10370 Feb 13 10:02 A_THE DREAM OF LITTLE TUK.txt\r\n", "-rw-r--r-- 1 lynn staff 16389 Feb 13 10:02 A_THE ELDERBUSH.txt\r\n", "-rw-r--r-- 1 lynn staff 10676 Feb 13 10:02 A_THE EMPEROR'S NEW CLOTHES.txt\r\n", "-rw-r--r-- 1 lynn staff 4737 Feb 13 10:02 A_THE FALSE COLLAR.txt\r\n", "-rw-r--r-- 1 lynn staff 17564 Feb 13 10:02 A_THE FIR TREE.txt\r\n", "-rw-r--r-- 1 lynn staff 7058 Feb 13 10:02 A_THE HAPPY FAMILY.txt\r\n", "-rw-r--r-- 1 lynn staff 3717 Feb 13 10:02 A_THE LEAP-FROG.txt\r\n", "-rw-r--r-- 1 lynn staff 5606 Feb 13 10:02 A_THE LITTLE MATCH GIRL.txt\r\n", "-rw-r--r-- 1 lynn staff 4421 Feb 13 10:02 A_THE NAUGHTY BOY.txt\r\n", "-rw-r--r-- 1 lynn staff 16240 Feb 13 10:02 A_THE OLD HOUSE.txt\r\n", "-rw-r--r-- 1 lynn staff 2251 Feb 13 10:02 A_THE REAL PRINCESS.txt\r\n", "-rw-r--r-- 1 lynn staff 12066 Feb 13 10:02 A_THE RED SHOES.txt\r\n", "-rw-r--r-- 1 lynn staff 26833 Feb 13 10:02 A_THE SHADOW.txt\r\n", "-rw-r--r-- 1 lynn staff 72041 Feb 13 10:02 A_THE SHOES OF FORTUNE.txt\r\n", "-rw-r--r-- 1 lynn staff 64922 Feb 13 10:02 A_THE SNOW QUEEN.txt\r\n", "-rw-r--r-- 1 lynn staff 10418 Feb 13 10:02 A_THE STORY OF A MOTHER.txt\r\n", "-rw-r--r-- 1 lynn staff 8388 Feb 13 10:02 A_THE SWINEHERD.txt\r\n", "-rw-r--r-- 1 lynn staff 10562 Feb 13 10:02 G_BEARSKIN.txt\r\n", "-rw-r--r-- 1 lynn staff 6256 Feb 13 10:02 G_BRIAR ROSE.txt\r\n", "-rw-r--r-- 1 lynn staff 13518 Feb 13 10:02 G_CATHERINE AND FREDERICK.txt\r\n", "-rw-r--r-- 1 lynn staff 10311 Feb 13 10:02 G_CINDERELLA.txt\r\n", "-rw-r--r-- 1 lynn staff 7033 Feb 13 10:02 G_DUMMLING AND THE THREE FEATHERS.txt\r\n", "-rw-r--r-- 1 lynn staff 16374 Feb 13 10:02 G_FAITHFUL JOHN.txt\r\n", "-rw-r--r-- 1 lynn staff 14884 Feb 13 10:02 G_HANSEL AND GRETHEL.txt\r\n", "-rw-r--r-- 1 lynn staff 13352 Feb 13 10:02 G_LITTLE ONE-EYE, TWO-EYES AND THREE-EYES.txt\r\n", "-rw-r--r-- 1 lynn staff 5979 Feb 13 10:02 G_LITTLE RED-CAP.txt\r\n", "-rw-r--r-- 1 lynn staff 12046 Feb 13 10:02 G_LITTLE SNOW-WHITE.txt\r\n", "-rw-r--r-- 1 lynn staff 6096 Feb 13 10:02 G_MOTHER HOLLE.txt\r\n", "-rw-r--r-- 1 lynn staff 19068 Feb 13 10:02 G_OH, IF I COULD BUT SHIVER!.txt\r\n", "-rw-r--r-- 1 lynn staff 7525 Feb 13 10:02 G_RAPUNZEL.txt\r\n", "-rw-r--r-- 1 lynn staff 5832 Feb 13 10:02 G_RUMPELSTILTSKIN.txt\r\n", "-rw-r--r-- 1 lynn staff 14512 Feb 13 10:02 G_SNOW-WHITE AND ROSE-RED.txt\r\n", "-rw-r--r-- 1 lynn staff 7368 Feb 13 10:02 G_THE FROG PRINCE.txt\r\n", "-rw-r--r-- 1 lynn staff 5718 Feb 13 10:02 G_THE GOLDEN GOOSE.txt\r\n", "-rw-r--r-- 1 lynn staff 11087 Feb 13 10:02 G_THE GOOSE-GIRL.txt\r\n", "-rw-r--r-- 1 lynn staff 11428 Feb 13 10:02 G_THE LITTLE BROTHER AND SISTER.txt\r\n", "-rw-r--r-- 1 lynn staff 10057 Feb 13 10:02 G_THE SIX SWANS.txt\r\n", "-rw-r--r-- 1 lynn staff 10601 Feb 13 10:02 G_THE THREE LITTLE MEN IN THE WOOD.txt\r\n", "-rw-r--r-- 1 lynn staff 9541 Feb 13 10:02 G_THE TRAVELS OF TOM THUMB.txt\r\n", "-rw-r--r-- 1 lynn staff 17354 Feb 13 10:02 G_THE VALIANT LITTLE TAILOR.txt\r\n", "-rw-r--r-- 1 lynn staff 12331 Feb 13 10:02 G_THE WATER OF LIFE.txt\r\n", "-rw-r--r-- 1 lynn staff 11533 Feb 13 10:02 G_THUMBLING.txt\r\n" ] } ], "prompt_number": 49 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }