{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import nltk" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get the text we want to process" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with open('book.txt', 'r') as file:\n", " text = file.readlines()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a smaller chunk from the text:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# using a list comprehension to simplify iterating over the the text structure\n", "snippet = \" \".join(block.strip() for block in text[175:200])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'\"Do none suggest themselves? You know my methods. Apply them!\" \"I can only think of the obvious conclusion that the man has practised in town before going to the country.\" \"I think that we might venture a little farther than this. Look at it in this light. On what occasion would it be most probable that such a presentation would be made? When would his friends unite to give him a pledge of their good will? Obviously at the moment when Dr. Mortimer withdrew from the service of the hospital in order to start in practice for himself. We know there has been a presentation. We believe there has been a change from a town hospital to a country practice. Is it, then, stretching our inference too far to say that the presentation was on the occasion of the change?\" \"It certainly seems probable.\" \"Now, you will observe that he could not have been on the staff of the hospital, since only a man well-established in a London practice could hold such a position, and such a one would not drift into the country. What was he, then? If he was in the hospital and yet not on the staff he could only have been a house-surgeon or a house-physician--little more than a senior student. And he left five years ago--the date is on the stick. So'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "snippet" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# alternative with for-loop\n", "other_snippet = []\n", "for block in text[175:200]:\n", " other_snippet.append(block.strip())\n", "other_snippet = \" \".join(other_snippet)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'\"Do none suggest themselves? You know my methods. Apply them!\" \"I can only think of the obvious conclusion that the man has practised in town before going to the country.\" \"I think that we might venture a little farther than this. Look at it in this light. On what occasion would it be most probable that such a presentation would be made? When would his friends unite to give him a pledge of their good will? Obviously at the moment when Dr. Mortimer withdrew from the service of the hospital in order to start in practice for himself. We know there has been a presentation. We believe there has been a change from a town hospital to a country practice. Is it, then, stretching our inference too far to say that the presentation was on the occasion of the change?\" \"It certainly seems probable.\" \"Now, you will observe that he could not have been on the staff of the hospital, since only a man well-established in a London practice could hold such a position, and such a one would not drift into the country. What was he, then? If he was in the hospital and yet not on the staff he could only have been a house-surgeon or a house-physician--little more than a senior student. And he left five years ago--the date is on the stick. So'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "other_snippet" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "whole_text = \" \".join(block.strip() for block in text)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "' towards the truth. Not that you are entirely wrong in this instance. The man is certainly a country practitioner. And he walks a good deal.\" \"Then I was right.\" \"To that extent.\" \"But that was all.\" \"No, no, my dear Watson, not all--by no means all. I would suggest, for example, that a presentation to a doctor is more likely to come from a hospital than from a hunt, and that when the initials \\'C.C.\\' are placed before that hospital the words \\'Charing Cross\\' very naturally suggest themselves.\" \"You may be right.\" \"The probability lies in that direction. And if we take this as a working hypothesis we have a fresh basis from which to start our construction of this unknown visitor.\" \"Well, then, supposing that \\'C.C.H.\\' does stand for \\'Charing Cross Hospital,\\' what further inferences may we draw?\" \"Do none suggest themselves? You know my methods. Apply them!\" \"I can only think of the obvious conclusion that the man has practised in town before going to the country.\" \"I think that we might venture a little farther than this. Look at it in this light. On what occasion would it be most probable that such a presentation would be made? When would his friends unite to give him a pledge of their good will? Obviously at the moment when Dr. Mortimer withdrew from the service of the hospital in order to start in practice for himself. We know there has been a presentation. We believe there has been a change from a town hospital to a country practice. Is it, then, stretching our inference too far to say that the presentation was on the occasion of the change?\" \"It certainly seems probable.\" \"Now, you will observe that he could not have been on the staff of the hospital, since only a man well-established in a London practice could hold such a position, and such a one would not drift into the country. What was he, then? If he was in the hospital and yet not on the staff he could only have been a house-surgeon or a house-physician--little more than a senior student. And he left five years ago--the date is on the stick. So your grave, middle-aged family practitioner vanishes into thin air, my dear Watson, and there emerges a young fellow under thirty, amiable, unambitious, absent-minded, and the possessor of a favourite dog, which I should describe roughly as being larger than a terrier and smaller than a mastiff.\" I laughed incredulously as Sherlock Holmes leaned back in his settee and blew little wavering rings of smoke up to the ceiling. \"As to the latter part'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "whole_text[5000:7500]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Tokenizing text\n", "- Tokens are meaningful chunks of text:\n", " - Sentences\n", " - words\n", " - punctuation\n", " - numbers\n", " - ?\n", " \n", "We'll look at some tools in nltk to help break the raw text into sentences and word tokens." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from nltk.tokenize import sent_tokenize, word_tokenize" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": true }, "outputs": [], "source": [ "str.split?" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['\"Do none suggest themselves? You know my methods',\n", " ' Apply them!\" \"I can only think of the obvious conclusion that the man has practised in town before going to the country',\n", " '\" \"I think that we might venture a little farther than this',\n", " ' Look at it in this light',\n", " ' On what occasion would it be most probable that such a presentation would be made? When would his friends unite to give him a pledge of their good will? Obviously at the moment when Dr',\n", " ' Mortimer withdrew from the service of the hospital in order to start in practice for himself',\n", " ' We know there has been a presentation',\n", " ' We believe there has been a change from a town hospital to a country practice',\n", " ' Is it, then, stretching our inference too far to say that the presentation was on the occasion of the change?\" \"It certainly seems probable',\n", " '\" \"Now, you will observe that he could not have been on the staff of the hospital, since only a man well-established in a London practice could hold such a position, and such a one would not drift into the country',\n", " ' What was he, then? If he was in the hospital and yet not on the staff he could only have been a house-surgeon or a house-physician--little more than a senior student',\n", " ' And he left five years ago--the date is on the stick',\n", " ' So']" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# you can try to separate sentences by splitting on punctuation\n", "snippet.split('.')" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['\"Do none suggest themselves?',\n", " 'You know my methods.',\n", " 'Apply them!\"',\n", " '\"I can only think of the obvious conclusion that the man has practised in town before going to the country.\"',\n", " '\"I think that we might venture a little farther than this.',\n", " 'Look at it in this light.',\n", " 'On what occasion would it be most probable that such a presentation would be made?',\n", " 'When would his friends unite to give him a pledge of their good will?',\n", " 'Obviously at the moment when Dr. Mortimer withdrew from the service of the hospital in order to start in practice for himself.',\n", " 'We know there has been a presentation.',\n", " 'We believe there has been a change from a town hospital to a country practice.',\n", " 'Is it, then, stretching our inference too far to say that the presentation was on the occasion of the change?\"',\n", " '\"It certainly seems probable.\"',\n", " '\"Now, you will observe that he could not have been on the staff of the hospital, since only a man well-established in a London practice could hold such a position, and such a one would not drift into the country.',\n", " 'What was he, then?',\n", " 'If he was in the hospital and yet not on the staff he could only have been a house-surgeon or a house-physician--little more than a senior student.',\n", " 'And he left five years ago--the date is on the stick.',\n", " 'So']" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The sentence tokenizer has some clever tricks to do a better job\n", "sent_tokenize(snippet)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['\"Do',\n", " 'none',\n", " 'suggest',\n", " 'themselves?',\n", " 'You',\n", " 'know',\n", " 'my',\n", " 'methods.',\n", " 'Apply',\n", " 'them!\"',\n", " '\"I',\n", " 'can',\n", " 'only',\n", " 'think',\n", " 'of',\n", " 'the',\n", " 'obvious',\n", " 'conclusion',\n", " 'that',\n", " 'the',\n", " 'man',\n", " 'has',\n", " 'practised',\n", " 'in',\n", " 'town',\n", " 'before',\n", " 'going',\n", " 'to',\n", " 'the',\n", " 'country.\"',\n", " '\"I',\n", " 'think',\n", " 'that',\n", " 'we',\n", " 'might',\n", " 'venture',\n", " 'a',\n", " 'little',\n", " 'farther',\n", " 'than',\n", " 'this.',\n", " 'Look',\n", " 'at',\n", " 'it',\n", " 'in',\n", " 'this',\n", " 'light.',\n", " 'On',\n", " 'what',\n", " 'occasion',\n", " 'would',\n", " 'it',\n", " 'be',\n", " 'most',\n", " 'probable',\n", " 'that',\n", " 'such',\n", " 'a',\n", " 'presentation',\n", " 'would',\n", " 'be',\n", " 'made?',\n", " 'When',\n", " 'would',\n", " 'his',\n", " 'friends',\n", " 'unite',\n", " 'to',\n", " 'give',\n", " 'him',\n", " 'a',\n", " 'pledge',\n", " 'of',\n", " 'their',\n", " 'good',\n", " 'will?',\n", " 'Obviously',\n", " 'at',\n", " 'the',\n", " 'moment',\n", " 'when',\n", " 'Dr.',\n", " 'Mortimer',\n", " 'withdrew',\n", " 'from',\n", " 'the',\n", " 'service',\n", " 'of',\n", " 'the',\n", " 'hospital',\n", " 'in',\n", " 'order',\n", " 'to',\n", " 'start',\n", " 'in',\n", " 'practice',\n", " 'for',\n", " 'himself.',\n", " 'We',\n", " 'know',\n", " 'there',\n", " 'has',\n", " 'been',\n", " 'a',\n", " 'presentation.',\n", " 'We',\n", " 'believe',\n", " 'there',\n", " 'has',\n", " 'been',\n", " 'a',\n", " 'change',\n", " 'from',\n", " 'a',\n", " 'town',\n", " 'hospital',\n", " 'to',\n", " 'a',\n", " 'country',\n", " 'practice.',\n", " 'Is',\n", " 'it,',\n", " 'then,',\n", " 'stretching',\n", " 'our',\n", " 'inference',\n", " 'too',\n", " 'far',\n", " 'to',\n", " 'say',\n", " 'that',\n", " 'the',\n", " 'presentation',\n", " 'was',\n", " 'on',\n", " 'the',\n", " 'occasion',\n", " 'of',\n", " 'the',\n", " 'change?\"',\n", " '\"It',\n", " 'certainly',\n", " 'seems',\n", " 'probable.\"',\n", " '\"Now,',\n", " 'you',\n", " 'will',\n", " 'observe',\n", " 'that',\n", " 'he',\n", " 'could',\n", " 'not',\n", " 'have',\n", " 'been',\n", " 'on',\n", " 'the',\n", " 'staff',\n", " 'of',\n", " 'the',\n", " 'hospital,',\n", " 'since',\n", " 'only',\n", " 'a',\n", " 'man',\n", " 'well-established',\n", " 'in',\n", " 'a',\n", " 'London',\n", " 'practice',\n", " 'could',\n", " 'hold',\n", " 'such',\n", " 'a',\n", " 'position,',\n", " 'and',\n", " 'such',\n", " 'a',\n", " 'one',\n", " 'would',\n", " 'not',\n", " 'drift',\n", " 'into',\n", " 'the',\n", " 'country.',\n", " 'What',\n", " 'was',\n", " 'he,',\n", " 'then?',\n", " 'If',\n", " 'he',\n", " 'was',\n", " 'in',\n", " 'the',\n", " 'hospital',\n", " 'and',\n", " 'yet',\n", " 'not',\n", " 'on',\n", " 'the',\n", " 'staff',\n", " 'he',\n", " 'could',\n", " 'only',\n", " 'have',\n", " 'been',\n", " 'a',\n", " 'house-surgeon',\n", " 'or',\n", " 'a',\n", " 'house-physician--little',\n", " 'more',\n", " 'than',\n", " 'a',\n", " 'senior',\n", " 'student.',\n", " 'And',\n", " 'he',\n", " 'left',\n", " 'five',\n", " 'years',\n", " 'ago--the',\n", " 'date',\n", " 'is',\n", " 'on',\n", " 'the',\n", " 'stick.',\n", " 'So']" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# splitting a text into tokens based on white space\n", "snippet.split()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "words = word_tokenize(snippet)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "['``',\n", " 'Do',\n", " 'none',\n", " 'suggest',\n", " 'themselves',\n", " '?',\n", " 'You',\n", " 'know',\n", " 'my',\n", " 'methods',\n", " '.',\n", " 'Apply',\n", " 'them',\n", " '!',\n", " \"''\",\n", " '``',\n", " 'I',\n", " 'can',\n", " 'only',\n", " 'think',\n", " 'of',\n", " 'the',\n", " 'obvious',\n", " 'conclusion',\n", " 'that',\n", " 'the',\n", " 'man',\n", " 'has',\n", " 'practised',\n", " 'in',\n", " 'town',\n", " 'before',\n", " 'going',\n", " 'to',\n", " 'the',\n", " 'country',\n", " '.',\n", " \"''\",\n", " '``',\n", " 'I',\n", " 'think',\n", " 'that',\n", " 'we',\n", " 'might',\n", " 'venture',\n", " 'a',\n", " 'little',\n", " 'farther',\n", " 'than',\n", " 'this',\n", " '.',\n", " 'Look',\n", " 'at',\n", " 'it',\n", " 'in',\n", " 'this',\n", " 'light',\n", " '.',\n", " 'On',\n", " 'what',\n", " 'occasion',\n", " 'would',\n", " 'it',\n", " 'be',\n", " 'most',\n", " 'probable',\n", " 'that',\n", " 'such',\n", " 'a',\n", " 'presentation',\n", " 'would',\n", " 'be',\n", " 'made',\n", " '?',\n", " 'When',\n", " 'would',\n", " 'his',\n", " 'friends',\n", " 'unite',\n", " 'to',\n", " 'give',\n", " 'him',\n", " 'a',\n", " 'pledge',\n", " 'of',\n", " 'their',\n", " 'good',\n", " 'will',\n", " '?',\n", " 'Obviously',\n", " 'at',\n", " 'the',\n", " 'moment',\n", " 'when',\n", " 'Dr.',\n", " 'Mortimer',\n", " 'withdrew',\n", " 'from',\n", " 'the',\n", " 'service',\n", " 'of',\n", " 'the',\n", " 'hospital',\n", " 'in',\n", " 'order',\n", " 'to',\n", " 'start',\n", " 'in',\n", " 'practice',\n", " 'for',\n", " 'himself',\n", " '.',\n", " 'We',\n", " 'know',\n", " 'there',\n", " 'has',\n", " 'been',\n", " 'a',\n", " 'presentation',\n", " '.',\n", " 'We',\n", " 'believe',\n", " 'there',\n", " 'has',\n", " 'been',\n", " 'a',\n", " 'change',\n", " 'from',\n", " 'a',\n", " 'town',\n", " 'hospital',\n", " 'to',\n", " 'a',\n", " 'country',\n", " 'practice',\n", " '.',\n", " 'Is',\n", " 'it',\n", " ',',\n", " 'then',\n", " ',',\n", " 'stretching',\n", " 'our',\n", " 'inference',\n", " 'too',\n", " 'far',\n", " 'to',\n", " 'say',\n", " 'that',\n", " 'the',\n", " 'presentation',\n", " 'was',\n", " 'on',\n", " 'the',\n", " 'occasion',\n", " 'of',\n", " 'the',\n", " 'change',\n", " '?',\n", " \"''\",\n", " '``',\n", " 'It',\n", " 'certainly',\n", " 'seems',\n", " 'probable',\n", " '.',\n", " \"''\",\n", " '``',\n", " 'Now',\n", " ',',\n", " 'you',\n", " 'will',\n", " 'observe',\n", " 'that',\n", " 'he',\n", " 'could',\n", " 'not',\n", " 'have',\n", " 'been',\n", " 'on',\n", " 'the',\n", " 'staff',\n", " 'of',\n", " 'the',\n", " 'hospital',\n", " ',',\n", " 'since',\n", " 'only',\n", " 'a',\n", " 'man',\n", " 'well-established',\n", " 'in',\n", " 'a',\n", " 'London',\n", " 'practice',\n", " 'could',\n", " 'hold',\n", " 'such',\n", " 'a',\n", " 'position',\n", " ',',\n", " 'and',\n", " 'such',\n", " 'a',\n", " 'one',\n", " 'would',\n", " 'not',\n", " 'drift',\n", " 'into',\n", " 'the',\n", " 'country',\n", " '.',\n", " 'What',\n", " 'was',\n", " 'he',\n", " ',',\n", " 'then',\n", " '?',\n", " 'If',\n", " 'he',\n", " 'was',\n", " 'in',\n", " 'the',\n", " 'hospital',\n", " 'and',\n", " 'yet',\n", " 'not',\n", " 'on',\n", " 'the',\n", " 'staff',\n", " 'he',\n", " 'could',\n", " 'only',\n", " 'have',\n", " 'been',\n", " 'a',\n", " 'house-surgeon',\n", " 'or',\n", " 'a',\n", " 'house-physician',\n", " '--',\n", " 'little',\n", " 'more',\n", " 'than',\n", " 'a',\n", " 'senior',\n", " 'student',\n", " '.',\n", " 'And',\n", " 'he',\n", " 'left',\n", " 'five',\n", " 'years',\n", " 'ago',\n", " '--',\n", " 'the',\n", " 'date',\n", " 'is',\n", " 'on',\n", " 'the',\n", " 'stick',\n", " '.',\n", " 'So']" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# word tokenize treats punctuation as a token\n", "words" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# let's plot the frequency of occurrence of different words\n", "nltk.FreqDist?" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true }, "outputs": [], "source": [ "fdist = nltk.FreqDist(words)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAE/CAYAAABLrsQiAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XucVXW9//HXBwYYhjuCMoqCeEERQZjxkpqVaXkpMzWP\nZKc0O3ROpXaxzC6ne2qZ/UrPschSs6OnvJRnyLS8goWXQe6CqYCAooDcGRiYmc/vj+/astnsmVl7\nz96zb+/n47EfM3vt7/qu757Ze33W97rM3RERkcrVo9AFEBGRwlIgEBGpcAoEIiIVToFARKTCKRCI\niFQ4BQIRkQqnQCAiUuEUCEREKpwCgYhIhasqdAHiGDZsmI8ePTqrfbdv307fvn1zmlZ5Kk/lqTyL\nLc90Zs+evc7dh3ea0N2L/lFXV+fZamxszHla5ak8lafyLLY80wEaPcY5Vk1DIiIVToFARKTCKRCI\niFQ4BQIRkQqnQCAiUuHyFgjM7DdmtsbMFqZsv9zMXjSzRWb2o3wdX0RE4slnjeB24IzkDWb2HuBD\nwAR3Pwq4IY/Hp7mllUVrd+bzECIiJS9vE8rcfYaZjU7Z/B/Ade7eHKVZk6/j72xp453XP87aLc2c\n/o7t1A7KbkKGiEi5M8/jPYujQDDd3cdHz+cCDxBqCjuAq9z9uXb2nQpMBaitra1raGjI+Pg3zNrA\nrFXNXHx0f847on+n6ZuamqipqclZOuWpPJWn8uyuPNOpr6+f7e71nSaMM+ss2wcwGliY9Hwh8HPA\ngOOAZUTBqKNHtjOLH3nhDR919XR/zw2Pe1tbW6fpy22mofJUnsqzcvJMhyKdWbwKuD8q47NAGzAs\nXwc75fDhDOrTg6VrtzF35cZ8HUZEpKR1dyD4E3AqgJkdDvQG1uXrYL169uCUUdUA3Pf8qnwdRkSk\npOVz+OjdwCxgrJmtMrPLgN8AY6Ihpf8LfCKqvuTNu0eFTuKGeatpbmnN56FEREpSPkcNTWnnpY/l\n65jpjB7ci3G1A3lh9WYeXbyGs46u7c7Di4gUvYqYWXx+3UgA7put5iERkVQVEQg+dMz+VPUwnvjn\nWtZuaS50cUREikpFBIJh/fvw7rHDaW1zHpj7WqGLIyJSVCoiEACcPzk0D92r5iERkT1UTCA49ch9\nGVzTiyVvbGHR65sKXRwRkaJRMYGgT1VPzpm4PwD3zVbzkIhIQsUEAtjdPPTA3NfY1dpW4NKIiBSH\nigoEE0YO4tB9+/PWtp08+eLaQhdHRKQoVFQgMLO3awVackJEJKioQADw4UkH0MPgkcVvsmGbbloj\nIlJxgWDEoGpOPmw4u1qdhvmvF7o4IiIFV3GBAOD8yQcAWnJCRAQqNBC8/6gRDOhTxbxVm3h5zZZC\nF0dEpKAqMhBU9+rJ2RPCKqT3ak6BiFS4igwEsHtF0j/OWUVrW15viSAiUtQqNhDUjxrCqH1qeHNz\nM0+9nLebpImIFL2KDQRmxnmTdJ8CEZGKDQQA50Wjhx5e9Aabd+wqcGlERAqjogPBgUNrOGHMUJpb\n2nhw/upCF0dEpCDyefP635jZmuhG9amvXWVmbmbD8nX8uLTkhIhUunzWCG4HzkjdaGYHAqcDK/J4\n7NjOPLqWvr168tzyDbyxtaXQxRER6XZ5CwTuPgNYn+alnwJfAYpizGb/PlWcOX4EAE+8ur3ApRER\n6X7mnr/zsZmNBqa7+/jo+TnAe939SjNbDtS7e9qxm2Y2FZgKUFtbW9fQ0JBVGZqamqipqekwzZw3\nmvn+zA2MGdSDH79v35zkmWla5ak8lafy7GraVPX19bPdvb7ThO6etwcwGlgY/V4DPAMMip4vB4bF\nyaeurs6z1djY2GmarTt2+SHX/NkP/up037x9Z07yzDSt8lSeylN5djVtKqDRY5xju3PU0CHAwcC8\nqDYwEnjezEZ0YxnS6teniqNHDqLNofHVDYUujohIt+q2QODuC9x9X3cf7e6jgVXAZHd/o7vK0JET\nxuwDwDNL03VriIiUr3wOH70bmAWMNbNVZnZZvo6VC8cfPBSAp5e+VeCSiIh0r6p8ZezuUzp5fXS+\njp2N+tFD6WGw4LVNbGtuoV+fvP1pRESKSkXPLE7Wv08VhwzpRWubq59ARCqKAkGSccN7A/CMmodE\npIIoECQ5KhEIlqnDWEQqhwJBkiOH9aKHwbyVG2naqeUmRKQyKBAkqenVg/EHDKKlzXn+1Y2FLo6I\nSLdQIEihYaQiUmkUCFIcf3A0sWyZAoGIVAYFghTHHjwUM5i7ciPbd7YWujgiInmnQJBiUN9ejKsd\nyK5WZ84KzScQkfKnQJBGYt2hpzWMVEQqgAJBGuowFpFKokCQxnFJ/QQ7dqmfQETKmwJBGoNrenPE\niIHsbGljzgrNJxCR8qZA0I5E85CGkYpIuVMgaMfbHcbqJxCRMqdA0I7johrBnBUbaW5RP4GIlC8F\ngnYM7debI0YMoLmljXkrNxW6OCIieaNA0AENIxWRSqBA0IHjx2jdIREpf/m8ef1vzGyNmS1M2vZj\nM1tiZvPN7I9mNjhfx8+FRD/B7Fc3sLOlrcClERHJj3zWCG4HzkjZ9jdgvLtPAP4JXJPH43fZsP59\nOGzf/uzY1cb8VZpPICLlKW+BwN1nAOtTtv3V3RO3/noaGJmv4+eKhpGKSLkzd89f5majgenuPj7N\naw3A7939d+3sOxWYClBbW1vX0NCQVRmampqoqanJOu3fV27nxqc3MXG/3vznKUNzkmc+yqk8lafy\nrOw806mvr5/t7vWdJnT3vD2A0cDCNNu/DvyRKBB19qirq/NsNTY2dintms07fNTV0/2Ib/zFd7a0\n5iTPrqRTnspTeSrPuIBGj3GO7fZRQ2b2CeADwMVRQYva8AF9OGR4P7bvamX+Ks0nEJHy062BwMzO\nAK4GznH3pu48dldoGKmIlLN8Dh+9G5gFjDWzVWZ2GXAzMAD4m5nNNbNf5Ov4ubS7w1g3qhGR8lOV\nr4zdfUqazb/O1/Hy6YTEfILl69nVqvkEIlJeNLM4hn0HVjNmWD+27Wxl4WvqJxCR8qJAENPxYxL3\nJ1DzkIiUFwWCmDSxTETKlQJBTMcfHAJB4/INtLYV/ahXEZHYFAhiGjGomlH71LC1uYVlG1s630FE\npEQoEGTghKhWsGjtzgKXREQkdxQIMpDoMFYgEJFyokCQgcQM48Vrd6qfQETKhgJBBg4Y3JeRQ/rS\n1OK8vGZroYsjIpITCgQZOny/AQAsW6dAICLlQYEgQwcP6wfA0nXbClwSEZHcUCDI0JjhUSBYq0Ag\nIuVBgSBDiRrBMtUIRKRMKBBkaMyw/oACgYiUDwWCDO03sA99ehrrt+1kY5PmE4hI6VMgyJCZsf+A\nnoBqBSJSHhQIslDbP9zPR4FARMqBAkEWVCMQkXKiQJCF2gGhRqC5BCJSDvJ58/rfmNkaM1uYtG2o\nmf3NzF6Kfg7J1/Hzaf/+UY1AcwlEpAzks0ZwO3BGyravAo+6+2HAo9HzkpOoESxbtw13LT4nIqUt\nb4HA3WcAqTf4/RBwR/T7HcC5+Tp+Pg3o3YOh/XqzfVcrb25uLnRxRES6JONAYGZDzGxClsfbz91X\nA0Q/980yn4LbveaQFp8TkdJmcZo2zOwJ4BygCpgLrAWedPcvdrLfaGC6u4+Pnm9098FJr29w97T9\nBGY2FZgKUFtbW9fQ0BDj7eytqamJmpqanKZtamriN4t28fjy7Xx68kDed0j7+2SSZz7KqTyVp/Ks\njDzTqa+vn+3u9Z0mdPdOH8Cc6OengO9Ev8+Psd9oYGHS8xeB2uj3WuDFOMevq6vzbDU2NuY8bWNj\no9/82Es+6urp/r2GRTnLMy7lqTyVp/KMA2j0GOfYuE1DVWZWC1wITI8fj/byf8Anot8/ATzQhbwK\naowWnxORMhE3EHwHeBh42d2fM7MxwEsd7WBmdwOzgLFmtsrMLgOuA043s5eA06PnJeng4QoEIlIe\nqmKmW+3ub3cQu/tSM7uxox3cfUo7L703buGK2eh9+mEGK9Y3sau1jV49NTdPREpT3LPXTTG3VYzq\nXj3Zf1BfWtqcVRu2F7o4IiJZ67BGYGbvAE4EhptZ8gihgUDPfBasFIwZ3o/XNm5n6dqtbw8nFREp\nNZ3VCHoD/QkBY0DSYzNwQX6LVvx0tzIRKQcd1gjc/UngSTO73d1f7aYylQzdyF5EykHczuI+ZjaN\nMC/g7X3c/dR8FKpUvF0j0OJzIlLC4gaCe4BfALcCrfkrTmnR/YtFpBzEDQQt7n5LXktSgg4Y0pde\nPY03Nu9gW3ML/frE/XOKiBSPuMNHG8zsM2ZWG91TYKiZDc1ryUpAzx7GqH1C89Dyt1QrEJHSFPcS\nNrEsxJeTtjkwJrfFKT0HD+vHy2u2smzdNo7af1ChiyMikrFYgcDdD853QUrVGHUYi0iJixUIzOzj\n6ba7+29zW5zSo7kEIlLq4jYNHZv0ezVhvaDnAQUCzSUQkRIXt2no8uTnZjYIuDMvJSoxiVVIl67d\nirtjZgUukYhIZrJdMrMJOCyXBSlVw/v3oX+fKjbvaGFD065CF0dEJGNx+wgaCKOEICw2dyTwh3wV\nqpSYGQcP68eC1zaxbN1Whvar+FG1IlJi4vYR3JD0ewvwqruvykN5SlIiECxdu426UQoEIlJaYjUN\nRYvPLSGsPDoE2JnPQpUajRwSkVIWKxCY2YXAs8BHCPctfsbMKn4Z6oQxum2liJSwuE1DXweOdfc1\nAGY2HHgEuDdfBSslWnxOREpZ3FFDPRJBIPJWBvuWvdHDaoAQCNravJPUIiLFJe7J/CEze9jMLjGz\nS4A/Aw9me1Az+4KZLTKzhWZ2t5lVZ5tXMRhQ3YvhA/rQ3NLG65t0/2IRKS0dBgIzO9TMTnL3LwO/\nBCYAE4FZwLRsDmhmBwBXAPXuPp4wHPWibPIqJuowFpFS1VmN4P8BWwDc/X53/6K7f4FQG/h/XThu\nFdDXzKqAGuD1LuRVFMYoEIhIiTL39tu0zWxhdNWe7rUF7n50Vgc1uxL4AbAd+Ku7X5wmzVRgKkBt\nbW1dQ0NDNoeiqamJmpqanKZNl+5PL27jzvlbOOvQGi6bNDAneeajnMpTeSrP8swznfr6+tnuXt9p\nQndv9wG8nM1rneQ5BHgMGA70Av4EfKyjferq6jxbjY2NOU+bLt3DC1f7qKun+8d//UzO8uxqWuWp\nPJVn5eSZDtDoMc7LnTUNPWdm/5a60cwuA2ZnEJiSnQYsc/e17r4LuB84Mcu8iobmEohIqepsHsHn\ngT+a2cXsPvHXA72BD2d5zBXACWZWQ2gaei/QmGVeRePAoTX0MFi1oYnmllb6VPUsdJFERGLpMBC4\n+5vAiWb2HiDRV/Bnd38s2wO6+zNmdi/hfgYtwByyHIFUTPpU9WTkkBpWrG9i5fomDt13QKGLJCIS\nS9z7ETwOPJ6rg7r7t4Bv5Sq/YnHwsH6sWN/E0rXbFAhEpGRodnAOaS6BiJQiBYIcUoexiJQiBYIc\n0v2LRaQUKRDkkJqGRKQUKRDk0P6D+tK7qgdrtzSzZYfuXywipUGBIId69DAO3ifUCpavaypwaURE\n4lEgyLFEh/HSdVsLXBIRkXgUCHJM/QQiUmoUCHJMgUBESo0CQY693TS0VoFAREqDAkGOHZx0I3vv\n4F4PIiLFQoEgx4bU9GJQ315sbW5h7dbmQhdHRKRTCgQ5Zma7+wnUPCQiJUCBIA90/2IRKSUKBHmg\nkUMiUkoUCPLg4OFafE5ESocCQR6oRiAipUSBIA9GR+sNvfrWNlo1hFREipwCQR7061PFiIHV7Gp1\n1m5rLXRxREQ6VJBAYGaDzexeM1tiZovN7B2FKEc+JZqHVm9VIBCR4laoGsHPgIfc/QhgIrC4QOXI\nm0SH8etbWgpcEhGRjlV19wHNbCBwCnAJgLvvBHZ2dznyLTGX4Lfzt3DPkoc7TX9gf+POsc3s079P\nvosmIrIH6+71cMzsGGAa8AKhNjAbuNLdt6WkmwpMBaitra1raGjI6nhNTU3U1NTkNG2cdMs37uJr\nj62nuTX+33fSiN587eQh9DDrtnIqT+WpPMsjz3Tq6+tnu3t9pwndvVsfQD3QAhwfPf8Z8L2O9qmr\nq/NsNTY25jxt3HQ7drX4k/941jc27ezw8cqaLT7+m3/2UVdP91ueeLnby6k8lafyLP080wEaPcZ5\nuRB9BKuAVe7+TPT8XmByAcqRd32qetKvdw8G9e3V4WPM8P5cftwgAH788IvMfnV9gUsuIpWk2wOB\nu78BrDSzsdGm9xKaiSpaXW01U08ZQ2ubc/ldc9jYVHbdJiJSpAo1auhy4H/MbD5wDPDDApWjqHz5\n/WM55sDBvL5pB1fdM1/3MxCRblGQQODuc9293t0nuPu57r6hEOUoNr169uCmKZMYWF3FI4vf5La/\nLy90kUSkAmhmcZE5cGgNP7pgIgDX/mUx81ZuLHCJRKTcKRAUoTPGj+CSE0ezq9X53N3Ps3nHrkIX\nSUTKmAJBkbrmrCMYf8BAVq7fzlfvU3+BiOSPAkGR6lPVk5unTKZ/nyoeXPAGv3tmRaGLJCJlSoGg\niI0e1o8fnnc0AN+b/gKLXt9U4BKJSDlSIChy50zcnynHHcTOljYuv2sOW5u1iJ2I5JYCQQn41gfH\nccSIASxdt41v/HGB+gtEJKe6ffVRyVx1r57c/NHJfPCmp/jT3NdpXFpF/6dmdLrf9u3b6Tuz83SZ\npM1HnrXVLdwyoZXqXj1j5SsiuaVAUCIO3bc/PzxvPFfdM59Vm1tg85Z4O26KmS6TtDnOcwnwnYZF\nXHvehPj5ikjOKBCUkA9PGsmJhwzjqefmcuSR4zpNv3jxC7HSZZI213m+uWUHU+94jrufXckJY/bh\nQ8ccECtvEckdBYISs9/AakYP7sW4/Qd2mnb76njpMkmb6zzHMZBLjxnItOc387X7FzBh5OC3b/Mp\nIt1DncVScO8b05ezj65l285WPnfX8+zYpfs8i3QnBQIpODPj2vOP5qChNSx6fTPXPlh2t7AWKWoK\nBFIUBlb34uaPTqJXT+OOWa/ylwWrC10kkYqhQCBFY8LIwVxz5pEAfOW++axc31TgEolUBgUCKSqX\nnjSa08ftx5YdLXzu7jnsbGkrdJFEyp4CgRQVM+PHF0zggMF9mbdyIz96aEmhiyRS9hQIpOgMrunN\nz6dMomcP49anlvHIC28WukgiZU2BQIpS3aghfPn9YwG46t55vL5xe4FLJFK+ChYIzKynmc0xs+mF\nKoMUt6nvHMO7xw5nY9Murrh7Drta1V8gkg+FrBFcCWjAuLSrRw/jJx+ZyH4D+9D46gZ++rd/FrpI\nImWpIEtMmNlI4GzgB8AXC1EGKQ379O/Dzy+axJRfPc1/P/EKs0b0ZuiC5zrdb+OmTQyOkS6TtHHT\nVffuybv33UVdrKOLFJ4VYm17M7sXuBYYAFzl7h9Ik2YqMBWgtra2rqGhIatjNTU1UVNTk9O0yrP7\n87x38VbuXrg11nGLwaA+xo3vG8bg6o6X1i6n/5HyLGye6dTX18929/rO0nV7jcDMPgCscffZZvbu\n9tK5+zRgGkB9fb3X1WV3fTV79mzi7hs3rfLs/jzr6uCilRuZNXcRhx5yaKd5vvzKy7HSZZI2brpp\nM5fy7LL13L7EuePSyfToYe2mLaf/kfIsbJ5dUYimoZOAc8zsLKAaGGhmv3P3jxWgLFJCJh44mJY1\n1dSN26/TtEO2r4qVLpO0cdMdPXIQp9/wGDNfWsctT77CZ98TLyCJFEq3dxa7+zXuPtLdRwMXAY8p\nCEg52W9gNVccPxiAG//2T55bvr7AJRLpmOYRiOTBpBF9+Pd3HUJrm3PF3XPYsG1noYsk0q6CBgJ3\nfyJdR7FIOfjS+w6nbtQQVm/awZfumUchBmaIxKEagUie9OrZg59PmcSgvr14bMkafv3UskIXSSQt\nBQKRPDpgcF9u+MhEAK77yxLmrNhQ4BKJ7E2BQCTPTh+3H5886WBa2pzP3TWHTU27Cl0kkT0oEIh0\ng6+eeQQTRg7itY3bufq++eovkKKiQCDSDXpX9eDmKZMZ0KeKhxa9wW9nvVroIom8TYFApJsctE8N\n150/AYAf/HkxC1/bVOASiQQKBCLd6OwJtXzshIPY2drG5+56nqZdWlpbCq8gq4+KVLJvnD2O2a9u\nZPHqzfxg5i6Oe3NhrP3WrNnMA6s6Txs3XSXnWdO7iskDWmMduxIoEIh0s+pePfmvj07iAzc9xZK3\ndrEkk/6CV2KmjZuugvM8aFAVpxzfSnWvjleIrQQKBCIFMGZ4fx684p3c+ehsDjrwoFj7rFi5Ilba\nuOkqOc/b/7GcZeu28d3pL/DDDx8dqwzlTIFApEBGD+vHWYf2o65udKz0s2e/FStt3HSVnGf96CGc\ne/NT3PXMCt4xZh8+OHH/WOUoV+osFpGKc9T+g7jkmIEAXHP/Apav21bgEhWWAoGIVKT3j+nLWUeP\nYGtzC5+7+3maWyq381iBQEQqkplx3fkTOHBoXxa+tplrH1xS6CIVjAKBiFSsgdW9uHnKZHr1NG7/\nx3IeWvhGoYtUEAoEIlLRJh44mK+eeSQAX7l3HivXNxW4RN1PgUBEKt4nTxrNaUfux+YdLVx+9xx2\ntVbWjG8FAhGpeGbGDR+ZwP6Dqpm7ciM3PPxioYvUrRQIRESAwTW9uemjk+jZw/jljKU8tuTNQhep\n23R7IDCzA83scTNbbGaLzOzK7i6DiEg6daOGctX7xgLwpT/MY/Wm7QUuUfcoRI2gBfiSux8JnAB8\n1szGFaAcIiJ7+fQpY3jX4cPZ0LSLK++eS2tb+d9EqNuXmHD31cDq6PctZrYYOAB4obvLIiKSqkcP\n48YLJ3LWz2fy7PL13NRWzclNr8Tad9Vr22jc2nnauOkA+m7fSV2slNkr6FpDZjYamAQ8U8hyiIgk\n26d/H3520SQ++qunmbliBzNXZDDZbH7MtDHTnTu2Hx+Pf/SsWKHunWpm/YEngR+4+/1pXp8KTAWo\nra2ta2hoyOo4TU1N1NTU5DSt8lSeyrMy8lywpplnV26jqireNXNLS0ustHHTARw2yDlx9KBYaVPV\n19fPdvf6ThO6e7c/gF7Aw8AX46Svq6vzbDU2NuY8rfJUnspTeRZbnukAjR7jHFuIUUMG/BpY7O43\ndvfxRURkT4UYNXQS8K/AqWY2N3qcVYByiIgIhRk19BRg3X1cERFJTzOLRUQqnAKBiEiFUyAQEalw\nCgQiIhWuYBPKMmFma4FXs9x9GLAux2mVp/JUnsqz2PJMZ5S7D+80VZzJBqX8IOaEikzSKk/lqTyV\nZ7Hl2ZWHmoZERCqcAoGISIWrhEAwLQ9plafyVJ7Ks9jyzFpJdBaLiEj+VEKNQEREOqBAICJS4RQI\nREQqXEFvVVkMzGwIcBhQndjm7jMKVJZaYL27N2ex71Dg34EdwK3uvjlHZeqTWp5020SKXbF9lovp\ne1TRNQIz+xQwg3C3tO9EP7+dwf4jOnjtRDP7qJl9PPGIkeWdwBIzuyFuGZLcB/QHRgKzzGxMO+W6\nM/p5Zcx8Z8Xclsh/PzP7QPTYN+YxKoqZ3WdmZ5tZ7O+fmQ0xswm5yDPd/z6Dz0PRMLPJHT3S7BL7\ns2xmM83sB2Z2hpkNyEFZf5PyvD/wYAfpD4jOIackHl0tQ4flK7dRQ2a2H/BDYH93P9PMxgHvcPdf\np0m7ADgWeNrdjzGzI4DvuPu/xDzWn9397DTb7wQOAeYCrdFmd/crYuRpwDh3X2RmPd29tbN9ov3m\nu/uE6Pf3A7cCG4EvAZ9y9wuj114AzgT+D3g3KfeGcPf1UboRwAHA74CPJqUbCPzC3Y9IU4YLgR8D\nT0Tp3wl82d3vTZP2cOAWYD93Hx+d5M5x9++npHsccEJN6YJ23vuCKE1aSX+Xhk7SnZMm7/OA64F9\no/dkIakPTEnXBzgfGE1STdvdv5smz9OAS4ETgHuA2919rzuZm9kTwDlRfnOBtcCT7v7FbPOM0j7v\n7pNTts1x90kp24YD/5bmPX2yC+891vfTzH4EfB/YDjwETAQ+7+6/S0rzePRrNVAPzCP8fyYAz7j7\nyVG6bD7LY4CTCZ/hE4BmYKa7fyEpzRbSf572+oyY2feAYe7+H1ErxJ+BX7n7bWmOfT3wL8AL7Hn+\n2OvzmSvlGAj+AtwGfN3dJ5pZFTDH3Y9Ok/Y5dz/WzOYCx7t7s5nNdfdjuliGxYSTedZ/3OgL8gN3\n/3DM9H8HLnb35dFzA/YHNgCD3H11tP0K4D+AMcBryVkQPmxjonSfAC4hfMEak9JtIZxk7k9ThnnA\n6e6+Jno+HHjE3SemSfsk8GXgl4kTkJktdPfxKelGRb+2uvuqdt57Is1no593Rj8vBpoSJyQze1e6\n/RPc/ck0eb8MfNDdF3e0r5k9BGwCZrP7y4u7/6SDfQYBU4CvAyuBXwG/c/dd0etz3H1SVHM90N2/\nlRzwM83TzKYQToQnAzOTdhtA+PuelpLXP6J0qe/pvmzfe9zvZ+J7aGYfBs4FvgA83s5n6X8J35UF\n0fPxwFXufkn0POPPcrRfLfAuQjB4D7DC3c9IlzaO6AQ/CKgDrkv9OyalexGY0K3NRvlew6K7H8Bz\n0c85SdvmtpP2j8BgQnPQDOAB4MEclOEeoLaLeTwMDM8g/Vjg8AzS30K4yro8ekxsJ935GeS5IOV5\nj9Rt2fyfMjj+3+Ns62qe7aRbmGG++wBXEk5M/0e4ArwJeCL57wnUAn8Fjo22zc82T2AUoRY4i3CC\nSzwmA1Vp8ov1/8jkvcf9vwOLop+/As6Ifp/XTp7p9k+3LZPP8ivAM9HfczLQI8vPz3lJj/MJNbtp\niW3t7PMXoH9XPreZPsqxs3ibme1DVGUzsxMIVyt78d1X29+OqpmDCNXQrhoGvGBmzxKqlInjZVK1\nO8tjNgtFeb+YQd4ASwhV5fsJtYE7zexX7n5TSr73mdnZwFHs2aG+V7Uf+IuZPQzcHT3/F9pvB11n\nZoew+/90AbA6NZGZLYvSrHX34zt5T/3M7GQPt0PFzE4E+qXJ8zDgWmBcyntK16/SaGa/B/7Env/L\n1KvIf5jhHcuEAAAUDUlEQVTZ0R5dlXbEzO4HjiDUXD7oUW0N+L2ZJV+xfpdwQfCUuz8XNVe8lG2e\n7v4qYRXfd3RWxsh0MzvL3dtty47Efu/E/342mNkSQtPQZ6La5Y528lxsZrcSPs8OfAxIV4N71Mxu\nBBLt7U8C33X3dMf/OaHmNAWYBDxpZjPc/ZUY7zHZB1OezwF6Rdud8P1L1QTMNbNH2fMz12nTcrbK\nsWloMuEqaDywEBgOXODu87uxDGmbIDxN00Mn+cTuI8iUmc0ntM1ui573A2Z5SrODmf0CqCFUjW8F\nLgCedffL0uR5PeEq6mRCcJkBnODuV6dJO4ZwZXQioflqGaFpK9vlxjGzOuA3hIAOoY/kk+7+fEq6\np4BvAT8lfCEvJXwXvpUmz73acAlNaKnt5C8QRp8tJXx5E01tezXjmNmp7v5Yhm+vXRY6iL/RTnBO\nTveUu5+cpm17jzbtpNeNEEibgV2p6ZLyfQE4lPA/7Oy9x/5+Rm3pm9291cxqgIHu/kaadNWE5s7E\nCX4GcIu770hJd190zDuiTf9KqAmfl+7vFe3Tn/D5uAoY6e4920ubK1FT1l7c/Y5023NyzHILBABR\nu+NYwgfyRY/aXEtJpn0EWeS/gNDcsCN6Xk2otqe21c539wlJP/sD97v7+9Lkma4TMm2bdtTBeAGh\ng3EosJlw8ujwZBbzvQ0kfLbT1gTNbLa715nZgsT7NbOZ7v7OLhxzFDCE0J4M4WS0MTmwWeh4bldq\nLSP6n1zG3rWxT6bsipnNcve4V/o5ldRHs4f2gnrc72fU1p9aa/ttF8q5V/9fe32CZvYTwgVNf+Bp\nwv9zprsvzfLYsTreC6Ucm4YAjmP3H3yymXXpAxRX3CuumH5KqOLmy23AM2b2x+j5ucBeI6sIVXOA\nJjPbH3gLODg5gZn9B/AZYExU00gYAPy9neM/QLhifx54Pat3sPv4e42iibYD4O43pry0I7qKfsnM\nPkfoNE871NVijm4i/P0+RVJTG6F9O7mpLbWZIFm6ZoI7CU147yc0E11M+iYPgL+a2fmEIJ2Tq7uo\no/axREA1s8HAu939T3sU3P1VM5vI7iA4093ntZPnZ4H/cfdF0fMhZjbF3f87Jd23CP0Z4wjNi2cC\nTwG/TUoTa7RYku0pTYcnsfvznepp4Efu/mZ7+WfoAULH+yMkdaink2HTZU6UXY3AujB0s5jks1ko\n6RiTSWrGcfc5adJ8k3Ayey/wX4Qv3q3u/s2kNIMIV8PXAl9N2n2LR8NR0+S71wihLryPvZp0krn7\nd1LSH0s4oQ4GvkcYRvgjd38mTd5xRzfFamrL8H0lRg0lamO9gIfd/dQ0abcQmnFaCG3p2Vx8pOaZ\n7go63TDTKwlXu4lA9mFgWmp/U4Z5LiAMZpjjYXTRfoTP3QeT0qStiSSk1kjM7BhCs1Ci6XAD8In2\nmo3N7ByS+hPcvaGj43WkvZpHO2ljN13mSjnWCOrp4tDNYpDvIBAd43nCFXlHab4X/XqfmU0HqlOb\nXKLnmwgda3Fl0sHYodQTfQyj3f05YCvhS4aZfYTQv5Gqxt2fTdQuIi1p0hl7Xum1Rtv2Thh/3H2i\nyWRj1EzyRrTPXty9y5Oe0kg3OS3dOeMywvDrRBC8njAyaa9AAPQwM0t8P82sJ9A7Tbod7t5mZi1R\nU98awpDnt2XRn7QY+BHhQnEw4TN7LpCuf+JaQsvC/0SbrjCzE939mgyPmRC34x2gr7s/Gv2dXiUM\nZplJCA55UY6BYCEwgjQjUCQ7FkbfjCb6vHSlqS2pOl8FXGpmnXauZpD3baRpKkjTDnsNYYhvZ9sg\n5ugm4je1QWgmSIy772is+LSow/QbhOGg/YFvpktoZo+6+3s725ahRgujbBI1wcujMu91eGIGQcIo\nqD9YGITghGVR0o3Uey5qivpVdMytwLPpMkxpiu1NGJWzLU1tKLk58jU6djZwjLu3Rce4gzDiJ9tA\ncCXwNTPrsOM9ErvpMlfKJhDY7lmjA+j60E2JtNfURlJbbYY+kItytWN60u/VhCaKt/sfzOxM4Czg\nADP7eVLagaS/yocwSW0acISZvUYYGbNX342732hhJnCiqe3SdE1tkZEeb2LSneyuOSRGjOyXnCDq\nUK4BhkVBI3nW7P4xjtGRywmB5/dRvn9l96S9ZJkEwauBTxNG+STyvDVNugHARwiz1B8ijBhK24ST\nWhsys3MJV/Op4v7dEwYDiabNQR0l7EyGNbbPE/6nVxCaLt8DxFmiJmtl00dgYcimEZYD+EryS8D1\n3vkYdEnDcjBLulCiq6pHEm3qUYfmMYSO1/9MSrqFMGt1Qwd59SNMKtqSg3JNA27qrFnMYszYjdrn\nP0846b/G7kCwmbCEwc1dLW8ccfqbMszvVHYv8TCGcCEyw91/FnP/p939hJRtsf7uUdopwHXA44T3\ndApwjbv/b4bv4wh3X2Lp1z5KNM+m7lNPmBk+ilC7iZJmX1vutJwl+P3ukGUwhFE6Z2b3AFf47glK\nJcPMxgJ/dvdDU7ZXuXt7NYDUPGKvXZVBuWLNOcikQ93MLk/XOdsVtnudpz0kBdaB7r7Zwsq3e0ke\nKGBmf3D3C9sb6ZPu+xn1HxxLuCL+d2C7p18XKHlYbg9CP+G7PGU4rWUw3yFKXxsd3whrF+01h6Ez\nZjbN3afa7nWRknk7Hf8vEgYoLADakhJnPcem03KWSyCwpCGMhOnhCQMIywTkcyhm2UlpajuG0D5b\n1E1ttudEKCd0rl7j0ZouWZ6MYq9dlUE5O51zEKWLfQUbpd+jLwe6PO6+LulpNaGZqsXdvxK9Pt3d\nP2C7Z3+/vStJ61ZFaWvdfXV7I33SvPdHCaOgZhGGXT7l0RpWacqZPOmvBVhOqA2tSUmX6XyHAwhX\n5cl/z7wvUW/RMPR8H2ePY5ZRIMh4CKO0rxyb2jI9GUX7JBYmfHuIYyZDAdspx5XsOefgXMKJ66bo\n9eQO9bizlbtl2LSZPenuHS7e18n+13vKTPN2tv2UsDhbM2EuygzCcNz2xv3nlO1eAXQRu6/KvSsX\nQHEDtZm9lzACL3WJibSL4+VC2QQCyY9Sa2qzPcd+P+Hu09tJN4LQoeiEGdVpq/1RB/D5wN/cfbKF\ntXGu7+LJsMM5B+0FqoR2AlbO+3JSmnwSTS4/c/exKelij1jK9PNkey7xMMLd+6RJM5IwVPUkwv/z\nKeBKb2e12jgsxyuAZhKozex3hHWjUoNQ3mYhl82oIckty262cEGZ2XWENt3E2O8rzewkTxn7bWFZ\n5/8EHiNcZd9kZt919z1uHhL5ImHo5hgLS30PJyyN0aWi0sFwyyzbgvMxbHo2u5vadhGaXN5eYyqT\nEUtJn6dD0nye/pF64GjY5DsJtYJXCWtIzUxNF7kNuIswygjCqK7bgNPjvc20lhI6anO1FHQm85sm\ndqXpMRsKBNKeuwjL4ZZSU9tZxBv7/WVgkru/FaXbh3AyShcIXiAsV95EGF30J+CfXSxnJsMt48rF\nireprgYeijqEv0lYjrkp6fVPs3vE0mz2HLH0Xyl5Zfp56gvcCMyO0bE/3Pe8wcvtZvb5TvbpTK5X\nAM0kUD9tZuPc/YUsj5UxNQ1J2YiuNN/tu++yNpTQPJQ6GudR4Ex33xk97024D8VpafL8A+HElqhl\nTAGGuPtHUtNmWNZcD7fMyYq3KXkmlrY4mTBy6ifA11L7hzIZsRQ1rS3yaBiuhdtAjvM0y3tkUM5H\ngNvZvfz5FMI8jqwn05nZVYQ7wiUbmOnIrGwGXUTNfIcQc3RTLigQSNkws4sIY7+foIOx32b2W+Bo\nwkxTBz5E+IL+E/ZcpM7M5nnKXbHSbStHtnuto2sJNxi6y9KsCxSljbVSqJnNASYnmkgszPVoTO03\nyLCcBwE3E+6z4ITa3RXuvqILeT5PWIcocdezKYRbZWY0SCKbQReZjm7KBTUNSTk5m9C8swFYAVzd\nTifwK+w5xPiB6Ge62Z9zzOwEd38awMyOpwj7SCz+MguZeM3MfgmcBlxvYY2kvdYfshgrhSYnT24n\n97CeUFfPQ98jnLQ3ROUZCtwAdKVz9QLgXjO7mFBz+ziw19LrnUnUyMysV2rtzMz6trNP3k747VGN\nQMqGZTEbNboi7e/um1O2J4Zw9iKsnb8iej4KeMFztHJqvli0zIK7f60LedQAZxBqAy9FE6yOdve/\npqTrdKXQpLT3E2pst0SbPgO8x93P7UI5061emrbmkmG+hxP6hFYC52YzdNVKZH6TAoGUFYsxG9XM\n7opeayV0cg4CbnT3HyelyXgIZ7GxNMss5Ok4z7r7cWY2m/B330K4j/FRadLuS7gN5KmEwPooockl\n7WSxmMefR+gbSq4RPJnNyBvbe7LhvoRlPpoh/aTDTvIriflNahqSspFmNuqx7ZxgxkUjYS4mNGVc\nTQgIbweCUjjRJ7P0yyx011Veo8VcKTT6f1yU4+P/hLCs+b2E93wh8IMs88rpooie3RLt3U6BQMrJ\nfMK48/GEL99GC7dwTK3S97Jwk5dzgZvdfZeZlXrVOLkZJrHMwoe648Du/pno119YWCiv3ZVCLYPb\nb2Zw/N+aWSOhlmHAedkOvSy1C4BcUSCQsuHuX4A9ZqPeRhi7nTob9ZeEE+U8YEbUDLSZEubulxbq\n2MmziN19eeq2FJncfjO26MTfbePuy436CKRspJmNmrjh+GMx9o29ImkxyscyCzGOmZhZ/Dhh1FDy\nzOK/uPuRafaJfftN6T6qEUg5iTUbNerA+xZJ96MlXJ1uam+fEpCPZRY6k8nM4oTYt9+U7qMagVQc\nM7uPMOU/cdevfyWs73Je+3sVN0t/U/gurZKawbEzmVn8KeA+woS+24luv+nuv8xfCaUzqhFIJTrE\n3c9Pev4dM5tbsNLkxjoz+xh7LrPwVncc2N1vshhLLEdzNjZHwzxnkHIzeimcvWYJilSA7dH6OQCY\n2UlAt6xzn0efJAybfIOwsNkFhA7zvIuWWL6BMJnv2OhRn5ouWgzwc91RJsmMmoak4pjZMYRmocQN\nyTcQlihIO+SxFEQrrX4+dZmFrgzLzODYse+FEK1iuh34PbAtsb2YJldVIgUCqTjRmjkXEFZ4HEzo\nJHZ3/25BC9YF+VpmIeaxY9/X2va+rSUAnnRbS+l+6iOQSvQAsBF4HnitwGXJlR5mNiSlRtBd3+9M\n7oUwjrD2zsmEgDAT+EV3FFLap0AglWiku59R6ELkWC6XWcjUtzNIewdheOnPo+dTom0X5rhMkgE1\nDUnFMbNpwE2JtebLhZmNY/cyC49mu8xClsceBRzm7o9Eq5b2TNx8JiVdxd7foZipRiAVI2llySrg\nUjNbSjfdAao7FGqZBTP7N2AqMJTQ73IAobkn3RITJXF/h0qjGoFUjHJYWroYRXMwjgOeSXROm9mC\ndMtARyOMEvd3ADiIsNZQG2UQjEuVagRSMXSiz5tmd99pFlaYiO441t4VZrn1zZQFBQIR6aonzexr\nQF8zO50wKqghXUIF4+KkpiER6ZJo6YjLCPf0NeBhwq0qdXIpEQoEIpIz0fyFkaU8S7sSaa0hEekS\nM3vCzAZGQWAucJuZ3Vjockl8CgQi0lWD3H0zcB5wm7vXAacVuEySAQUCEemqKjOrJcwOnl7owkjm\nFAhEpKu+S+ggfsXdnzOzMcBLBS6TZECdxSIiFU41AhHpEjM73MweNbOF0fMJZvaNQpdL4lMgEJGu\n+hVwDdGN6aOhoxcVtESSEQUCEemqGnd/NmVbS0FKIllRIBCRrlpnZocQrS9kZhcQ7pssJUKdxSLS\nJdEooWnAiYT7Py8DLta6QqVDi86JSNaidYbq3f00M+sH9Eh3QxopbqoRiEiXmNkMdz+l0OWQ7CkQ\niEiXmNk3ge3A74Ftie3uvr5ghZKMKBCISJeY2TLS3IjG3ccUoDiSBQUCEekSM+tLuBnNyYSAMBP4\nhbtvL2jBJDYFAhHpEjP7A7AZ+J9o0xRgsLtfWLhSSSYUCESkS8xsnrtP7GybFC9NKBORrppjZick\nnpjZ8cDfC1geyZBqBCLSJWa2GBgLrIg2HQQsBtoAd/cJhSqbxKNAICJdYmajOnpdM4yLnwKBiEiF\nUx+BiEiFUyAQEalwCgRScczs62a2yMzmm9ncaJRLvo71hJnV5yt/kVzQ6qNSUczsHcAHgMnu3mxm\nw4DeBS6WSEGpRiCVphZY5+7NAO6+zt1fN7P/NLPnzGyhmU0zM4O3r+h/amYzzGyxmR1rZveb2Utm\n9v0ozWgzW2Jmd0S1jHvNrCb1wGb2PjObZWbPm9k9ZtY/2n6dmb0Q7XtDN/4tRAAFAqk8fwUONLN/\nmtl/m9m7ou03u/ux7j4e6EuoNSTsjJZZ/gXwAPBZYDxwiZntE6UZC0yLxsxvJqy987ao5vEN4DR3\nnww0Al80s6HAh4Gjon2/n4f3LNIhBQKpKO6+FagDpgJrgd+b2SXAe8zsGTNbAJwKHJW02/9FPxcA\ni9x9dVSjWAocGL220t0Ts2l/R1iALdkJwDjg72Y2F/gEMIoQNHYAt5rZeUBTzt6sSEzqI5CK4+6t\nwBPAE9GJ/9PABMKdtlaa2beB6qRdmqOfbUm/J54nvkOpE3JSnxvwN3efkloeMzsOeC9wEfA5QiAS\n6TaqEUhFMbOxZnZY0qZjgBej39dF7fYXZJH1QVFHNITVN59Kef1p4CQzOzQqR42ZHR4db5C7Pwh8\nPiqPSLdSjUAqTX/gJjMbDLQALxOaiTYSmn6WA89lke9i4BNm9kvgJeCW5BfdfW3UBHW3mfWJNn8D\n2AI8YGbVhFrDF7I4tkiXaIkJkS4ys9HA9KijWaTkqGlIRKTCqUYgIlLhVCMQEalwCgQiIhVOgUBE\npMIpEIiIVDgFAhGRCvf/AcKS4dO7RGixAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fdist.plot(30)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# Stop words\n", "Stop words are words that you want to filter out from your text for downstream analysis. They are typically very common words which don't contain much useful information for the task at hand. There is no universal set of stop words and some domain knowledge is helpful for deciding what you want to include when processing your text." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from nltk.corpus import stopwords" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "stops = stopwords.words('english')" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['i',\n", " 'me',\n", " 'my',\n", " 'myself',\n", " 'we',\n", " 'our',\n", " 'ours',\n", " 'ourselves',\n", " 'you',\n", " 'your',\n", " 'yours',\n", " 'yourself',\n", " 'yourselves',\n", " 'he',\n", " 'him',\n", " 'his',\n", " 'himself',\n", " 'she',\n", " 'her',\n", " 'hers',\n", " 'herself',\n", " 'it',\n", " 'its',\n", " 'itself',\n", " 'they',\n", " 'them',\n", " 'their',\n", " 'theirs',\n", " 'themselves',\n", " 'what',\n", " 'which',\n", " 'who',\n", " 'whom',\n", " 'this',\n", " 'that',\n", " 'these',\n", " 'those',\n", " 'am',\n", " 'is',\n", " 'are',\n", " 'was',\n", " 'were',\n", " 'be',\n", " 'been',\n", " 'being',\n", " 'have',\n", " 'has',\n", " 'had',\n", " 'having',\n", " 'do',\n", " 'does',\n", " 'did',\n", " 'doing',\n", " 'a',\n", " 'an',\n", " 'the',\n", " 'and',\n", " 'but',\n", " 'if',\n", " 'or',\n", " 'because',\n", " 'as',\n", " 'until',\n", " 'while',\n", " 'of',\n", " 'at',\n", " 'by',\n", " 'for',\n", " 'with',\n", " 'about',\n", " 'against',\n", " 'between',\n", " 'into',\n", " 'through',\n", " 'during',\n", " 'before',\n", " 'after',\n", " 'above',\n", " 'below',\n", " 'to',\n", " 'from',\n", " 'up',\n", " 'down',\n", " 'in',\n", " 'out',\n", " 'on',\n", " 'off',\n", " 'over',\n", " 'under',\n", " 'again',\n", " 'further',\n", " 'then',\n", " 'once',\n", " 'here',\n", " 'there',\n", " 'when',\n", " 'where',\n", " 'why',\n", " 'how',\n", " 'all',\n", " 'any',\n", " 'both',\n", " 'each',\n", " 'few',\n", " 'more',\n", " 'most',\n", " 'other',\n", " 'some',\n", " 'such',\n", " 'no',\n", " 'nor',\n", " 'not',\n", " 'only',\n", " 'own',\n", " 'same',\n", " 'so',\n", " 'than',\n", " 'too',\n", " 'very',\n", " 's',\n", " 't',\n", " 'can',\n", " 'will',\n", " 'just',\n", " 'don',\n", " 'should',\n", " 'now',\n", " 'd',\n", " 'll',\n", " 'm',\n", " 'o',\n", " 're',\n", " 've',\n", " 'y',\n", " 'ain',\n", " 'aren',\n", " 'couldn',\n", " 'didn',\n", " 'doesn',\n", " 'hadn',\n", " 'hasn',\n", " 'haven',\n", " 'isn',\n", " 'ma',\n", " 'mightn',\n", " 'mustn',\n", " 'needn',\n", " 'shan',\n", " 'shouldn',\n", " 'wasn',\n", " 'weren',\n", " 'won',\n", " 'wouldn']" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stops" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How would you create a list of tokens that doesn't include the stopwords?" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "filtered_words = [word.lower() for word in words if word.lower() not in stops]" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['``',\n", " 'none',\n", " 'suggest',\n", " '?',\n", " 'know',\n", " 'methods',\n", " '.',\n", " 'apply',\n", " '!',\n", " \"''\",\n", " '``',\n", " 'think',\n", " 'obvious',\n", " 'conclusion',\n", " 'man',\n", " 'practised',\n", " 'town',\n", " 'going',\n", " 'country',\n", " '.',\n", " \"''\",\n", " '``',\n", " 'think',\n", " 'might',\n", " 'venture',\n", " 'little',\n", " 'farther',\n", " '.',\n", " 'look',\n", " 'light',\n", " '.',\n", " 'occasion',\n", " 'would',\n", " 'probable',\n", " 'presentation',\n", " 'would',\n", " 'made',\n", " '?',\n", " 'would',\n", " 'friends',\n", " 'unite',\n", " 'give',\n", " 'pledge',\n", " 'good',\n", " '?',\n", " 'obviously',\n", " 'moment',\n", " 'dr.',\n", " 'mortimer',\n", " 'withdrew',\n", " 'service',\n", " 'hospital',\n", " 'order',\n", " 'start',\n", " 'practice',\n", " '.',\n", " 'know',\n", " 'presentation',\n", " '.',\n", " 'believe',\n", " 'change',\n", " 'town',\n", " 'hospital',\n", " 'country',\n", " 'practice',\n", " '.',\n", " ',',\n", " ',',\n", " 'stretching',\n", " 'inference',\n", " 'far',\n", " 'say',\n", " 'presentation',\n", " 'occasion',\n", " 'change',\n", " '?',\n", " \"''\",\n", " '``',\n", " 'certainly',\n", " 'seems',\n", " 'probable',\n", " '.',\n", " \"''\",\n", " '``',\n", " ',',\n", " 'observe',\n", " 'could',\n", " 'staff',\n", " 'hospital',\n", " ',',\n", " 'since',\n", " 'man',\n", " 'well-established',\n", " 'london',\n", " 'practice',\n", " 'could',\n", " 'hold',\n", " 'position',\n", " ',',\n", " 'one',\n", " 'would',\n", " 'drift',\n", " 'country',\n", " '.',\n", " ',',\n", " '?',\n", " 'hospital',\n", " 'yet',\n", " 'staff',\n", " 'could',\n", " 'house-surgeon',\n", " 'house-physician',\n", " '--',\n", " 'little',\n", " 'senior',\n", " 'student',\n", " '.',\n", " 'left',\n", " 'five',\n", " 'years',\n", " 'ago',\n", " '--',\n", " 'date',\n", " 'stick',\n", " '.']" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filtered_words" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "filtered_fdist = nltk.FreqDist(filtered_words)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAE/CAYAAABLrsQiAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnXmYHFXV/z9nZrJN9hBIwpYQQlCWsMwoyCqboIIiAor4\nE0XkVVyiKPL6viruCgoK+AoCiiiisgkGhLBICAgEJixhSUIgYZOwBMgCWchyfn/c6kxNT/V0dU/3\ndM/U9/M89XR39al7T/VSp+49555j7o4QQojs0lBrBYQQQtQWGQIhhMg4MgRCCJFxZAiEECLjyBAI\nIUTGkSEQQoiMI0MghBAZR4ZACCEyjgyBEEJknKZaK5CG0aNH+4QJE8o6dtWqVQwaNKiismqzNm32\ntfNRm2qzkm0mMXv27CXuvmlRQXev+62lpcXLpa2treKyarM2bfa181GbarOSbSYBtHmKa6ymhoQQ\nIuPIEAghRMaRIRBCiIwjQyCEEBlHhkAIITKODIEQQmScPmsI3J3X3lzDs8vW1loVIYSoa3rFgrJy\nWLfBaf3xbeBwxP4b6NfYZ22eEEJ0iz57dezX2MDoIQNw4JUVa2qtjhBC1C191hAAjBs+EICXlq2u\nsSZCCFG/9GlDMGZYMAQvL5chEEKIQvRpQzB2mEYEQghRjL5tCHJTQxoRCCFEQfq0IRijEYEQQhSl\nTxuCcRoRCCFEUfq0IZCzWAghitOnDcHYWPhoqNEghBAinz5tCIYMaGJQk7Fm3QaWrlSqCSGESKJP\nGwKAUYPCKcpPIIQQyWTAEDQCMgRCCFGIPm8INolGBC8rhFQIIRLp84ZAIwIhhOiaPm8INokMgUJI\nhRAimT5vCHLO4sWaGhJCiESqZgjM7Pdm9oqZPRbbN8rMbjWzBdHjyGr1n2Pj1JAMgRBCJFLNEcEf\ngMPy9v03cLu7bwfcHr2uKps0R85iTQ0JIUQiVTME7j4TeD1v94eBy6LnlwFHVqv/HMMHNNDUYLyx\nci2r166vdndCCNHr6GkfwRh3XwwQPW5W7Q4bzNhs6AAAXlmukpVCCJGPVTMHj5lNAG5w952i10vd\nfUTs/TfcPdFPYGYnAycDjBs3rmXatGll6bBy5Up+eO8qnnx9LT947yh23LR/l7LNzc2p2kwjpzYr\n22ZfOx+1qTYr2WYSra2ts929taigu1dtAyYAj8VezwfGRc/HAfPTtNPS0uLl0tbW5p//U5uPP/0G\nv+6hF4rKpm2zlP7VZu/vW22qzXpvMwmgzVNcY3t6augfwAnR8xOA63uiU6WjFkKIwlQzfPQvwL3A\n9mb2gpl9FvgZcIiZLQAOiV5XnY0FapbJRyCEEPk0Vathdz+uwFsHVavPQuTqEmhEIIQQnenzK4sh\nVrtYhkAIITqRCUMwVkXshRCiINkwBLGpoQ0bVLJSCCHiZMIQDOzXyIjmfqzb4Lz21tu1VkcIIeqK\nTBgCaJ8eksNYCCE6khlDMEZ+AiGESCQzhmCsIoeEECKRzBiCMcM1IhBCiCQyYwg2ri7WiEAIITqQ\nGUMgZ7EQQiSTGUMgZ7EQQiSTGUMwVlNDQgiRSGYMwcjmfvRvamDF6nW8tWZdrdURQoi6ITOGwMwY\nMyyUrNSoQAgh2smMIQAYN2wQAC/LTyCEEBvJlCEYIz+BEEJ0IlOGYKymhoQQohOZMgQbaxdrakgI\nITaSKUOQCyFdLEMghBAbyZQhGKfaxUII0YlMGQLVLhZCiM5kyhBsNjQYgldXrGHd+g011kYIIeqD\nTBmC/k0NjB7Snw0OS95UyUohhICMGQJonx5avGxVjTURQoj6IHOGQOmohRCiI9kzBKpUJoQQHcie\nIdgYObSmxpoIIUR9kDlDMEZrCYQQogOZMwRjValMCCE6kD1DoAykQgjRgewagmWrcfcaayOEELWn\nJobAzL5mZo+b2WNm9hczG9hTfQ8d0ERz/0ZWrV3P8tUqWSmEED1uCMxsC+ArQKu77wQ0Ah/vwf61\nlkAIIWLUamqoCRhkZk1AM/BiT3Y+Rg5jIYTYiNVintzMpgI/BlYBt7j78QkyJwMnA4wbN65l2rRp\nZfW1cuVKmpubO+w7d9ZSZj63mi+2DuPAbZq7lE3bZndl1WZ996021Wa9t5lEa2vrbHdvLSro7j26\nASOBfwGbAv2A64BPdnVMS0uLl0tbW1unfT+7aa6PP/0GP/e2J4vKpm2zu7Jqs777Vptqs97bTAJo\n8xTX5VpMDR0MLHL3V919LXAtsFdPKjBWdQmEEGIjtTAEzwF7mlmzmRlwEDC3JxVQ7WIhhGinxw2B\nu88CrgYeBB6NdLioJ3XQojIhhGinqRaduvsZwBm16BuUiloIIeJkbmUxwOgh/WmwUKVszbr1tVZH\nCCFqSiYNQVNjw8b6xa8oHbUQIuNk0hCA0lELIUSOzBqCscMGAHIYCyFEhg2B0kwIIQRk2BBoakgI\nIQKZNQTjIkOwWCMCIUTGyawhGKO1BEIIAWTYECjfkBBCBLJrCDb6CNaoZKUQItNk1hA0929i6MAm\n3l63gTdWrq21OkIIUTMyawgg7jBeVWNNhBCidmTaEMhhLIQQGTcE7YvKlG9ICJFdsm0IVJdACCGy\nbQhUqUwIITJuCLSWQAghsm4IhivxnBBCyBCgEYEQItuUbAjMbKSZTamGMj3NqOb+9Gs0lq1ay+q1\nKlkphMgmqQyBmc0ws2FmNgp4BLjUzM6prmrVp6HBNpas1PSQECKrpB0RDHf35cBRwKXu3gIcXD21\neg5NDwkhsk5aQ9BkZuOAY4EbqqhPjzNWBWqEEBknrSH4PjAdeMrdHzCzicCC6qnVc+RCSFWgRgiR\nVZpSyi12940OYndf2Bd8BJBXu3hIjZURQogakHZEcH7Kfb0O1S4WQmSdLkcEZvYeYC9gUzM7NfbW\nMKCxmor1FB1XFw+srTJCCFEDik0N9SdMmDQBQ2P7lwNHV0upnmRsh3xDMgRCiOzRpSFw9zuBO83s\nD+7+bA/p1KNsNmwAAC+vWMN6lawUQmSQtM7iAWZ2ETAhfoy7H1gNpXqSgf0aGTW4P6+/9TbLV2+o\ntTpCCNHjpDUEVwEXApcA3c7FYGYjorZ2Ahw40d3v7W675TJm2EBef+ttXlslQyCEyB5pDcE6d7+g\ngv2eC9zs7kebWX+guYJtl8zYYQOYuxheX6V8Q0KI7JE2fHSamZ1iZuPMbFRuK6dDMxsG7Af8DsDd\n33b3peW0VSlyq4tf14hACJFB0o4ITogeT4vtc2BiGX1OBF4lJK7bBZgNTHX3t8poqyKMHTYIgGeW\nra2VCkIIUTPMezhSxsxagfuAvd19lpmdCyx39+/kyZ0MnAwwbty4lmnTppXV38qVK2lu7nrm6Zml\na/nGra/R2ADnH7Ypmw3ueolEmjZLlVWb9d232lSb9d5mEq2trbPdvbWooLsX3YBPJW1pjk1oayzw\nTOz1vsCNXR3T0tLi5dLW1pZK7it/edDHn36Df/3KhyvWZimyarO++1abarPe20wCaPMU1+W0PoJ3\nxbZ9ge8BH0pvlzoYnpeA581s+2jXQcAT5bRVSU49ZDKNBtc++AJPvryi1uoIIUSPkcoQuPuXY9vn\ngN0Iq47L5cvAn81sDrAr8JNutFURxm8ymEMmNrPB4RfT59daHSGE6DHKrVm8Etiu3E7d/WF3b3X3\nKe5+pLu/UW5bleTodw5mUL9GbnniZR58ri5UEkKIqpO2VOU0M/tHtN0IzAeur65qPc/IQY2cuM8E\nAM68aV7OhyGEEH2atOGjv4g9Xwc86+4vVEGfmnPyftty+X3PMWvR68xcsIT9J29aa5WEEKKqpPUR\n3AnMI2QgHQm8XU2lasnwQf045b3bAnDWzfPYsEGjAiFE3ybt1NCxwP3AMYS6xbPMrE+koU7ihL0m\nMGbYAB5/cTk3Prq41uoIIURVSess/l/gXe5+grt/Cng38J0ix/RaBvZr5KsHTwbg7Fvms3a9Uk8I\nIfouaQ1Bg7u/Env9WgnH9kqOadmSiaMH88xrK7my7flaqyOEEFUj7cX8ZjObbmafNrNPAzcC/6ye\nWrWnqbGBr78vrHk797YFrHpbmUmFEH2TLg2BmU0ys73d/TTgt8AUYBfgXuCiHtCvprx/p7HsvMVw\nXlmxhj/c80yt1RFCiKpQbETwK2AFgLtf6+6nuvvXCKOBX1VbuVrT0GB887AwKrhgxlMsW6nspEKI\nvkcxQzDB3efk73T3NkLZyj7PPpNGs9e2m7B89Tp+O/PpWqsjhBAVp5ghGNjFe4MqqUi9YmZ887B3\nAPD7fy/ileWra6yREEJUlmKG4AEz+1z+TjP7LKGgTCbYdasRHLbjWFav3cB5/1pQa3WEEKKiFEsx\n8VXg72Z2PO0X/lZC5tGPVFOxeuMbh07mlide4q/3P8+eIzeptTpCCFExujQE7v4ysJeZHQDsFO2+\n0d3/VXXN6oxJmw3l6JYtubLtBc6/fxmPr5pX9Jj+jQ1MaVbYqRCivkmVdM7d7wDuqLIudc/Ugydz\n3cMvMv+1tcyfkc5xvMcWAzho7yorJoQQ3SBt9lEBbDFiEH868d38457H2HyLLbqUdXd+fcdTzPrP\nGh5+fim7bjWih7QUQojSkCEokT0mbkLTG0NoaZlUVPatt9dzwYynOfOmeVzxuT0wsx7QUAghSqNP\n5wuqNZ/fb1sG9zPuXfgadz+1pNbqCCFEIjIEVWR4cz8+8o7BAJx183xVPBNC1CUyBFXmA5MGs9nQ\nATz6n2Xc9NhLtVZHCCE6IUNQZQY0GVMP3g6AX0yfzzrVNhBC1BkyBD3Asa1bMWGTZhYueYurZ/fJ\nUs9CiF6MDEEP0K+xgVOj2ga/um0Bq9dqkZkQon6QIeghDt95HDuMG8ZLy1fzx3ufqbU6QgixERmC\nHiJe2+D/7niaZatU20AIUR/IEPQg+0/elD22GcWyVWu5eObCWqsjhBCADEGPEq9t8Lu7F/HKCtU2\nEELUHhmCHqZl/EgO2WEMq9au59f/eqrW6gghhAxBLTjt0O0xgytmPcdzr62stTpCiIwjQ1ADJo8Z\nylG7bcm6Dc45t86vtTpCiIwjQ1AjvnrwdvRvbOD6R15k7uLltVZHCJFhamYIzKzRzB4ysxtqpUMt\n2WpUM8fvuTXuIfWEEELUilqOCKYCc2vYf8354gGTGNy/kdvnvcLcJW/XWh0hREapSWEaM9sS+CDw\nY+DUWuhQD4weMoDP7juR825fwNn3LuX6RfekOu7NN99kyAPFZTcf8Da77+4qiCOE6BKrRY58M7sa\n+CkwFPiGux+eIHMycDLAuHHjWqZNm1ZWXytXrqS5ubmispVsc+XaDXz55iUsXV2drKT/u89Idh83\noKhcLc69N/WtNtVmvbeZRGtr62x3by0q6O49ugGHA7+Jnr8XuKHYMS0tLV4ubW1tFZetdJuvrljt\nf7zpHp+18LVUWxrZn/zzCR9/+g1+2K9m+vr1Gyp2TrX8PHvDd6k21Wat2kwCaPMU1+VaTA3tDXzI\nzD4ADASGmdnl7v7JGuhSF4weMoAdNu1PyzajUsk3vl5cdsqWw7n6/meYu3g50+a8yId33aISqgoh\n+iA97ix292+5+5buPgH4OPCvLBuBajGwXyPH7jgEgLNveZK316kgjhAiGa0j6MMcMH4QEzcdzHOv\nr+Rvbc/XWh0hRJ1SU0Pg7jM8wVEsKkNjg3FaVBDnvNsXsOptFcQRQnRGI4I+zmE7jWXKlsN5dcUa\nLr1nUa3VEULUITIEfRwz4/Qo9fWFM55m2UoVxBFCdESGIAPsPWk0+0wazfLV67jgzqdrrY4Qos6Q\nIcgIpx0afAWX/nsRLy1TQRwhRDsyBBlhl61G8P6dxrJm3QbO+9eCWqsjhKgjZAgyxNfftz0NBn97\n4HkWLXmr1uoIIeoEGYIMMWmzIRzTshXrNzhn36LU10KIgAxBxph68Hb0b2rghjmLeew/y2qtjhCi\nDpAhyBibjxjECe8ZD8BZKogjhECGIJOc8t5JDB3QxMwnX+Wep5fUWh0hRI2RIcggIwf35+T9JgJw\n1s3zc+nBhRAZRYYgo5y4zzaMHtKfh59fyi1PvFxrdYQQNUSGIKMMHtDElw/cDoBfTJ/Peo0KhMgs\nNalZLOqD4969NRfftZAFr7zJsVe/iV1zY/GDHLg6hVwJsu/YpB/Tdt1AU6PuS4SoBfrnZZj+TQ18\n5/Ad6N8UfgbuKba0ciXIzl2ylmsefKGmn4UQWUYjgoxz6I5jmfeDw5j94Gxadm8pKp9WLq3stDkv\nMvWvD/Or2xbw4V23YGC/xlRtCyEqh0YEgoYGo8EsPBbb0sqllD1iyuZMGN7E4mWr+dO9z9b6oxAi\nk8gQiJrS0GAcv/NQAP5vxlMsX616CUL0NDIEoubsNrY/795mFEtXruXimQtrrY4QmUOGQNScUEUt\n1Eu45K5FvLpiTY01EiJbyBCIuqBl/CgOfucYVq1dz69VL0GIHkWGQNQNpx26PWZwxf3P8dxrK2ut\njhCZQYZA1A3bjx3KR3bbgrXrnV/e9mSt1REiM8gQiLriawdPpl+jcd3D/2Hu4uW1VkeITCBDIOqK\nrUY1c/we43EPOZCEENVHhkDUHV86cBLN/Ru5fd4rPPDM67VWR4g+jwyBqDtGDxnASfuGegln3jRP\n9RKEqDIyBKIu+dy+2zCyuR9tz77BHfNfqbU6QvRpZAhEXTJ0YD++eMAkIFRR27BBowIhqoUMgahb\nPrnneDYfPpB5L61g2pwXa62OEH2WHjcEZraVmd1hZnPN7HEzm9rTOojewcB+jXz1kMkAnH3Lk6zV\nqECIqlCLEcE64Ovu/k5gT+CLZrZDDfQQvYCjdtuCSZsN4bnXV3L7wlW1VkeIPkmPF6Zx98XA4uj5\nCjObC2wBPNHTuoj6p6mxgW+8b3s+f/lsrnriTQ5Z9DppKlrOW/I2PFs89DStXNbbXLFmQyo50Tup\naYUyM5sA7AbMqqUeor45dMcx7LLVCB55finH/vbe9AfekVI2rVyG2xzd3MBdresZ1F8V5PoiVqsY\nbTMbAtwJ/Njdr014/2TgZIBx48a1TJs2rax+Vq5cSXNzc0Vl1WbPt7nwjbX8/qGlbEg5m7l+/QYa\nUwwd0spluc2X3lzPsjUb+OTOQ/jIO4Z0KduXfnO9rc0kWltbZ7t7a1FBd+/xDegHTAdOTSPf0tLi\n5dLW1lZxWbVZmzb72vn0ljZnPvmKjz/9Bt/5jJt96VtvV6TNUmTVZvkAbZ7iGluLqCEDfgfMdfdz\nerp/IURp7DNpNDtv1p/lq9dx4cyna62OqAK1iBraG/h/wIFm9nC0faAGegghUmBmHL9zmBK69N+L\neHn56hprJCpNjxsCd7/b3c3dp7j7rtH2z57WQwiRnu1G9eewHceyeu0GzrtdFeT6GlpZLIRIxTcO\nnUyDwV8feJ5FS96qtTqigsgQCCFSMWmzoRzdsiXrNzjn3KoKcn0JGQIhRGqmHjyZ/k0NTHvkRR77\nz7JaqyMqhAyBECI1W4wYxKf2HA/AL25RBbm+ggyBEKIkTjlgEkMGNDFj/qvct/C1WqsjKoAMgRCi\nJEYN7s/J+4UKcmfdrApyfQEZAiFEyZy4zzZsMrg/Dz63lNvmqoJcb0eGQAhRMkMGNPGlA0MFuZ9P\nn8d61Yro1cgQCCHK4hN7bM0WIwbx5Mtvct1D/6m1OqIbyBAIIcpiQFMjp0YV5M659UnWrFtfY41E\nucgQCCHK5sjdtmDymCH8Z+kqrpj1XK3VEWUiQyCEKJvGBuO0Q98BwK//9RSr1qqSWW9EhkAI0S0O\nfudm7L71CF57622mLVhZa3VEGdS0VKUQovdjZpx+2Dv42EX38Y/5b7HNnU9jKY574T9v0fZm8foG\naeX6apsDxy1jx82Hp2q3XGQIhBDdZo+Jm/De7TdlxvxX+dlN89IfOCelbFq5Ptjm9hOXyhAIIXoH\nPztqCj+/7j422XRMKvmXXnqJsWPHVkyur7b5znHDUrXZHWQIhBAVYezwgXxip6G0tLwzlfzs2StT\nyaaV67Ntjh+Zqs3uIGexEEJkHBkCIYTIODIEQgiRcWQIhBAi48gQCCFExpEhEEKIjCNDIIQQGcd6\nQ5k5M3sVeLbMw0cDSyosqzZr02ZfOx+1qTYr2WYS491906JS7t6nN6Ct0rJqszZt9rXzUZtqs5Jt\ndmfT1JAQQmQcGQIhhMg4WTAEF1VBVm3Wps2+dj5qU21Wss2y6RXOYiGEENUjCyMCIYQQXSBDIIQQ\nGUeGQAghMo4K08Qws3HA6+6+poxjRwGfB1YDl7j78krrJ0rDzAaU810KkTUyMyIwszQ15P4EzDOz\nX5TRxTXAEGBL4F4zm1hGGz2GmV1jZh80sx77DZjZ7l1tBY65y8x+bGaHmdnQLtr+fd7rIcA/u5Df\nwsz2MrP9clvZJ1ZFzGwfM/tM9HxTM9umgNx4Mzs4ej4o/7MyszOjx2PK0KFo5Eqx/mNynfovppOZ\nNZhZwXqNZtZYTL9yKdZ3JFPR35KZjUrY+nWnzaJ9ZiVqyMxudPcPppAzYAd3fzx63eju61McN8fd\np0TPDwUuAZYCXwdOcvdj8+TvAJwwAjm6QJuPRjKJ5PqLZKcVkf1QXtsHA58B9gSuAv7g7onVtM1s\nAPBRYAKxUaS7/yBPbgzwE2Bzd3+/me0AvMfdfxe9f0ckOhBoBR4BDJgCzHL3fRL6ngjsA+wb6boG\nuMvdv5Yn90NgtLt/wcxGAjcCF7v7pQltngl8DHgCyH23nvAZpTrvSHYycAEwxt13MrMpwIfc/UcJ\ncqcB4/PaPDChzTMIn9P27j7ZzDYHrnL3vfPkPgecDIxy923NbDvgQnc/KCbzKLA74XNONLqFMLOH\n3H23Lt4v2n9M9sH8/gvsu4Iwwl4PzAaGA+e4+88T2lwEXA1c6u5PFNDx1Px97n5O9N4n3f3yMvsu\n+lsys/Pp+r/5lbw2nwG2At4g/D9GAIuBV4DPufvsQm2VS2amhtIYgUjOgZwR2AH4MfCRFIeuMLMJ\n7v6Mu083s62BzQlf5qMJ8p+OHrsyModHj1+MHv8UPR4PrMyTLWkU4+63AbeZ2XDgOOBWM3seuBi4\n3N3XxsSvB5YR/hRdTbX8AbgU+N/o9ZPA34DfRX0eAGBmfwVOdvdHo9c7Ad8ooOdCM1sFvB1tBwCd\nir26+3fM7EwzuxBoAX7m7tcU0PNIwsW12LRR2vOG8LmdBvw20mdOdEH5UZ7cVcCFkXyxG4yPALsB\nD0ZtvljgTvuLwLuBWZHcAjPbLE/mZkLOmsFmFp+2tHCId3XX+0oRPYv2b2bvBz4AbGFm58XeGgas\nS2hzB3dfbmbHE0Z2pxO+h04XY8KNxMeBS6IR7u+Bv+ZNzxYcTQKDu9F3mt9SW/S4N7AD4T8BcEzU\nbj43A3939+kAZvY+4DDgSuA3wB5d9FUePZHHorduwHRg05Sy2wOTq6THv9PsK6PdTYCphB/qPwh3\nNucDM/LkHkvZ3gPR40OxfQ8nyKXaF+1/mnCBmUq4o23Ie/+o2PZR4GHCIpyjgKMKtHkTMCTF+aQ6\n7xLPfXYJbd4fPT4YPQ4G5iTIzYr3TbjBm5MnMyB6vL6M38mDRd5P0/8uwAmE5JEnxLajgJEJbT4O\n9CMYzv2jfY+k0HU/4D/AW8BlwKQyzjd132l/S5HsHUC/2Ot+wB0Jcp3yC+X2FfqfdHfLzIigTD7g\nKaaFANx9fikNR8NZB15192IWfrCZ7ePud0fH7kXnu5hcu9sBPyXceQyM6TcxT+5a4B2EUcYR7r44\neutvZtZGR+4xs509uoPvgrfMbJPovDCzPQl31PnMNbNLgMsj2U8Ccwu0eR5haug4wt3xnWY2092f\njt4/Ik/+IcIf7Iio7WsT2lwJPGxmtxO70/e8ITrpzxtgiZltS/u5H00YzuczzcxOAf6e1/frCbJX\nmtlvgRHR9MuJhJFEPnea2f8Ag8zsEOAUYFqezL0EQ1pOEIMVeb9o/+7+CPCImV3h0WgzmsLbyt3f\nSGjzt8AzhOnDmWY2vpDukY/gg4SpzgnA2cCfCdOJ/wQmpzjHsvom/W8JwgzBUCD3XQ+J9uXzupmd\nDvw1ev0x4I3oPDeUdirpyIyPoFzS+giqrEMLYbg7PNq1FDjR3R9MkL0bOAP4JeFi+BnC93xGTKYB\n+LYnzHUX6P8JYBKwiPBjz00nTMmT250wotgJeAzYFDja3efkyQ0EvkC4ewOYCVzg7qu70GFIdC7f\nALZ097IdhGZ2QtJ+d78sTy7VeUeyEwkjkb0I04GLgE+6+zN5couSu/bE4ILowvq+qO/p7n5rgkwD\n8Nm4HCFyzWMyjxGmNr5LmMLKVyDJYOaOPcXdf9PF+0X7j8nOAD5EGDU8DLwK3OnunebwE45tcvdO\n00hmtpBwt/07d78n773zClyUS6KLvlP9liLZzwDfi3QF2B/4XsLvbjThP7wP4fO8G/g+4aZqa3d/\nqvwzSUaGoAtyPgJ3T+MjqDoWohfM3ZPusnMys929xcwedfedo313ufu+eXL3uvt7UvY7Pmm/u3eq\nEWFmTYRpMgPme0dfQ8mY2dmEP8QQ4D6C0bjL3RfmyW0KfI7Ojt0Tu9F36vOOHTOYMH21otx+q4GZ\n7UPwLR1LmAaM4935nErU4yF3383MTiKMBs6wjoEWXRoEjxy8eW0Ocfc3K6jjdwv0nXjjZGb9aR91\ndPmbtxC9mJsBmOXuL3VH10qhqaGu+SVh2qImFPpTmIWRetKfAlgd3aEtMLMvEeZL8x2HALeY2UeB\na5Pu3OK4+7NmtgthqA3hQvxIgl5fBP7s7RFXI83suNzdpJUQBRXjPuAsd3+5Kx0Jjt27gNso4oRN\nO31GuMu9C7jH3d8q0FZJ35GFMMD4aGgG8Nv4xcPMVpD8OSU6dgt8rssIvp8fuftr0bTi3WbW5lEU\nV6Uws70Jd7q5SKicnkmjnCYL63WOpT2oIE7Oqbs98C7ajdYRhJuAeL8bo3Fyn3ecbowE4t/1QELQ\nRuLUpZm9l+CLeIZw3luZ2QnuPjNB1oCDgYnu/gMz29rM3u3u9+fJTSaMfCdQJLKsUmhE0AW1nhay\nED5YEHf6bASWAAAaUklEQVT/fsIx7yL8aEcAPyREZZzl7rPy5FYQ/AzrCIvgCkaPmNlUwt12bvrg\nI8BF7n5+ntzD7r5r3r6NoYeF7rBj55N4p21mH6L9wnmnu+fPfyf2XYg002eR3ImE0ch7gBUEozDT\n3a+PyXT1HXn+XWTkG+lHuHgA/D9gvbuflEb3AudzFsH4XRHt+nj0uBzYx92PyJPfic5G8I/d6H8e\n8DVCBMzG/4u7v5YgewzwHUKwwxeiKbWfu/tH8+RuAT6aG1lZiJa6yt0Pi8kkTsvE+u80PVMOFsKI\n/+Huhya8Nxv4RM5HGF3E/+LuLQmyFxDm+A9093dGPpJb3P1deXKPECLL8j/PioeNbsSr4IHWVrsN\nOCbNvhLbnAMMjr0uFL0yh+jmInrdCDzezb5/CtxOcJSeCNwK/DRB7kcE536aNmdHj4/G9t3VhfxY\n4CvAc8CKAjJ7p9zXKfokaV/svd2jvr8M7FZApmBUWfwco9dnEOaoXyaE+r4EXN3N72hWd44v0OY8\nokin6PUAYF6RY4YBQ6ugy0hgQYH3Ev8HBWRz0V/xyLKk30PqyLJKbZoa6gWY2aUkTBV48rzutwhh\nb13uM7PbPW/BT9K+3Ft0nG5ZT3IkyXRCpMuFkb6fJ8REd2ys49RHf8Id8lueHMv+QWBXd98QHXsZ\nITLoW3lyU4H/MbM1wFq6jo9PNX0W3b3vQLho3gUcTRTTn8D5hIt2sX3rzWxbj6KeojvixFFnNFd9\nDO0jsT+Y2VWet0gNGGJme3g06jOzdxN8KtA5Rv9oQijnQ+7+GQuLAC8pcE5pucPMfh7pGY+cSQpm\nSLXwjhDNdr+Z/T16fSTto6j8NlsJRm1oeGm5YIqy7qDzptoaCUEPhQIr2szsd3Rc41Oo37UWIn9y\n01mbkhwFVEpkWUWQIegd3BB7PpAwNfNiXMBSLtixELHTDIyOhqYWk0sKZYPwJ5uV96dMmmc+Hfgv\nwhy4AbeQcJFx9/z0B0cSFiQVYgTtIXfDkwTy2yzCVwmfwVcI02cHAJ9KkNuEcCFYGvW/xPMiR8zs\nPYRIoU3z/AXDomPzOY1w4VxI+IzGE6amkjiOMApYHfX1M4Ihyr9ongT83kJklRGmhE6KHNc/zZNd\n5e4bzGxdFHzwCtDddCg552drbJ8DSXPaqRbeufuPzewmgl/Kgc+4+0MF+v89cIq73wUbHeOXEhaa\nlcPhsefrgJfzv/cYXyAsqPsK4bOfSVj0lcR5hIv7Zmb2Y4JR/naCXG7KKx7d5XT/eyqIDEEvwPNW\nyJrZXwhO0TgvEpyDH6LjHckKwvxtjv8iXAg3j+RyhmA58H8F+j/HQthfLpwt8U8Z3bVfEG2pcffr\nzOy/C7z9U+AhC+kpjOAr2DgaMLN3uPs8K5CrKOmulPCn+hPhIpzL4XIxeRcOj6LFzOydwKGEC3ij\nu28ZE+tPuPtuouPq1eWEP3q+PrdHzupcZNU8L7wq9RmC4c+F1Q4gLLDLb/MBYGcLq8TN3ZfG3r4y\nT7zNzEZE5zsbeBO4n27g0YrxlDS7+/15zt1CF9n1hDtmp+v4+RU5IxDpc3c06iwL7xwcMZMw7Zkk\nuwY4J9qKtfvnyKdwEOG7P9LdOzmh3T0xn1Q1kbO4F2Jm2wM3uvukhPcS450T5L7sec7eBJlhHpba\nj0p6PzdUNbMr3f3YQlFB3nm9wVGxlw2EO8n9vUA4axRl8i7Cn6dDyJ2ZXeTuJ1t7HqO8rhNz+Mwn\n3G09SuwC43nOajM7nHAx2I8wT3wvwZfQIcFdJPtNdz8rb98x7n5V9PxAd/9X3rnHFb02dlwuGmbr\n6LxvjV4fAtzt7h+PH2sl5ETKO24CMMzz1nmUSmSAziDm0Ad+4AlhztFd/pcIjt/dLSy8+6y7vz9P\nLhegcA3he08MUIhkf0kY4f2F8Dl9jLCW4xooeDPQ1fkUDY4o9TcfHbMt8IK7r4mijaYAf8wZ7lJ+\nI5VGhqAXEJtTt+jxJeBb8ZFCmT/Mveh88fhj7P0b3P1wa18FvfEtYuGBZjbO3RcXigpKuMBeGnu5\njnDne7G7J+a0MbMt6JykrVN4XlrM7G5PSHCXIPd/tK9beLGIbJfJ1Mzs+x5i5i9NONzj/p5So2HM\n7GbacyLFo0zOTtCzFN9QKszsGsICwngk1C7u3umCZukX3s0hJCx8K3o9GLi3wO846SYgR+LNQJHz\nKdp3qb/56JiHCTc9Ewi+s2mEPEUfiN5P/RupNDIEfYQyLsZ/ArYlrO6MZ00sexWmmZ3p7qcX21dq\nm4Q7vMdpv3t3z8sUGsl2adhicgcR5t/z0wJ0uuOKnKm58L77841VzDdzLO3JxCD4CHZw9658HxXB\nzB5z952KyOR8Q3cA76Wjb+gmd++UyK+E/pPChrsM57UiC++iG5p3xfwjAwn5nHYuV8+0lNJ3dB45\nv8tkQtqWmzxhUVnuxsDMvhkdc74VyezaU8hH0EuwjrH0M9w97kDGo1xB0fzmWILz1Qk/4KTVi62E\nC1XRO4ES7iIPITiM47w/f5+ZbUmIqNk70vFuYKq7v5DQfapMoYUMG5AUH/8Zwh+2HzHjQl5eIgsx\n778gLPoy4HwzO83dr46JpfXN5NqcSnBkriDM0+8O/Le735Igm3bhW5qcSEm+IY/0+HUXx6VhlXXM\nhbU3sCpJ0PIW4EW+gmWEkMmHY2+lDVDAQn6rXEqG3O/pB56wjiElqfsmjBj3tRB4cTvht/AxQvRQ\nPmvN7DhCYEJubUenOgOlTLVVDO/heFVtpW/Az0gRSx/JnkSId/8D7SseT0yQuwoYV6TfgcAoQvKt\nkdHzUYS77rkxuS8Q5ttXEpxquW0RYaVxfru3Ei7GTdH2aeDWAjqkzRQ6F9rXMBSRfTSl3CPAZrHX\nm1I4C2VT2jajx0MJq2Z3oUB2T8IF7aDosxxPWL37/QS5JwgpuudHso9SOJb9uwS/AISFXX8Hdu/m\n73PX6LN6JtoeAqYUkL2CkJ787GibR3DcPwB8M0+2hRCNM5UCayhiv6fvANtE27eB27p5Trun7Du3\nNuDLOf2JrRPIk92BEDl0XPR6G8JNQL7cNYTcQhOj7QxCBoCyz6fo+VazcW0V+pLCn7sh9rqxiz/6\nfGCT2OtNCPlP8uXuIMzRTo8uSP8grJ6My0ylPeHawuj5ouhP/6WY3HCCcfhLdMHKbaMK6FhKGupr\ngKcI4Ybn5bYEuaKGLSZ7MWE0VEwufzFWQ8K+K3OydDSCc5K+o9w+4FzgI9HzQheOVAvf8j7zjVuh\n31L0uA/hbvbDdHNBGCGa6XiCkTknunB9t4DsdGKGnRBxdTMwCHgiT7aRMILZOrd19Tnl7euUyjnF\neeQM5KikrcAxDxFWnt8H7Jj0uylDj9T/j0ptmhrqPRSNpY94gTDcz7ECeD5B7nvFOnT3c4Fzi0UY\neRiyLjOzcwkV1zamBbDYQqcYS8zskwTDAWG+vtAw/l46J0nbuEjM2iuzDQWeMLP76Tjv38mXQLgI\nnhA5wbvKKnqTmU2P6fkxOpe/nBo9Hk46ZltIn7AN8C0LqRMKhUamWvjmkf/HQjGYgfnv55GbNvsg\noYrY9Wb2vZS6F+J6wlqLByMdu2Jrwuglx1qC0VplYTEgEKLaCAblZdoXMDrJawPuMLOP0x4qezSh\nQl2pXEH4HmeTEBxBchz/VwnhzH9398cjZ3gH53UZgRypp9oqhZzFvYDoR/4z2ueq9yNEDf01QfaP\nwM6EP6cT7vjuJwzH8eREdWl0KJqfxsweIkwzePS6gXBnlh9NszVhXvo9kY73AF9x9+cS+n0QOMHb\nq5kdB3zVoxoOZrY/4TM5E/hm/FDgTE+o9VCCQ/1MQlGc3PqJmcCe3j3ndwNhKmWhuy+N5re38IQQ\nTuucN2o4IW/UfXlyHyJMs2xOWCA2njB1t2NCmzcQLtYHE6ZeVhGc4Lt045yKOqtjst8hhGPm8jUd\nQTD0ZxNCNI+P5J4C9vAU8/zWnjcrZ+QaaU8c59519bWqU0Ygxy4E31buhu8Nwn+gW2G+XeooQ1D/\nRI7QBYQfxHN0kb7WUiaqsxLSPERtvpdgCP5JcADf7Xm1lgtEj2xMMRzbdxnhYv5G9HoU8AtPCI+L\n7rCuJkw97ENwtB3ueY6zAuGbnfouhTRtWomZQqNjRgLb0dGoFgyHtbAC2L1whM0jhFW8t3lI8XwA\nYR765ATZZkLZw0c9lJQcB+zsCc7qtFgobn++pyvgg4X6Ghtz7bt7fiGkXEjoIZ5iTUwkP4rOn+md\naY5NaOt6QlGY6909vyRskp5Jd/lJ61dSRdXFHOq5NCFvkuxQrxgyBL0AMzuQ9gLuEwmRMTOjqZuu\njmsgzMcWrUplUZoHd/+fhPcepT0/zS5RSOUl3jmr5bWEUUtuZfEpwAHufmSeXKeQua7C6KKwvOsI\nU1xHuvuq2HtfiPqZSMdVt0MJiddKTiNejTZjbZ9EmE7akvA97kmIUU+6cMRz6EC4GHTKoWMhtXRr\nZBB28xDKeL/3QOhq1H8pBXySFieu8PaqZbmL4I6E1dc30nGqL6keQdJneo+XuTYiGmV+jDB9dj8h\nLPgGTyicFBm1HAMJC/vWufs3E2RT3axYSLnRShgpWaTHA4RIt6s8b+FiRaimA0Jb5TbCcHdPwnzk\nsxTIxEiY5xxGGCrPI5RLPC1lH/cV2J+rnTs7attIyCpKmL/+K2F64uVIl80S5B4hVqeW4IzLd8Lm\nO19foj0qZk5MriRHdcrPoeJt5p3XQCLnH+HP/bcCsnOAfWOv9yHZAX0b4e7x/EjncwkXwp76bZbi\nrH6GMIWzhOAXWk/waz1ImKo6o4utkAM69Wda4nk1EkKirwSWl3DcnXmvS42qS+1Qr9QmZ3EvwEI9\n1MFEKQ4Ii10SV+ESomGWm9nxhGmc0wkX8J/ntZmU5qHQ8DBVfppIp4/n70/gbELs+9VRn8cCP86T\nSeV89chRTXA4V4RqtBljtbuvNjPMbICHPEnbF5BNm0Pnw4R8RF8jTKENp3C2zIrjXVRsS+BmgmN1\nOoCZvY8wVXUl8Btv9/1sTM+RI1rXkUQpn2kqzGwQwX/xMUIoaaHMp/ERTgPBmI3NE7uCEAb9UyCe\nU2uFJ2cUTeVQryQyBL2DOYQf2E6EC9RSC6UmkyIJ+lmognUk8Gt3X2tmSRf4+LROLs3Dh5M6d/dT\noqcXWkhnkJifxsIKzM8ShvXxudoOc//u/kczayPMaxtwlLs/kSdTysWlN/FCZFSvA241szfonEk2\nN31wv4Xi9fEcOjPyG/SO1dMqUoylirS6++dzL9z9FjP7ibufaiFnUo5U6dQjin6mpWBmfyNkVL2Z\nkIhxhkdp0BPIRRgZ4X+0iPAf2IiXHlV3BXBf5KuA8F/9i4VVzE9QBeQj6EVYxwLuY919QILMVwij\ngEcIc4tbA5d7Xs3iEvtNtbLYzK4iTEd9gnBHejwhemUqohPRXPRw4GZ3fzu2/47CR+Ge50+IRndn\nEqbmjC4c1bXGQujs7YQpRAjG7RDCqOABQvnKslN2FPpMS9TxMMICx4pWJ0wbVRe9V9ShXlHdZAjq\nHwsx5PsSRgXP0p4I7V8pj++UkdRSpHmwEvPTWHth8jnuPiUamUzPv3BlGTPbk+BfiZdg3CHhrrCU\nNp8CjvCElMb1hpmNpj0dhBF+d98njHS3JkyB7kq4kYgXkV8B3OFRpFmVdRxICBaIp6y4wGPOYiuQ\nITSHJ+etShVVVws0NdQ7GERYsTk7/4KejxXIU0L4o8W5lDAEzc27fjLad0hMptTaBblEW0strDt4\nieB0Fe1cQMeqZW8l7AMKf5feOefMy73BCAC4+xJCOoYknooeH4kiZ5oIq4nn94hy7fyRYHhyiyiP\nI6TBiPsojsg/KIaTl7cqYmE0Yo9H1S3snqqVQSOCPoalTAlc4O4kMWOkpahdEMmdREgJsTMh19EQ\n4Dvu/ttyzqUvUspdYbHvMnZXuj/BQXkdRbKp1pq0cfdmdgQh4V9/d9/GzHYlGMGkleKV1vERz1tg\nl7SvjHY3I6RIOZDwGdxOWE9TKPCjx9CIoO+xrbt/NPb6+xbyoOeTOs2Dh3S5xWoXNBBC7N4gTF1V\nraxeL6eUu8Ji32XurtQJoYnvi71X6K601nwj9nxj3H2C3PcIGXRnALj7wxYK6fQED5nZnh6t4Daz\nPYB/Jwma2U8Iq71zxWVGAl93904lKEuIqutxZAj6HmnzlJxISPPwS9rTPCTWzrUUKZ49LGL6Ep1L\nI4qOfJ5wV/ht2u8KO60Ajujyu3T3z0T7LyP4d+IXo05FaeoB71xQ/t9mlrQCeJ27L7OOJS2rirXn\nAuoHfMrMnotej6dwtM77PbYI093fMLMPkFCLOG1UXS2QIeh7fAG4LJpfhihPSYLcDwn5SzqkeSAY\niHzS1i641cy+QYj22BjSWCBWOpOUeFf4eeCPKb7LKR6rUxxdjGpe7CSJhLj7VjrH3QM8ZmafABot\n1GX4CuFmpZrE166MpGPN4qWdxYGg3wCP6mVE6w86RfNF/IkQVXcosai67ipdCWQI+h5zgbMId/Aj\nCE7iI+lcfHtKPALD3V/v4uLxGOHPurhI3ycS7qBOyduvaaKIQnfv+XeF0VTb9h5SegwD8MKpQhrM\nbGSeUa/X/3Y87n4tYf3KZxPkvkwIJV1DCGqYTrh5qRrensV1KqGux7WRnn8iLKZM8pNdDtxuobyk\nE/4DhdZyTHL3Y8zsw+5+WeQQn17h0yiLev2xiPJJmxK4lIvHaNKleN6BjmF3dwEXlnMSfZhUd+/x\nqbYuDECONCu164XTCTH+yy1kIt2d4N/IZ4doyxUv+jChClxPhFp+lpBlNlez+EzCqv5OhsDdz4qm\nlA4iGI0f5lZNJ1C3UXUyBH2PLd39sBRypVw8vpey78sIoaXnRa+Pi/Ydm/L4LFCKAU411ZZmpXYd\n8W13v9LM9iGEKp9NcJznpwv/M8Gx/BiF6zVUC6PdFwbt9RAScfebCCkkinFRNAL8NiGh3BBCZbWa\nI0PQ90hTv7aki4e732khl/p27n6bhVTGjQmi2+eF2N1hISOmaOds4N5oFTaE2PRCBjj1VFv03dXr\nxT9O2sI4r7r7tJ5TqwOXkr5ecqpV3fUeVad1BH2EWMRDEyEv+0KKpAQuoe3PESJbRrn7tpHz7sKE\nFBN/iPbHw+5O8PZcRQKIQnFbCXe6s9393gJyg0iYavPkHFO9AktZGMfMDiKMKG+nBmsjLOR72liQ\nyN0fKiCXelW3mc109/2KydUCGYI+ghWofpTDu5HELYpdfzehIM5u0b5H3X3nPLm5hBzyuUpjWxOc\n1xvopjHqKyQ4Io8ELk5asGdmVxKm2v4c7ToOGOHuvXaqzVIWxjGzywnppB+nfWrI6yHUMo6Z/dvd\n904p+x2C4au7qDoZAlEUM5vl7ntYey6hJuDB/At7NY1RX8HM5gDviTkiBxMK0yStLK7KCtfeQNKN\nRj1iIaNoqlXdFmpkJ62qrvk0kXwEIg13mtn/AIPM7BDCdEWn+Vtd6FNRiiMy9QrXPsh9ZrZDHTu9\ncwwj/aruuo2q04hAFCVydH2W8GM3QuzzJSkWmIk8LJRiPAGIOyL/4O6/SpDN7FRbdO7bkqL8ZW+h\nnqf6ZAhESUThjlt6QmEakY4SHJGZnWordO71ds6WIp17TLZup/o0NSSKYmYzCIt5mgj5hl41szvd\n/dQuDxSJuPuDhAV/xeTq6qLXk/Sic0+Tzj1H3U71aUQgihJzEp8EbOXuZ1idFNQQopZYaenc63aq\nTyMCkYamKMzvWEL+FyFEIHU6d0LYbF0iQyDS8AOCg/jf7v6AmU0EFtRYJyHqgdTp3Ot5uktTQ0II\nUSZRNtmv5qdzr7eFb8VoqLUCov4xs8lmdruZPRa9nmJmnQpvCJFBOqVzB+qyFkRXyBCINFwMfIso\njW4UOlqXJfeE6GEaooyiQN3XgihIr1NY1IRmd7/fOpYNTKozK0TW6E21IAoiQyDSsMTMtiXKk2Jm\nR1O8WpkQfZ5eVguiIHIWi6JEUUIXAXsR6uYuAo6v5ygIIUR6NCIQXRLlGWp194OjTJkN7r6i1noJ\nISqHRgSiKPVcUEMI0X1kCERR6rmghhCi+8gQiKLUc0ENIUT3kSEQRemLtXOFEO3IEIii1HNBDSFE\n95EhEEWp54IaQojuoxQTIg0PmdmeuRf1VFBDCNF9NCIQRannghpCiO4jQyCKkuXauUJkARkCIYTI\nOPIRCCFExpEhEEKIjCNDIDKHmf2vmT1uZnPM7OEoCqpafc0ws9ZqtS9EJVD2UZEpzOw9wOHA7u6+\nxsxGA/1rrJYQNUUjApE1xgFL3H0NgLsvcfcXzey7ZvaAmT1mZhdZVI4tuqP/pZnNNLO5ZvYuM7vW\nzBaY2Y8imQlmNs/MLotGGVebWXN+x2b2PjO718weNLOrzGxItP9nZvZEdOwvevCzEAKQIRDZ4xZg\nKzN70sx+Y2b7R/t/7e7vcvedgEGEUUOOt6M03BcC1wNfBHYCPm1mm0Qy2wMXRWsqlhNyM20kGnl8\nGzjY3XcH2oBToxq3HwF2jI79URXOWYgukSEQmcLd3wRagJOBV4G/mdmngQPMbJaZPUooO7hj7LB/\nRI+PAo+7++JoRLEQ2Cp673l3z622vpyQoC/OnsAOwL/N7GHgBGA8wWisBi4xs6OAlRU7WSFSIh+B\nyBzuvh6YAcyILvz/BUwhVGJ73sy+BwyMHbImetwQe557nfsP5S/IyX9twK3ufly+Pmb2buAg4OPA\nlwiGSIgeQyMCkSnMbHsz2y62a1dgfvR8STRvf3QZTW8dOaIhZGe9O+/9+4C9zWxSpEezmU2O+hvu\n7v8EvhrpI0SPohGByBpDgPPNbASwDniKME20lDD18wzwQBntzgVOMLPfAguAC+Jvuvur0RTUX8xs\nQLT728AK4HozG0gYNXytjL6F6BZKMSFENzGzCcANkaNZiF6HpoaEECLjaEQghBAZRyMCIYTIODIE\nQgiRcWQIhBAi48gQCCFExpEhEEKIjPP/AUvWZZnkjpb+AAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "filtered_fdist.plot(30)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import string" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'!\"#$%&\\'()*+,-./:;<=>?@[\\\\]^_`{|}~'" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string.punctuation" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": true }, "outputs": [], "source": [ "stops = stopwords.words('english') + list(string.punctuation)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['i',\n", " 'me',\n", " 'my',\n", " 'myself',\n", " 'we',\n", " 'our',\n", " 'ours',\n", " 'ourselves',\n", " 'you',\n", " 'your',\n", " 'yours',\n", " 'yourself',\n", " 'yourselves',\n", " 'he',\n", " 'him',\n", " 'his',\n", " 'himself',\n", " 'she',\n", " 'her',\n", " 'hers',\n", " 'herself',\n", " 'it',\n", " 'its',\n", " 'itself',\n", " 'they',\n", " 'them',\n", " 'their',\n", " 'theirs',\n", " 'themselves',\n", " 'what',\n", " 'which',\n", " 'who',\n", " 'whom',\n", " 'this',\n", " 'that',\n", " 'these',\n", " 'those',\n", " 'am',\n", " 'is',\n", " 'are',\n", " 'was',\n", " 'were',\n", " 'be',\n", " 'been',\n", " 'being',\n", " 'have',\n", " 'has',\n", " 'had',\n", " 'having',\n", " 'do',\n", " 'does',\n", " 'did',\n", " 'doing',\n", " 'a',\n", " 'an',\n", " 'the',\n", " 'and',\n", " 'but',\n", " 'if',\n", " 'or',\n", " 'because',\n", " 'as',\n", " 'until',\n", " 'while',\n", " 'of',\n", " 'at',\n", " 'by',\n", " 'for',\n", " 'with',\n", " 'about',\n", " 'against',\n", " 'between',\n", " 'into',\n", " 'through',\n", " 'during',\n", " 'before',\n", " 'after',\n", " 'above',\n", " 'below',\n", " 'to',\n", " 'from',\n", " 'up',\n", " 'down',\n", " 'in',\n", " 'out',\n", " 'on',\n", " 'off',\n", " 'over',\n", " 'under',\n", " 'again',\n", " 'further',\n", " 'then',\n", " 'once',\n", " 'here',\n", " 'there',\n", " 'when',\n", " 'where',\n", " 'why',\n", " 'how',\n", " 'all',\n", " 'any',\n", " 'both',\n", " 'each',\n", " 'few',\n", " 'more',\n", " 'most',\n", " 'other',\n", " 'some',\n", " 'such',\n", " 'no',\n", " 'nor',\n", " 'not',\n", " 'only',\n", " 'own',\n", " 'same',\n", " 'so',\n", " 'than',\n", " 'too',\n", " 'very',\n", " 's',\n", " 't',\n", " 'can',\n", " 'will',\n", " 'just',\n", " 'don',\n", " 'should',\n", " 'now',\n", " 'd',\n", " 'll',\n", " 'm',\n", " 'o',\n", " 're',\n", " 've',\n", " 'y',\n", " 'ain',\n", " 'aren',\n", " 'couldn',\n", " 'didn',\n", " 'doesn',\n", " 'hadn',\n", " 'hasn',\n", " 'haven',\n", " 'isn',\n", " 'ma',\n", " 'mightn',\n", " 'mustn',\n", " 'needn',\n", " 'shan',\n", " 'shouldn',\n", " 'wasn',\n", " 'weren',\n", " 'won',\n", " 'wouldn',\n", " '!',\n", " '\"',\n", " '#',\n", " '$',\n", " '%',\n", " '&',\n", " \"'\",\n", " '(',\n", " ')',\n", " '*',\n", " '+',\n", " ',',\n", " '-',\n", " '.',\n", " '/',\n", " ':',\n", " ';',\n", " '<',\n", " '=',\n", " '>',\n", " '?',\n", " '@',\n", " '[',\n", " '\\\\',\n", " ']',\n", " '^',\n", " '_',\n", " '`',\n", " '{',\n", " '|',\n", " '}',\n", " '~']" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stops" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": true }, "outputs": [], "source": [ "filtered_words = [word.lower() for word in words if word.lower() not in stops]" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": true }, "outputs": [], "source": [ "filtered_fdist2 = nltk.FreqDist(filtered_words)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAE/CAYAAACpct9bAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnXmYHWWV/z/f7s5CZw9rQ7ZhVbYEukUUVERxRUQEhMER\n3BhABXVUfqgD6uioMy4zwAyIC6IoCggjiQsie1TADmSDsEQEEhKWhKx09pzfH+97u6ur6/at27m3\nb3ff83meeu6tqlOn3lu3qs67nHNemRmO4ziOA9BQ6wI4juM4Awc3Co7jOE4nbhQcx3GcTtwoOI7j\nOJ24UXAcx3E6caPgOI7jdOJGwXEcx+nEjYLjOI7TiRsFx3Ecp5OmWhegXHbZZRebNm1an47dsGED\nO+20U0VlXafrdJ2uc6DpzGLOnDkrzGzXkoJmNqiW1tZW6yvt7e0Vl3WdrtN1us6BpjMLoN1yvGO9\n+8hxHMfpxI2C4ziO04kbBcdxHKcTNwqO4zhOJ24UHMdxnE6qahQkPSVpgaS5ktoz9kvSpZIWS5ov\n6fBqlsdxHMfpnf6IU3ijma0osu/twH5xeTVwRfysONu2G8+s2cK+HVsY1zysGqdwHMcZ9NS6++jd\nwE+iG+19wHhJLdU40fnXPcSn/rCSOx57vhrqHcdxhgSyKs7RLOnvwCrAgO+Z2VWp/bOAb5jZ7Lh+\nO3ChmbWn5M4GzgZoaWlpnTlzZtlluWnRen62cD3v2LeZDx82tqR8R0cHzc3NFZNzna7TdbrO/tKZ\nRVtb2xwzayspmCfCra8LsGf83A2YB7w+tf83wNGJ9duB1t509jWiefYTL9rUC2fZCZfPziU/1CIc\nXafrdJ31ozMLBkJEs5kti58vADcDR6RElgKTE+uTgGXVKMshk8YhYNGytWzauq0ap3Acxxn0VM0o\nSBolaUzhO/AWYGFK7BbgA9EL6UhgjZktr0Z5xo4cxl5jGtm8bTuLlq+rxikcx3EGPdVsKewOzJY0\nD3gA+I2Z/V7SOZLOiTK/BZ4EFgPfB86rYnnYd2LwOpq3ZHU1T+M4jjNoqZpLqpk9CUzP2H5l4rsB\nH6tWGdLsN3E4dz290Y2C4zhOEWrtktqv7LdzaCnMdaPgOI6TSV0ZhSnjmhje1MCTK15mTceWWhfH\ncRxnwFFXRmFYgzhozxCjMP9Zby04juOkqSujADBj8ngA5j7jRsFxHCdN3RqFeUvdKDiO46SpW6Mw\nd8nqQhS14ziOE6k7ozBlYjPjm4exYv1mnl29odbFcRzHGVDUnVGQxPRJsQtpyZoal8ZxHGdgUXdG\nAZJdSKtqXBLHcZyBRV0bBW8pOI7jdKcujcKhk8YBsODZNWzdtr3GpXEcxxk41KVR2Hn0CKZMbGbD\nlm08/vz6WhfHcRxnwFCXRgFguscrOI7j9KBujYJHNjuO4/Skjo1CGFfwloLjOE4XVTcKkholPSRp\nVsa+syS9KGluXD5S7fIUOGjPcTQ1iMefX8fLm7b212kdx3EGNP3RUrgAWNTL/l+a2Yy4/KAfygPA\nyGGNvKJlDNsteCE5juM4VTYKkiYB7wT67WVfDl2Rzd6F5DiOA9VvKfwX8Dmgt2CA90qaL+lGSZOr\nXJ5uuAeS4zhOd1StTKGSjgfeYWbnSToG+IyZHZ+S2RlYb2abJJ0DnGpmx2boOhs4G6ClpaV15syZ\nfSpTR0cHzc3NnetL1m7lk7euYJedGvje8bv1KptXZyVkXafrdJ2uc0dl07S1tc0xs7aSgmZWlQX4\nOrAUeAp4DugAru1FvhFYU0pva2ur9ZX29vZu61u3bbeDLv69Tb1wlj2/ZkOvsnl1VkLWdbpO1+k6\nd1Q2DdBuOd7dVes+MrOLzGySmU0DTgPuMLP3J2UktSRWT6D3AemK09ggDtkruKbO9XEFx3Gc/o9T\nkPQVSSfE1fMlPSxpHnA+cFZ/l2fGFB9XcBzHKdDUHycxs7uAu+L3ixPbLwIu6o8yFMPnVnAcx+mi\nbiOaCxw2pcstdft2n57TcZz6pu6Nwu5jR7LH2JGs27SVJ1e8XOviOI7j1JS6NwoA0wt5kHyw2XGc\nOseNAjBj8gTAPZAcx3HcKJBoKbgHkuM4dY4bBeCQvcYhwaLla9m4ZVuti+M4jlMz3CgAY0YOY7/d\nRrNlm/HI8rW1Lo7jOE7NcKMQ8YypjuM4bhQ6KWRM9cFmx3HqGTcKkcKczd5ScBynnnGjEDlgjzGM\naGrgqZUdrO7YXOviOI7j1AQ3CpFhjQ2eMdVxnLrHjUKCzpnYPDme4zh1ihuFBF2DzatqXBLHcZza\n4EYhwWGdczavKcwG5ziOU1e4UUgwacJOTBw1nJde3swLHR7Z7DhO/VF1oyCpUdJDkmZl7Bsh6ZeS\nFku6X9K0apenNyQxfVIYbH5i5ZZaFsVxHKcm9EdL4QKKz738YWCVme0LfBf4Zj+Up1cKGVOfeMmN\nguM49UdVp+OUNAl4J/A14NMZIu8GvhS/3whcLklWww79QsbUe57ZyJk/eqCk/Nq1axg7r7QcwIFj\nNtHaukPFcxzHqSqq5vtX0o3A14ExwGfM7PjU/oXA28xsaVz/G/BqM1uRkjsbOBugpaWldebMmX0q\nT0dHB83Nzb3KrN+8nX+e9SIbt1X+ugxvhGtP3J3GBvUql6ec5cq6TtfpOutHZxZtbW1zzKytpKCZ\nVWUBjgf+N34/BpiVIfMwMCmx/jdg5970tra2Wl9pb2/PJffUivV21cw/2R2PPl9yySv3mn//o029\ncJY9/OyaipWzHFnX6TpdZ/3ozAJotxzv7mp2Hx0FnCDpHcBIYKyka83s/QmZpcBkYKmkJmAc8FIV\ny5SLqTuP4vCWEbQesFtJ2bHrl+SSa502kWXzljFv6WoO3HNsJYrpOI5Tcao20GxmF5nZJDObBpwG\n3JEyCAC3AGfG7ydHmSEZIFBIuDf3GU+h4TjOwKWqA81ZSPoKoRlzC/BD4KeSFhNaCKf1d3n6ixk+\n5afjOIOAfjEKZnYXcFf8fnFi+0bglP4oQ605aM9xNAoef34dL2/ayqgR/W6PHcdxSuIRzf3EyGGN\nTB3XxHaDBc96wj3HcQYmbhT6kX0nDgN8Ih/HcQYubhT6kf12DkbB52twHGeg4kahH9lv4nDAWwqO\n4wxc3Cj0I3uOaWT0iCaWrdnIC2s31ro4juM4PXCj0I80Shw6yaf8dBxn4OJGoZ/pnPLT4xUcxxmA\nuFHoZ6ZPKkz56UbBcZyBhxuFfuawKcEozF+yhu3bh2RGD8dxBjFuFPqZ3ceOZI+xI1m3aStPrni5\n1sVxHMfphhuFGtCZHM+7kBzHGWC4UagBnYPNbhQcxxlguFGoAdM9Y6rjOAMUNwo14NBJ45Fg0fK1\nbNyyrdbFcRzH6cSNQg0YPaKJ/XYbzZZtxiPL19a6OI7jOJ24UagRnfEKPhOb4zgDiKoZBUkjJT0g\naZ6khyV9OUPmLEkvSpobl49UqzwDjRlTPLLZcZyBRzWn/9oEHGtm6yUNA2ZL+p2Z3ZeS+6WZfbyK\n5RiQFFoK7oHkOM5AomotBQusj6vD4uIhvJED9hjDyGENPLWyg1Uvb651cRzHcQCQWfXe05IagTnA\nvsD/mNmFqf1nAV8HXgQeBz5lZksy9JwNnA3Q0tLSOnPmzD6Vp6Ojg+bm5orK7ojOL9yxkkdXbuGL\nr5vAYXuMGLDldJ2u03UOLp1ZtLW1zTGztpKCZlb1BRgP3AkcnNq+MzAifj8HuKOUrtbWVusr7e3t\nFZfdEZ1fmfmwTb1wlv3XbY9XTOeOyrlO1+k6B7/OLIB2y/G+7hfvIzNbDdwFvC21faWZbYqr3wda\n+6M8A4WudBeralwSx3GcQDW9j3aVND5+3wl4M/BoSqYlsXoCsKha5RmIzOicW2FNoeXkOI5TU6rp\nfdQCXBPHFRqA681slqSvEJoxtwDnSzoB2Aq8BJxVxfIMOCZN2ImJo4bz0subWfLSBqbs3Le+Qsdx\nnEpRNaNgZvOBwzK2X5z4fhFwUbXKMNCRxIzJ47nj0ReYu3S1GwXHcWqORzTXGI9XcBxnIOFGocYU\nMqb63AqO4wwE3CjUmMJg88Jn17Bl2/Yal8ZxnHrHjUKNGd88nGk7N7Np63Yee25drYvjOE6dU7ZR\nkDRB0qHVKEy94tNzOo4zUMhlFCTdJWmspInAPOBqSd+pbtHqB5+e03GcgULelsI4M1sLnARcbWat\nhGA0pwJM95aC4zgDhLxGoSlGH58KzKpieeqSA1vGMqxRLH5xPes2bql1cRzHqWPyGoUvA7cCi83s\nr5L2Bp6oXrHqi5HDGnlly1jMYMGza2pdHMdx6pi8RmG5mR1qZucBmNmTgI8pVJDO6Tm9C8lxnBqS\n1yhclnOb00dm+GCz4zgDgF5zH0l6DfBaYFdJn07sGgs0VrNg9UaXB9IaOHB8jUvjOE69Uioh3nBg\ndJQbk9i+Fji5WoWqR/beZRRjRjbx3NqNrNywrdbFcRynTunVKJjZ3cDdkn5sZk/3U5nqkoYGMX3S\neGYvXsHil7bwlloXyHGcuiRv6uwRkq4CpiWPMbNjq1GoemX65HHMXryCx1e6W6rjOLUhr1G4AbgS\n+AGQq29D0kjgHmBEPM+NZnZJSmYE8BPCNJwrgfeZ2VM5yzTkmDF5AgCLX3Kj4DhObchrFLaa2RVl\n6t4EHGtm6yUNA2ZL+p2Z3ZeQ+TCwysz2lXQa8E3gfWWeZ8gwfVJIo/23VVvYtt1obFCNS+Q4Tr2R\n1yjMlHQecDPhZQ+Amb1U7AALkw6vj6vD4pKeiPjdwJfi9xuByyXJ6nTC4t3GjmTPcSNZtmYj17cv\nYbcxI0oes3jZRlY3P18xOQA2+kC349QreY3CmfHzs4ltBuzd20FxfuY5wL7A/5jZ/SmRvYAlAGa2\nVdIaYGdgRc5yDTlmTBnPsgXPcdFNC/If9Kf2ispNHdfEm47Kf3rHcYYO6o9KuaTxhFbGJ8xsYWL7\nw8BbzWxpXP8bcISZrUwdfzZwNkBLS0vrzJkz+1SOjo4OmpvzzYOcV7bSOp9ctYVfLlyDKV8YyLZt\n22hsLC2bV27+85vYsh2uPmE3xo7oPbZxMFxP1+k661FnFm1tbXPMrK2koJmVXIAPZC15jk3ouAT4\nTGrbrcBr4vcmQgtBvelpbW21vtLe3l5x2aGm85Qr/mxTL5xldyx6vmI6y5F1na7TdVZGNg3Qbjne\n1XnTXLwqsbyOMA5wQm8HSNo1thCQtBMh1fajKbFb6OqaOhm4IxbeqRE+Z7Tj1De5xhTM7BPJdUnj\ngJ+WOKwFuCaOKzQA15vZLElfIVisW4AfAj+VtBh4CTit3B/gVJbOdBtL3Sg4Tj2Sd6A5TQewX28C\nZjYfOCxj+8WJ7xuBU/pYBqcKJBPzmRmSu8U6Tj2RyyhImkmXO2kj8Erg+moVyqkde43fiXEjGljV\nsYVnXupg6s6jal0kx3H6kbwthW8lvm8FnrboMeQMLSSx78RhzFm+iblLVrtRcJw6I9dAs4XEeI8S\nMqVOADZXs1BObdl/4jDAB5sdpx7JZRQknQo8QOj/PxW4X5Knzh6i7BuNgk/44zj1R97uoy8ArzKz\nFyC4mwJ/JKSmcIYYBaOwcNlaNm/dzvCmvJ7LjuMMdvI+7Q0FgxBZWcaxziBj9PAG9t5lFJu3buex\n59bVujiO4/QjeV/sv5d0q6SzJJ0F/Ab4bfWK5dSaQrzCXI9XcJy6olejIGlfSUeZ2WeB7wGHAtOB\nvwBX9UP5nBpRiFeY+4wbBcepJ0q1FP4LWAdgZjeZ2afN7FOEVsJ/VbtwTu3wyGbHqU9KGYVpMTK5\nG2bWTpia0xmivLJlDMMbG/jbi+tZu9FngnOceqGUURjZy76dKlkQZ2AxoqmRV+45FjNYsHRNrYvj\nOE4/Ucoo/FXSR9MbJX2YMHmOM4SZMckzpjpOvVEqTuGTwM2SzqDLCLQBw4H3VLNgTu2ZMWU81/zl\naTcKjlNH9GoUzOx54LWS3ggcHDf/xszuqHrJnJozfVL0QPKMqY5TN+SdT+FO4M4ql8UZYEzbeRRj\nRzbx4rpNPLd2Iy3jfBjJcYY6HpXsFKWhQV1BbB6v4Dh1QdWMgqTJku6UtEjSw5IuyJA5RtIaSXPj\ncnGWLqd2zPDIZsepK/o681oetgL/YmYPShoDzJF0m5k9kpK718yOr2I5nB2gMK7gGVMdpz6oWkvB\nzJab2YPx+zpgEbBXtc7nVIdC99GCpWvYtt1KSDuOM9iRWfUfdEnTgHuAg81sbWL7McCvgKXAMuAz\nZvZwxvFnA2cDtLS0tM6cObNP5ejo6KC5ubmisvWg85zfvMCLHdv57lt2Zsq4YQO2nK7TdbrO4rS1\ntc0xs7aSgmZW1QUYTYhxOClj31hgdPz+DuCJUvpaW1utr7S3t1dcth50nvezOTb1wln2iweerpjO\nHZVzna7TdZYH0G453tlV9T6SNIzQEviZmd2UYZDWmtn6+P23wDBJu1SzTE75zOiMV/B0F44z1Kmm\n95GAHwKLzOw7RWT2iHJIOiKWZ2W1yuT0jc6MqT7Y7DhDnmp6Hx0F/BOwQNLcuO3zwBQAM7sSOBk4\nV9JWYANwWmzmOAOIg/caS2ODeOz5dWzYvI2dhjfWukiO41SJqhkFM5sN9JoXwcwuBy6vVhmcytA8\nvIn9dx/DouVrWbhsDa+aNrHWRXIcp0p4RLOTixmTQ8ZU70JynKGNGwUnF4XI5ofcKDjOkMaNgpML\nH2x2nPrAjYKTi/12G0Pz8EaWrtrAivWbal0cx3GqhBsFJxeNDeKQvXxcwXGGOm4UnNzM8C4kxxny\nuFFwcuODzY4z9HGj4OQmOdjsMYaOMzRxo+DkpmXcSHYdM4K1G7fy1MqOWhfHcZwq4EbByY2krpnY\nlqyqcWkcx6kGbhScsugabPaMqY4zFHGj4JTF9M402j7Y7DhDETcKTlkcGnMgPbJsLVu2+WCz4ww1\n3Cg4ZTF25DD22XUUm7dt5+k1W2tdHMdxKowbBadsZkyeAMDjL22ucUkcx6k0bhScsimk0V780pYa\nl8RxnEpTzek4J0u6U9IiSQ9LuiBDRpIulbRY0nxJh1erPE7lKASxPeFGwXGGHNVsKWwF/sXMXgkc\nCXxM0oEpmbcD+8XlbOCKKpbHqRCv2GMsw5saWLZuG2s63DA4zlCimtNxLgeWx+/rJC0C9gIeSYi9\nG/hJnJf5PknjJbXEY50ByvCmBg7acywPPbOa4757N8MaS9ctNm/exPDb7qiYXD3rlOD1ezXS2prr\n9I5TFuqPHDaSpgH3AAeb2drE9lnAN+J8zki6HbjQzNpTx59NaEnQ0tLSOnPmzD6Vo6Ojg+bm5orK\n1qvOmx9dz7UL1ufS51Se5ia45sTdaVCv06APinvJdVZeZxZtbW1zzKytlFzVWgoFJI0GfgV8MmkQ\nCrszDulhpczsKuAqgLa2NmvtYxVpzpw55D02r2y96mxthTfMfoBXHHhwLp0LFy7g4IMPqZhcPet8\n7xV/5oV1m5g49RXss+voXmUHw73kOiuvc0eoqlGQNIxgEH5mZjdliCwFJifWJwHLqlkmp3JM3KmR\nyRPz1VpeGNWUSzavXD3rnDF5PH945HnmLVld0ig4TrlU0/tIwA+BRWb2nSJitwAfiF5IRwJrfDzB\ncXpn+mRPNeJUj2q2FI4C/glYIGlu3PZ5YAqAmV0J/BZ4B7AY6AA+WMXyOM6Q4DCfAc+pItX0PppN\n9phBUsaAj1WrDI4zFDlk0jgEPLJ8LZu2bmNEU2Oti+QMITyi2XEGGWNGDmOvsU1s2WY8siztu+E4\nO4YbBccZhOw3cRjgXUhO5XGj4DiDkH2jUfDBZqfSuFFwnEHI/oWWwlKfAc+pLG4UHGcQMmVcE8Ob\nGvj7ipdZ3eEpzJ3K4UbBcQYhTQ3i4D3HAt5acCqLGwXHGaQUJjvywWankrhRcJxByvQ42ZEbBaeS\nuFFwnEHKYbGlMHfJavoj27FTH7hRcJxByuSJOzGheRgrX97M0lUbal0cZ4jgRsFxBimSOpPjzVvq\nXUhOZXCj4DiDmBmFjKnPuFFwKoMbBccZxHhLwak0bhQcZxAzfVIwCgueXcOWbdtrXBpnKOBGwXEG\nMRNHDWfqzs1s3LKdx59fV+viOEMANwqOM8gptBbmLfHIZmfHqeZ0nD+S9IKkhUX2HyNpjaS5cbm4\nWmVxnKFM52DzklU1LokzFKjmdJw/Bi4HftKLzL1mdnwVy+A4Q57OwWZvKTgVoGotBTO7B3ipWvod\nxwkctOdYmhrE4y+sY/2mrbUujjPIUTXD4yVNA2aZ2cEZ+44BfgUsBZYBnzGzh4voORs4G6ClpaV1\n5syZfSpPR0cHzc3NFZV1na5zIOj83B9X8LdVW/nyGyZw8G4jBmw5XWf/6Myira1tjpm1lRQ0s6ot\nwDRgYZF9Y4HR8fs7gCfy6GxtbbW+0t7eXnFZ1+k6B4LOL9w836ZeOMuuuGtxxXTuqJzrrJ3OLIB2\ny/GOrZn3kZmtNbP18ftvgWGSdqlVeRxnMFPwQPLIZmdHqZlRkLSHJMXvR8SyrKxVeRxnMHPYFI9s\ndipD1byPJF0HHAPsImkpcAkwDMDMrgROBs6VtBXYAJwWmziO45TJ3ruMZsyIJpav2cjzazey+9iR\ntS6SM0ipmlEws9NL7L+c4LLqOM4O0tAgDp08jj8tXsncJat560F71LpIziDFI5odZ4jQFdnsXUhO\n33Gj4DhDBM+Y6lQCNwqOM0Q4LBqF+UvWsH27D885fcONguMMEXYbO5KWcSNZt2krT65YX+viOIMU\nNwqOM4QojCs85PEKTh9xo+A4Q4gZHq/g7CBuFBxnCOFzKzg7ihsFxxlCHDppHA2CRcvXsnHLtloX\nxxmEuFFwnCHEqBFN7LfbGLZuNx5etrbWxXEGIW4UHGeIMX3yOMCD2Jy+4UbBcYYYMyZPAGCuGwWn\nD7hRcJwhRmdLwT2QnD7gRsFxhhgH7D6GkcMaeHplB+s2ba91cZxBhhsFxxliNDU2cMheobXwxEtb\nalwaZ7DhRsFxhiCFeIXFbhScMqmaUZD0I0kvSFpYZL8kXSppsaT5kg6vVlkcp94oRDY//tLmGpfE\nGWxUs6XwY+Btvex/O7BfXM4GrqhiWRynrki2FHxCQ6ccqjnz2j2SpvUi8m7gJ3EKzvskjZfUYmbL\nq1Umx6kXJk3YiZ1HDWfly5v57h+fYNTwxpLHLH32ZdrX/61icq6z8jpfXrmB1tZcon2makYhB3sB\nSxLrS+M2NwqOs4NI4vCpE7jtkee59PYn8h84/9HKyrnOisrtN3EYn85/9j6hajYtY0thlpkdnLHv\nN8DXzWx2XL8d+JyZzcmQPZvQxURLS0vrzJkz+1Sejo4OmpubKyrrOl3nQNW5bN1W/vDEWtSYr+63\ndetWmppKy+aVc52V1zmuaTsnHjQ+l2yatra2OWbWVlLQzKq2ANOAhUX2fQ84PbH+GNBSSmdra6v1\nlfb29orLuk7X6Tpd50DTmQXQbjne27V0Sb0F+ED0QjoSWGM+nuA4jlNTqjamIOk64BhgF0lLgUuA\nYQBmdiXwW+AdwGKgA/hgtcriOI7j5KOa3kenl9hvwMeqdX7HcRynfDyi2XEcx+nEjYLjOI7TiRsF\nx3EcpxM3Co7jOE4nVQ1eqwaSXgSe7uPhuwArKizrOl2n63SdA01nFlPNbNeSUnmCGYbKQs7gjXJk\nXafrdJ2uc6Dp3JHFu48cx3GcTtwoOI7jOJ3Um1G4qgqyrtN1uk7XOdB09plBN9DsOI7jVI96ayk4\njuM4veBGwXEcx+nEjYLjOI7TiRsFp9+QNKLWZXAcp3fcKAwQJP1K0jsl9dt/Iunw3pYix9wr6WuS\n3iZpTC+6f5RaH02YQ6OY/F6SXivp9YWlzz+sikg6WtIH4/ddJf1DEbmpkt4cv++UvlaSvhk/T+lD\nGUp6oJQ6f7Fz5ymPpAZJY4vsayx1fLk64/6K3h+SJmYsw4rI3p5nW9ye57o3SDq1jLJOy9j2qrzH\nl8uQ9T6S1Ghm28qQvxMw4CUzO7mIzIIok4mZHZqQnVlC9oSU7jcTJho6ErgB+LGZZc7mHWvc7yVM\nd9o5J4aZfSUltzvw78CeZvZ2SQcCrzGzH8b9d0bRkUAbMA8QcChwv5kdnXHuvYGjgdfFsm4C7jWz\nT6Xk/g3YxczOlTQB+A3wfTO7OkPnN4H3AY8Ahf/MMq5Rrt8dZfcHrgB2N7ODJR0KnGBmX82Q+yww\nNaXz2AydlxCu0wFmtr+kPYEbzOyolNxHCXOKTzSzfSTtB1xpZm9KyCwADidc50wDXAxJD5nZYb3s\nL3n+KPdg+txZ2+L2nwPnEP6fOcA44Dtm9p8pub8DNwJXm9kjJX5HXp0l7w9Jl9H783Z+SudTwGRg\nFeGeHw8sB14APmpmcySNBJqBOwkThikePhb4nZm9MqUz13WPsveYWS7DJulB4F1m9mxcfwNwuZkd\nkuf4cqnaJDu1JL78vga8p4zDzoqfvRmS4+NnYXKgn8bPMwizxyX5Vhnnxsz+CPxR0jjgdOA2SUuA\n7wPXmtmWhPivgTWEB2lTL2p/DFwNfCGuPw78EvhhPOcbAST9AjjbzBbE9YOBzxQp55OSNgCb4/JG\n4JUZcv8q6ZuSrgRagW+Y2a+KlPNEwou2t98C+X83hOv2WcJc4JjZ/PgS+mpK7gbgyihfqhLxHuAw\n4MGoc1mR1tLHgCOA+6PcE5J2S8n8npDDZpSktYntCodY0Voz4cXVG72eX9LbCbMe7iXp0sRxY4Gt\nRXQeaGZrJZ1BaPFdSPgf/jMldyhwGvCD2Or9EfALM1tLT/LqzHN/tMfPo4ADCfc5wClRZ5rfAzeb\n2a0Akt4CvA24Hvhf4NXAPwOfBPaMOgpGYS3wPxk68/zvBW6T9JlYzpcLG83spQzZfwb+T9K7CBWJ\nfyf8f9Wh2nk0arEAtwK7VlH/n/Js64PenYELCDf4LYTa0WXAXSm5hTn1/TV+PpTYNjdDLte2uP1v\nhJv+gniDNqT2n5RY3gvMJQTcnAScVETn74DROX5Prt9d5m+fU4bOB+Lng/FzFDA/Q+7+5LkJla/5\nKZkR8fPnMgJTAAAgAElEQVTXfbhPHiyxv9fzA9OBMwmJJc9MLCcBE4rofJgwne4NwBvitnklyvF6\n4FnCS+8aYN++6Mx7f0TZO4FhifVhwJ0Zcj1yCBW2pe8T4BM5z13yf0/I/j1jebIX3a8B5gMPUMV3\nm5kNzZYC8A4ro+sIOpu9BrxoZq8uIT5K0tFmNjse+1rCCyJL737A1wm1l5GF7Wa2d0ruJuAVhNbH\nu8xsedz1S0ntdOfPkg6xWLPvhZcl7Rx/F5KOJNS00yyS9APg2ij7fmBREZ2XErqPTifUmu+OTeG/\nxf3vSsk/RHgw3xV135ShswOYG/tpO2uDlmryk/93A6yQtA9dv/1kQvdAmpmSzgNuTp07q8Z2vaTv\nAeNjV8GHCC2MNHdL+jywk6TjgPOAmSmZvxCMalYNuhQqsb/X85vZPGCepJ9bbIHGLr7JZraqiM7v\nAU8RuhjvkTQ1q+xxTOGdhK7QacC3gZ8Ruht/C+xfrk7y3x8QavVjgML/NzpuS/OSpAuBX8T19wGr\nYvm3JwXN7LL4jE+jexfjT1I68/zvhWMzx6KSZHRBNxOe3x9KwlLdq5XCxxT6pruV0CweFzetBj5k\nZg9myM4GLgG+S3gxfpBw3S9JyDQAX7SMvvEi538E2JdQu9hEV5fDoSm5wwktjYOBhcCuwMlmNj8l\nNxI4l1CzA7gHuMLMNvZShtHxt3wGmGRmfRpgjLrOzNpuZtek5HL97ii7N6GF8lpCv/Hfgfeb2VMp\nub9nn7q70U7IHwe8JZ77VjO7LUOmAfhwUg74gSUeNkkLCd0kFxO6udIFyDKehWPPM7P/7WV/yfNH\nubuAEwgvurnAi8DdZvbpYrpTxzeZ2dbUticJtfUfmtmfU/suLfIiL6Uz1/0RZT8IfCmWAeANwJcy\n7qVdCM/l0YRrNBv4MuGlO8XMFidkfwrsQ7hGyTGN9DhFruseZZuBT8dznR0rjweY2ayEzBuyfnfi\n99/d2/6+MiSNQmFMwczKGVPoy3nGEq5hVu27IDPHzFolLbA4MCTpXjN7XUruL2b2mpznnZq13cx6\nzDMhqQk4gHCTPmbdxybKRtK3CQ/SaOA+ggG518yeTMntCnyUnrWrD+3AuXP/7sQxowhdXOv6et5q\nIOlowljUqYSuwiS2I9epjDI8ZGaHSfoIoZVwiaT51t1holcDYWbfSekcbWbrc57/4iI6sxwHhtPV\nyuj1Ppa0B2FMAEKXznN5ytOLvkWE8Y+iL8vYwrjGzN6fU+cvCeMUH7DgCLET8Bczm1FEfneg4HH0\ngJmVGlfqM0O1++i7hC6QilLsAZFCaz79gEQ2xhrEE5I+TuhjzRp8+oOk9wI39XbzxfM8LWk6oUkO\n4aU8L6NcHwN+ZmYPx/UJkk4v1DJVhjdVgvuA/zCz53srI2FQ+F7gj5QYwM3bxUaohd0L/NnMXiaD\ncv8jBTfEZCvpLuB7yZeOpHVkX6fMQeEi13UNYazoq2a2MnY9zpbUbtEbrFJIOopQWy54VBXKmb6e\nTZJaCIbpC2RTGEg/gPBSKhiwdxEqBIVzdnr/FK51kiIthOR/OJLgyNGj21LSMYQxiafib5ks6Uwz\nuydDVsCbgb3N7CuSpkg6wsweSMntT2jlTqOE1xmhlb0H2d2PheO2KbgoDzezzcXkEuxjZu+TdHo8\nfoOyLlwo66mEVuVdhN9/maTPmtmNOc5TNkO1pVCVriMFl8SimNmXM455FeFGHw/8G8HD4z/M7P6U\n3DrCuMRWYCO9eKFIuoBQCy90MbwHuMrMLkvJzU3XPJRwZyxW8078nswauKQT6HqJ3m1mPfpNs85d\njDxdbFHuQ4RWymuAdQQDcY+Z/Toh09t/ZOlaqMJYyjDCSwfgn4BtZvaRPGUv8nv+g2AIfx43nRY/\n1wJHm9m7UvIH09Mgpvuryzn/o8CnCDXRzufAzFam5E4B/pXgJHFu7HL7TzN7b4bOPwDvLbS4FLyu\nbjCzt8X1zC6exLl7dPVknGMEcIuZvTW1fQ7wj2b2WFzfH7jOzFozdFxBGBM41sxeqTBW8gcze1VK\nbh7B6yx9jeYkZAp9+mOAGYRB3uSYRtpl+nuEcaJb6O5R1KOyKOnPwJsI1/5whfGv68zsiAzZecBx\nhdZBbIX/0cymp2UrglVxFNsXAzglz7Yydc4HRiXWi3nBzCca/rjeCDy8g+f+OnA7YZD1Q8BtwNcz\n5L5KGPDPo3NO/FyQ2HZvL/J7AOcDzwDrisgclXNblrdLUa8awkN/PvAJ4LAiMkW905K/Ma5fQuj/\nfp7gPvwccOMO/kf378jxRXQ+SvSYiusjgEd7kR8LjCnzHBOAJzK2Z97bRXQUPMOSXmdZ/3FJrzPC\neETRJUP+kqyliO7jgLsJ4zg/I7SCjikim75nGtLbKrkM1e6jqiLpajK6Eyy7H/gigstdr9sk3W49\ng4t6bCvsonuXzDayPVJuJXjMXBnLew7BP7u7su7dI8MJNeeXLdtX/p3ADDPbHo+9huBhdFFK7gLg\n85I2AVvopeVDzi62WKs/kPACvRc4mRgzkMFlhBd4qW3bJO1j0Xsq1pYzW5mxD/wUulpoP5Z0g6UC\n4oDRkl5tsTUo6QjCGAz0jAM4meAi+pCZfTD2Hf+gyG/Ky52S/jOWM1mz7XatlDPAL/JT4AFJN8f1\nE+lqXSV1thGM25iwqoITRo9YgVQ3WyPBESLL2aJd0g/pHheUFXsAsCX27xe6snYl5U0UKel1ZmUO\n5FpGT0EvsrcpBKUdSXg2LjCzYnMv/17SrcB1cf199JIdYEdxo9A3ZiW+jyR03yxLCihngJC6oiZ3\niU3dZNRklisdhIfu/tQDmtUvfSEh8OXcqPcPZLxwzCydguFEQhBOMcbT5fI3LksgrbMEnyRcg/MJ\nXWxvBD6QIbcz4eWxOp5/hfX0VHkNweNo19T4wth4bJrPEl6iTxKu0VRC91UWpxNaBxvjub5BMErp\nl+hHgB8peGiJ0G30kTjo/fWU7AYz2y5pq4LjwgtApudTGRQGWdsS2wxI95fnDfDDzL4m6XeEcSwD\nPmhmD2Wc+0fAeWZ2L3QOqF9NCGpLc3zi+1bg+fT/GTmXEBh2PuF63kMIMMviUsKLfjdJXyMY3S9m\nyBW6u5KeX0bGtS8yplQYI/oXi04W6sqK0A3LHqeA8O5YRXgPH6jgZtpjnMTMPivpJLo8pa4ys5vT\ncpXCjUIfsFRkrqTrCAOqSZYRbpoT6F6rWUfo7y1QbtQkZvYdBXfCwk2S+YDG2vwVccmNmf2fpP9X\nZPfXgYfiAyDC2EJnK0HSK8zsURXJnZSurRY2E2qBUwmtFAgvrG4vEoveZJJeCbyV8DJvNLNJCbHh\nhFp5E12DpBCuZ4/0JWZ2exzoLnhoPWrFI2efIjzIBVfdEYRgvrTOvwKHKESny8xWJ3ZfnxJvlzQ+\n/t45wHpC33WfsRipnoNmM3sgNb5ZLKIZQgtqO+H/yqp9Q+jOuzdRltnxpZpVzqfV3WHiHkKXZ1pu\nE/CduPSKmf0sjkG8ifB/nmhmPQavLUecQILvEJ7nn0edpxG6MB8jGMFjolwyC8BIQvBm5vVUV+qO\nh+m6lkZi8D7FnwgtbmMH749SDMmB5v5G0gHAb8xs34x9Pfyui+j4hKUGijNkxlpICzAxa3+h6Svp\nejM7VUW8i6xnPMNJidUGQg3zDVbERVbBY+VVhAekm8ufpKss+F3fmXGoZdWaJD1GqLEtIPGysdRA\nt6TjCS+Q1xP6n/9CGHvolnwvyn7OzP4jte0UM7shfj/WzO5I/fZkQW9KHFfwrJkSf/dtcf04YLaZ\nnZY8VmXkaEodNw0Ya6k4knKJxugSEs4AwFcs5Toda/4fJwwYH64Q4PdhM3t7hs6Cc8OvCP97MeeG\n7xJafdcRrtH7CLXhX0H3SkEph4ly7+N4zD7AUjPbpOC1dCjwk4JhLud/T+i831IBrZLuM7MjJc2z\nXgZ8Jd1tZj3iDeI9f2gvFZCkbNr76HWAex8NJBLNScXP54CLki2IPt7QvUZNSpplZserK/q6cxcJ\nl0NJLWa2XDn9+uMYSYGthBrx962IL7SkveiZQK5YDackkmZbRvK9DLn/oSsuYlkJ2V6TvUn6sgW/\n/KszDrfk+JDK9KyR9Hu6cjQlPVu+nVHOcsaSciHpVwQ3yqRH1XQzOykllyvAL8rOJyRTfDmujyL4\n1acrGFmVgQLdKgWldJZ7H8dj5hIqNdMI42czCUFh74j7c//vCZ1/IXjGFV7CJwOfjkah08suVVlr\nIOT8utTMDsjQ+TuCw0nJmA71s/eRG4Uq0YcXc66oyTLL8E0zu7DUtnJ1ktHstYyQ+1JGLiH3JkJ/\nfTqNQVatrdcgnsRYzql0JUWDMKZwoGW4/FUaSQvN7OASMmVl4Czz/FmuyEVdhJUjwC9Wbl5lXeMp\nIwn5pfqcqTOvzli+wtjL/oR0ML+zjAC2guGX9Ll4zGUqkVU2Rzn3Bv6b4ApthFidTxEcIlqtK91N\nobImQuXq74QW2uyErkKrcy+Cg0HJ1B1KBL7G9QaCR5VnSR1IqLuv/l2WCE8HsJi7KPab7kEYuDXC\nTZ8VYdlGiajJxLnz1i6PIww2J3l7epukSQTPnKNiGWcTvCGWZpw+V0bTYkYOyPK//yDhQR9G9/7V\nbkZBwa/+W/QexJN3LKeg8wLCQOg6Qr/+4cD/M7M/ZMjmDbLLk6MpayzJYjku7+W4PGxQ99xcRwEb\n0kJKBfrFsYU1BHfNuSnxXM4NCrm2LiGMdxXupa9YKkaiHJ2E1uHrFBwxbif8v+8jeCGl2aIQEPYB\nuvJw9ZgnIW8XG4CFgeR0Tq8CsxPfX2mp1DDqObFUIY/ZHDIi2Yuco1+9j6ri5zrUF+Ab5PDVj7If\nIfjT/5iuqMwPZcjdALSUOO9IYCIhgdiE+H0ioTa+KCF3LqF/voMwcFdY/k6IcE7rvY3wYm6Ky1nA\nbUXKkDej6SLoipEoIZvL5zr+7t0S67tSJKYAaMqrM36+lfCQTqdIFlLCC+BN8VpOJUQNfzlD7hFC\nWvHHouwCivvVX0wYR4AQSHYzcPgO3p8z4rV6Ki4PEfqv03I/J6RT/3ZcHiUM+P8V+FyGfCvBA+gC\nisdo3BZ/xz/E5YuEro5iZT08h85C7MEnCuUiEYeQkj2Q4IF0elz/B4KRT8v9ipDraO+4XELIJpCU\nKZzrsqiz21KsnKW2xe0X5NmW2PdewoD3d4H37Mj9UfL+qabyobrEB70hsd7Yy0P/GLBzYn1nQu6W\ntNydhH7dW+PL6RZCdGe3m4auZHBP0pVydx7w8YTcOIKhuI7w8iosE4uUsZzU2b8CFhPcGHt7QEoa\nuYTs9wmtpFJyJYN4gOsLsnQ3iPOz/qPCNkL3wHvi92IvnFxBdqlr3rkUu5fi59GEGvG72cHgM4JX\n1BkEg/Od+MK7OEPuVhIGnuC19XtgJ+CRDPlGQstmSmEpdo1S29pT6wUjODFryTj+IULXzX3AQVn3\nQh+uUcl7npCtGIL76gfSS0JuD4LBXETIHHx4XI6hSIAf2QYk877r78W7j/pOSV/9yFJCl0CBdcCS\nDLkvlTqhmf038N+lPJUsNIHXSPpvwkxynakJlAiqSrBC0vvpap6eDmQ19yF4/KSbvZ0BaeqeGuAR\nSb2mBogcDZwZ+2R7y376uxzN6Avi5/HkY45CCod/AC5SSN9QzN0yV5CdxfEihQlWRqb3pyh0rb2T\nMEvXryV9KWfZi/FrQizHg7GMxZhCaNEU2EIwXhsUgg47kfQJgnF5nq5gSaNn/MGdkk6jy/X2ZMKs\ne0l+Tvh/5pDhMEHPWIFPEtyebzazh2Mff7cB7T44dpTsYrOu9C2PAJ+n+/hYsiv0rYTW9SS6u82u\ni8cly3k68I/A3pKSz9EYUs+cysy5VSl8oLkPxJv+G3T1bb+e4H30iwzZnwCHEB5UI9QEHyA027Hs\nJHp5ylAyX46khwhdERbXGwi1trRXzhRCP3ZhIO3PwPlm9kzGeR8EzrSuWdpOBz5p0WVPId2vgG8C\nn0seCnzTMuaqKGMw/puECX4K8Rn3AEfajg2cNxC6W540s9WxT3wvy3ALVc88VuMIeazuS8mdQOiO\n2ZMQjDaV0L13UIbOWYQX95sJtc0NhAH0PnuW5BnojnL/SnADLeSOehfB4H+b4Bp6RkJ2MfBqyx4b\nSOos5PAqGLtGuvIAVe9FVr5jx3TCS71QoVtFuK+z/ve8LtPvteKzCxZkphIqIF8HkrFA6witxpLu\n69XGjUIfiIOoTxBupGfoJT2vcibRUxmpJqLOYwhG4beEwePZlppbuogXSrfUyHHbNYQX+6q4PhH4\nlmW75+1NcM07g/By/gBwvPX0gc9yCe1x7nLIo7Mvtas4gLkf3Q1sURdbhchjsyLeOtGF8FhCX/ph\nkt5I6OM+O0O2mTAN5AIL0ze2AIdYxkB3XiRdBVxmOSYjUpgbpHNOATNLT+hUkLuT4BaZJ+ZmIj2v\nZ4+UEZJ+TZjk5tdmlp7ONn3urNp/VsxLLo+7xCB7If3IeooMsiu/y3TeudMbCXNxvLmUzlrgRqEP\nSDqWrsnr9yZ42NwTu3d6O66B0IdbcrYtxVQTZvb5jH0L6MqXM10xX471zL55E6E1U4hoPg94o5md\nmJLr4bLXmxtfdAv8P0I32IlmtiGx79x4nr3pHu07hpAUruyU5tXQmdD9EUKX0yTC/3gkwVc+64WT\nzOsD4SXSI6+PQjrstmgcDrPgSvmA9YM7bDx/3kmYsoIg11n3tOGFl+dBhKjv39C9OzCdijzrev7Z\nsievfwOhC/CdhNbzL4FZ1tODJ5kNtTNS2MySLdGCbK7KiEI6jzZCy0ixDH8leMHdYInAR+V0mVZ5\n8Sm3AP+UrkwNBHxMoQ9YiIi8m+Av/0ZCormDCIOV3Yg33zmEm2QOME7Sd8wsPTl5+hy9pZrImy/n\nHMJA8BcJNa3bgR61VaBB0oRUS6HbvZHRVzuR0DVwv0LOlsJD93OCh1KP5rFlT3GZh2roLHAB4X+8\nz8zeKOkVBK+ULPLm9VmtkPfoHuBnkl6g9/QRlaZHRHIRHgQmE1q8InSLLY/l/Wg0dgUD+ExchscF\nsltkua9nbD3cHWvOxxKim39EYowqyqWT3/0pPn+dJCoO+ygExRUYQ+gOTbMzoWt1fTz+EkIL+PWE\n5zQZDZ/LZZowA+Hbsn5rBhuBBZJuo3ua7T7HJVUKNwp9QGGu2FHENAuEAJxiMyEdaCE1xRmErp4L\nCTddN6Og7FQTxZpxufLlxDKdlt6ewbcJvvU3xnOeCnwtJZNr4DbWfNYQalYVoRo6E2w0s42SkDTC\nQt6mHhGokbx5fd5NeOg/RehmG0d29s+qkO7r7oXfEwZvbwWQ9BZCV9b1hIRzr050b3amCCmgEDeS\nppzricKMY+8itBgOJzvzalak8B4psXIrDrkH2QnR4HkCxcqZQ/w39ByAHxC4Uegb8wk35sGEl9Vq\nhek0ewQIAcMUZvc6EbjczLZIynrZJ7t+Cqkm3p11cjM7L369MjZZM/PlKESIfpjQikn273YbKzCz\nn0hqJ9TWBJxkZo+kZPK+aAYbS6OB/T/gNkmr6JnxttAd8YDCRCrJvD53pRVa91nhSk4uU0PazOyc\nwoqZ/UHSv5vZp9Uz6CpXCnhyXM8CClNSvppgnP6HEASa5flV8FJKRgp/OClg5Xvc/Ry4L45rQHj+\nrlOInn4kJXufpAPTz0QGRwNnqbQXHZZj0qFa4WMKO4C6T16/h5mlHyQknU9oHcwj9FtOAa611BzN\nZZ43V0SzpBsIAUn/SKipnkHwgrkApwexj3sc8HtLTKmoMvL5RPmTCN5XuxFeDFV1IewrCq64txMG\neyEYueMIrYW/WkgX0ee0IcWuZ2L/2whBkhWbJVE5Pe7ivryD7IsI0fmlxmhyeT5F2bzR8f2OG4U+\noOCj/jpCa+FpupK03ZHz+B6ZU5Uj1YTKzJejronZ55vZobHFcmvWIGq9IulIwmx0yWkmD8yoWZaj\nczEh8KlHyuaBhKRd6EpJIcI992VC63eKmS1WcN2cQahUXJw4fB1wZ2Ecqo/nH0kYB0imxLjCunIh\nZWYyLZAe6I3H5PK4K7Oc5bzsjwb2M7OrFRLXjTazv2fI5ZqCtha4UegDkj5LMARz0i/3DNm8aYxv\nIzRpC7NLvR84w8yOS8hcQFe+nGfpMgprCVlNu+XMKXi8SLqH8PA9R/CBr3ltZKBQZs0y73/5JzM7\nqrol719ihaKJYCweq5DO6wnG5dq46XRggpmdEvdf3cvhlu4Gjcfk8rirBnGwuo2QG2x/SXsSPJl6\n3AuS5phZqxLJ7iTduyM9CJXCjUKVUf40xrkzWyrH3AtR7iOEtBSHEHIvjQb+1cy+15ffMhQpp2ZZ\n6r9M1GzfQBgI/T9KZH2tJSrP//9dhGSEw83sHyTNIBjErAj1vOfvMRdB1rYyde5G8Lg7li6Pu0/2\n4ghSMRTSdh9GSGFxWNxW7F76E6G34UbgDkIl7xuWkWa7v/GB5uqzj5m9N7H+5XjzpMmdasJCOuBS\ncy80AGtj8/4ednyKx6HKk3HcJ1mzfLKIbKn/suAsYIRkhG9J7MtyYaw1uWcKI6RhOYI4sG5mcxUm\nBdoRHpJ0pMWIcEmvJsww1g1J/06IHC9MlDOBMA1mj2k2Lb/HXTXYbGZWcCSJg9bdkPRTM/snQhR5\ncgraY+maIrSmuFGoPrnSGBOyrV5O6GMspJrInCtYOdJSW4hj+Dg9p390upM3lgNK/Jdm9sG4/RrC\neFDyJdYjgKnWWA7//wRbzWyNuk/d2SfUFfMyDPiApGfi+lR6ev4AvN0SQZxmtkrSO8iYezmvx12V\nuD56p42X9FHCM/39lExrHKM4I+7rAP6lH8qWGzcK1edc4JrYHw0xx0qG3L8Rcq90SzVBuLHS5J17\n4TZJnyF4jSQDZHY04GvIUGbN8hzgJzn+y0MtMS9zfIn1eZKXaqGe/v9t9PT/L7BQ0j8CjdFz5nyy\ng8LykIx5mUD3OZpX9xSnUSHmYVMs906ETLBZ/JTgcfdWEh53fSxnuWwnxC2tBfYnZKa9LSVzJcEF\nd2+6z6NRLBlgv+NGofosIkRH7kOIGF1DiFlIxxUcmvTkMLOXenmRLCQ8vMtLnPtDhBvtvNT2mt94\nA4Vitfp0zTJ2xx1gIa3IWAArnq6kZIT4ACHp/7+FEBvz4SKynwC+QBgj+Tkh7fa/9eWk1pVF9gLC\nfCM3xTL8lFB7To+XXQvcHgeejXBfF/Pz39fMTpH0bjO7RiGjwK19KWcfGEO4fi8R3Hx7xA6Z2aXA\npZKuMLNz+6lcZTEQb9ShRt40xuW8SHYhX1rqA+nu8ncvoabidJGrVp/sjuvFGBTIEyE+ELiQEEOw\nViFj6uGE7owsDoxLYSKmdxNmt+uzqyfhBXqkdc3R/E1CloBuRsHM/iN2Ob2JYDz+zWIUdgaFvE2r\nFTIJP0cYe6s6FqK/vyzpUELMx92SllpG4ruBahDAjUJ/kDcfSjkvki/lPPc1hKbspXH99Ljt1JzH\n1wPlGONc3XGWI0J8gPBFM7s++tYfR7gHryBEGaf5GWFgeiHF55soF5FIHEfXPA09MLPfEdJYlOKq\n2Nr7IiHZ3WjCTHD9yQsEY7SSjPk2BjpuFKpPrnwo5bxIzOzuOFi1n5n9USH9cmOG6AEp9747FTJ3\nOl18G/iLQvQ3wCkUN8a5u+PifzcQDUGScib4edG6Jp2pFFeTb97nXBHitfa4U0jK9z7CNLE3EpIK\nDvR7oAcep1AlEh4WTYTc8k9SIh9KGbo/SvCQmWhm+8SBvyutZ5qLH8ftSZe/M60rd5IDRPfeNkIN\neI6Z/aWI3E5kdMdZds6rAY/KmOBHOdNH96EMh5OYNMnMHsqQyR0hLukeM3t9KblqIOkbwC8sNR/D\nYMONQpVQkdD4ArYDCeaib/wRhMl9CkEynZGRCblFhBz4hRnUphAGvrezg4ZpqJAx2HkiITq8R3Cg\nQgTuWkJXCoSX5HgzG5TdcSpjgh9J1xLSRz9MIn10f7h6qowI8Tg2sgH3uOszbhQGIZLuN7NXqyu3\nURMhijJXgq4CO2KYhgoKufdfkxjsHEWYZCcrCrXiEbiDhaxKRz+e+7/JGSGukKE0K0rbPe5y4mMK\ng5O7JX0e2EnScYQujR79vf7Sz0XuwU5yRuAOUfKmj64GY8kfIe4edzuItxQGIXFA7cOEh0QEP+wf\n5Ahmc1IoTDd5JpAc7Pyxmf1XhmzddscpZ/roWjPUuvhqgRuFQU50oZxkGZPsOPnIM9gZ5eq2O67Y\nb++P36wcaeUTsnXbxVcpvPtoECLpLkLgUBMh/9GLku42s0/3eqCTiZk9SAguLCU3ZF/6pajxb7+a\nEEVdmP7z/XHbcRmy9dzFVxG8pTAISQwwfwSYbGaXaAcnEnGcgYrKSytft118lcJbCoOTpug+eCoh\nH43jDGVyp5UnuNg6O4AbhcHJVwiDy38ys79K2ht4osZlcpxqkTutfD138VUK7z5yHGdAEzPZfjKV\nn+pb/RE4V4801LoATvlI2l/S7ZIWxvVDJfWYcMRxhgg90soTpr10qoAbhcHJ94GLiGmCoztqraYg\ndJxq0xAznwIDen6KIYFf2MFJs5k9oO5TIxabW9dxBjuDZX6KIYEbhcHJCkn7EHO8SDqZ0rOwOc6g\nZBDNTzEk8IHmQUj0NroKeC1hnuC/A2e454XjODuKtxQGGTHvUZuZvTlm9Gwws3W1LpfjOEMDbykM\nQmo5kYjjOEMbNwqDEJ9IxHGcauFGYRDiE4k4jlMt3CgMQobaXMGO4wwc3CgMQnwiEcdxqoUbhUGI\nTyTiOE618DQXg5OHJB1ZWPGJRBzHqRTeUhiE+EQijuNUCzcKg5B6nivYcZzq4kbBcRzH6cTHFBzH\ncZxO3Cg4juM4nbhRcOoaSV+Q9LCk+ZLmRk+uap3rLklt1dLvOJXAs6Q6dYuk1wDHA4eb2SZJuwDD\na4jtB7YAAAIRSURBVFwsx6kp3lJw6pkWYIWZbQIwsxVmtkzSxZL+KmmhpKsUp7iLNf3vSrpH0iJJ\nr5J0k6QnJH01ykyT9Kika2Lr40ZJzekTS3qLpL9IelDSDZJGx+3fkPRIPPZb/XgtHAdwo+DUN38A\nJkt6XNL/SnpD3H65mb3KzA4GdiK0JgpsjmnLrwR+DXwMOBg4S9LOUeYA4KoYL7KWkKeqk9gi+SLw\nZjM7HGgHPh3nHn4PcFA89qtV+M2O0ytuFJy6xczWA63A2cCLwC8lnQW8UdL9khYQpoA8KHHYLfFz\nAfCwmS2PLY0ngclx3xIzK0SYX0tIXJjkSOBA4E+S5gJnAlMJBmQj8ANJJwEdFfuxjpMTH1Nw6hoz\n2wbcBdwVjcA/A4cSZrdbIulLwMjEIZvi5/bE98J64XlKB/+k1wXcZmanp8sj6QjgTcBpwMcJRslx\n+g1vKTh1i6QDJO2X2DQDeCx+XxH7+U/ug+opcRAbQgbb2an99wFHSdo3lqNZ0v7xfOPM7LfAJ2N5\nHKdf8ZaCU8+MBi6TNB7YCiwmdCWtJnQPPQX8tQ96FwFnSvoe8ARwRXKnmb0Yu6mukzQibv4isA74\ntaSRhNbEp/pwbsfZITzNheNUEEnTgFlxkNpxBh3efeQ4juN04i0Fx3EcpxNvKTiO4ziduFFwHMdx\nOnGj4DiO43TiRsFxHMfpxI2C4ziO08n/B5pe5ZVzPOtCAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "filtered_fdist2.plot(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a function that can combine some of our pre-processing tasks to clean up the raw text:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def process_text(text):\n", " # break text into word tokens\n", " tokens = word_tokenize(text)\n", " # remove stopwords\n", " filtered_words = [token.lower() for token in tokens if not token.lower() in stops]\n", " # filter for short punctuation\n", " filtered_words = [w for w in filtered_words if (len(w) > 2)]\n", " return filtered_words" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"Project Gutenberg's The Hound of the Baskervilles, by Arthur Conan Doyle This eBook is for the use of anyone \"" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "whole_text[:110]" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "337863" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(whole_text)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.17 s, sys: 0 ns, total: 1.17 s\n", "Wall time: 1.17 s\n" ] } ], "source": [ "%%time\n", "clean_text = process_text(whole_text)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "fdist_whole_text = nltk.FreqDist(clean_text)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAE2CAYAAAB7gwUjAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl4VOX1wPHvyQIhEHaQsAsiLoBKUEGpu63Wui/Val1b\nrNXWqvWH2k1b21rr0sW641Zr3a2C1A0BoYKQIKug7LIjS0wgbCHn98d7B4Zh5s69k0wmTM7neeZJ\ncue+c89MZubc+66iqhhjjDGxcjIdgDHGmIbJEoQxxpi4LEEYY4yJyxKEMcaYuCxBGGOMicsShDHG\nmLgsQRhjjInLEoQxxpi4LEEYY4yJKy/TAdRG+/bttWfPnimV3bJlC82aNWtwZRpqXKmUsbgsLour\nYZWJKCsrW6eqHZLuqKr77K2kpERTVVpa2iDLNNS4UiljcVlc6SxjcYUvEwGUaoDvWKtiMsYYE5cl\nCGOMMXFZgjDGGBOXJQhjjDFxWYIwxhgTlyUIY4wxcaUtQYhIgYhMEZEZIjJHRO7ytj8jIotFZLp3\nO9zbLiLyNxFZICIzRWRgumIDWFlZjdpqesYYk1A6B8ptA05S1U0ikg9MFJH/evfdqqqvxux/OtDH\nux0NPOL9rHO//M8snp+8jhadvuLEgzqm4xDGGLPPS9sVhDceY5P3Z7538ztlPxt4zis3GWgtIsXp\niK1H2+YA/O3D+XYVYYwxCUg6vyBFJBcoAw4A/qGqw0XkGWAI7gpjDHCbqm4TkVHAPao60Ss7Bhiu\nqqUxjzkMGAZQXFxcMnLkyNBxbamu4UejvmLTDuXO49vQv2PTQOWqqqooLCwMdaywZerjGPVVxuKy\nuCyuhlUmYtCgQWWqOijpjkGGW9f2BrQGxgL9gGJAgKbAs8CvvX3eBoZGlRkDlPg9bm2m2rj9n+O0\nx/BR+t3HPg5cprEP08+W52JxWVzpLNNQ44pGQ5pqQ1XLgXHAaaq6yotxG/A0cJS323KgW1SxrsDK\ndMV02gGFFBXkMXnRBqYu2ZCuwxhjzD4rnb2YOohIa+/3ZsApwLxIu4KICHAOMNsr8hZwudebaTDw\ntaquSld8zfNzuOqYngA89OGCdB3GGGP2Wem8gigGxorITGAq8L6qjgL+JSKzgFlAe+Bub//RwCJg\nAfAE8OM0xgbAVcfuT/MmuYz/4itmLCtP9+GMMWafkrZurqo6EzgizvaTEuyvwPXpiieeNs2bcNmQ\nHjw2fhEPjV3AE5cnb7MxxpjGotGPpP7B0F4U5Ofw/mdrmLuqItPhGGNMg9HoE0SHoqZcclR3AB4a\na20RxhgT0egTBMCw43rRJDeH0bNWsWDtpuQFjDGmEbAEARS3asYFg7qiCg/bVYQxxgCWIHa57vje\n5OYIb85YydL1mzMdjjHGZJwlCE+3toWce0QXdtYoj45fmOlwjDEm4yxBRPnxCb3JEXi1bDkryrdk\nOhxjjMkoSxBRenVowXcGdGbHTuVxu4owxjRyliBiXH/iAQD8e+oy1lZszXA0xhiTOZYgYvTtVMRp\nh3Zie3UNT0xYlOlwjDEmYyxBxHHDSe4q4vnJX7Jh8/YMR2OMMZlhCSKOfl1acWLfDmzZsZMRE+0q\nwhjTOFmCSOCGk/oA8OzHS/m6akeGozHGmPpnCSKBkh5tOPaAdmzaVs2zk5ZkOhxjjKl3liB83HCi\nu4p46n+L2bStOsPRGGNM/bIE4WNwr7Yc2bMN5VU7eH7y0kyHY4wx9coShA8R2dUW8eSERWyr1gxH\nZIwx9ccSRBLH9WnPgK6tWLdpO+8vrsp0OMYYU28sQSQhIvzEu4p4c95mduysyXBExhhTPyxBBHDK\nwR3Zv31zNmytYcay8kyHY4wx9cISRAAiwtAD2gMwedH6DEdjjDH1wxJEQIN7tQNg8qINGY7EGGPq\nhyWIgI7u1RaA0qUb2F5t7RDGmOxnCSKg9i2a0rVlHlt31DBzubVDGGOynyWIEPp1aAJYO4QxpnGw\nBBHCobsShLVDGGOynyWIEA7tkA9YO4QxpnGwBBFCq4Jc+nRsYe0QxphGIW0JQkQKRGSKiMwQkTki\ncpe3fX8R+URE5ovISyLSxNve1Pt7gXd/z3TFVhu7u7taO4QxJrul8wpiG3CSqh4GHA6cJiKDgT8B\nD6pqH2AjcI23/zXARlU9AHjQ26/BsfEQxpjGIm0JQp1N3p/53k2Bk4BXve3PAud4v5/t/Y13/8ki\nIumKL1U2HsIY01iktQ1CRHJFZDqwFngfWAiUq2pk9Z3lQBfv9y7AMgDv/q+BdumMLxXtWzS1dghj\nTKMgqulf40BEWgNvAL8GnvaqkRCRbsBoVe0vInOAb6nqcu++hcBRqro+5rGGAcMAiouLS0aOHJlS\nTFVVVRQWFqZU5olpFbyzsIpL+rXggoNb1OlxahNXQytjcVlcFlfDKhMxaNCgMlUdlHRHVa2XG/Ab\n4FZgHZDnbRsCvOv9/i4wxPs9z9tP/B6zpKREU1VaWppymVEzVmqP4aP00icm1/lxahNXQytjcVlc\n6SxjcYUvEwGUaoDv7XT2YurgXTkgIs2AU4C5wFjgAm+3K4A3vd/f8v7Gu/9D74k0ONYOYYxpDNLZ\nBlEMjBWRmcBU4H1VHQUMB24WkQW4NoYR3v4jgHbe9puB29IYW61YO4QxpjHIS9cDq+pM4Ig42xcB\nR8XZvhW4MF3x1LXBvdoxf+0mJi9az6CebTMdjjHG1DkbSZ0iGw9hjMl2liBSZO0QxphsZwkiRdYO\nYYzJdpYgasHmZTLGZDNLELVg7RDGmGxmCaIWrB3CGJPNLEHUgrVDGGOymSWIWrJ2CGNMtrIEUUvW\nDmGMyVaWIGrJ2iGMMdnKEkQtWTuEMSZbWYKoA9YOYYzJRpYg6oC1QxhjspEliDpg7RDGmGxkCaIO\nWDuEMSYbWYKoI9YOYYzJNpYg6oi1Qxhjso0liDpi7RDGmGxjCaKOWDuEMSbbWIKoQ9YOYYzJJpYg\n6pC1QxhjsokliDpk7RDGmGxiCaIOWTuEMSabWIKoY9YOYYzJFpYg6pi1QxhjsoUliDpm7RDGmGxh\nCaKOWTuEMSZbWIJIA2uHMMZkA0sQaWDtEMaYbJC2BCEi3URkrIjMFZE5InKjt/1OEVkhItO927ej\nytwuIgtE5HMR+Va6Yku36HaIHTWa4WiMMSY1eWl87GrgFlWdJiJFQJmIvO/d96Cq3he9s4gcAlwM\nHAp0Bj4QkQNVdWcaY0yLSDvE/LWbWLhhB4MzHZAxxqQgbVcQqrpKVad5v1cCc4EuPkXOBl5U1W2q\nuhhYAByVrvjSLVLNNPur7RmOxBhjUiOq6a8CEZGewEdAP+Bm4EqgAijFXWVsFJGHgMmq+rxXZgTw\nX1V9NeaxhgHDAIqLi0tGjhyZUkxVVVUUFhamrczHy7Zy/+RyipoI3+tfxEk9m5GXIxmPqz7LWFwW\nl8XVsMpEDBo0qExVByXdUVXTegNaAGXAed7f+wG5uKuX3wNPedv/AVwWVW4EcL7fY5eUlGiqSktL\n01qmalu1nv/w/7TH8FHaY/goPf7eD/Wt6St0586ajMZVn2UsLosrnWUsrvBlIoBSDfD9ndZeTCKS\nD7wG/EtVX/cS0hpV3amqNcAT7K5GWg50iyreFViZzvjSqVmTXF6+dgg3D27F/u2bs2R9FT/596ec\n+dBExn2+NpIEjTGmwUpnLybBXQXMVdUHorYXR+12LjDb+/0t4GIRaSoi+wN9gCnpiq8+5OQIx3Zr\nxns3Hccfz+vPfi2bMmdlBVc+PZXvPj6ZsqXWDdYY03ClsxfTscD3gVkiMt3bdgdwiYgcDiiwBLgW\nQFXniMjLwGe4HlDX6z7Ygyme/NwcLjmqO+ce0YXnJi3h4XELmbJ4A+c/MolTDt6PW7/Vl76dijId\npjHG7CFtCUJVJwLxWmVH+5T5Pa5dIisV5Ocy7LjeXHxUd574aBEjJi7mg7lrGDNvDece3oWbTj0w\n0yEaY8wuNpI6A1oW5HPLN/sy/tYTufKYnuTlCK9/uoKT7h/Hk59WsHlbdaZDNMYYSxCZ1KGoKXee\ndSgf3nIC5w3sQnWN8t8FVfzqP7OTFzbGmDQLnSBEpI2IDEhHMI1Vt7aFPHDR4bx1/VCa5MLrn65g\n7Ly1mQ7LGNPIBUoQIjJORFqKSFtgBvC0iDyQrJwJp3/XVlxyqGusvuONWVRu3ZHhiIwxjVnQK4hW\nqloBnAc8raolwCnpC6vxOuPAQg7r1ppVX2/lj/+dl+lwjDGNWNAEkeeNX7gIGJXGeBq9XBH+fMEA\n8nOFFz75ko8Xrst0SMaYRipogrgLeBdYoKpTRaQXMD99YTVuB+5XxE9O6gPAba/Nomq79WoyxtS/\noAlilaoOUNUfA6jqIsDaINLouhN6c1CnIr7cUMX9732R6XCMMY1Q0ATx94DbTB3Jz83hzxccRm6O\n8NT/FlO2dGOmQzLGNDK+CUJEhojILUAHEbk56nYnbkZWk0b9u7Zi2HG9UIXhr81k646smHnEGLOP\nSHYF0QQ3XXceUBR1qwAuSG9oBuDGk/vQq0NzFqzdxN8/tGYfY0z98Z2LSVXHA+NF5BlVXVpPMZko\nBfm53Hv+AC58bBKPjl/E6f2K6delVabDMsY0AkHbIJqKyOMi8p6IfBi5pTUys8ugnm25YkhPdtYo\n//fqTHbsrMl0SMaYRiDobK6vAI8CTwJWEZ4B/3daX8bMW8Nnqyp4bPxCbvC6wRpjTLoEvYKoVtVH\nVHWKqpZFbmmNzOyhsEke95znpsD625gFzF9TmeGIjDHZLmiCGCkiPxaRYhFpG7mlNTKzl2MPaM/F\nR3Zj+84abn11JjtrbNlSY0z6BE0QVwC3Ah8DZd6tNF1BmcTuOONgOrUsYPqycp7+3+JMh2OMyWKB\nEoSq7h/n1ivdwZm9tSzI5/fn9gPgvvc+Z8m6zRmOyBiTrQI1UovI5fG2q+pzdRuOCeLkg/fj7MM7\n8+b0ldz2+kxuGZif6ZCMMVkoaBXTkVG3bwB3AmelKSYTwG/OPJR2zZswedEG3l+0JdPhGGOyUNAq\npp9E3X4IHIEbZW0ypG3zJtx19qEAPDujkhETF9v4CGNMnUp1TeoqwDriZ9gZ/Yu5aFBXtu1Ufjfq\nM8742wQ+XmDrRxhj6kbQNoiRQKRPZS5wMPByuoIywYgIfzp/AL2abuKFudv5Ys0mvvfkJ3y7fyfu\n+PbBdG1TmOkQjTH7sKAjqe+L+r0aWKqqy9MQjwlJRDiycwFXnjaYERMX89CHCxg9azUfzlvLdccf\nwLXH96Ig3ybeNcaEF7QNYjwwDzeTaxtgezqDMuEV5Ody/YkHMOaW4/nOgGK27qjhwQ++4JQHxvPO\n7NWo2qA6Y0w4gRKEiFwETAEuxK1L/YmI2HTfDVDn1s146HsDeXHYYA7qVMTyjVv40fNlXP7UFBas\ntek5jDHBBW2k/gVwpKpeoaqXA0cBv0pfWKa2Bvdqx6ifDOW3Zx9Kq2b5TJi/jtP+MoG7R31G5dYd\nmQ7PGLMPCJogclR1bdTf60OUNRmSl5vD5UN6MvbnJ/C9o7uzU5UnJy7mxPvG8+GSKmpsLidjjI+g\nX/LviMi7InKliFwJvA2M9isgIt1EZKyIzBWROSJyo7e9rYi8LyLzvZ9tvO0iIn8TkQUiMlNEBtbm\niZnd2jZvwh/O7c/IG4ZS0qMN6zZt4x9TKzjvkY+Zsaw80+EZYxqoZGtSHyAix6rqrcBjwADgMGAS\n8HiSx64GblHVg4HBwPUicghwGzBGVfsAY7y/AU7Hja3oAwwDHkntKZlE+nVpxas/GsKD3z2MNgU5\nTF9WzjkP/4/hr85k3aZtmQ7PGNPAJLuC+AtQCaCqr6vqzap6E+7q4S9+BVV1lapO836vBOYCXYCz\ngWe93Z4FzvF+Pxt4Tp3JQGsRKU7hORkfIsK5R3Tl76e159rjepGXI7xUuowT7xvHUzYa2xgTRfy6\nP4rIbFXtl+C+WaraP9BBRHoCHwH9gC9VtXXUfRtVtY2IjALuUdWJ3vYxwHBVLY15rGG4KwyKi4tL\nRo4cGSSEvVRVVVFYGG4gWX2Uqe+4VlRW8/T0Cj5d7Xoud2uZxzVHFNG/Y9NaH8deY4vL4mpYZSIG\nDRpUpqqDku6oqglvwIJU7ovZrwVu/YjzvL/LY+7f6P18GxgatX0MUOL32CUlJZqq0tLSBlkmE3HV\n1NTo+3NW6zf+9KH2GD5Kewwfpdc9X6rLNmyu1XHsNba40lnG4gpfJgIo1QDf38mqmKaKyA9jN4rI\nNd6Xvi8RyQdeA/6lqq97m9dEqo68n5HeUcuBblHFuwIrkx3D1J6IcMoh+/HeTcdx67f60iw/l9Gz\nVnPKA+P56wfz2brDliE3pjFKNtXGz4A3RORSdieEQbiZXM/1KygiAowA5qrqA1F3vYVboe4e7+eb\nUdtvEJEXgaOBr1V1VYjnYmopMhr73CO68IfRcxk1cxUPfvAFr5Qt4/bTD6amspo2X20K/Hibtlt7\nhjH7Mt8EoaprgGNE5ERc+wHA26r6YYDHPhb4PjBLRKZ72+7AJYaXvauQL3Gjs8E1fH8bWICbLfaq\nME/E1J3IaOzLBq/nzrfmMG91Jde/MM3d+c74wI/TNFd478DN9GjXPE2RGmPSKdBkfao6Fhgb5oHV\nNTZLgrtPjrO/AteHOYZJr8ho7BemfMkLn3zJ15uqKCgoCFS2YssO1m/ezhMTFnH3OYH6MhhjGpig\ns7maRioyGvvyIT0pKyujpKQkULkFays55YGPeKV0OT875UDat9i7V5QxpmGz6TJMWhzQsYhBxU3Z\nVl3Dc5OWZjocY0wKLEGYtDnnINf28NykJVRtr85sMMaY0CxBmLQ5qF0+A7u3prxqBy9PXZbpcIwx\nIVmCMGkjIgw7rjcAT05cTLVN42HMPsUShEmrUw/Zj17tm7N84xZGz16d6XCMMSFYgjBplZsj/OAb\nvQB4bPxCW/rUmH2IJQiTducN7EL7Fk2Ys7KCjxeuz3Q4xpiALEGYtCvIz+WqY/cH4NHxCzMcjTEm\nKEsQpl5cdnQPCpvkMmH+Ouas/DrT4RhjArAEYepFq8J8Lj6yOwBPfLQow9EYY4KwBGHqzTXf2J/c\nHGHkzFUs31iV6XCMMUlYgjD1pkvrZpw5oJidNcqIiYszHY4xJglLEKZeRQbOvTR1GeVV2zMcjTHG\njyUIU68O6dySb/RpT9X2nTw/2SbxM6YhswRh6t2PjndXEc98vMSWMzWmAbMEYerdMb3b0a9LS9Zt\n2s7r01ZkOhxjTAKWIEy9i57E74kJi9hZY9NvGNMQWYIwGfHtfp3o2qYZi9dt5v3P1mQ6HGNMHJYg\nTEbk5ebwQ28Sv0dtEj9jGiRLECZjLhzUldaF+UxfVs7UJRszHY4xJoYlCJMxhU3yuHxITwAe/8gm\n8TOmobEEYTLqiiE9aJqXwwdz1zJ/TWWmwzHGRLEEYTKqXYumXDioKwCP2yR+xjQoliBMxv1gaC9y\nBP4zfQUbttjAOWMaCksQJuN6tm/Oaf06sWOn8vi0Cl4rW86khev5cn0V26trMh2eMY1WXqYDMAbg\n2uN6M3rWaqau3MbUV2bs2i4CHYua0rl1Mzq3bkaX1s3o3Kpg19/d2hRmMGpjspslCNMgHNatNY9/\nv4TRU+ZRU9CKleVbWFm+hdUVW1lTsY01Fdv49MvyvcqJwLUDW1JSkoGgjclyaUsQIvIU8B1grar2\n87bdCfwQ+Mrb7Q5VHe3ddztwDbAT+Kmqvpuu2EzD9M1DO9Fu6wpKSo7Yta16Zw1rKrexYqNLGCu8\nxLGyfAvLNm5hwdpNvDxnEzedu5OmebkZjN6Y7JPOK4hngIeA52K2P6iq90VvEJFDgIuBQ4HOwAci\ncqCqWotlI5eXm0MXr2oplqpy+l8nMG91Ja9PW8ElR3XPQITGZK+0NVKr6kfAhoC7nw28qKrbVHUx\nsAA4Kl2xmewgIlx3gpv077HxC23SP2PqWCZ6Md0gIjNF5CkRaeNt6wIsi9pnubfNGF9n9C9mv+a5\nLFlfxehZqzIdjjFZRdI5SZqI9ARGRbVB7AesAxT4HVCsqleLyD+ASar6vLffCGC0qr4W5zGHAcMA\niouLS0aOHJlSbFVVVRQWhusBUx9lGmpcqZSpr7hGzS3n6dlb6dkqj/tObYeINIi4GurrZXE1zrii\nDRo0qExVByXdUVXTdgN6ArOT3QfcDtwedd+7wJBkj19SUqKpKi0tbZBlGmpcqZSpr7g+/mSqDrr7\nfe0xfJSOnbemwcTVUF8viyu9ZRpqXNGAUg3wHV6vVUwiUhz157nAbO/3t4CLRaSpiOwP9AGm1Gds\nZt/VJFf4wdD9AXh4nE36Z0xdSVuCEJF/A5OAviKyXESuAe4VkVkiMhM4EbgJQFXnAC8DnwHvANer\n9WAyIXzv6O60LMhjyuINlC0N2jfCGOMnnb2YLlHVYlXNV9WuqjpCVb+vqv1VdYCqnqWqq6L2/72q\n9lbVvqr633TFZbJTUUH+rqnDHx5rVxHG1AWbi8lkjauO7UlBfg5j5q1l3uqKTIdjzD7PEoTJGu1a\nNOXiI91guUesLcKYWrMEYbLKD4/rRV6OMHLGSr5cX5XpcIzZp1mCMFmlS+tmnH14F2oUHrNlTI2p\nFUsQJutcd0IvROCVsuWsrdya6XCM2WdZgjBZ54CORXzzkP3YXl3DiImLMx2OMfssSxAmK/34hAMA\n+NfkL/l6y44MR2PMvskShMlKh3VrzbEHtGPTtmqen7w00+EYs0+yBGGyVuQq4qmJi9my3QbmGxOW\nJQiTtY7p3Y4BXVuxfvN2Xi5dlryAMWYPliBM1hIRfuwtKPT4R4vYsbMmwxEZs2+xBGGy2jcP6UTv\nDs1ZUb6Ft6avzHQ4xuxTLEGYrJaTI/zoeHcV8cj4hdTYsqTGBGYJwmS9sw/vQudWBSxYu4n3567J\ndDjG7DMsQZis1yQvhx8e1wtwCwppGpfZNSabWIIwjcLFR3anbfMmzFhWzuyvtmc6HGP2CZYgTKPQ\nrEkuVx3TE4DX5m7m66oddiVhTBJ5mQ7AmPpy+ZCePDp+IbPWbuew375Hi6Z5dG5dQOfWzejcuhld\nWjejc+sCurQupHPrAvZrWUB+rp1DmcbLEoRpNFoV5vObMw/lb+9/xoatyqZt1XyxZhNfrNkUd/8c\ngf1augSSu6OK7otmUFSQR1FBPi0L8nb9vufPPFoW5NM0zxKL2fdZgjCNykVHdqN3zloGDhxIxZZq\nVpRvYUX5FlZ6t92/b2VN5VZWfe1uAFNWLg98nPxcQVSRN4Ivr641NaH2B+hWlMsvWqzlhAM7ICKh\nyhqTjCUI0yiJCK0K82lVmM8hnVvG3Wd7dQ1rKrayonwLpbPm0bFzdyq27qBya7V3c7/v3rb7vu2R\nUds1IUdvh9x/wcYarnp6Kkft35bhp/WlpEfbcMczxoclCGMSaJKXQ7e2hXRrW0j+xgJKSroFLrut\neidl0z5l4BFHBC4z7dNw+2/fWcP9r0/izQVbmbJ4A+c/MolTDu7Iz7/Vl4M6xU96xoRhCcKYNGia\nl0vTXKEgPzd4mZD7F+Tnclbf5txy3hCe/GgRT05czAdz1zJm3lrOObwLN51yIN3bFaYSvjGAdXM1\nZp/XsiCfm7/Zl/G3nsiVx/QkL0d449MVnPzAOH795mxbdtWkzBKEMVmiQ1FT7jzrUD685QTOG9iF\n6hrluUlLOf7ecdz7zjxbWc+EZgnCmCzTrW0hD1x0OO/ceBynHrIfW3bs5OFxCznu3rE8Mm4hazfv\ntEGCJhBrgzAmS/XtVMQTlw9i2pcbufedeUxetIE/vTMPgKIx73FgpyIO8m59O7Wkb6ciWjXLz3DU\npiGxBGFMlhvYvQ3//uFgJsxfx5MTFzN9yToqtlVTtnQjZUs37rFvcasC+nYq4qBOLb3EUcQOmyK9\n0bIEYUwjICIcd2AHjjuwA6WlpXTv24/PV1fy+epK5nk/v1hTuWtg4LjPv9pVtkkunLN0Bt8f3JP+\nXVtl8FmY+pa2BCEiTwHfAdaqaj9vW1vgJaAnsAS4SFU3ihsC+lfg20AVcKWqTktXbMY0ZiJCx6IC\nOhYV8I0+HXZt31mjLFm/OSppVPD56kqWrK/i5dLlvFy6nMO6tuLSwT04c0BnmjUJ3iXX7JvSeQXx\nDPAQ8FzUttuAMap6j4jc5v09HDgd6OPdjgYe8X4aY+pJbo7Qu0MLendowbf7F+/aPnLcJ8zYVMQr\nZcuZsfxrZrw6k9+/PZcLSrpy6dHd6dWhRQajNumUtl5MqvoRsCFm89nAs97vzwLnRG1/Tp3JQGsR\nKcYYk3Gdi/L45XcO4ZM7TubPFwzgsG6t+XrLDkZMXMxJ94/nsic/4Z3Zq6jeGXJaEdPgSTq7u4lI\nT2BUVBVTuaq2jrp/o6q2EZFRwD2qOtHbPgYYrqqlcR5zGDAMoLi4uGTkyJEpxVZVVUVhYbhRpvVR\npqHGlUoZiyt741q4cQfvLqxiwpdb2L7TbWtbkMOpvQo5pVczCnSbvV4NLK5ogwYNKlPVQUl3VNW0\n3XBtDbOj/i6PuX+j9/NtYGjU9jFASbLHLykp0VSVlpY2yDINNa5Uylhc2R9X+ebtOmLCIj3xvrHa\nY/go7TF8lPa6/W39zv3v6d2j5ugrpct01vJy3bK9ul7jymSZhhpXNKBUA3yH13cvpjUiUqyqq7wq\npLXe9uVA9ExoXYGV9RybMSakVoX5XD10f646tieTFq7n+U+W8t6cNcxau51Zaxfv2i9HoGf75t64\ni5ZeV9oiurUpJCfHpilvqOo7QbwFXAHc4/18M2r7DSLyIq5x+mtVXVXPsRljUiQiHHNAe445oD1f\nVW7j1bGlVDfvyLw1lcxbVcHidZtZ9JW7jZ61ele5wia59NmviIP2K6J6cyVlmxfutQhTy6jfm+Xn\n2roX9Sid3Vz/DZwAtBeR5cBvcInhZRG5BvgSuNDbfTSui+sCXDfXq9IVlzEmvToUNeWoLgWUlPTZ\ntW3rjp0sWLvJjb1YU8ncVa4L7drKbcxYVs6MZeVux7nzfB87N0d2rdzXVKsZsHD6rquSgzoV0aGo\nqSWQOpRcHew5AAAgAElEQVS2BKGqlyS46+Q4+ypwfbpiMcZkVkF+Lv26tKJflz0H2m3cvJ15qyuZ\nv7aSOQuWUtSmg1t0adsOKrZEL8rkft9WXUN51Q7Kq9zEgws2rtjj8doU5u81EvzA/Ypo3tTGBKfC\nXjVjTMa0ad6EIb3bMaR3O8qarKek5BDf/bdX1+xKGhPLZkKrzrtGhM9dXcHGqh1MXrSByYv27GHf\nvW0hfTsVkbe9kpEr5oSKsXx9JZMrFuxR1RW9BnlRQT5FTfOysi3FEoQxZp/RJC+Hdi2a0q5FU9a3\nb0JJSY9d96kqq77eusdI8HmrK1n41Sa+3FDFlxuq3I4LloQ/8OefJ92lRVOXMPK0mo5TPo6bUBIl\nmc3ba6ip0QaXZCxBGGOygojQuXUzOrduxokHddy1fcfOGhZ9tZl5qyuYMW8h3boFXzpWFRYs+ZKi\ndh33Wos8eg3yTdt23wCWVWxM8shxvDl6V5KJJBC/JPP12m2UhD9KKJYgjDFZLT83h75ee0TXnasp\nKdk/VPmyZhsoKTnYd5+dNcqmbS5pfDJtJt16HbhHIqnwEknFXsnF/SzfvI0t1boryaz6Onlcvdvk\nccXpoZ5KaJYgjDGmlnJzhFbN8mnVLJ81rfMp2b9tqPJlZWUcfsTAXUkm3tVKbJLJ3xYgi9SSJQhj\njGkAopNMEGVlZWmOyJYcNcYYk4AlCGOMMXFZgjDGGBOXJQhjjDFxWYIwxhgTlyUIY4wxcVmCMMYY\nE5clCGOMMXGldU3qdBORr4ClKRZvD6xrgGUaalyplLG4LK50lrG4wpeJ6KGqHZLuFWRd0my8EXBN\n1vou01DjyqbnYnFZXI0xrlRuVsVkjDEmLksQxhhj4mrMCeLxBlqmocaVShmLq+EdI5UyFlfDO0aq\nZULZpxupjTHGpE9jvoIwxhjjwxKEMcaYuBpNghCRHBE5JtNxGGP2PSLSNNMxZEKjSRCqWgPcn0pZ\nEckVkc4i0j1yq+PwUiIiXUTkGBE5LnKLs89Av1uSx79QRIq8338pIq8nK5Pi87hBRNoE3DdXRJ5P\n4RgTROT3InJa5DkFKNMv5DEKReRXIvKE93cfEflO2FgzSUTa+t2SlB0TZFt98k4MLwpZ5qmYv1sA\no5OUCf2/T+X1EpEbg2yrK41tydH3ROR84HUN2DovIj8BfgOsAWq8zQoMSLD/gcAjwH6q2k9EBgBn\nqerdcfYd6T1WXKp6lk9cfwK+C3wG7IyK66OYXSNJsQAYBMwAxIv/E2BoomMAv1LVV0RkKPAt4D7v\nuR0dJ55ZSZ5L3NfL0wmYKiLTgKeAdxP9f1R1p4h0EJEmqrrd5zFjXYF7rucDfxaRbcAEVb3Jp8yj\nItIEeAZ4QVXLkxzjaaAMGOL9vRx4BRiVqICIHAvcCfTAfR4FUFXt5VMm8HvM5zHuVNU749xVhvs/\nSpz7FNgrLhEpAAqB9l6ij5RtCXT2ieFG3GtWCTwJHAHcpqrv+ZQJ9Xqpao2I3AC8nOgx41ghIo+o\n6nXe83kbeCJJmcD/+1RfL88VwF9jtl0ZZ1vdSPdIvIZ0w70Ra4DtQIX3d0WSMguAdiGOMR44Cvg0\natvsBPse793+CrwEnOndXgD+kOQ4nwNNQ8T1ItA/6u9+wDNJynzq/fwj8L3obXH27eHd7vVu/b3b\nPcCvA8QnuCT0ovea/wHonWDfx4CpwK+AmyO3AMcoBi4G/oFLrO8EKNPHe/4LvP/LqT77lsa+RsCM\nJI8/Dzgd6Ai0i9zq6j3m8xhnhtk/yWPdCCwGtgGLvN8X405GbvApN8P7+S3gLeAwYFoaXq9fAT8H\nugFtI7ckZf4EPOq9z84P8BoE/t+n8noBlwAjgY3eaxW5jQU+qKv/ZeytUV1BqGqgqoUYy4CvQ+xf\nqKpTRPY4AatOEM94ABH5napGVw+NFJHYK4FYi4B83JssiINUdVbUsWeLyOFJyqwQkceAU4A/efWw\ncaslVXUpuDM8VT026q7bROR/wG/9DqSqKiKrgdW416sN8KqIvK+q/xez+0rvlgMErS5aiJu35gVg\nBPATddWOvlR1voj8EigF/gYcIe6fe4eqvh6z+3YRaYZ3JSUivUn+//laVf8b5DlECfweE5Fc4Keq\n+mD0dlUdmWB/3ypEVZ0WZ9tfgb+KyE9U9e/Jgo8+nPfz28DTqjpDYp5UHKm8Xld7P6+P2rbX1ZCI\nnBf15xRcYpkCqIicF+f/HS3w/z7F1+tjYBVu/qXoqvJKYGbAxwitUSQIETlIVeclevPHe9NHWQSM\nE5G3ifqHq+oDCfZf5705Im+UC3D/WD8dRKSXqi7yyuwPJJtIqwqY7tVZRsf10wT7zxWRJ4Hnvdgu\nA+YmOcZFwGnAfapaLiLFwK1JyjQXkaGqOtF7LscAzf0KiMhPcZfO63BVDbeq6g4RyQHmA3skCFW9\nyytX5P7UTUliAvflPhR3JnYEMF5EPlLVhT5xDQCuAs4A3seddU8Tkc7AJCD2C+M3wDtANxH5F3As\n7vLfz1gR+bP3WNH/R7/3ZOD3mLoqubOBB+PdH4dfO50CJ/ncv1pEilS10kuqA4G7fZ5LmYi8B+wP\n3O79P5Ml7dCvl6run+QxI86M+ftT3EnYmbjn7pcgQv/vVfXv3uejJ1Hfxar6XJx9lwJLReRSYKWq\nbgXwklJXYInfsVLVKAbKicjjqjpMRMZGbd71xFU14ZteRH4Tb3vkSyrO/r1wIxyPwV0OLgYuU9Ul\nPsc4zSuzyNvUE7hWVd/1KXNFgrieTbB/AXAdELlS+Qh4JPJG8znOUKCPqj4tIh2AFqq62Gf/Elw7\nQitvUzlwtd8HWETuAp6KXIXE3Hewqs6N2dYP+CeuqgBcYrlcVef4PRevbAvcl/7Pga6qmuuz70e4\nuudXVXVLzH3fV9V/xinTDhiMOzuerKq+s23GvCcjNMl7MtR7TER+j/t/vARsjjqIXxIKTURmquoA\n7z3zR1yb1R2quleblbd/DnA4sMg7AWkHdFHVhGfEKb5ehbhqyO7e90AfoK+qJmwbSkUK//t/Ar2B\n6US1I/qc5CEipcAx6rW/eW1k/1PVI+vgKex9vMaQICLE9WZ4R1UrRORXuDOc3wX5oIQ8W0VEmgM5\nqloZcP+mwEHen/NUNWjVUdp4yXEQ7sN0oHfm/EpMFVKisi1x7y/f6jnvS2KmqgbuMSQiHwO/UNWx\n3t8n4NpsEnZjFpH7cVcQLYDJuAQ5IXLVVhupVMvUhaDvsagv1ciHPdKwW6dfqiLyqaoeISJ/BGap\n6guRbQn2F+BSoJeq/lZc78BOqjrF7/mEJSIv4RqQL1fXqN8MmKSqcatYvROhH7L3mf3V8faPKjcg\nTpmEVx0iMhc4REN8CYvI9Ni4RWSGqh4W9DHCaBRVTFF+qaove2c4p+Iup+P2yomIPVsVEd+zVe+L\n/ny8N0qkSlVV96qDF5GTVPXDmLpPgN4ikuzNFWk8PQTXQylynNh61dr0LjoXVx0zzdt3pSToIioi\nNyfYHjlO3Co5db1MZohId1X90ieWaM0jycF7jHHel6WfycC9qrom4DECv8bUolpGRFrhqiciV3bj\ngd/6JVYR2Q/XiN9ZVU8XkUOAIao6Ima/yP9kFHv3TEr2pRTplRNJukl7ZBGizcrzMK5K6SRcG1Ul\n8BqQ8Gw4ldcL19nhuyJyCYCqbknS1vEmMAH4gN1n9r7EdY0dAMxhz96OftVSs3E9+JJVQUf7SkTO\nUtW3vOOeTeprQiTV2BJE5J99BvCoqr4pIncmKfM4rodM9NnqE+z+4MR6E9eoXUbyBsrjgQ/Zu+4T\nkr+5nsZ9UB4ETsRVm8R709emH/52VVURidR1+30Jp9IBIKIYmCMiU9izCiRRN99F3hVgpIrnMlw1\nS0LquuueJbvHiozXBA21UQK9xqp6YpLH8fMU7osi0lf/+95xY08aoj3j7fML7+8vcNVHI2L2i/xP\n+uK+dN/ExX8me3eHjhX2SxXCt1kdraoDReRT7xgbvSoTP6m8XmE7DxSq6vAkccQarKqHBNlRdndv\nLwI+89730e0pCbu3Az8C/iUi//AeYzlwechYA2tsVUyjgBW4M5wSYAswxe/yLN7lm98lnYjMDlNd\nkioRKVPVEhGZpar9vW0TVPUbdXiMn+O6eZ6KO5O+GjceIExPlSDHOT7edvV6ecXZvw1wF67KSHBf\ndneq6kafY/wR1zX0X96mS3BdE2/3KRPoNY5zBRj7PPyuBONVGey1Leb+qap6ZHT1jV8ZcQ3B50eq\noryrwFdU9TSfY3wMnIyr3x7ofan+W1WP8ikTdwBpoitDEfkEd6I11TtGB+C9RFVSXplUXq9TgV/i\nrgTfw2tAVtVxCfa/G/hYVX0Hx8WUGQHcr6qfBdg37vs9ItH7PuYxWuC+vwNVYaeqsV1BpNIrJ+zZ\n6sci0l+jupQGISJnAIeyZ1WGX9fQrV79/XxxA4FW4PqGJ3r8SnZXKzTB9c7YrKotE5VR1fu8D1cF\n7iz016r6fpLn8TRxqi/86m+DfCBi9t8I/NRr56gJ2C50BnC4el1bReRZXC+VhAmC4K9xvCvAXeHi\nfyW4Rfbs9XUs7sTFz2ZxDaKRM+LB+HfF7o4b+xOxHVcF6ieVHllvs7sqqwDXO+lz3Ps6nr8BbwAd\nxTWkX4D7IvcT+vVS1ffFDcKMNCDfmKQB+UbgDnGDKXewu80m4WcFeBaYJK6r9raoMntV4YZ9v0cL\nWr1YVxrVFUQqwp6tishnuLPuRSR5o0SVeRQ3svJEXDfPC3BXNtf4lDkS1021NfA73CjMe1X1k4DP\n6xzgKFW9I8C+Ldmz4W2Dz77nR/1ZgGvHWKn+PTPOww1M6oh7vXw/kCLSH3iOPXsxXaGqs32OMRM4\nIRK7uGkjxiX5v9TqNQ5C3FiUZ3G9jATYgDu7neFTZiDwd9xgx9m4LtEXaILePyLyC9zJ0Ru4L/Bz\ngZdU9Y8+x/gnMAv35bsI+CTJl2qiOK9V1Wt99jkId6UiwBiN6bEWZ//DcP/7SC+5jbj/ve9YAAnZ\ngByWiCzANerPIqqrrsbpmRdVJvqkLeJr3JibWzROBwoR+S9e9aKqHiYiebjBef1r/yzixGgJIhiv\ncawm2SWdiPTADfKKVEN8BJQneaNEugdGfrbATQfyTZ8yg3B10D1wVwOQJBHFeYzJqjrY5/5rcY2H\nW3Bv+qTTQMR5jBzcSE+/htoFuDEGycZlRPZPpRfTJbhR3WO953EccLuqvuhTJtBrLCKXqerzkqCh\nXhOPmYl+jJbevhUB9r0QeBc3Mvh8XCeLX6l/V+KBRL0nVfXTJMc4CXdS9A3cgLLpXrlQUzqIyDRV\nHRizzXdOpyQnIPur6uLo1yuyzadM3Abk2KtaqcV4KRH50O89nqDMXbgBny/g3pMX4xqtPweuU9UT\n4pQJVb1YW42tiik07yzyKbwGPxH5GtevvyxBkXOAH+CqFQRXNfUE7owvkchYhCpxXUk34C7P/fwL\nVz22xxmLz/OIrifPwXVfTXZ28HPg0LBnjjH64Ko4/KwJmhw8oXsxqeq/RWQcrrFWgOGqujrJcYK+\nxpFjh26olz3nI3rC+3LynY+I3XNktcG1pyXtjed9uQXubquud9143Ot1Iq5x9FB85vyJSZA5uHa+\nr+LsGjvf0x7db4kz31OU14CBMYn0Ve9YiQRtQL4ZGEb8XmnJBgnOE5EXcNNhRDc4+12lnKZ7jhF5\n3Dtp+62IJLqyD1u9WCuWIJIbAfxYVScAiOsi+zQJJusDrsG9ITd7+/8JN+rWL0GMFJHWwJ9xH2Il\n+eRgX6nX1S2g6HryatzIy7OTlFmIG7EdWNRlc+TDvhpI1iOkVFxf9f8Q7MMVuheTJwdXHZUHHCgi\nB6qqX2+eQK+xqj7m/fqwqsb7QvRztar+VUS+hatiuwr3/vJLEKn0xgtF3Aj95rj37gTgSFVdm6RY\nEbu/7KtxX5avxe6kqvuLiADdEjVgx4nnIFyCahVzstOSqHa7BCaJyCHJGpBVdZj3M5Veac1w793o\nq/5k7U814sZmver9fUFM2Xhuxs3B1FvcFDYdYsrVKUsQyVVGkgOAqk70vgQTEfbsO72T+N1Po80D\ndqrqa16j00Dcl6Wf34ibOiN2qo24b0hVvSrJ48VzO67R/ROCTeeR6nxXLXGJKOiH62pcu9Br7G4X\nutLvALJ79tvYfup+CSLUa4x7rRbjupy+nqidKjY072eY+YjCjjdIxUzcWXk/3BlquYhM0pgR5TFG\nA3ewZ13/bcQ5mVJVFZE38D/zj9YX12W7NXue7FTiBrX5CdyAHCEBp8CIui+Vz9eluCuyh3HvxcnA\nZeK65N6Q4DjTxPWC6ot7Hp+r6o4Ujh2ItUEkISIP4hqQ/437J34X1zD2GuxdL+ldZl+BaxAEV+X0\njKr+xecY0VMU/AF3iZtwigKvzPO4kde+9apR+3fFXcUc6z2PibjeHMt9jjHF2y+24S3udB5R5c5i\n90CmcVr3UxpE2gZ6svsDnOwD/zkwQEOMUA/7GntljsLVJZ+DmzH2RVVNuH6FuF5fXXBViocBubjX\nLOEXp7hRzqfhRivPF9cbr3+SaqmUyJ5Tk3RS1YQL53iv8c9xDedJG2rF9eV/RlWnhohniKpOCrq/\nVyZUA7KkNgVG6M9XyOeQaFAt3vE2ABNVNdDAvsDHtQThTxJMU8Dus5C96iW9euRdvZ4CNAiGmqLA\nK7Orb37A5/E+rjEsulrmUlU91afMx34NvwnK3IOrtw4z3iDU+gZhv4i8Mv8FLtSAU6V4ZUK9xjFl\n2wMP4F5jv/meQs9HVB/Edev9Bu4Mfym7pyb50KfMRFX1W18kdv/PgAO9x99MsDP70GthhG1AltSm\nwAj8+RKR/1PVe0Xk78TvEr5XIhKRu1T1N94JRTztgGZ+n+dUWIJIQvaerE8h6RiFsMdIZQDfE8CD\nyepVo/ZPZYDR73Ef3tiGN79eJjPZc7xBLq4bnt+HfjyuMfixqJ4ZCQcchv0i8sq8hjtDDzr7bSqv\ncUtcF9KLcWegbwAv+3RoQETGqOrJybbVNxG5FZcUylQ17lTiccqcjDshCFQlJ67H316SJPpQ7xXv\n/odxVVOBGpBF5BXcFOmBp8AI8/kSkTNVdaS4CTfjJYiEVVlJYhihPl3jU2FtEMlFn3EW4OpBw/S4\nCSKVAXxDgSu8Ou8g9arrROQyXFUZuA/y+iTH+J73M/rsP1kvE3AfxkgSaeW3oyfw+gaesG0D4Bpb\nYxuc/QY+QfjXeAau7ei3yapBpHariqWdqv45hWJX4ark8gkwH5HuXkOkI8kbmiPCvlcgYAOy1G4K\njMCfL909xctn7N1mo7hxHnGJz1xUdZ0cwBJEUqq6R5c3EbmPvb9oanuMKqLerN6ZS7Kzl4TTJCRw\nNfAQbl4hxS1A4tuwpsHn0Y/2B2CauC6lu8YbJCkTdg2NUF9Enu/hBlTN8o5xCa4awK93WeDX2LtS\nekNV446FiONa4Ge4ZFDG7gRRgVvxbl90WMhqz7Nw7W2dgbW48SZzSTzyGlJYbyVEA/J9uP/Dn3Bt\nSLtC9bb5Cf35wq3NEriruieVuahSZlVMIXlne1NUtU+mYwlD3NQSP1OvZ424wUr3xWtwTdAQtovf\nmbrXwDcf15D/JW4Eru94A4m/vsGlPo2IodsGvGO8ius5MhQ3wdl3NMl05CGPEbpqSMKvwtZgpVAl\nNwM3tuADrw3uROAS9bqbJigTeC2MVOr6vXLxBvfNTFJNGvjzFVUmlarS0FXFtWFXEEnIntNl5+L6\nHddZ+0M9GqBR3S5VdYOIJGoEr83cQk/jvoDPwhuBK27lNr8RuCu8cmNx02dU4HqCJXqdJ0uAfu17\nBK26SEQuxlUBLQO+qf5dNlMxXUTewk2LHT0rbcLXS92qYv3Ye0rxlOqhMyxsldwOVV0vIjkikqOq\nY8V1R05I3fQTp0iwtTAiVcGlQYIXkeuAHwO9vLa0iCLgf0mKh/l8RaRSVZrK3F0pswSRXPR02dW4\nUb+BGu0amBwRaRNzhhP3/x/ikjxe2dAjcHHTUJfjBgmuDHCYwF9Esvd6GG1xif4TcWtuBJ6aJIC2\nuHrn6B4zvgnV6wRxAi5BjAZOx3WR3BcTRNhqz3Jx3Wg/wk1hvZbEa2uHXm8kqq6/SlVfiSl3YZyH\newH4L27m4tuitlf6dczwBP58RUmlqvRHwHNeWwR4c1ElOU7KrIqpkRCRy3FtAa/i3oQXAb/XOMtm\nRpVJZTGb2BG4EzXJCNxkvVDi7B+490uiff3K1CcvgR2G6+l1mLjZOp9UVb+ruKzgXQVsxSX4S3Ed\nGv6lqns17sbpTbgHTbAEsFc2XpXRXttqI8XPV9iu6jm4SRlflhBzd9WGXUE0Eqr6nLj1bE/CfSDP\nC1BFk0qDWCojcENNkR7mS70+E4Ck0Ecf2KpuVb1q70O/luS9xLKCetPReHwHX/olgERE5HTcCPUu\nIvK3qLtakrznUygpfr5CVZV675MbcF2n05oYIuwKwiRUmwYxCTACN6r6J4+QU6Q3RLXoo38HbuzE\nLbhu1dNrU823r5CQ07x7ZXrhqisH4947k4CbNP7U2IfhBiH+Fvh11F2VwFgNNhVK2ogbkNcb19Ae\ndAqQX+HaHF5iz3auZFVgKbErCOMndIOY7D0C9ylcVVM8tVkOtSFKpY9+EXAhMA63QE9LzfAo6np0\nLyGmefe8gOsGfK7398W4sQd7TUujbl6r2bgOCb5XKBkSts0GXHdaxTWmR0vLVaclCOPnOuDZkA1i\nzXBTTCQdgZvp+v80CN1Hn929vv5O8F5f2SLsNO/gaj2i6/Wf905K4lLVnSLSTkSaqOr2RPtlQorv\n/0NwyWEo7n02AXi0LuOKZlVMJiFxs4RegLsMbo1rU1Ctw2lGskmYPvox5XLZs9fXFlU9KL3RZk7U\nOJvjcQvkBJ3mPTLXVznwIrsnz2yKN7gwXlWLuJlvB+IGuEZXyyRdyKmhEZGXcd3Ao+c6a62qFyUu\nVYvjWYIwiYjIO+zufrprlkiNGV1uHBFpqqrbovvoi0hbv/rhVHp97etk94Rz0YsGRWiSwWV+636o\nxlntMFEPqFQavjNNRGZozBxt8bbVFatiMn66qmoq9aSN1esicrbuXiyqE/A2/msepNLra58WaYD3\nRh/fqKrl3t9tiL+aW3TZ0NO/RBKBiBS5P4PP6NsAfSoig1V1MoCIHE3yQXwpswRh/ITqfmr4D/Cq\niJyPWy/6LVwvroRU9SbYo9fX07hql4TrLmSRAZHkAKCqGwOMPibsyHNv/3/iBjIiIuuAy1V1Ti1i\nz5SjgctFJLISX3dgbqRHYF33/LMEYfYS0/30KhHZp7uf1hdVfUJEmuASRU/gWlX92K9MyF5f2Sb0\n6OMUR54/Dtys3jrmInICbknfUGudNBD1ekVvCcLEk23dT9MqZhoIwV09TAcGe9UBfo2hgXt9ZaH7\ncVepe4w+TlLmAnaPPL8qMvI8SZnmkeQAoKrjvHaifU599/yzBGH2koXdT9Mtdh3uNxJs34umtu5C\nVkhx9PGWFEaeL/IGmEWv9ubX2G08liCMqaV4vWG8eXNa1NeUCPsqLyEEnpUXKBWR1rgqojLcyPMp\nScpcDdyFW0decJMDXhk62EbIurkaU0dE5AXcOIaduC+vVsADjfkqIZ1EpCcBRp6LyCDgF8Ss3GZt\naclZgjCmjkTmqRKRS3GNzsNxbQv2RVRHJIU1vEXkc1xvstlErdxmVanJWRWTMXUnX0TycctVPqSq\nO0TEzsDqgNRuDe+vdPfaECYESxDG1J3HgCXADOAjby0Ka4OoG/HW8FbczKwPJSmbysptBqtiMiat\nRCSvEXZfTRsR+TXwF1Wt8HomDQR+p6rTfMo8j1u5bQ5RK7f5TelhHEsQxtQhETkDt8Rq9Chfm9yw\njojITFUdICJDgT/gxlLcoap7TfcdVSbUym1mt5xMB2BMthCRR3Gzi/4EVwVyIeC75KkJLTJp5BnA\no6r6JtAkSZnJInJIesPKTnYFYUwdiTq7jfxsAbyuqt/MdGzZQkRGASuAU3A9xbYAU/xmM01l5Tbj\nWCO1MXUnMgNrlYh0BtYDoWcfNb4uws1HdJ+qlotIMW6ZVz82I3GKLEEYU3dGeaN878X1tIHk8wSZ\nEFS1Cng96u9VJFm1z8Y7pM6qmIypIyLSDLdM6zfYvRzkI6q6NaOBGZMiSxDG1BFvOchK4HlvU1qX\ngzQm3SxBGFNH6ns5SGPSzbq5GlN3PhWRwZE/0r0cpDHpZlcQxtRS1Ap8+UBf4Evv7x7AZ6raL4Ph\nGZMySxDG1JI351JC1ovG7KssQRhjjInL2iCMMcbEZQnCGGNMXJYgjPGIyC9EZI6IzBSR6V4vpHQd\na5y3FKYxDZZNtWEMICJDgO8AA1V1m4i0J/ksocZkNbuCMMYpBtap6jYAVV2nqitF5NciMlVEZovI\n4yIisOsK4EER+UhE5orIkSLyuojMF5G7vX16isg8EXnWuyp5VUQKYw8sIt8UkUkiMk1EXvFmgUVE\n7hGRz7yy99Xja2EMYAnCmIj3gG4i8oWIPCwix3vbH1LVI72xDM1wVxkR21X1OOBR4E3geqAfcKWI\ntPP26Qs87k0tXQH8OPqg3pXKL4FTVHUgUArcLCJtgXOBQ72yd6fhORvjyxKEMYCqbsKtLzAM+Ap4\nSUSuBE4UkU+8wXAn4VaLi3jL+zkLmKOqq7wrkEVAN+++ZaoaGU39PDA05tCDgUOA/4nIdOAK3AC7\nCmAr8KSInAdU1dmTNSYga4MwxqOqO4FxwDgvIVwLDAAGqeoyEbmTqKVEcYvPgFvneFvU9hp2f7Zi\nBxrF/i3A+6p6SWw8InIUcDJwMXADLkEZU2/sCsIYQET6ikifqE2HA597v6/z2gUuSOGhu3sN4OBm\nd50Yc/9k4FgROcCLo1BEDvSO10pVRwM/8+Ixpl7ZFYQxTgvg796CP9XAAlx1UzmuCmkJMDWFx50L\nXGDNRa4AAABxSURBVCEijwHzgUei71TVr7yqrH+LSFNv8y9x04a/KSIFuKuMm1I4tjG1YlNtGJMm\nItITGGWT9Zl9lVUxGWOMicuuIIwxxsRlVxDGGGPisgRhjDEmLksQxhhj4rIEYYwxJi5LEMYYY+Ky\nBGGMMSau/wd3rWvzukbz1wAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fdist_whole_text.plot(25)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### If some of those words don't seem important, we can add them to 'stops' and clean text again" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": true }, "outputs": [], "source": [ "boring_words = ['sir', 'upon', 'said', 'one']\n", "stops += boring_words" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.32 s, sys: 4 ms, total: 1.32 s\n", "Wall time: 1.32 s\n" ] } ], "source": [ "%%time\n", "cleaned_text = process_text(whole_text)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "194" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fdist_whole_text['holmes']" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "116" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fdist_whole_text['watson']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Stemming" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from nltk.stem import PorterStemmer" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on package nltk.stem in nltk:\n", "\n", "NAME\n", " nltk.stem - NLTK Stemmers\n", "\n", "DESCRIPTION\n", " Interfaces used to remove morphological affixes from words, leaving\n", " only the word stem. Stemming algorithms aim to remove those affixes\n", " required for eg. grammatical role, tense, derivational morphology\n", " leaving only the stem of the word. This is a difficult problem due to\n", " irregular words (eg. common verbs in English), complicated\n", " morphological rules, and part-of-speech and sense ambiguities\n", " (eg. ``ceil-`` is not the stem of ``ceiling``).\n", " \n", " StemmerI defines a standard interface for stemmers.\n", "\n", "PACKAGE CONTENTS\n", " api\n", " isri\n", " lancaster\n", " porter\n", " regexp\n", " rslp\n", " snowball\n", " util\n", " wordnet\n", "\n", "FILE\n", " /home/derek/anaconda3/envs/nlp36/lib/python3.6/site-packages/nltk/stem/__init__.py\n", "\n", "\n" ] } ], "source": [ "help(nltk.stem)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true }, "outputs": [], "source": [ "ps = PorterStemmer()" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "happi\n", "happi\n", "had\n", "fish\n", "fish\n", "fisher\n", "fish\n", "fish\n" ] } ], "source": [ "print(ps.stem('Happy'))\n", "print(ps.stem('Happiness'))\n", "print(ps.stem('Had'))\n", "\n", "print(ps.stem('Fishing'))\n", "print(ps.stem('Fish'))\n", "print(ps.stem('Fisher'))\n", "print(ps.stem('Fishes'))\n", "print(ps.stem('Fished'))" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": true }, "outputs": [], "source": [ "words = process_text(snippet)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": true, "scrolled": true }, "outputs": [], "source": [ "stemmed = [ps.stem(word) for word in words]" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "none ---> none\n", "suggest ---> suggest\n", "know ---> know\n", "methods ---> method\n", "apply ---> appli\n", "think ---> think\n", "obvious ---> obviou\n", "conclusion ---> conclus\n", "man ---> man\n", "practised ---> practis\n", "town ---> town\n", "going ---> go\n", "country ---> countri\n", "think ---> think\n", "might ---> might\n", "venture ---> ventur\n", "little ---> littl\n", "farther ---> farther\n", "look ---> look\n", "light ---> light\n", "occasion ---> occas\n", "would ---> would\n", "probable ---> probabl\n", "presentation ---> present\n", "would ---> would\n", "made ---> made\n", "would ---> would\n", "friends ---> friend\n", "unite ---> unit\n", "give ---> give\n", "pledge ---> pledg\n", "good ---> good\n", "obviously ---> obvious\n", "moment ---> moment\n", "dr. ---> dr.\n", "mortimer ---> mortim\n", "withdrew ---> withdrew\n", "service ---> servic\n", "hospital ---> hospit\n", "order ---> order\n", "start ---> start\n", "practice ---> practic\n", "know ---> know\n", "presentation ---> present\n", "believe ---> believ\n", "change ---> chang\n", "town ---> town\n", "hospital ---> hospit\n", "country ---> countri\n", "practice ---> practic\n", "stretching ---> stretch\n", "inference ---> infer\n", "far ---> far\n", "say ---> say\n", "presentation ---> present\n", "occasion ---> occas\n", "change ---> chang\n", "certainly ---> certainli\n", "seems ---> seem\n", "probable ---> probabl\n", "observe ---> observ\n", "could ---> could\n", "staff ---> staff\n", "hospital ---> hospit\n", "since ---> sinc\n", "man ---> man\n", "well-established ---> well-establish\n", "london ---> london\n", "practice ---> practic\n", "could ---> could\n", "hold ---> hold\n", "position ---> posit\n", "would ---> would\n", "drift ---> drift\n", "country ---> countri\n", "hospital ---> hospit\n", "yet ---> yet\n", "staff ---> staff\n", "could ---> could\n", "house-surgeon ---> house-surgeon\n", "house-physician ---> house-physician\n", "little ---> littl\n", "senior ---> senior\n", "student ---> student\n", "left ---> left\n", "five ---> five\n", "years ---> year\n", "ago ---> ago\n", "date ---> date\n", "stick ---> stick\n" ] } ], "source": [ "for w, stem in zip(words, stemmed):\n", " print('{} ---> {}'.format(w, stem))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's make another function that stems the word tokens during the processing stage" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def stem_process(text):\n", " # tokenize\n", " tokens = word_tokenize(text)\n", " # remove stops\n", " filtered_words = [token.lower() for token in tokens if not token.lower() in stops]\n", " filtered_words = [w for w in filtered_words if (len(w) > 2)]\n", " # stem\n", " stemmed_words = [ps.stem(w) for w in filtered_words]\n", " return stemmed_words" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 2.72 s, sys: 16 ms, total: 2.73 s\n", "Wall time: 2.74 s\n" ] } ], "source": [ "%%time\n", "stemmed = stem_process(whole_text)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['project',\n", " 'gutenberg',\n", " 'hound',\n", " 'baskervil',\n", " 'arthur',\n", " 'conan',\n", " 'doyl',\n", " 'ebook',\n", " 'use',\n", " 'anyon',\n", " 'anywher',\n", " 'cost',\n", " 'almost',\n", " 'restrict',\n", " 'whatsoev',\n", " 'may',\n", " 'copi',\n", " 'give',\n", " 'away',\n", " 're-us',\n", " 'term',\n", " 'project',\n", " 'gutenberg',\n", " 'licens',\n", " 'includ',\n", " 'ebook',\n", " 'onlin',\n", " 'www.gutenberg.org',\n", " 'titl',\n", " 'hound',\n", " 'baskervil',\n", " 'author',\n", " 'arthur',\n", " 'conan',\n", " 'doyl',\n", " 'post',\n", " 'date',\n", " 'octob',\n", " '2010',\n", " 'releas',\n", " 'date',\n", " 'februari',\n", " '2002',\n", " 'etext',\n", " '3070',\n", " 'languag',\n", " 'english',\n", " '***',\n", " 'start',\n", " 'project',\n", " 'gutenberg',\n", " 'ebook',\n", " 'hound',\n", " 'baskervil',\n", " '***',\n", " 'produc',\n", " 'etext',\n", " 'produc',\n", " 'k.pehtla',\n", " 'ppehtla',\n", " 'nfld.com',\n", " 'hound',\n", " 'baskervil',\n", " 'arthur',\n", " 'conan',\n", " 'doyl',\n", " 'content',\n", " 'chapter',\n", " 'mr.',\n", " 'sherlock',\n", " 'holm',\n", " 'chapter',\n", " 'curs',\n", " 'baskervil',\n", " 'chapter',\n", " 'problem',\n", " 'chapter',\n", " 'henri',\n", " 'baskervil',\n", " 'chapter',\n", " 'three',\n", " 'broken',\n", " 'thread',\n", " 'chapter',\n", " 'baskervil',\n", " 'hall',\n", " 'chapter',\n", " 'stapleton',\n", " 'merripit',\n", " 'hous',\n", " 'chapter',\n", " 'first',\n", " 'report',\n", " 'dr.',\n", " 'watson',\n", " 'chapter',\n", " 'light',\n", " 'moor',\n", " 'chapter',\n", " 'extract',\n", " 'diari',\n", " 'dr.',\n", " 'watson',\n", " 'chapter',\n", " 'man',\n", " 'tor',\n", " 'chapter',\n", " 'death',\n", " 'moor',\n", " 'chapter',\n", " 'fix',\n", " 'net',\n", " 'chapter',\n", " 'hound',\n", " 'baskervil',\n", " 'chapter',\n", " 'retrospect',\n", " 'chapter',\n", " 'mr.',\n", " 'sherlock',\n", " 'holm',\n", " 'mr.',\n", " 'sherlock',\n", " 'holm',\n", " 'usual',\n", " 'late',\n", " 'morn',\n", " 'save',\n", " 'infrequ',\n", " 'occas',\n", " 'night',\n", " 'seat',\n", " 'breakfast',\n", " 'tabl',\n", " 'stood',\n", " 'hearth-rug',\n", " 'pick',\n", " 'stick',\n", " 'visitor',\n", " 'left',\n", " 'behind',\n", " 'night',\n", " 'fine',\n", " 'thick',\n", " 'piec',\n", " 'wood',\n", " 'bulbous-head',\n", " 'sort',\n", " 'known',\n", " 'penang',\n", " 'lawyer',\n", " 'head',\n", " 'broad',\n", " 'silver',\n", " 'band',\n", " 'nearli',\n", " 'inch',\n", " 'across',\n", " 'jame',\n", " 'mortim',\n", " 'm.r.c.s.',\n", " 'friend',\n", " 'c.c.h.',\n", " 'engrav',\n", " 'date',\n", " '1884',\n", " 'stick',\n", " 'old-fashion',\n", " 'famili',\n", " 'practition',\n", " 'use',\n", " 'carri',\n", " 'dignifi',\n", " 'solid',\n", " 'reassur',\n", " 'well',\n", " 'watson',\n", " 'make',\n", " 'holm',\n", " 'sit',\n", " 'back',\n", " 'given',\n", " 'sign',\n", " 'occup',\n", " 'know',\n", " 'believ',\n", " 'eye',\n", " 'back',\n", " 'head',\n", " 'least',\n", " 'well-polish',\n", " 'silver-pl',\n", " 'coffee-pot',\n", " 'front',\n", " 'tell',\n", " 'watson',\n", " 'make',\n", " 'visitor',\n", " 'stick',\n", " 'sinc',\n", " 'unfortun',\n", " 'miss',\n", " 'notion',\n", " 'errand',\n", " 'accident',\n", " 'souvenir',\n", " 'becom',\n", " 'import',\n", " 'let',\n", " 'hear',\n", " 'reconstruct',\n", " 'man',\n", " 'examin',\n", " 'think',\n", " 'follow',\n", " 'far',\n", " 'could',\n", " 'method',\n", " 'companion',\n", " 'dr.',\n", " 'mortim',\n", " 'success',\n", " 'elderli',\n", " 'medic',\n", " 'man',\n", " 'well-esteem',\n", " 'sinc',\n", " 'know',\n", " 'give',\n", " 'mark',\n", " 'appreci',\n", " 'good',\n", " 'holm',\n", " 'excel',\n", " 'think',\n", " 'also',\n", " 'probabl',\n", " 'favour',\n", " 'countri',\n", " 'practition',\n", " 'great',\n", " 'deal',\n", " 'visit',\n", " 'foot',\n", " 'stick',\n", " 'though',\n", " 'origin',\n", " 'handsom',\n", " 'knock',\n", " 'hardli',\n", " 'imagin',\n", " 'town',\n", " 'practition',\n", " 'carri',\n", " 'thick-iron',\n", " 'ferrul',\n", " 'worn',\n", " 'evid',\n", " 'done',\n", " 'great',\n", " 'amount',\n", " 'walk',\n", " 'perfectli',\n", " 'sound',\n", " 'holm',\n", " \"'friend\",\n", " 'c.c.h',\n", " 'guess',\n", " 'someth',\n", " 'hunt',\n", " 'local',\n", " 'hunt',\n", " 'whose',\n", " 'member',\n", " 'possibl',\n", " 'given',\n", " 'surgic',\n", " 'assist',\n", " 'made',\n", " 'small',\n", " 'present',\n", " 'return',\n", " 'realli',\n", " 'watson',\n", " 'excel',\n", " 'holm',\n", " 'push',\n", " 'back',\n", " 'chair',\n", " 'light',\n", " 'cigarett',\n", " 'bound',\n", " 'say',\n", " 'account',\n", " 'good',\n", " 'give',\n", " 'small',\n", " 'achiev',\n", " 'habitu',\n", " 'underr',\n", " 'abil',\n", " 'may',\n", " 'lumin',\n", " 'conductor',\n", " 'light',\n", " 'peopl',\n", " 'without',\n", " 'possess',\n", " 'geniu',\n", " 'remark',\n", " 'power',\n", " 'stimul',\n", " 'confess',\n", " 'dear',\n", " 'fellow',\n", " 'much',\n", " 'debt',\n", " 'never',\n", " 'much',\n", " 'must',\n", " 'admit',\n", " 'word',\n", " 'gave',\n", " 'keen',\n", " 'pleasur',\n", " 'often',\n", " 'piqu',\n", " 'indiffer',\n", " 'admir',\n", " 'attempt',\n", " 'made',\n", " 'give',\n", " 'public',\n", " 'method',\n", " 'proud',\n", " 'think',\n", " 'far',\n", " 'master',\n", " 'system',\n", " 'appli',\n", " 'way',\n", " 'earn',\n", " 'approv',\n", " 'took',\n", " 'stick',\n", " 'hand',\n", " 'examin',\n", " 'minut',\n", " 'nake',\n", " 'eye',\n", " 'express',\n", " 'interest',\n", " 'laid',\n", " 'cigarett',\n", " 'carri',\n", " 'cane',\n", " 'window',\n", " 'look',\n", " 'convex',\n", " 'len',\n", " 'interest',\n", " 'though',\n", " 'elementari',\n", " 'return',\n", " 'favourit',\n", " 'corner',\n", " 'sette',\n", " 'certainli',\n", " 'two',\n", " 'indic',\n", " 'stick',\n", " 'give',\n", " 'basi',\n", " 'sever',\n", " 'deduct',\n", " 'anyth',\n", " 'escap',\n", " 'ask',\n", " 'self-import',\n", " 'trust',\n", " 'noth',\n", " 'consequ',\n", " 'overlook',\n", " 'afraid',\n", " 'dear',\n", " 'watson',\n", " 'conclus',\n", " 'erron',\n", " 'stimul',\n", " 'meant',\n", " 'frank',\n", " 'note',\n", " 'fallaci',\n", " 'occasion',\n", " 'guid',\n", " 'toward',\n", " 'truth',\n", " 'entir',\n", " 'wrong',\n", " 'instanc',\n", " 'man',\n", " 'certainli',\n", " 'countri',\n", " 'practition',\n", " 'walk',\n", " 'good',\n", " 'deal',\n", " 'right',\n", " 'extent',\n", " 'dear',\n", " 'watson',\n", " 'mean',\n", " 'would',\n", " 'suggest',\n", " 'exampl',\n", " 'present',\n", " 'doctor',\n", " 'like',\n", " 'come',\n", " 'hospit',\n", " 'hunt',\n", " 'initi',\n", " \"'c.c\",\n", " 'place',\n", " 'hospit',\n", " 'word',\n", " \"'chare\",\n", " 'cross',\n", " 'natur',\n", " 'suggest',\n", " 'may',\n", " 'right',\n", " 'probabl',\n", " 'lie',\n", " 'direct',\n", " 'take',\n", " 'work',\n", " 'hypothesi',\n", " 'fresh',\n", " 'basi',\n", " 'start',\n", " 'construct',\n", " 'unknown',\n", " 'visitor',\n", " 'well',\n", " 'suppos',\n", " \"'c.c.h\",\n", " 'stand',\n", " \"'chare\",\n", " 'cross',\n", " 'hospit',\n", " 'infer',\n", " 'may',\n", " 'draw',\n", " 'none',\n", " 'suggest',\n", " 'know',\n", " 'method',\n", " 'appli',\n", " 'think',\n", " 'obviou',\n", " 'conclus',\n", " 'man',\n", " 'practis',\n", " 'town',\n", " 'go',\n", " 'countri',\n", " 'think',\n", " 'might',\n", " 'ventur',\n", " 'littl',\n", " 'farther',\n", " 'look',\n", " 'light',\n", " 'occas',\n", " 'would',\n", " 'probabl',\n", " 'present',\n", " 'would',\n", " 'made',\n", " 'would',\n", " 'friend',\n", " 'unit',\n", " 'give',\n", " 'pledg',\n", " 'good',\n", " 'obvious',\n", " 'moment',\n", " 'dr.',\n", " 'mortim',\n", " 'withdrew',\n", " 'servic',\n", " 'hospit',\n", " 'order',\n", " 'start',\n", " 'practic',\n", " 'know',\n", " 'present',\n", " 'believ',\n", " 'chang',\n", " 'town',\n", " 'hospit',\n", " 'countri',\n", " 'practic',\n", " 'stretch',\n", " 'infer',\n", " 'far',\n", " 'say',\n", " 'present',\n", " 'occas',\n", " 'chang',\n", " 'certainli',\n", " 'seem',\n", " 'probabl',\n", " 'observ',\n", " 'could',\n", " 'staff',\n", " 'hospit',\n", " 'sinc',\n", " 'man',\n", " 'well-establish',\n", " 'london',\n", " 'practic',\n", " 'could',\n", " 'hold',\n", " 'posit',\n", " 'would',\n", " 'drift',\n", " 'countri',\n", " 'hospit',\n", " 'yet',\n", " 'staff',\n", " 'could',\n", " 'house-surgeon',\n", " 'house-physician',\n", " 'littl',\n", " 'senior',\n", " 'student',\n", " 'left',\n", " 'five',\n", " 'year',\n", " 'ago',\n", " 'date',\n", " 'stick',\n", " 'grave',\n", " 'middle-ag',\n", " 'famili',\n", " 'practition',\n", " 'vanish',\n", " 'thin',\n", " 'air',\n", " 'dear',\n", " 'watson',\n", " 'emerg',\n", " 'young',\n", " 'fellow',\n", " 'thirti',\n", " 'amiabl',\n", " 'unambiti',\n", " 'absent-mind',\n", " 'possessor',\n", " 'favourit',\n", " 'dog',\n", " 'describ',\n", " 'roughli',\n", " 'larger',\n", " 'terrier',\n", " 'smaller',\n", " 'mastiff',\n", " 'laugh',\n", " 'incredul',\n", " 'sherlock',\n", " 'holm',\n", " 'lean',\n", " 'back',\n", " 'sette',\n", " 'blew',\n", " 'littl',\n", " 'waver',\n", " 'ring',\n", " 'smoke',\n", " 'ceil',\n", " 'latter',\n", " 'part',\n", " 'mean',\n", " 'check',\n", " 'least',\n", " 'difficult',\n", " 'find',\n", " 'particular',\n", " 'man',\n", " 'age',\n", " 'profession',\n", " 'career',\n", " 'small',\n", " 'medic',\n", " 'shelf',\n", " 'took',\n", " 'medic',\n", " 'directori',\n", " 'turn',\n", " 'name',\n", " 'sever',\n", " 'mortim',\n", " 'could',\n", " 'visitor',\n", " 'read',\n", " 'record',\n", " 'aloud',\n", " 'mortim',\n", " 'jame',\n", " 'm.r.c.s.',\n", " '1882',\n", " 'grimpen',\n", " 'dartmoor',\n", " 'devon',\n", " 'house-surgeon',\n", " '1882',\n", " '1884',\n", " 'chare',\n", " 'cross',\n", " 'hospit',\n", " 'winner',\n", " 'jackson',\n", " 'prize',\n", " 'compar',\n", " 'patholog',\n", " 'essay',\n", " 'entitl',\n", " \"'i\",\n", " 'diseas',\n", " 'revers',\n", " 'correspond',\n", " 'member',\n", " 'swedish',\n", " 'patholog',\n", " 'societi',\n", " 'author',\n", " \"'some\",\n", " 'freak',\n", " 'atav',\n", " 'lancet',\n", " '1882',\n", " \"'do\",\n", " 'progress',\n", " 'journal',\n", " 'psycholog',\n", " 'march',\n", " '1883',\n", " 'medic',\n", " 'offic',\n", " 'parish',\n", " 'grimpen',\n", " 'thorsley',\n", " 'high',\n", " 'barrow',\n", " 'mention',\n", " 'local',\n", " 'hunt',\n", " 'watson',\n", " 'holm',\n", " 'mischiev',\n", " 'smile',\n", " 'countri',\n", " 'doctor',\n", " 'astut',\n", " 'observ',\n", " 'think',\n", " 'fairli',\n", " 'justifi',\n", " 'infer',\n", " 'adject',\n", " 'rememb',\n", " 'right',\n", " 'amiabl',\n", " 'unambiti',\n", " 'absent-mind',\n", " 'experi',\n", " 'amiabl',\n", " 'man',\n", " 'world',\n", " 'receiv',\n", " 'testimoni',\n", " 'unambiti',\n", " 'abandon',\n", " 'london',\n", " 'career',\n", " 'countri',\n", " 'absent-mind',\n", " 'leav',\n", " 'stick',\n", " 'visiting-card',\n", " 'wait',\n", " 'hour',\n", " 'room',\n", " 'dog',\n", " 'habit',\n", " 'carri',\n", " 'stick',\n", " 'behind',\n", " 'master',\n", " 'heavi',\n", " 'stick',\n", " 'dog',\n", " 'held',\n", " 'tightli',\n", " 'middl',\n", " 'mark',\n", " 'teeth',\n", " 'plainli',\n", " 'visibl',\n", " 'dog',\n", " 'jaw',\n", " 'shown',\n", " 'space',\n", " 'mark',\n", " 'broad',\n", " 'opinion',\n", " 'terrier',\n", " 'broad',\n", " 'enough',\n", " 'mastiff',\n", " 'may',\n", " 'ye',\n", " 'jove',\n", " 'curly-hair',\n", " 'spaniel',\n", " 'risen',\n", " 'pace',\n", " 'room',\n", " 'spoke',\n", " 'halt',\n", " 'recess',\n", " 'window',\n", " 'ring',\n", " 'convict',\n", " 'voic',\n", " 'glanc',\n", " 'surpris',\n", " 'dear',\n", " 'fellow',\n", " 'possibl',\n", " 'sure',\n", " 'simpl',\n", " 'reason',\n", " 'see',\n", " 'dog',\n", " 'door-step',\n", " 'ring',\n", " 'owner',\n", " \"n't\",\n", " 'move',\n", " 'beg',\n", " 'watson',\n", " 'profession',\n", " 'brother',\n", " 'presenc',\n", " 'may',\n", " 'assist',\n", " 'dramat',\n", " 'moment',\n", " 'fate',\n", " 'watson',\n", " 'hear',\n", " 'step',\n", " 'stair',\n", " 'walk',\n", " 'life',\n", " 'know',\n", " 'whether',\n", " 'good',\n", " 'ill.',\n", " 'dr.',\n", " 'jame',\n", " 'mortim',\n", " 'man',\n", " 'scienc',\n", " 'ask',\n", " 'sherlock',\n", " 'holm',\n", " 'specialist',\n", " 'crime',\n", " 'come',\n", " 'appear',\n", " 'visitor',\n", " 'surpris',\n", " 'sinc',\n", " 'expect',\n", " 'typic',\n", " 'countri',\n", " 'practition',\n", " 'tall',\n", " 'thin',\n", " 'man',\n", " 'long',\n", " 'nose',\n", " 'like',\n", " 'beak',\n", " 'jut',\n", " 'two',\n", " 'keen',\n", " 'gray',\n", " 'eye',\n", " 'set',\n", " 'close',\n", " 'togeth',\n", " 'sparkl',\n", " 'brightli',\n", " 'behind',\n", " 'pair',\n", " 'gold-rim',\n", " 'glass',\n", " 'clad',\n", " 'profession',\n", " 'rather',\n", " 'slovenli',\n", " 'fashion',\n", " 'frock-coat',\n", " 'dingi',\n", " 'trouser',\n", " 'fray',\n", " 'though',\n", " 'young',\n", " 'long',\n", " 'back',\n", " 'alreadi',\n", " 'bow',\n", " 'walk',\n", " 'forward',\n", " 'thrust',\n", " 'head',\n", " 'gener',\n", " 'air',\n", " 'peer',\n", " 'benevol',\n", " 'enter',\n", " 'eye',\n", " 'fell',\n", " 'stick',\n", " 'holm',\n", " 'hand',\n", " 'ran',\n", " 'toward',\n", " 'exclam',\n", " 'joy',\n", " 'glad',\n", " 'sure',\n", " 'whether',\n", " 'left',\n", " 'ship',\n", " 'offic',\n", " 'would',\n", " 'lose',\n", " 'stick',\n", " 'world',\n", " 'present',\n", " 'see',\n", " 'holm',\n", " 'ye',\n", " 'chare',\n", " 'cross',\n", " 'hospit',\n", " 'two',\n", " 'friend',\n", " 'occas',\n", " 'marriag',\n", " 'dear',\n", " 'dear',\n", " 'bad',\n", " 'holm',\n", " 'shake',\n", " 'head',\n", " 'dr.',\n", " 'mortim',\n", " 'blink',\n", " 'glass',\n", " 'mild',\n", " 'astonish',\n", " 'bad',\n", " 'disarrang',\n", " 'littl',\n", " 'deduct',\n", " 'marriag',\n", " 'say',\n", " 'ye',\n", " 'marri',\n", " 'left',\n", " 'hospit',\n", " 'hope',\n", " 'consult',\n", " 'practic',\n", " 'necessari',\n", " 'make',\n", " 'home',\n", " 'come',\n", " 'come',\n", " 'far',\n", " 'wrong',\n", " 'holm',\n", " 'dr.',\n", " 'jame',\n", " 'mortim',\n", " 'mister',\n", " 'mister',\n", " 'humbl',\n", " 'm.r.c.',\n", " 'man',\n", " 'precis',\n", " 'mind',\n", " 'evid',\n", " 'dabbler',\n", " 'scienc',\n", " 'mr.',\n", " 'holm',\n", " 'picker',\n", " 'shell',\n", " 'shore',\n", " 'great',\n", " 'unknown',\n", " 'ocean',\n", " 'presum',\n", " 'mr.',\n", " 'sherlock',\n", " 'holm',\n", " 'address',\n", " 'friend',\n", " 'watson',\n", " 'glad',\n", " 'meet',\n", " 'heard',\n", " 'name',\n", " 'mention',\n", " 'connect',\n", " 'friend',\n", " 'interest',\n", " 'much',\n", " 'mr.',\n", " 'holm',\n", " 'hardli',\n", " 'expect',\n", " 'dolichocephal',\n", " 'skull',\n", " 'well-mark',\n", " 'supra-orbit',\n", " 'develop',\n", " 'would',\n", " 'object',\n", " 'run',\n", " 'finger',\n", " 'along',\n", " 'pariet',\n", " 'fissur',\n", " 'cast',\n", " 'skull',\n", " 'origin',\n", " 'avail',\n", " 'would',\n", " 'ornament',\n", " 'anthropolog',\n", " 'museum',\n", " 'intent',\n", " 'fulsom',\n", " 'confess',\n", " 'covet',\n", " 'skull',\n", " 'sherlock',\n", " 'holm',\n", " 'wave',\n", " 'strang',\n", " 'visitor',\n", " 'chair',\n", " 'enthusiast',\n", " 'line',\n", " 'thought',\n", " 'perceiv',\n", " 'mine',\n", " 'observ',\n", " 'forefing',\n", " 'make',\n", " 'cigarett',\n", " 'hesit',\n", " 'light',\n", " 'man',\n", " 'drew',\n", " 'paper',\n", " 'tobacco',\n", " 'twirl',\n", " 'surpris',\n", " 'dexter',\n", " 'long',\n", " 'quiver',\n", " 'finger',\n", " 'agil',\n", " 'restless',\n", " 'antenna',\n", " 'insect',\n", " 'holm',\n", " 'silent',\n", " 'littl',\n", " 'dart',\n", " ...]" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stemmed" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": true }, "outputs": [], "source": [ "fdist_stems = nltk.FreqDist(stemmed)" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEvCAYAAABfWlZwAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXl8lNXV+L8nCdlYAiRBwr4IooIoCSruS61VX5e6VV63\nqpXW2mrbt2rV/mq19VVbrdXaWu3rTt2XCmjriiCymYBA2ASFsMoetrBzfn/cZ2AyeSbzTMhkJpnz\n/Xyez8zcOXPmzDMzz7n33HPPFVXFMAzDMCLJSLYBhmEYRmpiDsIwDMPwxRyEYRiG4Ys5CMMwDMMX\ncxCGYRiGL+YgDMMwDF/MQRiGYRi+JMxBiEh3ERkrInNFZLaI3Oy1/1FE5onITBF5S0Tah73mdhFZ\nKCLzReTMRNlmGIZhxEYStVBOREqAElWdJiJtgQrgAqAb8LGq7haRBwBU9TYROQx4CTga6AJ8CPRX\n1T0JMdAwDMOol4SNIFR1papO8+5vBuYCXVX1fVXd7YlNxjkMgPOBl1V1h6ouAhbinIVhGIaRBLKa\n4k1EpBdwFDAl4qlrgVe8+11xDiPEMq8tKkVFRdqrV68G2bRt2zby8vIaVdZ0mk7TaTpTTacfFRUV\na1W1OJZcwh2EiLQB3gB+pqqbwtrvBHYD/ww1+by8TvxLREYAIwBKSkp44oknGmRXTU0N+fn5jSpr\nOk2n6TSdqabTj7KysqpAgqqasANoBbwH/CKi/WpgEpAf1nY7cHvY4/eAYfXpLy0t1YZSXl7e6LKm\n03SaTtOZajr9AMo1wDU8kVlMAjwFzFXVP4W1fwe4DThPVWvCXjIKuExEckSkN9APmJoo+wzDMIz6\nSWSI6XjgSmCWiHzhtd0BPArkAB84H8JkVf2Rqs4WkVeBObjQ041qGUyGYRhJI2EOQlUn4D+v8G49\nr7kXuDdRNhmGYRjBsZXUhmEYhi/mIAzDMAxf0tJBLFlXw4OTNrBx265km2IYhpGypKWDuP2tmUxa\ntoM735oVSqk1DMMwIkhLB3HvBYPIzRTGzFzJaxXLkm2OYRhGSpKWDqJXUWt+MKQdAL8dNZuv12xJ\nskWGYRipR1o6CIBTeuZy3uAu1Ozcw00vT2fHbltyYRiGEU7aOggR4fffHUi3DnlULt/EQ+9/mWyT\nDMMwUoq0dRAA7XJb8ejwo8jMEJ4c/zXjv1yTbJMMwzBShrR2EABDenTg59/qB8AvXp3B2i07kmyR\nYRhGapD2DgLghlMO5pjeHVm7ZQe3vDbDUl8NwzAwBwFAZobw8PeOpCCvFWPnr+GZzxYn2yTDMIyk\nYw7Co0v7PB646AgA7v/3PGav2JhkiwzDMJKLOYgwvjOwM/99TA927tnLTS9NZ9tOS301DCN9MQcR\nwf875zD6dWrDV2u2cs+YOck2xzAMI2mYg4ggLzuTR4cfRXZWBi9NXcJ/Klcm2yTDMIykYA7Ch0NL\n2nH7WQMAuO2NWaytsVCTYRjphzmIKHz/uF6cNqATG7ft4pEp1ezZa6mvhmGkF+YgoiAi/PHiIyhu\nm8Octbv469iFyTbJMAyjSTEHUQ+FbXL406WDAXjkowVUVK1PskWGYRhNR8IchIh0F5GxIjJXRGaL\nyM1ee0cR+UBEFni3Hbx2EZFHRWShiMwUkSGJsi0eTuxXzPmHtGbPXuWml75g03bbhc4wjPQgkSOI\n3cD/qOqhwLHAjSJyGPAr4CNV7Qd85D0GOAvo5x0jgMcTaFtcDB/YhiO6FbC8eht3vlVppTgMw0gL\nEuYgVHWlqk7z7m8G5gJdgfOB5zyx54ALvPvnA8+rYzLQXkRKEmVfPLTKEB657CjyszMZPWMFr9su\ndIZhpAFNMgchIr2Ao4ApwEGquhKcEwE6eWJdgaVhL1vmtaUEvYtac/d5hwNwl+1CZxhGGiCJDpeI\nSBtgHHCvqr4pItWq2j7s+Q2q2kFE3gHuU9UJXvtHwK2qWhGhbwQuBEVJSUnp6NGjG2RXTU0N+fn5\nccmqKg9P2chnS7fTt0MW955WSKsMOSCdibDTdJpO02k666OsrKxCVctiCqpqwg6gFfAe8IuwtvlA\niXe/BJjv3X8CGO4nF+0oLS3VhlJeXt4g2Y3bdurx93+kPW8bo/e+M6dRdDaGnOk0nabTdAYFKNcA\n1/BEZjEJ8BQwV1X/FPbUKOBq7/7VwNth7Vd52UzHAhvVC0WlEu1yW/HIZbYLnWEYLZ9EzkEcD1wJ\nnCYiX3jH2cD9wBkisgA4w3sM8C7wNbAQ+Afw4wTadkCU9uzAz063XegMw2jZZCVKsbq5BIny9Ok+\n8grcmCh7Gpsfn3owExauZcqi9dzy2gye/v7QZJtkGIbRqNhK6gZiu9AZhtHSMQdxALhd6AYBbhe6\nxdW2ytowjJZDwkJM6cJ3BpYw/OgevDR1CQ9NrqY6Z1Gg17XfsYvSBNtmGIZxIJiDaAR+81+H8fni\n9SxcvYXfBdyFrigvgwtPU1yyl2EYRuphDqIRyMvO5IXrjuaPb02hoLA4pvyrny9l7bY9rNq0g84F\nuU1goWEYRvyYg2gkSgryuHxQW0pLD48pO/+bzUz8ah2VyzeagzAMI2WxSeokMLBrAQCVKzYm2RLD\nMIzomINIAod3aQdA5fJNSbbEMAwjOuYgkkBoBDHbRhCGYaQw5iCSQO/C1uRmCSs3brcyHYZhpCzm\nIJJARobQu73LD5i9wsJMhmGkJuYgkkTv9q0AqFxuYSbDMFITcxBJok8H5yBsHsIwjFTFHESS6NPB\nQkyGYaQ25iCSRLe2WeRkZVC1roaN26zIn2EYqYc5iCSRmSEMKHHrIebYKMIwjBTEHEQSGegtmLN5\nCMMwUhFzEElkX8kNy2QyDCMFMQeRRAZ2CdVkshCTYRiphzmIJNK/cxuyMoSv1myhZufuZJtjGIZR\ni4Q5CBF5WkRWi0hlWNuRIjJZRL4QkXIROdprFxF5VEQWishMERmSKLtSiZysTPof1BZVmLvSRhGG\nYaQWiRxBPAt8J6LtD8Ddqnok8BvvMcBZQD/vGAE8nkC7UoqBXa2yq2EYqUnCHISqjgfWRzYD7bz7\nBcAK7/75wPPqmAy0F5GSRNmWShzexSaqDcNITZp6R7mfAe+JyIM453Sc194VWBomt8xrW9m05jU9\n+0YQNlFtGEaKIaqaOOUivYAxqjrQe/woME5V3xCRS4ERqvotEXkHuE9VJ3hyHwG3qmqFj84RuDAU\nJSUlpaNHj26QbTU1NeTn5zeqbEN0bt+9lyveWk2GwMjvHkR2pqSknabTdJrO5qnTj7KysgpVLYsp\nqKoJO4BeQGXY443sd0oCbPLuPwEMD5ObD5TE0l9aWqoNpby8vNFlG6rz9Ic+0Z63jdEZSzc0ms7G\nkDOdptN0Nn+dfgDlGuAa3tRpriuAk737pwELvPujgKu8bKZjgY2q2uLDSyEG2hakhmGkIAmbgxCR\nl4BTgCIRWQbcBVwPPCIiWcB2vFAR8C5wNrAQqAGuSZRdqcjArgX864sVVFrJDcMwUoiEOQhVHR7l\nqVIfWQVuTJQtqU4ok2m2ZTIZhpFC2ErqFOAwL8Q095vN7NqzN8nWGIZhOMxBpAAFea3oWZjPzt17\nWbh6S7LNMQzDAMxBpAwDbcGcYRgphjmIFOHwrqG9ISyTyTCM1MAcRIpgIwjDMFINcxApwuHeRPWc\nlZvYuzdxq9sNwzCCYg4iRShsk0OXglxqdu5h0bqtyTbHMAzDHEQqcbhtQWoYRgphDiKFCM1D2ES1\nYRipgDmIFGL/5kE2gjAMI/mYg0ghBoaFmDSBZdgNwzCCYA4ihejUNoeiNjls2r6bZRu2JdscwzDS\nHHMQKYSIWJjJMIyUwRxEihFaD2Glvw3DSDbmIFKM/SuqLZPJMIzkYg4ixbCJasMwUgVzEClGtw55\ntMvNYt3WnazatCPZ5hiGkcaYg0gx3ES1rag2DCP5mINIQfY5CJuoNgwjiZiDSEH2ZTLZRLVhGEkk\nYQ5CRJ4WkdUiUhnR/lMRmS8is0XkD2Htt4vIQu+5MxNlV3MgNIKYbSMIwzCSSFYCdT8LPAY8H2oQ\nkVOB84EjVHWHiHTy2g8DLgMOB7oAH4pIf1Xdk0D7Upbeha1pnZ3Jyo3b2bg9LU+BYRgpQMJGEKo6\nHlgf0XwDcL+q7vBkVnvt5wMvq+oOVV0ELASOTpRtqU5GhnCYF2b6unp3kq0xDCNdaeo5iP7AiSIy\nRUTGichQr70rsDRMbpnXlrYc7i2YW7RhV5ItMQwjXZFELsYSkV7AGFUd6D2uBD4GbgaGAq8AfXCh\nqEmqOtKTewp4V1Xf8NE5AhgBUFJSUjp69OgG2VZTU0N+fn6jyjamzrGLt/HY5xsZ2jmLX51YlLJ2\nmk7TaTpTW6cfZWVlFapaFlNQVRN2AL2AyrDH/wFOCXv8FVAM3A7cHtb+HjAslv7S0lJtKOXl5Y0u\n25g6567cqD1vG6PH3PPvRtMZr5zpNJ2ms/nr9AMo1wDX8KYOMf0LOA1ARPoD2cBaYBRwmYjkiEhv\noB8wtYltSykOLm5DTlYG32zdw8ZtFmYyDKPpSWSa60vAJOAQEVkmItcBTwN9vFDTy8DVnkObDbwK\nzMGNMm7UNM1gCpGVmcGAEjdR/cxni1hebftDGIbRtCQszVVVh0d56ooo8vcC9ybKnuZIaY8OzFha\nzZ8/XMCfP1xA3+LWnNS/mJP6F3Ns70LysjOTbaJhGC2YuB2EiHQAuqvqzATYY4TxyzP7k7VtHYu3\n5zHxq3V8tWYrX63ZyjOfLSY7K4Oje3XkpP5FnNivmAGd2ybbXMMwWhiBHISIfAKc58l/AawRkXGq\n+osE2pb25GdncWbffEpLS9m1Zy/Tl1Qz/ss1jF+whlnLNzJh4VomLFwLzKNT2xwu7J9DaWmyrTYM\no6UQdARRoKqbROQHwDOqepeI2AiiCWmVmcHRvTtydO+O/PLMQ1i/dSefLljD+C/X8umCNazevIN/\nTNvB9WftoLBNTrLNNQyjBRB0kjpLREqAS4ExCbTHCEjH1tmcf2RXHrp0MFPuOJ1TDilmj8K/vliR\nbNMMw2ghBHUQd+PWJixU1c9FpA+wIHFmGfEgIlw2tDsAr5UvtZ3oDMNoFIKGmFaq6hGhB6r6tYj8\nKUE2GQ3gtAEH0S5bmPfNZiqXb2JQt4Jkm2QYRjMn6AjiLwHbjCSRnZXBiT3zAHitYmkMacMwjNjU\nO4IQkWHAcUCxiIRnLLUDLAk/xTitVx7vLKjh7S9WcMfZh5Lbyr4iwzAaTqwRRDbQBudI2oYdm4CL\nE2uaES+92rdiYNd2bNy2iw/mrEq2OYZhNHPqHUGo6jhgnIg8q6pVTWSTcQBcUtqdyuWzea1iGecO\n7pJscwzDaMYEnYPIEZEnReR9Efk4dCTUMqNBnH9kF7IzM/h0wRpWWP0mwzAOgKAO4jVgOvBr4Jaw\nw0gx2udnc8bhB6EKb05blmxzDMNoxgR1ELtV9XFVnaqqFaEjoZYZDeaS0m4AvF6xzNZEGIbRYII6\niNEi8mMRKRGRjqEjoZYZDebEfsV0bpfL4nU1fL54Q7LNMQyjmRLUQVyNCylNBCq8ozxRRhkHRmaG\ncFGp29L7tXJbE2EYRsMI5CBUtbfP0SfRxhkN5+JSV3rjnVkr2bpjd5KtMQyjORK03PdVfu2q+nzj\nmmM0Fr2LWjO0Vwc+X7yBd2at5NKy7sk2yTCMZkbQENPQsONE4Le4/SGMFOYSzym8Xm7ZTIZhxE/Q\nENNPw47rgaNwq6yNFOacQSXkZ2cydfF6Fq3dmmxzDMNoZgQdQURSA/RrTEOMxqd1ThZnDyoB4HUr\n4GcYRpwEchAiMlpERnnHO8B84O0Yr3laRFaLSKXPc78UERWRIu+xiMijIrJQRGaKyJCGfBijLqE1\nEW9ULGfPXlsTYRhGcILuB/Fg2P3dQJWqxgpsPws8BtSayBaR7sAZwJKw5rNwI5J+wDHA496tcYAc\n3bsjvQrzWbyuhk8XrOGUQzol2yTDMJoJQecgxgHzcJVcOwA7A7xmPLDe56mHgVuB8O7s+cDz6pgM\ntPe2ODUOEBHhYm8U8VqFTVYbhhGcoCGmS4GpwCW4famniEjc5b5F5DxguarOiHiqKxAeJF/mtRmN\nwIVDuiECH8xeRXVNTN9uGIYBgASp1SMiM4AzVHW197gY+FBVB8d4XS9gjKoOFJF8YCzwbVXdKCKL\ngTJVXevNa9ynqhO8130E3OpX70lERgAjAEpKSkpHjx4d+MOGU1NTQ35+fqPKprLO341fzxerdnLd\nUW05pYukrJ2m03SazsbV6UdZWVmFqpbFFFTVmAcwK+JxRmRblNf1Aiq9+4OA1cBi79iNm4foDDwB\nDA973XygJJb+0tJSbSjl5eWNLpvKOkd9sVx73jZGz3l0fErbaTpNp+lsXJ1+AOUa4NofdJL6PyLy\nHvCS9/h7wLsBXxtyRLOAfTOkESOIUcBPRORl3OT0RlVdGY9+o37OOOwg2uVmUbl8E4urW1GabIMM\nw0h56p2DEJGDReR4Vb0F18s/AhgMTAKejPHalzy5Q0RkmYhcV4/4u8DXwELgH8CPg38EIwi5rTK5\n4Cg3rfPxYttIyDCM2MQaQfwZuANAVd8E3gQQkTLvuXOjvVBVh9enWFV7hd1X4MZAFhsN5pLS7jw/\nqYpPq7axc/desrMauk7SMIx0INYVopeqzoxsVNVy3PyC0YwY2LUdAzq3ZdNO5VUrA24YRgxiOYjc\nep7La0xDjMQjItxwSl8A7hk9h+lLbDMhwzCiE8tBfC4i10c2evMJtuVoM+T8I7tyZt88du7Zy49G\nVrB68/Zkm2QYRooSaw7iZ8BbInI5+x1CGa6S63cTaZiROK45sh3r9+Ty+eIN/HjkNF68/libjzAM\now71XhVUdZWqHgfczf71C3er6jBV/Sbx5hmJoFWG8LfLS+ncLpfyqg3cM2Z2sk0yDCMFCVqLaayq\n/sU7Pk60UUbiKW6bwxNXlpKdlcHIyUt4eeqS2C8yDCOtsLhCGjO4e3vuvWAgAL95ezYVVTZpbRjG\nfsxBpDmXlHXn+8f1YueevdwwsoJVm2zS2jAMhzkIgzvPOZRjendk9eYd3DCygh279yTbJMMwUgBz\nEAatMjP46+VD6FKQy7Ql1fx21Jxkm2QYRgpgDsIAoKhNDk9cWUZOVgYvTV3CP6dUJdskwzCSjDkI\nYx+DuhVw34WDAPjtqNmUL/bbENAwjHTBHIRRiwuHdOPa43uza4/yo5HTWLfN5iMMI10xB2HU4Y6z\nBzCsTyFrt+zgnnHrmbtyU7JNMgwjCZiDMOqQ5U1a9y1uzbLNezjvsQn8fdxX7Nkbe3tawzBaDuYg\nDF86ts5m9E9P4Nt98ti1R7n/3/MY/uRklq6vSbZphmE0EeYgjKjkZ2fxw9ICnrlmKMVtc5i6eD3f\n+fN4Xv18aWjvcMMwWjDmIIyYnHpIJ9772UmcPagzW3fu4dY3ZnL98xWs3bIj2aYZhpFAzEEYgejY\nOpu//vcQHv7eYNrmZvHh3FWc+fB43p9tRX0No6ViDsIIjIjw3aO68d7PTuK4voWs27qTES9UcOvr\nM6jZtTfZ5hmG0cgkzEGIyNMislpEKsPa/igi80Rkpoi8JSLtw567XUQWish8ETkzUXYZB06X9nmM\nvO4YfvNfh5GTlcGr5cv4xftrGTm5iq07difbPMMwGolEjiCeBb4T0fYBMFBVjwC+BG4HEJHDgMuA\nw73X/E1EMhNom3GAZGQI157QmzE/PYGBXduxpmYvv/5XJcf870fc9XYlC1ZtTraJhmEcIAlzEKo6\nHlgf0fa+qoa6mJOBbt7984GXVXWHqi4CFgJHJ8o2o/Hod1Bb3vrx8fz8mAKG9urAlh27eW5SFWc8\nPJ7LnpzEu7NWsmuPhZ8MozkSa0/qRHIt8Ip3vyvOYYRY5rUZzYBWmRmc0COPm79bytyVmxg5uYq3\npi9n8tfrmfz1ejq1zWH40T3472N6cFC73GSbaxhGQCSR+ewi0gsYo6oDI9rvBMqAC1VVReSvwCRV\nHek9/xTwrqq+4aNzBDACoKSkpHT06NENsq2mpob8/PxGlTWd++W27trLuKptvLewhmWbXT2nDIFj\nuuYwrHMGQ7q3JS+r/gFsc/3sptN0ppJOP8rKyipUtSymoKom7AB6AZURbVcDk4D8sLbbgdvDHr8H\nDIulv7S0VBtKeXl5o8uazrrs3btXP1u4Rm8YWa59bn9He942RnveNkYPvuMdveyJSfq3sQu1cnm1\n7t27N6l2mk7T2VJ1+gGUa4BreJOGmETkO8BtwMmqGl6zYRTwooj8CegC9AOmNqVtRmIQEY7rW8Rx\nfYv4ZuN2Xi1fyuiKRSzcsItJX69j0tfreOA/bj+Kk/oVcVL/Yk7oV0RRm5xkm24YaU/CHISIvASc\nAhSJyDLgLtxIIQf4QEQAJqvqj1R1toi8CswBdgM3qqrVmW5hdC7I5abT+3F8+030GTCICQvXMv7L\nNYxfsIZVm3bw5vTlvDl9OQADu7bj4DZ7+FqX0quoNT075lPcNgfvd2MYRhOQMAehqsN9mp+qR/5e\n4N5E2WOkFh1aZ3Pu4C6cO7gLqsqXq7bscxZTFq2ncvkmKoF/zZ+57zV5rTLpWZjvHa3p0TGfXoWt\n6VmYb7WhDCMBJDOLyTAAF4Y6pHNbDunclutP6sP2XXuYsmg9oyfNZmd2AVXra6hat5Xqml3M+2Yz\n876pu8bi8OJsXjtiN/nZ9pM2jMbC/k1GypHbKpOT+xfTZnNbSkuP2te+cdsulqyrYfG6rSxZX8Pi\ntVupWl/D3JWbmL1mJzf+cxpPXlVGq0yrIGMYjYE5CKPZUJDXikHdChjUraBW+1drtnDBX8Yzdv4a\nfvXGLB685AibqzCMRsC6Wkazp29xG+44oQN5rTJ5Y9oy/vDe/GSbZBgtAnMQRougf2E2f7tiCFkZ\nwuOffMXTExYl2yTDaPaYgzBaDKce0okHLjoCgHvGzGHUjBVJtsgwmjfmIIwWxUWl3fjVWQMA+J9X\nv2DCgrVJtsgwmi/mIIwWxw9P6sO1x/dm1x7lhy+UU7l8Y7JNMoxmiTkIo8UhIvz6nEM5b3AXtu7c\nw/efmUrVuq3JNsswmh3mIIwWSUaG8OAlgzmxXxFrt+zkqqensmbzjmSbZRjNCnMQRoslOyuDx68o\nZVDXAqrW1XDNs1PZZntnG0ZgzEEYLZo2OVk8c81QehbmU7l8Ew9MrObjeatYuHoLO3ZbPUjDqA9b\nSW20eIra5PD8tUdz0eMTmbV6J9c+Ww6ACHQpyNtXALBHx9b0Ksynh1cM0DDSHXMQRlrQs7A1r/xw\nGH/81+dszchn8bqtLN+wjeXV7pj41bo6r+nbIYsRe5dw3uCu5GVnJsFqw0gu5iCMtKFvcRuuH9KO\n0tJSAHbu3svy6m1UrdtK1boa73AFAJesr+GrDbu57Y1Z3PvOXC4p687lx/SgT3GbJH8Kw2g6zEEY\naUt2Vga9i1rTu6huOGn7rj38dfQkPl0pfLG0mqcmLOKpCYs4sV8RVx7bk9MGdCLLqsYaLRxzEIbh\nQ26rTE7pmcf/XFjKrGUbeWHyYt7+YgWfLljLpwvW0qUgl8uP7cmlZd2TbaphJAxzEIYRg0HdCvjD\nxYO54+xDeb1iGSMnV7F4XQ1/fG8+f/7wS8pKsjlh08J9O9z1KMynIK9Vss02jAPGHIRhBKR9fjY/\nONGV8ZiwcC3PT6ri43mrmLRsB5OW1S4x3iG/FT0KXVZUz44uK6pnYT47d9vWqEbzwRyEYcRJRoZw\nUv9iTupfzLINNbzwQQXapojFa91Od1XrathQs4sNNdXMWFpd67XtczK4hSq+V9bd5jCMlCdhDkJE\nngb+C1itqgO9to7AK0AvYDFwqapuELf91yPA2UAN8H1VnZYo2wyjsejWIZ9v982ntPTQfW2qyprN\nO6jytkUNOY05KzexcPUW7nyrkmc+W8wdZw/g1EM62e53RsqSyBHEs8BjwPNhbb8CPlLV+0XkV97j\n24CzgH7ecQzwuHdrGM0OEaFTu1w6tctlaK+O+9pVlcfenshrX+5k4eotXPtsOcf1LeSOsw9lYNeC\nejQaRnJI2BhXVccD6yOazwee8+4/B1wQ1v68OiYD7UWkJFG2GUYyEBGO657LB784iV+fcygFea2Y\n+NU6zn1sAr949QtWVG9LtomGUYumDoIepKorAbzbTl57V2BpmNwyr80wWhw5WZn84MQ+jL/lVK4/\nsTetMjJ4c9pyTn3wE/7wn3ls3r4r2SYaBgCimrisChHpBYwJm4OoVtX2Yc9vUNUOIvIOcJ+qTvDa\nPwJuVdUKH50jgBEAJSUlpaNHj26QbTU1NeTn5zeqrOk0nQ2R/WbLbl6s3MJnS7cD0C4ng/P7ZnNq\n37YU5MYu8dGcP7vpTLxOP8rKyipUtSymoKom7MBNRleGPZ4PlHj3S4D53v0ngOF+cvUdpaWl2lDK\ny8sbXdZ0ms4Dka2oWq8X/e0z7XnbmH3HOY+O1wf+PVcnfbVWd+zakxJ2ms7mpdMPoFwDXMObOs11\nFHA1cL93+3ZY+09E5GXc5PRG9UJRhpEuDOnRgdd+NIz3Zn/D4x9UMnfdbiqXb6Jy+Sb+9slXtM7O\nZFjfIk7uX8RJ/Yut4qyRcBKZ5voScApQJCLLgLtwjuFVEbkOWAJc4om/i0txXYhLc70mUXYZRioj\nInxnYAnFO1Zw+BFHMmXResZ/uYbxX65hweotfDh3FR/OXQVAz8J8TupXTMGebWjhenoU5lPcJsfS\nZo1GI2EOQlWHR3nqdB9ZBW5MlC2G0RzJbZXJyf2LObl/MQArqrfx6YI1jP9yLRMWrqVqXQ0vrKsC\n4LHPJwGQn51Jj45uf4tQ2Y+eHd0q7j0JnG80Wia2ktowmgld2ufxvaE9+N7QHuzZq8xYVs2EBWuZ\nOm8JmzWHqvU1VNfsYt43m5n3zeY6r8/OhGEzp3JS/2JO7l9E3+I2Ntow6sUchGE0QzIzhCE9OjCk\nRwcq2m/at8fFxppdVK3fyuJ1NSxZF7qtoWr9VlZt2sG4L9cw7ss1/A7oUpDLif1cyZATDi6iIN8K\nDBq1MQexTJfNAAAgAElEQVRhGC2IgvxWHJHfniO6ta/z3IefTWVTXhfGf7mGTxesZcXG7bxSvpRX\nypeSITC4e3tO8hyGhaMMMAdhGGlDh9xMvjWkGxcO6cbevcqclZsY502AV1RtYPqSaqYvqeaRjxZQ\nlJfBNZsXcmlZd4rb5iTbdCNJmIMwjDQkI0MY2LWAgV0LuPHUg9myYzeTvlrH+C/XMHb+apZt2LZv\nv4uzBpZw5bCelPXsYHMWaYY5CMMwaJOTxRmHHcQZhx3E3r3KM/+exOR12Xw0dxWjZqxg1IwVDOjc\nliuH9eSCI7vSOscuHemAfcuGYdQiI0M4snMO151TyrINNbw0dQkvT13KvG82c+dbldz37jwuGtKV\nK4f1TLapRoIxB2EYRlS6dcjnljMHcNPp/fhP5Te8MKmK8qoNPDepiucmVTGgsBVHL6vct91qz8J8\nunfMJ7dV7BpSRupjDsIwjJjkZGVy/pFdOf/IrsxZsYkXJlfxr+nLmbduF/MmVdWSFYHO7XLpGVqk\nV+Ruv/lmB1u/XBPzvZau3cnhu/aYk0kBzEEYhhEXh3Vpx30XDuL2swfw4gdTyWpfQtW6GqrW11C1\nbivLNmxj5cbtrNy4nclfR2wJ8+nUQO/xuwnvc2yfwn1pt32LW9sEeRIwB2EYRoNol9uKoV1yKS3t\nU6t91569rKjett9prN1K1foaVq/bQLt27WLqrVpdzZKNu/lk/ho+me9GHF3b53FS/yJO6lfMcQcX\nUZBni/qaAnMQhmE0Kq0yM+hZ2LpOtdmKiop9K77ro6Kigu79DufTBWsZv8At6ltevY2Xpi7lpalu\nUd+R3dtzUv9iDtq7kyP3KpkZNrpIBOYgDMNIOTq1y+Wi0m5cVOoW9c1esYnxC1yZkGlVG5i2pJpp\nS6oBuH/SB5xwcJEbYfQvpqQgL8nWtxzMQRiGkdJkZAiDuhUwqFvdRX0fzFrGN1t38c6slbwzy20h\n069TG6/GVBHH9im0ye4DwByEYRjNivBFfRd030FRrwGM/3IN475cy6Sv1rJg9RYWrN7C058tIjsr\ng2N6d6R/6x0MGryX7KyMZJvfrDAHYRhGs6ZnYWuuHNaaK4f1YufuvUxbssFtsrRgDZXLN/HpgrV8\nClRunMLjV5TSsXV2sk1uNpg7NQyjxZCdlcGxfQq59TsDGPPTEyn/9bf406WD6ZCbwZRF67ngr5/x\n5aq6e2UY/piDMAyjxVLUJocLh3TjgW8VMqhrAUvW13Dh3yby8bxVyTatWWAOwjCMFk9hXiav/nAY\n5xxRwpYdu7nuuXL+Mf5r1Pa9qBdzEIZhpAV52Zk8Nvwofv6t/qjCve/O5dbXZ7Jj955km5ayJMVB\niMjPRWS2iFSKyEsikisivUVkiogsEJFXRMRmkgzDaFREhJu/1Y+//vcQcltl8FrFMq74vyms27Ij\n2aalJE3uIESkK3ATUKaqA4FM4DLgAeBhVe0HbACua2rbDMNID845ooTXfngcndvl8vniDZz32GfM\n+2ZTss1KOZIVYsoC8kQkC8gHVgKnAa97zz8HXJAk2wzDSAMGdStg1E+OZ3C3ApZXb+Oiv03kwzk2\neR1Ok6+DUNXlIvIgsATYBrwPVADVqrrbE1sGdG1q2wzDSC86tcvllR8O49bXZzJqxgquf6GcQwtb\nUVAxOeZrt9ds5vCls+hV2JoeHfPp6d3mZbecldvS1LP4ItIBeAP4HlANvOY9vktVD/ZkugPvquog\nn9ePAEYAlJSUlI4ePbpBdtTU1JCfn9+osqbTdJrO5qlTVXlj3lZeqtwS6H3ro2NuBp3bZNK5TZZ3\nm0n7rN30KmxN6+z6gzaJ+Ox+lJWVVahqWSy5ZKyk/hawSFXXAIjIm8BxQHsRyfJGEd2AFX4vVtUn\ngScBysrKNEh1SD+CVpaMR9Z0mk7T2Xx1lpXBiLVbGff5DPr36x9T58y588nu0IUl62tYvG4rS9bV\nsHRDDeu372X99r3MWbsr4hVbaZ/fylW67ZhPr8J8ehS2plehG30Utclm2rRpjf7ZD4RkOIglwLEi\nko8LMZ0OlANjgYuBl4GrgbeTYJthGGlM76LWrO+UQ+nBRTFlczZWUVrau1bbnr0athfGVne7bitz\nl61jTY1SXbOL6ppqZiytrqMvPzuT9jnQety4QLZ2ytnNPxPrH5IyBzFFRF4HpgG7gem4EcE7wMsi\n8nuv7ammts0wDONAyMwQund0+3KfwH4nU1FRwZAhQ1izeYe3855zHPtu19dQXbOLmp3A5oBhro6J\n3zQpKcX6VPUu4K6I5q+Bo5NgjmEYRsIRETq1y6VTu1yG9upY5/mNNbsYO2Uahx12WCB9C+bNaWwT\n62DVXA3DMFKAgvxWdG+XRf+D2gaS37ws8ZdvK7VhGIZh+GIOwjAMw/DFHIRhGIbhizkIwzAMwxdz\nEIZhGIYv5iAMwzAMX8xBGIZhGL40ebG+xkRE1gBVDXx5EbC2kWVNp+k0naYz1XT60VNVi2NKqWpa\nHkB5Y8uaTtNpOk1nquk8kMNCTIZhGIYv5iAMwzAMX9LZQTyZAFnTaTpNp+lMNZ0NpllPUhuGYRiJ\nI51HEIZhGEY9mIMwDMMwfDEHYRhphojkJNuGlo6IZIjIccm240BJuzkIEekK9CRssyRVHe89N6S+\n16rqtHr0tovQuf6AjY2BiGQCz6nqFfXIXFifDlV90+c1nwLjgU+Bz1R18wHamQ/8D9BDVa8XkX7A\nIao6xkd2oKpWHsj7JRIRqbsVWBiR37uIXKeqT0W03a+qvzpAOz5S1dMDtD2tqteGPW4DvB2Si/fz\nhOk5AuhF7d+832/pZlV9JEBbL1VdHNE2VFU/99EZ6LMHoYH/j8HAid7DT1V1RhTdk1R1WLw2RUNE\nBLgc6KOq94hID6Czqk5trPeIJK12lBORB4DvAXOAPV6z4i6GAA95t7lAGTADEOAIYApwgo/OHwL3\nANs8XSGdfbznZ4W110FVjwjTNUFVTxCRzRGvESeq7SJeu0dEikUkW1V3RnmLc6O9t/cedf4AwNW4\nz3oR8EcR2YH7I/w8XEhE+gOPAwep6kDvonGeqv4+Qt8zQAUQ+rMsA14D6jgI4O8ikg08C7yoqnV3\nd3fvfTzwW/Y7+9A56nMgsj6v/a2q/jasqQJ33sRHfN/3HsbFIrJdVf/p6fsb4NuDD3I+RSQXyAeK\nRKRDmB3tgC4+apeLyOOqeoMn/w7wjwP4PIjI07j/xGxgb5hstN/SIxFt3/dpe1NEzlXV5d57nAw8\nBgwKe9+4PruIHAT8L9BFVc8SkcOAYREOO/T/6AQcB3zsPT4V+CTyM4nIzcD1Ye0jReRJVf2Lz2d/\nX0QuAt5Un564iPyF+q8NN0U0/Q13vk/DXXM2A28AQ6PpOGASvRIvlQ5gPpATQO5lYFDY44HAs1Fk\nFwBF9ejq6R1/8I5B3nE/8JtG+ExPAJ8D/w/4RehoBL0lwGXAX3EO9T8+MuNw+4hPD2ur9JEr927D\n5WbU8979gPuAhcCLwBk+MvOAs3B/7MLQEUVfYFmf1557gOcxD/gAGA48D/y5HtmY5xO4GVgE7MDt\n477IO2YAP4mi9wHg797v5CKf5wU3ugv6meYEkBkOjAY2AKPCjrHAhz7yQz37OgNnA18A3Q/kswP/\nBi4N/dZwnYNZUewdA5RE/P7f9JGbCbQOe9wamBlF52bcBX0XsMl7vCns+au940lgAvBT7xgPPOyj\nb1o8/6PGONJqBIH7UbXC/cDqY4Cqzgo9UNVKETkyiuxXQE00RapaBa4Xq6rHhz31KxH5DNcT2IeI\nZOB+cANj2BhihXdkAHU2sxWRK1R1pIj8Iop9f/J5zVe4Gi8vAk8BP1XVvZFyQL6qTnUj333s9pHb\nKSJ5eL0lEelLPd+Bqi4QkV8D5cCjwFHe8PoO3T/k36iq/46mI4KYsl647iZVfTjCltERcoHCkBGh\nmx8A/wI+A+4RkY7qH7qJeT7VhWYeEZGfqn+vNWRneOhkKq4DMRVQEbkw7DyiqioibwGl9X22MCaJ\nyGGqOqcemYnASly9oIfC2jfjLrK1UNXPReQm4H1gO65TsCZCJtBnD6NIVV8Vkdu91+8WkT1RZHup\n6sqwx6uA/j5ywv7oA959v9EXqlrv5tKq+hyAiHwfOFVVd3mP/447D5Hs8n6nof9RMftHcAkh3RxE\nDfCFiHxE2AVK6w7l5orI/wEjcV/GFcDcKDpvByaKyJQYOluLyAmqOgHAm8BqHalMVfeKyAwR6aGq\nS2J9IFW929PX1j3ULZHv690G2wnd8SguxDQcOAoYJyLjVfWrCLm13sU+9IO9GHdRiOQu4D9AdxH5\nJ3A8LsxQBy+scg1wDq7nfa6qThORLsAk9g/tx4rIH73H4efdb54opqy6cN35wMM+rw/noXqeU9zw\nH/aHbvZ9NO8znUOU0A3BzyfANyLSVlU3e850CPD7sM8UGVqcjuscnYt/OGhytJi/D8/hnMQ3uPMZ\nCtntC5d6HaMqEbkcWKGq273PlAd0AxZ7j0dT+zzlAxuBp0QEVT0v8s1V9S/e/6cXtedAno8Q3Soi\nhew/n8d6uv34RETeA17y5C/DjXYieQaY4jlUgAtwnShfROQ84KTQe6jPvBsuPNYWCHUa2uAfLnwU\neAvoJCL3AhcDv4723o1BWk1Si8jVfu0hTx4mlwvcwP4vdjzweOhHHiE7FTc8nEWYN/fRWQo8DRR4\nTdXAtX4XNBH5GDfkngpsDdNZ588iIgOBF4BQj3UtcJWqzo6QK47skcXCm9C8Bvgl0E1VMyOe74Mb\nHh+HCyUsAq7QiMlGT7YQOBZ3MZmsqr5VKEVkPC5G/rqqbot47kpVfcG77/fnVVU9LbIxqKz3pysA\nXqH2eY+anFAf3mhwmKp+FlA+nvM5U1WPEJETcOG4B3EjrGMaaOscXI+5CvfZ61z0w2QX4kKZkb/5\nOpWVRaQcOE69OTJvfukzVR3qPT65PrtUdZyPzheAvrgw1J79orU7Zd5o7y+4EHElUAxcrKp1RjCe\n/IXsn3wer6pvRZEbgutAiSc3PYrc/bj/8T+9puFAhUYkKIjINbg5stDv9GTgt5HXEE92AHC6994f\nqWq0jmujkFYOIhGIyERVDZzOJi7bSVQ1Wk8m6p8myp9lInCnqo71Hp8C/G+kTSKyAHfBeQUXW91Q\nz/s/hPsDtAEm42U0qerXUeRbAxlaT7aTBMx6SSZhjiT0pwhdJP2cTj7uItlDVUdIlMwsaUAmS8Dz\nOV1VjxKR+3Bx9RdDbRFyxbhJ1V7UPvfXRsj19HufKBf9j/3OSRQ7v1DVIyPaZqjqYB/Zg9g/4TpV\nVVdH0TkXOEwDXLxEJAs4BPddzg+FcRqKiNyDy+6bqKpbY8jOBI4MhWe98ND0KE63MxBy7lNU9Rsf\nmQeBZyI7f4kkrUJM3p/4PuAwXKYSAOpls0gcGUdhjBWREbgJufDwxXpPp2/sPxRn9psD8HME9dA6\n5By8137iXWAidfYTkaNxQ+c7vR7jy6o60kfnZOAPqrqqvjeWYFkicWW9xPqOwuQKcKGr0ChvHHCP\nn+ONJRv2HY2hbkZPtN9DKDMr5IijZWbVm8kSYWd74Cq8i3nYbyQyXAkuO+kJ4FvAA+LWNvita3ob\nd0H7kNqx81ro/rmyToSd9yjME5EXqfub93P4a0TkPFUd5ek/H589DETkUuCPuMwhAf4iIreo6us+\nOitxk9nRwm8hnTcC/wxdUEWkg4gMV9W/+cheiJvM7+S9v2/mIC40Nhx4VFy24ae4UcTbUcxoz/7Q\nUUEUGYBMYA3umtxfRPqrl34fxjzgSc/pPQO8VF9HszFIqxGEiEzAXSgexsVir8Gdg7u85317USGi\n9KYW+Yvuczp3xdB5t4/OoD9WvFjoNFyYCdx8SZmqXhDtPUWkCPgTcHlk2ChMJjx2Ok4jJms9mX/j\nfqh3qupg74c7XVUHRcjNUdXDotkTIVvvdxQm9wbuQhEahl8JDFbVOnntsWTDvqNDcD3Yt3Hn/Fzc\nn/8HPjrLVbUsvNfu1zP2LiKtcRfnbdT/XU7EOed6w5WebD7wHdzoYYGIlOAy796PkKvTg/fD+74f\nwsW+V+My7+aq6uE+ss/4qNDIUYkn2xcXYumKc7bLcCHQhRFyM3AT06u9x8W4bKfBYTKh+Yq2wJG4\nEGy4gzovQqff6KXOKMtrX4ib7woUsvF6/Jfiwq8d1GdCWkSG47IVx+K+95OA21X15Qi5UPp9rQ5U\n5OcJkz8E978Yjkt8+Ed4J7FR0QSmSKXagYv/QViqGy50knTbIuxcCBwaULYDbvJqGm4i8hHcDzZS\nrh0upe7fwJc4B1QaRed9wEfAtd7xAXCfj9zn3m142t0XPnJP4UICjfYdRXmfOm3xyOIyR9qGPW6L\nT3qv99xEXAprKPWwLy4sciDf+7Q4ZHv4HT5yvwfODqBvBi79d7r3+FTgyUb8TbcJP7c+z8+KeJzh\n03ZyfYePzpl4nWDvcSYwO8r7fxbwc/yf992/hQsxHg1k1SNfApwHnI9b1OYnEyj9PuwznI/LiqsA\nbsON5F5urO8q/EirEBOw3Zs4XCAiPwGW43rptZDaC9WycdkfWzWs1ydxrsD0el11hmvq0+sCVmnA\nnoy6uYSbvLmNvVo3iynEDNyP6h5VnRRD7TnUjp0+h3M+t0fIBc0SiZn1Ekag7wjYJrWzwo7H9dD9\nCCrbAwhfcLgTF+7xI57MrCCZLAAviMj1uDBVnXBlBO+wPxyWC/TGXWgie/w3A3eIW+y4i+gjmF2q\nuk5ciYgMVR3r9WzDP8etqvoHibLAS31CYUHDkMB/ZH8WEbge9bsR+uMJvQK8B7wqLm1UgR/hvjM/\nykXkFdx/pL6wWSHuIl2NCx2tVVW/1O4QoW09M4HjxGVmReoMlH4vIn/CjWo/xs0zhlZQPyAi8+t7\nbUNJNwfxM1wa3U3A73C9pKsihTRiuCgiF+B6CuHEu0I5/KKQC3wXt37Bj6A/VkRkEG4BVkfv8Vrg\nag0rV+FNjr2lqr7zIVEIEjv9BW7xU19xazqKcal3kTyNC+vUCp1EIdB3hMsye86bXxDP1u9H0fkj\n4HlPFlyGkF9G2wvAVC9sp7jvqE54x+Mq3EX6ddwf/Gb1ycySupksN3vOyq/Uxk5cHP5OfFblh6N1\nw3hDgB/6yAVNb64Wl7U2HviniKym7pqWUKelPKBOcCvin8F9JnCj11eISA1V1Vu8TlcoO+hJjZ5F\nFFlpAFzHpBz4H92fTHEb7pzc4Ol8HzcC8KMdLg3+2+FmEfE/VtXvejYcCpyJm4PMVNVuPnYGnXsL\nmn5ficvc6qVuvUx4qY3I61OjkG5zEGW4H2pPnMeG6D3ZyNdOVtVjG9GWDFyM1S9DJp4Yb9AspsC1\nauKInV6C66V1x5XlOAb4fxqRFirxZb3E9R15IydUdVM9Onur6qJw2VCbj+wQaqc6RkthPA13MTsR\ndwH/wpOPrDEUTybLV8Axfo4mCCIyTVWHePcHqOo8ibKwz+c7ao1boBaq91OAm+Bd5/M+l6jqa7Ha\nvPbPVXVoxFyN77yIN9o4GncRrS+L6W5c5+pFz97LcJPW84EbVPUUv9c1BiLyX7jv/CRceHcSLgT6\ntI9soLk3CZ5+/zheqQ1VPVRcuZH31UsZTgTp5iDmA7cQI387InyUgavLdLL6pCtKHNk0Ea87BHhH\nVQ9uwEcJ1+M3MerX9hCuhMVr1M7x90019SY9h+L+gNHS7sJz8f8XN8lZJxdfXP2h9gTIeonjO7oZ\n1zPdjFs3MQT4lUZM0nqy+y6cYW0Vqhp05bAv3sV+KG6U8yNgm6oOiJCZCZyi+7PaOuLCTH4OYhRw\nmapGXZkfJhs+GszArYLuqKpnes8/qS79NvB6kaBEOZ912rz2T3Cdhw9UdYgXhnxAVU+OkIvMYjoR\n8M1iEpEpPr+xyap6bPhvXwJmxHmyucB1uBBduGxkOvDTuE7Rp6q6wmt7QFVv89H5FPCQ1r/iPDCh\ncywxEiMak3QLMa1RL90uBuHho9241Lbzo8g+jRv6Xeo9vhJ34ao1RxE2LBbv9hvcELgOErwIHsDX\nIvL/qJ3F5JdZ1RFYx/6VvhAl1dQjA5eOWF/aXSht8hzg76r6toj81kdXHs4x1Dt89wj6HV2rqo+I\nyJm4OYprcOd9n4MQt6jocKAgwum3I3YqZ7144YDWeD1IYGiUHu//AtO8C+W+0VgUtXtwoYax1B9q\nADeBHurd7cY53zfCXjPCuz014OeJmTknImfh6iR1FZFHw17eDv8SKxA8DHknYedQvCwmXAgvkr2e\nQwk9F64vvMf7DPsz4k7Fy4iLYucLuDTSM3Hlby7Hv3rCkT4j+bPw/y8HmnsTlwnpN6cT6ciavNRG\nuo0gTselhkXG+hq8YMtvuBxtCB2HznG4XvQTYT2FSvWpz+QNM+/GTZIKLob8W41SBTXg+wdKuxOR\nMbhJ5G/herDbcKGBBvdogn5HYaOXR3A98rckIoVRXM79BbgsknCnsxmX9THxAOx8GPeZd+BSDccD\nk7Tu6u8XcAUdNwBLiDIa82QDhRo82aHAHdReAOcbipMAZSkkQJqnuDLXR+IuoL8Je2ozMFajLL6U\nAIvVRGSWhs2reCHYGRox1+I91weXrTcMd7GcDPwc91ss1f3JCBWqWhquW0Q+VdUTfXSGFh6Gflet\ngPdCIy0RuQH4MS6cGF5ypi0uA6pOyX0JuOJcXKJHiFzgEtxo8DcRcpfj/pdDcM7nYuDXfqG9xiLd\nRhDXAANwse2ok0Yi0g23RP947/kJuEnIZT46A2fTSPBslqBF8MClV3bH9fizcMvwT8NNjoW/dzyj\nkgtwq4JjFTW8FJeL/6CqVnthqVsiheI8n4G+I6BCRN7HZe/cLq4WVa3elLrFS2+LyDCNnbkVF+qV\nPpf95UiewcXBI0t5P4ObqzgPb65CXF2ryHLXvo6gHkbicvArqacXKVHKUuASG8KJmTmnqjNEpBL4\ndixbReQ0Vf1Y6mb79RcRxSUVTFDVkE0xs5jC7Pia6EkiE8LuB82IA5fhBW6yfiBuhN8r7PkXcSni\n9wHhCQabNfreL0uCjIa17jzPn8WtB/pNhNw/RaSC/aU2Loj1nR0o6TaCqNVLqUfuA9wPIjxsc7mq\nnuEjeyTOm9fKkNGIei/iX5elXFXrhBvELUD7CfCaF3O8GLhOVc/ykZ2Pz4XCp5cSz6jk38AlGj1l\nNi7iPJ9Bv6MMXG/2a885FQJdI8+7JxuPcwz6mX6Ci5OX4uoXhcqRfOwjW+9chYi8qqqXiv9KfvUb\nkYm3d0gAO+stSxF2AT8Z5+CCZM79B3f+ou1Bgojcrap3iX/CBbh00bzw34C4Fef7RsIakcUkAdNs\nReQFVb1SRG7F7aHQHpcRV4CrEDDZx94f4EJ0g3CZV21wCRdPRPuMsZCAc29SO5EgNOd5QyLnFoKS\nbg7iH7g66/VOGsUTNhJX4uBiXC+tPS7dTlU1sox3PNksfkXbLo+86HuyQS8U8WSTvAEMpm6Yxy8W\nHpM4z2fQ7yjwrmLxOMegiMgtOKdQofXkwfvMVUzQiLkKESlR1ZUi8iq1R2CCu6BdSgRxhOJew5Ux\n9y1LEXYBD82PhaPqnzn3BC7MMYraCQ91ysbUh4g8parXxSF/rqqO9kJxfg7ieU9uDm5eYBRwChGf\ny6/H7/2PL8KNGsKz5+6JlI3D3kDZiFI7kSA05/mgqiZkbUM8pFuI6QTganGTQvUt2ForIlewf7g7\nHDfB68fbuEUz03BD2PoIWpdlOS40MRY3ubwJl7fv92O9S1xp8ljzKvGUkp5E7Zg9uInIhhLP+az3\nO5L4d1SD+EJ2gVDVPwYUnYkbZQzEdR6qxRXw2xeGDLt4H+wz8quVFRVGvaE4qV2WYo64qsN1ylKo\n6jWe/HO4sF+197gD0Uub17sHSYT9UbP8VPU68V/TAD6T5Lq/3MscfOZf2B82+ztuQVwf3GrjUGJI\n6Nav1PrbuO+ngtj7xQTlf+oJP+1DAyYSJIN0cxDfCSh3LW67w4dxP6iJuD+kH91UNYjeeLJZwp1O\ntMV0IYLG7G/EjUoGiMhyvFLSUXT+Ny5MNgtA3LqIK3DzCA0hnvMZ61z+ELeYrgv7//zgnOhfo7wm\nHufYqASZqwifAPVGmiHa4ibA/RgcIxT3IO7cPICbU9r3dl5bJEdoWGKDqm4QkTo1i7zn7vbsjrYH\nSTj1Zvlp8IV84YzEJxU6zL5HccX0HlfVGwLqDPo/jocpIvIF7vP+u54wX4NS5ZsETUD9juZ+4OYU\nOoQ97gg8HUX2ScK2J61H5wu4SaefUk9dFk+2zrad9cj6bqHoI5fj3bbGq4mDy5Twk+2Dc06H4kpF\nfwoUNMX5jEPnT+OQ7YNLmazBjc4m4FajNsVv6Se4lcMLcaO8u3ALncJlCnC94ZfYv0Vtz2jfj/ea\nfxCgvhU+9Z3w2SITV4ol8juKtj3nQFzplSrvqAAOjyIbuGZWHOd0QgK+p0D/4zh1CnCG971+hesk\n9veRewOXidjHO+7CZ7vTZBxJNyAVD8KKz0Vrw/VeZuKGu7twqzhnhtp9Xn+a5yA+8H4sb+CG9H7v\nH/jHGseF4h3CiorherEV9cj39z7be7jJxISezwbqHYjrmV4VOmLI73OOTfhbugW3wjxqQbcG6p2L\nK83h+7vDlZeYhZsjmBl2LAJG+ui7ytP5O1wocx5wZZT3nojbIjP0+BTc/gh+spOAE8IeH49LBz6Q\nz346rmTGcNxI5ELgwgbqiut/fAA2n4rrnFTjRgjDwp5rdCfaWEe6hZiCkiEiHdTL6xa3+jXyXP1X\nPArVpfyNo3Y2y+G4fG689wllsWQB14jI18Qubhd0XuVfwOtepkh33BzDL8MFfLJoOuKKjE0RV2Qs\nZkmSKAQ5n3EhrkT3KbhVsu/iJiUnEJa+KQ3Yi6Ox0eBzFfESKxwSV1qmqj4vbve303C/oQs1eqJA\noOU+QwoAAAfOSURBVD1IPILWwYqHoGHVIMT1P44HL7PuClxYbRUuejAKl333Gi5FG+IrPNmkmIPw\n5yHcPtOv4354lwL3hguoT0ZRffhks/itvG3IjzVQ3FRV/yFuu8d/4cIZP9S6C8US9WeJeT4bwMW4\nTKvpqnqNuDo+kYXYGhLfbhbE+v2pi19vxPWyg+qcg+tJxyLQ6n0vFfkQdXuFxKyZFQex5l8CE+//\nOE4m4c7RBVp7zU+5uAqzIcILT0LjONFGIa3SXONBXFniUG/qo3p6U0H1BVp529hE9KKF/VVVp0PT\n9KI9Oxr7fIbSditwI7LNuLmbOhvcGI2LxLF6X9yiwJMi2w/w/QOlQicTL439jxqggnLQVPlkYCOI\nKMTRmwqqL+jK28Ymshf9VpT2hNLY5xP4XNwWnf/ATZJuwe0wVgfZX5rhWNwIZhLwc42yx7YRk0Cr\n9z0+EJFf4ibqw9dMxEz/rIegYdWkoap7xJUmCUI8qfJNio0gmgiJY+VtE9iSAbRppOF+UhBXQmI8\nLly3HWinPquoPdnJuBTY0DqMy3BZUMf4yRv1IwFX73uyQQvRxfP+Pf3aExwuihsJWEH5QBdtJhIb\nQTQdebh9oOtdeZsoxG0y/yNcPZ4KXIXTPyVwEjXRhGoc/YUYNY5wHaEXwh6P9By20TDWqM8e5VE4\nDLfG4wSco/gUt5CtwaSaI6iHoBWUJ4rIIPXWHaUSNoJIE0KlLcRVhCzFlSeuSKVhebxIgP0YPLn7\ncUP4l3F/0O/hQnt/hQMOd6QdQct8eLKv4hYxhtcga68+5UNaEt5v8yZVfbgemfCsxX64nQlTKmRm\nDiJNEJHZuPS6F4HHVHWcJHizkUQSpMZRmGydDJsw9EDCHemIiIzEpZlGloP3q9sUaEOrloiIjNV6\nymhEC5WFSIWRkoWY0ocncEXAZgDjvR9ns52DIECNoxCq2juyzTgg4kkznS4ix6pXQVVEjiF6+ZCW\nxkQReYy6E/TTvNukO4BY2AgijRGRrGTMhzQmYVlhv8SVL/HNChNX4z9y68nIPRGMAMSTZiqu3Pgh\nuM2SAHrgVmzvJUXCKIlCErDda1NjDiKNEJFzqLvnbtJzrRtCPFlh0VZdq6rf1pdGDLyLfl/c4rh6\nY+bNIYxiRMdCTGmCt3IzHzeh+3+4hTm+6waaCfFkhQVZdW0EJ3DV03R3AM29U2YOIn04Tt1euzNV\n9W4vR7vBe3EnmzjTc7ep6l4R2e2VfFiN/54ARgDS/aIflJbQKctItgFGkxGavK0RkS64ypXpMnlb\nHrHqehrN7I9qNEuOU9WrgA3q9tAYhluB3mywEUT6MMa7SP4Bd5GENAmzqOqPvbt/F7efctRV14bR\niER2ytbRzDpl5iDShwdxVSNPZP/agceTalETIWF7Vavq4sg2w0gQzb5TZllMaYK3onUzbrtGSIMV\nrbJ//+qx1N68vh1uC8hDk2SakQaISB77O2WhMiOPq+r2pBoWB+Yg0oR0XNEqIjezf//q5ezftH4z\n8KSqRtvD2jAOmJbQKbNJ6vRhuogcG3qQDitaVfURbxX1vcCR3v1ncDVvJiXVOCMdOERVr1PVsd4x\nArdosNlgDqKFIyKzRGQmbl/kiSKy2KtNNAlo1I1cUpiLVXWTiJyA20T+WdJk/sVIKs2+U2aT1C2f\nhO2524zY492eA/xdVd8Wkd8m0R6jBRNWpbUVcJWILPEe96RxN81KODYHYbR4RGQMbg7iW7jSHNuA\nqS15/sVIHi2pvIg5CKPFIyL5uPIQs1R1gYiUAINU9f0km2YYKY05CMMwDMMXm6Q2DMMwfDEHYRiG\nYfhiDsIwPETkThGZLSIzReQLLy0xUe/1iYiUJUq/YTQGluZqGICIDMOlBA9R1R0iUgRkJ9ksw0gq\nNoIwDEcJsFZVdwCo6lpVXSEivxGRz0WkUkSeFBGBfSOAh0VkvIjMFZGhIvKmiCwQkd97Mr1EZJ6I\nPOeNSl73MqpqISLfFpFJIjJNRF7ztlFFRO4XkTneax9swnNhGMD/b+9+XnQKwzCOf68VSSnWZIFZ\nmKZJSBH58Q+QBauZFYUN61n6D0iUpZLlSBZsZkERCyXJj6RmYTEWmiSEy+K537ydTtI7Rqnrszk/\n3vec55zF6e55np77ToCIGLgDrJf0UtIlSfvq/EXbO2yP06rYDS88/Gp7L3AZmAVOA+PAtKR19Z8x\nWt6nCWARODV0PdVTmQEO2d4GPAbOSVoLHAa21rXnl+GdI34rASICsP2RtojuBLAA3JA0DeyX9LBW\nxx6glY8cuFnbp8Az2++qB/KGX4Vh5m0P0itcA/Z0mt5Fq5V9X9ITYIq24nYR+AxclXQE+PTXXjbi\nD2UOIqLY/g7MAXMVEE4CE8B22/OVnmPl0CVfavtjaH9wPPi2uguNuscC7to+3n0eSTuBg8Ax4Awt\nQEX8M+lBRACSxiRtHjo1Cbyo/fc1L3B0hFtvqAlwaOme73V+fwDslrSpnmOVpC3V3hrbt2kpyydH\naDtiSdKDiGhWAxeqAtg34DVtuOkDbQjpLfBohPs+B6YkXQFe0ckia3uhhrKuS1pRp2dodQRmq+iR\ngLMjtB2xJEm1EbFMJG0EbtUEd8R/J0NMERHRKz2IiIjolR5ERET0SoCIiIheCRAREdErASIiInol\nQERERK8EiIiI6PUTwPssJIjpY0IAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fdist_stems.plot(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Lemmatizing\n", "[In English, for example, run, runs, ran and running are forms of the same lexeme, with run as the lemma. Lexeme, in this context, refers to the set of all the forms that have the same meaning, and lemma refers to the particular form that is chosen by convention to represent the lexeme.](https://en.wikipedia.org/wiki/Lemma_(morphology))" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from nltk.stem import WordNetLemmatizer" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": true }, "outputs": [], "source": [ "lemmatizer = WordNetLemmatizer()" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "collapsed": true }, "outputs": [], "source": [ "lemmatizer.lemmatize?" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "having\n", "have\n", "had\n", "\n", "fishing\n", "fish\n", "fisher\n", "fish\n", "fished\n", "\n", "am\n", "is\n", "wa\n" ] } ], "source": [ "print(lemmatizer.lemmatize('having'))\n", "print(lemmatizer.lemmatize('have'))\n", "print(lemmatizer.lemmatize('had'))\n", "print()\n", "print(lemmatizer.lemmatize('fishing'))\n", "print(lemmatizer.lemmatize('fish'))\n", "print(lemmatizer.lemmatize('fisher'))\n", "print(lemmatizer.lemmatize('fishes'))\n", "print(lemmatizer.lemmatize('fished'))\n", "print()\n", "print(lemmatizer.lemmatize('am'))\n", "print(lemmatizer.lemmatize('is'))\n", "print(lemmatizer.lemmatize('was'))" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "have\n", "have\n", "have\n", "\n", "fish\n", "fish\n", "fisher\n", "fish\n", "fish\n", "\n", "be\n", "be\n", "be\n" ] } ], "source": [ "# including POS for the lemmatizer can improve its output\n", "print(lemmatizer.lemmatize('having', pos='v'))\n", "print(lemmatizer.lemmatize('have', pos='v'))\n", "print(lemmatizer.lemmatize('had', pos='v'))\n", "print()\n", "print(lemmatizer.lemmatize('fishing', pos='v'))\n", "print(lemmatizer.lemmatize('fish', pos='v'))\n", "print(lemmatizer.lemmatize('fisher', pos='n'))\n", "print(lemmatizer.lemmatize('fishes', pos='v'))\n", "print(lemmatizer.lemmatize('fished', pos='v'))\n", "print()\n", "print(lemmatizer.lemmatize('am', pos='v'))\n", "print(lemmatizer.lemmatize('is', pos='v'))\n", "print(lemmatizer.lemmatize('was', pos='v'))" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": true }, "outputs": [], "source": [ "lemmatized = [lemmatizer.lemmatize(word) for word in words]" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "none ---> none\n", "suggest ---> suggest\n", "know ---> know\n", "methods ---> method\n", "apply ---> apply\n", "think ---> think\n", "obvious ---> obvious\n", "conclusion ---> conclusion\n", "man ---> man\n", "practised ---> practised\n", "town ---> town\n", "going ---> going\n", "country ---> country\n", "think ---> think\n", "might ---> might\n", "venture ---> venture\n", "little ---> little\n", "farther ---> farther\n", "look ---> look\n", "light ---> light\n", "occasion ---> occasion\n", "would ---> would\n", "probable ---> probable\n", "presentation ---> presentation\n", "would ---> would\n", "made ---> made\n", "would ---> would\n", "friends ---> friend\n", "unite ---> unite\n", "give ---> give\n", "pledge ---> pledge\n", "good ---> good\n", "obviously ---> obviously\n", "moment ---> moment\n", "dr. ---> dr.\n", "mortimer ---> mortimer\n", "withdrew ---> withdrew\n", "service ---> service\n", "hospital ---> hospital\n", "order ---> order\n", "start ---> start\n", "practice ---> practice\n", "know ---> know\n", "presentation ---> presentation\n", "believe ---> believe\n", "change ---> change\n", "town ---> town\n", "hospital ---> hospital\n", "country ---> country\n", "practice ---> practice\n", "stretching ---> stretching\n", "inference ---> inference\n", "far ---> far\n", "say ---> say\n", "presentation ---> presentation\n", "occasion ---> occasion\n", "change ---> change\n", "certainly ---> certainly\n", "seems ---> seems\n", "probable ---> probable\n", "observe ---> observe\n", "could ---> could\n", "staff ---> staff\n", "hospital ---> hospital\n", "since ---> since\n", "man ---> man\n", "well-established ---> well-established\n", "london ---> london\n", "practice ---> practice\n", "could ---> could\n", "hold ---> hold\n", "position ---> position\n", "would ---> would\n", "drift ---> drift\n", "country ---> country\n", "hospital ---> hospital\n", "yet ---> yet\n", "staff ---> staff\n", "could ---> could\n", "house-surgeon ---> house-surgeon\n", "house-physician ---> house-physician\n", "little ---> little\n", "senior ---> senior\n", "student ---> student\n", "left ---> left\n", "five ---> five\n", "years ---> year\n", "ago ---> ago\n", "date ---> date\n", "stick ---> stick\n" ] } ], "source": [ "for w, lemma in zip(words, lemmatized):\n", " print('{} ---> {}'.format(w, lemma))" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'run'" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lemmatizer.lemmatize('running', pos='v')" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def lemma_process(text):\n", " # tokenize\n", " tokens = word_tokenize(text)\n", " # remove stops\n", " filtered_words = [token.lower() for token in tokens if not token.lower() in stops]\n", " filtered_words = [w for w in filtered_words if (len(w) > 2)]\n", " # lemmatize\n", " lemmatized_words = [lemmatizer.lemmatize(w) for w in filtered_words]\n", " return lemmatized_words" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.88 s, sys: 4 ms, total: 1.88 s\n", "Wall time: 1.89 s\n" ] } ], "source": [ "%%time\n", "lemma_text = lemma_process(whole_text)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAE2CAYAAAB7gwUjAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXd4HNXVuN8jybItF8kNLPeCjbGNbSxBDKHzJRgSSmjB\nCT2/kEL7wpcECEkg+cIHCSQQCKF3ElqAGBsIxRhMsQHJvdu49yo3uck6vz/uXWu1mt0dlfWurPM+\nzzzauXPmzJnRzJy55557r6gqhmEYhhFLVroNMAzDMDITcxCGYRhGIOYgDMMwjEDMQRiGYRiBmIMw\nDMMwAjEHYRiGYQRiDsIwDMMIJGUOQkS6i8h4EZkjIrNE5AZffreIzBWR6SLyuogURO1zi4gsFJF5\nInJ6qmwzDMMwkiOp6ignIoVAoapOFpE2QClwLtAN+EBVK0TkjwCqepOIDAReAI4BugDvA/1VdV9K\nDDQMwzASkpMqxaq6Gljtf28TkTlAV1V9N0psEnCB/30O8KKq7gYWi8hCnLOYGO8YHTt21F69etXJ\nvp07d9KyZcsGlTWdptN0ms5M0xlEaWnpBlXtlEwuZQ4iGhHpBRwFfB6z6SrgJf+7K85hRFjhy2J1\nXQ1cDVBYWMgjjzxSJ5vKy8vJy8trUFnTaTpNp+nMNJ1BFBcXLw0lqKopXYDWuPDSeTHltwKvUxXm\nehC4JGr7E8D5iXQXFRVpXSkpKWlwWdNpOk2n6cw0nUEAJRri/Z3SGoSINANeBf6hqq9FlV8OfBs4\nzRsLrsbQPWr3bsCqVNpnGIZhxCeVWUyCqwXMUdW/RJWPBG4CzlbV8qhd3gAuFpHmItIb6Ad8kSr7\nDMMwjMSksgbxdeBSYIaITPVlvwLuB5oD7zkfwiRV/bGqzhKRl4HZQAVwjVoGk2EYRtpIZRbTJ4AE\nbHorwT53AHekyibDMAwjPNaT2jAMwwikyTqIZVv2ptsEwzCMjKZJOoj/HTubn727kfdnr023KYZh\nGBlLk3QQh7ZtDsAv/jWNtVt3pdkawzCMzKRJOoj/d3wfhh6ay+byvdz48lQqK1MzHpVhGEZjpkk6\niKws4bpj8unQKpdPF27k0Y8XpdskwzCMjKNJOgiAdi2yuefCoQDc8848pi0vS7NFhmEYmUWTdRAA\npww4hCuO60VFpXL9i1PYvrsi3SYZhmFkDE3aQQDcfMYAjihsy9KN5fx29Mx0m2MYhpExNHkH0aJZ\nNg+MGkaLZlm8Nnklo6euTLdJhmEYGUGTdxAAhx3Sht9+exAAt74+k2Uby5PsYRiGcfBjDsIz6pju\njBzUme27K7jhpSns3VeZbpMMwzDSijkIj4hw1/lHUpjfginLyvjr+wvSbZJhGEZaMQcRRUFeLvd+\ndxgi8OCHC5n41cZ0m2QYhpE2zEHEMKJPB6495TBU4WcvTWXzjj3pNskwDCMtmIMI4IbT+jG8RwFr\ntu7i5temUzUrqmEYRtPBHEQAOdlZ/PXio2jTPId3Zq3lvUU7022SYRjGAcccRBy6t8/jjvOOBOCp\naVtZsHZbmi0yDMM4sJiDSMDZQ7twQVE39uyD616Ywq69NkW2YRhNB3MQSbj97EF0bp3N3DXbuOvt\nuek2xzAM44CRMgchIt1FZLyIzBGRWSJygy9vLyLvicgC/7edLxcRuV9EForIdBEZnirbakPr5jn8\n7GsFNMsWnv5sCePm2Cx0hmE0DVJZg6gA/kdVjwBGANeIyEDgZmCcqvYDxvl1gDOAfn65GngohbbV\nisPaN+Pn3zwcgF/8azrrbBY6wzCaAClzEKq6WlUn+9/bgDlAV+Ac4Bkv9gxwrv99DvCsOiYBBSJS\nmCr7assPT+jDCf06smnHHm58eZrNQmcYxkGPHIgcfxHpBUwABgPLVLUgattmVW0nImOBu1T1E18+\nDrhJVUtidF2Nq2FQWFhYNGbMmDrZVF5eTl5eXq1kN+/cx43vbmDrHuXSI1tz7oDW9daZCjtNp+k0\nnaYzEcXFxaWqWpxUUFVTugCtgVLgPL9eFrN9s//7JnB8VPk4oCiR7qKiIq0rJSUldZIdN2eN9rxp\nrPa95U2dumxzg+hsCDnTaTpNp+kMC1CiId7fKc1iEpFmwKvAP1T1NV+8NhI68n/X+fIVQPeo3bsB\nq1JpX104dcCh+2ehu8FmoTMM4yAmlVlMAjwBzFHVv0RtegO43P++HBgdVX6Zz2YaAWxR1dWpsq8+\n3HzGAAZ0bsOSjeXcNnpWus0xDMNICamsQXwduBQ4VUSm+uVM4C7gGyKyAPiGXwd4C1gELAQeA36a\nQtvqhZuF7ihaNMvi1ckrbBY6wzAOSnJSpVhdY7PE2XxagLwC16TKnoam36Ft+M23B3Lr6zP59esz\nGd6jXbpNMgzDaFCsJ3U9+N4xPRg5qDPbdldw/YtTqLDUV8MwDiLMQdSD2FnoXp69Pd0mGYZhNBgp\nCzE1FSKz0I16bBKvztnBu7e/E2q/r3fNpagoxcYZhmHUA6tBNAAj+nTg5pEDyBbYtqsi1PLeonIb\nHdYwjIzGahANxI9O6svgFpsYPGRYUtnz/v4pX63fwfy12xjSrSCpvGEYRjowB9GAtMjJIr9ls6Ry\nR3bN56v1O5i5cqs5CMMwMhYLMaWBwV3zAZi5akuaLTEMw4iPOYg0MKiLcxCzVpqDMAwjczEHkQYG\ndmkLwJw129i7rzLN1hiGYQRjDiIN5LdsRudW2eypqGThOus7YRhGZmIOIk30bucas2damMkwjAzF\nHESa6NPOJZDNWrU1zZYYhmEEYw4iTfQpsBqEYRiZjTmINBEJMc1evdXmtzYMIyMxB5Em8ptn0SW/\nBeV79rF44450m2MYhlEDcxBpZFCkw5yFmQzDyEDMQaSRwZEOc9ZQbRhGBmIOIo0M7uo6zFkNwjCM\nTMQcRBoZHBVicjOuGoZhZA7mINLIIW2a07F1c7buqmDF5p3pNscwDKMaKXMQIvKkiKwTkZlRZcNE\nZJKITBWREhE5xpeLiNwvIgtFZLqIDE+VXZmEiFiYyTCMjCWVNYingZExZX8Cfqeqw4Df+nWAM4B+\nfrkaeCiFdmUUg/zAfTb0t2EYmUbKHISqTgA2xRYDbf3vfGCV/30O8Kw6JgEFIlKYKtsyiUgm08yV\nlslkGEZmIalsHBWRXsBYVR3s148A3gEE55yOU9WlIjIWuEtVP/Fy44CbVLUkQOfVuFoGhYWFRWPG\njKmTbeXl5eTl5TWobF10rt1RwU/f2kDb5lk8eVYnRCQj7TSdptN0Nk6dQRQXF5eqanFSQVVN2QL0\nAmZGrd8PnO9/XwS873+/CRwfJTcOKEqmv6ioSOtKSUlJg8vWRWdlZaUeedt/tOdNY3V12c4G0dlQ\ncqbTdJrOxq8zCKBEQ7zDD3QW0+XAa/73K8Ax/vcKoHuUXDeqwk8HNa6h2npUG4aReRxoB7EKOMn/\nPhVY4H+/AVzms5lGAFtUdfUBti1t2BzVhmFkIjmpUiwiLwAnAx1FZAVwG/BD4K8ikgPswrclAG8B\nZwILgXLgylTZlYnsz2SyhmrDMDKIlDkIVR0VZ1NRgKwC16TKlkwnUoOYZTUIwzAyCOtJnQH07tCK\nVrnZrN6yiw3bd6fbHMMwDMAcREaQlSUM9GEmG9nVMIxMwRxEhjCoi2UyGYaRWZiDyBCsHcIwjEzD\nHESGEBm0z0JMhmFkCuYgMoTDOrWmeU4WSzeWs2Xn3nSbYxiGYQ4iU8jJzmJAoatFzLZahGEYGYA5\niAxi8P5MJmuHMAwj/ZiDyCBsTCbDMDIJcxAZxP65ISzEZBhGBmAOIoPo37k1OVnCV+u3U76nIt3m\nGIbRxDEHkUE0z8mm/6FtUIU5q60WYRhGejEHkWHYyK6GYWQK5iAyDGuoNgwjUzAHkWFEelRbQ7Vh\nGOnGHESGcURhW0Rgwdpt7Nq7L93mGIbRhDEHkWHk5ebQt1NrKiqV+Wu3pdscwzCaMOYgMpDB1lBt\nGEYGYA4iA9nfUG1DbhiGkUbMQWQgkcmDZlkmk2EYaSRlDkJEnhSRdSIyM6b8OhGZJyKzRORPUeW3\niMhCv+30VNnVGIhMPzpnzTYqKjXN1hiG0VRJZQ3iaWBkdIGInAKcAwxR1UHAPb58IHAxMMjv83cR\nyU6hbRlNfstm9OyQx56KSlZstSE3DMNIDylzEKo6AdgUU/wT4C5V3e1l1vnyc4AXVXW3qi4GFgLH\npMq2xkBk4L5FZTZ5kGEY6UFUUxfCEJFewFhVHezXpwKjcbWEXcDPVfVLEfkbMElVn/dyTwBvq+q/\nAnReDVwNUFhYWDRmzJg62VZeXk5eXl6DyjakztfmbucfM7bzjV65/Pjo9hlrp+k0naYzs3UGUVxc\nXKqqxUkFVTVlC9ALmBm1PhO4HxBcDWGx//0gcEmU3BPA+cn0FxUVaV0pKSlpcNmG1PnRvHXa86ax\nevqf3mkwnbWVM52m03Q2fp1BACUa4h1+oLOYVgCveRu/ACqBjr68e5RcN2DVAbYto4gM2rekrIJK\na6g2DCMNHGgH8W/gVAAR6Q/kAhuAN4CLRaS5iPQG+gFfHGDbMooOrZvTJb8Fu/YpizfuSLc5hmE0\nQXJSpVhEXgBOBjqKyArgNuBJ4Emf+roHuNxXd2aJyMvAbKACuEZVm/xARIO65rNqyy6+99gkTuzX\niRP7d+L4wzrSrlVuuk0zDKMJkDIHoaqj4my6JI78HcAdqbKnMXLJiJ58uWg9a7fu5pXSFbxSugIR\nGNKtgJP6deTE/p0Y1r2AnGzr72gYRsNTawchIu2A7qo6PQX2GFGc1L8Tj3+7E3ld+jNhwXomzF9P\nyZLNTFtexrTlZdz/wULaNM/huMM6cGL/TnTZV5lukw3DOIgI5SBE5EPgbC8/FVgvIh+p6o0ptM0A\nRISBXdoysEtbfnxSX8r3VDBp0UYmzN/AhPnrWbRhB+/MWss7s9bSIz+HU45Lt8WGYRwshK1B5Kvq\nVhH5f8BTqnqbiFgNIg3k5eZw6oBDOXXAoQAs31TOxws28L9jZ7NsSwUbtu+mY+vmabbSMIyDgbDB\n6xwRKQQuAsam0B6jlnRvn8f3vtaDod1dz+upy8rSbJFhGAcLYR3E74B3gIXqej73ARakziyjthzV\nox0AU5ZvTrMlhmEcLIQNMa1W1SGRFVVdJCJ/SZFNRh04qnsBAFOsBmEYRgMRtgbxQMgyI00M6+Ec\nxLTlZeyznteGYTQACWsQInIscBzQSUSiM5baAk12OO5M5JA2LTgkL5t15ftYsG4bAzq3TbdJhmE0\ncpLVIHKB1jhH0iZq2QpckFrTjNrSr0MzwMJMhmE0DAlrEKr6EfCRiDytqksPkE1GHenfoRmfLt/F\nlGWbGXVMj3SbYxhGIydsI3VzEXkUN3z3/n1U9dRUGGXUjf7trQZhGEbDEdZBvAI8DDwONPlB9DKV\n3gXNyM3OYsG67WzZuZf8ls3SbZJhGI2YsA6iQlUfSqklRr1pli0M6tqWKcvKmL6ijBP6dUq3SYZh\nNGLCprmOEZGfikihiLSPLCm1zKgTR3X3HeYszGQYRj0JW4O43P/9RVSZAn0a1hyjvhzVowA+hSnL\nrEe1YRj1I5SDUNXeqTbEaBiO8h3mpiwvQ1URkTRbZBhGYyXscN+XBZWr6rMNa45RX7oWtKRTm+as\n37abJRvL6d2xVbpNMgyjkRK2DeLoqOUE4Hbc/BBGhiEiUeMyWZjJMIy6E8pBqOp1UcsPgaNwvayN\nDGT/yK7WUG0YRj2o62TG5UC/hjTEaDiq2iGsBmEYRt0J5SBEZIyIvOGXN4F5wOgk+zwpIutEZGbA\ntp+LiIpIR78uInK/iCwUkekiMrwuJ2M4hnTLJ0tgzupt7Nxj/RoNw6gbYdNc74n6XQEsVdUVSfZ5\nGvgbUK0hW0S6A98AlkUVn4GrkfQDvgY85P8adSAvN4cBndsye/VWZqzcwjG9rcuKYRi1J2wbxEfA\nXNxIru2APSH2mQBsCth0L/BLXD+KCOcAz6pjElDgpzg16sj+MJM1VBuGUUfChpguAr4ALsTNS/25\niNR6uG8RORtYqarTYjZ1BZZHra/wZUYdsYZqwzDqi6gmn31MRKYB31DVdX69E/C+qg5Nsl8vYKyq\nDhaRPGA88E1V3SIiS4BiVd3g2zXuVNVP/H7jgF+qammAzquBqwEKCwuLxowZE/pkoykvLycvL69B\nZTNJ58ptFVz/nw20a5HFY9/uVK3DXCbZaTpNp+lMrc4giouLS1W1OKmgqiZdgBkx61mxZXH26wXM\n9L+PBNYBS/xSgWuH6Aw8AoyK2m8eUJhMf1FRkdaVkpKSBpfNJJ379lXqkNvf0Z43jdWVm8sbRGdD\nyJlO02k6D6zOIIASDfHuD5vm+h8ReUdErhCRK4A3gbdC7htxRDNU9RBV7aWqvXBhpOGqugZ4A7jM\nZzONALao6ura6Deqk5UlDNvfYc7CTIZh1J6EDkJEDhORr6vqL3Bf+UOAocBE4NEk+77g5Q4XkRUi\n8oME4m8Bi4CFwGPAT8OfghEPa6g2DKM+JEtzvQ/4FYCqvga8BiAixX7bWfF2VNVRiRT7WkTktwLX\nhLLYCM3+hurlVoMwDKP2JAsx9VLV6bGFqlqCa18wMphh3VwNYsbKLeypqEyzNYZhNDaSOYgWCba1\nbEhDjIYnP68ZfTu1Yk9FJXNWb023OYZhNDKSOYgvReSHsYW+PaFGCqqReVT1h7B2CMMwakeyNoj/\nBl4Xke9T5RCKcSO5fieVhhkNw1E9CvhX6QqmLC/jinQbYxhGoyKhg1DVtcBxInIKMNgXv6mqH6Tc\nMqNBsDmqDcOoK2GnHB2P6wVtNDL6H9qavNxslm0qZ8P23XRs3TzdJhmG0Uio63wQRiMhJzuLId3y\nAZhqtQjDMGqBOYgmQFV/CGuoNgwjPOYgmgBH2ZAbhmHUAXMQTYBhfsiNacvL2FeZfPRewzAMMAfR\nJDikTQu6tWvJjj37WLBuW7rNMQyjkWAOoolgEwgZhlFbzEE0EaraIayh2jCMcJiDaCJUDf1tNQjD\nMMJhDqKJMKhLPrk5WSxYt50de2xkV8MwkmMOoomQm5PF4C5tAVi4eW+arTEMozFgDqIJEWmonr/R\nHIRhGMkxB9GEiLRDmIMwDCMM5iCaEPtrEJv24GZ5NQzDiI85iCZEl/wWHNKmOdv3KEs2lqfbHMMw\nMpyUOQgReVJE1onIzKiyu0VkrohMF5HXRaQgatstIrJQROaJyOmpsqspIyL7w0y3vj6D1Vt2ptki\nwzAymVTWIJ4GRsaUvQcMVtUhwHzgFgARGQhcDAzy+/xdRLJTaFuT5acnH0bbXOGzrzZy+r0TGD11\nZbpNMgwjQ0mZg1DVCcCmmLJ3VbXCr04Cuvnf5wAvqupuVV0MLASOSZVtTZmh3Qv4y+kdOW3AIWzd\nVcENL07l2n9Opqx8T7pNMwwjw0hnG8RVwNv+d1dgedS2Fb7MSAHtWmTz+OXF3HXekeTlZjN2+mpO\nv28CH81fn27TDMPIICSV2Swi0gsYq6qDY8pvBYqB81RVReRBYKKqPu+3PwG8paqvBui8GrgaoLCw\nsGjMmDF1sq28vJy8vLwGlW2MOtdsr+D+L7Ywz6e+juybx2VD2tA8RzLKTtNpOk1n3WVjKS4uLlXV\n4qSCqpqyBegFzIwpuxyYCORFld0C3BK1/g5wbDL9RUVFWldKSkoaXLax6qzYV6kPjl+gh/3qTe15\n01g95e7xOmXZ5oyz03SaTtNZN9lYgBIN8Q4/oCEmERkJ3AScrarReZZvABeLSHMR6Q30A744kLY1\nZbKzhJ+efBj/vubr9D+0NYs27OD8hz7jL+/Np8ImGDKMJktOqhSLyAvAyUBHEVkB3IarKTQH3hMR\ngEmq+mNVnSUiLwOzgQrgGlXdlyrbjGAGdcnnjWuP58/vzuPxTxZz/7gFPJYj9P7sY3p1zKNH+1b0\n6pBHjw559OzQisK2LcjKknSbbRhGikiZg1DVUQHFTySQvwO4I1X2GOFo0SybW781kFMHHMotr01n\nycZyZq/eyuzVW2vI5uZk0b1dS3p1aEWzvdsZs3JWqGOsW7c1lOzerds5fHAFrZun7DY1DCMB9uQZ\ngRzbtwPjf34yH00sIb/bYSzbVM6SDeUs3bSDpRvLWbqxnA3bd/PV+h18tX6H2+mrJeEPsDCc7JqK\nKTx2WbHVVAwjDZiDMOIiIrRpnsVRPdrtH8cpmu27K1i2sZylG3fw5awFdOvWPZTe5SuW0z2JbKUq\n9707l3Fz13Hf+/O58ZuH1+kcDMOoO+YgjDrTunkOA7u0ZWCXthyyZxVFRb1D7VdauimUbNa2Nfzh\n483c/8FCBnZpy8jBhfU12TCMWmCD9RkZy9BDm3PzGQMA+J+Xp7Fg7bY0W2QYTQtzEEZG88MT+nD2\n0C7s2LOPHz5bwpadNpeFYRwozEEYGY2I8Mfzh3BEYVuWbCznhhensM/6ZhjGAcEchJHxtMzN5tFL\ni2iX14wP563nL+/NS7dJhtEkMAdhNAq6t8/jb98bTpbAg+O/4q0Zq9NtkmEc9JiDMBoNXz+sI786\n8wgAfv7KNOatsUZrw0gl5iCMRsUPju/NucO6UO4brW0eC8NIHeYgjEaFiHDneUMY1KUtyzaVc90L\n1mhtGKnCHITR6GiZm80jlxbRvlUuHy/YwN3vWKO1YaQCcxBGo6Rbuzwe/N5wsrOEhz/6ine+Krea\nhGE0MOYgjEbLsX078OtvuUbrRydv5cQ/jefB8QvZsH13mi0zjIMDG4vJaNRccVwvmmVn8bf357Cy\nbCd3vzOP+96fz5lHFnLZsT0Z3qMdfu4RwzBqiTkIo1EjIlwyoieH56xnZ9uePDtxKR/MXcvoqasY\nPXUVRxS25dIRPTn3qC7k5drtbhi1wZ4Y46AgS4QT+3fixP6dWLG5nH9+voyXvlzOnNVb+dXrM7jz\nrTmcX9SNo9pUUJRuYw2jkWAOwjjo6NYuj1+OHMAN/9WP/8xcw7MTl1K6dDNPf7aEZ4DpO2bzi9MP\np0Wz7HSbahgZjTVSGwctzXOyOWdYV179yXG8ef3xjDqmOyLwxCeLOftvnzBz5ZZ0m2gYGY05CKNJ\nMKhLPneeN4Q7T+1An06tmL92O9/5+6c8OH4hFfsq022eYWQkKXMQIvKkiKwTkZlRZe1F5D0RWeD/\ntvPlIiL3i8hCEZkuIsNTZZfRtDmsfTPevO4ELj+2J3v3KXe/M4+LHpnIkg070m2aYWQcqaxBPA2M\njCm7GRinqv2AcX4d4Aygn1+uBh5KoV1GE6dlbja/O2cwz/3gGA5t25zJy8o48/6P+cfnS1G1znaG\nESFlDkJVJwCbYorPAZ7xv58Bzo0qf1Ydk4ACEbEJiI2UckK/Trzz3ydy1lA3+N+tr8/kqqe/ZN3W\nXek2zTAyggPdBnGoqq4G8H8P8eVdgeVRcit8mWGklIK8XB4YdRT3jzqKti1yGD9vPaffN4G3bb4J\nw0BSWaUWkV7AWFUd7NfLVLUgavtmVW0nIm8Cd6rqJ758HPBLVS0N0Hk1LgxFYWFh0ZgxY+pkW3l5\nOXl5eQ0qazobt86NO/fx4JdbmLbWDSHeOz+Lbvm5dG6dTedWOe5v62zym2dV6519MJy76Tx4dQZR\nXFxcqqrFSQVVNWUL0AuYGbU+Dyj0vwuBef73I8CoILlES1FRkdaVkpKSBpc1nY1fZ2VlpT7z2WI9\n/Ndvac+bxgYuA3/zto68b4L++LkS/b+3ZuudL0/QTxes1xWby7ViX+UBsdN0ms7ayMYClGiId/iB\n7ij3BnA5cJf/Ozqq/FoReRH4GrBFfSjKMA4kIsJlx/binKFdeWNCCc07dmPZxnKWbNzBsk3lLNmw\ng627KpizeitzVm/dv9/DpZ8DkJudRbf2LenVoRU92ufRq0MePTu0omeHPLq1q9vXnmGki5Q5CBF5\nATgZ6CgiK4DbcI7hZRH5AbAMuNCLvwWcCSwEyoErU2WXYYQhP68ZAzvlUlTUvca2svI9LNlYztKN\nO1i2sZzS+cvYIS1ZsrGc9dt2s2j9Dhatr5k2myVweIdm/Dh7JSMHd6Z5jvXkNjKblDkIVR0VZ9Np\nAbIKXJMqWwyjISnIy2VYXi7DurvmtNKCrRQVuRGeduyuYNmmcpZ6B7J0k/+7sZxVZTuZs2EvN7w4\nlY6tc/nu0d353td60rWgZTpPxzDiYmMxGUYD0qp5DkcUtuWIwrY1tm3fXcEDoyfy0Spl7pptPDj+\nKx768CtOO+JQLh3Rk+MP60hWlg1NbmQO5iAM4wDRunkO3+ybx80XDqdk6WaenbiU/8xczXuz1/Le\n7LX07tiK73+tBxcGhLUMIx2YgzCMA4yIcHSv9hzdqz3rth3BS18s559fLGPxhh384c053PPuPIo6\n59Jr2YxQ+nZv3cZiVtCrQx49OuTRqXVzmyTJaBDMQRhGGjmkTQuuO60fPzm5L+PmruP5SUv5eMEG\nPl2+i0+XLwut519zpu3/nZebTY/2efTskOeyqfzfrTsq2FepZFsYywiJOQjDyABysrM4fVBnTh/U\nma/Wb+flD6fQrXuP5DuqMm3+Evbk5u9vFC8r38vcNduYu2ZbDfFm77xN93bOeUTSb3t2yKNH+1Z0\nb9/SMquMapiDMIwMo2+n1nyzTx5FRT1DyQ/M3UhR0VH717eU72Xpph0s2VjOso2Rv+UsWFPG5l2V\nLNqwg0UbdgDrq+kRgS75LenZIY+8ynKKt39Fz/ZVjqRVc3tdNDXsP24YBxn5ec0YklfAkG4F1cpL\nS0sZeOQw1+HP9+HY3wFw4w5Wbt7JyjK3ALy/eG61/Tu2znXOIspp7N28l+Gq1uZxkGIOwjCaEC1z\nszm8cxsO79ymxrY9FZWsLNvJ0o07+GTqXCrzOrIsUhPZVM6G7XvYsH0PpUs3V9vv8Rkfc8mxPfnO\nUV1pbbWMgwr7bxqGAUBuTha9O7aid8dWtNm+nKKigfu3VVYqa7buYunG8iqnsbGcT+evZd7abfzm\n3zP549tzOW94Vy4Z0ZP+h9Z0QEbjwxyEYRhJycoSuhS0pEtBS47t22F/+aQvS1if24XnJi7liyWb\neHbiUp4JuoTgAAAgAElEQVSduJQRfdpz6YhefHPQoTTLtpmNGyvmIAzDqDPNsoSzhnbhrKFdmLtm\nK89PWsrrk1cyadEmJi3axCFtmjPqmB6MOiZERpaRcZiDMAyjQRjQuS1/OPdIbho5gNenrOS5iUtZ\nsG47fx23gL+NX0ifghyOmD/Fd+hrZR37GgHmIAzDaFDatGjGZcf24tIRPZm0aBPPT1rKO7PWsGDT\nXhZsWlVDPrpjX88OrWi+aydDhlVaaCoDMAdhGEZKEBGO7duBY/t2YPOOPbz5SSktOnbfP7ptoo59\nH678jHu/O4zDDmmdxjMwzEEYhpFy2rXK5YiOuRQVdauxLbpj39INO3jm04XMWLmFb93/MTefMYDL\nj+1lo9ymCXMQhmGkldiOfUNblTF6WS6vTl7B78bMZtycddx94RAK823ejAONBfkMw8goWjXL4s8X\nDeXhS4po3yqXTxZu4PR7JzB66sp0m9bkMAdhGEZGMnJwZ/7z3ydw2oBD2LqrghtenMq1/5xMWfme\ndJvWZDAHYRhGxnJImxY8fnkxd513JHm52YydvprT75vAR/PXJ9/ZqDfmIAzDyGhEhIuP6cHbN5xA\nUc92rN26m8uf/ILfjp7JzorKdJt3UJMWByEiPxORWSIyU0ReEJEWItJbRD4XkQUi8pKI5KbDNsMw\nMpOeHVrx8o+O5ZcjD6dZtvDsxKVcNXodlz35BY9/vIj5a7ehquk286DigGcxiUhX4HpgoKruFJGX\ngYuBM4F7VfVFEXkY+AHw0IG2zzCMzCU7S/jpyYdxUv9O/Hb0LEqXbmbC/PVMmL8e3pxDYX4LTujX\nkRP7d+L4wzpSkGffmfUhXWmuOUBLEdkL5AGrgVOB7/ntzwC3Yw7CMIwABnXJ59WfHMe4T79ga14X\nJszfwMcL1rN6yy5eLlnByyUryBIY0q2AE/t3ouO+PfTYtpuOrXNtWI9acMAdhKquFJF7gGXATuBd\noBQoU9UKL7YC6HqgbTMMo3FR0CKb047qxneO6kZlpTJnzVYmzN/AhPnrKVm6ianLy5i6vAyA3374\nPq1ys6uNA9XLT4DUo0Mehfktbb7uGORAx+xEpB3wKvBdoAx4xa/fpqqHeZnuwFuqemTA/lcDVwMU\nFhYWjRkzpk52lJeXk5eX16CyptN0ms7M0bmzopJZ6/Ywde0e5q7fxdpypXxv/PddThYc0iqbQ1oK\nXdrmUtg6m0Nb51DYOptOrbJpFuM8Mvnck1FcXFyqqsXJ5NIRYvovYLGqrgcQkdeA44ACEcnxtYhu\nQM1RvQBVfRR4FKC4uFiLiorqZERpaSlh9w0razpNp+nMLJ3HR8kNHz6csvK9LN1Uvn88qKqpV8vZ\nsH03q7btY9U2mLquopqeLIEuBS33DyjYq0MeOzfu5LC+XULZuWjjIvrkJ5cNKwewbttirjyhbu+/\nsKTDQSwDRohIHi7EdBpQAowHLgBeBC4HRqfBNsMwDlJEhHatcmnXKpdh3QtqbN+xu4Jlm8r54IsZ\n5BR09lOtOkeyqmwnKza75dOFG6t2+nxyeAMmhZQNKdevfTOuPDP84etCOtogPheRfwGTgQpgCq5G\n8Cbwooj8wZc9caBtMwyj6dKqeQ5HFLalvFsLior6Vtu2p6KSFZvLq41CO3fpGtq1q+logti8eTPt\n2rVrMDmAFnu3JReqJ2nJYlLV24DbYooXAcekwRzDMIyE5OZk0adTa/p0qhp+vLR0V9rDa6nGelIb\nhmEYgZiDMAzDMAIxB2EYhmEEYg7CMAzDCMQchGEYhhGIOQjDMAwjEHMQhmEYRiDmIAzDMIxADvhg\nfQ2JiKwHltZx947AhgaWNZ2m03SazkzTGURPVe2UVEpVm+QClDS0rOk0nabTdGaazvosFmIyDMMw\nAjEHYRiGYQTSlB3EoymQNZ2m03SazkzTWWcadSO1YRiGkTqacg3CMAzDSIA5CMMwDCMQcxCGYQQi\nIs3TbYORXpqcgxCRriJynIicGFmitg1PtMTRd6GItPG/fy0ir8WTrYWN14pI0nkHRSRbRJ4PqfNj\nEblDREZG7E0gOzikzjwR+Y2IPObX+4nIt8PseyAQkfaJlnrq/kFA2V1xZMeFLEvb9RSRJ2PWWwNv\nBchlichFtdAb9txvCFNWW0RkqH+erhWRofXVV4vjiohcIiK/9es9RKReM2aKSHbDWFcLUt3RIpMW\n4I/AEtyNP8Yvb0RtH++XicBeoAQo9b8/iaNzuv97PPAxcA7wedT2GcD0eEscnX8AFgIvAyPxyQRx\nZN8BckOcex/gMuAxb1MJcG8c2U+AL4CfAgUJdL4E/BKY6ddbAlMD5L4OvAfMx00tuxhYFEdnf2Bc\nlM4hwK9D/n9vj1lfHHW82KXG8WtzbOBt4PtR638HnoiRaQG0B6YB7fzv9kAvYE49rucNQFtAcHO3\nTwa+GcfOUNce+F/gIf+7HfAZcGUcnRNC/C9qe+6TA8qmBJQd6s/5bb8+EPhBHBtuAGYCv/fLDOC6\nOLIJ9QIPAPfHWwL0PQQ8GDlXfw2+jHPs84AFwBZgK7AN2Bogtxi4GxgY5nloiOWAHCRTFmAe0DyE\n3IvAkVHrg4Gn48hO8X/vBL4XXeZ/9/TLn/xypF/uAn6bwAYBTve2LAT+D+gbIPcI8CXwG+DGyBJH\nZyFwsb9xZwP/SXD8fv6cFgL/BL4RIFMScL7TAuTmAmcAhwAdIkuc436Em5s8WufMkP/fs+p5f4Q+\nNu7l/R4wCngWuC9A5gb/UO+muqOaBlxbj+s5zf89HXgDGErAC7YO1/6PwMP+fjo/wXX6DfBzoDtV\nL/72dTl3f/3GAJv9uUSW8cD7Acd+G7go6hrkADPi2DkdaBW13or4H2UJ9QKX++VR3AfUdX6ZQMCH\nVuT/kex/6csXAkeEuD/bAD/EOe9JwNVA2/rc80mPmUrlmbb4m6B1CLmgr7YaZb58LO4l/RVQADSP\n81B/GqYsZvtQ4D7/kD8ETAH+FCNzW9ASoOsr4HP/4A4HskJch2zgfGAlMMfbcV7U9s9wL8rIw9AX\n+CJAz+fJjhUl+6X/G/1gTQ2w62chdA1PtNTx2O2jlp7+f/I3Al6SUfsEfrUGyIW9npFa61+B78Ta\nXJtrj/t6jSznA1NxL8Hzov/XMfuEqpGFOXd/DU/G1dpPilqGAzl1+R9Flc8AWkSttyC+MwmlF+e4\nmkWtNwPGB113f59G/pedEvyPEr4H4uxzon8udwDPAIfVVkeYJYemRTkw1cdAd0cKVfX6GLk5IvI4\n8DygwCW4F2QQF+HCQPeoapmIFAK/CJBrJSLHq+onACJyHO6LpgYicj3ua2UD8DjwC1XdKyJZuKro\nL6Ns/53fp41b1e1x7LwfFwYbBRwFfCQiE1T1q4DjDwGuBL6F+0o+S1Uni0gX3IP8mhe9DfgP0F1E\n/oELZ1wRcOzxInK33y/6uk8OkN0gIn1x1x0RuQBYHS2gqvtE5Bzg3jjnGuHPCbYpcGptj40LOWrU\nuuCu07d8eZ+AY60RkTaquk1Efo17+f0h4PzDXs9SEXkX6A3c4v/3lXHOM9m1PytGfgrupXeWP5/X\nYrajqr3jHKsGqvqAv9d7QdX7RlWf9X+XAktF5PvAKlXdBSAiLYFuuJBwNDtEpANV/6MRuNBMEE8B\nn4vI6379XFwYKYiwervgvuQ3+fXWviyW+4HXgUNE5A7gAuDX0QIicp7/WSIiLwH/pvr/6LUY+Wzc\nfXYl7nr+GfgHcAIubN4/zrnVmSbVUU5ELg8qV9VnYuRaAD/BeWlw1ciHIjdvgN7jgX6q+pSIdMLV\nUhbHyBQBTwL5vqgMuCroJSkivwOe9A9P7LYjVHVO1Ppg4DncFyw4p3KZqs6KY2tr3A32c6CbqtZo\n+BKRCbi2in+p6s6YbZeq6nNR6x2AEbgX5SRVrTG6pIiMDzBFVTX2BY2I9MF9wR6HCzssBi5R1SUx\ncnfgruVLuK+oiNIgpxOKWhw7CzhWVT8NqXe6qg7x98mdwD3Ar1T1awGyYa5nFjAM99Ve5vfpqqrT\nA2RDX/uwiEgeLpTZQ1WvFpF+wOGqOjZA9jlcTWgqsC/q+NfHyJUAx6nqHr+ei/uyPjpGbjiuPWAw\nrn2hE3BB0LlHyR+Pu54TVHVKArmkekXkSuB2XE0CXG3n9th3iJcdAJzmjz0u+rn1258KssWjqnpV\njPwif9wnVPWzmG33B3zo1psm5SBSgYjcBhTjHpD+/iv7FVX9ehz5trjrHvjV4x/+6aoaNpPoM+BW\nVR3v108G/k9Vj4uR+zPuQWmNi19OAD5W1UVhjhOjK2GWVn1e0lHHaIULg22Lsz3ygEZuYCG+0wn9\nQgtzbC8zUVWPDXkuU1T1KBG5Exfi+GekLEB2CDW/tmO/JAX4PtBHVX8vIj2Azqr6RRh74tjYCRff\njj32VQGyL+FqUpep6mD/tT9RVYcFyM7BNaomfNGIyNTY/UVkmqrWyDwSkRzgcNz/fJ6q7o2j8/e4\nxJHPVHVHkIyXy8I55S9C6u0MRJz756q6JkDmHuCpeB9qdUFEWieIEKSEJhVi8i+GO3EZCi0i5ara\nx2+fQfXwQTVUdUhA8XdwIZvJXmaVRKWRisiNcWyJ6PxLzDEqRWSaiPRQ1WUhTqtVxDn4/T/0L7hY\nJuHaL9YmU5jsOlHL0I2I5OPCJ5Ea2UfA74OcpIgcimuQ76KqZ4jIQNzX+hN+e+R6jvXHkphjB/EU\n7oUWcZorgFe8juhjF+AyvXoBOVH/o6Avs3dF5HzgtWQvP2CliDwC/BfwR3H9C2qkmItLNR0CzKIq\nZBQU5vm7334qLjtnG/AqcHSMXG2u/Wjcy/R9qr7049FXVb8rIqMAVHWnRC5WTWYCnakZqotlvYic\nrapveLvPIWCuAxG5BvhH5MUrIu1EZJSq/j1A5xJcSPV+Ednmz2+Cqo6OFvLP3J+9ww/zQs8G1uPe\nn/1FpL+qToiRmQs86p3ZU8ALCT4KnwFuUNWyyDkBfw5wzhX+/AdR/bms4cQbiiblIHD/qNtwsetT\ncKGW6Bu7Ljnne1RVRSQSu4x9OSfscxCHQmCWiHxB9fDJ2QGyi0TkN7gwE7j2ksWxQqr6ioicLVX9\nPj5S1TFxjp/wOqnqKbU8nydxL4pI/vyl/hjnBcg+7bfd6tfn48JIkdhx5Hoejnshjva2nYWrFQUR\n9oX2Fs6RziB+TD/Cjbg2pH0ispOqGkzbANmw7VQjVHVgkuMCfE1Vh4vIFH8+m31IJoiw1z5PVW8K\ncWyAPb7WELnn+xIVO/dlY/z2NsBsfy9Hx9dj7+UfA/8QkQf9fitwzjqWH6rqg1F6NovID3FOsxqq\n+iTwpP/ivwgXVr2a4GcylMMXkT8C36WmE69276nq48DjInI47vmZLiKfAo9Ff9B5hkScQ9Q51ahd\n4p7xubjstd/japHx2kYbBk1By3emLkCp/xudvvZxPXX+HJfFtAhXRZ9IyKyVBDpPClriyLbDNYhN\nxjUw/hVoFyB3Jy7H/yq/vAfcWZfrRPXMlxpLgL7aZIWFzSZ5F2gTtd6GOGm7hM8OCkwVbYD7rkfQ\nEiD3BCFy3Kldhkyoa4/re3NmyPP5Bq4msh7XSLoEODnMPZzoXvb7tY7+vwZsn05UvyB/HWbFkX3c\n/+9fxzn0YwjIjPKy23Av/L0k7osQKlU+yrZzcI3PpcBNuJTeF2PkpkU/s7j2xBrZVlSl1Eey2JoB\nH6Tino0sTa0GscvHGxeIyLW4NLFDYoV8dTTyFZGL+0fs0ICvQ1W9R0S+gbupDsf1bXgvQOdTBIRA\nNKB6qKofhT0hVd0MXO/bNio1fozyW8AwVa309jyDcyi3BMgmu06xmS/VTKJmSGSnVM/g+jqws8ae\njrDZJD2APVHre3ChoSDCZgc9579Gx1L9a3dTgCwicjZVoZsPNU6bBvAmVeGwFrjso3m4UEE0zwAT\nRWSNP36kVhIb2kyaIRNF2Gt/A/ArEdmNe0nGrRGp6nsiMpmqxvQbNKYxvTb3sLcrYWgxineAl0Xk\nYdw1/THufxtEB9xLugyXdbRBVSuCBFU1bE1/Ee59sDuRkIj8BfecfIBrE4y0D/1RRObFiP8Z+ExE\n/uXXLwTuCFAbaRMpE5ecsob493yD0KQaqUXkaFyVrADXc7QtLi7/eZL9zgWOUdVfJZBpS/XGvU0x\n28+PWm2Ba7tYpQHxbXHpb3/EvZSFBA+riByJ66gVncV0uarOjJGbjvvK2+TX2+NeajXaVep6neIh\nIsNwL798fy6bgCtUdVqAbNhskltxYYPXcS+K7wAvqeqdATqfw4WNduIe8M9jX2he7hrcg1lGlTNX\nrWp7iZa9Cxfi+ocvGoWred2c8GJUneOPVPVHMeULcV+61UJcGpzNljBDJkpuKO7+iGTPbcbdH4FZ\nP2EJ05ju5aI/tiJswfXk/x/1SRIi8jY+tKiqQ33sfoqqHhmjLwv4EVXn/i7wuKrGbTcRkSNwYZmf\nAdmq2i2OXFKHLyKv4vonJUyVF5GrgAqgl8YkEohIvsa0R3iHeCpV/8/ZAcf+f7i2piNxodjWwG9U\n9ZF4515fmpqDKMbFtnvivgIg+AstaN9JqjoioPxHuHjgTtxDHXmZB+XDR++XhespGpR1sxDX9yBp\nfFHCZzGNwvXeHu9tPBG4RVVfDNCZ8DqJyCWq+ny8BniNaXiP0tvWb9+a4HwuxH0ldsd13Poa7iEI\nSgcejssBh8QpjKfiMrhOwPVTmOrl/xoj9xUuvp90InjvcKNrZNm4F1rSe8nLT1bV4TFlHwTdD1Hb\nE44fFVTTEZHeqro4+tpHyvz2Aao6V+JkpsW57oGN6UG1YXEp26twvfEF15O/M64G9RNVPdnLfamq\nR0tUdpcEZDbVBnHjWJ2Au9fb4cK/H6trm4iVDeXwJXyq/EP4RAJVPcI3PL+rMWm7XrZHHJ3L/Pag\n50yqxIKft4agqYWY/oFrHEzYCClVHVjAZZsUEz9D5ufAoDAvlRj64cIkQawN4xw8obKYVPUFEfkQ\n9xAIcJMGpOd5kl2niP5Q1XJxg649hYvrPuZfRjer6rsB4r9R16DeDpf182dcL/IafQb8yytpSq2q\nfiAiH+HO/RRcWGIQrr0mmlm4zpRhKaCqw1R+PKGYBzwLKMLF72OZKyL/xMWpgzpMRTrp7X85RA5B\n/E56r+J6jUc75X95G8DVWK4mODOtRkaaJ2xjOsBIrd7f41H/sfV7EYmukYcKLUryDLtozsN9bPxV\nVVf5/f8Yx84zCQ7BVnMQsY4gAbVJJIiEIMG1lcWGIGMTM97w64kSMxqEpuYg1qtPo0tCdIy9AtcI\nd04c2a8I8VKJqmpHHuY1uEarIEL1rPSEymLyZOFCUInS8yDJdYqq0v5dVYNedLFcpap/FZHTcWGz\nK3EOI8hBREIF3wIeVtXRInJ7iGPERVzP+Vb4L0jgaFVdF+fYU8X1sUjU0x5cvHyyd7r7a2RxTGhD\n1QugAucAXg2Qa+mP+82osv1tOqraW0QE6K5JUqB9CGoQkB/zwdOW6i/Wq/3f2mSmTRSRgUFhkAAq\nxY3+GomvXxC1Lfqj60bci6+vuGyfTjGyEZJlIkYzLKBWcwbxn7ukDl9EFhPwsRjgoPb6WmXE4XUi\nzkdpQBhtOC6MFtkeGS3hXZyz3+bXb8ela6eMpuYgbhM3hEZs/LDai1dVr6yFzltwDUyfk+ClUotG\nMHAPcTlxXhQxXAX8DvfCEdwXxRWxQhIyPc8T6jrhznsxLg31NXUN5kFEHuAzcZ2HpvkXXRCh+gzU\nkum4L+bBuK/SMnEd3WIba//tlzB8C5dCuhlYRuIa2VvAr6ges78ZF6bZT5j7TlVV3NARRUlED8el\nbRdQ/YNnGy7brgaSYEiMGMI2poNLxfwrLg1VcWnEl4hLk7026jiTReQkkndUa6mq40REfNvM7SLy\nMc5pRM7jJ7iRiPv4UGCENkC83u93AlP8x0Eih18c9bsFrkE5KPRXm0SCavhrUSMURe0SMxqEptYG\n8TwwgCSxUxHphmso/Trupv4El6mxIkDnF357bMNiUNf7sFkvtTmnSHtBL6oe7BoPq7jMiSGqmjD7\nwsuGuk5e9hhcXPlc3AixL6rq8zEyTwFdcVXnobjMkg9VtcZLTlyv55G4NL8F4voMHBknHFUrpPow\nI51Vtc4T4oRt1/Cy8/wxZ5Kg8TnsfSeur8DTqvplCDuPVdWJIeRCDYnhZUM3poc47qk+BBjUJ0Zx\nX/SfqG+E9rWLE3A1kg9wGXZ3qerhUTrzcW0Od1I9RLQtqJ0mar9CqkKwgT2k4+z3iaoeH1AeNpEg\nKATZXlVPj5ELnZjRUDQ1BzEjtjoXR+49XKNadNjm+6r6jQDZzzSmQTiOzqBGsBJVrfGVIiL9cXH3\nQ9UNZTAEOFtV/xAgG/bl8zZwoYboqh/2OsXs0xH4C+46ZcdsCz12UCoQl6p7Au7BW0rVMCMf+O0v\nq+pFEtyTXjVguAe/XzbV2zV2quqAALnAF0iAXKj7TkRm4wZmW4rrSBn3Cz7svSQhh8Twsgkb073M\nL1X1TyLyAMEhmeu93O9U9TaJPy5RB1zobY2qXioiv8TVRiIZdvm4DLtJyexOhoQb5iS6MT/SPvmT\nePdIyOPeRvUQ5BLg1aCPOQmZmNFQNDUH8Rhu7PaEsVMJHhcmMKPCVx+XUrNhMTbNNXTWi29Q/QXw\niFZldMzUgPGZavHyCZWe52XDXqe2uK+Yi3Ffn68DL6tqaYzcOFU9LVlZqhCRX+CcQqkG5MGLSKGq\nrhaRl6new1lwL58aM6gFtGt8EqddAxE5DfdBkDBkF/a+E5GeQccJ+oIPey+JyCvA9aqabEgMRCTy\ngo7XmI6InKWqY8Rl/QQ5iKDQVbzjPQEci2s/eAM3PHi1EGWimkHIY4TKzJLqgx9GXub3qGps34ba\nHPtoaoYg44XsDihNrQ3ieOByHzdPFDvdICKXAC/49VHAxjg6v+f/RtcE4mWUhMp6wQ178EVMmD6w\ngw/h2wsmUpX9ECFoWAgIf52m4WL2vw8KY4gbFTcP6CguKylyQm0JHiI5Jajq3Um2R16KhwXUvGrU\nCDxh2zXAhbUG4FKGE42xFOq+i9goIocQ1eAch4T3ktR+SAxI0pju94sM4zKbgJcfrm/GfiTBmFGq\n+gNxQ+D/B/dclVKV7JEog6s2hMrM0toPMxOG5wmIAmQEmsJu2pm2UDW7W7UlQK4H7mW6HliHewnW\nGBqhlse+GPe18TSukW8xcHEc2bdxX+SRoRQuwE+FGCD7PK7T0TO4DI+ncEOFx8pNpvoseaOIM5lM\nmOuEa0f4S5JzrtWMamm8L36Ci6fvoPqUsIuB55Ps2xo3s9hSYHccmcBJaup63wFn4+YF2eFtrCT+\ncBMJ7yXc0Bcn44bvOClqOTne/VHLazvP29s7yTP3Ki7Zoo9fbsMlPsTKPZSieyDsMCf5uFBqiV/+\nDOTX89iB0xlnwtKkQkxhEZcD/d/qs3LEdVC6R6Oqm3Ea1fajNcMHz+Ee6kjWS9xGMAmel+D7GhxC\nCNuu0gfXsPd9XA3hMuDbGmeEyTCEDROJyHWq+kBdj5Nq6tKomaxdI0Y2bMgu6X3ny6fh+ie8r24Y\n8VOAUepTVmNkw85xEdRxb7pG1RrDtivE6AgbAg0d1k0F4gaxHINLP49ba/ah2pm4DzJwgx8OVdWE\n74Mkxw4VgkwHTS3EFJYhGpWyqaqbpOboirUdj+gp3Iv5bHzWi7gZ3WpkveAyM57C9Xpujxvn6XJc\nj+1YJkmInHRVXSQiF+O+SpfjJrmPNx5SWKaKyBu4XOzoUWdj04YfEDd2TGznptBx6FTineQW3EMa\nlpa4L8nAdo0Ywobswtx3AHtVdaOIZIlIlqqOlzgdwNQNZfFfEmeOC6ldSmgkC6ck8elWI2wItDbj\ndaWCJ3Ev+2Qj+fZV1ehhc34nIlPreeywIcgDjjmIYLJEpF3Ml1y1a6W16yuBhu/NC24I6zJcWGhV\nEtUJXz4BmTntceGhz0WEgJdUbWiPi5FHZ7TUuLF9lsbJOAfxFq6x8RNi4tCNCU3SrhHDyJBySe87\nT5m4lN0JuCGy1xHTRiXh5yH5Jy4MlbT2pFXtCuWqWq2DlrghUoII+/L7MfCsr82BHzMqjs5UsEzD\ndaJNhSMbGiYKkA7MQQQTPbqi4nKPg0ZXTNi4FiMXtjcvuKlAw75UksnVZY6LUNTCSV6Ay6CaoqpX\nihu58/FU2ZVpBIUG4xD2vjsH2IUbfO77uLh4bO0yVMfMOtaebqFmD96gMgjx8hOXBn24ukH6ko7X\nlSKSDXMS4SfAMw3syEJFAdKBtUHEQUKMrujlQsUkReReXLx6N67qPgE3TWONrw8ReRR4QFVnNNDp\npAQJn2MfGYitFFd72gbMVNXY4a6bPGHvu3QgImfgesNfhOs9H6EtroH3mIB9wra/TFDVExPJpBIJ\n7oehAe0/zXEfPH1xWYlbvFxQ+Dfssed4fclCkAcccxD1pLaNa5KgN29UOCgHN5jfIjLshommFjn2\nf8elOl4M/A+wHTdpTa3CdIZDajccfB9cGHME7t6aCPxM6zYX+VBch8ffA7+N2rQNGK8BQ62EffmJ\nG09sJ87xRLdn1at/Q1hEpH2YY4nIf6gK/+4fYlxVE03Dm0xn6H4tBxoLMdWfUDHJgKyXJ3GhpmhS\nFg5KEWH7a7TBjVnzIS6Xva0eoF7UByl/IuRw8Lg2hgdxHRrBOekXCBgdNxnqxtCaiUtwCDuqadhQ\n6VU4B/bTmPL69m8Iy+e+sfkpXBpwvC/n2oR/Q5EJjiAe5iDqT9iYZNKsl0y+UeKwQdx8xJERKy8g\neHL6SAbXAyTP4DKSU5vh4EVVn4taf95/rNQJVd0nIh1EJFdV94SQD3tPD8Q5h+Nx99PHwMN1tbMO\n9McNEHkV8IC40ZSfVtX5MXKficiRmR7+bSgsxFRPUhGTbCyEzbH3sqHGLTLiE9X35iTcpDtJh4MX\nN9mRSy0AAAdBSURBVAZYGfAi7sX7XaA5rlZRpxCOuNF2h+M69UWHg+o8cY24YU62Un2ssgINGOYk\n1fh+Jc/jkkqm4bK7HqURhX8bCnMQ9SQVMcnGgog0V9Xd0Tn2QbHcgAyuuOMWGfGJakiNDDERTY0G\nVb/P4gQqVZPMfBjHjtuCytXPW1AXRGSaxgx4F1SWKsQNIHkJLslkLa5n9Ru4NpdXcGnacWmEtf9Q\nWIip/jR4TLIR8ZqInKOqOwBEpDNudqzYYbxrM26REYdIo764Htc3qGqZX29H8IxwqGrvFNgRmcCm\njVtNPkJwCKaIyAj1o7KKyNeIP3dDKpiIG0X3XK0+vHqJiDx8sDqAZFgNop40lpTUVCAiP8RNnHM+\nbg7pN4Cfa5y5GxJlcBnhkah5mxOVRW1r0F7sXt9zVE2UswG4TFVn1UPnHNxkQZGZ8nrgem5XkuIQ\njg9/3q2qgZ0LmzJWg6gjMSmpV4pIk4hJRqOqj4mbZ/ffuNE6f6Sqn8XKhczgMsITtsd1qnqxPwrc\nqH4udBE5GXgM1xZVV9JWC/cN7wcklNXYMAdRdxpbSmqDETOMg+BqD1OBET5MENtYWZtxi4zkhO7p\nT2p6sbeKOAcAVf3Qt0PVmQwI4YQaV6ypYQ6ijmTADZ1OYodxeD1OOVDrcYuMJKjqsyJSQlWP6/MS\n9FTeqaqVIlLhh7FYR/37FizyHduiZ75L1BjeGAg1rlhTwxyEUWuCslX8eDqt9cCPodMk8Q4hzDAc\nJSJSgAsBleJ6sX9Rz8NfhZu74VWcg5oAXFFPnWnDt0FMV9V7021LpmGN1Ead8YOb/RiX3luKn0zF\nagyZiYj0ogF6sYtIMXArGThFZl0RkfGamtniGjXmIIw6ExlzSkS+j2uAvgnXztBoXxQHG5KC+cBF\nZB4BU2Q25rCruLnl86k5FtTktBmVAViIyagPzUSkGXAu8DdV3Ssi9sWRAUhq5wNfr1VzQxwsRDKw\nokdAUKq3STQ5zEEY9eER3Dzb04AJflRKa4PIDH4E/DfOGZTi069xI6/+rZ66w84S12iw8FIwFmIy\nGhQRybFU1sxBRH4L3KeqW33m0XDgf+sTOhGR53GzxM0iapa4oKE+GhMi8i3cLI/RHQoP+jHVEmE1\nCKNeBD1UBM+dbaSHC1T19yJyPPANXB+Kh6jDcN9RZOwUmXVFRB7GheROwfUTuYD6Z3s1erLSbYDR\nePEP1XeB63AhjAuBwMlPjLQRGUDyW8DDqjoayK2nzkl+5ruDieNU9TJgs0/jPhbXAbRJYw7CqA/2\nUGU+K/3w3BcBb/nh6ev73B+P63k8T0Smi8gMEWnsE0BFBo4sF5EuwF6gwQc6bGxYiMmoD7EP1Ubs\noco0LsKNc3SPqpaJSCFumtj6cDCOXjzWdyj8E65RH+o/JEmjxxqpjTrjGz0fwKUCPuiLH1fV36TP\nKsOoPSLSEjc75AlUzWj3kKruSqthacYchFFn7KEyDhb8jHbbcDPJQRpntMskzEEYdcYeKuNgId0z\n2mUq1gZh1IfDYx6g8SIyLW3WGEbdSfeMdhmJOQijPthDZTRqoib+agZcJiLL/HpPwo2We1BjISaj\n1sQ8VJFpIvc/VKo6OI3mGUZo/PAwcWnMAxA2BOYgjFpjD5VhNA3MQRiGYRiBWE9qwzAMIxBzEIZh\nGEYg5iAMwyMit4rILD++0FSflZWqY33op+40jIzF0lwNAxCRY4FvA8NVdbeIdKT+o54aRqPGahCG\n4SgENqjqbgBV3aCqq0TktyLypYjMFJFHRURgfw3gXhGZICJzRORoEXlNRBaIyB+8TC8RmSsiz/ha\nyb9EJC/2wCLyTRGZKCKTReQVEWnty+8Skdl+33sO4LUwDMAchGFEeBfoLiLzReTvInKSL/+bqh7t\n+3a0xNUyIuxR1ROBh4HRwDXAYOAKEengZQ4HHlXVIbjpWH8afVBfU/k18F+qOhwoAW4UkfbAd4BB\nft8/pOCcDSMh5iAMA1DV7UARcDWwHnhJRK4AThGRz33nwFNxs+dFeMP/nQHMUtXVvgayiKp5MZar\naqR3+fO4uRSiGQEMBD4VkanA5bgOh1uBXcDjInIeUN5gJ2sYIbE2CMPwqOo+4EPgQ+8QfgQMAYpV\ndbmI3E71qVV3+7+VUb8j65FnK7ajUey6AO+p6qhYe0TkGOA04GLgWpyDMowDhtUgDAMQkcNFpF9U\n0TBgnv+94f+3d8cmCgZBAIXfZCLCdSAGYhO2cfm1oLUYWIKxIFZgIFjABWdgbiKYCmuwExyy0S8K\nwvvShd3NhpmBmewLfHe4epgNcKjTbncP53tgGhHj/Ec/Iib53lcpZQvM8j/SW5lBSNUAWORWsRtw\npJabLtQS0gk4dLj3F/jJtZ9/wPL/YSnlnKWsVa4DhdqTuALriOhRs4x5h7elpzhqQ3qRiBgBG4cX\n6lNZYpIkNZlBSJKazCAkSU0GCElSkwFCktRkgJAkNRkgJElNBghJUtMd8WwVR37D9gAAAAAASUVO\nRK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "lemma_fdist = nltk.FreqDist(lemma_text)\n", "\n", "lemma_fdist.plot(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part-of-speech (POS) tagging\n", "The process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition and its context.\n", "\n", "POS tagging is tricky because some words can have more than one POS depending on the context.\n", "\n", "\"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.\"" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "$: dollar\n", " $ -$ --$ A$ C$ HK$ M$ NZ$ S$ U.S.$ US$\n", "'': closing quotation mark\n", " ' ''\n", "(: opening parenthesis\n", " ( [ {\n", "): closing parenthesis\n", " ) ] }\n", ",: comma\n", " ,\n", "--: dash\n", " --\n", ".: sentence terminator\n", " . ! ?\n", ":: colon or ellipsis\n", " : ; ...\n", "CC: conjunction, coordinating\n", " & 'n and both but either et for less minus neither nor or plus so\n", " therefore times v. versus vs. whether yet\n", "CD: numeral, cardinal\n", " mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-\n", " seven 1987 twenty '79 zero two 78-degrees eighty-four IX '60s .025\n", " fifteen 271,124 dozen quintillion DM2,000 ...\n", "DT: determiner\n", " all an another any both del each either every half la many much nary\n", " neither no some such that the them these this those\n", "EX: existential there\n", " there\n", "FW: foreign word\n", " gemeinschaft hund ich jeux habeas Haementeria Herr K'ang-si vous\n", " lutihaw alai je jour objets salutaris fille quibusdam pas trop Monte\n", " terram fiche oui corporis ...\n", "IN: preposition or conjunction, subordinating\n", " astride among uppon whether out inside pro despite on by throughout\n", " below within for towards near behind atop around if like until below\n", " next into if beside ...\n", "JJ: adjective or numeral, ordinal\n", " third ill-mannered pre-war regrettable oiled calamitous first separable\n", " ectoplasmic battery-powered participatory fourth still-to-be-named\n", " multilingual multi-disciplinary ...\n", "JJR: adjective, comparative\n", " bleaker braver breezier briefer brighter brisker broader bumper busier\n", " calmer cheaper choosier cleaner clearer closer colder commoner costlier\n", " cozier creamier crunchier cuter ...\n", "JJS: adjective, superlative\n", " calmest cheapest choicest classiest cleanest clearest closest commonest\n", " corniest costliest crassest creepiest crudest cutest darkest deadliest\n", " dearest deepest densest dinkiest ...\n", "LS: list item marker\n", " A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002 SP-44005\n", " SP-44007 Second Third Three Two * a b c d first five four one six three\n", " two\n", "MD: modal auxiliary\n", " can cannot could couldn't dare may might must need ought shall should\n", " shouldn't will would\n", "NN: noun, common, singular or mass\n", " common-carrier cabbage knuckle-duster Casino afghan shed thermostat\n", " investment slide humour falloff slick wind hyena override subhumanity\n", " machinist ...\n", "NNP: noun, proper, singular\n", " Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos\n", " Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA\n", " Shannon A.K.C. Meltex Liverpool ...\n", "NNPS: noun, proper, plural\n", " Americans Americas Amharas Amityvilles Amusements Anarcho-Syndicalists\n", " Andalusians Andes Andruses Angels Animals Anthony Antilles Antiques\n", " Apache Apaches Apocrypha ...\n", "NNS: noun, common, plural\n", " undergraduates scotches bric-a-brac products bodyguards facets coasts\n", " divestitures storehouses designs clubs fragrances averages\n", " subjectivists apprehensions muses factory-jobs ...\n", "PDT: pre-determiner\n", " all both half many quite such sure this\n", "POS: genitive marker\n", " ' 's\n", "PRP: pronoun, personal\n", " hers herself him himself hisself it itself me myself one oneself ours\n", " ourselves ownself self she thee theirs them themselves they thou thy us\n", "PRP$: pronoun, possessive\n", " her his mine my our ours their thy your\n", "RB: adverb\n", " occasionally unabatingly maddeningly adventurously professedly\n", " stirringly prominently technologically magisterially predominately\n", " swiftly fiscally pitilessly ...\n", "RBR: adverb, comparative\n", " further gloomier grander graver greater grimmer harder harsher\n", " healthier heavier higher however larger later leaner lengthier less-\n", " perfectly lesser lonelier longer louder lower more ...\n", "RBS: adverb, superlative\n", " best biggest bluntest earliest farthest first furthest hardest\n", " heartiest highest largest least less most nearest second tightest worst\n", "RP: particle\n", " aboard about across along apart around aside at away back before behind\n", " by crop down ever fast for forth from go high i.e. in into just later\n", " low more off on open out over per pie raising start teeth that through\n", " under unto up up-pp upon whole with you\n", "SYM: symbol\n", " % & ' '' ''. ) ). * + ,. < = > @ A[fj] U.S U.S.S.R * ** ***\n", "TO: \"to\" as preposition or infinitive marker\n", " to\n", "UH: interjection\n", " Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey Kee-reist Oops amen\n", " huh howdy uh dammit whammo shucks heck anyways whodunnit honey golly\n", " man baby diddle hush sonuvabitch ...\n", "VB: verb, base form\n", " ask assemble assess assign assume atone attention avoid bake balkanize\n", " bank begin behold believe bend benefit bevel beware bless boil bomb\n", " boost brace break bring broil brush build ...\n", "VBD: verb, past tense\n", " dipped pleaded swiped regummed soaked tidied convened halted registered\n", " cushioned exacted snubbed strode aimed adopted belied figgered\n", " speculated wore appreciated contemplated ...\n", "VBG: verb, present participle or gerund\n", " telegraphing stirring focusing angering judging stalling lactating\n", " hankerin' alleging veering capping approaching traveling besieging\n", " encrypting interrupting erasing wincing ...\n", "VBN: verb, past participle\n", " multihulled dilapidated aerosolized chaired languished panelized used\n", " experimented flourished imitated reunifed factored condensed sheared\n", " unsettled primed dubbed desired ...\n", "VBP: verb, present tense, not 3rd person singular\n", " predominate wrap resort sue twist spill cure lengthen brush terminate\n", " appear tend stray glisten obtain comprise detest tease attract\n", " emphasize mold postpone sever return wag ...\n", "VBZ: verb, present tense, 3rd person singular\n", " bases reconstructs marks mixes displeases seals carps weaves snatches\n", " slumps stretches authorizes smolders pictures emerges stockpiles\n", " seduces fizzes uses bolsters slaps speaks pleads ...\n", "WDT: WH-determiner\n", " that what whatever which whichever\n", "WP: WH-pronoun\n", " that what whatever whatsoever which who whom whosoever\n", "WP$: WH-pronoun, possessive\n", " whose\n", "WRB: Wh-adverb\n", " how however whence whenever where whereby whereever wherein whereof why\n", "``: opening quotation mark\n", " ` ``\n" ] } ], "source": [ "nltk.help.upenn_tagset()" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'\"Do none suggest themselves? You know my methods. Apply them!\" \"I can only think of the obvious conclusion that the man has practised in town before going to the country.\" \"I think that we might venture a little farther than this. Look at it in this light. On what occasion would it be most probable that such a presentation would be made? When would his friends unite to give him a pledge of their good will? Obviously at the moment when Dr. Mortimer withdrew from the service of the hospital in order to start in practice for himself. We know there has been a presentation. We believe there has been a change from a town hospital to a country practice. Is it, then, stretching our inference too far to say that the presentation was on the occasion of the change?\" \"It certainly seems probable.\" \"Now, you will observe that he could not have been on the staff of the hospital, since only a man well-established in a London practice could hold such a position, and such a one would not drift into the country. What was he, then? If he was in the hospital and yet not on the staff he could only have been a house-surgeon or a house-physician--little more than a senior student. And he left five years ago--the date is on the stick. So'" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "snippet" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('You', 'PRP'),\n", " ('know', 'VBP'),\n", " ('my', 'PRP$'),\n", " ('methods', 'NNS'),\n", " ('.', '.')]" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nltk.pos_tag(word_tokenize(sent_tokenize(snippet)[1]))" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def process_POS(text):\n", " sentences = sent_tokenize(text)\n", " tagged_words = []\n", " for sentence in sentences:\n", " words = word_tokenize(sentence)\n", " tagged = nltk.pos_tag(words)\n", " tagged_words.append(tagged)\n", " return tagged_words" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "collapsed": true }, "outputs": [], "source": [ "tagged_sentences = process_POS(snippet)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[('``', '``'),\n", " ('Do', 'VBP'),\n", " ('none', 'RB'),\n", " ('suggest', 'VB'),\n", " ('themselves', 'PRP'),\n", " ('?', '.')],\n", " [('You', 'PRP'),\n", " ('know', 'VBP'),\n", " ('my', 'PRP$'),\n", " ('methods', 'NNS'),\n", " ('.', '.')],\n", " [('Apply', 'VB'), ('them', 'PRP'), ('!', '.'), (\"''\", \"''\")],\n", " [('``', '``'),\n", " ('I', 'PRP'),\n", " ('can', 'MD'),\n", " ('only', 'RB'),\n", " ('think', 'VB'),\n", " ('of', 'IN'),\n", " ('the', 'DT'),\n", " ('obvious', 'JJ'),\n", " ('conclusion', 'NN'),\n", " ('that', 'IN'),\n", " ('the', 'DT'),\n", " ('man', 'NN'),\n", " ('has', 'VBZ'),\n", " ('practised', 'VBN'),\n", " ('in', 'IN'),\n", " ('town', 'NN'),\n", " ('before', 'IN'),\n", " ('going', 'VBG'),\n", " ('to', 'TO'),\n", " ('the', 'DT'),\n", " ('country', 'NN'),\n", " ('.', '.'),\n", " (\"''\", \"''\")],\n", " [('``', '``'),\n", " ('I', 'PRP'),\n", " ('think', 'VBP'),\n", " ('that', 'IN'),\n", " ('we', 'PRP'),\n", " ('might', 'MD'),\n", " ('venture', 'NN'),\n", " ('a', 'DT'),\n", " ('little', 'RB'),\n", " ('farther', 'JJR'),\n", " ('than', 'IN'),\n", " ('this', 'DT'),\n", " ('.', '.')],\n", " [('Look', 'NN'),\n", " ('at', 'IN'),\n", " ('it', 'PRP'),\n", " ('in', 'IN'),\n", " ('this', 'DT'),\n", " ('light', 'NN'),\n", " ('.', '.')],\n", " [('On', 'IN'),\n", " ('what', 'WP'),\n", " ('occasion', 'NN'),\n", " ('would', 'MD'),\n", " ('it', 'PRP'),\n", " ('be', 'VB'),\n", " ('most', 'RBS'),\n", " ('probable', 'JJ'),\n", " ('that', 'IN'),\n", " ('such', 'PDT'),\n", " ('a', 'DT'),\n", " ('presentation', 'NN'),\n", " ('would', 'MD'),\n", " ('be', 'VB'),\n", " ('made', 'VBN'),\n", " ('?', '.')],\n", " [('When', 'WRB'),\n", " ('would', 'MD'),\n", " ('his', 'PRP$'),\n", " ('friends', 'NNS'),\n", " ('unite', 'JJ'),\n", " ('to', 'TO'),\n", " ('give', 'VB'),\n", " ('him', 'PRP'),\n", " ('a', 'DT'),\n", " ('pledge', 'NN'),\n", " ('of', 'IN'),\n", " ('their', 'PRP$'),\n", " ('good', 'NN'),\n", " ('will', 'MD'),\n", " ('?', '.')],\n", " [('Obviously', 'RB'),\n", " ('at', 'IN'),\n", " ('the', 'DT'),\n", " ('moment', 'NN'),\n", " ('when', 'WRB'),\n", " ('Dr.', 'NNP'),\n", " ('Mortimer', 'NNP'),\n", " ('withdrew', 'VBD'),\n", " ('from', 'IN'),\n", " ('the', 'DT'),\n", " ('service', 'NN'),\n", " ('of', 'IN'),\n", " ('the', 'DT'),\n", " ('hospital', 'NN'),\n", " ('in', 'IN'),\n", " ('order', 'NN'),\n", " ('to', 'TO'),\n", " ('start', 'VB'),\n", " ('in', 'IN'),\n", " ('practice', 'NN'),\n", " ('for', 'IN'),\n", " ('himself', 'PRP'),\n", " ('.', '.')],\n", " [('We', 'PRP'),\n", " ('know', 'VBP'),\n", " ('there', 'EX'),\n", " ('has', 'VBZ'),\n", " ('been', 'VBN'),\n", " ('a', 'DT'),\n", " ('presentation', 'NN'),\n", " ('.', '.')],\n", " [('We', 'PRP'),\n", " ('believe', 'VBP'),\n", " ('there', 'EX'),\n", " ('has', 'VBZ'),\n", " ('been', 'VBN'),\n", " ('a', 'DT'),\n", " ('change', 'NN'),\n", " ('from', 'IN'),\n", " ('a', 'DT'),\n", " ('town', 'NN'),\n", " ('hospital', 'NN'),\n", " ('to', 'TO'),\n", " ('a', 'DT'),\n", " ('country', 'NN'),\n", " ('practice', 'NN'),\n", " ('.', '.')],\n", " [('Is', 'VBZ'),\n", " ('it', 'PRP'),\n", " (',', ','),\n", " ('then', 'RB'),\n", " (',', ','),\n", " ('stretching', 'VBG'),\n", " ('our', 'PRP$'),\n", " ('inference', 'NN'),\n", " ('too', 'RB'),\n", " ('far', 'RB'),\n", " ('to', 'TO'),\n", " ('say', 'VB'),\n", " ('that', 'IN'),\n", " ('the', 'DT'),\n", " ('presentation', 'NN'),\n", " ('was', 'VBD'),\n", " ('on', 'IN'),\n", " ('the', 'DT'),\n", " ('occasion', 'NN'),\n", " ('of', 'IN'),\n", " ('the', 'DT'),\n", " ('change', 'NN'),\n", " ('?', '.'),\n", " (\"''\", \"''\")],\n", " [('``', '``'),\n", " ('It', 'PRP'),\n", " ('certainly', 'RB'),\n", " ('seems', 'VBZ'),\n", " ('probable', 'JJ'),\n", " ('.', '.'),\n", " (\"''\", \"''\")],\n", " [('``', '``'),\n", " ('Now', 'RB'),\n", " (',', ','),\n", " ('you', 'PRP'),\n", " ('will', 'MD'),\n", " ('observe', 'VB'),\n", " ('that', 'IN'),\n", " ('he', 'PRP'),\n", " ('could', 'MD'),\n", " ('not', 'RB'),\n", " ('have', 'VB'),\n", " ('been', 'VBN'),\n", " ('on', 'IN'),\n", " ('the', 'DT'),\n", " ('staff', 'NN'),\n", " ('of', 'IN'),\n", " ('the', 'DT'),\n", " ('hospital', 'NN'),\n", " (',', ','),\n", " ('since', 'IN'),\n", " ('only', 'RB'),\n", " ('a', 'DT'),\n", " ('man', 'NN'),\n", " ('well-established', 'JJ'),\n", " ('in', 'IN'),\n", " ('a', 'DT'),\n", " ('London', 'NNP'),\n", " ('practice', 'NN'),\n", " ('could', 'MD'),\n", " ('hold', 'VB'),\n", " ('such', 'PDT'),\n", " ('a', 'DT'),\n", " ('position', 'NN'),\n", " (',', ','),\n", " ('and', 'CC'),\n", " ('such', 'PDT'),\n", " ('a', 'DT'),\n", " ('one', 'NN'),\n", " ('would', 'MD'),\n", " ('not', 'RB'),\n", " ('drift', 'VB'),\n", " ('into', 'IN'),\n", " ('the', 'DT'),\n", " ('country', 'NN'),\n", " ('.', '.')],\n", " [('What', 'WP'),\n", " ('was', 'VBD'),\n", " ('he', 'PRP'),\n", " (',', ','),\n", " ('then', 'RB'),\n", " ('?', '.')],\n", " [('If', 'IN'),\n", " ('he', 'PRP'),\n", " ('was', 'VBD'),\n", " ('in', 'IN'),\n", " ('the', 'DT'),\n", " ('hospital', 'NN'),\n", " ('and', 'CC'),\n", " ('yet', 'RB'),\n", " ('not', 'RB'),\n", " ('on', 'IN'),\n", " ('the', 'DT'),\n", " ('staff', 'NN'),\n", " ('he', 'PRP'),\n", " ('could', 'MD'),\n", " ('only', 'RB'),\n", " ('have', 'VB'),\n", " ('been', 'VBN'),\n", " ('a', 'DT'),\n", " ('house-surgeon', 'NN'),\n", " ('or', 'CC'),\n", " ('a', 'DT'),\n", " ('house-physician', 'JJ'),\n", " ('--', ':'),\n", " ('little', 'RB'),\n", " ('more', 'JJR'),\n", " ('than', 'IN'),\n", " ('a', 'DT'),\n", " ('senior', 'JJ'),\n", " ('student', 'NN'),\n", " ('.', '.')],\n", " [('And', 'CC'),\n", " ('he', 'PRP'),\n", " ('left', 'VBD'),\n", " ('five', 'CD'),\n", " ('years', 'NNS'),\n", " ('ago', 'RB'),\n", " ('--', ':'),\n", " ('the', 'DT'),\n", " ('date', 'NN'),\n", " ('is', 'VBZ'),\n", " ('on', 'IN'),\n", " ('the', 'DT'),\n", " ('stick', 'NN'),\n", " ('.', '.')],\n", " [('So', 'RB')]]" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tagged_sentences" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('``', '``'), ('Do', 'VBP'), ('none', 'RB'), ('suggest', 'VB'), ('themselves', 'PRP'), ('?', '.')]\n", "[('You', 'PRP'), ('know', 'VBP'), ('my', 'PRP$'), ('methods', 'NNS'), ('.', '.')]\n", "[('Apply', 'VB'), ('them', 'PRP'), ('!', '.'), (\"''\", \"''\")]\n", "[('``', '``'), ('I', 'PRP'), ('can', 'MD'), ('only', 'RB'), ('think', 'VB'), ('of', 'IN'), ('the', 'DT'), ('obvious', 'JJ'), ('conclusion', 'NN'), ('that', 'IN'), ('the', 'DT'), ('man', 'NN'), ('has', 'VBZ'), ('practised', 'VBN'), ('in', 'IN'), ('town', 'NN'), ('before', 'IN'), ('going', 'VBG'), ('to', 'TO'), ('the', 'DT'), ('country', 'NN'), ('.', '.'), (\"''\", \"''\")]\n", "[('``', '``'), ('I', 'PRP'), ('think', 'VBP'), ('that', 'IN'), ('we', 'PRP'), ('might', 'MD'), ('venture', 'NN'), ('a', 'DT'), ('little', 'RB'), ('farther', 'JJR'), ('than', 'IN'), ('this', 'DT'), ('.', '.')]\n" ] } ], "source": [ "sentences =[]\n", "for sentence in tagged_sentences[:5]:\n", " print(sentence)\n", " lemmas = []\n", " for word, pos in sentence:\n", " if pos == 'VBP':\n", " lemmas.append(lemmatizer.lemmatize(word, 'v'))\n", " elif pos in ['NN', 'NNS']:\n", " lemmas.append(lemmatizer.lemmatize(word, 'n'))\n", " else:\n", " lemmas.append(lemmatizer.lemmatize(word))\n", " sentences.append(lemmas)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# ngrams" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from nltk import ngrams\n", "from collections import Counter" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "collapsed": true }, "outputs": [], "source": [ "bigrams = Counter(ngrams(word_tokenize(whole_text), 2))" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(\"''\", '``')\t859\n", "(',', 'and')\t678\n", "('.', \"''\")\t671\n", "('.', '``')\t475\n", "('of', 'the')\t461\n", "('?', \"''\")\t399\n", "('.', 'I')\t253\n", "('in', 'the')\t251\n", "(',', 'but')\t211\n", "(',', \"''\")\t186\n", "('``', 'I')\t183\n", "(\"''\", 'said')\t175\n", "('.', 'The')\t157\n", "('Sir', 'Henry')\t153\n", "('I', 'have')\t147\n", "('to', 'the')\t144\n", "('.', 'It')\t137\n", "('the', 'moor')\t135\n", "('upon', 'the')\t132\n", "('that', 'he')\t130\n", "(',', 'I')\t129\n", "('that', 'I')\t127\n", "('.', 'He')\t127\n", "('and', 'the')\t120\n", "('at', 'the')\t109\n", "('it', '.')\t109\n", "(',', 'the')\t105\n", "('it', 'was')\t105\n", "('and', 'I')\t104\n", "('I', 'had')\t102\n" ] } ], "source": [ "for phrase, freq in bigrams.most_common(30):\n", " print(\"{}\\t{}\".format(phrase, freq))" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "collapsed": true }, "outputs": [], "source": [ "trigrams = Counter(ngrams(word_tokenize(whole_text), 3))" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('.', \"''\", '``')\t508\n", "('?', \"''\", '``')\t316\n", "(',', \"''\", 'said')\t150\n", "(\"''\", '``', 'I')\t107\n", "(',', 'and', 'I')\t77\n", "('.', '``', 'I')\t66\n", "(',', 'and', 'the')\t57\n", "(',', 'sir', ',')\t57\n", "(\"''\", 'said', 'he')\t55\n", "('the', 'moor', '.')\t52\n", "('.', 'It', 'was')\t51\n", "(\"''\", '``', 'No')\t50\n", "(\"''\", '``', 'Yes')\t49\n", "('``', 'No', ',')\t46\n", "(\"''\", '``', 'And')\t44\n", "('upon', 'the', 'moor')\t44\n", "('``', 'Well', ',')\t43\n", "(',', 'and', 'that')\t43\n", "('``', 'Yes', ',')\t43\n", "(',', 'and', 'he')\t41\n", "(\"''\", 'said', 'Holmes')\t39\n", "('it', '.', \"''\")\t38\n", "(',', 'but', 'I')\t36\n", "(\"''\", '``', 'Well')\t34\n", "('.', 'It', 'is')\t34\n", "('.', '``', 'It')\t33\n", "('said', 'he', '.')\t32\n", "(\"''\", '``', 'But')\t32\n", "(',', 'Watson', ',')\t31\n", "('he', '.', '``')\t31\n" ] } ], "source": [ "for phrase, freq in trigrams.most_common(30):\n", " print(\"{}\\t{}\".format(phrase, freq))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Stopwords and punctuation will have an effect on ngrams!" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "collapsed": true }, "outputs": [], "source": [ "stemmed = stem_process(whole_text)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "collapsed": true }, "outputs": [], "source": [ "stemmed_bigrams = Counter(ngrams(stemmed, 2))" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "[(('dr.', 'mortim'), 73),\n", " (('project', 'gutenberg-tm'), 57),\n", " (('sherlock', 'holm'), 34),\n", " (('baskervil', 'hall'), 31),\n", " (('project', 'gutenberg'), 30),\n", " (('dr.', 'watson'), 28),\n", " (('electron', 'work'), 27),\n", " (('henri', 'baskervil'), 25),\n", " (('mr.', 'holm'), 24),\n", " (('coomb', 'tracey'), 18),\n", " (('gutenberg-tm', 'electron'), 18),\n", " (('merripit', 'hous'), 15),\n", " (('charl', 'baskervil'), 15),\n", " (('mr.', 'sherlock'), 14),\n", " (('baker', 'street'), 14),\n", " (('grimpen', 'mire'), 14),\n", " (('gutenberg', 'literari'), 13),\n", " (('literari', 'archiv'), 13),\n", " (('archiv', 'foundat'), 13),\n", " (('hound', 'baskervil'), 12)]" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stemmed_bigrams.most_common(20)" ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "collapsed": true }, "outputs": [], "source": [ "stemmed_trigrams = Counter(ngrams(stemmed, 3))" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(('project', 'gutenberg-tm', 'electron'), 18),\n", " (('gutenberg-tm', 'electron', 'work'), 18),\n", " (('mr.', 'sherlock', 'holm'), 14),\n", " (('project', 'gutenberg', 'literari'), 13),\n", " (('gutenberg', 'literari', 'archiv'), 13),\n", " (('literari', 'archiv', 'foundat'), 13),\n", " (('mrs.', 'laura', 'lyon'), 10),\n", " (('project', 'gutenberg-tm', 'work'), 10),\n", " (('distribut', 'project', 'gutenberg-tm'), 9),\n", " (('great', 'grimpen', 'mire'), 8),\n", " (('project', 'gutenberg-tm', 'licens'), 8),\n", " (('full', 'project', 'gutenberg-tm'), 6),\n", " (('copi', 'project', 'gutenberg-tm'), 5),\n", " (('arthur', 'conan', 'doyl'), 4),\n", " (('dr.', 'jame', 'mortim'), 4),\n", " (('death', 'charl', 'baskervil'), 4),\n", " (('frankland', 'lafter', 'hall'), 4),\n", " (('dr.', 'mortim', 'look'), 4),\n", " (('phrase', 'project', 'gutenberg'), 4),\n", " (('set', 'forth', 'paragraph'), 4)]" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stemmed_trigrams.most_common(20)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bag of words (BOW) text representation for machine learning" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0 0 0 1 2 1 2 1 1 1]\n", " [1 1 1 1 1 0 0 1 0 1]]\n", "['also', 'football', 'games', 'john', 'likes', 'mary', 'movies', 'to', 'too', 'watch']\n" ] } ], "source": [ "from sklearn.feature_extraction.text import CountVectorizer\n", "vectorizer = CountVectorizer()\n", "data_corpus = [\"John likes to watch movies. Mary likes movies too.\", \n", "\"John also likes to watch football games.\"]\n", "X = vectorizer.fit_transform(data_corpus) \n", "print(X.toarray())\n", "print(vectorizer.get_feature_names())" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/derek/anaconda3/envs/nlp36/lib/python3.6/site-packages/nltk/twitter/__init__.py:20: UserWarning: The twython library has not been installed. Some functionality from the twitter package will not be available.\n", " warnings.warn(\"The twython library has not been installed. \"\n" ] } ], "source": [ "from nltk.sentiment.vader import SentimentIntensityAnalyzer" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "collapsed": true }, "outputs": [], "source": [ "vader = SentimentIntensityAnalyzer()" ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "collapsed": true }, "outputs": [], "source": [ "text = \"I dont hate movies!\"" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'compound': 0.509, 'neg': 0.0, 'neu': 0.378, 'pos': 0.622}" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vader.polarity_scores(text)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }