{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting Graphical" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook introduces graphical output for text processing. It's part of the [The Art of Literary Text Analysis](ArtOfLiteraryTextAnalysis.ipynb) and assumes that you've already worked through previous notebooks ([Getting Setup](GettingSetup.ipynb), [Getting Started](GettingStarted.ipynb), [Getting Texts](GettingTexts.ipynb) and [Getting NLTK](GettingNltk.ipynb)). In this notebook we'll look in particular at:\n", "\n", "* [Plotting high frequency terms](#Plotting-Word-Frequency)\n", "* [Plotting a characteristic curve of word lengths](#The-Characteristic-Curve-of-Word-Lengths)\n", "* [Plotting a distribution graph of terms](#Graphing-Distribution)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graphing in Jupyter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Anaconda bundle already comes with the Matlablib library, so nothing further to install.\n", "\n", "However, there's a very important step needed when graphing with iPython, you need to instruct the kernel to produce graphs inline the first time you generate a graph in a notebook. That's accomplished with this code:\n", "\n", "> %matplotlib inline\n", "\n", "If ever you forget to do that, your notebook might become unresponsive and you'll need to shutdown the kernel and start again. Even that's not a big deal, but best to avoid it.\n", "\n", "We can test simple graphing in a new notebook (let's call it `GettingGraphical`) to make sure that everything is working. The syntax below also shows how we can create a shorthand name for a library so that instead of always writing ```matplotlib.pyplot``` we can simply write ```plt```." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEACAYAAACpoOGTAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEYNJREFUeJzt3H+MHHd5x/H3BzsRIARuaGUnsVEocVSHCkhaXEstzQKN\ndBhqIyERWarCD6lEbQOopeCESOX6FwQqSKMIiCAg04JclCJkSiBxEdv+U0ICwRRiE5tiGgfFQaAg\nEakikZ/+cYPZLN+z7272vJfz+yWtPDPfZ2aer8a+z83srlNVSJI07mnTbkCStDIZEJKkJgNCktRk\nQEiSmgwISVKTASFJauodEElmkhxKcjjJ7nlqbu7GDyS5bGT7uiS3JzmY5P4k2/r2I0majF4BkWQN\ncAswA1wK7EqyZaxmO3BxVW0G3gJ8ZGT4H4E7qmoL8CLgYJ9+JEmT0/cOYitwpKqOVtXjwF5g51jN\nDmAPQFXdDaxLsj7Jc4CXVdUnurEnqupnPfuRJE1I34C4EHhwZP1Yt+10NRuB5wM/TvLJJN9M8rEk\nz+zZjyRpQvoGxEL/n4409lsLXA58uKouBx4DruvZjyRpQtb23P8hYNPI+ibm7hBOVbOx2xbgWFXd\n022/nUZAJPE/i5KkJaiq8V/OF6XvHcS9wOYkFyU5F7gK2DdWsw+4GqD7lNKjVXW8qh4GHkxySVf3\nJ8B3WyepqlX7es973jP1Hpyf8zvb5nY2zG8Set1BVNUTSa4F7gTWALdV1cEk13Tjt1bVHUm2JznC\n3GOkN40c4q3Ap7tw+f7YmCRpivo+YqKqvgR8aWzbrWPr186z7wHgpX17kCRNnt+knrLBYDDtFpaV\n83vqWs1zg9U/v0nIpJ5VLZcktdJ7lKSVJgk15TepJUmrlAEhSWoyICRJTQaEJKnJgJAkNRkQkqQm\nA0KS1GRASJKaDAhJUpMBIUlqMiAkSU0GhCSpyYCQJDUZEJKkJgNCktRkQEiSmgwISVKTASFJajIg\nJElNBoQkqcmAkCQ1GRCSpCYDQpLUZEBIkpoMCElSkwEhSWrqHRBJZpIcSnI4ye55am7uxg8kuWxs\nbE2S+5J8oW8vkqTJ6RUQSdYAtwAzwKXAriRbxmq2AxdX1WbgLcBHxg7zduB+oPr0IkmarL53EFuB\nI1V1tKoeB/YCO8dqdgB7AKrqbmBdkvUASTYC24GPA+nZiyRpgvoGxIXAgyPrx7ptC635EPBO4ETP\nPiRJE9Y3IBb6WGj87iBJXgM8UlX3NcYlSVO2tuf+DwGbRtY3MXeHcKqajd221wE7uvcong48O8mn\nqurq8ZPMzs6eXB4MBgwGg55tS9LqMhwOGQ6HEz1mqpb+3nCStcD3gFcCPwK+DuyqqoMjNduBa6tq\ne5JtwE1VtW3sOFcAf1tVf9o4R/XpUZLORkmoql5PZ3rdQVTVE0muBe4E1gC3VdXBJNd047dW1R1J\ntic5AjwGvGm+w/XpRZI0Wb3uIM4E7yAkafEmcQfhN6klSU0GhCSpyYCQJDUZEJKkJgNCktRkQEiS\nmgwISVKTASFJajIgJElNBoQkqcmAkCQ1GRCSpCYDQpLUZEBIkpoMCElSkwEhSWoyICRJTQaEJKnJ\ngJAkNRkQkqQmA0KS1GRASJKaDAhJUpMBIUlqMiAkSU0GhCSpyYCQJDUZEJKkJgNCktTUOyCSzCQ5\nlORwkt3z1NzcjR9Iclm3bVOSryb5bpLvJHlb314kSZPTKyCSrAFuAWaAS4FdSbaM1WwHLq6qzcBb\ngI90Q48Df11VLwS2AX81vq8kaXr63kFsBY5U1dGqehzYC+wcq9kB7AGoqruBdUnWV9XDVfWtbvvP\ngYPABT37kSRNSN+AuBB4cGT9WLftdDUbRwuSXARcBtzdsx9J0oSs7bl/LbAu8+2X5FnA7cDbuzuJ\nXzM7O3tyeTAYMBgMFtWkJK12w+GQ4XA40WOmaqE/4xs7J9uA2aqa6davB05U1Y0jNR8FhlW1t1s/\nBFxRVceTnAP8G/ClqrppnnNUnx4l6WyUhKoa/+V8Ufo+YroX2JzkoiTnAlcB+8Zq9gFXw8lAebQL\nhwC3AffPFw6SpOnp9Yipqp5Ici1wJ7AGuK2qDia5phu/taruSLI9yRHgMeBN3e5/CPwZ8O0k93Xb\nrq+qL/fpSZI0Gb0eMZ0JPmKSpMVbCY+YJEmrlAEhSWoyICRJTQaEJKnJgJAkNRkQkqQmA0KS1GRA\nSJKaDAhJUpMBIUlqMiAkSU0GhCSpyYCQJDUZEJKkJgNCktRkQEiSmgwISVKTASFJajIgJElNBoQk\nqcmAkCQ1GRCSpCYDQpLUZEBIkpoMCElSkwEhSWoyICRJTQaEJKmpd0AkmUlyKMnhJLvnqbm5Gz+Q\n5LLF7CtJmo5eAZFkDXALMANcCuxKsmWsZjtwcVVtBt4CfGSh+0qSpqfvHcRW4EhVHa2qx4G9wM6x\nmh3AHoCquhtYl2TDAveVJE1J34C4EHhwZP1Yt20hNRcsYF9J0pSs7bl/LbAufU4yOzt7cnkwGDAY\nDPocTpJWneFwyHA4nOgxU7XQn/GNnZNtwGxVzXTr1wMnqurGkZqPAsOq2tutHwKuAJ5/un277dWn\nR0k6GyWhqnr9ct73EdO9wOYkFyU5F7gK2DdWsw+4Gk4GyqNVdXyB+0qSpqTXI6aqeiLJtcCdwBrg\ntqo6mOSabvzWqrojyfYkR4DHgDedat8+/UiSJqfXI6YzwUdMkrR4K+ERkyRplTIgJElNBoQkqcmA\nkCQ1GRCSpCYDQpLUZEBIkpoMCElSkwEhSWoyICRJTQaEJKnJgJAkNRkQkqQmA0KS1GRASJKaDAhJ\nUpMBIUlqMiAkSU0GhCSpyYCQJDUZEJKkJgNCktRkQEiSmgwISVKTASFJajIgJElNBoQkqcmAkCQ1\n9QqIJOcl2Z/kgSR3JVk3T91MkkNJDifZPbL9A0kOJjmQ5HNJntOnH0nS5PS9g7gO2F9VlwBf6daf\nJMka4BZgBrgU2JVkSzd8F/DCqnox8ABwfc9+JEkT0jcgdgB7uuU9wGsbNVuBI1V1tKoeB/YCOwGq\nan9Vnejq7gY29uxHkjQhfQNifVUd75aPA+sbNRcCD46sH+u2jXszcEfPfiRJE7L2dAVJ9gMbGkM3\njK5UVSWpRl1r2/g5bgB+UVWfaY3Pzs6eXB4MBgwGg9MdUpLOKsPhkOFwONFjpuq0P7/n3zk5BAyq\n6uEk5wNfrarfGavZBsxW1Uy3fj1woqpu7NbfCPw58Mqq+r/GOapPj5J0NkpCVaXPMfo+YtoHvKFb\nfgPw+UbNvcDmJBclORe4qtuPJDPAO4GdrXCQJE1P3zuI84DPAs8DjgKvr6pHk1wAfKyqXt3VvQq4\nCVgD3FZV7+22HwbOBX7aHfK/quovx87hHYQkLdIk7iB6BcSZYEBI0uKthEdMkqRVyoCQJDUZEJKk\nJgNCktRkQEiSmgwISVKTASFJajIgJElNBoQkqcmAkCQ1GRCSpCYDQpLUZEBIkpoMCElSkwEhSWoy\nICRJTQaEJKnJgJAkNRkQkqQmA0KS1GRASJKaDAhJUpMBIUlqMiAkSU0GhCSpyYCQJDUZEJKkJgNC\nktS05IBIcl6S/UkeSHJXknXz1M0kOZTkcJLdjfF3JDmR5Lyl9iJJmrw+dxDXAfur6hLgK936kyRZ\nA9wCzACXAruSbBkZ3wRcCfywRx+SpGXQJyB2AHu65T3Aaxs1W4EjVXW0qh4H9gI7R8Y/CLyrRw+S\npGXSJyDWV9Xxbvk4sL5RcyHw4Mj6sW4bSXYCx6rq2z16kCQtk7WnGkyyH9jQGLphdKWqKkk16lrb\nSPIM4N3MPV46ufnUrUqSzqRTBkRVXTnfWJLjSTZU1cNJzgceaZQ9BGwaWd/E3F3EC4CLgANJADYC\n30iytap+7Tizs7MnlweDAYPB4FRtS9JZZzgcMhwOJ3rMVDV/yT/9jsn7gZ9U1Y1JrgPWVdV1YzVr\nge8BrwR+BHwd2FVVB8fqfgD8XlX9tHGeWmqPknS2SkJV9Xoy0+c9iPcBVyZ5AHhFt06SC5J8EaCq\nngCuBe4E7gf+ZTwcOiaAJK0wS76DOFO8g5CkxZv2HYQkaRUzICRJTQaEJKnJgJAkNRkQkqQmA0KS\n1GRASJKaDAhJUpMBIUlqMiAkSU0GhCSpyYCQJDUZEJKkJgNCktRkQEiSmgwISVKTASFJajIgJElN\nBoQkqcmAkCQ1GRCSpCYDQpLUZEBIkpoMCElSkwEhSWoyICRJTQaEJKnJgJAkNS05IJKcl2R/kgeS\n3JVk3Tx1M0kOJTmcZPfY2FuTHEzynSQ3LrUXSdLk9bmDuA7YX1WXAF/p1p8kyRrgFmAGuBTYlWRL\nN/ZyYAfwoqr6XeAfevTylDUcDqfdwrJyfk9dq3lusPrnNwl9AmIHsKdb3gO8tlGzFThSVUer6nFg\nL7CzG/sL4L3ddqrqxz16ecpa7X9Jnd9T12qeG6z++U1Cn4BYX1XHu+XjwPpGzYXAgyPrx7ptAJuB\nP07ytSTDJL/foxdJ0oStPdVgkv3AhsbQDaMrVVVJqlHX2jZ67t+oqm1JXgp8Fvjt0/QrSTpTqmpJ\nL+AQsKFbPh841KjZBnx5ZP16YHe3/CXgipGxI8BzG8coX758+fK1+NdSf77/8nXKO4jT2Ae8Abix\n+/PzjZp7gc1JLgJ+BFwF7OrGPg+8AviPJJcA51bVT8YPUFXp0aMkaYnS/Za++B2T85h7LPQ84Cjw\n+qp6NMkFwMeq6tVd3auAm4A1wG1V9d5u+znAJ4CXAL8A3lFVw16zkSRNzJIDQpK0uq2Ib1Kv9i/d\nTWJ+3fg7kpzo7t5WjL7zS/KB7todSPK5JM85c923ne5adDU3d+MHkly2mH2nbanzS7IpyVeTfLf7\nt/a2M9v5wvS5ft3YmiT3JfnCmel44Xr+3VyX5Pbu39v9Sbad8mR938SYxAt4P/Cubnk38L5GzRrm\n3si+CDgH+BawpRt7ObAfOKdb/61pz2mS8+vGNwFfBn4AnDftOU34+l0JPK1bfl9r/zM8n1Nei65m\nO3BHt/wHwNcWuu+0Xz3ntwF4Sbf8LOB7q2l+I+N/A3wa2Dft+Uxybsx9Z+3N3fJa4DmnOt+KuINg\n9X/pru/8AD4IvGtZu1y6XvOrqv1VdaKruxvYuMz9ns7prgWMzLmq7gbWJdmwwH2nbanzW19VD1fV\nt7rtPwcOAhecudYXZMnzA0iykbkfsh8HVtqHZJY8t+7O/GVV9Ylu7Imq+tmpTrZSAmK1f+mu1/yS\n7ASOVdW3l7XLpet7/Ua9Gbhjsu0t2kJ6na/mggXsO21Lnd+Tgrv7dOJlzIX6StLn+gF8CHgncIKV\np8+1ez7w4ySfTPLNJB9L8sxTnazPx1wXZbV/6W655pfkGcC7mXsMc3LzUvtcqmW+fr88xw3AL6rq\nM0vrcmIW+smNlfbb5UItdX4n90vyLOB24O3dncRKstT5JclrgEeq6r4kg8m2NRF9rt1a4HLg2qq6\nJ8lNzP0fen8330HOWEBU1ZXzjSU5nmRDVT2c5HzgkUbZQ8w9h/+lTcwlI92fn+vOc0/3Ru5zq/G9\niuWyjPN7AXPPGw8kgbnfBL6RZGtVtY6zLJb5+pHkjczd1r9yMh33cspe56nZ2NWcs4B9p22p83sI\nTn5E/V+Bf66q1vefpq3P/F4H7EiyHXg68Owkn6qqq5ex38XoM7cw9yTinm777TT+k9UnmfabLt2b\nJe/nV9+wvo72m5xrge8z98PyXJ78Juc1wN93y5cA/zvtOU1yfmN1K/VN6j7Xbwb4LvCb057LQq8F\nT34jcBu/ehN3QdfxKTy/AJ8CPjTteSzH/MZqrgC+MO35THJuwH8Cl3TLs8CNpzzftCfcNXoe8O/A\nA8BdwLpu+wXAF0fqXsXcpyaOANePbD8H+Cfgv4FvAINpz2mS8xs71v+w8gKi7/U7DPwQuK97fXgF\nzOnXemXuF5FrRmpu6cYPAJcv5jpO+7XU+QF/xNyz+W+NXK+Zac9nktdvZPwKVtinmCbwd/PFwD3d\n9s9xmk8x+UU5SVLTSvkUkyRphTEgJElNBoQkqcmAkCQ1GRCSpCYDQpLUZEBIkpoMCElS0/8DnxLK\nEH3JXNEAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "# make sure that graphs are embedded into our notebook output\n", "%matplotlib inline\n", "\n", "plt.plot() # create an empty graph" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wow, who knew such insights were possible with ipython, eh? :)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting Word Frequency" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The previous notebook on [Getting NLTK](GettingNltk.ipynb) explained the basics of tokenization, filtering, listing frequencies and concordances. If you need to recapitulate the essentials of the previous notebook, try running this:\n", "\n", "```python\n", "import urllib.request\n", "# retrieve Poe plain text value\n", "poeUrl = \"http://www.gutenberg.org/files/2147/2147-0.txt\"\n", "poeString = urllib.request.urlopen(poeUrl).read().decode()```\n", "\n", "And then this, in a separate cell so that we don't read repeatedly from Gutenberg:\n", "\n", "```python\n", "import os\n", "# isolate The Gold Bug\n", "start = poeString.find(\"THE GOLD-BUG\")\n", "end = poeString.find(\"FOUR BEASTS IN ONE\")\n", "goldBugString = poeString[start:end]\n", "# save the file locally\n", "directory = \"data\"\n", "if not os.path.exists(directory):\n", " os.makedirs(directory)\n", "with open(\"data/goldBug.txt\", \"w\") as f:\n", " f.write(goldBugString)```\n", "\n", "Let's pick up where we left off by (re)reading our _Gold Bug_ text, tokenizing, filtering to keep only words, calculating frequencies and showing a table of the top frequency words. We'd previously created a filtered list that removed stop-words (very common syntactic words that don't carry much meaning), but for now we'll keep all words." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " the of and i to a in it you was that with as for had at he this but we \n", " 877 465 359 336 329 327 238 213 162 137 130 114 113 113 110 108 103 99 99 98 \n" ] } ], "source": [ "import nltk\n", "\n", "# read Gold Bug plain text into string\n", "with open(\"data/goldBug.txt\", \"r\") as f:\n", " goldBugString = f.read()\n", "\n", "# simple lowercase tokenize\n", "goldBugTokensLowercase = nltk.word_tokenize(goldBugString.lower())\n", "\n", "# filter out tokens that aren't words\n", "goldBugWordTokensLowercase = [word for word in goldBugTokensLowercase if word[0].isalpha()]\n", "\n", "# determine frequencies\n", "goldBugWordTokensLowercaseFreqs = nltk.FreqDist(goldBugWordTokensLowercase)\n", "\n", "# preview the top 20 frequencies\n", "goldBugWordTokensLowercaseFreqs.tabulate(20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This table is useful for ranking the top frequency terms (from left to right), though it's difficult to get a sense from the numbers of how the frequencies compare. Do the numbers drop gradually, precipitously or irregularly? This is a perfect scenario for experimenting with visualization by producing a simple graph.\n", "\n", "In addition to the ```tabulate()``` function, the frequencies (FreqDist) object that we created as a ```plot()``` function, conveniently plots a graph of the top frequency terms. Again, in order to embed a graph in the output of an iPython Notebook we need to give the following special instruction: ```%matplotlib inline```. It's ok to repeat this several times in a notebook, but like an ```import``` statement, we really just need to do this once for the first cell in the notebook where it's relevant." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAErCAYAAADT6YSvAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnXmYHFXVh99fEsJkZZKACYQlQAgQAQNEZAkIiIiyuSCI\nioCKArIogkwQIegnCn4Kih+IKBKRVRAF1CFhCfsi+xLClgSIkABJBhJCIMv5/rjVmZpO7z1dXd1z\n3uepp+tW1alzurq6Tt1z7iIzw3Ecx+m59Kq3AY7jOE59cUfgOI7Tw3FH4DiO08NxR+A4jtPDcUfg\nOI7Tw3FH4DiO08NxR+A43YSkyyT9pBvPN03SN7rrfBXof1rSbnXSPUnS5QX2z5b0iSRtambcEdQB\nSYslLYqWlZKWxMqHdpOOaZLei513kaSPdce5GwVJF0u6MFZeQ9K7ebbt0A0qLVqy7Tgt9hu8J2l5\nrPxUuedLCjPbyszuqlRe0nGSnoiu7+uS7pB0SKnqS9if85jIIb8fXd93JD1cL4fWKLgjqANmNtDM\nBpnZIOBlYL9M2cyu6i41wHdi5x1kZg/GD5DUp5t0pZU7gfgDYDzheu+atc2AR8o5saR8/x1lbzCz\ns2O/99HAfbHfZOty9DYKki4ATgROAoYC6wGnA/uUeooq1BtwTnR9BwMXAX+TVM05mxp3BClC0pqS\nzpf032g5T1LfaN/ukuZImijpTUmzJH25Ah2zJf1A0pPAIkm9JO0o6T5JCyU9LunjseM3lnRn9GY1\nRdJvM1X2yKZXc5z/E9G6JLVJelHSW5KukTQk2jcqqg19TdLL0Xc6LXaeXtGb9Iuxt7r1Jf2fpP/N\n0nmjpO/m+Lp3A1tKGhqVJwBXAwMkDYu27Up4MK+QtGVUk1oYhUX2j+m4TNJFkv4laTGwu6RtJT0a\n2Xc10FLKT0DsISdpZ0n/kdQh6SFJO+UUktaV9KSk70flQr/ZNEk/lnRPZNstme8rqUXSX6LfY2Gk\n80N5dM6WtGe0PknStZImR+d8WtL2eeTGAMcAh5jZbWb2vgXuNbMjY8etF/128yW9IOmbeS+adFh0\nn7wVv09K5CqCMxoe+y6rwk6xe7FXVN5Y0l3R95wa3XN5w1TNgDuCdPFDYAfgI9GyA+EtKsNwYBjh\n7epw4PfRny4f+d6AvgR8GmgF1gVuBn5sZkOAk4HrYw/KK4H/RHp/AnyNwtX2eJX9BOAAwlv5usBC\n4P+yjt8FGAN8AjhD0ubR9u9n7Ize6o4ElgCXAYdK4e1O0tqR7BWrGWL2Kl1rALsRnMN9WdvuUqgd\n3QS0A+sAxwNXZF3fQ4GfmNlA4GHg78BkYAjwV+ALRa5NFyIH9U/gfMKD6lfAPzPOMnbcxsA04Ddm\n9ktJIyn8m2VsPQL4ENA3OgbCfTMYWD/S+W3gvTwmZn+X/QkP1bWAG4Hf5pHbE3jFzB7N990jrgZe\nIdwbBwFnS9oj+yBJY4ELga8Q7v1hkf2FyNwfvQn37ExgXrSv2G90JfAA4fpMAr5agkxjY2a+1HEB\nZgF7RusvAvvE9u0NzIrWdweWAf1i+68BTs9z3mnAu4SH70Lg4Zi+I2LHnQr8OUu2nfDn2TCHzisy\nx0c2vVrg+0zPrEfldYEPCC8go4CVwHqx/Q8CB0frzwH75/lu04G9ovXjgJsLXN8/ER6wIjwIWggP\nv8y2BQSnsCvwepbslcCZ0fplwGWxfbsB/806/l7Cw7nQ730EcHe0fhjwQNb++4DDo/U7gF9G1/SQ\nUn6zmNxpsX3HAP+O1o+M7Ny6zHtzEjAltm8ssCSP3OnA/Vnb5kT34XvABtGyHBgQO+Zs4E8xfZdH\n62cAV8aO6w+8H7+3snRdFulZSHh5WAJ8ObZ/1bmjcuZe7EXnPd8S2395/PhmXLxGkC7WI7zBZngl\n2pZhoZnF395eztofx4DjzWxItIyP7YuHczYCvhiFCRZKWkh4Sx8RnTuXzlJjraOAG2LnnU748w+P\nHTM3tr4EGBitrw+8lOe8kwlvaUSfhartdxEe2lsDM81sKeFBmNnWj+CA1qPrdYGu19cID7MM6wH/\nzXF8OXHo9Qi/cT6dIrwFzwGujx1T6DfLEL+u79F5XS8HbgGuVgg/nqPSc0XzYutLgBblzpXMJzj9\nVZjZ+sDawJrR91oPWGBm78YOewUYmeN86xG79ma2JNKRDwN+Ed33/YGPAr+QVEp+ImPX0ti27Pui\n6XBHkC5eIzw8M2wYbcswRFL/WHkjVn8YlUK8mvsK4W1nSGwZZGbnAq/n0ZmRf5fwdgasqoavk3Xu\nfbLO3d/MXi/BxleB0Xn2XQEcKOkjwBaEEE0+7iaE2faN1gGeIbyR7gs8ZGYfEK7zBpmQU0T29Y1f\nt9dZ/aG1EeWFEP4byWSfI6PTgDMJD70rYw/dQr9ZQcxsuZn92Mw+DOwM7Eeo/XUntwPr58ghxK/t\na8BQSQNj2zakq7ONH7vBqpOE+3FYjuNyYmbPEJz/vtGmLvctXR3o65Fd/bLsamrcEaSLq4DTJa0d\nxb7PYPW33bMUmjzuSrix/1rgfKW8nf4F2F/S3pJ6R8nE3SWNNLOXCbHwjM4JhAdHhucJb4WfkbQG\nISSwZmz/7whx3w0BJK0j6YASbAL4A/ATSaMV2CaKqWNmcwh5iz8D15nZ+/lOYmYvAm8QWrDcFW0z\nQi1g1baovAT4QfRdd4++69XR/uxreR+wXNIJ0fGfJ7x5lsO/gTGSDpXUR6Fp5RaE+H+GZcAXgQHA\nnyNHlfc3i8nl/O0l7SFp68hpL4rOv6JMuwtiZs8BFxNqHXtJ6hfp2zl2zKuEa/gzhUYS2wBfj75b\nNtcD+0naRaHxxI8p/OzKTshvQWgo8HS06TFgN0kbSFoLmBizK3PPT4p+150I90FT5wjcEaSL/yHc\nhE9Gy8PRtgxzCXHP1wgO4ttm9nyB8xW9eaOH6oHAaYQH5iuERG3m3vgy8DFCLP0MwsNXkezbwLGE\nh/YcYDFdq9G/JiQVp0h6B7ifkAAvxb5fAdcCU4C3gUvo2ipnMiG0U0prjjsJYYl7Y9vuJtReMs7h\nA0Iy9NPAm4RE6GGx69ul3bqZLQM+T4j5zwcOpmv4Jh+rzmNm8wkPme8DbxESuvuZ2YIuAp26hgN/\nJNQYcv1m8Ye/Za1nysMJLw9vE0J10yjtGuZqt5/39zOz7wC/IfyO8wn3xY8J1ylzjxxKqAG/BvwN\nOMPMbs/WF73Rf4eQs3mNcC8WCtcYwaEvUmjhdQtwKfD76Hy3EvJrTxJeKG7K+i5fAXaK7P5JdOwH\nBfQ1PIqSIbU5uXQi8E3CDXqJmf06equ7hlAFnk1IDnZEx08kvBWsAE4wsyk1M67BiN5QLzezDYod\nW2M7zgRGm9lhdbZjN8L1yA6tOE63IukaYLqZnVVvW2pFzWoEkrYiOIGPEmK0+0naFGgDpprZGOC2\nqJxpInYIoTXCPsCFeRJRTn2pe6ecKAx1IqGW4DjdiqTxkjZV6MvyaUIT6EJ5qIanlg/aLYAHzWyp\nma0gVM+/QLiok6NjJgOfjdYPBK4ys2VmNpvQlLI7uv03E2mIU9Z12ANJWxLCY8MJ7e8dp7sZQWiC\nuwg4DzjazJ6or0m1pWahoShB8w9CrG0pcCsh5n2YhU4wRImvBWY2RKFL+gNmdkW07w+Ets+lxF0d\nx3GcCqnZWDNmNkPSOYRk37vA42S1TjAzk1Ssl6rjOI5TQ2o66JiZXUrI1iPpp4SWJfMkjTCzuZLW\nJbR6gNASIp4IXZ8cbeRHjx5tixcvZt680Ldl0003ZdCgQTz++OMAjBs3DsDLXvayl3t8efjw0Hcz\n87w0s9w5vlp2WwY+FH1uCDxLGKPkXODUaHsb8HPr7LL+OGFclI0JvUqV45xWCWeeeWZqZZLUlXb7\nktSVdvuS1JV2+5LUlXb7KpWLnp05n9W1Hob4umggrGXAsWb2tqSfA9cqTLgxm9CuGDObLulaOoch\nODYyvgsZD1cuS5cuLX5QnWSS1JV2+5LUlXb7ktSVdvuS1JV2+6qRy0etQ0OrTQZhobPMXnmOP5sw\n8JTjOI6TEL0nTZpUbxvK4pRTTpn0gx9MYo01ypPr06cPo0aNSqVMkrrSbl+SutJuX5K60m5fkrrS\nbl+lcmeddRaTJk3K2Smupj2La4Eke+YZY+zYelviOI7TOEjKmyxuuJ6748aNY+bM8uU6OjpSK5Ok\nrrTbl6SutNuXpK6025ekrrTbV41cPhrOEQDMmlVvCxzHcZqHhgwNffe7xnnn1dsSx3GcxqGpQkNA\nRaEhx3EcJzcN5wjGjRtXUWjI44uVyzSrrrTbl6SutNuXpK6021eNXD4azhFAqBE0WETLcRwntTRk\njgCMN96AddYpfrzjOI7ThDkC8DyB4zhOd9FwjiAzul65eQKPL1Yu06y60m5fkrrSbl+SutJuXzVy\n+Wg4R5DBawSO4zjdQ8PmCL75TbjEZ6x1HMcpCc8ROI7jOHlpOEeQyRGU6wg8vli5TLPqSrt9SepK\nu31J6kq7fdXI5aPhHAGABK++CsuW1dsSx3GcxqemOQJJ3wO+QZiE/ingSGAAcA2wEdEMZWbWER0/\nEfg6YZL7E8xsSo5z2vrrG3PmwEsvwSab1Mx8x3GcpqEuOQJJI4Hjge3NbGugN/AlwjzFU81sDHBb\nVEbSWOAQwtzF+wAXSspp38Ybh08fhdRxHKd6ah0a6gP0l9QH6A+8BhwATI72TwY+G60fCFxlZsvM\nbDbwIrBD9gnHjRu3qhZQTp7A44uVyzSrrrTbl6SutNuXpK6021eNXD5q5gjM7L/AL4FXCA6gw8ym\nAsPNbF502DwgMxv9esCc2CnmACNzndtrBI7jON1HzSavlzSE8PY/Cngb+Kukr8aPMTML/QLystq+\nRYsW8fTTbUALN94IO+88ngkTJtDa2gp0esruKLe2tpYtn9lWC3t6gn3Zbzo92b64Drcv2fs97faV\nUp42bRrt7e0AtLS0UIiaJYslfRH4lJl9MyofBuwI7AnsYWZzJa0L3GFmW0hqAzCzn0fHtwNnmtmD\nWee1u+82dt0VdtgBHuyy13Ecx8lFvTqUvQzsKKmfJAF7AdOBm4DDo2MOB/4erd8IfElSX0kbA5sB\nD2Wf1HMEycs0q66025ekrrTbl6SutNtXjVw+ahYaMrOHJF0HPAosjz5/DwwCrpX0DaLmo9Hx0yVd\nS3AWy4FjLU91ZcQIaGmBt96CRYtg0KBafQvHcZzmpyHHGjIzttwSZsyAJ56Abbapt1WO4zjppinH\nGqokPOQ4juOsTsM5gsxYQ+U2IfX4YuUyzaor7fYlqSvt9iWpK+32VSOXj4ZzBBm8RuA4jtM9NGyO\n4IYb4POfh333hZtvrrdVjuM46cZzBI7jOE5eGs4R5MoRlFKp8fhi5TLNqivt9iWpK+32Jakr7fZV\nI5ePhnMEGQYPhmHDYOlSmDu33tY4juM0Lg2bI4AwxMR//gP33AO77FJnwxzHcVJMU+YIwEchdRzH\n6Q4azhFkcgRQXsLY44uVyzSrrrTbl6SutNuXpK6021eNXD4azhHE8RqB4zhO9TR0juDWW+GTn4Td\ndoM776yzYY7jOCnGcwSO4zhOXhrOEcRzBBtuCL16wZw58P77heU8vli5TLPqSrt9SepKu31J6kq7\nfdXI5aPhHEGcNdaADTYIHcpeeaXe1jiO4zQmDZ0jANhzT7jjDmhvh099qo6GOY7jpJi65QgkbS7p\nsdjytqQTJA2VNFXS85KmSGqNyUyU9IKkGZL2LqbDxxxyHMepjpo6AjN7zsy2NbNtge2BJcANQBsw\n1czGALdFZSSNBQ4BxgL7ABdK6mJjPEcApSeMPb5YuUyz6kq7fUnqSrt9SepKu33VyOUjyRzBXsCL\nZvYqcAAwOdo+GfhstH4gcJWZLTOz2cCLwA6FTuo1AsdxnOpILEcg6VLgYTO7UNJCMxsSbRewwMyG\nSLoAeMDMroj2/QH4t5ldHztPlxzBAw/ATjvBdtvBI48k8lUcx3EajkI5gj4JGdAX2B84NXufmZmk\nQt6oy75NN92UtrY2WlpaANh88/GMGzeBmTNDmiFTZWpt9bKXvezlnlueNm0a7e3tAKuel3kxs5ov\nhJBPe6w8AxgRra8LzIjW24C22HHtwMfi5xo3bpzFWbnSrH9/MzBbsMDysnDhwvw76yyTpK6025ek\nrrTbl6SutNuXpK6021epXHjc535GJ5UjOBS4Kla+ETg8Wj8c+Hts+5ck9ZW0MbAZ8FChE0udeQLv\nYew4jlM+Nc8RSBoAvAxsbGaLom1DgWuBDYHZwMFm1hHtOw34OrAcONHMbsk6n2XbfMABcNNNcN11\n8IUv1PTrOI7jNCR1zRGY2bvA2lnbFhBaEeU6/mzg7HJ0eMshx3Gcymm4ISay+xFAaY7A2yBXLtOs\nutJuX5K60m5fkrrSbl81cvloOEeQCx+F1HEcp3IafqwhgGeega22gs02g+efr5NhjuM4KaZQjqAp\nHMG778LAgdC3LyxZAr1718k4x3GclNJUE9PkyhEMGADDh8MHH8Brr+WW8/hi5TLNqivt9iWpK+32\nJakr7fZVI5ePhnME+fA8geM4TmU0RWgI4CtfgSuvhD/9CY44Inm7HMdx0kxThYby4TUCx3Gcymg4\nR5ArRwDF+xJ4fLFymWbVlXb7ktSVdvuS1JV2+6qRy0fDOYJ8+HhDjuM4ldE0OYKXX4ZRo2DddfO3\nHHIcx+mpNH0/AoAVK6ClBZYvD30J+vWrg3GO4zgppamSxflyBL17w0YbhfXZs1ff7/HFymWaVVfa\n7UtSV9rtS1JX2u2rRi4fDecICuGjkDqO45RP04SGAL79bfj97+GCC+C44xI2zHEcJ8U0VWioEF4j\ncBzHKZ+aOwJJrZKuk/SspOmSPiZpqKSpkp6XNEVSa+z4iZJekDRD0t7Z58uXI4DCnco8vli5TLPq\nSrt9SepKu31J6kq7fdXI5SOJGsGvgX+Z2ZbANoSJ69uAqWY2BrgtKiNpLHAIMBbYB7hQUsk2eo3A\ncRynfGqaI5C0FvCYmW2StX0G8HEzmydpBDDNzLaQNBFYaWbnRMe1A5PM7IGYbN4cwYIFMGwYDBoE\nb78dJrZ3HMdx6psj2Bh4U9KfJD0q6ZJoMvvhZjYvOmYeMDxaXw+YE5OfA4wsVdmQITB4MCxaBPPn\nd4f5juM4zU+tJ6/vA2wHHGdm/5F0PlEYKIOZmaRC1ZIu+3bbbTfa2tpoaWkBYPz48UyYMIHW1lYk\n2HvvDl58EWbObGXttbvG0lpbW1eVW1tDWqJQOVu22PEAc+bMYeDAgSUf7/atXl68eDHrr79+j7cP\nKvu9mtG+JO/3tNsX11Ho+GnTptHe3g6w6nmZFzOr2QKMAGbFyhOAfwLPAiOibesCM6L1NqAtdnw7\n8LH4OceNG2eF+NznzMDs6qu7bl+4cGFBuVwkJZOkrrTbl6SutNuXpK6025ekrrTbV6lceNznflbX\nvB+BpLuAb5rZ85ImAf2jXfPN7BxJbUCrmbVFyeIrgR0IIaFbgdEWM7JQjgDg5JPhl7+Es8+GiRNr\n9KUcx3EajEI5glqHhgCOB66Q1Bd4CTgS6A1cK+kbwGzgYAAzmy7pWmA6sBw4tuBTPwc+L4HjOE55\n1Lz5qJk9YWYfNbOPmNnnzextM1tgZnuZ2Rgz29vMOmLHn21mo81sCzO7Jft8hfoRQP4mpN4GuXKZ\nZtWVdvuS1JV2+5LUlXb7qpHLR1P1LAavETiO45RLU401BLB0aRiCunfvsN4nieCX4zhOyukxYw1B\nmJNg5MgwP8GcOcWPdxzH6ek0nCMoliOAzvBQPE/g8cXKZZpVV9rtS1JX2u1LUlfa7atGLh8N5whK\nwcccchzHKZ2myxEATJoEZ50Fp50GP/1pMnY5juOkmR6VIwCvETiO45RDwzmCcnIE8SakHl+sXKZZ\ndaXdviR1pd2+JHWl3b5q5PLRcI6gFLxG4DiOUzpNmSNYuRL694f33w9DUg8cmJBxjuM4KaXH5Qh6\n9YJRo8K69zB2HMcpTMM5glJyBNAZHso4Ao8vVi7TrLrSbl+SutJuX5K60m5fNXL5aDhHUCq5OpU5\njuM4q9OUOQIIcxKcfDIcfzz85jcJGOY4jpNielyOAHwUUsdxnFKpuSOQNFvSk5Iek/RQtG2opKmS\nnpc0RVJr7PiJkl6QNEPS3tnnKzdHkAkNeXyxcplm1ZV2+5LUlXb7ktSVdvuqkctHEjUCA3Y3s23N\nbIdoWxsw1czGALdFZaKpKg8BxgL7ABdKqsjGeI2gwaJfjuM4iVJWjkDSUGB9M3uyDJlZwHgzmx/b\nNgP4uJnNkzQCmGZmW0iaCKw0s3Oi49qBSWb2QEy25Nkrhw2DBQvg9ddhxIhSLXYcx2k+qsoRSLpT\n0uDICTwC/EHSeWXoN+BWSQ9LOiraNtzM5kXr84Dh0fp6QHwWgTmESewrwvMEjuM4xSkl7LKWmb0D\nfB74cxTe2asMHbuY2bbAp4HvSNo1vjN6vS/0it9lX6k5AuiaJ/D4YuUyzaor7fYlqSvt9iWpK+32\nVSOXj1ImcuwtaV3gYOD0aFvJ8SQzez36fFPSDcAOwDxJI8xsbnTuN6LD/wtsEBNfP9q2isGDB9PW\n1kZLSwsA48ePZ8KECbS2hnxz5gK1trayySYwblwHCxZ0ysf316K8ePHiiuTdvs7y4sWL3b4qfq9m\ntC/J+z3t9sUpdPy0adNob28HWPW8zEfRHIGkLwI/Au41s2MkbQqca2ZfKCgYZPsDvc1skaQBwBTg\nLEKNYr6ZnSOpDWg1s7YoWXwlwVmMBG4FRseTAuXkCC6+GI4+Go48Ei69tCQRx3GcpqRQjqCUGsHr\nZrZNpmBmL5WRIxgO3CApo+sKM5si6WHgWknfAGYTahuY2XRJ1wLTgeXAsSU/9XPgo5A6juMUp5Qc\nwQU5tpXUV9fMZpnZuGjZysx+Fm1fYGZ7mdkYM9vbzDpiMmeb2Wgz28LMbsk+Zzk5gniy2OOLlcs0\nq66025ekrrTbl6SutNtXjVw+8tYIJO0E7AysI+kkIFOlGAT07lYrasSGG4aRSF99FZYtq7c1juM4\n6SRvjkDSx4E9gG8Dv4vtWgTcZGYv1N68nHaVFS3aaCN45RV44QUYPbqGhjmO46SYinIEZnYncKek\ny8xsdq2MqzWbbBIcwcyZ7ggcx3FyUUqOYE1Jl0RjA90RLbfX3LI8lJMjgM48wbx5Hl+sVKZZdaXd\nviR1pd2+JHWl3b5q5PJRSquhvwIXAX8AVkTbGmb0ns02C5+PPw6HHVZfWxzHcdJIKf0IHjGz7ROy\npyjl5ghmzw7OYMUKeOYZ2HLL2tnmOI6TVqqdj+AmSd+RtG40fPTQaNyhhmDUKPjGN8IIpGedVW9r\nHMdx0kcpjuAI4GTgPsKgc5mlLpSbIwD44Q9h/PgOrrkGnnqqdLm0x/ya0b4kdaXdviR1pd2+JHWl\n3b5q5PJR1BGY2Sgz2zh76VYraswGG8D++4f1SZPqaorjOE7qKCVHcDg5ksNm9udaGVWIcnMEGV57\nDTbdFJYuhUcfhW23rYFxjuM4KaXaHMFHY8tuwCTggG6zLiHWWw+OOSasn3lmfW1xHMdJE6WEho4z\ns+Oj5ZvAdoRhJupCJTkCCDG1U0+F/v3hppvgP/8pTaYSPZXg8c/kdaXdviR1pd2+JHWl3b5q5PJR\nyXzAS4CGyhFkGD4cjjsurJ9xRn1tcRzHSQul5AhuihV7ESaWv9bMTq2lYQXsqWZkat56K/Q2XrwY\n7r0Xdt65G41zHMdJKYVyBKU4gt2jVSPMEfCKmb3arRaWQbWOAOD00+GnP4W99oKpU7vJMMdxnBRT\nVbLYzKYBM4DBwBDg/W61rkyqyRFkOOkkGDwYbr0V7rqrNJlK9NRarhntS1JX2u1LUlfa7UtSV9rt\nq0YuH0UdgaSDgQeBLxJmEnsomr6yJCT1lvRYJsQU9UyeKul5SVMktcaOnSjpBUkzJO1d/tcpjaFD\ngzMA+NGPQq9jx3GcnkopoaEngb3M7I2ovA5wW3z6yiLyJwHbA4PM7ABJ5wJvmdm5kk4FhmTNV/xR\nOucrHmNmK7POV3VoCODtt0OuYOHCUDP4xCeqPqXjOE5qqbYfgYA3Y+X5dM5WVkzx+sBnCCOXZmQO\nACZH65OBz0brBwJXmdmyaP6DFwmT2NeEtdaCk08O62ec4bUCx3F6LqU4gnbgFklHSDoS+Bfw7xLP\nfx5wChB/qx9uZvOi9XmECe4B1gPmxI6bQ6gZdKE7cgQZjj8ehg2D++6DKVNKk6lET63kmtG+JHWl\n3b4kdaXdviR1pd2+auTykdcRSNpM0gQzOwW4GNgG2Jow+Nzvi51Y0n7AG2b2GHlqEFGMp9C7eE3f\n0wcNglOjRrCeK3Acp6dSaGKa84GJAGZ2PXA9gKRtCG/6+xc5987AAZI+A7QAgyVdDsyTNMLM5kpa\nF3gjOv6/wAYx+fWjbV1YtGgRbW1ttLS0ADB+/HgmTJhAa2vIOWc8Zanlr361g3/9C6ZNa+Wf/4QJ\nEzr3t7a2ln2+zLZK7Smn3Iz2Zb/p9GT74jrcvmTv97TbV0p52rRptLe3A6x6Xuaj0OT1D5vZ+Dz7\nnjazrQqeuevxHwdONrP9o2TxfDM7R1Ib0JqVLN6BzmTx6OzMcHcli+Ocd15oRbTttvDII6CSMiCO\n4ziNQ6XJ4tYC+wq7l9xknt4/Bz4p6Xlgz6iMmU0HrgWmE3IQx+Z64ndnjiDD0UfDuuvCY4/B3/9e\nmkwlerpbrhntS1JX2u1LUlfa7UtSV9rtq0YuH4UcwcOSvpW9UdJRlDkxjZndaWYHROsLzGwvMxtj\nZnubWUfsuLPNbLSZbWFmt5Sjoxr69YPTTgvrZ54JK1cWPt5xHKeZKBQaGgHcAHxA54N/e2BN4HNm\n9noiFq5uV7eHhgDefx9Gj4Y5c+Caa+Dgg7tdheM4Tt2oeKwhSQL2ALYihHaeMbPba2JlidTKEQBc\nfHEIE225ZZjSsnfvmqhxHMdJnIo7lFngdjP7jZldUG8nALXJEWQ48sgw2f2zz8LVV6c/5teM9iWp\nK+32Jak/tjMtAAAgAElEQVQr7fYlqSvt9lUjl49K5iNoWvr2Df0JAM46C1asqK89juM4SVB0rKG0\nUcvQEMCyZSE09NJLcNllcPjhNVPlOI6TGNWONdSjWGONzjmNjzkGvvWtkC9wHMdpVhrOEdQyR5Dh\ny1+GL30JNt+8g0sugW22gd13h+uvh+XLu09PtXJpj0mmXVfa7UtSV9rtS1JX2u2rRi4fDecIkqB3\nb7jqKpg8OcxxPHAg3HknHHQQbLIJ/OxnYcpLx3GcZsBzBCXwzjvBKfz2t/D882HbmmvCoYeGEUy3\n2y5RcxzHccqmqjmL00Y9HEGGlSvDHMcXXAD/+lfnaKU77xwcwhe+EHIMjuM4aaOpksVJ5AjyyfTq\nBZ/6FNx8c6gZfO97YYKb++4LtYONNoL//d8OrrkGHnoI3nyz9KGtPf6ZvK6025ekrrTbl6SutNtX\njVw+Cg1D7RRg9Gj41a/gxz+Gv/wl1BKmT4crroDHH+88buDAMCVmfNlkk871AQPq9x0cx3HAQ0Pd\nhhnccQf8858wcybMmhWWd94pLLfOOsEhDBkSBr/r3z/3Z65tm28eekI7juMUw3MEdcIMFi7sdApx\nBzFrFsyeDR98UPn5+/ULtY8xY7rNZMdxmpSmcgTbbrutPfbYY2XLxWcBSovMypXw+uvBISxe3MG7\n77ayZAm89x5FP194AdZcs4MPfaiV9vbSJ9Op5DtVKpd2XWm3L0ldabcvSV1pt69SuUKOwHMEdaRX\nLxg5MiwdHVDO7/rmm/D5z8OUKaGj20EH1c5Ox3Gam5rVCCS1AHcS5i/oA1xnZpMkDQWuATYCZgMH\nZyankTQR+DqwAjjBzKbkOG/DhIZqze9+F4bBGDkyjJg6aFC9LXIcJ63UpfmomS0F9jCzccA4YB9J\nHwPagKlmNga4LSoTzVl8CDAW2Ae4UFLDNW9NkqOOgvHj4b//Da2XHMdxKqGmD1ozWxKt9gXWIExu\ncwAwOdo+GfhstH4gcJWZLTOz2cCLhInsu1DPfgS1kqlUbtGiDi66KOQHzj8fnn66NnoqlUu7rrTb\nl6SutNuXpK6021eNXD5q6ggk9ZL0ODAPmGJmDwHDzWxedMg8YHi0vh4wJyY+BxhZS/uagfHjw6xq\ny5fDd75Tegc2x3GcDDVNFpvZSmCcpLWAGyRtlbXfJBV6dK22b9GiRbS1tdHS0gLA+PHjmTBhwqoM\nesZTdke5tbW1bPnMtlrYk8++iRPhuutauesu+OtfO9h773TZV4l83Naeal9ch9uX7P8x7faVUp42\nbRrt7e0Aq56X+Uis+aikHwFLgKOA3c1srqR1gTvMbAtJbQBm9vPo+HbgTDN7MOs8nizOwWWXhak2\nP/QheO658logOY7T/NQlWSxpbUmt0Xo/4JPAs8CNQGber8OBv0frNwJfktRX0sbAZsBD2ef1HEFu\nma99DSZMgDfegNNP7149lcqlXVfa7UtSV9rtS1JX2u2rRi4ftcwRrAvcLukJwgN9ipn9C/g58ElJ\nzwN7RmXMbDpwLTAd+DdwrL/6l06vXnDhhWEuhYsugkcfrbdFjuM0Cg3Xs9hDQ4U56SQ47zzYYQe4\n//7gIBzHcZpqiAl3BIV55x3Yckt47TW4+OIw57LjOI7PR0DPiS8OHhyGxwaYOHH1KTXrbV+adKXd\nviR1pd2+JHWl3b5q5PLRcI7AKc7BB8Nee8GCBdDWVm9rHMdJOx4aalKeew623hqWLYN77w3TaTqO\n03NpqtCQUxqbbw6nnBLWjz029Dx2HMfJRcM5As8RlC7zwx+GeZSfeCI0La1UT6VyadeVdvuS1JV2\n+5LUlXb7qpHLR8M5Aqd0+veH3/wmrJ9+epgEx3EcJxvPEfQA9t8fbr4ZvvxluOKKelvjOE498H4E\nPZxZs2DsWFi6FG6/HfbYo94WOY6TNE2VLPYcQfkyG28Mp50W1s8/v4MPPihbVdNci2plmlVX2u1L\nUlfa7atGLh8N5wicyjjlFNhsM3jllTD8xJVXeksix3ECHhrqQdx/f5jwfu7cUB41KoxN9PWvw4AB\ndTXNcZwa4zkCZxVLl8Lll8MvfgEvvBC2DRsGxx0XlrXXrq99juPUBs8R4PHFDEuXdnDUUfDss3D9\n9SFMNH8+nHUWbLghnHACzJ5dP/uS1JV2+5LUlXb7ktSVdvuqkctHwzkCp3vo3TuEiR54AKZNg898\nBt57Dy64AEaPhq98JXREcxyn+fHQkLOKp54KIaOrrupMJH/qU/CDH4Qmp8pZqXQcpxGoW2hI0gaS\n7pD0jKSnJZ0QbR8qaaqk5yVNyUxpGe2bKOkFSTMk7V1L+5yubL01/PnP8NJLcOKJoWfyLbfAJz4R\nagyLF9fbQsdxakGtQ0PLgO+Z2YeBHYHvSNoSaAOmmtkY4LaojKSxwCHAWGAf4EJJXWz0HEHtZTbc\nEM4/PzQ1/fGPYehQmDu3g09+EhYurK19lco142+VpK6025ekrrTbV41cPmrqCMxsrpk9Hq0vJkxe\nPxI4AJgcHTYZ+Gy0fiBwlZktM7PZwIvADrW00cnPsGHwox/Bgw/C8OEhn7DHHvDGG/W2zHGc7iSx\nHIGkUcCdwFbAK2Y2JNouYIGZDZF0AfCAmV0R7fsD8G8zuz52Hs8R1IFXXw2T3Tz/PIwZA7feChts\nUG+rHMcplUI5gj4JGTAQuB440cwWKZZ1NDOTVOjJ3mXfpptuSltbGy0tLQCMHz+eCRMm0Noa0gyZ\nKpOXu7e8wQat3H03HHtsBy+9BBMmtHLbbbD22umwz8te9nLX8rRp02hvbwdY9bzMi5nVdAHWAG4B\nvhvbNgMYEa2vC8yI1tuAtthx7cDH4ucbN26cVcLChQtTK5OkrmrtW7DAbKedzMBsxAizp56qna40\nyjSrrrTbl6SutNtXqVx43Od+Tte61ZCAPwLTzez82K4bgcOj9cOBv8e2f0lSX0kbA5sBD9XSRqc8\nhgyBKVNCS6K5c+HjH4eH/BdynIampjkCSROAu4An6QzxTCQ83K8FNgRmAwebWUckcxrwdWA5IZR0\nS9Y5rZY2O6WxdCkccgjceCMMHAg33QS7715vqxzHyYePNeTUhGXL4IgjwkimLS1hyIrPfKbeVjmO\nkwsfawhvg1yNTD65NdYIHdC+9a1QQzjwQLj22troSotMs+pKu31J6kq7fdXI5aPhHIGTLnr3ht/9\nDk4+OQxLceihcOml9bbKcZxy8NCQ0y2YwU9/GjqgAZx3Hnz3u/W1yXGcTpoqNOSkEwlOPz0MTQHw\nve+Foa1XrKivXY7jFKfhHIHnCJKXKUfuxBPhj3+EXr3g73/vYIst4OKLQw6hljam8Vo0kq6025ek\nrrTbV41cPhrOETjp5+tfh7/9DUaMgBdfhKOPDtNinn12eYPWOY6TDJ4jcGrG8uWhSek558Bjj4Vt\nAweGVkbf+x6sv3597XOcnoT3I3DqihncdltwCLfeGrb16RNmQTvlFPjwh+trn+P0BJoqWew5guRl\nqtUlhZFLp06FRx4JPZJXroTJk2GrrWD//eGee4LDqFZXrWWaVVfa7UtSV9rtq0YuHw3nCJzGZrvt\n4Oqr4YUX4NhjQ4/km2+GXXeFXXaBf/yjvMSy4zjV46Ehp668+Sb89rdhWbCgc/uIEbDxxqsvm2wS\ncgt9EhlA3XGaB88ROKln8eLQI/nii8PkN8uX5z+2d+8wnWbcQWyzDey5JwwYkJzNjtNINJUj2Hbb\nbe2xTBOUMujo6Fg1eUPaZJLUlXb7AObP7+Ddd1uZNQtmzoRZs7our722usy4cR08+2wre+wB++0H\n++4bmqzWwr5mvO5pty9JXWm3r1K5us9Q5jjlkHnj33DDMN9BNu+9By+/3OkYZs4MzuGJJ6C9PSzH\nHRcS0RmnsOOOHk5ynHw0XI3AQ0NOPubNg3//G/75T7jlFli0qHPf0KHw6U8Hx/CpT4UJdhynJ9FU\noSF3BE4pfPAB3H13aJF0003w0kud+3r3hgkTYO+9g0NYY43OpU+f4uXevcu3p3dvGDw4LAMHhiE4\nHCdJ6uYIJF0K7Au8YWZbR9uGAtcAG7H67GQTCbOTrQBOMLMp2ef0HEHyMo2uyywkoG++OSx33x0G\nwxs3roPHHy/fvkrksmUGDYK11up0DvElvn3ttTuAVvr1g/79We0zvr7mmqHPRiP/VmnQlXb7KpWr\nZ47gT8AFwJ9j29qAqWZ2rqRTo3KbpLHAIcBYYCRwq6QxZrayxjY6TY4Em28elu9/Hzo6wrzLzz8P\nH/tYmGlt2bLQUimznqucWTbeuHCrplxssAG88Qa8805oIbVoUdfQVT7GjYPHHy/9e/brF/IhCxfC\n8OGhGe6IEbnXW1uDjOPUPDQkaRRwU6xGMAP4uJnNkzQCmGZmW0S1gZVmdk50XDswycweyDqfh4ac\nhmbFiuAE3nmn6/L226uXlywJyfHMZ3w9+/P998uzo2/fTqcwfDgMG1Z8aWmpzTVxak/aWg0NN7N5\n0fo8YHi0vh4Qf+jPIdQMHKep6N07vI1XEBEoyIoVwSEsXBgS53PnhiW+Hi8vWgSvvhqWUunfv6tj\nWHPN8u2USsvFxMt9+8Laa3et2XzoQ94SrLuo62U0M5NU6PV+tX277bYbbW1ttESvJuPHj2fChAmr\n4mWZMTiyy5lt+fbnKmfLFjseYM6cOQwcOLDk492+1cuLFy9m/Who0p5sH5T3e/XuDcuXdyAtZvz4\n4vYtWQIvv9zBggUZ59HKBx908M478MILrcyfD4MGhfKDD4bymDFBPpPvOOigObz44sBV5XHjuu7P\nVR49ejHXXbd+ycdnypn1TFmC3XbrYOhQWLSolREjYOzYUB4woJXhw2HAgDn07z+QtdZqpX//cH1a\nWmDYsO79fdP4f5w2bRrt7e0Aq56X+ahXaGh3M5sraV3gjig01AZgZj+PjmsHzjSzB+Pn82Rx8jLN\nqivt9iWpqxQZs5DfmD+/c1m5soMVK8rTJXXw/vutRfMx8W3vvw+9e3fwxBOtq2o2b77ZOVBhPvIl\n9vv0WT35nlkfM6aDV18t/7f6yEc66NevtUuP95EjC7cyS0uyuB6O4FxgvpmdEz38W80skyy+EtiB\nKFkMjM5OCHiOwHEcCI7irbdWD3/F13PlWZYsCaPfJsEaa4SOkZtsknvsrLXXTi5hX8/mo1cBHwfW\nJuQDzgD+AVwLbMjqzUdPIzQfXQ6caGa35DinOwLHcSrGLNQy8iXeKxn9duVKeP31rkOhzJwZnFEh\nevWqzBGcdBKce255MnVLFpvZoXl27ZXn+LOBswuds5r5CNJSxa6nrrTbl6SutNuXpK6029eduqSQ\nfO7bN/TZqKV9770Hs2eTc9ysmTNhk00q68syZEgHUL5cPjzn7jiOUyP69YMttwxLNmZh6PVczqgY\nb79dvW1xfIgJx3GcHkBTTVXpOI7jdC8N5wh8zuLkZZpVV9rtS1JX2u1LUlfa7atGLh8N5wgcx3Gc\n7sVzBI7jOD0AzxE4juM4eWk4R+A5guRlmlVX2u1LUlfa7UtSV9rtq0YuHw3nCBzHcZzuxXMEjuM4\nPQDPETiO4zh5aThH4DmC5GWaVVfa7UtSV9rtS1JX2u2rRi4fDecIHMdxnO7FcwSO4zg9AM8ROI7j\nOHlJnSOQtI+kGZJekHRq9n7PESQv06y60m5fkrrSbl+SutJuXzVy+UiVI5DUG/gtsA8wFjhUUpeR\nvBctWlTRue+5557UyiSpK+32Jakr7fYlqSvt9iWpK+32VSOXj1Q5AsJ8xS+a2WwzWwZcDRwYP+Cl\nl16q6MQPP/xwamWS1JV2+5LUlXb7ktSVdvuS1JV2+6qRy0faHMFI4NVYeU60zXEcx6kRaXMERZsD\nDR8+vKITL61gRuqkZJLUlXb7ktSVdvuS1JV2+5LUlXb7qpHLR6qaj0raEZhkZvtE5YnASjM7J3ZM\negx2HMdpIPI1H02bI+gDPAd8AngNeAg41MyerathjuM4TUyfehsQx8yWSzoOuAXoDfzRnYDjOE5t\nSVWNwHEcx0meVNUInOJIGgpsBqyZ2WZmd9XPIsdxGh13BBUi6V4z20XSYlZv7WRmNrgGOo8CTgDW\nBx4HdgTuB/YsIrcLMIrO39vM7M9FZE40s18X25a1//CsTRYpK6irEiS1mNnSYttyyJV9LZIgz32U\nodvvJ0mbAxcCI8zsw5K2AQ4ws//pTj2Rrglmdk/Wtl3M7N5a6AIeN7PFkg4DtgV+bWYvF5AZZmbz\nK9D1N+CPwL/NbGUZcqm7B9PWfHQ1JI2Q9EdJ7VF5rKRvFDj+8ujzu2XoWCxpUZ7lnVwyZrZL9DnQ\nzAZlLSX9aaPvtr+k/SR9qASREwmd7l42sz0IN/nbRXT8BfgFsAswPlo+WoKuI3JsO7KIzEdjOiYA\nk4ADiimS9AtJgyWtIek2SW9Ff+JC3Ffitrieiq6FpO9KWkuBP0p6TNKnChzfR9IVxc4bJ3MfAb8G\nTiX0nxkJ/CDaVsi+y0vZlsUlwGnAB1H5KeDQYnaWey0iLsix7bcl6Bog6UeSLonKm0nar4jYRcC7\nkj4CnAS8BBR7yD4g6a+SPiMpZ6uaArq+Arwo6eeRcy1IFfdgJf+R0jGzVC9AO3AI8GRUXgN4usDx\n04H1gCeBocCw6HMoMLSIrv8BjgUGR8sxwE9q9L0OBl4m3KR/BmYDXywi83D0+TjQkvm+RWSeJcoF\nlWjXocBNQEf0mVmmAbeV+R1bgVtKOO6J6PNzhDestTK/d45j1wW2B2YA20Xr2wG7AzO681rE5DL3\n3qeAG4CtgMeKyNwDrFmprmLbsvY/llXuU8J98XC2LOFNutuuBbAT8H1Cx9CTovXvE14QnihB17UE\np/hMVB5QTC5jC3Am8M1o/dEiMr2AvQkjGbwE/AwYU+Z9fnT0Pe8jvDCt0c33YMn/kUqWRggNrW1m\n10hqAzCzZZKWFzj+d8BtwCbAIzn2b1xA9gAz2yZWvkjSk8CPyjW6BE4HPmpmbwBIWodg918LyLwq\naQjwd2CqpIUEB1KIpwkPz9dKtOs+4HVgHeB/gcwb0iLgiRLPkWEJha93hsx9uB9wnZm9XaC/yN6E\n2spI4Jex7YsIb7iFKPdaZMhcg32By83s6RJeHGcB90i6kXAdIIQAflVE7l1JXwWuispfAhbnNEo6\nDZgI9JMUH4RrGfD7InrelDQ6dq6DCL97Mcq5Fn2BQYQWgINi298BDipB16ZmdrCkLwGY2bslXPdF\n0XX5KrBrNH7ZGoUELIR1pgBTJO0J/AU4VtLjwEQzy1vTlLR2pOurwKPAlYTa8OGEl5NsKr0Hy/mP\nlE0jOILFkoZlClGns7zhEDP7DfAbSRcBFwO7EWKvd5vZ40V0lfwn7AYEvBkrz6fzT5YTM/tctDpJ\n0jRCraW9iJ51gOmSHgLe7zyV5QzZWIilvkzIP5SFpJtixV6EgQOvLUH0JkkzgKXAMVGYLGes38wm\nA5MlHWRm15Vp10DKuBYxHpE0hfBy0SZpMFAsJvxStPSK9IoSes4DXyaEgs6PyvdG21bDzM4Gzpb0\nM0K4YTOgpQQdAMcR/h+bS3oNmEl4mBUjfi0mFroWZnYncKekP1mBGH0B3pfUP1OQtCmdv1s+DiFc\nr6+b2VxJGxJeaPISPcy/AnwNmEe4NjcBHwGuI8Tzc8ndAGxBqNHvb2YZR3q1pEeyjq32Hiz5P1IJ\nqW8+Kml7Qozxw8AzhAfbQWZW8O1U0onAUcDfok2fAy6JHEU+mY0Jf8Kdo033Aiea2exqvkMeXb8g\n3GhXEh4SmfDXD7pZz+65tpvZtDzHV5wEj+kyYDnwipm9mu/4LNlhQIeZrZA0ABhkZnOLyOxHcDar\nHn5m9uMidmU7W4seWIX09ALGEd5w+xLuwZGF7qWY7KBISWXD5paA8jQiMLO8jQgktQBfIDzkhhLe\n0i3X9cuS60XITb1kZh3R7zbSzJ4sIHNHjs1WyL5I7pOEmvNYYCohrn6EmeU6X8VIeh64HPiTmc3J\n2tdmZj/PI/dpwnNpF6KXTeAiy9Fgodp7MDpH2f+RUkm9IwCQtAaQScQ8Z2Fk0mIyTwE7mtm7UXkA\n8ICZbV07S0tH0rnAg4RqpBFiyjt2tyNIGkkjCMkvAx7KhL5KkNsa2BLoRwmtjSRdHB27JyHx+UXg\nQTMr1JDg3OzrK+kcM1tt3ousYyp50G5NeFPM1GbfBA43s6eL6OoHfIPVHdzXC8g8Tbjm95vZOIWh\n28+O1SBzydxCyAM9AqyI6fllnuO3NLNnJW2XvSuI2aMFdI2PFTMOaLmZnZJPJpL7CyHX9x4h1PaA\nmb2V59hqXmB2IITYRtG1Jc82+WQiub8SHOhfCNfhy8BaZvbFAjKbAK+b2XtRuR+h5dasPMd/wsxu\nk/SF2PfKOBIzs7/lkiub7ko21HIheNyvEOJuXwO+VoLMU0C/WLkf8FQRmQ8BPyQ8WP4ULZfW6Dut\nlmArZl+Z5783+lxMiJ/Hl3dq9J3KToBHcpOAO4A3oms+lxAHLfj7Rp+Z5OVA4J5aXHNCXLcfUTKV\nEA64oYjM/cAesfLuwH0l6LoO+AkhVHM44U34N0VkKmlEkLfBRZ7jL4k+p0W/VZelgnvlPyUcsych\n6TuV4AiuB75bg/v2eULrtk0IzmAUMKoEudWucQnX/RGgb6y8ZqFrAZwVfV4WeyatWrrrGqQ+RxC9\nFWxCuMlXxHYVaxL2J+BBhba+Aj4LXFpE5h/AXYQbLxP37NYqk6RjCC2TNo1qLRkGEUJR3YLFmrd2\n1zlLoJIEOITE4UcIrTuOlDQcKNb88r3oc4mkkYQcy4hcB3bDNV9qZu9JyvRVmFFCU8H+FgthmNm0\nqFZajNFmdpCkA81ssqQrCbXFQlTSiOA+SdtYgZBOHDM7KvrcvZTj4yh0gszQi9BksmgTazO7XdJd\n0fF7ElrmbEVn/qS7eNPMbqxA7lFJO5nZ/bAqf5mrgUqc3maWabKLmb0vqW++g83szGj1aDpDed3+\n3E69IyA0DxxrkVssFTP7laQ76Qy9HGFmjxUR62dFwgTdwJXAv4GfE5rGrWqVYxV0akkZZSfAI96z\nEPdcLmktQs1ggyIyN0UPv1/Q+ee7JM+x1V7zSh60syT9iBB7FqFGO7MEXZmHxNtReGkuISeRFyuj\nEUHMEfYGjpQ0i65Jy4LhkOgcO5P1QLLCHaIepfOFajnh2uUN4cX03EZoMno/wRmOtxJDjWVypqQ/\nEF5aMtffLE/YJXYN+wD3SnqV8P02JAyaWYi3Iif/j+hcBwI5w11Z/IPOUF73jkFNA+QIojjciWZW\nbnOrSnT9DyHO+s9a62pGKk2AS7qQEJI7hNDO/F1CGKdYB7aMfAshJJJzIldJg83snSjZttoNb2YL\nStETnWt3ogdt/M0utv9yMztM0kmEprO7RLvuJgyxvrDI+Y8ihEC2JtRqBwJnmNnvSrWxyPlHFdpv\nRRpG5Kuhm9nxBWT6E2pkEwg17XuAC614L/DzCLWBpYRmzXcS/p/vFZIrl+g7bUFojLKqBVS++6/I\nNTQr3It5NKG2u160aQ5wmJm9WMTGp81sq0LHVENqHUFWc6ttCUNSl9PcqhKdi4H+hLeCTELarAbD\nRTQjkk4g3NgTok13m9kNJcj9hfAnv4cQ8hlcSshCnV31e2e25XozlfRPM9s3evtdDTMrpa9DSUia\nDuxFeCPfna41IivmdLJa88TfuM/qLhurQdKzlFlDrySpmiU/iNB35GRCYnXNwhLlIek5YItyow5V\n6hwIYGYlNU+X9Hvgt6WG8solzaGhTOuFcwnzFsf/UOfWQqGZDVTnoG6ltsd2OhlOaF3zGCEfU6yP\nQ4ZLgV2B3wCjCbHXu80sbyy4nNyRme0brWbeKu+22g1vXqhDo0XbC1HTEEA3UEmHqA+b2dhY+fbI\nYRZE0vGE+2J7QrL4UkLNqru5j9BK65kanBsASYeZ2eWSvk+sViop0+oqZ0fD7gjllWRfWmsEGSQ9\nZmbbZm17ymrQDLSSZoJOVxTamWd6/44ndCj7o5m9VESuD12Tgu+ZWd6EbIVvpnsSaiu7ApsSHFZB\nh1Mpkn5nZkdXIFfTEEClVFNDj5z2/2UlVb9jZgXHypF0CqHxxqNWQpPxSlHoqLUpwdl0+0M20vFt\nM7tY0iS6hiczjiBnja/aUF7J9qXVEcRbehB6aGYYRGga+ZUa6Cy7PbazOpLGEcZb2Qe4neBQb7U8\n7cZzJAXvLpYUrDR3VK7DSZpahwAqRZ0dos4FTiGrhm5mO+SQiSdVNwe6JFXNbMuaGVwG+R623fWQ\nbQTSHBqqR+ua7GaCz5bQTNCJUOjN/TVCa6E/ACdbGBuqF/AC4QGSiycJD+etCLHkhZJyJgVVRVf9\nBFuhlE1SIYBKsagnuqQ1LKsXrEKnqFzsX+iU3WRa1ST5wFcYGuIoVu+8lrfDYBKk1hGY2duEMYW+\nlKDaSpoJOp0MBT6f3WrCzFZKyvtQMLPvQZek4J8IfQJyJQWryR2V7HDqQKGHZt1RBX0xetIbdRnU\nvK9SJaQ2NFRvijUTdLqPHEnBuwnhodsLyFScO6p1K5RmJOrfMYTm7P+SGJIeN7Nx9bYjm9TWCOqN\n5RmUzakJLYQ3/aJJwUreTGOySbVCaTrqVENvRm6WtG/a+ip5jcBpKKp5M02qFYrjZKOug+ENIGV9\nldwROI7jJITCFKa17stSNu4IHMdxEiLJvixl2eWOwHEcJznS2JfFk8WO4zgJkda+LL3qbYDjOE4P\n4klCkngrYBtgqwId8hLDQ0OO4zgJk7a+LB4achzHSYi09mVxR+A4jpMcJXeeTBIPDTmO4/RwPFns\nOI7Tw3FH4DiO08NxR+A4jtPDcUfg9Ggk/VDS05KekPSYpNVm2upGXdMkbV+r8ztOpXirIafHImkn\nYF9g22gmtaHkngynuzBSMAmJ42TjNQKnJzMCeCvTjM/MFpjZ65J+JOkhSU9JujhzcPRG/ytJ/5E0\nXT7F12AAAAIBSURBVNJ4SX+T9Lykn0THjJI0Q9JfomP+mqvnqKS9Jd0n6RFJ10oaEG3/uaRnohrK\nLxK6Dk4Pxx2B05OZAmwg6TlJ/ydpt2j7b81sh2i2s36S9ou2G/C+mX0U+B1h2sFjCMMFHBFNcwow\nBvg/MxtLmBLz2LhSSWsDPwQ+YWbbA48AJ0U1ks+a2YfN7CPAT2r1xR0njjsCp8diZu8Senh+C3gT\nuEbS4cCekh6Q9CRhhMixMbEbo8+ngWfMbF40lelMYINo36tmdn+0/hfCsMMZBOwYnfM+SY8BXwM2\nJMwAtlTSHyV9DkjDXMpOD8BzBE6PxsxWEiYKuTOa+vJoYGtgezP7r6QzCb1BM7wffa6MrWfKmf9T\nPA8gcucFpprZl7M3RsnqTwAHAcdF645TU7xG4PRYJI2RtFls07bADMKDe76kgcAXKzj1hpJ2jNa/\nTNexZAx4ANhF0qaRHQMkbRblCVrN7N/AScBHKtDtOGXjNQKnJzMQuEBSK7AceAH4NtBBCP3MBR7M\nI1uoBdBzwHckXQo8A1zURdDsLUlHAFdJyrRS+iGwCPiHpBZCTeJ7FX4vxykLH2vIcboRSaOAm6JE\ns+M0BB4acpzux9+unIbCawSO4zg9HK8ROI7j9HDcETiO4/Rw3BE4juP0cNwROI7j9HDcETiO4/Rw\n3BE4juP0cP4fblnXMOgVgQ4AAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# make sure that graphs are embedded into our notebook output\n", "%matplotlib inline\n", "\n", "# plot the top frequency words in a graph\n", "goldBugWordTokensLowercaseFreqs.plot(25, title=\"Top Frequency Word Tokens in Gold Bug\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This graph shows not only the rank of the words (along the bottom x axis), but is also much more effective than the table at showing the steep decline in frequency as we move away from the first words. This is actually a well-known phenomenon with natural language and is described by Zipf's law, which the [Wikipedia article](http://en.wikipedia.org/wiki/Zipf's_law) nicely summarizes:\n", "\n", "> Zipf's law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, etc. \n", "\n", "As we continue to explore frequency of words, it's useful to keep in mind the distinction between frequency rank and the actual number of words (tokens) that each word form (type) is contributing." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Characteristic Curve of Word Lengths" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the first examples we have of quantitative stylistics (text analysis) is an 1887 study by T.C. Mendenhall who manually counted the length of words and used that to suggest that authors had a distinctive stylistic signature, based on the average word length of their writings. In some ways this is similar to the type/token ratio we saw in the previous notebook, as it tries to measure stylistic features of texts without considering (yet) what the words may mean. It also uses all words, even the function words that authors are maybe using less deliberately. Unlike with the type/token ratios, Mendenhall's Characteristic Curve is less sensitive to changes in the total text length. If an author uses relatively longer words, chances are that style will persist throughout a text (which is different from comparing type/token ratios for a text of 1,000 words or 100,000 words).\n", "\n", "To calculate the frequencies of terms, we can start by replacing each word in our tokens list with the length of that word. So, instead of this:\n", "\n", "```python\n", "[word for word in tokens]```\n", "\n", "we have this:\n", "\n", "```python\n", "[len(word) for word in tokens]```" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "first five words: ['the', 'gold-bug', 'what', 'ho', 'what']\n", "first five word lengths: [3, 8, 4, 2, 4]\n" ] } ], "source": [ "goldBugLowerCaseWordTokenLengths = [len(w) for w in goldBugWordTokensLowercase]\n", "print(\"first five words: \", goldBugWordTokensLowercase[:5])\n", "print(\"first five word lengths: \", goldBugLowerCaseWordTokenLengths[:5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That looks right, \"the\" is 3 letters, \"gold-bug\" is 8, etc.\n", "\n", "Now, just as we counted the frequencies of repeating words, we can count the frequencies of repeating word lengths." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY0AAAEQCAYAAABMXyhMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmcFNW5//HPwwgMIDig7IgQgwm4AII7Ro3G4BJFRSPG\nLYlGJRqv8eY6JlG8muCSq4m4RkEj8ScRrxsqiysqGhlBUAS5gBEFhHFjDOCG8Pz+ONXSDDNMz0zV\n9PZ9v179murT3d8+00qfqfNUnTJ3R0REJBPNst0BERHJHxo0REQkYxo0REQkYxo0REQkYxo0REQk\nYxo0REQkY4kNGmZWamYzzWyumb1pZldE7VeY2XIzmxPdjkh7zaVmttjMFprZ4Wntg8xsXvTYjUn1\nWUREts6SPE/DzFq7+2dmtg0wA7gQGAqscfcbqj23H3AfsBfQHXga6OPubmYVwPnuXmFmk4Ex7j41\nsY6LiEiNEp2ecvfPos0WQHMgNUJZDU8/Fpjg7uvdfSmwBNjHzLoCbd29InreeGBYcr0WEZHaJDpo\nmFkzM5sLVAJPpn3xX2Bmr5vZODMri9q6AcvTXr6csMdRvX1F1C4iIk0s6T2Nje4+AOhB2GvYFbgN\n6A0MAFYC1yfZBxERic82TfEm7v6pmT0HDHX3bwYJMxsLPBbdXQHsmPayHoQ9jBXRdnr7iurv8e1v\nf9vXrl1LZWUlADvvvDNt27Zl7ty5AAwYMABA93Vf93W/6O937twZ4JvvS3evqWRQM3dP5AbsAJRF\n262AF4AjgS5pz7kIuC/a7gfMJdQ/egNvs6lQPxPYh1ALmUwYfKq/nydp1KhRyld+Uebnc9+VX7fo\nuzPj7/Yk9zS6AveYWQlhGux+d59sZuPNbAChKP4OcE70jb/AzCYCC4CvgZHRLwQwEvgbYfCZ7DUc\nOZUaOZPyxRdfKF/5RZmfz31XfvwSGzTcfR6wZw3tp2/lNaOB0TW0zwZ2j7WDIiJSbwVzRnhqbi4p\nQ4cOVb7yizI/n/uu/PglenJfUzIzL5TfRUSkqZhZvQrhBbOnkTo6IClVVVXKV35R5udz35Ufv4IZ\nNEREJHmanhIRKWJFOz0lIiLJK5hBQzUN5Ss//7KVn/38+iqYQUNERJKnmoaISBFTTUNERBJTMIOG\nahrKV37+ZSs/+/n1VTCDhoiIJE81DRGRIqaahoiIJKZgBg3VNJSv/PzLVn728+urYAYNERFJnmoa\nIiJFTDUNERFJTMEMGqppKF/5+Zet/Ozn11fBDBoiIpI81TRERIpYUdc07rwz2z0QESlsBTNoDBgw\ngF/+El5+OZn8fJ+3VL7yczFb+dnPr6/EBg0zKzWzmWY218zeNLMrovYOZvaUmS0ysyfNrCztNZea\n2WIzW2hmh6e1DzKzedFjN9b2nuvXwwknwIoVSf1WIiLFLdGahpm1dvfPzGwbYAZwIXAC8JG7X2dm\nlwDt3b3czPoB9wF7Ad2Bp4E+7u5mVgGc7+4VZjYZGOPuU6u9lx9yiPPcc7D33vD881BamtivJiJS\nEHKqpuHun0WbLYDmgAPHAPdE7fcAw6LtY4EJ7r7e3ZcCS4B9zKwr0NbdK6LnjU97zWbuvx922gkq\nKuC880B1cRGReCU6aJhZMzObC1QCT0Zf/J3dvTJ6SiXQOdruBixPe/lywh5H9fYVUftmBgwYQMeO\n8PDD0KoV/O1vcPPN8f0u+T5vqXzl52K28rOfX19J72lsdPcBQA/CXsNu1R53wt5HbAYOhHHjwvZF\nF8H06XGmi4gUt22a4k3c/VMzew74IVBpZl3cfVU09fRB9LQVwI5pL+tB2MNYEW2nt29R6l6zZg3l\n5eWURoWMs88ezKuvDuHEE8uYNQu22y6M1mVloe6eGr0zvZ9qa+jrla/8fM0vKyuLvb/Kz17+9OnT\nmTo1lIRLG1D4TawQbmY7AF+7e5WZtQKmAdcABwMfu/u1ZlYOlFUrhO/NpkL4t6NC+EzgV0AF8AS1\nFMLTf5cNG+Coo2DaNBgwAF56CVq3TuRXFRHJW7lUCO8KPGtmrxO+7J9098mEgeMHZrYI+H50H3df\nAEwEFgBTgJFpo8BIYCywGFhSfcCALdeeKimBCRNg551h7lw466zGFcbzfd5S+crPxWzlZz+/vhKb\nnnL3ecCeNbR/AhxWy2tGA6NraJ8N7F7fPrRvD48+CvvuGwaQgQPhN7+pb4qIiKQUxdpTDz8Mxx8P\nzZrB5Mnwwx82cedERHJULk1P5YzjjoPLL4eNG+Hkk2HJkmz3SEQkPxXMoFHX9TRGjYJjjoGqKhg2\nDNasqV9+vs9bKl/5uZit/Ozn11fBDBp1adYM/v536NsX5s+HM84Iex4iIpK5oqhppFu0KKxN9emn\ncOWVcNllTdA5EZEcVd+aRtENGhCK4UcfHQ7BffTRMG0lIlKMirYQXp9rhB95JIyODuw99VR46626\nX5Pv85bKV34uZis/+/n1VTCDRn1dcgmcdFIoiB97bCiQi4jI1hXl9FTKunWw//7wxhth72PSpHAm\nuYhIsSja6amGaNMGHnkEtt8+1DkuvzzbPRIRyW0FM2jUp6aRrnfvcPGmkpJQ53jggZqfl+/zlspX\nfi5mKz/7+fVVMINGYxx6KPzP/4TtM88M01UiIrKloq5ppHMPJ/z9/e9h7+PVV8O0lYhIIdN5Go3w\n+efwve/BrFlh72PqVNimSS5TJSKSHUVbCG9oTSNdq1bw0EPQqRM880w4LDcl3+ctla/8XMxWfvbz\n66tgBo247LgjPPhg2MO44YYwXSUiIoGmp2px++1w3nnQsiXMmAGDB8cWLSKSM1TTiNE558Add0CP\nHqHO0blzrPEiIlmnmkaMbropnDG+fDlcdFFVRmtUNVS+z4sqv3Dz87nvyo9fwQwaSWjRItQ3uncP\nixruumu48t/8+dnumYhIdmh6KgMrVsAf/wjjxsFXX4EZDB8ersWx++6JvKWISJNQTSNBy5bBtdfC\nnXeGwQPghBPCmlV77JHoW4uIJEI1jYRUVVWx445w883wr3/BBReEI6sefBD694fjj4e5cxuXnyTl\nKz8Xs5Wf/fz6SmzQMLMdzew5M5tvZm+a2a+i9ivMbLmZzYluR6S95lIzW2xmC83s8LT2QWY2L3rs\nxqT6nKnu3WHMmDB4XHghlJbCww/DwIEwbBi89lq2eygikozEpqfMrAvQxd3nmtm2wGxgGHASsMbd\nb6j2/H7AfcBeQHfgaaCPu7uZVQDnu3uFmU0Gxrj71GqvT3x6qjYrV8Kf/hTO7fj889D2ox/BqFEw\naFBWuiQikpGcmZ5y91XuPjfaXgu8RRgMAGrq4LHABHdf7+5LgSXAPmbWFWjr7hXR88YTBp+c0bVr\nOHv8X/+Ciy8Oy5E89lg4IfDoo8PihyIihaBJahpm1gsYCLwSNV1gZq+b2TgzK4vaugHL0162nDDI\nVG9fwabB5xtNUdOoS5cuYYn1pUvhN7+B1q3hiSdg773DlQFnzmxcfmMoX/m5mK387OfXV+JruEZT\nU/8LXOjua83sNuDK6OGrgOuBnzf2fdq1a0d5eTmlpaUADB48mCFDhlBWFsak1Aff0Ptr167N+Pmd\nOsFvf1vFyJFw++1l3HwzrFxZxbnnQufOZYwaBX37Njw/6f4rX/m6X7j3p0+fztSpYXY/9X1ZH4ke\ncmtmzYHHgSnu/pcaHu8FPObuu5tZOYC7XxM9NhUYBbwLPOfufaP2EcBB7n5utays1TTq8tFHYfrq\nppsg+rfND34Qah4HHJDdvolIccuZmoaZGTAOWJA+YEQ1ipTjgHnR9iTgZDNrYWa9gT5AhbuvAv5t\nZvtEmacBjyTV7yTssEO4lOzSpfC730HbtvDUUzBkCBx2GLz4YrZ7KCKSmSRrGgcApwKHVDu89loz\ne8PMXgcOAi4CcPcFwERgATAFGJm26zASGAssBpZUP3IKcqOmUZftt4c//CEMHpddBu3ahet2fO97\nYW2rJHeU8n3eVfnZy8/nvis/fonVNNx9BjUPSlO28prRwOga2mcDBbNgR4cOcOWVcNFFcOONoXg+\nfTq88grst1+2eyciUjstI5ID/vM/4frrw1nmY8ZkuzciUky09lQemjUL9torXK9j+XJdl1xEmk7O\nFMKbWj7UNGozaBAcdVQVlZVhmioJ+T7vqvzs5edz35Ufv4IZNPKZGXz/+2F7woTs9kVEZGs0PZUj\n3noL+vWDsjJYtSqsoCsikrSinZ7Kd337hiXWq6pg2rRs90ZEpGYFM2jkc00jlT9iRNhOYooq3+dd\nlZ+9/Hzuu/LjVzCDRiH48Y/Dz0mTYN267PZFRKQmqmnkmP33h3/+E+67j2/2PEREkqKaRp5LDRT/\n+Ed2+yEiUpOCGTQKoaYBcNJJ0KwZTJkCq1fHn58U5Rdufj73XfnxK5hBo1B07hzO2Vi/Hh56KNu9\nERHZnGoaOWjcODjrLDj0UHj66Wz3RkQKmdaeKgCrV4c9jg0bYMWKcBlZEZEkFG0hvFBqGgDt28MR\nR8DGjfDAA/HnJ0H5hZufz31XfvwKZtAoNEme6Cci0lCanspR69ZBp07w2WfwzjvQq1e2eyQihaho\np6cKTZs2cMwxYVvnbIhIriiYQaOQahopcU5R5fu8q/Kzl5/PfVd+/Apm0ChEP/xhWCr9jTdgwYJs\n90ZERDWNnHfWWeG8jcsugyuvzHZvRKTQqKZRYNKnqApwTBSRPFMwg0Yh1jQADj44nOi3ZAnMnh1/\nflyUX7j5+dx35ccvsUHDzHY0s+fMbL6ZvWlmv4raO5jZU2a2yMyeNLOytNdcamaLzWyhmR2e1j7I\nzOZFj92YVJ9zUUlJWMQQdM6GiGRfYjUNM+sCdHH3uWa2LTAbGAb8FPjI3a8zs0uA9u5ebmb9gPuA\nvYDuwNNAH3d3M6sAznf3CjObDIxx96nV3q8gaxoQrq+x//7QvTu8915YBVdEJA45U9Nw91XuPjfa\nXgu8RRgMjgHuiZ52D2EgATgWmODu6919KbAE2MfMugJt3b0iet74tNcUhX33DSf3rVgBM2Zkuzci\nUsya5G9WM+sFDARmAp3dvTJ6qBLoHG13A5anvWw5YZCp3r4iat9ModY0AMzg5JPDdkOnqPJ93lX5\n2cvP574rP37bJP0G0dTUg8CF7r7GbNNeUDT1FMucUrt27SgvL6e0tBSAwYMHM2TIEMrKQskk9cE3\n9P7atWsb9frG5g8fXsXUqfDAA2WMGQPr1uVX/5Wf3/m6Xzj3p0+fztSpYXY/9X1ZH/WqaZhZB6CH\nu7+R4fObA48DU9z9L1HbQuBgd18VTT095+7fNbNyAHe/JnreVGAU8G70nL5R+wjgIHc/t9p7FWxN\nA8LhtrvuCm+9BZMnh1VwRUQaK/aahpk9b2btogFjNjDWzP6cwesMGAcsSA0YkUnAGdH2GcAjae0n\nm1kLM+sN9AEq3H0V8G8z2yfKPC3tNUXDTCvfikj2ZVLT2M7d/w0cD4x3972BwzJ43QHAqcAhZjYn\nug0FrgF+YGaLgO9H93H3BcBEYAEwBRiZtuswEhgLLAaWVD9yCgq7ppGSGjQeeQQ+/zz+/MZQfuHm\n53PflR+/TGoaJdE00knA76O2OueB3H0GtQ9KNQ467j4aGF1D+2xg9wz6WtC+/W0YPBhmzQpTVCec\nkO0eiUixqbOmYWYnApcBL7n7eWa2M3Cdu+fUV1ah1zRSbrgBLr44DBj/+7/Z7o2I5LvYrxFuZkOi\nvYattmVbsQway5dDz57QogV88AG0a5ftHolIPkvi5L6bamgbk3mXmkYx1DQAevSAAw+EL78MtY24\n8xtK+YWbn899V378aq1pmNl+wP5ARzP7NZAaidoCJU3QN6nFiBHwwgvhin6nn57t3ohIMal1esrM\nDgIOAc4Bbk97aA3wmLsvTr57mSuW6SmAjz6Crl3D9sqVsMMO2e2PiOSvJGoavaK1oHJaMQ0aAEce\nCVOmwG23wbnn1v18EZGaJFHTaGlmd0bLmT8X3Z5tRB8TUSw1jZT6rkWVa/1Xfv7k53PflR+/TM7T\neAC4jXBy3YaorXj+pM9Rw4ZBaSm8+GI4oqpHj2z3SESKQSbTU7PdfVAT9afBim16CmD4cHjwQbj+\nevj1r7PdGxHJR0lMTz1mZr80s67RVfc6ROtQSZZpLSoRaWqZDBpnAv8JvExYsDB1yynFVtOAUAxv\n2zYsK7K4jmPZcrH/ys+P/Hzuu/LjV+eg4e693L139VtTdE62rlWrUNuAcM6GiEjSMqlpnEENhW93\nH59UpxqiGGsaEA67PfJI6NsX5s8PS6iLiGQqifM0bmbToNGKsJz5a+4+vMG9TECxDhrr14cT/T7+\nGObOhf79s90jEcknsRfC3f18d78gup0F7ElYSiSnFGNNA6B5czjxxLC9tSmqXO2/8nM/P5/7rvz4\nZVIIr+4zQDWNHJI6iuof/wiXhRURSUom01OPpd1tBvQDJrr7JUl2rL6KdXoKYOPGsFz6ihXw8suw\n337Z7pGI5Iv6Tk9lckb49dFPB74G3nP3ZQ3pnCSjWTP48Y/DBZomTNCgISLJyaSmMR1YCLQD2gNf\nJtynBinWmkZKaopq4kT4+uv48+ui/MLNz+e+Kz9+dQ4aZnYSMBM4kXCd8IroErCSQwYNCtcQr6yE\n55/Pdm9EpFBlUtN4AzjM3T+I7ncEnnH3PZqgfxkr5ppGyuWXw1VXwc9/DmPHZrs3IpIPklh7yoAP\n0+5/zKar+EkOSS2X/uCD4XKwIiJxy2TQmApMM7MzzeynwGRgSrLdqr9ir2kA9OsHe+wBVVUwbVr8\n+Vuj/MLNz+e+Kz9+tQ4aZtbHzIa4+2+AvwJ7ALsTFi68I5NwM7vLzCrNbF5a2xVmttzM5kS3I9Ie\nu9TMFpvZQjM7PK19kJnNix67sQG/Z9HQyrcikqStXSP8CeBSd3+jWvsewB/d/Ud1hpsdCKwFxrv7\n7lHbKGCNu99Q7bn9gPuAvYDuwNNAH3d3M6sAznf3CjObDIxx96nVXl/0NQ2ApUuhd29o3Ro++ADa\ntMl2j0Qkl8VZ0+hcfcAAiNoyOiPc3V8EVtfwUE0dPBaY4O7ro2uSLwH2MbOuQFt3r4ieNx4Ylsn7\nF6NevcJ5Gp99Bo89VufTRUTqZWuDRtlWHitt5PteYGavm9k4M0u9TzdgedpzlhP2OKq3r4jaN6Oa\nxiY1TVHlU/+Vn1v5+dx35cdva2eEzzKzX7j7ZvULMzubxl2E6Tbgymj7KsIZ5z9vRB4A7dq1o7y8\nnNLSMJ4NHjyYIUOGUFYWxqTUB9/Q+2vXrm3U65sy/8QT4e67q1ixAlavLqN9+/zqv/JzL1/3C+f+\n9OnTmTo1zO6nvi/rY2s1jS7Aw8BXbBokBgEtgePcfWVGb2DWC3gsVdOo7TEzKwdw92uix6YCo4B3\ngefcvW/UPgI4yN3PrZalmkaaww6DZ54J52v8vNFDsogUqthqGu6+Ctgf+G9gKfAO8N/uvm+mA0Yt\nHeyadvc4IHVk1STgZDNrYWa9gT5ARdSPf5vZPmZmwGnAIw19/2KRvvKtiEhctnqehgfPuvsYd7/J\n3Z+tT7iZTSAcovsdM1tmZj8DrjWzN8zsdeAg4KLovRYAE4EFhPNARqbtOowExgKLgSXVj5wC1TSq\nO/74cK2NZ5+FVavyr//Kz538fO678uOXySq3DebuI2povmsrzx8NjK6hfTbhHBHJUPv2cMQRMGkS\nPPAAnHZatnskIoWgzrWn8oVqGluaMAFOOSUcgvvyy9nujYjkotivEZ4vNGhsad066NQpnLPxzjvh\nHA4RkXRJLFiYF1TT2FKbNnDMMWF78uT867/ycyM/n/uu/PgVzKAhNUsdRTVtmla+FZHG0/RUgfvy\nS+jTB5YtCwPIvfeGy8OKiEART09JzVq2hEcfhW23DYXx3/8+2z0SkXxWMIOGahq1GzgQHnywipIS\nuPpquCOjhe3rJ58/H+VnL1v52c+vr4IZNGTr9t4bbr89bI8cCZMnZ7c/IpKfVNMoMpddBn/4Qziy\n6oUXYM89s90jEckmnachW+UOp58eCuJdusArr8BOO2W7VyKSLUVbCFdNI7N8Mxg3Dg45JKxJdeSR\nEMdbF8rno/ymzVZ+9vPrq2AGDclcixbw0EPQrx8sWADHHadzOEQkM5qeKmLvvQf77gsrV8Kpp8L4\n8WFPRESKR9FOT0n99ewJTzwRiuL33guXX57tHolIriuYQUM1jYblDxwYlk4vKQlHVY0dG29+XJSf\nvfx87rvy41cwg4Y03BFHwK23hu1zz4WpW1ziSkQkUE1DvvG738Ho0WHJkRdfhIR33kQkB+g8DWkw\n91AQv+8+6No1nMPRs2e2eyUiSSraQrhqGo3PN4O77oKDDw5HVNXnHI5c6L/y8y9b+dnPr6+CGTQk\nHi1bhnM4+vaF+fPhhBPgq6+y3SsRyRWanpIavftuOIdj1So47TS45x6dwyFSiIp2ekritdNO8Pjj\n4RyOv/8drrgi2z0SkVxQMIOGahrx5w8aBBMnhiv9XXllqHfEmV8fys9efj73XfnxS3TQMLO7zKzS\nzOaltXUws6fMbJGZPWlmZWmPXWpmi81soZkdntY+yMzmRY/dmGSfZXNHHrnpHI5f/CJca1xEilei\nNQ0zOxBYC4x3992jtuuAj9z9OjO7BGjv7uVm1g+4D9gL6A48DfRxdzezCuB8d68ws8nAGHefWu29\nVNNI0KWXwjXXhHM4ZsyA/v2z3SMRiUNO1TTc/UVgdbXmY4B7ou17gGHR9rHABHdf7+5LgSXAPmbW\nFWjr7hXR88anvUaayB//CCNGwNq1cNRRsHx5tnskItmQjZpGZ3evjLYrgc7Rdjcg/atoOWGPo3r7\niqh9M6ppJJvfrBncfTccdBCsWBGmrT79NL78uig/e/n53Hflx2+bbL55NPUUy5xSu3btKC8vp7S0\nFIDBgwczZMgQyspCyST1wTf0/tq1axv1+kLJf/jhMg44AEpKqrj4Yrj11jJatMif/itf94v9/vTp\n05kaLTCX+r6sj8TP0zCzXsBjaTWNhcDB7r4qmnp6zt2/a2blAO5+TfS8qcAo4N3oOX2j9hHAQe5+\nbrX3UU2jiSxdGs7hqKyEM84IeyA6h0MkP+VUTaMWk4Azou0zgEfS2k82sxZm1hvoA1S4+yrg32a2\nj5kZcFraayQLevUK53C0bh1O+rvyymz3SESaStKH3E4AXga+Y2bLzOynwDXAD8xsEfD96D7uvgCY\nCCwApgAj03YdRgJjgcXAkupHToFqGk2dP3gw3H9/qHVccQVceWUVt9wSrs3xwguwcCGsXh0WQYxD\nvn0+hZSfz31XfvwSrWm4+4haHjqsluePBkbX0D4b2D3GrkkMjj4abrkFzjsPHn4Y5s7d8jnNm0On\nTuHWufPWf3bsGJ4vIrlLa09Jo02ZEpZR/+CDUOdI/ayshDVr6pfVoUPNg8oRR4Qz1EUkXrqehuSU\nzz8Pg0j1AaWmnx99BBs31p41fHg4X2SXXZqu/yKFrmgHjYEDB/qcOXMSy6+qqvrm8DXlJ5O/YQN8\n/PGWg8miRTB7dhUVFWWUlMDPfw6jRkG3bjF1nvz4fLKVn899V37d6jtoZPU8DZF0JSWb6h/VLV4M\nf/pTWDTxjjvCyrv/8R/wX/8FCf57EpFqCmZPQ9NTxWHhwnAt84ceCvfbt4ff/hbOPx8acJ6SSNEr\n2ukpDRrFZeZMKC+H6dPD/R49wvkip58e9lhEJDP5cHJfInSeRnHl77MPPPtsOHKrf/+wgOLPfgZ7\n7AGPPlr/80MK7fPJl2zlZz+/vgpm0JDiYwZDh8Jrr8G990Lv3rBgAQwbBkOGwIsvZruHIoVH01NS\nML76Cv76V7jqKvjww9B21FFw9dWwu04NFamRahpS9NasgeuvD7e1a8MeyWmnhZrHTjtlu3ciuUU1\njYTk+7xlMeW3bRvWw3r7bbjgAthmGxg/PpwUeNFF4STCxuQ3RD7n53PflR+/ghk0RKrr1AnGjAmH\n6Z5ySpi++stf4FvfClNY0SUoRKQeND0lRWPu3HCt8+j6M3TuDJdfDmefrYUSpXippiFSh+nT4ZJL\noCK66vzOO4czy0eMCFNbIsVENY2E5Pu8pfI3OfjgsCrvgw+GOsfbb8Ntt1XRtWtY1+qVV+K7DkhK\nPn0+TZmt/Ozn11fBDBoi9WEGxx8P8+eHczz22APWrQtrW+23XzhE9y9/CQsoisgmmp4SiSxaBOPG\nwd/+FlbXBWjRIgwuZ50FhxwSrlQoUkhU0xBppK++CtdAHzs2FM1T/1t961th+urMM+Ndll0km1TT\nSEi+z1sqP/P81N7F5MmwdGk452PHHeFf/wor7PbsCcceC489Bl9/Xf/8JKimofymUjCDhkgSevYM\nF3x6552wOOIJJ4R6yKRJcMwx4Qzz3/8+DCgixUDTUyL1VFkZzjAfOzbUQVIOPTSc8zFsGLRsmb3+\nidSHahoiTcQ9rKQ7diw88AB88UVo3377sNbVWWfBrrtmt48idVFNIyH5Pm+p/PjzzeB73wt7HStX\nws03h2t7fPxxOFx3t91g//3DYbxLl+Ze/3MhW/nZz6+vrA0aZrbUzN4wszlmVhG1dTCzp8xskZk9\naWZlac+/1MwWm9lCMzs8W/0WqUlZGfzylzBnDsyaBeecE84u/+c/wxFXxx0H/fqF9nvvhffey3aP\nRRoma9NTZvYOMMjdP0lruw74yN2vM7NLgPbuXm5m/YD7gL2A7sDTwC7uvjHttZqekpyybl2Ytrr3\nXnj5Zfj8880f79kz7KkceGD4+Z3vhL0XkaaUNzWNaNAY7O4fp7UtBA5y90oz6wJMd/fvmtmlwEZ3\nvzZ63lTgCnd/Je21GjQkZ331FcyeHWogL7wAM2bAp59u/pyOHcMVB1MDSf/+YVl3kSTlU03DgafN\nbJaZnR21dXb3ymi7EugcbXcDlqe9djlhj+MbqmkoP5fzP/usiv32CwsjPv54qHvMnQs33QQnnghd\nuoSrDT78cLjmx+DB0KFDuJzt6NFhsEkV2pu6//n+2Ss/Xtn8O+YAd19pZh2Bp6K9jG+4u5vZ1nYd\nNnusXbuNwHHIAAAOiElEQVR2lJeXU1paCsDgwYMZMmQIZWWhLJL64Bt6f2108YW48pRf3Plr1lSx\n007Qv38Z558Pq1dX8f77MHNmGS++CJWVVaxcCdOmlTFtGgwYUEXz5lBaWsaBB8LBB1ex667QrVvT\n9F/3C+f+9OnTmRpdHyD1fVkfOXHIrZmNAtYCZwMHu/sqM+sKPBdNT5UDuPs10fOnAqPcfWZahqan\npKCsWBH2MFK3efM2f7xZMxg4MExn/eQnMGhQdvop+S0vahpm1hoocfc1ZtYGeBL4b+Aw4GN3vzYa\nKMqqFcL3ZlMh/Nvpo4QGDSl0n3wCL70UaiIvvhhqJOnLmOy3H5x/PgwfHpZCEclEvgwavYGHo7vb\nAP/P3a82sw7ARKAnsBQ4yd2rotf8FvgZ8DVwobtPS88cOHCgz5kzJ7E+V1VVfbOrp3zl50L+unXh\n2h+PPw6vvlrFSy+F/M6dw6G955wTz8KK+fjZKD9z9R00slLTcPd3gAE1tH9C2Nuo6TWjgdEJd00k\nb7RpE5YuOfTQcHLhpEmhsD5/Plx5ZSigDx8e9j7231+H80o8cqKmEQdNT4mEpU2efz4MHo88Ahuj\nM5kGDgyDx4gR0KpVdvsouSUvpqeSoEFDZHPvvQe33w533gkffRTatt8+rIl13nlhhV6RfDpPI1Y6\nT0P5yt9cz55himrZsnA1wkGDwvkh114bLih13HHwzDN1Xw+9ED8b5TdcwQwaIlKz0lI44wx49dWw\nFtYpp0BJSZi+OuywsBLvrbdCdKqHyFZpekqkCK1aBXfcEaavVq4Mbe3awU9/CiNHwi67ZLd/0nRU\n0xCRjK1fDw89FJZ1nzFjU/vQoaFwfsQR4SRCKVyqaSQk3+ctla/8mjRvDj/+cThZ8LXXwjLupaUw\ndSocfXTY47jjjipWr465w2ly9bMplvz6KphBQ0QaZ+DAcBXC5cvhuuugVy94+2247Tbo3j0cdZXg\n+bOSJzQ9JSI12rABnngiTF099dSm9n33DRecGj487JVIflNNQ0Rit2hR2OO4++5N1wHZYYew93HO\nOWGvRPKTahoJyfd5S+UrvzHZu+wCf/5zWHn3zjthwIBwwuA114RzPn70o1AH2bix7rya8pOk/HgV\nzKAhIslr0ybsXbz2WriE7amnhmL644+HI6122QWuvz6syCuFSdNTItIoH3wAd90Vpq/eey+0lZaG\nda5GjgxXIZTcpZqGiGRFqnB+660wLe3CBXvvHQrnJ52kwnkuUk0jIfk+b6l85SedXVICxxwTahuL\nFsGvfw1lZVBREZYx6dEDLrkE3nmnYfkNpfx4FcygISK5o0+fUNtYsQLGjYM99wyLJV53Hey8Mxx1\nFEye3LDCuWSXpqdEJHHuYY/jllvg/vvhq69Ce+/ecPbZ4dyP3XaDjh2z289ipJqGiOS0Dz8MhfPb\nb4elSzd/rFOnsOrubruF2667hluCVzstekU7aOga4cpXfjL5SWVv2ABTpsArr1TxzDNlzJ8Pa9bU\n/NwePTYNIqkBpW/fcAhwXfL5s2+K/Ly4RriISElJWBRxyBD4wx/CFNayZfDmm+E2f374uWBBWA9r\n+fJQZE8xC9Nb6Xslu+0G3/kOtGyZvd+r0BXMnoamp0QK04YN4Yir6oPJwoXw9ddbPr+kJBTiU4PJ\nd78b9lS6dYOuXXXYb3VFOz2lQUOkuKxfD4sXbxpMUgPKkiVbPyqrQ4dNA0i3bjXfunSBFi2a7nfJ\npoIdNMxsKPAXoAQY6+7Xpj+umobylZ9fNY2k8j//POyFpPZI1q2rYtasMt5/P1ylcP36zHI6dqx5\nQEkfbDp3hrVr8+vzqa4gaxpmVgLcDBwGrABeNbNJ7v5W6jlraqugxWTGjBkcffTRyld+0eXnW99b\ntQrXBhk4MNx//PEZ3HRTyN+4MZwv8v77W7+tWhWO8vrwQ3j99drfywz695/Bxo1H06lTGES29rMh\ntZakP//6yotBA9gbWOLuSwHM7B/AscA3g8bbb7+daAdmzZqV6H845Ss/V/Pzue/V85s1C3sQHTtC\n//61v2bDhjBg1DW4fPABzJ07C8is/9ttV/fAkvrZrl0YlJL+fOorXwaN7sCytPvLgX2y1BcRKXAl\nJaGu0aVLOJu9NuvXh6VRTjsNKivDIJL+s3rbp5+G26JFdfehZcsweHz5ZVhFeJttworCqVv1+zW1\nZfKc+sqXQaPOwkvnzp0T7cAXX3yhfOUXZX4+9z3p/ObNoUWLL76ZCtuajRth9eotB5bafq5bFw5B\n7tz5C2bPTuxXqLe8KISb2b7AFe4+NLp/KbAxvRhuZrn/i4iI5KCCO3rKzLYB/g84FHgfqABGpBfC\nRUQkeXkxPeXuX5vZ+cA0wiG34zRgiIg0vbzY0xARkdyQF3sa1ZnZPoSaxqtmtiswFHjL3SfHlN8X\n6AbMdPe1ae1D3X1q7a/MPWZ2IOGQ5Xnu/mQMefsSPutPzaw1UA7sCcwHRrv7p43M/xXwsLsvq/PJ\nOcjMWgInAyvc/Wkz+wmwP7AAuMPdMzy1bKvvsTNwPNAD2EiYur3P3f/d2GyRuuTdRZjM7ArgRuB2\nM7sauAloDZSb2e9jyP8V8AhwATDfzIalPXx1Y/Nrec/xMWZVpG2fTfh8tgVGRQcQNNZdwLpo+0ag\nHXAN8Dlwdwz5VwEVZjbDzEaaWZNdYcHMfhpDzN3AkcCFZvZ3YDjwCmHgHtvYcDO7ELgdaBlltgR6\nAjPN7JDG5ktuMbNO2e7DFtw9r27Am4Q9pNbAGmC7qL0V4a/pOPK3jbZ7AbOA/4juz4kh/zFgUvQz\ndVuXao8hf07a9iygY7TdBngzhvy30rZfq/bY63H0n/DHzOGEAepDYCpwBtA24f+3lsWQMS/6uQ3w\nAbBNdN9i/P+zJNpuDTwfbfcE5jYyu4zwB8BCYDXwSbR9DVCW8Gc/JYaM7aK+3gucUu2xW2PI7wrc\nBtwCbA9cAcwDJgJdY8jvUO22PbA0dT+G/KHV/luPi/p/H9A505x8nJ762t2/Br42s7c9mg5x98/N\nLI6LR5pHU1LuvtTMDgYeNLOdCP/wG6sHYapiLGFqwYDBwP/EkA1QYmYdotwSd/8QwN3XmVkNa4LW\n23wz+5m73wW8bmZ7eZgm3AX4KoZ83H0j8CTwpJm1AI4ARgDXAzs0JtvM5m3l4Tj+qrNoiqo14Q+Z\n7YCPgVLimQ52oDmwIcpsA+Du75lZA07V2sxE4BngYKDS3d3MuhIG7ImEgbzBzKy20+QMyOBMhzrd\nDSwCHgR+ZmYnAD9x9y+A/WLI/xvwOGHPfTrw/4CjCKtT3B79bIyPgHertXUHZhP+u3+rkflXE/4A\ng/BvaSXwI+A44K/AsFpet7kk/3pI4gbMBFpH282qjZyvxZD/HDCgWltzYDyhjtLY/BLg18DTwMCo\n7Z0YP5+lwDvR7V9EfwEBbWnkX6Jpn/M9UfZMYH30Xi8A/WPIr3VvDmgTQ34l4QuqVw2392PIvyj6\nbN4DLiR8CY8l7CFcEUP+hYS/DscSahk/i9o7AS80MntRQx6rR/6G6N9XTbfPY8h/vdr93wEvEf7Q\niGOWIH0v/r2tvXcD8y8mfKnvkdb2TmNza+n/60QHQtW3/7F0pilvQGkt7TsAu8eQvyPQpYZ2A4bE\n+Hv0AB4g7Oo2elokg/drDfSOMW87YABhL2mLz6sRud9J+HO4CziwlscmxPQe3YHu0XZ74ERg7xh/\nh90ItZLvxvzZPAX8F2lTFUAX4BLg6Rjy5wO71PJYHFODb5H2h2TUdmb0vu/GkP962vYfqz3W6KnH\nKGfH6Hvhz4R64Tsx/vddTviD9WLCH5fpg8YbmebokNssM7Ojgf3d/bfZ7osUt2hasxw4Bkity1NJ\nqMFd4+6fNDL/RMKX68IaHhvm7o80Mv9PwJPu/lS19qHATe7ep5H5VwHXufuaau19gKvdfXhj8qtl\nHgv8Fujl7rGskRQdRJT+hX+bu38QTUFe6+6nZ5SjQUNE6mJmP3X3OI6Oqy0/VSdT/qbM1sDO7j4v\nlz5/DRoiUiczW+buOypf+fl49JSIJKCOI8saPUWi/EblN/rIvrj6r0FDRFI6EVZXWF3DYy8rX/mg\nQUNENnmCcGLrnOoPmNnzylc+qKYhIiL1kHdrT4mISPZo0BARkYxp0BARkYxp0BCpgZn9zszeNLPX\nzWyOme2d4HtNN7NBSeWLxElHT4lUY2b7EVYvHeju66PlNVom+JbO5ss7iOQs7WmIbKkL8JFHV9lz\n90/cfaWZXWZmFWY2z8z+mnpytKdwg5m9amYLzGywmT1kZoui9Yows15mttDM7o2e84CZtar+xmZ2\nuJm9bGazzWyimbWJ2q8xs/nRns+fmuhzENmCBg2RLT0J7Ghm/2dmt5jZ96L2m919b3ffHWgVLTYJ\nYS/hS3ffi3BdhUeB8wir0Z5pZu2j5+0C3OLu/YB/AyPT39TMdiAs532ouw8iXEfh19GezjB339Xd\n+xOubiiSFRo0RKpx93XAIOAXhCsH3m9mZwDfN7NXzOwN4PtAv7SXTYp+vgnMd/dKd/+KcG2N1Jo+\ny9z9n9H2vcCQtNcbsG+U+bKZzQFOJ1yR71PgCzMbZ2bHES6tK5IVqmmI1MDD1QOfB56P1uw5F9gd\nGOTuK8xsFOHKeSlfRj83pm2n7qf+naXXLYya6xhPufsp1RujQvyhhOtonB9tizQ57WmIVGNmu0TX\nSEgZSLhWtgMfm9m2hAsr1VdPM9s32j4FeDHtMQdeAQ4ws52jfrQxsz5RXaPM3acQLqLTvwHvLRIL\n7WmIbGlb4CYzKwO+BhYD5wBVhOmnVYRL3dZka0dC/R/wSzO7i3A1uds2e6H7R2Z2JjAhus44hBrH\nGuBRMysl7KFc1MDfS6TRtPaUSBMws17AY1ERXSRvaXpKpOnoLzTJe9rTEBGRjGlPQ0REMqZBQ0RE\nMqZBQ0REMqZBQ0REMqZBQ0REMqZBQ0REMvb/ASs+aPy3Ip5KAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "nltk.FreqDist(goldBugLowerCaseWordTokenLengths).plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That was easy, but not really what we want, since the word lengths on the bottom axis are ordered by frequency (3 is the most common word length, followed by 2, and then 4). The default behaviour of ordering by frequency was useful for words, but not as useful here if we want to order by word length.\n", "\n", "To accomplish what we want, we'll extract items from the frequency list, which provides a sorting by key (by word length), and then create a list from that.\n", "\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(1, 697),\n", " (2, 2627),\n", " (3, 3092),\n", " (4, 2441),\n", " (5, 1360),\n", " (6, 931),\n", " (7, 898),\n", " (8, 550),\n", " (9, 460),\n", " (10, 301),\n", " (11, 145),\n", " (12, 81),\n", " (13, 43),\n", " (14, 10),\n", " (15, 3)]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "goldBugLowerCaseWordTokenLengthFreqs = list(sorted(nltk.FreqDist(goldBugLowerCaseWordTokenLengths).items()))\n", "goldBugLowerCaseWordTokenLengthFreqs # sorted by word length (not frequency)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Formally, this is a list of tuples where each line represents an item in the list and within each line item there's a fixed-order tuple of two numbers, the first for the word length and the second for the frequency. Since lists don't have a built-in ```plot()``` function – unlike FreqDist that we used previously to plot high frequency words – we need to call the graphing library directly and plot the x (word lengths) and y (frequencies)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY0AAAEPCAYAAAC+35gCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm8lHXd//HXm80NFXBhR1xwzRWVyu2UhNiCWmmamZUt\nZilpWWreiWmm3reWWuldaeGvtMzKqBsXLE6WRpSCrAqYqICgGQqoKMvn98f3OjIczjnMgZm5Zua8\nn4/HPM4137muaz4cPfOZ766IwMzMrBid8g7AzMxqh5OGmZkVzUnDzMyK5qRhZmZFc9IwM7OiOWmY\nmVnRypY0JG0p6e+SpkqaIWlMVj5G0gJJU7LH8QXXXCxprqQnJI0oKB8qaXr22g3litnMzNqmcs7T\nkLR1RLwmqQvwV2A0MBJYHhHXNzt3X+AO4DCgP/AgMCQiQtJk4IsRMVnSeODGiLivbIGbmVmLyto8\nFRGvZYfdgK5AU4ZSC6efANwZEasiYj4wDxgmqS+wbURMzs67HTixfFGbmVlrypo0JHWSNBVYAjxQ\n8MF/rqTHJd0qqUdW1g9YUHD5AlKNo3n5wqzczMwqrNw1jbURcRAwgFRr2A+4GdgVOAh4HriunDGY\nmVnpdKnEm0TEK5ImAiMj4q0kIenHwO+zpwuBgQWXDSDVMBZmx4XlC5u/hyQvomVmtgkioqUugxaV\nc/TUjk1NT5K2At4DzJbUp+C0k4Dp2fE44FRJ3STtCgwBJkfEYmCZpGGSBJwB3NPSe0ZE1T8uu+yy\n3GNwnI6zVmN0nKV/tFc5axp9gbGSOpOS0y8jYryk2yUdROoUfxr4HEBEzJJ0FzALWA2cE+v+RecA\nPwW2AsaHR06ZmeWibEkjIqYDh7RQ/vE2rrkKuKqF8keB/UsaoJmZtZtnhFdYQ0ND3iEUxXGWVi3E\nWQsxguPMW1kn91WSpKiXf4uZWaVIIqqhI9zMzOqPk4aZmRXNScPMzIrmpGFmZkVz0jAzs6I5aZiZ\nWdGcNMzMrGhOGmZmVjQnDTMzK5qThpmZFc1Jw8zMiuakYWZmRXPSMDOzojlpmJlZ0Zw0zMysaE4a\nZmZWNCcNMzMrmpOGmZkVzUnDzMyK5qRR4771Lfje9/KOwsw6CkVE3jGUhKSol39LsSZOhNNOg9Wr\nYd486NEj74jMrNZIIiJU7Pllq2lI2lLS3yVNlTRD0pisvJekCZLmSHpAUo+Cay6WNFfSE5JGFJQP\nlTQ9e+2GcsVcS5YuhTPPhLFj4f3vh+9+N++IzKwjKGtNQ9LWEfGapC7AX4HRwIeAf0fEtZK+BvSM\niIsk7QvcARwG9AceBIZEREiaDHwxIiZLGg/cGBH3NXuvDlPTiEg1jN694YYb4KmnYNgwmDsXevbM\nOzozqyVVU9MAiIjXssNuQFcggFHA2Kx8LHBidnwCcGdErIqI+cA8YJikvsC2ETE5O+/2gms6pJ/9\nDGbMgKuvTs933x1GjYLvfCffuMys/pU1aUjqJGkqsAR4IPvg7x0RS7JTlgC9s+N+wIKCyxeQahzN\nyxdm5R3S/PlwwQXw85/DVlutK7/0Uvj+9+E//8ktNDPrALqU8+YRsRY4SNL2wG8lva3Z6yGpZG1K\nY8aMeeu4oaGBhoaGUt26KqxZA2ecARddBAceuP5ru+0GJ50E118PV16ZT3xmVv0aGxtpbGzc5Osr\nNnpK0n8BrwGfARoiYnHW9DQxIvaWdBFARFydnX8fcBnwTHbOPln5acAxEXF2s/vXfZ/Gt74Ff/oT\nTJgAnVqoI86fD0OHwpw5sMMOFQ/PzGpQ1fRpSNqxaWSUpK2A9wCzgXHAmdlpZwL3ZMfjgFMldZO0\nKzAEmBwRi4FlkoZJEnBGwTUdxj/+ATfemEZLtZQwAAYPhg99CK67rqKhmVkHUraahqT9SR3dnUnJ\n6ZcRcaWkXsBdwCBgPnBKRLycXXMJ8ClgNTA6Iu7PyocCPwW2AsZHxHktvF/d1jRefRUOOSQ1O518\nctvnPvNMOvfJJ2HHHSsTn5nVrvbWNDy5rwZ87nOwcmWqZRTj7LPTRL+m0VVmZq1x0qgz48bBl74E\nU6fCdtsVd82zz8LBB8MTT8BOO5U3PjOrbU4adWTx4vThf/fdcMQR7bv2nHOge3e49tryxGZm9cFJ\no05EwHvfC4ceCldc0f7rFyyAAw5ItY2ddy59fGZWH6pm9JRtnh/8AF56Cb7xjU27fsAA+OhHXdMw\ns9JyTaMKzZoFxxwDDz8Me+656fdZuBD23z/dr0+f0sVnZvXDNY0a98YbcPrpcNVVm5cwAPr3h499\nzLUNMysd1zSqzNe+luZY/Pa3oKJzf+sWLYK3vQ1mzoS+fTf/fmZWX9wRXsMaG1MtY+rU0g6V/dKX\n0k/vuWFmzTlp1KilS9MihD/8IYwcWdp7P/887LdfWk69X7/S3tvMapuTRg1q2lRp553T+lLl8OUv\nw6pV5bu/mdUmJ40a9LOfwbe/Df/85/p7ZJTSkiWw774wbVrqIDczAyeNvMNot/nz4bDD0nLnBx1U\n3ve68EJ4/XX43vfK+z5mVjucNGrImjXQ0JC2ar3wwvK/3wsvwD77pI72gQPL/35mVv08T6OGXHMN\ndO2a+hsqYeed4dOfTk1hZmabwjWNnPzzn2ltqUcfrey3/n//G/baC6ZMgUGDKve+ZladXNOoAa++\nmuZj3HRT5ZuJdtwRPvvZtHWsmVl7uaaRg7PPhtdeg9tvz+f9X3opLVHy6KNpi1gz67hc06hy48bB\n/ffnO4Jphx1S4nJtw8zayzWNCmraVOlXv4Ijj8w3lv/8B4YMSX0ru+6abyxmlh/XNKrYbbfBCSfk\nnzAAevVKu/tdeWXekZhZLXHSqKDp09u/bWs5XXAB/O538NRTeUdiZrXCSaOCZs5My5RXi5494Qtf\ncG3DzIrnPo0KWbUKttsu9SWUa32pTfHyy7DHHjBpUvppZh1L1fRpSBooaaKkmZJmSDovKx8jaYGk\nKdnj+IJrLpY0V9ITkkYUlA+VND177YZyxVxO8+alORnVlDAAevSAc8+FK67IOxIzqwVdynjvVcD5\nETFVUnfgUUkTgACuj4jrC0+WtC/wEWBfoD/woKQhWfXhZuCsiJgsabykkRFxXxljL7kZM9KeFtXo\nS19KtYw5czZ/i1kzq29lq2lExOKImJodrwBmk5IBQEtVoROAOyNiVUTMB+YBwyT1BbaNiMnZebcD\nJ5Yr7nKZMaO6+jMKbb89nHeeaxtmtnEV6QiXNBg4GJiUFZ0r6XFJt0rqkZX1AxYUXLaAlGSaly9k\nXfKpGTNnVm9NA2D0aLjvvrQ/uZlZa8rZPAVA1jR1NzA6IlZIuhn4ZvbyFcB1wFmleK8xY8a8ddzQ\n0EBDQ0MpblsSM2ZAQXhVZ7vtUjPVN78JP/953tGYWbk0NjbS2Ni4ydeXdfSUpK7AH4B7I+K7Lbw+\nGPh9ROwv6SKAiLg6e+0+4DLgGWBiROyTlZ8GHBMRZze7V9WOnlq5Mg1vfeUV6NYt72hat2xZ6tv4\n85/TvhtmVv+qafSUgFuBWYUJI+ujaHISMD07HgecKqmbpF2BIcDkiFgMLJM0LLvnGcA95Yq7HJ58\nEnbbrboTBqTaxvnnp9qGmVlLytk8dQTwMWCapClZ2SXAaZIOIo2iehr4HEBEzJJ0FzALWA2cU1B1\nOAf4KbAVML7WRk5Ve39GoS9+MdU2ailmM6scT+6rgEsugS23hG98I+9IivNf/5X2Ev+f/8k7EjMr\nt6ppnrJ1au1b+0knpTWpqjQHm1mOnDQqoJrnaLTk4INTTcPDb82sOSeNMnv1VXj+edh997wjKZ4E\nH/hA2jDKzKyQk0aZzZ6dluboUvYZMaU1ahT8/vd5R2Fm1cZJo8xqrT+jybveBdOmwYsv5h2JmVUT\nJ40yq7X+jCZbbgnDh8P48XlHYmbVxEmjzGq1pgGpX8NNVGZWyPM0ymzQIGhsTDPCa82LL6aJfkuW\npJqHmdUfz9OoIsuWpZ36Bg/OO5JNs9NOsP/+KemZmYGTRlnNnJkW/utUw79lN1GZWaEa/jirfjNn\n1mYneKFRo9J8jSps+TOzHDhplFE1b/FarL33Tv0ZU6fmHYmZVQMnjTKq1eG2haR1tQ0zMyeNMqrl\n4baF3K9hZk2cNMrkpZfgtddgwIC8I9l8RxwBTz8NCxZs/Fwzq28bTRqSNhih31KZra+plqGiRz9X\nr65d4fjj4Q9/yDsSM8tbMTWNR4osswL10J9RyE1UZgZtbPea7eXdD9ha0iGASFu0bgdsXZnwale9\n9Gc0GTkSPvMZWLECunfPOxozy0tbC3aPAD4B9AeuKyhfTtrr29owYwZ88IN5R1E6228Pw4bBhAlp\nZz8z65g2uvaUpA9HxN0VimeTVdPaUxFpCY4ZM6BPn7yjKZ2bboIpU+C22/KOxMxKpb1rTxWTNLYE\nPgQMBjqTNVNFxDc3I86Sq6aksXhx6s948cX66AhvMn8+HH542omwc+e8ozGzUijHgoW/A0YBq4BX\ngRXZT2tFPY2cKjR4cKo5/f3veUdiZnkpZhPS/hFxXNkjqSP1NnKqUNPs8He+M+9IzCwPRQ25lXRA\ne28saaCkiZJmSpoh6bysvJekCZLmSHpAUo+Cay6WNFfSE5JGFJQPlTQ9e+2G9sZSafU2cqqQh96a\ndWzFJI2jgEezD/np2WNaEdetAs6PiP2AtwNfkLQPcBEwISL2BP6YPUfSvsBHgH2BkcAPpLcaeG4G\nzoqIIcAQSSPb8W+suHquaRx2WJrtPm9e3pGYWR6KSRrHA0NIQ3A/kD1GbeyiiFgcEVOz4xXAbNLw\n3VHA2Oy0scCJ2fEJwJ0RsSoi5gPzgGHZfJFtI2Jydt7tBddUnYj6rml06uTahllHVkzSWNvKo2iS\nBgMHA38HekfEkuylJUDv7LgfULi60QJSkmlevjArr0oLFsDWW8MOO+QdSfmMGuWkYdZRFdMRPp40\nExxgS2BX4EmgqO/SkroDvwZGR8RyFQwpioiQVLJxsmPGjHnruKGhgYaGhlLdumj1XMtocuyxcPrp\nsHQp9OyZdzRm1h6NjY00bsYezhtNGhGxXut8tqTIF4q5uaSupITx/yLinqx4iaQ+EbE4a3p6IStf\nCAwsuHwAqYaxMDsuLF/Y0vsVJo281HN/RpOtt4aGBrj3XvjoR/OOxszao/kX6ssvv7xd17d7afSI\neAwYtrHzsk7sW4FZEfHdgpfGAWdmx2cC9xSUnyqpm6RdSf0okyNiMbBM0rDsnmcUXFN16mG3vmJ4\nYyazjqmYGeFfLnjaCTgE6LWxuRuSjgQeAqaxrnnrYmAycBcwCJgPnBIRL2fXXAJ8ClhNas66Pysf\nCvwU2AoYHxHntfB+VTEj/LDD4MYb4R3vyDuS8nr+edh3X1iyBLp1yzsaM9tU5VhGZAzrPvRXkz7o\nfx0RKzcxxrKohqSxdi1suy0sWpQW+Kt3w4bBVVelPg4zq03tTRrF9GmMyW68bfZ8+SZHV+fmz0+j\npjpCwoB1TVROGmYdRzE79+0vaQowE5gp6VFJdd7Vu2k6Qid4oaaht1XQKmhmFVJMR/gPgQsiYlBE\nDAK+nJVZMx1huG2ht70tNcnNnJl3JGZWKcUkja0jYmLTk4hoBLYpW0Q1rKPVNCSPojLraIpJGk9L\n+i9JgyXtKulS4F/lDqwWdbSaBnh2uFlHU8zoqV7A5cARWdFfgDERsbTMsbVL3qOnVq+G7bZLGy9t\n04HqYW++Cb17w+zZ9bVLoVlHUbLRU5K2Ii0U+AJwbkH5zkBVDbetBk89BX37dqyEAWmOxogR8H//\nB2edlXc0ZlZubTVP3UhaFr25I4DryxNO7epo/RmF3K9h1nG0lTSGRsSvmxdGxG+BY8oXUm3qiP0Z\nTY4/Hhob4fXX847EzMqtraSx9SZe1yF15JpGr15w8MHwxz/mHYmZlVtbH/4vSNpgYUJJh7NuZVrL\ndOSaBriJyqyjaHX0VJYc7iItFPgoIGAoaWXaUyNiUoViLEqeo6fefDMtHbJ0KWy5ZS4h5G7uXDjm\nmLQJVSfXQ81qRntHT7X6551trzosO+cTpGQh4PBqSxh5mzMHdtml4yYMgCFDUuJ89NG8IzGzcmpz\nwcJsW9ZvVCiWmtWR+zMKNTVRHXZY3pGYWbm4IaEEOsrGSxvj2eFm9c9JowRmznRNA+Dtb097iTzz\nTN6RmFm5OGmUgGsaSefO8N73urZhVs/aGj1V+KcfpE7wt55HxKhyBtZeeY2eev31NE9h2TLo2rXi\nb191fvMbuOUWeOCBvCMxs2KUbPQUcF32+BfwOmkPjR8BK/Aqt2+ZPRv22MMJo8mIETBpUkqiZlZ/\nWh09le2bgaTrImJowUvjJHlgZcb9Gevr3h2OOALuvx9OPjnvaMys1IrahEnS7k1PJO1G20uMdCju\nz9iQZ4eb1a9iksb5wERJf5b0Z2Ai8KXyhlU7XNPY0Ac+APfem/YYMbP60ubkPkmdgO2BPYG9s+In\nIsL7aWRc09jQgAFphvwjj8DRR+cdjZmVUps1jYhYC3w1IlZGxNTsUXTCkHSbpCWSpheUjZG0QNKU\n7HF8wWsXS5or6QlJIwrKh0qanr12Qzv/jWWzfDm88ALstlvekVSfD3zATVRm9aiY5qkJkr4iaaCk\nXk2PIu//E2Bks7IAro+Ig7PHvQCS9gU+AuybXfMDSU3DwG4GzoqIIcAQSc3vmYtZs2DvvdP8BFuf\n+zXM6lObzVOZU0kf9F8oKAtgo9+vI+Ivkga38FJLY4JPAO6MiFXAfEnzgGGSniFtOzs5O+924ETg\nviJiLyv3Z7Tu4IPhtdfgySdhr73yjsbMSmWjNY2IGBwRuzZ7bG6DzLmSHpd0q6QeWVk/YEHBOQuA\n/i2UL8zKc+eFClsnuYnKrB5ttKYhqRvweeBoUg3jz8AtWY1gU9wMfDM7voI0gfCsTbzXesaMGfPW\ncUNDAw0NDaW4batmzoRjjy3rW9S0UaPgqqvgwgvzjsTMmjQ2NtLY2LjJ17e6jMhbJ0i3kpLLWFKz\n0hnA6oj4dFFvkJqnfh8R+7f1mqSLACLi6uy1+4DLgGeAiRGxT1Z+GnBMRJzd7F4VX0akf/80QmiX\nXSr6tjVj5Uro3Rueegp23DHvaMysJaVcRqTJYRFxZkT8KSL+GBGfAA7fjAD7Fjw9CWgaWTUOOFVS\nN0m7AkOAyRGxGFgmaVjWMX4GcM+mvn+pLF2alsoYNCjvSKrXllvC8OEwfnzekZhZqRSTNFZL2qPp\nSTY7vKhpW5LuBB4B9pL0nKRPAddImibpceAY0uRBImIWaXvZWcC9wDkFVYdzgB8Dc4F5EVEVneD7\n7Zfa7q117tcwqy/FNE8dSxo6+3RWNBj4ZET8qbyhtU+lm6duuQX+8Q+49daKvWVNevHFtBXskiWw\nxRZ5R2NmzbW3earVjnBJ5wMPkzq+hwBNAyfneEa4h9sWa6ed0u+psRGOOy7vaMxsc7XVPDUA+C7w\nIvAAab7GILxYIeDlQ9rDTVRm9aOY5qktgEOBdwDvzH6+3DSaqVpUunlq551h6lTo169ib1mzZs9O\ntYxnnnEfkFm1Kcfoqa2A7UgLF24PLAImbVp49eGFF2DVKujbd+PnWlpqZYstYMqUvCMxs83VVp/G\nj0jrQC0HJpNGQV0fEUsrFFvVaurP8Lfm4kjwhS+kx5//DN265R2RmW2qtmoag4AtgMWkpTsWAi9X\nIqhq5/6M9jvvvNSk99Wv5h2JmW2OtrZ7PS7bT2M/Uj/GBcD+kl4CJkXENyoUY9XxyKn269QJfvpT\nGDoUjjwSPvzhvCMys02x0f00ImI6abLdvaQhuHsAoysQW9VyTWPT9OwJv/oVfP7zMGdO3tGY2aZo\ndfSUpNGsGy21mtSn8XD2c0ZErKlUkMWo1OipCOjVK33o7bRT2d+uLv3v/8L3vw+TJsHWHsBtlqv2\njp5qK2l8B/gr8LeIWFSi+MqmUklj4cK0V8QLL5T9repWBHz842nzqp/8xAMKzPJUsiG3EXF+RPy6\nFhJGJbk/Y/NJ65Zhue22vKMxs/YoZuc+K+D+jNLYZhu4+244+ujUOX7QQXlHZGbFKGZynxVwTaN0\n9tkHbropjaR65ZW8ozGzYjhptJNrGqV16qkwciR88pOpr8PMqttG156qFZXoCF+7FrbfHp59Ng0f\ntdJ44w046qiUQC64IO9ozDqWki2Nbht69lnYbjsnjFLbYos0f+Pww9PjyCPzjsjMWuPmqXZwf0b5\n7LJLGkl12mkezmxWzZw02sH9GeX1vvel+Runnw5rqmrqqJk1cdJohxkzXNMot8svTwnjm9/MOxIz\na4mTRju4ear8unSBO+6AH/8Y7rsv72jMrDmPnirSmjWw7bapvb1797K9jWUeeghOOQUmT4ZBg/KO\nxqx+lWPnPgP+9S/o3dsJo1KOPjoNvz3lFHjzzbyjMbMmThpFcid45X3lKylRe+Mms+pR1qQh6TZJ\nSyRNLyjrJWmCpDmSHpDUo+C1iyXNlfSEpBEF5UMlTc9eu6GcMbfG/RmV17Rx07hxaR6HmeWv3DWN\nnwAjm5VdBEyIiD2BP2bPkbQv8BHSvuQjgR9Iby2afTNwVkQMAYZIan7PsnNNIx9NGzedcw48+WTe\n0ZhZWZNGRPwFWNqseBQwNjseC5yYHZ8A3BkRqyJiPjAPGCapL7BtREzOzru94JqKcU0jP0OHwpVX\npoUNX3st72jMOrY8+jR6R8SS7HgJ0Ds77gcsKDhvAdC/hfKFWXnFrFoF8+bB3ntX8l2t0Gc/m5ZP\nP+ccL2xolqdc156KiJBUso+AMWPGvHXc0NBAQ0NDSe47dy4MHAhbbVWS29kmaNq46fDD03IjZ52V\nd0RmtamxsZHGxsZNvj6PpLFEUp+IWJw1PTWtNLQQGFhw3gBSDWNhdlxYvrClGxcmjVJyf0Z18MZN\nZpuv+Rfqyy+/vF3X59E8NQ44Mzs+E7inoPxUSd0k7QoMASZHxGJgmaRhWcf4GQXXVIT7M6pH4cZN\nL7+cdzRmHU+5h9zeCTwC7CXpOUmfBK4G3iNpDvDu7DkRMQu4C5gF3AucUzDF+xzgx8BcYF5EVHSB\nCdc0qkvhxk2rV+cdjVnH4mVEirD33qlZxLWN6vHGG/De96blRnbcEfr1S4++fdcdFz7feWfo3Dnv\nqM2qT3uXEXHS2IiVK6FHD1i2DLp1K/ntbTOtXg1LlsDzz8OiResezZ8vXQo77dR6Umk63mknJxfr\nWLxzX4k9+STsvrsTRrXq0gX690+PtqxalZJL86QyadL6z5cvTyvsnn56ZeI3qzVOGhvh/oz60LUr\nDBiQHm2ZOROOPTYtTHnCCZWJzayWOGlshDde6lj22w/+8IfUX9K9e0ogZraOV7ndiJkzXdPoaA49\nNA18OO00+Nvf8o7GrLo4aWyEaxod09FHw9ixcOKJ8PjjeUdjVj08eqoNr76ahnMuX546XK3j+dWv\nYPRomDgR9tor72jMSs+jp0po1qz0QeGE0XGdfHL60jBiRJoTsssueUdkli9/HLbB/RkG8KlPpcQx\nfDj85S/Qp0/eEZnlx0mjDe7PsCajR6cJniNGQGMj9OqVd0Rm+XBHeBtc07BCl16aksbxx6eah1lH\n5KTRBtc0rJAE//3fcOCBaeLfypV5R2RWeR491YpXXklLUyxbBp2cWq3AmjXwsY/BihXwm9+k2eZm\ntaq9o6f8cdiKmTPT3g1OGNZc585w++3p+OMfT0nErKPwR2IrXnoJjjgi7yisWnXtCnfdlRZB/Pzn\nvW+5dRxunjLbDE1DcY86KvV3qOhKvll1cPOUWQVtuy3cey888ABceWXe0ZiVn+dpmG2mXr1S0jjq\nKNhuuzSnw6xeOWmYlUCfPvDgg2mhw223TbPIzeqRk4ZZieyyS6pxvOtdKXGcfHLeEZmVnpOGWQnt\ntReMHw/HHZc2cTr++LwjMistd4SbldhBB8E996Q5HA89lHc0ZqXlpGFWBu94B9x5J3z4w/DPf+Yd\njVnp5JY0JM2XNE3SFEmTs7JekiZImiPpAUk9Cs6/WNJcSU9IGpFX3GbFGj4cfvQjeN/74Mtfhvvv\nh9dfzzsqs82T2+Q+SU8DQyPiPwVl1wL/johrJX0N6BkRF0naF7gDOAzoDzwI7BkRawuu9eQ+q0rT\npsG4camTfMqUVAsZMSI99t/fEwItX+2d3Jd30jg0Il4qKHsCOCYilkjqAzRGxN6SLgbWRsQ12Xn3\nAWMiYlLBtU4aVvWWLUtbxz7wQHqsWLEugQwfDr175x2hdTS1lDT+BbwCrAH+NyJ+JGlpRPTMXhfw\nn4joKekmYFJE/Dx77cfAvRHx64L7OWlYzfnXv9YlkIkTYfDglECOOy6tfbbFFnlHaPWulvYIPyIi\nnpe0EzAhq2W8JSJCUltZYIPXxowZ89ZxQ0MDDQ0NJQrVrDx22w3OPjs9Vq+GyZNTAvn619NKy0ce\nua4mss8+bsqyzdfY2EhjY+MmX18VCxZKugxYAXwGaIiIxZL6AhOz5qmLACLi6uz8+4DLIuLvBfdw\nTcPqytKl8Kc/pSRy//1pCfbCpqwddsg7QqsHNdE8JWlroHNELJe0DfAAcDkwHHgpIq7JEkWPZh3h\nh7OuI3yPwizhpGH1LALmzl3XlPXQQ3DiiXDxxWlCodmmqpWksSvw2+xpF+DnEfFtSb2Au4BBwHzg\nlIh4ObvmEuBTwGpgdETc3+yeThrWYbz8Mnzve3DjjfDud8Mll8ABB+QdldWimkga5eCkYR3RihVw\nyy1w3XUwbFjqCznssLyjslri/TTMOpDu3eErX0mjsIYPhw9+EEaOhL/+Ne/IrF65pmFWR958M+1f\n/u1vw8CBcOmlcOyxHnVlrXPzlJmxejX84hfwrW/B9tun5PG+9zl52IacNMzsLWvWwG9/u24r2ksv\nTU1YndwwbRknDTPbQAT84Q8peSxfnkZbnXoqdPGOOh2ek4aZtSoC/vhHuOIKWLAgzfP4+MehW7e8\nI7O8OGmYWVEeeij1ecyeDV/9Kpx1Fmy1Vd5RWaV5yK2ZFeXoo9PyJHffDRMmpHWwrr02rcRr1hon\nDbMO7vDaqEZcAAAKIUlEQVTD4Xe/Swlk6tSUPL7xDfj3v/OOzKqRk4aZAWkZkjvugEmTYPFi2HNP\nOP/81Pdh1sRJw8zWs8ce8MMfwvTpaWjuAQfApz+dFkw0c9Iwsxb175/WtJo7FwYMgHe+Mw3Tffzx\nvCOzPDlpmFmbdtgBxoxJ61sdeigcfzy8//3w8MN5R2Z58JBbM2uXlSth7Fi45pq0vtUll6SNobxE\nSW3yPA0zq4jVq+Guu+Cqq9LkwEsugZNOgs6d847M2sNJw8wqau3atETJVVelzaEuughOPx26ds07\nMiuGk4aZ5SICGhtT8njySbjwwjTLfOut847M2uKkYWa5mzw57ekxcSIccsj6jyFD3IRVTZw0zKxq\nvPgiPPbY+o8XXoADD1w/keyzj5uz8uKkYWZVbenStFzJY4/BlCnp5zPPwH77wcEHr0sk++8PW26Z\nd7T1z0nDzGrOihUwbdr6NZI5c1JTVmGN5MAD077oVjpOGmZWF1auhBkz1k8kM2akmeq77AKDBqXH\nwIHrH7vjvX3qNmlIGgl8F+gM/Dgirmn2upOGWZ1btQqeegqefTY9nntu/ePnnoNttmk5mTQd9+3r\njvhCdZk0JHUGngSGAwuBfwCnRcTsgnNqImk0NjbS0NCQdxgb5ThLqxbirIUYoe04I1Lne/NkUnj8\n4ospcTRPKn37Qr9+6dGnz+bvZlgrv8/2Jo1a2SH4cGBeRMwHkPQL4ARgdlsXVaNa+R/JcZZWLcRZ\nCzFC23FKsPPO6TF0aMvXv/kmLFq0fiKZORMefDCVL1oES5bA9tuvSyL9+q2fVJqe9+nT+qivWvl9\ntletJI3+wHMFzxcAw3KKxcxqWLduMHhwerRm7dq0CVVTElm0CJ5/Pi0Xf//96XjRojR8uGfPDZNK\n374pEd1zT3q/rl03/rN5WbU2odVK0qj+diczqxudOq2rsRx0UOvnrVmTmruakkjT4/HHU9L46U9T\nzWbVqvb9fPPNdP/mieWTn0yTJvNUK30abwfGRMTI7PnFwNrCznBJ1f8PMTOrQvXYEd6F1BF+LLAI\nmEyzjnAzMyu/mmieiojVkr4I3E8acnurE4aZWeXVRE3DzMyqQ11s9ypppKQnJM2V9LW842mJpIGS\nJkqaKWmGpPPyjqk1kjpLmiLp93nH0hpJPSTdLWm2pFlZv1fVkXR+9t97uqQ7JG2Rd0wAkm6TtETS\n9IKyXpImSJoj6QFJPfKMMYuppTj/O/vv/rik30jaPs8Ys5g2iLPgtS9LWiupVx6xNYulxTglnZv9\nTmdIuqa166EOkkY28e97wEhgX+A0SfvkG1WLVgHnR8R+wNuBL1RpnACjgVlU96i1G4DxEbEPcABV\nOGdHUn/gXGBoROxPalo9Nd+o3vIT0t9MoYuACRGxJ/DH7HneWorzAWC/iDgQmANcXPGoNtRSnEga\nCLwHeKbiEbVsgzglvQsYBRwQEW8D/qetG9R80qBg4l9ErAKaJv5VlYhYHBFTs+MVpA+5fvlGtSFJ\nA4D3Aj8GqnLX5+yb5VERcRukPq+IeCXnsFrTBdg6G8yxNWlFg9xFxF+Apc2KRwFjs+OxwIkVDaoF\nLcUZERMiYm329O/AgIoH1kwrv0+A64GvVjicVrUS5+eBb2efn0TEi23dox6SRksT//rnFEtRJA0G\nDib9D19tvgNcCKzd2Ik52hV4UdJPJD0m6UeSqm6ZuohYCFwHPEsa9fdyRDyYb1Rt6h0RS7LjJUDv\nPIMp0qeA8XkH0RJJJwALImJa3rFsxBDgaEmTJDVKOrStk+shaVRzE8oGJHUH7gZGZzWOqiHp/cAL\nETGFKq1lZLoAhwA/iIhDgFepjqaU9UjqSfr2PphUq+wu6fRcgypStpBbVf9tSfo68GZE3JF3LM1l\nX2IuAS4rLM4pnI3pAvSMiLeTvjDe1dbJ9ZA0FgIDC54PJNU2qo6krsCvgZ9FxD15x9OCdwKjJD0N\n3Am8W9LtOcfUkgWkb3D/yJ7fTUoi1WY48HREvBQRq4HfkH7H1WqJpD4AkvoCL+QcT6skfYLUjFqt\nSXh30peFx7O/pwHAo5J2zjWqli0g/b9J9je1VtIOrZ1cD0njn8AQSYMldQM+AozLOaYNSBJwKzAr\nIr6bdzwtiYhLImJgROxK6rD9U0R8PO+4mouIxcBzkvbMioYDM3MMqTXPAG+XtFX23384aYBBtRoH\nnJkdnwlU4xebpm0SLgROiIiVecfTkoiYHhG9I2LX7O9pAXBIRFRjIr4HeDdA9jfVLSJeau3kmk8a\n2Te4pol/s4BfVunEvyOAjwHvyoazTsn+569m1dw8cS7wc0mPk0ZPXZVzPBuIiMmkWtBjQFO79g/z\ni2gdSXcCjwB7SXpO0ieBq4H3SJpD+hC5Os8YocU4PwXcBHQHJmR/Rz/INUjWi3PPgt9noar4W2ol\nztuA3bJhuHcCbX5R9OQ+MzMrWs3XNMzMrHKcNMzMrGhOGmZmVjQnDTMzK5qThpmZFc1Jw8zMiuak\nYR2CpO9IGl3w/H5JPyp4fp2k8zfx3g0tLSPfWnmpSNpe0ucr9X5m4KRhHcdfyZbwkNQJ2IG0lH6T\ndwAPF3Oj7Ppq0BM4J+8grGOplv/5zcrtb6TEALAfMANYnm3mtAWwD/CYpGOzlXOnSbo1W5oGSfMl\nXS3pUeBkpY2/ZmfPT2pPIJJGSHpE0qOS7pK0TcF7jMnKp0naKyvfSWlzpBnZir7zs7WBrgZ2z2ZF\nX0uaddxd0q+y2H62+b82s/U5aViHEBGLgNXZpjjvICWRydnxoaRlPjqTNqk5JSIOIK3+2dT8E8C/\nI2Io8DvSciDvz573ochlIiTtCHwdODa79lHggoL3eDErvxn4SlZ+GfBgtkHO3cCg7NyvAU9FxMER\n8VXSKqoHkzbR2pe0NMQR7fpFmW2Ek4Z1JI+QmqjeSUoaf8uOm5qm9iKtSjsvO38scHTB9b/Mfu6d\nnfdU9vxnFL/s9dtJH+iPSJpCWudnUMHrv8l+PkZaJRXSumW/AIiI+1m3iU5L7zk5IhZlS5tPLbiH\nWUl0yTsAswp6mPQBvD8wnbR511eAV0iLtjUn1q9BvNrKfdu7T8KEiPhoK6+9kf1cw/p/n8W+xxsF\nx83vYbbZXNOwjuQR4P3AS5EsBXqQahqPkPabHixp9+z8M4A/t3CfJ7Lzdsuen9aOGP4OHNH0HpK2\nkTRkI9c8DJySnT+C1AEOsBzYth3vbbbZnDSsI5lBGjU1qaBsGmkb1v9kezN8EviVpGnAauCW7Ly3\nahzZeZ8F/i/rCF9Cy30aARybLUH9nKTngN2ATwB3Zsu6P0JqFmvp2qZ7Xg6MyJau/jCwGFie7Xnw\nsKTpkq5pdk3hfcxKxkujm1W5bATXmohYI+kdwPezbW7NKs7tnWbVbxBwVzY/5E3gMznHYx2Yaxpm\nZlY092mYmVnRnDTMzKxoThpmZlY0Jw0zMyuak4aZmRXNScPMzIr2/wEpJm/icD8gcwAAAABJRU5E\nrkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "goldBugLowerCaseWordTokenWordLengths = [f[0] for f in goldBugLowerCaseWordTokenLengthFreqs]\n", "goldBugLowerCaseWordTokenWordLengthValues = [f[1] for f in goldBugLowerCaseWordTokenLengthFreqs]\n", "plt.plot(goldBugLowerCaseWordTokenWordLengths, goldBugLowerCaseWordTokenWordLengthValues)\n", "plt.xlabel('Word Length')\n", "plt.ylabel('Word Count')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's pretty darn close to what some of Mendenhall's graphs looked like, such as this one for the first thousand words of _Oliver Twist_:\n", "\n", "![Characteristic Curve](images/characteristic-curve-mendenhall.png)\n", "\n", "Thank goodness we didn't need to count tens of thousands of tokens by hand (an error-prone process) like Mendenhall did!\n", "\n", "On its own, one characteristic curve isn't terribly useful since the point is to compare an author's curve with another, but for now at least we know we can fairly easily generate the output for one text. For now, let's shift back to working with words." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graphing Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we saw in the previous notebooks, sometimes it's useful to work with all word tokens (like when measuring Zipf's Law or aggregate word length) but typically we need to strip out function words to start studying the meaning of texts. Let's recapitulate the filtering steps." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('upon', 81),\n", " ('de', 73),\n", " (\"'s\", 56),\n", " ('jupiter', 53),\n", " ('legrand', 47),\n", " ('one', 38),\n", " ('said', 35),\n", " ('well', 35),\n", " ('massa', 34),\n", " ('could', 33),\n", " ('bug', 32),\n", " ('skull', 29),\n", " ('parchment', 27),\n", " ('made', 25),\n", " ('tree', 25),\n", " ('time', 24),\n", " ('first', 24),\n", " ('much', 23),\n", " ('us', 23),\n", " ('two', 23)]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stopwords = nltk.corpus.stopwords.words(\"English\")\n", "goldBugContentWordTokensLowercase = [word for word in goldBugWordTokensLowercase if word not in stopwords]\n", "goldBugContentWordTokensLowercaseFreqs = nltk.FreqDist(goldBugContentWordTokensLowercase)\n", "goldBugContentWordTokensLowercaseFreqs.most_common(20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, now that we've done some plotting, we could graph the top frequency content terms, though it may be harder to read the words." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAFECAYAAADGEp5zAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnXecVNXZx78/QMQCAtEg0djFTlGwJGjsmkRNMZYYI7bY\nEjUajURjNJpieRP7a6K+KsYSTbUTbKiosYKAjcQeFWJbZVVQ5Hn/OGfY2WFnd+7OnZ0zu8/389nP\nzr0z53efuTPz3HOf85znyMxwHMdxuje96m2A4ziOU3vc2TuO4/QA3Nk7juP0ANzZO47j9ADc2TuO\n4/QA3Nk7juP0ANzZO46zGJJmStqqTsc+VdIf2nn+JUnbdaVN3QF39lUiqVnS3Pi3UNKHRdvfzukY\nkyV9VKQ7V9JmeWg3EgocJWlGPO+vSrpB0oY5aE+WdFAedka9hZLWKPPciUWf40eSFhRtz8jLhmow\nsw3N7L7Otpf0A0lPSvpA0huS7pG0V6WHr+D5Nl8j6UpJ8+O5fF/SY/W6aKWGO/sqMbNlzay/mfUH\nXgZ2KWyb2XV5HQb4fpFufzN7uPgFkvrkdKyUOQ84CjgSGAQMA/4OfDUH7VrMLlSbBzL7VdF35jDg\nwaLPdaOKxaUkf7+SLgCOBo4FBgOfA34K7FypRBWHN+DMeC4HABcDf5VUjWa3IMkvS3dA0pKSzpX0\nWvw7R1Lf+NzWkv4j6SeS3pT0oqR9OnGMlyT9WNJ0YK6kXpI2l/SgpHclTZP0paLXry7p3tjjmSTp\nwsLtcrTp1Tb0t4uPJWm8pH9LekvS9ZIGxedWiz3Z/SS9HN/TiUU6vWJv9t9Fva2VJV0k6X9KjnmT\npB+28V7XBo4A9jazyWb2iZl9ZGbXmtmZ8TXLSbpK0n+j7ScVfuSS9pc0RdLZkt6R9IKkneNzvwS2\nBC6MPcLz4/51Jd0h6W1Jz0rao8ieK6P9t8T39M9CT15SoUf8ZNRb1K6tj5Ei51bBMS+WdJukZmCb\n+D6Pi73oZkmXSRoi6fZo1x2SBsb2/SRdHT+/dyU9IumzbRoVdLeNj09VuIOaEDVnStqkTLthwOHA\nXmZ2l5nNt8ADZnZA0es+Fz/rtyX9S9LBZU+Q9N34vXqr+HtVIdcRLjhDit7LohBR0Xe3V9xeXdJ9\nRefuIrUTUmoozMz/cvoDXgS2jY9PAx4Elo9/DwCnxee2Bj4B/gdYAtgKaAaGldG9Bziojf0vAU8A\nKwFLxv9vATvH57eP25+J2w8VHXNL4H3gqiKbXm3n/Rwd38/nYvvfAdfG51YDFgK/j3YMB+YB68Tn\njwemA2vH7Y0IP8AxwGuA4v7lgQ+AFdp4r4cBL3Zw/q8C/gYsA6wKPAccGJ/bH/gYOIjgXA8DXis5\nxwcWbS8DvAqMI3SKRgJvAuvF56+M53Y00Bu4GriuqP1CYI0KvjP7A/dnOGYTsEXcXjJ+Rg8CK8TP\nZg7wODAiPn8X8LP4+kOBm4B+8RyMAvpX8F0+FfiI0DMX8CvgoTLtDgNeqOB93wdcCPSNtv4X2Kbo\neH+Ij9cH5gJj42t/Q/jtbFtG9wrg9Pi4d7Tn30XfsVMK2iXf3V5Fv5GzgD7AF4H3iL+RRv/znn3t\n2Ifg3N8ys7eAnwPfLXnNyRZ6qPcBtwJ7ltEScH7sjb0r6bG434Dzzew1M5sP7AvcZmYTAczsTuAx\n4KuSViE4psIx7wduzvB+DgV+amavm9kn8f18S61DCT+30JObDjxJ+BEDHAycZGb/inbNMLN3zOxR\nwo+pMNi2N3CPmb3ZxvE/A8wuZ5yk3sBewE/M7AMze5ngGIrP+ctm9n8WftVXAUNLerbFt/q7EC4u\nE8xsoZlNA/4KFPfS/2pmj5nZp8A1BOdcDZUc8+9m9hBA/MwBLjCzN83sdeB+4J9m9mR8/m8Epw7h\nYvcZwkXXzGyqmc2t0Lb7zWxiPHdX0/LZlrI84YKzCIW72HcVxic+L+nzwBeAE8zsYzN7ErgM2K8N\nvW8BN5vZFDP7GDiZ4JzLIeA4Se8SLhK/JVzsrOj5thu2/EZ+ZmYLzOwBwsWxW4SA3NnXjs8RYvgF\nXon7CrxrZh8Vbb9c8nwxBhxpZoPi3+ii54pDL6sCexRdFN4l9E5WjNptHbPSL/JqwN+KdJ8GFhBv\njyPFzvhDYNn4eGXg+TK6EwgXKeL/crfMbwND27FvecIdR+k5X6kt+8zsw/hw2aLni+P2qwKblZzL\nfWh5v0Zrp/ZRiVZnqOSYr7bRrtSO4u15RXb9AfgH8EeF0OKZqnysp1jzQ6Cf2h4zWOxzMrOVCZ/P\nkoTv2+eAd8zsg6KXlX5WBT4H/KdI68N4jHIYcHb8nSxNuHs8uxCy64CCXfOK9rV1vhsSd/a143WC\ngyywStxXYJCkpYu2VyWENLJS7KBeIdyiDir6629mZwFvlDlmof0HwKLnYk95hRLtnUu0lzazNyqw\n8VVgrTLPXQN8TdIIYF3CgGtb3AWsXC5WTAipfMLi5/w/bb56cUoHaF8B7m3jXH6/Qr3OkNcxyw0M\nLzCz08xsA0LPehfa7k1Xw920/TkV2/Q6MFhS8cWx3Gf1OvD5RSLh+/uZSo0xs6cIIdTCIH6r7zmh\nI1TgjWjXUiV2dQvc2deO64CfSlpe0vLAz1i81/pzSUtI2pLwZfxTO3qV9MCvBnaVtKOk3nFAbmtJ\nK8WwxmNFxxxL+LEXmEXorX1F0hKE7Ikli57/HfCreKuLpBUk7VaBTRBu0U+XtJYCwyUNBjCz/wCP\nEsIqfy4KTbQihoD+F7hO0pck9Y3vb29JJ8RQyg3ALyUtK2lV4Jh4TiphDrBm0fYtwDBJ+8bztYSk\nMZLWjc939HmU6lXCrVUes10kbSNpo3ghn0u4OH5ajWYpZvYcYezmj5K2l7RUPN4Xil7zKmGc4dcK\niQzDgQNp+7P6C7CLpC8qJDicRvt+a7EBb0K8f2bcNRXYKoaTlgN+UmRX4Tdyajz3WxB+I92iDrw7\n+9rxC8IXZ3r8eyzuKzAbeJfQc/kDcKiZzWpHr8MvXHScXwNOJAx4vQL8iJbPeR9gM+AdwsXnKuIP\nw8zeI2S7XEboYTXT+hb2PEL8cpKk9wkDWZtWaN9vCY54EiFGfylhkLDABMKgbbtZD2Z2FGFQ7yLC\nuft3fL83xZccSei5vUCIXV9DGLAr2FdqY/H2eYQxiHcknWtmzcCOhHGE1wi9vl8TBgkr0TsVmBDD\nMd9q720V2sX4edZjltNcTJ8QDvoT4TN4GphMB+e8neOWtSPeiZxP+NzfJnyPTiOMSRW+U98m3IW9\nThiX+JmZ3V16vNgz/z5wbXztO7QfWjHgxwpZUM2EsNXlwCVR707gesJv8lHCuFXxe/kOsEW0+/T4\n2o/bOV7DUBihro24dDRhcE7ApWZ2XuzRXU8IIbwE7GlmTTUzIkEkbU0It3y+o9fW2I5TgLXMrHTg\nuKvt2IpwPlatpx2OU4qk64Gnzezn9balWmrWs1eY1XgwYYBkBOFWbE1gPHCHmQ0jxGHH18oGp0Pq\nnmUQQ0ZHE3r7jlNXJI2WtKbC3JAvA7tRfhypoahlGGdd4GEzmxfjqfcCuxNO3oT4mgnA12toQ8qk\nEAesNCxQEyStRwjHDAHOrZcdjlPEioQ5F3OBc4DDYmpow1OzME4cGLmREP+aBxRyvr9rZoWZlyKk\nOg2qiRGO4zgOEGaJ1QQze1bSmYRBuQ+AaZSM/JuZSUqhh+s4jtOtqWnxLDO7nDASXqg/8h9gjqQV\nzWy2pKGErJHFWGuttay5uZk5c8JcjjXXXJP+/fszbdo0AEaODJMVfdu3fdu3e/r2kCFh3l3BX5rZ\n4uNxVsNaDMBn4/9VgGeA5Qh1J06I+8cDZ5Rpa9Vyyimn1LV9Khop2JCKRgo2pKKRgg2paKRgQ14a\n0Xcu5lNrXRb3z5I+Q5i8cYSZvSfpDOAGhdrhL1GmHkzhSlUN8+bN6/hFNWyfikYKNqSikYINqWik\nYEMqGinYkJdGOWodxlls0QAze4dQjdFxHMfpIpKdQVuIPVXDzjtXulZCbdqnopGCDalopGBDKhop\n2JCKRgo25KVRjprOoK0GSZaqbY7jOKkiqc0B2mR79oXR5mpoamqqa/tUNFKwIRWNFGxIRSMFG1LR\nSMGGvDTKkayzdxzHcfLDwziO4zjdiIYL4ziO4zj5kayz95h9fhop2JCKRgo2pKKRgg2paKRgQ14a\n5UjW2TuO4zj54TF7x3GcboTH7B3HcXowyTp7j9nnp5GCDalopGBDKhop2JCKRgo25KVRjmSdveM4\njpMfHrN3HMfpRnjM3nEcpweTrLP3mH1+GinYkIpGCjakopGCDalopGBDXhrlSNbZO47jOPnhMXvH\ncZxuhMfsHcdxejDJOnuP2eenkYINqWikYEMqGinYkIpGCjbkpVGOZJ294ziOkx8es3ccx+lGeMze\ncRynB5Oss/eYfX4aKdiQikYKNqSikYINqWikYENeGuWoqbOXdIykmZJmSLpW0pKSBku6Q9IsSZMk\nDaylDY7jOE4NY/aSVgLuB9Yzs/mSrgduAzYA3jKzsySdAAwys/FttLdnnjHWXbcm5jmO43RL6hWz\n7wMsLakPsDTwOrAbMCE+PwH4ernGBx4In35aYwsdx3F6ADVz9mb2GvAb4BWCk28yszuAIWY2J75s\nDjCkrfYjR47koYfgwgs7b4PH8dKxIRWNFGxIRSMFG1LRSMGGvDTK0adWwpIGEXrxqwHvAX+StG/x\na8zMJLUZRxowYABDhoznuOP68fzzsOOOoxk7diwDB4YQf+GktLfd3Nyc6fV5ty+ms+1T2W5ubq5a\nL4XzWW377nQ+U/g8UjmfKXwenT2fkydPZuLEiQD069ePctQyZr8HsJOZHRy3vwtsDmwLbGNmsyUN\nBe4xs8Ui85Jsn32Ma6+FbbaBO++EXsnmDjmO46RBPWL2LwObS1pKkoDtgaeBm4Fx8TXjgL+XEzjv\nPFhhBbjnHrj00hpa6jiO082pZcz+EeDPwBPA9Lj7EuAMYAdJswi9/DPaaj9y5EiWXx4uuihsH388\nvPJKNhs8jpeODalopGBDKhop2JCKRgo25KVRjpoGRszsVDNbz8w2MrNxZvaJmb1jZtub2TAz29HM\n2n133/oWfPObMHcuHHooeAUFx3Gc7DREbZzZs2H99eHdd+HKK2HcuPbbOo7j9FQaujbOiiuG+D3A\nD38Ib7xRX3scx3EajWSdfWltnH33ha98BZqa4PDDKwvneBwvHRtS0UjBhlQ0UrAhFY0UbMhLoxzJ\nOvtSJPj972HAALjxRrjhhnpb5DiO0zg0RMy+mEsvhUMOgeWXh6efDqmZjuM4TqChY/bFHHwwbLcd\nvPUWHHVUva1xHMdpDJJ19uXq2Uuhd7/00vDHP8Lfy07J8jheSjakopGCDalopGBDKhop2JCXRjmS\ndfbtsfrqcEacinX44SEl03EcxylPw8XsCyxcCFttBQ88APvvD1dc0XW2OY7jpEq5mH3DOnuAWbNg\nxAiYNw9uvx123rmLjHMcx0mUhhugrWQN2mHD4LTTwuNDDoH332/9vMfx0rEhFY0UbEhFIwUbUtFI\nwYa8NMqRrLOvlGOOgTFj4NVX4YQT6m2N4zhOmjR0GKfAzJmw8cbwySdw992h/r3jOE5PpOHCOFnY\ncEM4+eTw+OCD4YMP6muP4zhOaiTr7CuJ2RczfnwYrH3hBfjpT8M+j+OlY0MqGinYkIpGCjakopGC\nDXlplCNZZ5+VJZaAyy+H3r1DhcwHH6y3RY7jOOnQLWL2xZx0EvzqV7DOOjBtGrSz/q7jOE63o1vH\n7Is5+WRYbz147rmWJQ0dx3F6Osk6+6wx+wL9+sEvfhEeP/GEx/FSsSEVjRRsSEUjBRtS0UjBhrw0\nypGss6+GnXYKTv/pp8OSho7jOD2dbhezL7DrrnDLLXDJJfC97+VomOM4TsL0mJh9ga99Lfy/8cb6\n2uE4jpMCyTr7zsbsC+y6K4wc2cSdd0Jzc+c0UonBeTwyP40UbEhFIwUbUtFIwYa8NMpRU2cvaR1J\nU4v+3pN0lKTBku6QNEvSJEkD8z72kCGw/vowfz5MmpS3uuM4TmPRZTF7Sb2A14BNgSOBt8zsLEkn\nAIPMbHzJ66uK2QOceWaYWbvffjBhQlVSjuM4DUHd69lL2hE42cy2lPQs8CUzmyNpRWCyma1b8vqq\nnf2zz4ac+8GDYc4c6NOnKjnHcZzkSWGAdm/guvh4iJnNiY/nAENKX1xtzB5gxRWbGDYM3nknrGiV\nlVRicB6PzE8jBRtS0UjBhlQ0UrAhL41ydElfV1JfYFdgsYrzZmaSFuvCDxgwgPHjx9Mv1jsYPXo0\nY8eOZeDAEN4vnJT2tpubm/na1wZy9tkwZUoTI0a0//q22md5fVvbBTrbPpXt5jjKXY1eCuez2vbd\n6Xym8Hmkcj5T+Dw6ez4nT57MxIkTARb5y7bokjCOpK8Bh5vZznH7WWBrM5staShwTy3COBB69GPH\nwhprwL//DVrs5sZxHKf7UO8wzrdpCeEA3ASMi4/HAX+v1YE33xxWWCGUPp45s1ZHcRzHSZuaO3tJ\nywDbA38t2n0GsIOkWcC2cbsVecTsm5qa6N075NxD9glWqcTgPB6Zn0YKNqSikYINqWikYENeGuWo\nubM3sw/MbHkzm1u07x0z297MhpnZjmZWu3eIz6Z1HMfptrVxivnwQ1h+efjoI/jPf2CllXKRdRzH\nSY56x+zrytJLw447hsc33VRfWxzHcepBss4+r5h9gc6EclKJwXk8Mj+NFGxIRSMFG1LRSMGGvDTK\nkayzz5tddoFeveDuu+H99+ttjeM4TtfSI2L2BbbaCu6/H66/HvbcM1dpx3GcJOjRMfsCnpXjOE5P\nJVlnn3fMHlqc/W23wSefZG+fhw310EjBhlQ0UrAhFY0UbEhFIwUb8tIoR7LOvhastVaocd/UBPfd\nV29rHMdxuo4eFbMHOPFE+PWv4cgj4fzzc5d3HMepKx6zjxTH7RO9zjmO4+ROss6+FjF7gDFjYOhQ\neOUVePLJ7O3zsKGrNVKwIRWNFGxIRSMFG1LRSMGGvDTKkayzrxW9enW+MJrjOE6j0uNi9hCycb76\nVRg5EqZOrckhHMdx6kLd16DNSi2d/bx5oTDaBx/ASy/BqqvW5DCO4zhdTsMN0NYqZg/Qrx/svHN4\n3F5htFRicB6PzE8jBRtS0UjBhlQ0UrAhL41yJOvsa43PpnUcpyfRI8M4AO+8A5/9bFiT9s03Ia7j\n6ziO09A0XBin1gweDFtuCQsWhAFbx3Gc7kyyzr6WMfsCHYVyUonBeTwyP40UbEhFIwUbUtFIwYa8\nNMqRrLPvCgrO/vbbYf78+triOI5TS3pszL7A8OEwYwZMnAg77VTzwzmO49QUj9mXwbNyHMfpCSTr\n7LsiZg8tzv6mmxYvjJZKDM7jkflppGBDKhop2JCKRgo25KVRjpo7e0kDJf1Z0jOSnpa0maTBku6Q\nNEvSJEl1S3zcZBNYaSV47TV4/PF6WeE4jlNbah6zlzQBuNfMLpfUB1gGOAl4y8zOknQCMMjMxpe0\n65KYPcARR8DFF8NPfwqnn94lh3Qcx6kJdamNI2k5YKqZrVGy/1ngS2Y2R9KKwGQzW7fkNV3m7P/x\nj1A+YaONYPr0Ljmk4zhOTajXAO3qwJuSrpD0hKRLJS0DDDGzOfE1c4AhpQ27KmYPsM02MGBAyMp5\n4YXs7fOwoZYaKdiQikYKNqSikYINqWikYENeGuXoUzPlFv2NgR+Y2aOSzgVahWvMzCQt1oUfMGAA\n48ePp1+/fgCMHj2asWPHMjDWNSiclPa2m5ubK3p9375w0EFN3HMP3HjjQI45Jlv79rYLdLZ9KtvN\nzc1V66VwPqtt353OZwqfRyrnM4XPo7Pnc/LkyUycOBFgkb9si0xhHEmDgZXNrKJgRwzRPGRmq8ft\nscBPgDWAbcxstqShwD31DOMAXHcd7LMPfOlLMHlylx3WcRwnVzodxpF0r6QB0dE/Dlwm6ZxKDmpm\ns4FXJQ2Lu7YHngJuBsbFfeOAv1eiV0u+/GXo0wfuvx/efrve1jiO4+RLJTH75czsfeCbwFVmtinB\naVfKkcA1kp4EhgO/BM4AdpA0C9g2breiK2P2EKpebr01LFwIt96avX0eNtRKIwUbUtFIwYZUNFKw\nIRWNFGzIS6MclTj73jHUsicQ3SAVx1fM7EkzG2NmI8zsm2b2npm9Y2bbm9kwM9vRzGr3DjPgs2kd\nx+mudBizl7QHcDLwgJkdLmlN4Cwz272mhnVxzB7glVfCEoXLLANvvRVWtHIcx2kkqkm9fMPMhpvZ\n4QBm9jxQUcy+0VhlFRg1KqxNe9dd9bbGcRwnPypx9he0se/8vA0ppatj9gWKQzmpxOA8HpmfRgo2\npKKRgg2paKRgQ14a5Sjr7CVtIelHwAqSjpX0o/h3KtC7ZhbVmYKzv/nmMFjrOI7THSgbs5f0JWAb\n4FDgd0VPzQVuNrN/1dSwOsTsIVS+XH11ePlleOgh2HzzLjfBcRyn05SL2ZedQWtm9wL3SrrSzF6q\npXEpIcFuu8EFF4RQjjt7x3G6A5XE7JeMNW3ukHRP/Lu71obVK2YPLaGcp59OIwbn8cj8NFKwIRWN\nFGxIRSMFG/LSKEcltXH+BFwMXAZ8GveluZZhTmy1VZhk9cor8Mtfwo9+5GmYjuM0NpXk2T9uZpt0\nkT3Fx61LzL7AaafBKaeEx6utBmefDbvvHsI8juM4qdLpevYx++ZN4K/A/MJ+M3snZxtLj1tXZw9w\n551wzDEwc2bY3morOOcc2HjjuprlOI5TlmomVe0PHAc8SCiEVvirKfWM2RcYPbqJqVPDKlbLLw/3\n3QejR8NBB8Hs2V1jQx4aKdiQikYKNqSikYINqWikYENeGuXo0Nmb2WpmtnrpX80sSow+feCww+Bf\n/4Jjj4XeveHyy2HtteHXv4Z58+ptoeM4TsdUEsYZRxsDsmZ2Va2MisetexinLWbNguOPh5tuCtse\nz3ccJyWqidlfSIuzX4pQkvgJM/tW7la2Pm6Szr7AHXeEnr7H8x3HSYlOx+zN7AdmdmT8O5iwzGD/\nWhhZTAox+/ba77ADFcXzU4jjpWBDKhop2JCKRgo2pKKRgg15aZSjMwuOf0hYSLzH4/F8x3EahUrC\nODcXbfYC1gduMLMTampY4mGctmgrnn/BBbDLLnU1y3GcHkQ1Mfut40MDFgCvmNmruVu4+HEbztkX\nKI7n9+oVauy4w3ccpyuoJmY/GXgWGAAMomhiVS1JPWbfHoV4/gknwPDhTey1Fzz2WNfbkVf77qSR\ngg2paKRgQyoaKdiQl0Y5OnT2kvYEHgb2IKxD+0hcqtBphz59Qtx+p53gww/hq1+FF1+st1WO4/RU\nKgnjTAe2N7P/xu0VgLvMbHhNDWvgME4xH38cHP2dd8K668IDD8DgwfW2ynGc7ko15RJEqI1T4O24\nz6mAvn3hz3+GjTaCZ5+Fb3wD5ndJIMxxHKeFSpz9ROAfkvaXdABwG3B7bc1q7Jh9qcZyy8Ftt8FK\nK4V8/P33z7bkYSrvoztopGBDKhop2JCKRgo25KVRjvbWoF1b0lgzOx74PTAc2IhQEO2SSg8g6SVJ\n0yVNlfRI3Dc4LoYyS9IkSQOrfB/Js/LKcOut0L8//PGPcOKJ9bbIcZyeRHtr0N4K/MTMppfsHw78\n0sx2regA0ovAJsUlkSWdBbxlZmdJOgEYZGbjS9p1i5h9KZMmhRj+ggXwv/8Lhx9eb4scx+lOdCZm\nP6TU0QPEfVln0JYeeDdgQnw8Afh6Rr2GZccd4ZJ4X/SDH8Att9TXHsdxegbtOfv2QitZFukz4E5J\nj0n6Xtw3xMzmxMdzgCGljbpTzL6UAw4Iq2AtXAh77QWPPlpbO1KJJaagkYINqWikYEMqGinYkJdG\nOdpbg/YxSYeYWav4fHTYWRYv+aKZvRFTNu+Q9Gzxk2ZmkhaL1wwYMIDx48fTLy7+Onr0aMaOHcvA\ngeEaVDgp7W03Nzdnen3e7Yspff7oo5uYNw/OPHMgu+wCkyc3MXRodv2u2m5ubq5ar5bns6vad6fz\nmcLnkcr5TOHz6Oz5nDx5MhMnTgRY5C/bor2Y/YrA34CPaXHumwBLAt8wszfKqpY7mHQK0Ax8D9ja\nzGZLGgrcY2brlry2W8bsiynOwV9nHXjwQc/BdxynOjpVG0eSgG2ADQnhmKfM7O4MB10a6G1mcyUt\nA0wCfg5sD7xtZmdKGg8M7CkDtKW89x5suSXMmAFjx4a6Ou1cnB3HcdqlU5OqLHC3mZ1vZhdkcfSR\nIcD9kqYRSi7cYmaTgDOAHSTNIiyGckZpw+4csy+mOAd/ypS2c/Ab4X00ikYKNqSikYINqWikYENe\nGuVoL2ZfNWb2IrCY145pmNvX8tiNRCEHf8st4frrYdVV4cwz622V4zjdiQ5r49SLnhLGKaY4B/+i\ni+CII+ptkeM4jUY1tXGcLqI4B//II+Hmm9t/veM4TqUk6+x7Ssy+lOIc/L33Djn4jfg+UtVIwYZU\nNFKwIRWNFGzIS6McyTr7nswpp8C4caEO/i67wBuZk1wdx3Fa4zH7RCnOwd9ss5CD38svzY7jdIDH\n7BuMQh38FVeEhx+Gq6+ut0WO4zQyyTr7nhqzL2a55UIK5siRTZxwAsyd2/U2dDeNFGxIRSMFG1LR\nSMGGvDTKkayzdwL77gvrrQezZ8MvflFvaxzHaVQ8Zt8APPoobLopLLEEPPUUrL12vS1yHCdVPGbf\nwIwZE1IyP/kEjjmm3tY4jtOIJOvsPWbfWuNXvwpLGt56K9yecQXglN5HvTVSsCEVjRRsSEUjBRvy\n0ihHss7eac2KK4b8e4Af/jCkZjqO41SKx+wbiI8/huHD4bnn4Oyz4bjj6m2R4zip0al69vXEnX3b\n3H47fOUrIaQza1bo8TuO4xRouAFaj9m3rfHlL4cSCnPnwokn1seGRtZIwYZUNFKwIRWNFGzIS6Mc\nyTp7pzy//W1Iw7ziCnjkkXpb4zhOI+BhnAblhBPgrLO8bo7jOK3xmH03Y+5cGDYszKydMAH226/e\nFjmOkwKXW1QoAAAgAElEQVQes69D+1pq9O/fsnRhR3VzUn4fXa2Rgg2paKRgQyoaKdiQl0Y5knX2\nTsfsu28I43jdHMdxOsLDOA2O181xHKeYhgvjOJXhdXMcx6mEZJ29x+wr1+iobk6jvI+u0EjBhlQ0\nUrAhFY0UbMhLoxw1d/aSekuaKunmuD1Y0h2SZkmaJGlgrW3o7njdHMdxOqLmMXtJxwKbAP3NbDdJ\nZwFvmdlZkk4ABpnZ+Dbaecw+A143x3EcqFPMXtLKwFeAy4DCwXcDJsTHE4Cv19KGnkLfvnDOOeHx\naaeFDB3HcZwCtQ7jnAMcDyws2jfEzObEx3OAIW019Jh9do1ydXMa7X3UUiMFG1LRSMGGVDRSsCEv\njXL0qZWwpF2A/5rZVElbt/UaMzNJbcZqBgwYwPjx4+nXrx8Ao0ePZuzYsQwcGEL8hZPS3nZzc3Om\n1+fdvpjOts+6/dvfDuQf/4CpU5t46CHYYot89Jubm6u2L4XzWW37vLZTOJ8pfB6pnM8UPo/Ons/J\nkyczceJEgEX+si1qFrOX9Cvgu8ACoB8wAPgrMAbY2sxmSxoK3GNm67bR3mP2ncTr5jhOz6WutXEk\nfQk4zsx2jQO0b5vZmZLGAwN9gDZfvG6O4/RcUphUVfDcZwA7SJoFbBu3F8Nj9p3XKK2b8/rrjfk+\naqGRgg2paKRgQyoaKdiQl0Y5usTZm9m9ZrZbfPyOmW1vZsPMbEczq92768EU1835wx/qbY3jOPXG\na+N0Y4rr5kyeDF/4Qr0tchyn1qQQxnG6mDFj4MADQ92cL34Rxo2D116rt1WO49SDZJ29x+zz0bjw\nQjjzzCb69oWrrgoDt6efDh991HU2pKSRgg2paKRgQyoaKdiQl0Y5knX2Tj4stRQccgg8/TR885vw\n4Yfws5/BOuvAH/8IHilznJ6Bx+x7GJMnh2JpTz4Ztr/wBTj33BDycRyn8fGYvQPA1lvD44/DpZfC\nZz8bJl1tuqnH8x2nu5Oss/eYfX4ape1794aDD4Z//Svk4VcSz0/hfeShkYINqWikYEMqGinYkJdG\nOZJ19k7tGTAAzjjD4/mO0xPwmL2zCI/nO07jU9faOJ3BnX19+PRTuOIKOOkk+O9/w7799oOTT4ZB\ng6rTHjAgTPByHKd2NJyzHzVqlE2dOrUqjaampkUlQevRPhWNzrR///2wtu0554RVsEaObGLatOre\nx+jRTSxcOJBRo2DUKBg5EkaMgGWXrVzDP9P8NFKwIRWNFGzIS6Ocs69ZPXunsSnE8w85BMaPhzlz\nYPDgzuuZwYIFMG0aPPFEy34J1l47OP7CRWDUqJAp5DhOfiTbs/cwTvfjvfdg+nSYOrXl76mnwkWg\nlKFDWzv/kSNhjTXCxcFxnPI0XBjHnX3PYP78kA00dWro9Rf+x4WDWjFgQMsdQOH/+uv7OIDjFNNw\nzt5j9vlppGBDFo2FC+H551ucf+FvzpzFxw769oUNN6TicYBGOxe11EjBhlQ0UrAhLw2P2TsNQ69e\nIY6/9tqwxx4t+994I4SBHn+85Q7g3/8OYwBtjQMUXwB8HMDp6STbs/cwjlMJ778f5gUUev/TpoVx\ngE8+Wfy1n/vc4heA1Vf3cQCne9FwYRx39k5nKR4HKFwAyo0DLLdccPzF2UDrrefjAE7j0nDO3mP2\n+WmkYEO9NQrjADNnNvHIIwMXjQfMmbP4azsaB2j0c5GSDalopGBDXhoes3d6NIVxgBVWgG98o2X/\nG2+0zgSaOjVcFNobB9h5Z9h7b+jXr+vfh+N0lmR79h7GcepFJfMBVlsNzj4bdt/dY/5OWjRcGMed\nvZMShXGAxx+H886DmTPD/q22CiUlNt64vvY5ToGGW7zE69nnp5GCDalodLb9kkuGEM7BB8PkyU1c\nfDEsvzzcdx+MHg0HHQSzZ9fejjw1UrAhFY0UbMhLoxw1c/aS+kl6WNI0STMlnRr3D5Z0h6RZkiZJ\nqm40wnG6mN694bDDwuIvxx4bti+/PMT0f/1rmDev3hY6zuLUNIwjaWkz+1BSH2AKcDSwO/CWmZ0l\n6QRgkJmNb6Oth3GchmDWLDj+eLjpprDt8XynntQljGNmH8aHfYElAAN2AybE/ROAr9fSBsepNcOG\nwY03wqRJIWXzpZfCzN+tt26d0eM49aSmzl5SL0nTgDnAJDN7BBhiZoXs5jnAkLbaesw+P40UbEhF\no5Y27LBDyNypJJ7f3c9Fo2mkYENeGuWoaZ69mS0ERkpaDvibpA1LnjdJbcZqBgwYwPjx4+kXk5lH\njx7N2LFjF004KJyU9rabm5szvT7v9sV0tn0q281x+mmjn89q21eyfdhhsMsuTVx1FZxyykAuvxye\neaaJ73wHDjpoIP36pXE+U/g8Uvl+pvB5dPZ8Tp48mYkTJwIs8pdt0WWpl5JOBj4EvgdsbWazJQ0F\n7jGzddt4vcfsnYbH4/lOV9PlMXtJyxcybSQtBewAPAPcBIyLLxsH/L1WNjhOvSkXzx8zJoR7anjX\n7jitqGXMfihwt6QngUcIMfvbgDOAHSTNAraN24vhMfv8NFKwIRWNetlQGs//9NMmjjgirMj1ne/A\nnXeG+j21tiPP9t1JIwUb8tIoR82cvZnNMLONzWyEmW1kZr+I+98xs+3NbJiZ7Whm3rdxegR9+oT8\n/JdfhpNOgu22Czn5114bLgZrrAGnnAIvvlhvS53uiJdLcJw68tJLMGECXHFFuAgU2GYbOPBA+OY3\nYeml62ae04B4bRzHSZiFC2Hy5DAT9y9/aZmFO2BAqLB5wAGw2WY+qOt0jNfGqUP7VDRSsCEVjRRs\naEujVy/Ydlu4+upQdvl3v4NNNw0rcV1yCWyxBWywQcjkKeTsd9dzUQ+NFGzIS6McyTp7x+mpDBwI\nhx4KDz8cqmsed1xYP/eZZ+DHP4aVV4avfQ2mTGl7+UXHaQsP4zhOA/DJJ3D77SHMc+utLbX1P/tZ\n2HffEObZcMP2NZyegcfsHaebMGdOCPdcfnmosV9gzJgwqLv33uHuwOmZeMy+Du1T0UjBhlQ0UrCh\nWo0hQ+BHP4IpU5p4+OGQzjlgADz6KBx+eOW5+/V+HylppGBDXhrlSNbZO47TPlIYxL344jBoe801\nnrvvlMfDOI7TzWgvd/+AA0JdHs/d7754zN5xehjt5e7vtRd8+9vVx/aXWy4Ud+vlMYJkaDhnP2rU\nKJs6dWpVGk1NTYtKgtajfSoaKdiQikYKNtRDo6kJrr8+OP5HHgn7Ro5sYtq06mwYObKJF14YyIgR\nYY3eUaNg5EhYf33o27cyjRTOZwo25KVRztnXtJ694zhpUMjdP/RQeOopuPJKeOWV6nUHD4Zp0+D+\n+8Nfgb59wySw4gvAiBHQv3/1x3Q6R7I9ew/jOE5jMGdOqOg5bVr4P3VqWIy9FAnWWqvF+RcuBEPa\nXKvO6SwNF8ZxZ+84jcvcufDkk60vADNn1mbGr19EWtNwzt5j9vlppGBDKhop2JCKRlfb8PHHoeRD\nwfkX7gbWXDOfsYNSjaFDF78ArL5624PJKXweeWl4zN5xnLrSt2+I248YAfvv37K/qam6rKCPP4bp\n02HGjJYLyLRpoaDcG2/Abbe1vHbAAFoNJo8aBeut1/ljNxLJ9uw9jOM4TmdZuBBeeGHxsYRCxdBi\n+vYNYaAlluh6O0tZay3485+r02i4MI47e8dx8mb27BbnX/jf1mByvdhww3CHUg0N5+w9Zp+fRgo2\npKKRgg2paKRgQwoac+fCc8810adPdTYsWFC9Rq9eTQwf7jF7x3Gc3OnfP4RPqp1NXO3YQ0GjViTb\ns/cwjuM4TnYarsSx4ziOkx81dfaSPi/pHklPSZop6ai4f7CkOyTNkjRJ0mI3P17PPj+NFGxIRSMF\nG1LRSMGGVDRSsCEvjXLUumf/CXCMmW0AbA58X9J6wHjgDjMbBtwVt1sxd+7cqg8+ZcqUurZPRSMF\nG1LRSMGGVDRSsCEVjRRsyEujHDV19mY228ymxcfNwDPASsBuwIT4sgnA10vbPv/881Uf/7HHHqtr\n+1Q0UrAhFY0UbEhFIwUbUtFIwYa8NMrRZTF7SasBo4CHgSFmNic+NQfooVUsHMdxuoYucfaSlgX+\nAhxtZq3iMzHlZrG0myE5VDGaV1itoU7tU9FIwYZUNFKwIRWNFGxIRSMFG/LSKEfNUy8lLQHcAtxu\nZufGfc8CW5vZbElDgXvMbN2Sdp536TiO0wm6fFKVJAH/BzxdcPSRm4BxwJnx/99L27ZlrOM4jtM5\natqzlzQWuA+YTkuo5ifAI8ANwCrAS8CeZlbDuWOO4zg9m2Rn0DqO4zj54TNoHcdxegBJFUKT1JuQ\nhrnILjPLYVnkio8/FphmZs2SvktIFT3PzF6usH1v4CgzO6eTx/9R0aYBKnqMmf22M7pdjaTB7T1v\nZu9k1NsF2ADoR8u5OK2DNpvQcg4Xu301sycyHL+t9zPXzGqwyF67dtxlZtt1tK+d9ssAxwKrmNn3\nJK0NrGNmt9TA3B5F/O0vY2bv19uWciTj7CUdCZwC/Bf4tOipjTJorAP8L7CimW0gaTiwm5n9okKJ\ni4HhkkYQfhSXAVcBX6qksZl9KmkfoFPOHuhPcEzrAGMIA9kCdiGMc1RENedB0s1Fm8UXHAiZsrtV\nYMITtOFgi1i9Ao2CPb8HlgK2BS4F9iTM1eiI30QblgI2IYwbAQwHHgO2qNQGwvtZBXg3bg8CZkua\nDXzPzB5vx/5myp8LM7MBHR1c0lLA0sAKJReeAYRJipVyBfA48IW4/TrwZ0K2XMVI2hJYy8yukLQC\nsKyZvZihfbW/UyQdbWbndbSvnfa7A2cQOpeLOlWVfB5FGtcBhxL81aPAcpLOM7OzKtWIOiOBLQnf\nk/vN7Mks7SvGzJL4A54HPlOlxn3AZsDUuC3gqQztC+1OAQ6Oj5/IaMM5wIXxw9u48JdR436gf9F2\n//glqPl5ALaOf+cB1wO7EmY8XwecW4fvxYz4f3r8vywwJUP7vwIbFW1vCPwlow2XAjsVbe8IXEK4\nYDzSBefgh8CLwPz4v/A3HfhBBp3H4/+pRfuezGjLqcDNwKy4vRLwQEaNqn6npe+haN+0DO2fB9ar\n8nN5Mv7/DqFzsUTh+5pB42hgJnAacDowgxAdyP17lEzPHngFqPYWaGkzezhkfIbLtKQst9pzJZ0I\n7AtsGW/Nsi5WNopwhS4NM2yTQeOzhLpCBT6J+yql0+fBzCYDSPqNmW1S9NRNksr2YIuRtHEHx6g4\nhAJ8FP9/KGkl4G1gxQzt1zWzRWv/mNnMWJ8pC1uY2feKNCbF83OIpL7tNcwjpGUhbflcSUeZ2fkV\nW7048+NdQsG2NQkXkCx8g/Adfzza9pqk/hk1Ov39lPRtYB9g9ZK70P6E70alzDazZzK8vi36xHlE\nXwcuMrNPOjE/6GBgMzP7AEDSGcA/gWo+5zZJydm/CNwj6Vbg47jPLFuc+k1JaxU2JH0LeCND+72A\nbwMHWpjwtQpwdob2mNnWWV5fhquARyT9ldDr+TottYQqodrzALC0pDXN7PmosQYhlFAJv6X9ME6W\nC98tkgYRPofCxebSDO2nS7oMuJpwLvcBst4mvyHpBOCPUWNPYE7sDCzsoG1uIS0zO1/SF4DVaD2u\ndVWFEqcCE4GVJV0LfBHYv9LjR+ab2cKCo47jAFmp5vv5YHztCsD/0BKCeZ+WUF1ZYvgG4DFJ1xPm\n+BT7m79WaAfA7wmp49OBeyWtCryXoX2BhWUe50oyqZeSTo0PCwaJcPJ/nkFjTcIH8AWgiXAB+Y6Z\nvZSfpR3asCLwS2AlM9tZ0vqEnuH/ZdTZhJY43n1mVvEajfE8XEI4D+/SifMgaeeoUYjFrgYcYmb/\nqFQjbyQtCfQzs4p/ULEnezjhXEIIIVxsZhXPS49x6VMIzhHgAeDnhB/2Kmb270q1qkHS1cAawDSK\nxrXM7MgMGssTKtAC/NPM3spow/HAWoRQ1q+BA4Frs9xx5PT9XBb4yMI42TqEca7brYNBc0lXUuJj\nip83swMy2HBKya5eQG8z+2kGjWMJF9zijt2V1skkj3aPlYqzL1C4JbSSGjodtPlRya5+hBP/IRXc\nHUgqOLT/mtlmGcxtS2siYSDsJDMbHm/zpprZhhl1ehPCFX1oyUDJlJkUe129spzLkvb9gHXj8Z81\ns0y3/Hlkf0jaA/iHmb0v6WRCCOEXGUNBdUfSVm3tN7P7Mmg8A6xvGX+0RZlJrXbT8r3KdC4l7Uhw\n9hA+mzuytC/S6fT3M4YUtyQMlj9AGCD92My+0xlbOoOk42g5r0sREimeNrMDM+psAoylZYC2usW3\ny5BMGEfSRoTwxWfi9pvAODObWUHztrJYAL5LBVksZlbxrXQFLG9m10saH7U/kbQgi0C1mUnRSe9O\n6I33VrjnNusgXTG23c7M7oq3u8XZOGsqLHeW5TY3j+yPn5nZnxTSYrcj3Lr/Dti0g/cxo52nzcyG\nV2qApHvKaGxbqQbwY1ocQz+C/Y8TsowqZSYwlHAes5BnZhLALML7v0PS0pL6Z+ycDQL2I4ajYkjI\nzOyoDDb0MrMPJR0E/K+ZnSWp4vCcpAmEwoxNRTb9JoujNrP/KdE8G5hUafvY5hfAvcBlhbh9rUjG\n2RNu6441s3sAJG1Ny61eu5jZqbHN/YTMl7lx+1TgttqYW5ZmSZ8pbEjanOxxvB8SesBZBpyKuZEQ\nxnocyFpGbyvCgjK70nasOYuzX9PM9pS0N4CZfVCI9WagcLHbBbjUzG6RdHoF7XbNeqB2OL7oceFC\nmukCbma7FG9L+jwh4ykLKwBPS3qEloFVsw7SYQvjSHEM6HuFAWtJGxLCURUj6RDge8BgYE1gZULK\nckW5/pHbgIcIF52FlJkLUYEtWxAyYQ6Ku7JMEh1hRSVazOzdjhILKmAZsqXCArxAGEc6X9JcQibe\n/Wa2WL2waknJ2S9dcPQQskI6MfhTbRZLHvyIkJq2hqQHCT/Qb2XUqDYzaSUz26kzDc3slPh//yqO\nX2C+pEWDup3M/nhN0iXADsAZ8a6lwx91nuM0Zla6osQUSY9WKfsfIGtW0KkFkyiZcFcheWQmfZ9w\nV/LPqDFLUtbf2JJmdmzGNqX8kFBn629m9lT8brV1B1YOSRpcyIaKWVO9sxhQcvfYi+BrOrx7LsbM\nLgcuj2N9ewHHEXL3l82iUwkpOfsXY0z2D4Qv8ncIV70sVJvFkgdPESZhrRNteI7sZSmqzUx6UNJw\nM+swO6E9FGaurk/ozRaMyPJlPgW4neqyP/YEdgbONrMmhZLYx3fQZhFqPampLyGVttmyTZ4pTp/s\nBYwmTGiqGEkXlGiMpCW7qCJiB2g1woSmO+OFNMtvOI/MpPlmNr8oG2fRmFIGro53CDdTdPGvJA01\nHrM3YRLWojuamDWWJQz0G+AhSTcQzsUehMSKLBTfPS4A5nQ0QFyKpP8jXPTnAFMId43dO2ZPGNX/\nOS1hgvvjvooxs1/GAdJCFsv+tRrsaIcHzWxjQnwVAElPECZXVcor8a9v/Mt6m7slcEAceC6+3c8S\npy6duboHlc1cLWYccCshTv8iYbJIpuwPwiD1rWY2T9I2hDhzxRdwM1vUQ5LUizBBbPPyLdqkOH1y\nASHd7qCyr26b4ruDBYQMlgeyCOQQQjmAkJl0dNy+L7bPwr2STiKk5u4AHEFw2lmYD5wFnERLqqER\nMo06JGbgfFGSsg5WF2lcFQd5t43H/oaZPZ1R46XOHLuEwQQ/3AS8A7yV9YJRKSlm4yxHcEzJ1pho\ni9jj/BxwDaHHVHDQA4DfWcniLDW2ZVVClkIh3fB+4F2rsMZP1JhhZhtJmh6zipYFJprZ2Awa20Yb\nxhLS9Z4gxCPPbbdha40nCYOKqxFivTcCG5jZVyrVaENzmpmNzPD6pQlObSzBOU0hDApmSd882Mwu\nK9l3hpmNz6DxJDGEYmaj4r4ZZlZxSZFqiRfMgynKxiEMLlbsSGInZEwnLvzFGr8j/N7+RMi6gwx5\n8gpzaIDF6k91WS2uEnvWI9zB/pCQvrly3sdIpmcvaQxwOfH2WFITcFAb8dJU2YnQk12JcItYYC5w\nYhahGAP9MSGEUpjxmCX74+uEHmDhi/8HQu88y6y8ameuYmZ3S7qPEPbYFjiMUK6gYmcPLDSzBZK+\nCVxgZhdIyjLnYPeizV6EC8dHZV5ejgmEMZTzaAl//IFwt1Mpu0uaZ2ZXR7suouWzrZSqQiiShgG/\nYvHvVUU96ni8mbHjckkWw0v4F9k/g1L6EXrCpb+JShMIbqN1dtTqhJDrBlXalQlJuxI6RFsCA4G7\nCZ2z3EnG2RMc/RFmdj9ATLW7nHDbnjxmdiVwpaTdzewvVcpdQ6hLswthsGZ/4M0M7UunYJ9J9inY\nN6v1zFUjFIarGEl3ETIUHiL0hkeb2X+zaAAfKxSX24+WGGmWEhbFWUWFEMzXMtqwgZmtX7R9t6RM\nt/zANwklJz4Fvky408oUpqT6EMoVhHGU3xJmMe9PhkHJeNF9TtKqWe4S2+BDYFpMaS0OM1Ycc682\ngcBK5r3ETJzvV6PZSb5BuDs6z8xei7ZkKqRWKSk5+wUFRw9gZlOUMT+9nkj6rpn9AVhNYVbcoqfI\nXvbhM2Z2mUItlHsJP/KsdzjVTsF+DvjUzP4iaQPCZKa/ZdSYTujVb0joGb8r6SEzy9KrO5BwR/BL\nM3tRoWzD1ZU2zimr6AlJW5jZQ7AonbbSOkHFg7sHE8JQU4DTirNBKmQ8YaxgBqETcBvZLsBLxYFd\nxXjzqXE86eQMGoOBp2L6ZyEvvMP0zxL+zuJLkVZ0hyLpxxZy6i9o4+lMF4yShk9IqmpCZScZ1cZF\n/8uEO/tcScnZ3xsHBa+L23vFfRtD9ll+daCQYliY4FWgMznEhQyc2TEj5nVCDL5SrgAeLslKujyj\nDSeb2Q3xDmtbQg//YkK1woows2Ng0azo/aNdKwJLZtB4CjiyaPsFQmnailDIZz+fEG+HMCh5tJn9\np4K2hdS6PsADkl4lfJarEC6GlVBaG0fAV+NfxYOSEAYmCeGTzoZQ5sVMln9L+gHhe5U1vXlJgu3F\nEyYy9UTjXXBnGR+P9zyh1EKrEtyViqj1rPtehASK16qwKxOSDifcma1ZksLZnzAjOP9jpjJAq7Zn\nKS7CzLIUz2poooOfAnweuIAwjnGqmd3UbsPWGlVNwS4MYipU4ZthZtdImloYGKxQ40hCLHITQjZO\nYcLI3Rk0qo0z30kIixXuBr5DqMOyQwVtV2vnacs44L0nYYD7fUk/o6XsQ8XplzG+exqtC6FZpWmk\ncVzsGUJs+HTC9+psM/tnBhsW+w5UOkgs6U9mtofant1cUbZYDJ9tTyjotjUlzj5D+uapLB7e+0uW\nQfdqiIkogwgdlxNoeR9zrfOTKds/ZkLOvrSoEACWoRBaCihM7jiXMAXdCFX6jok90kra9yb0POu6\nKpVCjv9rhMlMowgzcR82sxEZNI4n9KSf6Gw6maQHaIkz70pIH+xtZhWFHiQ9WWpzW/tqTVF201jg\nF4Q7pZ9ZhlpMkp4nxHhnmlnm0Fx09ifScrEQYQC8Eie7qCdK6FUX6E+oZ99hTRpJQ83sDYXc9uMp\nuTswsz0r0DiKkD66BouXjcjSCSg9F4X2DTFG2BlScvbFRYX6EQYnn+nEIFZdkfQwYfGSP8ZdewFH\nZvxRP2pmY2phXwYbliGkgk03s3/F1NKNzCxT7Y8c7HjCzDYu7j0W9lXY/m5C+OhagnPZGzjAKlzK\nLy9yulOaDGwXwzmdsWEWYYbmTIrGcayCfPE8e6LV3B0Uvf53ZnZYluOWtO/0uWhUknH2pSiUs51k\nZl+qty1ZUMxLL9mXqScp6RxCxsn1hEGwwiBv6uMWuaNQcmJLwsSsuwi9uV+b2ToVtl+VcPEtTKR6\nkHDx7dJ86pzulDYlhF8m04mZ1ZKmWIZ5EnmTx91BjrY8YGZf7PiV3YeUnf1gwpJva3X44oSIaY5N\ntB5oHkQcxKokphh7cIt9MD1p3KJAtXFmheqGPzSzd+P2YOB/uvqOMY87JUl3EOZtzKB1b7SiUKek\n7Ql3NnfR+QU7Ok094tTt2FLXc1EPknH2KlNUyMzaSrFKFkkvUT4roOKYohOoJs4c2y82W7atfY2A\npJmWcV2EkvZXE9YneIrWF4uKF+zoLvTEc5FS6mXVRYVSwMxWq1YjpoWVXjDeIywYPa1a/QbjGtqI\nrWZAqrK6YULcJmkn6/xqYWMIlS/T6OHVlx53LpJx9t1lYETSONoOwVS6TiiEVMXRhNmRhbzsGcBh\nkv5sZmfmYWuD8N8sKadtkEd1w1Q4AjhO0se0lPKuOPWSMF6xPqE329PpcecimTBOd0HShbReqmxb\nQuphxTXtFRZh+bKZNcftZQmzJXcm9O6z1iBvWPKIrSrMAC5UN7zbMlY37C5IepYwONrpaqjdhZ54\nLpLp2XcXzOwHxduSBhKyarKwAi2ODUIvboiFZdi6ZNJHQuxPiK0uQeswTsXO3sIs3G7Rg1MoSrcq\nRb9dq3wd251rYlRj0uPOhTv72vMhoaJeFq4hlDv4OyH0sCtwbczo6Gm90h4XWy1HzPTai/AdKM61\nr8jZd5dQaR70xHPhYZyckVRchbAXIS54g5mdkFFnDC3r7z5gjVPqOVckXUFIlewWPfNqiBOBNjKz\nrEs7Oo737GtA8YrzC4CXrYKiW23Qj5B/fLmkFSStbmYv5mNiQ7EFoRxuj4mttsPzhJXL3Nk7mfGe\nfU4UZuSp9ZqnBYyw0MLZZnZRBVqnEjJy1jGzYTFOe0NPm/EH5YuR9aTbcLWU8/0cYe3au+hkHXin\n5+LOvouQ9BnC+rQdTvNXWH5uFCHzprD83GJlGJyegaT9aelAqPSxmVW8Jq/Tc/EwThdhZm8rLJhd\nCfPNbKFalp/LWnPc6UZYrP8eU3A/KhRCixVS+9XRNKeB6FVvA3oSZlZakrUcf1JYyGWgpEMIt+2Z\nloVSr4wAAAQKSURBVAR0uiV30nrd2qWBO+pki9NgeBgnUSTtCOwYN/9hZv6j7uF0pzo/TtfjYZxE\nidUQu7R2vJM8H0raxOLqVpJGA1nW83V6MO7sE6JMJk+BLDVQnO7J0cANkt6I20MJk6wcp0Pc2SeE\nmS1bbxucNImDsWOB9YBCRtdzZvZx+VaO04LH7B2nQUhhuUqncXFn7zgNgi9X6VSDO3vHaRB8uUqn\nGtzZO47j9AB8gNZxGghJuxAqqS6aOWtmp9XPIqdR8Bm0jtMgxFnVewJHEeL1exIWMnGcDvEwjuM0\nCJJmmNlGhaJ4sVbORDMbW2/bnPTxnr3jNA6F2bIfxbLXC4AV62iP00B4zN5xGodbJA0CzgIej/su\nraM9TgPhYRzHaRAkLQUcQZhJa8AU4GIz8/o4Toe4s3ecBkHSn4D3gasJA7T7AMuZ2R51NcxpCNzZ\nO06DIOlpM1u/o32O0xY+QOs4jcMTkrYobEjanJbYveO0i/fsHadBkPQsMAx4lRCzXwV4jpCVY75G\nsdMeno3jOI3DzvU2wGlcvGfvOI7TA/CYveM4Tg/Anb3jOE4PwJ294zhOD8CdvdPtkXSSpJmSnpQ0\nVdKmNTzWZEmb1ErfcTqLZ+M43ZqYl/5VYJSZfSJpMLBkDQ9ptLGalOPUG+/ZO92dFYG3zOwTADN7\nx8zekHSypEckzYh14oFFPfPfSnpU0tOSRkv6q6RZkk6Pr1lN0rOSro6v+VOsW9MKSTtKelDS45Ju\nkLRM3H+GpKfincbZXXQenB6OO3unuzMJ+Lyk5yRdJGmruP9CM9vUzDYCloorQEHolc83szHA74Ab\ngcOBDYH9Y9VJCJObLoqlCt4nFChbhKTlgZOA7cxsE8JM12PjncXXzWwDMxsBnF6rN+44xbizd7o1\nZvYBsAlwCPAmcL2kccC2kv4paTqwLWGpvwI3xf8zgafMbI6ZfQy8AHw+PveqmT0UH19NqERZQMDm\nUfNBSVOB/QgzXt8D5kn6P0nfoKVGvePUFI/ZO90eM1sI3AvcK2kGcBiwEbCJmb0m6RSK1nQF5sf/\nC4seF7YLv5niuLxoO05/h5ntU7ozDhBvB3wL+EF87Dg1xXv2TrdG0jBJaxftGgU8S3DOb8el/TpT\nIniVWIgMQqnh+4ueM+CfwBclrRntWEbS2jFuP9DMbgeOBUZ04tiOkxnv2TvdnWWBCyQNJBQM+xdw\nKNBECNPMBh4u07a9zJrngO9Luhx4Cri4VUOztyTtD1wnqZD9cxIwF7hRUj/CHcExnXxfjpMJr43j\nOBmRtBpwcxzcdZyGwMM4jtM5vJfkNBTes3ccx+kBeM/ecRynB+DO3nEcpwfgzt5xHKcH4M7ecRyn\nB+DO3nEcpwfgzt5xHKcH8P8+ZFHPvwH2pwAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "goldBugContentWordTokensLowercaseFreqs.plot(20, title=\"Top Frequency Content Terms in Gold Bug\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we'd noticed in the last notebook, the words \"jupiter\" and \"legrand\" are suspiciously high frequency (for relatively uncommon words), which may suggest that they're being used as character names in the story. We can regenerate our NLTK text object and ask for concordances of each to confirm this hypothesis. To help differentiate between upper and lowercase words, we'll re-tokenize the text and not perform any case alternation or filtering." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Displaying 5 of 53 matches:\n", "ccompanied by an old negro , called Jupiter , who had been manumitted before th\n", "rived to instil this obstinacy into Jupiter , with a view to the supervision an\n", "nd gave me a most cordial welcome . Jupiter , grinning from ear to ear , bustle\n", " had hunted down and secured , with Jupiter 's assistance , a scarabæus which h\n", "tellin on you , '' here interrupted Jupiter ; `` de bug is a goole bug , solid \n", "Displaying 5 of 47 matches:\n", "cted an intimacy with a Mr. William Legrand . He was of an ancient Huguenot fam\n", " or more remote end of the island , Legrand had built himself a small hut , whi\n", "ot improbable that the relatives of Legrand , conceiving him to be somewhat uns\n", "repare some marsh-hens for supper . Legrand was in one of his fits -- how else \n", " only known you were here ! '' said Legrand , `` but it 's so long since I saw \n" ] } ], "source": [ "goldBugText = nltk.Text(nltk.word_tokenize(goldBugString))\n", "goldBugText.concordance(\"jupiter\", lines=5)\n", "goldBugText.concordance(\"legrand\", lines=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Are these two character names present throughout the story? One way to get a quick sense is to create a dispersion plot which is essentially a distribution graph of occurrences. Note that [dispersion_plot()](http://www.nltk.org/api/nltk.html?highlight=dispersion_plot#nltk.text.Text.dispersion_plot) takes a list of words as an argument, but that the words are case-sensitive (unlike the ```concordance()``` function). Since case matters, for other purposes it might have been preferable to use the lowercase tokens instead." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZcAAAEZCAYAAABb3GilAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGBxJREFUeJzt3XmUZGWd5vHvgwUqIgjKgAsI7ags7YKgIraaLm3bHtfj\nBgqtOGPrjOPWHkHRUehuHdDx4C7qUaRZFLfGdXCvbpVCBIECAW0UkEVREAFpEZHf/HHfhCCNrMrM\nejMjqfp+zomTN9773vf+4kZUPHHvjbqRqkKSpJ42mnQBkqT1j+EiSerOcJEkdWe4SJK6M1wkSd0Z\nLpKk7gwX3WYkeVSS8zqMc2GSx6/D8i9I8tV1raOXXttlAeu9KclfLPV6ddtguGjRrOub+ExV9Z2q\n2qnHUO32Z5J8PMkfklzTbmcleVuSzUfqOLaq/qZDHV103C63kmSHFiDXttsFSQ5cwDgvSvKd3vVp\neTNctJhmfRNfxgo4rKo2B+4G7A/sCXwvyaaTKirJJP+tblFVdwb2Ad6c5IkTrEW3EYaLllwGr09y\nfpIrkhyfZMs274NJPjPS97Ak32jTU0kuHpm3XZLPJflVG+e9rf0+Sb7V2n6d5JgkW8ynRICquqGq\nTgWeBtyVIWhu9Um8PZbDk1ye5Ookq5Ps0uZ9PMkRSb7W9oJWJtl+pP6dknw9yZVJzkvynJF5H2/b\n4itJfgdMJXlyknPaWJckee0s22Xntq6rkpyd5Kkzxn1/ki+1cU6e66GtqjoZ+BHwl3+2wZItkvxL\ney4uTPLGtm12Bj4IPKLt/fxmrk+CbtsMF03CKxnesB8N3B24Cnh/m/cPwAOSvDDJo4AXA383c4Ak\ntwO+BFwA3Bu4J/DJkS5vbWPvDGwHHLzQYqvqd8DXgUeNmf3E1n7fqtoCeA4w+gb6fOAfGfaCzgCO\nbfXfqY15DLA1sDfwgfZmPG0f4J+qajPgJOCjwEvaXtWuwLdmFpNkY+CLwIlt3FcAxya530i35zFs\njy2B8xm21Zq0nMgj23pPH9PnvcCdgR2BxzA8Z/tX1bnAy4BVVXXnqtpqLevSesJw0SS8FHhTVV1W\nVX8EDgGenWSjqvo9sB9wOHA08L+q6rIxYzyMITxeV1W/r6o/VNX3AKrqp1X1zar6Y1Vd0cZ6zDrW\n/Atg3BvjHxneVHdu9f+4qn45Mv9LVfXdqroBeCPDJ/h7AU8BLqiqo6rqpqo6A/gcQzhNO6GqVrXH\ndD1wA7Brks2r6uqqGvcmvydwp6o6tKpurKpvM4TwPiN9PldVp1bVnxjC7sFreexXAFcCHwEObGPe\nrAX984A3VNV1VXUR8E6G5xHanqA2LIaLJmEH4F/bYZurgHOAG4FtAKrqFOBnre+nZxljO+Ciqrpp\n5owk2yT5ZDt0dDVDSN11HWu+J8Mb7K1U1beA9zHseV2e5ENJ7jw9G7hkpO91DHs192DY23r49DZo\n2+H5tG3Qlr35UFfzLODJwIXtsNeeY+q8x5jlLmrt0+NePjLv98Bmsz7qwV2raquq2qWq3jdm/t2A\njdt6pv2cYZtpA2W4aBJ+DjypqrYcuW1aVb8ASPJyYBPgMuCAWca4GNi+fWqe6W3An4C/bIeq9mN+\nr/VbfQkhyWbAE4Cx33iqqvdW1R7ALsD9gNdNL8oQgqPjbAVcyrAN/m3GNrhzVb181qKGvY1nMBzu\nOgH41JhulwHbJRndW7h3W+diuYJhD26HkbbtuSVYb2tf6lAHhosW2yZJ7jByWwEcAbxt+uR2kq2T\nPK1N3w/4J+AFDMftD0jyoDHjnsJwqOrQJJu2sfdq8zYDrgOuSXJPbnmzn4u0G0lun2R3hjfyK4Ej\n/6xzskeSh7dzHf8JXM8QbNOenOSRSTZpj2tVVV0KfBm4X5J9k2zcbg9NMv2V4sxYz8YZ/n/NFu1w\n1rUz1jPt+62OA9oyUwyH4KbPR3U/RNXq+RTw1iSbJbk38BqG80kw7Cndq20jbSAMFy22rzC82U3f\n3gy8G/gC8LUk1wCrgIe1vZCjgUOr6qyqOh84CDh65I2p4OY3tKcC/5VhL+Bi4LmtzyHAQ4CrGU5u\nf5a5f3ouhjfmaxg+kR8F/ADYq50Pmu4zPd7mwIcZDndd2JZ5x0i/44C3MITTbsC+rf5rGb4MsDfD\nXsUvgP/DsMc2cx3T9gUuaIf6/p4hgEfrpp3beSrwt8CvGQ7Z7VdVP1nDuGvaNnOd9wqGQP8Zwx7e\nsdwSxt9k+JbZL5P8ag3jaT0SfyxMWhxJjgQuqar/PelapKXmnou0ePyWlDZYhou0eG6LVyiQuvCw\nmCSpO/dcJEndrZh0Ab0kcRdMkhagqrqfH1yv9lyqalnd3vKWt0y8httKXdZkTRtCXcuxpsWyXoWL\nJGl5MFwkSd0ZLotoampq0iWMtRzrsqa5saa5W451LceaFst681XkJLW+PBZJWipJKE/oS5JuCwwX\nSVJ3hoskqTvDRZLUneEiSerOcJEkdWe4SJK6M1wkSd0ZLpKk7gwXSVJ3hoskqTvDRZLUneEiSerO\ncJEkdWe4SJK6M1wkSd0ZLpKk7gwXSVJ3hoskqTvDRZLUneEiSerOcJEkdWe4SJK6M1wkSd0ZLpKk\n7gwXSVJ3hoskqTvDRZLUneEiSerOcJEkdWe4SJK6M1wkSd0ZLpKk7gwXSVJ3hoskqTvDRZLUneEi\nSerOcJEkdWe4SJK6M1wkSd0ZLpKk7gwXSVJ3hoskqTvDRZLUneEiSerOcJEkdWe4SJK6M1wkSd0Z\nLpKk7gwXSVJ3hoskqTvDRZLUneEiSerOcJEkdWe4SJK6M1wkSd0ZLpKk7gwXSVJ3hoskqTvDRZLU\nneEiSerOcJEkdWe4SJK6M1wkSd2tU7gk/G6By700Yb82/aKEu69LHZKk5WVd91xqQQsVH6ri6Hb3\nhcA95rN8wu0Wst6VK4fbzLbRv2tqn9lnIeufWce48Wdbz1z6zlbvmsYe1/aud42vYdxy86l73Ly5\nbuu1jT9z+87nMY/rv7bHNZ/a5tK3t8V4Dc933RpvPq+t26p1PiyW8JiEL47cf1/CC9v0hQmHJaxO\n+H7CfVr7wQmvTXgWsAdwbMIPE+6QsHvCyoRTE05M2LYtszLh8IQfAK9cSK2Gy9zf7E44YXwN45Yz\nXAyX2dat8QyXhSlu2aMp4LdVPBB4H/Cu0T5VfBY4FXh+FQ8B/gS8F3hWFXsARwJvHVlm4yoeWsXh\ni1C3JKmTFUuwjk+0v5+EWUMh7e/9gV2Bb2RouR1w2Ui/49e0ooMPPvjm6ampKaampuZbqySt11au\nXMnKJdhl6hEuN3LrPaA7rqHvbOdoptsD/KiKvWbpd92aChkNF0nSn5v5wfuQQw5ZlPX0OCx2EbBL\nwiYJdwEeN2P+80b+ntSmwy17K9cCm7fpHwNbJ+wJkLBxwi4dapQkLaEF77kkrAD+UMUlCZ8CzgYu\nAH44o+uWCWcC1wP7tLbR8zIfB45I+E9gL+DZwHsStmj1HQ6cs9A6R407SjbdNnPeuPZ1Pcq2tjFn\nq2U+fRcy9ri2ZzxjfA3jlptP3Wuat7Ztvbbx57JNZmtb07aay/M+n8e+VEdrF+M1PN91a7z5vLZu\nq1K1oG8Tk/Ag4ENVw17GLH0uAHav4jcLrG8e9aQW+lgkaUOVhKrK2nvOz4IOiyW8DDgOeNNauvpu\nL0kboAXvuSw37rlI0vwtqz0XSZLWxHCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3h\nIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNcJEnd\nGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS\n1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNc\nJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7\nw0WS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKm7NYZLwu+WqpD5WK51SZIGa9tz\nqZ4rS1jRaaixda1cufa2NfVZW9/R+ytXzm3s+cyf77Kzjbe2vuNqH9dn3FjzqWNcv3Hjrmlbzlbr\n2mqY63MxWx3jtsVcx15o36UcazH0en1P+nHO59+VZjfvw2IJ90n4fwmnJvx7wv1H2k9OWJ3wzwnX\ntvaphO8kfB44u7Wd0JY/O+ElI2P/ri17RsKqhP/S2nds91cn/PNstRkuc+truBgui8Fw0aiFnHP5\nMPCKKvYAXgd8oLW/Gzi8igcCF89YZjfglVXs1O7v35Z/KPDKhC1b+6bAqioeDPw73Bw87wbe38a+\nbAE1S5KW0LwOUyVsBjwC+HRyc/Mm7e+ewNPa9CeA/zuy6ClVXDRy/1UJz2jT2wH3BU4Bbqjiy639\nNOCv2/RewDPb9DHAYePqW7nyYA4+eJiemppiampqzo9NkjYEK1euZOUS7IbN9xzIRsBvq9htnstd\nNz2RMAU8HtiziusTvg3coc3+48gyN823vqmpW8JFkvTnZn7wPuSQQxZlPfM6LFbFNcAFCc8GSEjC\nA9vsk2FoB/ZewzCbA1e1YNmJYY9nbb43MuYL5lOzJGnprW3PYNPkVudP3snw5v7BhDcBGzMcAlsN\nvBo4JuEg4KvA1SPLjX6760TgZQnnAD8GVs3Sr0buvwo4LuFA4PPM8m2xcUfBZratqc/a+o7en8u6\n5jt/vsvONt7a+s5lublukzWNN9ca5vOczGXsuSw3bvy51jef57HnkdnlfpS31+t70o9zPv+uNLtU\n9fm2ccIdq/h9m94beF7VzedJFl2S6vVYJGlDkYSqytp7zk+v/3cCsHvC+4AAVwEv7ji2JOk2pNue\ny6S55yJJ87dYey5eW0yS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3h\nIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNcJEnd\nGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS\n1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNc\nJEndGS6SpO4MF0lSd4aLJKk7w0WS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7\nw0WS1J3hIknqznCRJHVnuEiSujNcJEndGS6SpO4MF0lSd4aLJKk7w2URrVy5ctIljLUc67KmubGm\nuVuOdS3HmhaL4bKIlusLaTnWZU1zY01ztxzrWo41LRbDRZLUneEiSeouVTXpGrpIsn48EElaYlWV\n3mOuN+EiSVo+PCwmSerOcJEkdbdehEuSJyU5L8l/JDlwEdezXZJvJ/lRkrOTvLK1b5Xk60l+kuRr\nSe4ysswbWl3nJXniSPvuSc5q897dobbbJTk9yReXUU13SfKZJOcmOSfJwyddV5LXtOfurCTHJbn9\nUteU5GNJLk9y1khbtxraYzq+tZ+c5N4LrOkd7bk7M8nnkmyxlDXNVtfIvNcmuSnJVpPeVq39FW17\nnZ3ksEnXlOTBbfnTk/wgyUOXsiaq6jZ9A24HnA/sAGwMnAHsvEjr2hZ4cJveDPgxsDPwduCA1n4g\ncGib3qXVs3Gr73xuOc91CvCwNv0V4EnrWNs/AMcCX2j3l0NNRwEvbtMrgC0mWRdwT+BnwO3b/eOB\nFy51TcCjgN2As0bautUA/E/gA236ecAnF1jTXwMbtelDl7qm2epq7dsBJwIXAFstg231WODrwMbt\n/tbLoKavAX/Tpv8W+PaS1rSQf6TL6QY8Ajhx5P7rgdcv0bpPAJ4AnAds09q2Bc5r028ADhzpfyKw\nJ3B34NyR9r2BI9ahjnsB32gv8C+2tknXtAXwszHtE6uLIVx+DmzJEHZfZHgDXfKa2j/q0TeCbjW0\nPg9v0yuAXy+kphnzngkcs9Q1zVYX8Gnggdw6XCa2rYBPAY8b02+SNZ0IPLdN77PUz9/6cFjsnsDF\nI/cvaW2LKskODJ8Uvs/wpnB5m3U5sE2bvkerZ2ZtM9svZd1qPhx4HXDTSNuka9oR+HWSI5P8MMlH\nktxpknVV1aXAOxkC5jLgt1X19UnWNKJnDTf/m6iqG4GrRw8dLdCLGT7JTrymJE8HLqmq1TNmTbKu\n+wKPboeMVibZYxnU9GrgHUl+DryDIVSWrKb1IVxqqVeYZDPgs8CrquraWxUzRPuS1ZTkKcCvqup0\nYOx31Ze6pmYF8BCGXemHANcx7FVOrK4kWwJPY/iEdw9gsyT7TrKmcZZDDaOSvBG4oaqOWwa1bAoc\nBLxltHlC5YxaAWxZVXsyfND71ITrgeFQ1quranvgNcDHlnLl60O4XMpw/HXadtw6fbtKsjFDsBxd\nVSe05suTbNvm3x341Sy13avVdmmbHm2/dIEl7QU8LckFwCeAxyU5esI10ca8pKp+0O5/hiFsfjnB\nup4AXFBVV7ZPX59jOKw6yZqm9Xi+LhlZZvs21gpgi6r6zUKKSvIi4MnAC0aaJ1nTfRg+HJzZXvP3\nAk5Lss2E67qE4fVEe83flORuE67p76rqX9v0Z4CHjYy/6DWtD+FyKnDfJDsk2YThZNMXFmNFSQJ8\nFDinqt41MusLDCeGaX9PGGnfO8kmSXZk2HU+pap+CVyT4dtTAfYbWWZequqgqtquqnZkOEb6rara\nb5I1tbp+CVyc5H6t6QnAjxjOc0yqrouAPZPcsY31BOCcCdc0rcfz9fkxYz0b+OZCCkryJIZP4U+v\nqutn1DqRmqrqrKrapqp2bK/5S4CHtEOKE6uL4fl6HEB7zW9SVVdMuKbLkjymTT8O+MnI+Itf01xO\nFC33G8M3IX7M8K2HNyziev6K4bzGGcDp7fYkYCuGE+o/YfiGxl1Gljmo1XUe7ZsbrX134Kw27z2d\n6nsMt3xbbOI1AQ8CfgCcyfCpbotJ1wUcDJzbxjuK4RszS1oTwx7mZcANDMex9+9ZA3B7hsMy/wGc\nDOywgJpe3Ja/aOS1/oGlrGlGXX+Y3lYz5v+MdkJ/Atvq5pra6+joto7TgKkJP3/7A49k+PB9BrAK\n2G0pa/LyL5Kk7taHw2KSpGXGcJEkdWe4SJK6M1wkSd0ZLpKk7gwXSVJ3hos2KEkOT/KqkftfTfKR\nkfvvTPKaBY49lfaTB2Pm/VWS72e4JPu5SV4yMm/rNu+01u85GX6iYN7/eS7JQQupXerNcNGG5rsM\nl8whyUbAXRkuQT7tEcD35jJQW34u/bZl+DmEl1bVzgz/GfelSZ7cujweWF1Vu1fVd4H/Bvz3qnr8\nXMaf4Q1r7yItPsNFG5pVDAECsCtwNnBthh82uz3D7/P8MMnj29WcVyf5aLu0EEkuTHJoktOA52T4\nobpz2/1nzrLOlwNHVtUZAFV1JXAA8PokDwIOA56e4Ued3szwP6s/luTtSXZNckqbd2aS+7Q69m17\nO6cnOSLJRkkOBe7Y2o5ehG0nzdmKSRcgLaWquizJjUm2YwiZVQyXE38EcA2wmuEH6I5k+H2O85Mc\nBfwP4N0MVyu+oqp2T3IHhsu1PLaqfprkeMZfzXgX4OMz2k4Ddq2qM1ug7F5V079s+ljgtVX1wyTv\nAd5VVce1CwauSLIz8Fxgr6r6U5IPAC+oqtcneXlV7dZre0kL5Z6LNkQnMRwa24shXFa16elDYvdn\nuHry+a3/UcCjR5Y/vv3dqfX7abt/DLNf/n1Nl4XPGuavAg5KcgDD9ZyuZziMtjtwapLTGS5KuOMa\nxpeWnOGiDdH3GA49PYDhIn0nc0vYnDSmf7j1Hsl1s4w7W0CcwxAGo3ZnOCS3RlX1CeCpwO+Br7S9\nGoCjqmq3dtupqv5xbWNJS8lw0YboJOApwJU1uAq4C8Oey0kMh7p2mD6/wXDp8X8bM855rd9ftPv7\nzLK+9wMvaudXSHJXht+kf/vaCk2yY1VdUFXvZbj8+QMYLnf+7CRbtz5bJdm+LfLHdvhMmijDRRui\nsxm+JXbySNtqhp89/k079LQ/8Okkq4EbgSNav5v3YFq/vwe+3E7oX86Ycy41/E7GvsBHkpzLsOf0\n0ar68siYs12e/LlJzm6Hv3YF/qWqzgXeBHwtyZkMl+jftvX/MLDaE/qaNC+5L0nqzj0XSVJ3hosk\nqTvDRZLUneEiSerOcJEkdWe4SJK6M1wkSd0ZLpKk7v4/sf/94C1QRlgAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "goldBugText.dispersion_plot([\"Jupiter\", \"Legrand\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This graph suggests that there are many more occurrences of the character names in the first half of the text. This doesn't necessarily mean that the characters are not as present in the second half, but their names appear less often (perhaps because of a shift in dialogue structure or narrative focus)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Next Steps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are some tasks to try:\n", "\n", "* Generate a simple list of the top 20 frequency lowercase content terms (without counts, just the terms)\n", "* Create a dispersion plot of these terms, do any other stand out as irregularly distributed?\n", "* Try the command [goldBugText.collocations()](http://www.nltk.org/api/nltk.html?highlight=text#nltk.text.Text.collocations) – what does this do? how might it be useful?\n", "\n", "In the next notebook we're going to look at more powerful ways of matching terms and [Searching Meaning](SearchingMeaning.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "[CC BY-SA](https://creativecommons.org/licenses/by-sa/4.0/) From [The Art of Literary Text Analysis](ArtOfLiteraryTextAnalysis.ipynb) by [Stéfan Sinclair](http://stefansinclair.name) & [Geoffrey Rockwell](http://geoffreyrockwell.com). Edited and revised by [Melissa Mony](http://melissamony.com).
Created January 27, 2015 and last modified January 14, 2018 (Jupyter 5.0.0)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 1 }