{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Similarity Queries using Annoy Tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial is about using the [Annoy(Approximate Nearest Neighbors Oh Yeah)]((https://github.com/spotify/annoy \"Link to annoy repo\") library for similarity queries in gensim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why use Annoy?\n", "The current implementation for finding k nearest neighbors in a vector space in gensim has linear complexity via brute force in the number of indexed documents, although with extremely low constant factors. The retrieved results are exact, which is an overkill in many applications: approximate results retrieved in sub-linear time may be enough. Annoy can find approximate nearest neighbors much faster." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the following examples, we'll use the Lee Corpus (which you already have if you've installed gensim)\n", "\n", "See the [Word2Vec tutorial](https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/word2vec.ipynb) for how to initialize and save this model." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Word2Vec(vocab=6981, size=100, alpha=0.025)\n" ] } ], "source": [ "# Load the model\n", "import gensim, os\n", "from gensim.models.word2vec import Word2Vec\n", "\n", "# Set file names for train and test data\n", "test_data_dir = '{}'.format(os.sep).join([gensim.__path__[0], 'test', 'test_data']) + os.sep\n", "lee_train_file = test_data_dir + 'lee_background.cor'\n", "\n", "class MyText(object):\n", " def __iter__(self):\n", " for line in open(lee_train_file):\n", " # assume there's one document per line, tokens separated by whitespace\n", " yield gensim.utils.simple_preprocess(line)\n", "\n", "sentences = MyText()\n", "model = Word2Vec(sentences, min_count=1)\n", "print(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "#### Comparing the traditional implementation and the Annoy \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These benchmarks are run on a 2.4GHz 4 core i7 processor " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Set up the model and vector that we are using in the comparison\n", "try:\n", " from gensim.similarities.index import AnnoyIndexer\n", "except ImportError:\n", " raise ValueError(\"SKIP: Please install the annoy indexer\")\n", "\n", "model.init_sims()\n", "annoy_index = AnnoyIndexer(model, 300)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[(u'the', 1.0),\n", " (u'on', 0.999976396560669),\n", " (u'in', 0.9999759197235107),\n", " (u'two', 0.9999756217002869),\n", " (u'after', 0.9999749660491943)]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Dry run to make sure both indices are fully in RAM\n", "vector = model.wv.syn0norm[0]\n", "model.most_similar([vector], topn=5, indexer=annoy_index)\n", "model.most_similar([vector], topn=5)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Gensim: 0.002638526\n", "Annoy: 0.001149898\n", "\n", "Annoy is 2.29 times faster on average over 1000 random queries on this particular run\n" ] } ], "source": [ "import time, numpy\n", "\n", "def avg_query_time(annoy_index=None):\n", " \"\"\"\n", " Average query time of a most_similar method over 1000 random queries,\n", " uses annoy if given an indexer\n", " \"\"\"\n", " total_time = 0\n", " for _ in range(1000):\n", " rand_vec = model.wv.syn0norm[numpy.random.randint(0, len(model.vocab))]\n", " start_time = time.clock()\n", " model.most_similar([rand_vec], topn=5, indexer=annoy_index)\n", " total_time += time.clock() - start_time\n", " return total_time / 1000\n", "\n", "gensim_time = avg_query_time()\n", "annoy_time = avg_query_time(annoy_index)\n", "print \"Gensim: {}\".format(gensim_time) \n", "print \"Annoy: {}\".format(annoy_time)\n", "print \"\\nAnnoy is {} times faster on average over 1000 random queries on \\\n", "this particular run\".format(numpy.round(gensim_time / annoy_time, 2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "**This speedup factor is by no means constant** and will vary greatly from run to run and is particular to this data set, BLAS setup, Annoy parameters(as tree size increases speedup factor decreases), machine specifications, among other factors.\n", "\n", ">**Note**: Initialization time for the annoy indexer was not included in the times. The optimal knn algorithm for you to use will depend on how many queries you need to make and the size of the corpus. If you are making very few similarity queries, the time taken to initialize the annoy indexer will be longer than the time it would take the brute force method to retrieve results. If you are making many queries however, the time it takes to initialize the annoy indexer will be made up for by the incredibly fast retrieval times for queries once the indexer has been initialized\n", "\n", ">**Note** : Gensim's 'most_similar' method is using numpy operations in the form of dot product whereas Annoy's method isnt. If 'numpy' on your machine is using one of the BLAS libraries like ATLAS or LAPACK, it'll run on multiple cores(only if your machine has multicore support ). Check [SciPy Cookbook](http://scipy-cookbook.readthedocs.io/items/ParallelProgramming.html) for more details." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Annoy?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Annoy is an open source library to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data. For our purpose, it is used to find similarity between words or documents in a vector space. [See the tutorial on similarity queries for more information on them](https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Similarity_Queries.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting Started" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First thing to do is to install annoy, by running the following in the command line:\n", "\n", "`sudo pip install annoy`\n", "\n", "And then set up the logger: " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# import modules & set up logging\n", "import logging\n", "logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making a Similarity Query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating an indexer\n", "An instance of `AnnoyIndexer` needs to be created in order to use Annoy in gensim. The `AnnoyIndexer` class is located in `gensim.similarities.index`\n", "\n", "`AnnoyIndexer()` takes two parameters:\n", "\n", "**`model`**: A `Word2Vec` or `Doc2Vec` model\n", "\n", "**`num_trees`**: A positive integer. `num_trees` effects the build time and the index size. **A larger value will give more accurate results, but larger indexes**. More information on what trees in Annoy do can be found [here](https://github.com/spotify/annoy#how-does-it-work). The relationship between `num_trees`, build time, and accuracy will be investigated later in the tutorial. \n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from gensim.similarities.index import AnnoyIndexer\n", "# 100 trees are being used in this example\n", "annoy_index = AnnoyIndexer(model, 100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we are ready to make a query, lets find the top 5 most similar words to \"army\" in the lee corpus. To make a similarity query we call `Word2Vec.most_similar` like we would traditionally, but with an added parameter, `indexer`. The only supported indexer in gensim as of now is Annoy. " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(u'science', 0.9998273665260058)\n", "(u'rates', 0.9086664393544197)\n", "(u'insurance', 0.9080813005566597)\n", "(u'north', 0.9077721834182739)\n", "(u'there', 0.9076579436659813)\n" ] } ], "source": [ "# Derive the vector for the word \"science\" in our model\n", "vector = model[\"science\"]\n", "# The instance of AnnoyIndexer we just created is passed \n", "approximate_neighbors = model.most_similar([vector], topn=5, indexer=annoy_index)\n", "# Neatly print the approximate_neighbors and their corresponding cosine similarity values\n", "for neighbor in approximate_neighbors:\n", " print(neighbor)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Analyzing the results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The closer the cosine similarity of a vector is to 1, the more similar that word is to our query, which was the vector for \"science\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Persisting Indexes\n", "You can save and load your indexes from/to disk to prevent having to construct them each time. This will create two files on disk, _fname_ and _fname.d_. Both files are needed to correctly restore all attributes. Before loading an index, you will have to create an empty AnnoyIndexer object." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "fname = 'index'\n", "\n", "# Persist index to disk\n", "annoy_index.save(fname)\n", "\n", "# Load index back\n", "if os.path.exists(fname):\n", " annoy_index2 = AnnoyIndexer()\n", " annoy_index2.load(fname)\n", " annoy_index2.model = model" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(u'science', 0.9998273665260058)\n", "(u'rates', 0.9086666032671928)\n", "(u'insurance', 0.9080811440944672)\n", "(u'north', 0.9077721834182739)\n", "(u'there', 0.9076577797532082)\n" ] } ], "source": [ "# Results should be identical to above\n", "vector = model[\"science\"]\n", "approximate_neighbors = model.most_similar([vector], topn=5, indexer=annoy_index2)\n", "for neighbor in approximate_neighbors:\n", " print neighbor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Be sure to use the same model at load that was used originally, otherwise you will get unexpected behaviors." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save memory by memory-mapping indices saved to disk" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Annoy library has a useful feature that indices can be memory-mapped from disk. It saves memory when the same index is used by several processes.\n", "\n", "Below are two snippets of code. First one has a separate index for each process. The second snipped shares the index between two processes via memory-mapping. The second example uses less total RAM as it is shared." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Process Id: 6216\n", "(u'klusener', 0.9090957194566727)\n", "(u'started', 0.908975400030613)\n", "(u'gutnick', 0.908865213394165)\n", "(u'ground', 0.9084076434373856)\n", "(u'interest', 0.9074432477355003)\n", "Memory used by process 6216= pmem(rss=126914560, vms=1385103360, shared=9273344, text=3051520, lib=0, data=1073524736, dirty=0)\n", "Process Id: 6231\n", "(u'klusener', 0.9090957194566727)\n", "(u'started', 0.908975400030613)\n", "(u'gutnick', 0.908865213394165)\n", "(u'ground', 0.9084076434373856)\n", "(u'interest', 0.9074432477355003)\n", "Memory used by process 6231= pmem(rss=126496768, vms=1385103360, shared=8835072, text=3051520, lib=0, data=1073524736, dirty=0)\n", "CPU times: user 64 ms, sys: 12 ms, total: 76 ms\n", "Wall time: 2.86 s\n" ] } ], "source": [ "%%time\n", "\n", "# Bad example. Two processes load the Word2vec model from disk and create there own Annoy indices from that model. \n", "\n", "from gensim import models\n", "from gensim.similarities.index import AnnoyIndexer\n", "from multiprocessing import Process\n", "import os\n", "import psutil\n", "\n", "model.save('/tmp/mymodel')\n", "\n", "def f(process_id):\n", " print 'Process Id: ', os.getpid()\n", " process = psutil.Process(os.getpid())\n", " new_model = models.Word2Vec.load('/tmp/mymodel')\n", " vector = new_model[\"science\"]\n", " annoy_index = AnnoyIndexer(new_model,100)\n", " approximate_neighbors = new_model.most_similar([vector], topn=5, indexer=annoy_index)\n", " for neighbor in approximate_neighbors:\n", " print neighbor\n", " print 'Memory used by process '+str(os.getpid())+'=', process.memory_info()\n", "\n", "# Creating and running two parallel process to share the same index file.\n", "p1 = Process(target=f, args=('1',))\n", "p1.start()\n", "p1.join()\n", "p2 = Process(target=f, args=('2',))\n", "p2.start()\n", "p2.join()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Process Id: 6246\n", "(u'science', 0.9998273665260058)\n", "(u'rates', 0.9086664393544197)\n", "(u'insurance', 0.9080813005566597)\n", "(u'north', 0.9077721834182739)\n", "(u'there', 0.9076579436659813)\n", "Memory used by process 6246 pmem(rss=125091840, vms=1382862848, shared=22179840, text=3051520, lib=0, data=1058062336, dirty=0)\n", "Process Id: 6261\n", "(u'science', 0.9998273665260058)\n", "(u'rates', 0.9086664393544197)\n", "(u'insurance', 0.9080813005566597)\n", "(u'north', 0.9077721834182739)\n", "(u'there', 0.9076579436659813)\n", "Memory used by process 6261 pmem(rss=125034496, vms=1382862848, shared=22122496, text=3051520, lib=0, data=1058062336, dirty=0)\n", "CPU times: user 44 ms, sys: 16 ms, total: 60 ms\n", "Wall time: 202 ms\n" ] } ], "source": [ "%%time\n", "\n", "# Good example. Two processes load both the Word2vec model and index from disk and memory-map the index\n", "\n", "from gensim import models\n", "from gensim.similarities.index import AnnoyIndexer\n", "from multiprocessing import Process\n", "import os\n", "import psutil\n", "\n", "model.save('/tmp/mymodel')\n", "\n", "def f(process_id):\n", " print 'Process Id: ', os.getpid()\n", " process = psutil.Process(os.getpid())\n", " new_model = models.Word2Vec.load('/tmp/mymodel')\n", " vector = new_model[\"science\"]\n", " annoy_index = AnnoyIndexer()\n", " annoy_index.load('index')\n", " annoy_index.model = new_model\n", " approximate_neighbors = new_model.most_similar([vector], topn=5, indexer=annoy_index)\n", " for neighbor in approximate_neighbors:\n", " print neighbor\n", " print 'Memory used by process '+str(os.getpid()), process.memory_info()\n", "\n", "# Creating and running two parallel process to share the same index file.\n", "p1 = Process(target=f, args=('1',))\n", "p1.start()\n", "p1.join()\n", "p2 = Process(target=f, args=('2',))\n", "p2.start()\n", "p2.join()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Relationship between num_trees and initialization time" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYkAAAEaCAYAAADkL6tQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XeYU2Xax/HvDYLYsRdQUOwoNkR9bWNHBVHXAlbQ1bW7\n9rYr2Nu6NuwiAgrYsKDAsquMIruAigiIgAWQJgiKoggCc79/PGckDslMZibJSSa/z3XNRXLOyTn3\nhDO583Rzd0RERJKpF3cAIiKSv5QkREQkJSUJERFJSUlCRERSUpIQEZGUlCRERCQlJQkREUlJSUIk\nDWY2yMzOzPSxSV7bzMzKzKxebc9VxXUmmNlBmT5vimstMrPmubiWZJ5pMJ1kg5lNBc5193fjjiVO\nZnY28Gd3PzDN45sBXwMN3L0sQzH0BGa4+82ZOF8V1xoG9HH3Z7N9LckNlSQkFmZWP+4YcsQAfROT\nwuXu+qljP8BU4CrgU+AHoB/QMNp3NjC8wvFlwDbR457Ao8AgYBEwHNgUeAD4HpgI7FbF9XsDK4Bf\ngJ+Aq4Fm0XXOAaYDpdGx+wIjojg/AQ5OOM+6wDPAbGAGcBsrS78tgFJgITAP6JcilkHARRW2jQWO\njx4/AMwFfozer51TnGcYcE7iewjcF70nXwFtKx4L7Aj8CiyL3svvo/3HAGOia04Huia8tln03tVL\nct2x0fv5U3S+MuCgaN9LwJzofSwFdoq2nwf8BiyJXvdGwj1yaPS4IfAgMAuYGb0nDaJ9B0fv/ZXR\n+zQL6JziPbodWA4sjq71cG3vL2Bz4JXo//gr4NK4/76K7Sf2APSThf/U8AEwMvrjaxz94Z0f7Tsb\neL/C8Ssq/BHPA3aPPjzeIVR/nE74Vnwb8G6aMRyS8Lw8STwHrAGsDmwBzAeOio45LHq+YfT8NeAx\noBGwUfQ7nRft6wvcED1uCPxfijjOBD5IeL5z9GHUADgS+BBYJ9q3A7BpivNUTBJLCYnAgAuAWZUc\nW/H9PghoGT3ehfDhflzC+5Q0SVQ4x3nR/+va0fPOwJrR7/VP4JOEY3sCtyb5/ylPErcC/wU2jH5G\nALdE+w4mJLmuQH3gaELyX6+q96m291f0/CPgpujazYEvgSPi/hsrph9VN9VdD7n7XHdfCAwk/FGm\nYhWev+buY939N8IH9a/u/oKHv9wXqzhXZed1wrfmX919KXAG8La7/wvA3d8hfCgcY2abED6QrnD3\nJe4+n/Btt2N0rmVAMzNr4u6/uft/U8TwGrCbmW0ZPT8NGODuy6JzrAPsbGbm7pPdfW6av9t0d382\nek96AZtHMVfJ3d9398+ixxOA/oQP47SY2QGED9P27v5zdJ7n3H1x9HvdSvid10nzlKcRksICd18A\n3EJIruV+A25z9xXuPhj4mZBQ0w65wvN07682wEbufkd07WmEkmVHJGeUJOquxA+7xcDaNXztr0me\nV+dcFc1MeNwMOMXMvo9+fgD2J1QxNCN8K56TsO8JYOPotdcQ7t/RZjbezLoku1j0ITqIlR8snYAX\non3DgO6E6o+5ZvaEmaX7u32bcI1fo4dpvdbM2pjZu2Y2z8wWAn8hlJTSee2WhA/Ss9z9q2hbPTO7\n28y+jM43lZCQ0zonoUT3TcLz6dG2cgv8j43o1b2fKkr3/toKaFLh/rgBSCsZS2YoSRSfXwjVEgCY\n2WZZuk6qxtrE7TOA3u6+QfSzvruv4+73RvuWEKqeyvc1dvdWAO4+z93Pd/cmhOqex8xsmxTX7Aec\nZmb7AqtHyYHoPN3dvTWhGmoHQvLJpGTvQ1/gdaCJuzcGnmTVb9urMLNGhG/e/3T3oQm7TgPaE6qP\nGhOqZSzhnFU1nM8mJOVyzaJtNZHJRvoZwNcV7o/13L19Bq8hVVCSKD6fAi3NrJWZrU6oa67uH3aV\nH2iEb9oVP7Qrvu55oL2ZHRl9G25kZgeb2Rbu/i0wFHjAzNaxYJvyvv1mdpKZNYnOs5DQ3pGqy+gg\nwgffrYRv4UTnaB19q1+N8A12SSXnqKm5QFMza5CwbW3gB3dfZmZtCB/yiVK9vz2Bz939/grb1yG0\nkfxgZmsBd/HH/9O5rPp/kagf8Dcz28jMNgL+DvSp7JeqRFXXSkf57z8aWGRm10b3Rn0za2lmrWt5\nfqmGrCYJM+thZnPNbFwVx+1tZsvM7MRsxlNEUn7ou/sXhA/Ld4AphN4lGTt/gruBv0fVBFcme527\nzwQ6ADcC3xGqOa5m5X15FqFxcyKhsflloLzkszcwysx+Inwrvyyqs1412FD3PYDQMN43Yde6wNPR\nuacSGs3vS/H7VPU7e4rH7wKfAd+a2bxo28XAbWb2I/A3EhJXFec6FTghGpy2yMx+MrP9Cb3JviH0\nPJpAaIRO1IPwxeB7MxuQ5Ly3E9qCxhG+RHwE3JHm71rRQ8DJZrbAzB5M4/iU54+quNoR2iimEhq8\nnyb8v0mOZHUwXdTA9jOhSqFVimPqAf8mfJN71t0HJDtORERyL6slCXf/gNBvuzKXsrIftIiI5JFY\n2yTMbAvCoKbHSa+eW/KEmW2ZUOVR/lP+vGnc8YlIZqwW8/UfBK5LeK5EUSDcfQahwVRE6rC4k0Rr\noL+ZGaFP99Fmtszd36x4oJlp/hsRkRpw9xp/Ac9FdVNif+0/cPdtop+tCe0SFyVLEAnH68edrl27\nxh5DvvzovdB7ofei8p/aympJwsz6AiXAhmb2DaFPfkPA3f2pCoerpCAikmeymiTcveIgocqOPSeb\nsYiISPVpxHUBKikpiTuEvKH3YiW9FyvpvcicglmZLkzSWRixiojkCzPD87zhWkRECpSShIiIpKQk\nISIiKSlJiIhISkoSIiKSkpKEiIikpCQhIiIpKUmIiEhKShIiIpKSkoSIiKSkJCEiIikpSYiISEpK\nEiIiBebXX2H58txcS0lCRKTA/PWv0LYtLFuW/WspSYiIFBB3eOst+OUXuPDC8DyblCRERArI+PHQ\nqBH8+9/w8cdw773ZvV5Wly8VEZHMGjIkVDWtvTYMHAj77QctWsBJJ2XneipJiIgUkMGD4eijw+Om\nTeGNN0K108KF2bmeli8VESkQixbBFlvAt9/CWmut3H7uubD55nD77au+RsuXiogUiXfegX33/WOC\nALj5Znj8cZg3L/PXVJIQESkQ5e0RFTVrBqefDnfdlflrZrW6ycx6AO2Aue7eKsn+04DroqeLgAvd\nfXyKc6m6SUTqtNmzQylhvfVW3ecOW28NgwbBzjuvuv/bb6FlSxg7FrbccuX2fK9u6gkcVcn+r4GD\n3H034Hbg6SzHIyKSt84/Hzp1Sj72YdIkKCuDnXZK/trNNoPzzoPbbstsTFlNEu7+AfBDJftHuvuP\n0dORQJNsxiMikq+WLIH334dp06Bv31X3l1c1WSVlgmuvhQEDYOrUzMWVT20SfwYGxx2EiEgchg+H\nXXaB3r3hyivhu+9W7isfZV3e9TWVDTaAESOgefPMxZUXScLMDgG6sLJ9QkSkqJSPf2jdGs46Cy6/\nPGyfORPat4f58+GII6o+zw47VF7aqK7YR1ybWSvgKaCtu6esmgLo1q3b749LSkooKSnJamwiIrky\nZAj06hUe33ILtGoFl14K/fuHfwcMgIYNqz5PaWkppaWlGYsr64PpzKw5MNDdd02ybyvgHeBMdx9Z\nxXnUu0lE6qTp02HvvUMPpXpR/c5774XBcQ88EKqhaqq2vZuy3QW2L1ACbAjMBboCDQF396fM7Gng\nRGA6YMAyd2+T4lxKEiJSJz35ZGiTeP75zJ87r5NEJilJiEhddfzxcPLJYUBcpilJiIgUsN9+g403\nhi+/DP9mWr4PphMRKWru0LUrpGpLHjEi9EjKRoLIBCUJEZEs6toVevSAyy4LI6YrGjw4+XxM+UJJ\nQkQkSx5+OHRhHTMmLBLUr98f95eVwdtvVz1ILk5KEiIiWdC3L9x3HwwdCptsEmZo/fvfQxtEuVtu\ngcaNoU3SPp35QUlCRCTD5s2DSy4JVUnlU2QcfDBsvz0880x4/uqr0LNnGCRXv35soVYp9hHXIiJ1\nTc+ecMIJqw6Cu+OOMMXGHnvABReEUdabbhpPjOlSkhARyaCyMnj6aXjhhVX37bUX7L9/KFX06hWe\n5zslCRGRDBo2LCwclKqd4R//gGOOCetGFAINphMRyaBTT4WDDoKLL447kkAjrkVE8sS8eWFg3NSp\noddSPtCIaxGRPNGrV2iwzpcEkQlqkxARyQB3eOop6NMn7kgyS0lCRKSWJk4MPZrWWAP22SfuaDJL\n1U0iIjWwdCk8+mgY83DEEWFA3IABmV06NB+oJCEiUg1lZfDii3DTTbDTTvDPf4beTPk8aro2lCRE\nRNI0bhycc05YYvTZZ6GkJO6Isk/VTSIiVSgrC4PgDjssjH8YNao4EgSoJCEiUqlp06BLF1i+HEaP\nhq23jjui3FJJQkQkiWXLwlTfrVuHRYFKS4svQYBKEiIiqxgxAi66KMzQOnIkbLtt3BHFR0lCRIQw\nGG7IELjnnjCtxl13hUn46lqX1upSkhCRojZtGrzyShgpXVYG110XJulr0CDuyPJDVif4M7MeQDtg\nrru3SnHMw8DRwC9AZ3cfm+I4TfAnIjW2fDncfz98+WVIBmVlMGFCSBInnACnnBJ6L9W1kkNezwJr\nZgcAPwO9kyUJMzsauMTdjzWzfYCH3H3fFOdSkhCRGpk/P5QO6teHk08O4xzq1YNmzcJAuNXqcJ1K\nbZNEtd4aM1sLWOLuK9I53t0/MLNmlRzSAegdHTvKzNYzs03dfW514hIRSeXTT0NJ4eST4c476+7I\n6GyptAusmdUzs9PM7G0zmwdMAuaY2UQzu8/Matvm3wSYkfB8VrRNRKTWBg6Eww8PyeGee5QgaqKq\ncRLDgBbADcBm7r6lu28CHACMBO4xszOyHKOISLU99RScfz4MGgQdO8YdTeGqqrrpcHdfVnGju38P\nvAq8ama16QMwC9gy4XnTaFtS3bp1+/1xSUkJJcUyLl5E0uYOXbtC374wfHjxjXEoLS2ltLQ0Y+dL\nq+HazFoAM919qZmVAK0IjdEL03htc2Cgu++aZN8xwMVRw/W+wINquBaRmho+HG69FRYuhLffhk02\niTui+OWkd5OZjQVaA82BQcAbQEt3P6aK1/UFSoANgblAV6Ah4O7+VHRMd6AtoQtsF3cfk+JcShIi\nAoQk0L07rL46rL8+NGwIPXrAzJlw/fVw1llhn+QuSYxx9z3N7BpC76ZHzOwTd9+jpheuLiUJEYEw\nvqFDh9AI3aIF/PADLFoUtnXsWLe7s9ZErrrALjOzTsDZQPtom8YjikhKTzwBLVvCgQdm9rx33QXf\nfw/DhoUShGRXurPAdgH2A+5w96lmtjVQx5b7FpFMmTwZLrsMnnkms+cdOjQsGfryy0oQuZLVEdeZ\npOomkcLgHqbWbtECXn8dZs3KzFQX06fDPvuEpUMPPrj25ysWta1uqmow3UAza5+sm6uZbWNmt5rZ\nOTW9uIjUPa+/DjNmwEMPwRprwGef1f6cpaWw//7wt78pQeRaVW0S5wFXAg+a2ffAd0AjQi+nr4Du\n7v5GViMUkYKxeDFccUVY/7lBAzjiiFBFtMsuNTvfihVw++3w5JPw3HNw5JEZDVfSkHZ1UzTeYXPg\nV2CKuy/OXlhJr6/qJpE8d/PNoT3ixRfD8wEDwsjnIUOqd55ly+DNN8OsrY0awQsvwOabZz7eYpDX\ns8BmkpKESH5bvDh8kE+YAFtG8ygsXBgef/dd+LCvysKFcO+90LMnbLddmFajUyfNuVQbWW2TEBFJ\n17vvwp57rkwQAI0bw667huVAqzJ6NOyxB8ybF871/vtwxhlKEHFTkhCRjHjrLWjXbtXt5e0SqbjD\ngw+G195/f+g2u9NO2YtTqiftJGFma5jZDtkMRkQKk3vqJHHkkamTxPjxYX/fvjBqFJx4YnbjlOpL\nK0mYWXtgLDAker67mb2ZzcBEpHB8+mno7rr99qvua9MGpk6FuQlLic2dC3/5S1gu9LjjQnXU1lvn\nLl5JX7oliW5AG2AhQLQOtf5LRQQIM64ee2zyQXMNGsAhh8A774TxE1dcEaqT1l479IS69NJwjOSn\ndJPEMnf/scI2dTUSESB1VVO5I4+E666D3XcPDdHjxoX2h/XXz12MUjPpTvD3mZmdBtQ3s+2Ay4D/\nZi8sESkU8+bB55/DQQelPuaUU2D58tBbSYmhsKQ7VfiawE3AkYAB/wJuc/cl2Q3vDzFonIRIHurV\nK6wl/corcUciyWgwnYjE6uSTQ3tE585xRyLJ5GrRodbAjYQ5m36vonL3VjW9cHUpSYjkn99+C0uE\nTp4Mm24adzSSTK4WHXoBuAYYD5TV9GIiUnesWAF33hl6KilB1F3pJonv3F3jIkQEgClTQvVSo0bQ\nr1/c0Ug2pVvddBjQCXgHWFq+3d0HZC+0VWJQdZNIzNyhe3e45Rbo1g0uugjqaXKfvJar6qYuwI6E\nda3Lq5scyFmSEJF4LVgAXbrAnDnwv/+FWVql7ks3Sezt7pq3SaRIPPssvPoq7LZbmNl19dXh4ovD\neIdXXtH60sUk3STxXzPb2d0nZjUaEYnVkiVhmowRI1YuINSnT1hf+skn4eij445Qci3dJLEvMNbM\nphLaJAzwdLrAmllb4EHCFCA93P2eCvu3BHoBjaNjbnD3wen/CiJSU+PHw9KloWSwdGloY2jePMzI\nus46cUcn+SDdhutmyba7+/QqXlcPmAIcBswGPgQ6uvukhGOeBMa4+5NmthMwyN1XmTxQDdcimeMe\nSgrPPANNmoTxDsuWwZ//DFdemXyiPilMWW24NrN13f0nYFENz98G+KI8mZhZf6ADMCnhmDJg3ehx\nY2BWDa8lImkoK4PLLw9VSp9+GgbDiaRSVXVTX6Ad8DGhN1NiNnJgmype3wSYkfB8JiFxJLoFGGpm\nlwFrAodXcU4RqaFly+Ccc0Ibw7BhsN56cUck+a7SJOHu7aJ/s7l2RCegp7s/YGb7As8DLbN4PZGi\ntHw5nH46/PwzDBkCa64Zd0RSCNJquDazd9z9sKq2JTEL2CrheVNWrU46FzgKwN1HmlkjM9vI3edX\nPFm3bt1+f1xSUkJJSUk64YsUvRUr4Oyz4aef4I03QpdWqZtKS0spLS3N2Pkqbbg2s0aEKqBhQAkr\nq5vWBYa4+46VntysPjCZ0HA9BxgNdHL3zxOOeRt4yd17RQ3X/3b3pknOpYZrkRooK4Nzz4VvvgmL\nA62xRtwRSS5le8T1X4C/AlsQ2iXKL/QT0L2qk7v7CjO7BBjKyi6wn5vZLcCH7v4WcDXwtJldQWjE\nPrtGv4mI/MHy5VBaCo8+GkZLDx6sBCHVl24X2Evd/ZEcxFNZDCpJiKRh3rwwr9Irr4QxD6ecAhdc\nENaUluJT25JEWlNzxZ0gROSPfv4ZnngChg8PYx7KvftumEZjzTXDgLjRo+Hqq5UgpOa0Mp1IAVm8\nGB5/HO67D/bdFyZODAngssvgq6+gR4+wnOgRR8QdqeSLXM0CKyIxmzoVDjggJId//xt23TU0Sv/r\nX/Dww6HH0iefaAEgyay0SxJm1gRoxh+XL30/S3Elu75KElLUjjsO9tkHbrop7kikkOSkJGFm9wCn\nAhOBFdFmB3KWJESK2cCBYUbWl1+OOxIpNun2bpoMtHL3pVUenCUqSUixWrwYWraEp5+GwzVpjVRT\nTno3AV8TVqUTkSyaOzcsDTpgQJhnCeDuu6FNGyUIiUe6JYlXgd1YdY3ry7IX2ioxqCQhddaPP4Ye\nS48/Dn/6E0yaBFOmQMeO8PzzMHYsNF1lHgKRquWqd9Ob0Y+IZNi4caGUcOyxMGYMNItWb5k0KXRp\nfeABJQiJT3V6NzUEto+eTnb3ZVmLKvn1VZKQOmfpUth7b7jiCujSJe5opC7KSZuEmZUAXwCPAo8B\nU8zsoJpeVKTYTJ8Ohx4Kb1Yoj998M7RoAZ07xxKWSJXSrW66HzjS3ScDmNn2QD9gr2wFJlJXjBsH\nxxwDnTqFOZSmTQsjpIcPhz59wupwWi5U8lW6SaJBeYIAcPcpZqbeTiJVKC0NE+w9/HBohL744tD2\nMHlymJX1ySdh443jjlIktXR7Nz1LmMb7+WjT6UB9dz8ni7FVjEFtElJQyhNE//6hqqncwoVw6qmw\nzTahN5NINtW2TSLdJLE6cDFwQLRpOPBYLgfXKUlIIZkxI4xt6NMn9fgGd1UzSfblJEnkAyUJKRRL\nlsCBB4ZSxDXXxB2NFLusJgkze8ndTzGz8YS5mv7A3VvV9MLVpSQhhcAdzjknTKXRv79KChK/bA+m\nuzz6t11NLyBSLNzhrrvg44/hf/9TgpC6odJxEu4+J3p4kbtPT/wBLsp+eCL5xR3++1947rmwOly5\nX36B004LS4a+9RastVZsIYpkVLoT/CVb5+roTAYiks/mz4d774WddgrVSa+8EqbPuOYaeO+9sBDQ\n6qvDiBGw1VZxRyuSOZVWN5nZhYQSwzZmNi5h1zrAiGwGJpIv3KFDh/Dh/+yzsN9+oSpp2jTo3h3O\nPBNuuCEMlFMVk9Q1VTVcrwesD9wFXJ+wa5G7f5/l2CrGooZricXAgXDjjWEm1vr1445GpHpy2gXW\nzDYBGpU/d/dvanrh6lKSkDisWAG77w533gnt28cdjUj15WqCv/Zm9gUwFXgPmAYMTvO1bc1skplN\nMbPrUhxzipl9Zmbjzez5ZMeIxKFvX1h3XWin/n1SpNIdcf0pcCjwH3ffw8wOAc5w93OreF09YApw\nGDAb+BDo6O6TEo7ZFngROMTdfzKzjdx9fpJzqSQhObV0Key4I/TuHQbHiRSiXC1fuszdFwD1zKye\nuw8DWqfxujbAF1G32WVAf6BDhWPOAx51958AkiUIkTg8+WRYW1oJQopZurPALjSztYH3gRfMbB7w\nSxqvawLMSHg+k5A4Em0PYGYfEJLWLe7+rzTjEsmYxx6Da6+FevWgQYNQkhihPnxS5NJNEh2AX4Er\nCDPArgfcmsEYtgUOArYC3jezXcpLFiK5MGQI3HZbGC292WawbBmstho0bhx3ZCLxSjdJXAk85+4z\ngF4AZnY+8FQVr5tF+OAv1zTalmgmMNLdy4BpZjYF2A74uOLJunXr9vvjkpISSkpK0gxfJLWJE+Gs\ns2DAANhhh7ijEamd0tJSSktLM3a+dBuu5wHfAZdE7RGY2Rh337OK19UHJhMarucAo4FO7v55wjFH\nRds6m9lGhOSwu7v/UOFcariWjJs/H/bZB7p2DYlCpK7JVcP1LMI0HHebWfnkx1Ve1N1XAJcAQ4HP\ngP7u/rmZ3WJm7aJj/gUsMLPPgHeAqysmCJFMcw8lh//7v7AAkBKESHLpliQ+ibq+NgIeB9YGdnX3\nHbMdYEIMKklIrbmHtaWvvz5M533PPXDkkZpOQ+qubE8VXu4jAHdfAnQxs4uBvWp6UZFcmzMHnn8e\nevUKiwJ17Qqnnx56MolIalqZTuq8Bx6AW2+FE0+Es8+GAw5QcpDikdWSRD6tTCdSE08/DQ8/DOPH\nQ9OmcUcjUniqmgV2c3efY2bNku2PFh/KCZUkpLpefBGuvDKs97DttnFHIxKPnM4CGyclCUnXb7/B\nSy/BVVfBf/4Du+4ad0Qi8cl2ddMiklQzEbq/uruvW9MLi9RWWRksWAALF8KPP8KMGfDGG2H9h+23\nD/8qQYjUjkoSEotly2DYsND9tDpGjQqJYPRo+OijsAjQ+uuH6TM23hiOOQZOOEHtDyLltOiQFKTB\ng+HYY+Hrr6F58/ReM2YMHHUUXHhhGCW9996wySZZDVOk4OVq0aHjarrokEgyAwfCBhvAU1XN/hWZ\nPz90YX3ssdCd9dhjlSBEciHd3uK3AfsCU9x9a8JcTCOzFpXUae7w1lvwzDPw7LNhSu7KLF8eps7o\n2BFOPjk3MYpIkO1Fh0RW8emn0LAhdOgQFvUZMKDy46+9NkzbfccduYlPRFZKN0lUXHToIdJbdEhk\nFQMHQvv2Yb6kCy+Exx9Pfty8eaEEMWQI9OsXGqlFJLfSTRKJiw4NAb4C2mcrKKk7LrgglBwSDRwI\n7dqFxx06wJdfwoQJK/e7Q//+0KoVbLVVWAhogw1yF7OIrKQusJI1s2fDlltCmzZhGdB69cJEezvv\nDHPnhionCJPtLVgAjzwSRkffemvY/+yzoReTiNRcVns3RetOY2aLzOynhJ9FZqblRaVSb78NJ50U\nSgY9e67cduSRKxMEwHnnQd++cPDB4fFZZ8HYsUoQIvmg0hHX7n5A9O86uQlH6pI334ROnWCnnaBt\nWzj++FDVVLGHUtOmYX2HJk1CG8Rq6U5gLyJZl+6iQ33c/cyqtmWTqpsKy+LFsNlmMH16GBF96aXw\n00/w2mswdSpsuGHcEYoUh1wtOtSywkVXQ4sOSSXeeQf22iskCIDbboMdd4Tdd1eCECkkVU3wdwNw\nI7BGQhuEAb8BaY6VlWJU3s21XOPG0KdPmLNJRApHutVNd7n7DTmIp7IYVN1UIMrKQvvC++/DdtvF\nHY1Iccv2VOE7uvsk4GUz27PifncfU9MLS9318ceh5KAEIVL4qmqTuBI4H7g/yT4HDs14RFLwKlY1\niUjh0mA6ybjddw8D4w48MO5IRCQnU4VHF/o/MzvNzM4q/0nzdW3NbJKZTTGz6yo57k9mVpasWksK\nw4oVYZT0zJmw335xRyMimZBWF1gz6wO0AMYCK6LNDvSu4nX1gO6EqcVnAx+a2RtRO0ficWsDl6Hp\nxwuSexg4d9NNoS3i7bc1IE6krkj3T7k1sHMN6nvaAF+4+3QAM+tPmCxwUoXjbgPuBq6t5vklZhMn\nwkUXhbmX7r47LAZkNS7Yiki+Sbe6aQKwWQ3O3wSYkfB8ZrTtd2a2B9DU3bXSXQFZvBhuuCHMt3Ty\nyWGupXbtlCBE6pp0SxIbARPNbDTw+zpi7n5cbS5uZgb8Ezg7cXNtzinZtWBBWFGue/fQMD1uHGy+\nedxRiUi2pJskutXw/LOArRKeN422lVuHMOVHaZQwNgPeMLPjko3B6NZtZRglJSWUlJTUMCyprpkz\nwxTeL79V3Qp9AAANgElEQVQc1oB4/fUw7YaI5JfS0lJKS0szdr6sdoE1s/rAZELD9RxgNNDJ3T9P\ncfww4Ep3/yTJPnWBjYE7PP10aJQ+7zy4/HLYdNO4oxKRdGV7xPUiQi+mVXYB7u7rVvZ6d19hZpcA\nQwntHz3c/XMzuwX40N3fqvgSVN2UN776KiSGn3+GYcNgl13ijkhEck2D6WQVv/0G//gH/POfYZ2H\nv/5VXVpFClWupgqXIvHee3DhhdCiBXz0ETRvHndEIhInJQkB4Ouv4brrYNQoePBBOOEEdWcVkWpM\nyyF105w5cO21sPfesNtuMGkSnHiiEoSIBCpJFKGff4YBA+D55+HDD8O60hMmaLyDiKxKDddF5q23\n4IILYI894Mwzw5Tea6wRd1Qiki1quJa0fP996KU0YkRYRvSQQ+KOSEQKgdokisCAAbDrrrD++mEa\nDSUIEUmXShJ12LffwiWXhPaGF1+EAw6IOyIRKTQqSdRBy5fD44+H3krbbx9maFWCEJGaUEmiDnGH\nQYPgmmtgiy1g6NCQKEREakpJoo4YNQpuvBFmzw5TahxzjMY6iEjtqbqpwE2YAMcfDyedBJ06hYZp\nrQ4nIpmikkQB+uGHsK5D797w5ZdhxHT//tCoUdyRiUhdo8F0BcQd7rwT7r0XjjoqDIZr2xYaNIg7\nMhHJVxpMVyQWL4YuXWDaNJg4EZo0qfIlIiK1pjaJAjBjRlhPumHDMJW3EoSI5IqSRJ576SVo3TpM\nwte7t9odRCS3VN2UpxYsCKOlP/kEBg6ENm3ijkhEipFKEnnGPUyh0aoVbLppSBJKECISF5Uk8sjk\nyaH0MHeu5loSkfygkkQemD8frr4a9t8/jJQeM0YJQkTyg5JEjH78EW6+GXbYAX79FcaPhyuugNVU\nvhORPKGPoxxzh5EjoUcPePVV6NABPvoItt467shERFaV9ZKEmbU1s0lmNsXMrkuy/woz+8zMxprZ\nv81sy2zHFJfS0rD4T+fOsN128Nln8NxzShAikr+yOi2HmdUDpgCHAbOBD4GO7j4p4ZiDgVHuvsTM\nLgBK3L1jknMV7LQcv/4KN90UGqOfeALatdMEfCKSG7WdliPbJYk2wBfuPt3dlwH9gQ6JB7j7e+6+\nJHo6EqhT44lHjgyD4WbODDO0tm+vBCEihSPbbRJNgBkJz2cSEkcq5wKDsxpRjnzzDVx/Pbz/Ptx3\nH3TsqOQgIoUnb3o3mdkZwF7AfXHHUhu//AJ//zvssUdod5g8OazzoAQhIoUo2yWJWcBWCc+bRtv+\nwMwOB24ADoqqpZLq1q3b749LSkooKSnJVJy15h7mWbrmmjAZ36efQtOmcUclIsWmtLSU0tLSjJ0v\n2w3X9YHJhIbrOcBooJO7f55wzB7Ay8BR7v5VJefK24brjz4KyeGHH+CRR0KSEBHJB3ndcO3uK4BL\ngKHAZ0B/d//czG4xs3bRYfcCawEvm9knZvZ6NmPKpIkT4U9/CmMdTj01JAslCBGpS7QyXTWtWAFD\nh8Izz8Dw4WHp0IsvhjXWiDsyEZFVaWW6LBoyBP7yF9h4Y9hqK9hoIxg8GDbbDP78Z+jZE9ZdN+4o\nRUSyRyWJFCZMgEMPDQv9bLBB6NI6ezYcfDDstlvOwhARqZXaliSUJJKYNw/22Qduuw3OOCMnlxQR\nyYq8brguREuWwAknwOmnK0GIiKgkkeCbb8LkextuGOZZqqcUKiIFTiWJDHAPs7HutRccfjj066cE\nISICRdy76euvw4R748eHKby/+w7+8x81SouIJCq66qa5c8O4hg8+CCWHXXeFVq3CoLjVV89AoCIi\neUTjJNLkDi+8AFddBeeeC88/D40axR2ViEh+K4okMXIk/O1voUpp0KBQghARkarV6ebZMWPg2GPh\nlFNWzq2kBCEikr46myR694ajjw4/X3wB550HDRrEHZWISGGpk9VNDzwQfkpLYaed4o5GRKRw1akk\n4R7aHgYMCL2Xttqq6teIiEhqdSZJfPghXHklLF8epvDeaKO4IxIRKXwF3ybxzTdw5plh4Z/OnUMJ\nQglCRCQzCjZJzJgBF10Eu+8OzZrB5Mlh/EP9+nFHJiJSdxRkkrjqqjB9xjrrwKRJcPvt4bGIiGRW\nwbVJjBsHL78cSg4bbxx3NCIidVvBlSReeCGs9aAEISKSfQU1wd+KFU7z5vD222FiPhERqVxRrScx\nfDg0bqwEISKSK1lPEmbW1swmmdkUM7suyf6GZtbfzL4ws/+ZWcohcH37hqomERHJjawmCTOrB3QH\njgJaAp3MbMcKh50LfO/u2wEPAvemOt+rr0KnTtmKtnCUlpbGHULe0Huxkt6LlfReZE62SxJtgC/c\nfbq7LwP6Ax0qHNMB6BU9fgU4LNXJdt5ZU22A/gAS6b1YSe/FSnovMifbSaIJMCPh+cxoW9Jj3H0F\nsNDMNkh2MlU1iYjkVj42XKdshT/ppFyGISIiWe0Ca2b7At3cvW30/HrA3f2ehGMGR8eMMrP6wBx3\n3yTJuQqjr66ISJ7J5zWuPwS2NbNmwBygI1Cx6XkgcDYwCjgZeDfZiWrzS4qISM1kNUm4+wozuwQY\nSqja6uHun5vZLcCH7v4W0APoY2ZfAAsIiURERPJAwYy4FhGR3MvHhutVVDUgry4zs6Zm9q6ZfWZm\n483ssmj7+mY21Mwmm9m/zGy9uGPNBTOrZ2ZjzOzN6HlzMxsZ3Rv9zKzgJq2sKTNbz8xeNrPPo/tj\nn2K8L8zsCjObYGbjzOyFaIBu0dwXZtbDzOaa2biEbSnvAzN7OBq8PNbMdq/q/HmfJNIckFeXLQeu\ndPeWwH7AxdHvfz3wH3ffgdCOc0OMMebS5cDEhOf3APe7+/bAQsLgzGLxEDDI3XcCdgMmUWT3hZlt\nAVwK7OnurQhV6J0orvuiJ+HzMVHS+8DMjgZaRIOX/wI8UdXJ8z5JkN6AvDrL3b9197HR45+Bz4Gm\n/HEQYi/g+HgizB0zawocAzyTsPlQ4NXocS/ghFzHFQczWxc40N17Arj7cnf/kSK8L4D6wFpRaWEN\nYDZwCEVyX7j7B8APFTZXvA86JGzvHb1uFLCemW1a2fkLIUmkMyCvKJhZc2B3YCSwqbvPhZBIgFW6\nDddBDwDXAA5gZhsCP7h7WbR/JrBFTLHl2tbAfDPrGVW/PWVma1Jk94W7zwbuB74BZgE/AmOAhUV6\nX5TbpMJ9UJ4IKn6ezqKKz9NCSBICmNnahGlLLo9KFBV7HNTpHghmdiwwNypVJXaHLtau0asBewKP\nuvuewC+EKoZiuy8aE74dNyMkgrWAtrEGlZ9qfB8UQpKYBSTO2NQ02lY0omL0K0Afd38j2jy3vJho\nZpsB8+KKL0f2B44zs6+BfoRqpocIxeXy+7iY7o2ZwAx3/yh6/iohaRTbfXE48LW7fx9N6/Ma4V5p\nXKT3RblU98EsYMuE46p8bwohSfw+IM/MGhLGUbwZc0y59iww0d0fStj2JtA5enw28EbFF9Ul7n6j\nu2/l7tsQ7oF33f0MYBhhECYUwftQLqpKmGFm20ebDgM+o8juC0I1075m1sjMjJXvQ7HdF8YfS9WJ\n90FnVv7+bwJnwe8zYiwsr5ZKeeJCGCdhZm0J3xrLB+TdHXNIOWNm+wPvA+MJRUYHbgRGAy8RvhVM\nB05x94VxxZlLZnYwcJW7H2dmWxM6M6wPfAKcEXVwqPPMbDdCI34D4GugC6ERt6juCzPrSvjisIxw\nD/yZ8A25KO4LM+sLlAAbAnOBrsDrwMskuQ/MrDuhSu4XoIu7j6n0/IWQJEREJB6FUN0kIiIxUZIQ\nEZGUlCRERCQlJQkREUlJSUJERFJSkhARkZSUJEQyxMzOjka3itQZShIimdOZFJOlJUwRIVJQdONK\nnRZN5zIxmiV1gpkNiaZwGGZme0bHbGhmU6PHZ5vZa9GCLV+b2cXRojZjzOy/0YRyya7zJ6A18Hx0\nbCMzm2pmd5vZR8BJZraNmQ02sw/N7L3yKTXM7ORoQalPzKw02razmY2KzjXWzFrk4v0SqUhJQorB\ntsAj7r4LYQGaP1H5bKktCeswtAHuAH6OZlodSTTvTUXu/iphnrHT3H1Pd18S7Zrv7q3d/SXgKeAS\nd9+bMOX549ExfweOdPc9gOOibRcAD0bXbU2Y0E8k5+rskn4iCaa6+/jo8RigeRXHD3P3xcBiM1sI\nvBVtHw/sWsnrKk6yBvAigJmtBfwf8HI0ER2EOZcARgC9zOwlYEC07X/ATdFCS6+5+5dVxCySFSpJ\nSDFYmvB4BeHL0XJW3v+NKjneE56XUf0vVr9E/9YjLJC0p7vvEf3sAuDuFwI3ESZj+9jM1nf3fkB7\nYAkwyMxKqnldkYxQkpBikGxhommEahxYOaV0bf0ErJtsh7svAqaa2Um/B2XWKvp3G3f/0N27Eub9\n39LMtnb3qe7+CGGa51YZilGkWpQkpBgka3/4B3ChmX0MbFCN11amF/BEecN1kteeDpwbNURPYGX7\nw31mNs7MxgEj3H0ccErU0P4JoY2kdzXiEMkYTRUuIiIpqSQhIiIpqXeTSDVFK3vtT6hOsujfh9y9\nV6yBiWSBqptERCQlVTeJiEhKShIiIpKSkoSIiKSkJCEiIikpSYiISEpKEiIiktL/A/f1rtiUE6Po\nAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt, time\n", "x_cor = []\n", "y_cor = []\n", "for x in range(100):\n", " start_time = time.time()\n", " AnnoyIndexer(model, x)\n", " y_cor.append(time.time()-start_time)\n", " x_cor.append(x)\n", "\n", "plt.plot(x_cor, y_cor)\n", "plt.title(\"num_trees vs initalization time\")\n", "plt.ylabel(\"Initialization time (s)\")\n", "plt.xlabel(\"num_tress\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialization time of the annoy indexer increases in a linear fashion with num_trees. Initialization time will vary from corpus to corpus, in the graph above the lee corpus was used" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Relationship between num_trees and accuracy" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAEaCAYAAADzDTuZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm8nPPd//HXJ0gsEUIImliCKCo0IlSUI2qtSuwllFBR\nSxf6cxelSevufYu2VNVaepc6BAlqi9ByVCmyIJFIokQIWckmgsT5/P74XlcyOZk5Z7Zr1vfz8ZjH\nmbnmWr5z5pz5zOe7mrsjIiL1rV25CyAiIuWnYCAiIgoGIiKiYCAiIigYiIgICgYiIoKCgYiIoGAg\nVcTMZphZ/3KXQ6QWKRhIzTCzdcpdhmpiZvr/l1X0xyBZib6V/9TMXjezhWZ2r5m1j547w8yeb7F/\ns5n1iO7/n5ndaGZPmNlSM3vezLqa2XVm9rGZTTGzPdu4/l3AtsCjZrbEzP6fmW0XXecsM5sJ/CPa\ndz8zeyEq56tmdlDKeTqZ2e1m9qGZvW9mV5mZRc/taGZNZrbIzOaZ2b0ZyvKEmZ3fYttrZjYwun+d\nmc01s8XR72u3DOc5M3rtS8zsP2Y2pMXzA6LyLzazt8zssGh7ZzP7s5l9YGYfmdmDObwPN5nZ42a2\nFGgws6PMbEJ0jZlmNrTF8Qek/C5nmtn3zKyPmc2Jf2/RfseZ2WuZ30GpeO6um25t3oAZwEtAV2BT\nYAowJHruDOCfLfb/EugR3f8/YB6wF9Ce8KH9DjAIMOAq4Jksy3BwyuPtgGbgL8AGQAdgG2ABcHi0\nzyHR482jxw8BNwHrA12i13RO9Nw9wGXR/fbA/hnKcTrwr5THuwEfA+sBhwFjgY2j53YBumY4z5HA\n9tH9bwLLgL2ix32BRUD/6PHWQM/o/uPAvUAnYB3gmzm8DwuB/VJe44HA7tHjrwGzgWNSfr9LgJOi\n63QGekXPvRH/jqPHDwI/KfffqW7535QZSC6ud/e57r4IeJTw4Z6JtXj8kLu/5u5fED6Ql7t7o4dP\nkvvaOFdr53VgqLsvd/fPgdOAx919DIC7/wMYBxxlZlsSPoAvcvfP3H0B8Hvgu9G5VgDbmdlX3P0L\nd38xQxkeAvY0s+7R41OBB919RXSOjYHdzMzcfZq7z013Encf7e7vRvefB54iBAWAs4A73P2Z6PnZ\n7j7dzLYCDgfOdfcl7v5ldGy2v6+/uftL0Tm/cPd/uvvk6PEbwAggzqROAZ529/uj6yx094nRc3cR\ngiJmtllUprSZlFQHBQPJReqH2qdAxzyPXZ7mcS7namlWyv3tgJOi6qePzWwh0I/wzXo7wrf32SnP\n3QJsER17CeF/4hUzm2Rmg9NdzN0/AZ5gdRA5BWiMnnsW+CNwIzDXzG4xs7SvzcyONLN/R1U9CwmB\nqkv0dHfg7TSHdQc+dvclbfxOMnm/RRn6mtkzUbXYIuDcLMoAcDdwtJltQMgc/pkp6El1UDCQYlgG\nbBg/iL69JiHTFLup298H7nL3zaJbZ3ff2N2viZ77jFBlFD+3qbv3AnD3ee4+xN2/AvwAuCmub0/j\nXuBUM9sP6BAFAaLz/NHd+xCqj3YhBJk1RO0tI4FrgC3cvTMwmtXf5N8Hdkxz3feBzcysU5rnsnkf\nWv4O7wEeBr7i7psCt7Yow05pzoG7fwj8GziekI39Nd1+Uj0UDKQYXgd2N7NeZtYBGErmD+5MWlZn\npDMHaPnh3PK4u4HvmNlhZtbOzNY3s4PMbBt3n0OoirnOzDa2oIeZHQhgZieY2Vei8ywitEc0ZyjL\nE4RM41eEai6ic/SJvm2vS8h4PstwjvbRbYG7N5vZkYT2htgdwGAzOzgq5zZmtkv0GkYTAtWmZrau\nmcVVS/m8Dx2Bhe6+wsz6Eqq8Yo3AIdHvZR0z28zWbOj/K/BfhLaGB9u4jlQ4BQPJVsYPFXd/i/Ch\n+A9gOtBaHXbO509xNXBlVMVzcbrj3H0WMAC4HJgPzAT+H6v/1r9H+BCeQmj0fQCIv0HvA7xsZksI\n35Z/FNfpr1XY0PbxIKGB+p6UpzoBf4rOPYPQeP2bNMd/AvwIeMDMPiZUOf0t5fmxwGBCm8ZioInQ\nmwpCXf1KYCqhuu3H0TH5vA/nA1eZ2WLgClICm7u/DxxF+P19DLwK9Eo59iFCQHzQ3T/L4lpSwSy0\n3yV4AbN3CX/MzcAKd+9rZp0Jf3TbAe8CJ7n74kQLIiJFZ2b/IfQqe6bcZZHClCIzaAYa3P3r7t43\n2nYp8Hd33wV4BrisBOUQkSIys+OBZgWC2rBuCa5hrB10BrC6+9qdhBT40hKURSpY1FVzCmtW/Vj0\neLeoCkgqgJk9C+xKaDyWGlCKaqJ3CPWNDtzq7reb2cKo90S8z8fuvlmiBRERkYxKkRn0c/fZZrYF\n8JSZTWPtxsJkI5KIiLQq8WDg7rOjn/PN7GHCMPu5ZtbV3edGfaHnpTvWzBQkRETy4O7ZdNdeJdEG\nZDPbMB59aWYbEfpRTwIeAc6MdjuDlC51LZV7vo4kb0OHDi17GfTa9Pr0+mrvlo+kM4OuwEPRN/x1\ngUZ3f8rMxgH3m9lZhH7gJyVcDhERaUWiwcDdZ5BmAjJ3/xj4VpLXFhGR7GkEchk1NDSUuwiJqeXX\nBnp91a7WX18+Eu9aWogwA3Dllk9EpBKZGV5JDcgiIlIdFAxERETBQEQkk5kzYVadTIKiNgMRkQx+\n8ANYbz244YZylyQ3+bQZlGI6ChGRqjRxIjRnWt6oxigzEBFJo7kZNt0UvvwSFi6E9u3LXaLsqTeR\niEiRzJwJnTpBjx7wxhvlLk3yFAxERNKYOBF69YJ99oFXXil3aZKnYCAiksakSSEY9O0LY8eWuzTJ\nUzAQEUkjNTNQMBARqVMTJ8Iee4Tb22/DsmXlLlGyFAxERFpYvjw0IO+yS+hFtMceMGFCuUuVLAUD\nEZEWpkyBnj1Xdyeth6oiBQMRkRbi9oJYPfQoUjAQEWkhbi+I1UOPIgUDEZEW4m6lsZ49YcEC+Oij\n8pUpaQoGIiIp3OH119cMBu3awd5713Z2oGAgIpJi7twwL9HWW6+5vdarihQMRERSxI3H1mKat1rv\nUaRgICKSomV7QSzuUVSrEykrGIiIpGjZrTTWvXv4WeqVz778En71K/jii2Svo2AgIpKiZbfSmFl5\nqopGjICnnw4rriVJwUBEJLJyJUybBrvvnv75Ug8+i7OCYcPWbsMoNgUDEZHI9OnQrRtstFH650vd\no2jECNhyS+jfP/lraQ1kEZFIpiqiWJ8+MH586HraLuGv0nFWcNNNyWcFoMxARGSVTI3HsS5dYLPN\nQgaRtFJmBaBgICKySqZupalKUVVUyraCmIKBiEikrcwAStOjqNRZASgYiIgAsGhRmIhuhx1a3y/p\nHkXlyApAwUBEBIA33oCvfa3thuHevUN1UlKDwMqRFYCCgYgIkF0VEUDHjtCjRwgIxVaurAAUDERE\ngLa7laZKqt2gXFkBKBiIiADZZwaQTI+icmYFoGAgIkJzc2gzyCUzKHYjcjmzAihRMDCzdmY2wcwe\niR5vb2Yvmdl0M7vXzDQSWkTKZuZM2GSTMKAsG3vsAe+8A8uWFef65c4KoHSZwY+BKSmPhwO/c/ee\nwCLg7BKVQ0RkLbm0FwC0bx/2nzChONcvd1YAJQgGZtYNOAq4PWVzf2BUdP9O4NikyyEikkk2I49b\nKlZVUSVkBVCazOA64BLAAcxsc2ChuzdHz88CtilBOURE0sql8ThWrB5FlZAVQMLBwMy+Dcx199eA\n1JhXxvgnIrKmXKuJoDg9iiolK4Dkp7DuBxxjZkcBGwAbA9cDm5hZuyg76AZ8kOkEw4YNW3W/oaGB\nhoaGJMsrInVm+fLQgLzLLrkd17MnLFgQbl265HftYmUFTU1NNDU1FXQO8xKt7mxmBwE/dfdjzOw+\n4EF3v8/MbgZed/db0hzjpSqfiNSn8ePhrLPg9ddzP7Z/f/iv/4Ijjsj92C+/hN12C+sVHHJI7se3\nxsxw95xyjXKNM7gUuNjMpgObAXeUqRwiUufyaS+IFVJVVCltBbGS9e939+eA56L7M4B9S3VtEcne\nsmXgHubgqQf5tBfE9tkH/vKX3I8r9Spm2dAIZBFZw/DhMGRIuUtROvl0K43FPYpyrc3+xS+ge/fK\nyQpAwUBEWpg4EUaOhDlzyl2S5LmHtoJ8g0H37uHnrFnZH9PYCPfeG26VkhWAgoGItDBlCuy7L9x+\ne9v7Vru5c0NA2Hrr/I43y23w2UsvwUUXwSOPwBZb5HfNpCgYiMgqn30G778P114Lt94KK1eWu0TJ\nitsLCvmGnu3gs/feg+OPhz//OSyiU2kUDERklalTYccdwwfcttvCo4+Wu0TJKqS9IJZNj6Jly2DA\ngJAVHH10YddLioKBiKwyZUro+w5wwQVw443lLU/SCulWGuvTB8aNC9Ngp9PcDKefDl//Ovz0p4Vd\nK0kKBiKyyuTJsPvu4f7xx4dvzlOnlrdMSSqkW2msSxfYfHOYPj3981deCfPnw803V1aDcUsKBiKy\nSmow6NABvv/98CFWi1auhGnTVr/eQmSqKop7Dj34YPh9VjIFAxFZJbWaCODcc+Huu+GTT8pXpqRM\nnw7dusFGGxV+rnQ9iiq551A6CgYiAqzuSbTzzqu3bbstfPOb4RturSlGe0GsZY+iSu85lI6CgYgA\nq3sSrbfemtsvuCBMm1Brc0YWo70g1rt3aF/54ouQRR1zTGX3HEpHwUBEgLWriGKHHBKyhhdeKH2Z\nklSMbqWxjh2hR48wmvn000NwqOSeQ+koGIgIsGbjcap27eC882qvm2kxq4kgVBWdeWZY36DSew6l\no2AgIkDmYADhQ+7JJ2tnvqJFi+Cjj2CHHYp3zv33h08/rY6eQ+koGIgIkLmaCGDTTeHEE2tnvqI3\n3ggNu+2K+Ak4eHCoJqqGnkPpKBiISNqeRC2df37tzFdU7CoigHXWgU6dinvOUlIwEJGMPYlS7bVX\n7cxXlEQwqHYKBiLSahVRqkLmK/rsM3juOVixIr/ji6mY3UprhYKBiLTaeJzq+ONDfXuu8xXNmgUH\nHhgWnt92W7j8cnjnnfzKWqjm5vAaFAzWpGAgIlkHgw4d4Oyzc5uv6Pnnw9w9xx0H//kPPPNMyBL2\n3RcOPxxGjSpttjBzJmyyCWy2WemuWQ0UDEQk62oiyH6+IvcwcvmEE8K0DJdeGvre77prWDzn/ffh\ne9+DP/yhtNmCqojSUzAQqXPZ9CRKFc9XdM89rZ8zziBeeAGOOGLtfdZfHwYNCu0IqdnCYYclmy0U\nc+RxLVEwEKlz2fQkailuSE43X9GsWXDQQbB0Kfz737DTTm2fLzVbOOOMkC3svHNYIazYxo2DPfcs\n/nmrnYKBSJ3LpYoolmm+orh94Nhj4f77w5w9uUjNFnbdFf72t9yOb8vixfDss+kzlXqnYCBS57Jt\nPE4Vz1d0003hcab2gUIMGlT8qbNHjYL+/aFz5+KetxYoGIjUuXyCAYT5ikaPDr1z2mofyMfAgeF8\n8+YV53wQgsugQcU7Xy1RMBCpc/lUE8Hq+Yr23DO39oFsdewI3/52qG4qhg8+gFdfra41BkrJvIJX\nrDAzr+TyiVS7zz4LVSZLluTWgBx791146ik455xkpmwePRp++cuwhGShfvtbePNNuOOOws9V6cwM\nd8/pHVFmIFLH8ulJlGr77WHIkOTm7j/0UJgxIwxWK1RjI5x2WuHnqVUKBiJ1LN8qolJZd104+eTW\nxzRkY/JkmD8/dHmV9BQMROpYvo3HpTRoUBjxXEiNcWMjnHpqcdcvqDX61YhUoebmMFL39dcLO8/k\nyZWdGUAYt+AeBovlo7lZvYiyoWAgUoVGjYKnn4YRIwo7TzVkBmars4N8vPACbLyxpqBoi3oTiVSZ\n5ubwwXbiiSEYvPlmfudZvjz0JFq6NP8G5FJ5660wH9KsWaEdIRfnnhvWOr700mTKVonUm0ikDowa\nBRttBFdeGbqE5rq2QGzatMJ6EpXSzjvDdtvBP/6R23FffBF+X6eckky5akmbwcDM8p7s1cw6mNnL\nZvaqmU0ys6HR9u3N7CUzm25m95pZjrFepD41N4d+98OGhcbQAQPyn7+nGqqIUuVTVTR6dHiN222X\nTJlqSTaZwU1m9oqZnW9mm+Rycnf/HDjY3b8O7AUcaWb7AsOB37l7T2ARcHauBRepR3FWEE/5MHAg\nPPxwfueaMqW6gsHJJ4f1l3OZyfTuu9VwnK02g4G7fxMYBHQHxpvZPWZ2aLYXcPdPo7sdgHUBBw4G\nRkXb7wSOzaXQIvUoNSuIB3k1NIRqotmzcz9fNfQkStW1K3zjG9lnQosXh9HRJ56YbLlqRVZtBu7+\nFnAF8DPgIOAPZjbVzI5r61gza2dmrwJzgKeBt4FF7t4c7TIL2CafwovUk5ZZAUD79nDkkfDII7mf\nr9qqiSCMIM52JlPNUJqbbNoMepnZdcCbQH/gO+6+a3T/uraOd/fmqJqoG9AX+GphRRapP+mygtjA\ngbm3GyxfntvqZpViwIDQVXT+/Lb31diC3GTTcHsDcDtwubsvjze6+4dmdkW2F3L3JWbWBHwD2NTM\n2kXZQTfgg0zHDRs2bNX9hoYGGhoasr2kSM1IlxXEjjgCvv/90LOoU6fszldNPYlSdewYZh297z64\n8MLM+9XbDKVNTU00NTUVdI42xxmYWUdgubt/GT1uB6yf0hbQ2rFdgBXuvtjMNgDGAFcDZwAPuvt9\nZnYz8Lq735LmeI0zkLoXjyv4zW9ClVA6Rx0V1hc46aTsztnYGLKJYk0PXUqjR8OvfhWmzM6knmYo\nTSepcQZ/BzZIebxhtC0bWwPPmtlrwMvAGHd/ArgUuNjMpgObAXX6lom0rbWsIJZrr6Jq60mU6tBD\n4Z13Wp/J9O67NUNprrLJDF5z973a2pYEZQZS77LJCgDmzAlrBs+dGxqV2zJwYKhPr9aeNj/6EXTp\nAr/4xdrPTZ4Mhx8O771XvxPTJZUZLDOz3ikX2RtY3sr+IlIk2WQFAFttFYJBttXG1diTKFVrM5k2\nNoYRx/UaCPKVza/rJ8ADZva8mf0LuA9opelGRIqhtR5E6WRbVVStPYlSZZrJNJ6hVFVEuctm0NlY\nQnfQ84AfALu6+/ikCyZS77LNCmJxF9Pm5tb3q9aeRKnimUxbjjnQDKX5yzaR2gXYDegNnGJm30uu\nSCKSa1YA0LMnbLJJ2/P+V3sVUWzQoDBr68qVq7fFDcdJLcNZy7IZdDaUMNbgBsI0EtcAxyRcLpG6\nlmtWEMumqqiaexKlajmT6eefw8iRmqE0X9lkBicAhwBz3H0wsCeQ04R1IpK9fLKCWDbBoNrmJGrN\naaetnsl09Gj42tc0Q2m+sgkGy6ORwivNrBMwjzBpnYgkIN+sAKBPnzBB27RpmfeplWoiWHMmU00/\nUZhsgsE4M9sU+BMwHpgAtDL2T0TyVUhWAG2vcVALPYlSbbllmMn0r38NM5SecEK5S1S9Wg0GZmbA\n/7r7omi6iEOBM6LqIhEpskKyglhrVUW10JOopdNOg0suCTOUbrZZuUtTvVoNBtHw3ydSHr/r7hMT\nL5VIHVq5srCsINbQEOblSbfGQS1VEcUGDAgZlaqICpNNNdEEM9sn8ZKI1LkbbggjiQvJCmD1GgeP\nPrr2c7XSkyhVx45h0rpjtURWQbIJBvsC/zazt81sYrSWsbIDkSJ67z349a/h5puL00c+U1VRLfUk\nStWrF6yzTrlLUd2ymagubUctd5+ZSInWvLYmqpOa5x6qOvr2hSuyXiGkdUuWQLduMGvWmmsc7Lxz\naFyuxYAgqyU1UZ1nuIlIETz8cJiO+ZJLinfOTp3ggAPgySdXb4t7Eu20U/GuI7Ujm5XOHid8+Buw\nPrADMA2osZpHkdJbsiRMx9zYCB06FPfccVVRvOBN3JMomymupf5kM1HdHu7eK/q5M2EdY40zECmC\nK68Mc+8feGDxz33MMWFU7hdfhMe12JNIiiebzGAN7j7BzPZNojAi9WTs2LDs5OTJyZw/dY2Dww4L\nPYnUViCZtBkMzOzilIftCDOXfphYiUTqwMqVcO65YQWzJAdKxVVFhx0Wgo764ksm2TQgb5xy60Bo\nQxiQZKFEat0NN4QgkPSHc+oaB6omkta02bW0nNS1VGrRe+9B795hoFQp5gjabbcwfuHww0ODtRqQ\na18iXUvN7Oloorr4cWczG5NPAUXqnTtceCH85Celmyxu4EAYPlw9iaR12VQTbeHui+IH7r4Q2DK5\nIonUriTGFLRl4MDQq0hVRNKabILBl2a2bfwgGpGsuhuRHMVjCm65pfhjClrTpw9ss416Eknrsula\n+nPgX2b2HGHg2TeBIYmWSqQGXXll6NWTxJiC1rRrB5ddBvtouklpRVYNyGbWBdgveviSuy9ItFSr\nr6sGZKkJY8fCd74TevRsvnm5SyO1LqkG5GOBFe7+mLs/Rlj+cmC+hRSpN6ljChQIpFJl02Yw1N0X\nxw+ixuShyRVJpLbccAN07hxW5BKpVNm0GaQLGDlPYyFSj6ZPD+sUvPhicdYpEElKNpnBODO71sx2\njG7XAuOTLphItVu4EI4+Gq6+Gnr2LHdpRFqXzeI2GwFXAt+KNj0N/Le7L0u4bGpAlqq1YkVYerJX\nL7j22nKXRupNPg3Imo5CJAEXXAAzZoR1iLUco5RaPsEgm1lLtwD+i7CYzfrxdnfvn3MJRerAjTeG\naaNffFGBQKpHNm0GjcBUwgpnvwTeBcYmWCaRqvX003DVVSEj2GSTcpdGJHvZtBmMd/e9zWyiu/eK\nto1198THM6qaSKrJtGlhdPEDD5R+lLFIqkSqiYAV0c/ZZvZtwsI2CS7HIVJ9Pv44jDD+n/9RIJDq\nlE1mcDTwPNAduAHoBPzS3R9JvHDKDKQKqOeQVJqK601kZt2Au4CuQDPwJ3f/g5l1Bu4DtiO0QZyU\nOso55XgFA1nl88/h1FNh0qTsj/nNb2BAwuvynX8+zJwJjzyiBmOpDJUYDLYCtnL318ysI2Gw2gBg\nMPCRu19jZj8DOrv7pWmOVzAQICwKc9ZZsGhRWKglGy+9BNdfD+PGJTf698Yb4aabwqplnTolcw2R\nXFVcMFjrYmYPA3+Mbge5+9woYDS5+1fT7K9gIAD89rfQ2AjPPw8dO2Z3THNzWE2ssRH226/t/XP1\n9NNw+umhC2mPHsU/v0i+Epm1tFjMbHtgL+AloKu7zwVw9zlo5TRpxWOPwXXXhWqYbAMBhHn8zzsv\nfHMvtmnTwsRz99+vQCC1IetgYGb7mdmTZtaU6xTWURXRSODH7v4Ja6+Upq//ktakSTB4MIwaBd27\n53784MGhz//8+cUr06JF6jkktSdj11Iz2yr61h67GDiWsNrZy8DD2VzAzNYlBIK/uvvfos1zzaxr\nSjXRvEzHDxs2bNX9hoYGGhoasrms1ID58+GYY+D3v8+/mmfzzcMawHfcAZeu1SqVn+HD4YAD4Oyz\ni3M+kUI1NTXR1NRU0DkythlE9fsTgGvc/TMzu43QxbQZON/d+2V1AbO7gAXufnHKtuHAx+4+XA3I\nks7nn8O3vhW+ef/614Wda9w4OOEEePvtwnv7LFgAu+wCr74K227b9v4i5VD0BmQz+w7wY0L30JHA\nqcCGwL3u3mbibWb9gH8CkwhVQQ5cDrwC3E8YuzCT0LV0UZrjFQzqUNxzaPFiGDky1P0Xat994Yor\nQvVOIS67LFQT3Xxz4WUSSUoivYnMbB3gfOBo4Nfu/s/8i5gbBYP6lE/PobbceSfcey88+WT+51BW\nINWiqL2JzOwYM3sWeBJ4AzgZGGBmI8xsx8KKKtXmX/8K39aTlm/PobacfDJMmAD/+U/+5/jd7+Ck\nkxQIpDa11mYwEegLbACMcfe+0fadgavc/buJF06ZQUVwh4MOgpdfDqt3bbhhMtd54w3o3z8EgiTG\nBfzsZ2Fx+t/9LvdjlRVINSn2OIPFwHHA8aT09nH3t0oRCKRyPPsszJkDe+0VRvUmYf78UJ9/3XXJ\nBAKAH/wgVBd9+mnuxyorkFrXWmbQBTiFMGvpPe6+pJQFi8qgzKDM4qzgnHNg6tTQG+dXvyruNYrZ\nc6gtRx8Nxx2XW5WXsgKpNkXNDNx9gbvf4O63lCMQSGWIs4JTToGGhrCCV7FdeCFssUVYFCZpF1wQ\n5hPK5TuGsgKpB1oDWTJKzQpOPx2WLYOuXWHevOK1GyxYEKZz+PDD4jYYZxLPV3TPPaG7aTblU1Yg\n1aai5yaS6pOaFQBstBHsuWdx2w3+/vcQcEoRCGD1fEU33pjd/soKpF5ks9KZ1CF3GDYMrrwS1k35\nK4mrivr3L851xoyBww8vzrmyNXgw7LRTaLTeYovM+y1YALfdFrICkVqnzEDSapkVxIrZbuBenmAQ\nz1f05z+3vp+yAqknajOQtbRsK0hVzHaDiRPDh/Lbbye3+Ewmbc1XpLYCqWZqM5CiyJQVQHHbDeKs\noNSBAKBPnxDUnngi/fPKCqTeKBjIGjK1FaQqVlXRmDFwxBGFnydf55+ffuGbuK3gsstKXyaRclEw\nkDW0lhXEihEMli0L01scfHBh5ynEySfD+PFrz1ekrEDqkYKBrBJnBb/4ReasAGD//cOkb/lM6xB7\n7jno3bu8i8ivv37oWZQ6HbWyAqlXCgayyrPPwty58N02Zp4qRrtBOXoRpdNyviJlBVKvFAwEyK6t\nIFWhVUWVEgx22CFMjDdihLICqW8KBgJknxXECgkGM2fCxx/D17+e3/HFFs9X9NvfKiuQ+qVxBrJq\nXMGQIXDaadkdU8h4g9tuC20GjY25lzUJ8XxF8+bB5MkKBlL9NM5A8pJrVgCFtRs8+WRlVBHF2rWD\nyy+HH/5QgUDqlzKDOpdPVhD7+c9zX99gxYowH9DUqbDVVrldT0Syo8xAcpZPVhDLp93g5ZdDo60C\ngUhlUTCoY7n2IGopn/EGldKLSETWpGBQBf7yl9DY+tlnxTvn0qWh90y+WQHk126gYCBSmRQMKtys\nWXDxxXDXXdC9e7j/5pv5n2/cuNA+sO224UN8xIj8soJYLlVFCxaEtoJ+/fK/nogkQ8Ggwl19NZx9\ndvhG/cryqYktAAAMu0lEQVQrsMEGYWGZgw7KPltYuhRuvRX23htOPBG23x6mTIFRowrv659LMIhX\nNWvfvrBrikjxqTdRBZs1C3r1Ct+mt9xy9fYVK+DRR8MH/IQJYc2Bc86BXXdd8/jx48M+DzwQAsiQ\nIXDooaErZbHkMt5g8OAQkC68sHjXF5G15dObSMGggl14YcgEfvObzPvMmAG33x5W7dp55/CBv2xZ\nGNi1cGEIEmeeCVtvnVw5+/WDq65qfSlMd+jWLWQRO++cXFlERMGgpmTKCjKJs4Xbbw/f0IcMgW99\nq7hZQCbZjDeYNAkGDCjPqmYi9SafYFBA06EkKW4ryCYQAKy3Hhx3XLiVWkNDyAxa8+STYSEbBQKR\nyqRgUIFmzYJ77glZQTVIHW+Qqd1gzJgw3YOIVCb1JqpAuWYF5dbWeINKWNVMRFqnzKDCVFtWEIu7\nmKZrRK6EVc1EpHXKDCpMtWUFsdbGG2jUsUjlU2+iCpJrD6JK0tp4g69+NQyQ23vv8pRNpN5o1tIq\nN3x4dWYFkLndoNJWNROR9NRmUCE++CB8e662toJU6doNxowp/qhnESm+RP9FzewOM5trZhNTtnU2\ns6fMbJqZjTGzTZIsQ7Wo1raCVOnaDcaMCeMLRKSyJdpmYGYHAJ8Ad7l7r2jbcOAjd7/GzH4GdHb3\nSzMcXxdtBh98AHvsUZ1tBalathvEq5pNmxa2i0hpVFybgbv/C1jYYvMA4M7o/p3AwCTLUA1qISuA\ntdsN4lXNFAhEKl852gy2dPe5AO4+x8yq/COwMLXQVpAqtd1AXUpFqkclNOvVfj1QK2olK4ilthso\nGIhUj3JkBnPNrKu7zzWzrYB5re08bNiwVfcbGhpoaGhItnQlVGtZAayep+j997WqmUipNDU10ZTt\nKlMZJD7ozMy2Bx519z2ix8OBj919eL03IP/wh7D++q2vV1CN+vWD3XeH2bPDtNoiUloVN4W1md0D\nNACbm9l7wFDgauABMzsLmAmclGQZKlUtZgWxhoYwgO73vy93SUQkW5qOokxqNSsAePppOOwwmD5d\nq5qJlEPFZQaS3tKlcNdd4cOyFvXrBxddBDvtVO6SiEi2lBmUwV13hUXqVZ8uIkmouEFnkt7dd8Np\np5W7FCIiqykzKLHZs2G33UIDcqYlIkVECqHMoAqMGAEDBigQiEhlUTAoscZGGDSo3KUQEVmTgkEJ\nTZ0KH36Yfp1gEZFyUjAoocZG+O53YZ11yl0SEZE1aZxBibiHYPDAA+UuiYjI2pQZlMi//w0dOkDv\n3uUuiYjI2hQMSiRuOLacOnuJiJSGxhmUwIoVsM02YeWvHj3KXRoRqXUaZ1ChxoyBnj0VCESkcikY\nlEBjo6afEJHKpmqihC1dCt26wdtvQ5cu5S6NiNQDVRNVoIceggMPVCAQkcqmYJAwzVAqItWg7qqJ\nPvwQ3nkHDjigqKdNa84c2HVXzVAqIqWlaqI2fPIJHHkkHH00/Pd/Q3NzstfTDKUiUi3qJhg0N4dB\nX337wpQp8PjjcMIJoYE3KXffrRlKRaQ61E0wuPxyWLwYbrwxDABragqNuvvum8xaxFOnhuohzVAq\nItWgLoJBvObwyJHQvn3Y1qED3HYb/OQnof3g8ceLe83GRjjlFM1QKiLVoeYbkF98EQYODJnAbrtl\n3ufEE+G880IG0a7AEOkOO+4YAtDeexd2LhGRXKkBuYWZM0O7wJ13Zg4EAPvvD2PHhuzgxBMLb0fQ\nDKUiUm1qNhh88gkccwxccknoQdSWuB1h881hv/3grbfyv7ZmKBWRalOT1UTNzXDssbDFFvCnP+X+\noXzbbXDFFfCXv8BRR+V2rGYoFZFyUzVR5PLLYdEiuOmm/L6dDxkCDz8cfv785zBvXvbHaoZSEalG\nNRcM4p5Do0at7jmUj/33h1degdmzYZdd4OST4Zln2h6oFlcRiYhUk5qqJop7Dj37LOy+e/HKsXhx\nGEB2662wfDmccw6ceSZsueWa+2mGUhGpBHVdTTRzJhx/fKjnL2YgANhkE7jgAnj99RAUpk5dnS38\n4x+rswXNUCoi1aomMoNPPoF+/cK39YsuSr5ckD5beOyxEDROPrk0ZRARSSefzKAmgsH//m+omsmn\n51Ch3EPbwq23wgsvwKuvamI6ESmvug0GX34ZboU0GIuI1Ip8gsG6SRWmlNZZR3MAiYgUomYakEVE\nJH9lCwZmdoSZTTWz6Wb2s3KVQ0REyhQMzKwd8EfgcGB34BQz+2o5ylJOTU1N5S5CYmr5tYFeX7Wr\n9deXj3JlBn2Bt9x9pruvAEYAA8pUlrKp5T/IWn5toNdX7Wr99eWjXMHgK8D7KY9nRdtERKQM1IAs\nIiLlGWdgZvsBw9z9iOjxpYC7+/AW+1XuIAgRkQpWFYPOzGwdYBpwCDAbeAU4xd3fLHlhRESkPIPO\n3P1LM7sQeIpQVXWHAoGISPlU9HQUIiJSGhXZgFzrA9LM7F0ze93MXjWzV8pdnkKZ2R1mNtfMJqZs\n62xmT5nZNDMbY2ablLOMhcjw+oaa2SwzmxDdjihnGfNlZt3M7Bkzm2xmk8zsR9H2mnj/0ry+H0bb\na+X962BmL0efJZPMbGi0fXszeyn6DL3XzNqsBaq4zCAakDad0J7wITAW+K67Ty1rwYrIzN4B9nb3\nheUuSzGY2QHAJ8Bd7t4r2jYc+Mjdr4kCemd3v7Sc5cxXhtc3FFjq7teWtXAFMrOtgK3c/TUz6wiM\nJ4z5GUwNvH+tvL6TqYH3D8DMNnT3T6O22BeAHwMXAyPd/QEzuxl4zd1vbe08lZgZ1MOANKMyf/d5\ncfd/AS0D2wDgzuj+ncDAkhaqiDK8PgjvY1Vz9znu/lp0/xPgTaAbNfL+ZXh98Zimqn//ANz90+hu\nB0I7sAMHA6Oi7XcCx7Z1nkr8QKqHAWkOjDGzsWZ2TrkLk5At3X0uhH9IYMs29q9GF5jZa2Z2e7VW\no6Qys+2BvYCXgK619v6lvL6Xo0018f6ZWTszexWYAzwNvA0scvd4xfZZwDZtnacSg0E96OfufYCj\nCH+QB5S7QCVQWfWRhbsJ2NHd9yL8E1Z1dUNUhTIS+HH0Dbrl+1XV71+a11cz75+7N7v71wkZXV8g\nr3neKjEYfABsm/K4W7StZrj77OjnfOAhwhtYa+aaWVdYVW87r8zlKSp3n5+y8tKfgH3KWZ5CRI2L\nI4G/uvvfos018/6le3219P7F3H0J0AR8A9g0an+FLD9DKzEYjAV2MrPtzKw98F3gkTKXqWjMbMPo\nWwpmthFwGPBGeUtVFMaadbCPAGdG988A/tbygCqzxuuLPiBjx1Hd7+GfgSnufn3Ktlp6/9Z6fbXy\n/plZl7iKy8w2AA4FpgDPAidGu2X1/lVcbyIIXUuB61k9IO3qMhepaMxsB0I24ITGnsZqf31mdg/Q\nAGwOzAWGAg8DDwDdgZnASe6+qFxlLESG13cwof65GXgXODeuY68mZtYP+CcwifA36cDlhFkB7qfK\n379WXt+p1Mb7twehgbhddLvP3X8dfc6MADoDrwKnRR1yMp+rEoOBiIiUViVWE4mISIkpGIiIiIKB\niIgoGIiICAoGIiKCgoGIiKBgIFIQMzujxQAmkaqkYCBSmDPJMJFiynQAIhVPf6xSc6KpTKaY2W1m\n9oaZPWlm65vZs2bWO9pnczObEd0/w8weihZzecfMLjCzi6JFT140s00zXOd4oA9wd7Tv+mY2w8yu\nNrNxwAlm1sPMRkcz1D5nZj2jY7uY2choYZKXzewb0faDooVKJpjZ+GjKEpHEKRhIrdoJuMHdvwYs\nAo6n9Zk4dyfM2d8X+DXwibv3Jkzn/L10F3D3UYS5tE51997u/ln01AJ37+Pu9wO3ARe6+z7AJcDN\n0T7XA9e6+77ACcAd0fafAudH1/4msDyvVy+SozaXQhOpUjPcfVJ0fwKwfRv7PxstEvKpmS0CHou2\nTwL2aOW4lhP0AdwHqyYi3B94wMzifdaLfn4L2DVle0cz25CwUtV1ZtYIPOjuNTVjr1QuBQOpVZ+n\n3P8S2ABYyepseP1W9veUx83k/n+yLPrZDlgYfctvyYB900weNtzMHgO+DbxgZoe5+/Qcry+SM1UT\nSa1Kt6Thu4Q6flg9vW+hlgCd0j3h7kuBGWZ2wqpCmfWK7j5FWKs23r5n9LOHu09292sIVVB5LVQi\nkisFA6lV6doHfgucZ2bjgc1yOLY1dwK3xA3IaY4dBJwdLa/4BnBMtP3HQB8zez3afm60/SdmNsnM\nXgO+AEbnUBaRvGkKaxERUWYgIiJqQBbJipn9EehHqAay6Of17n5nWQsmUiSqJhIREVUTiYiIgoGI\niKBgICIiKBiIiAgKBiIigoKBiIgA/x+X3mvtI8m3lQAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "exact_results = [element[0] for element in model.most_similar([model.wv.syn0norm[0]], topn=100)]\n", "x_axis = []\n", "y_axis = []\n", "for x in range(1,30):\n", " annoy_index = AnnoyIndexer(model, x)\n", " approximate_results = model.most_similar([model.wv.syn0norm[0]],topn=100, indexer=annoy_index)\n", " top_words = [result[0] for result in approximate_results]\n", " x_axis.append(x)\n", " y_axis.append(len(set(top_words).intersection(exact_results)))\n", " \n", "plt.plot(x_axis, y_axis)\n", "plt.title(\"num_trees vs accuracy\")\n", "plt.ylabel(\"% accuracy\")\n", "plt.xlabel(\"num_trees\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This was again done with the lee corpus, a relatively small corpus. Results will vary from corpus to corpus" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }