{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep Inverse Regression with Yelp reviews\n", "\n", "In this note we'll use [gensim](https://radimrehurek.com/gensim/) to turn the Word2Vec machinery into a document classifier, as in [Document Classification by Inversion of Distributed Language Representations](http://arxiv.org/pdf/1504.07295v3) from ACL 2015." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data and prep" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, download to the same directory as this note the data from the [Yelp recruiting contest](https://www.kaggle.com/c/yelp-recruiting) on [kaggle](https://www.kaggle.com/):\n", "* https://www.kaggle.com/c/yelp-recruiting/download/yelp_training_set.zip\n", "* https://www.kaggle.com/c/yelp-recruiting/download/yelp_test_set.zip\n", "\n", "You'll need to sign-up for kaggle.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can then unpack the data and grab the information we need. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tutorial Requirements:\n", "1. gensim (and all of its own requirements)\n", "1. pandas\n", "1. matplotlib" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# ### uncomment below if you want...\n", "# ## ... copious amounts of logging info\n", "# import logging\n", "# logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)\n", "# rootLogger = logging.getLogger()\n", "# rootLogger.setLevel(logging.INFO)\n", "# ## ... or auto-reload of gensim during development\n", "# %load_ext autoreload\n", "# %autoreload 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we define a super simple parser" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import re\n", "\n", "contractions = re.compile(r\"'|-|\\\"\")\n", "# all non alphanumeric\n", "symbols = re.compile(r'(\\W+)', re.U)\n", "# single character removal\n", "singles = re.compile(r'(\\s\\S\\s)', re.I|re.U)\n", "# separators (any whitespace)\n", "seps = re.compile(r'\\s+')\n", "\n", "# cleaner (order matters)\n", "def clean(text): \n", " text = text.lower()\n", " text = contractions.sub('', text)\n", " text = symbols.sub(r' \\1 ', text)\n", " text = singles.sub(' ', text)\n", " text = seps.sub(' ', text)\n", " return text\n", "\n", "# sentence splitter\n", "alteos = re.compile(r'([!\\?])')\n", "def sentences(l):\n", " l = alteos.sub(r' \\1 .', l).rstrip(\"(\\.)*\\n\")\n", " return l.split(\".\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And put everything together in a review generator that provides tokenized sentences and the number of stars for every review." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from zipfile import ZipFile\n", "import json\n", "\n", "def YelpReviews(label):\n", " with ZipFile(\"yelp_%s_set.zip\"%label, 'r') as zf:\n", " with zf.open(\"yelp_%s_set/yelp_%s_set_review.json\"%(label,label)) as f:\n", " for line in f:\n", " if type(line) is bytes:\n", " line = line.decode('utf-8')\n", " rev = json.loads(line)\n", " yield {'y':rev['stars'],\\\n", " 'x':[clean(s).split() for s in sentences(rev['text'])]}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For example:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'x': [['nice', 'place', 'big', 'patio'],\n", " ['now', 'offering', 'live', 'sketch', 'comedy'],\n", " ['wednesday',\n", " 'november',\n", " '17th',\n", " 'see',\n", " 'local',\n", " 'troupe',\n", " 'th',\n", " 'sic',\n", " 'sense',\n", " 'in',\n", " 'their',\n", " '2nd',\n", " 'annual',\n", " 'holiday',\n", " 'show'],\n", " ['lighter', 'snappier', 'take', 'on', 'the', 'holiday', 'times'],\n", " ['not', 'for', 'the', 'easily', 'offended'],\n", " ['sketches',\n", " 'include',\n", " 'the',\n", " 'scariest',\n", " 'holloween',\n", " 'costume',\n", " 'the',\n", " 'first',\n", " 'thanksgiving',\n", " 'and',\n", " 'who',\n", " 'shot',\n", " 'santa',\n", " 'claus'],\n", " ['as', 'well', 'as', 'the', 'infectious', 'song', 'mama', 'christmas']],\n", " 'y': 5}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "try:\n", " next(YelpReviews(\"test\"))\n", "except FileNotFoundError:\n", " raise ValueError(\"SKIP: Please download the yelp_test_set.zip\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, since the files are small we'll just read everything into in-memory lists. It takes a minute ..." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "229907 training reviews\n" ] } ], "source": [ "revtrain = list(YelpReviews(\"training\"))\n", "print(len(revtrain), \"training reviews\")\n", "\n", "## and shuffle just in case they are ordered\n", "import numpy as np\n", "np.random.shuffle(revtrain)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, write a function to generate sentences -- ordered lists of words -- from reviews that have certain star ratings" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def StarSentences(reviews, stars=[1,2,3,4,5]):\n", " for r in reviews:\n", " if r['y'] in stars:\n", " for s in r['x']:\n", " yield s" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Word2Vec modeling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We fit out-of-the-box Word2Vec" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Word2Vec(vocab=0, size=100, alpha=0.025)\n" ] } ], "source": [ "from gensim.models import Word2Vec\n", "import multiprocessing\n", "\n", "## create a w2v learner \n", "basemodel = Word2Vec(\n", " workers=multiprocessing.cpu_count(), # use your cores\n", " iter=3, # iter = sweeps of SGD through the data; more is better\n", " hs=1, negative=0 # we only have scoring for the hierarchical softmax setup\n", " )\n", "print(basemodel)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Build vocab from all sentences (you could also pre-train the base model from a neutral or un-labeled vocabulary)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "basemodel.build_vocab(StarSentences(revtrain))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we will _deep_ copy each base model and do star-specific training. This is where the big computations happen..." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 stars ( 246207 )\n", "2 stars ( 295371 )\n", "3 stars ( 437718 )\n", "4 stars ( 883235 )\n", "5 stars ( 799704 )\n" ] } ], "source": [ "from copy import deepcopy\n", "starmodels = [deepcopy(basemodel) for i in range(5)]\n", "for i in range(5):\n", " slist = list(StarSentences(revtrain, [i+1]))\n", " print(i+1, \"stars (\", len(slist), \")\")\n", " starmodels[i].train( slist, total_examples=len(slist) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inversion of the distributed representations\n", "\n", "At this point, we have 5 different word2vec language representations. Each 'model' has been trained conditional (i.e., limited to) text from a specific star rating. We will apply Bayes rule to go from _p(text|stars)_ to _p(stars|text)_." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For any new sentence we can obtain its _likelihood_ (lhd; actually, the composite likelihood approximation; see the paper) using the [score](https://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec.score) function in the `word2vec` class. We get the likelihood for each sentence in the first test review, then convert to a probability over star ratings. Every sentence in the review is evaluated separately and the final star rating of the review is an average vote of all the sentences. This is all in the following handy wrapper." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", "docprob takes two lists\n", "* docs: a list of documents, each of which is a list of sentences\n", "* models: the candidate word2vec models (each potential class)\n", "\n", "it returns the array of class probabilities. Everything is done in-memory.\n", "\"\"\"\n", "\n", "import pandas as pd # for quick summing within doc\n", "\n", "def docprob(docs, mods):\n", " # score() takes a list [s] of sentences here; could also be a sentence generator\n", " sentlist = [s for d in docs for s in d]\n", " # the log likelihood of each sentence in this review under each w2v representation\n", " llhd = np.array( [ m.score(sentlist, len(sentlist)) for m in mods ] )\n", " # now exponentiate to get likelihoods, \n", " lhd = np.exp(llhd - llhd.max(axis=0)) # subtract row max to avoid numeric overload\n", " # normalize across models (stars) to get sentence-star probabilities\n", " prob = pd.DataFrame( (lhd/lhd.sum(axis=0)).transpose() )\n", " # and finally average the sentence probabilities to get the review probability\n", " prob[\"doc\"] = [i for i,d in enumerate(docs) for s in d]\n", " prob = prob.groupby(\"doc\").mean()\n", " return prob" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Test set example\n", "\n", "As an example, we apply the inversion on the full test set. " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# read in the test set\n", "revtest = list(YelpReviews(\"test\"))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# get the probs (note we give docprob a list of lists of words, plus the models)\n", "probs = docprob( [r['x'] for r in revtest], starmodels )" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "import matplotlib" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtUAAAFWCAYAAACmf2GAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X2cXWV97/3vDwISG2GSgkbDgTkarcWH7ipG7pqarZxq\nRMW8fCxUzXi8LfepHJ0eT+/a3trM9EGk6oupYGuxnM4oFVS0o6CCtLKiqEBUJoBEFHQiBBIekqBE\nIyT87j/W2ps1+3HtmZ25rjXzeb9egb32Xnvt31xrzZrfvtZvXZe5uwAAAADM3mGhAwAAAADKjqQa\nAAAAmCOSagAAAGCOSKoBAACAOSKpBgAAAOaIpBoAAACYI5JqAKVgZgfN7PtmNmVm3zWzUw7BZ/yi\ny+snmtkZ/f7cQ83MNprZ+S2e32Rm/2uW25z3tjCz3zGzV8znZwJAUSTVAMpin7s/z90rkv5S0gcP\nwWd0G7j/v0o6cy4fYGahzrv9npSgbVuY2eF9/qyaiqTTennDIYwFAGYgqQZQFpZ7fIyk3fUXzD5k\nZjeb2VYze2P23AYz+4/s8ZPN7DYze2LWaztpZtdkz/1Vyw+buc03ZE+fI2lt1mP+7ob1zcz+0cxu\nNbOrzOzLZvba7LWfmtkHzey7kl6f9bh+J+t1/7yZHZOtd42ZPS97/Jtm9tPscduYzeyPzOz6LKZ/\nMjPLnn9btu51kl7UoV0rZvbtbN23Z++dMLPTc59xsZm9uuF9M9oii/GLZvafkv7DzNaZ2eW5bZxv\nZm/NHj/PzBIz22JmXzWzJ7Vo/zdk7X9jtu4Rkv5a0huzz3yDmb0gi/17ZnatmT091175WFaa2ebs\nfTeZWaf2AIBZWRI6AAAoaKmZfV/SUkkrJb1UkrLE9bnu/hwze6KkLWa22d0nzey1ZvZOSeslvd/d\n781yzhdIepak/dn6V7j792sfZGava7HNb0h6r6T3uPvpavZaSSe4+0lZkrhN0kW51+9395Oz7W+V\n9E53v9bMRiVtktSqDCPfu9wUs6RfSnqTpN9z94Nm9jFJf5R9mRiR9LuSfi4pkfR9tfYcSS+U9ARJ\nN5rZl7O4/1TSl8zsaEn/l6S3NrxvRluY2cbs857j7g+a2Tq16B03syWSzpd0urs/kH0J+oCktzes\n+n5JL3P3e8zsaHd/JPsy8Xx3f1e2rWWS1rr7o2Z2qtJE//XZ+/Ox/C9JV7r7OdmXjse3aQsAmDWS\nagBl8Ut3r/XiniLpU5KeLWmtpEskKUuaE6UJ6BWS3iXpFknfcffP5rZ1tbvvzbb1hWwb+aTzRW22\n2anmeq2kz2Xv2WVm1zS8/pns846WdIy7X5s9PyHps+ouH/Pns887KOn5SpNsk3SUpF1Kk+Rr3H13\ntv5nJD29zXa/6O4PS3rAzL4uaY27f8nMPmZmv6k0Sf28uz9aMMYHu6zzW0r329VZzIdJurvFetdK\nmjCzz0r6QpttDUj6ZNZD7Zr5Ny0fyxZJF2W93V90960FfhYA6AlJNYDScffrzOxYMzu2xcv5MpH/\nIulRSY3lBY09qN3qja3L60XsK7DOAT1WlndUw2v5GC23PO7u/19+RTN7jYrH3G67n5T0Fkl/KGmo\n4LbyP2P+Z5Ee+3lM0i3u3rEEw93/xMxeIOlVkr5XK4tp8DeSvu7urzWzEyXlv8jUY3H3b5rZiyW9\nUtK4mX3E3S8u+DMBQCHUVAMoi3qSaGbPVHr+ekDSNyW9ycwOM7PjJP2+pBuyMoOLlCaF28zsPblt\n/YGZDZjZUkkblPaK5j+j5TaV9lQ/oU1835L0Oks9SVK11Uru/nNJe3J1vW+RtDl7PC3p5OzxGxre\n2hjztyR9XWmN9nFZuyw3sxMkXS/pxdnyES22lfcaMzsy65Vep7RXV0p70IfTkP2HLd7XqS0kabuk\nk8zsCDMbkHRq9vxtko7LrjbIzJaY2UmNbzazp7r7FnffJOlepV+QfiHp6NxqR0vakT1+W7tAsja5\n190vkvQvklol6AAwJ/RUAyiLo7Ka6lri+1Z3d0n/niVoW5X2Sv9ZVrLxfknfcPdvm9lNShPtK7L3\n3qC0pGCVpE+5+43Z8y5J7t5um7slPWpmNyrtIf6HXHyfV1rn/QNJd0r6nqQH89vN2Sjpn7ME+Sd6\nLCH8sKTPmtk7JH254T2NMX9fkszsfZK+ZumoIg8rrdW+wcxGJF0naY+kqQ7tepPSmuvflPTX7r4z\na4N7zWybpH/v8L56W2SfU+fud2WlG7dI+qmy8pqsNvr1ks639AbNwyWNSbq1Yfsfqt14KOk/3f0m\nM7tT0nuz4+AcSX+vtPzjfS3aK68q6c/M7BGliXljfTgAzJmlf5MAYHHIbqir3+zW523/hrvvM7MV\nSnuLX+Tu9/Zhu4cs5g6f+XilXyqe5+4dx+8GANBTDQD9dEVW6nCE0l7fOSfUIWQjaVwk6SMk1ABQ\nDD3VAAAAwBxxoyIAAAAwRyTVAAAAwByRVAMAAABzRFINYNEzs6PM7HIz25vNPhglM7vGzP57BHGs\ny4a3CxnDX5jZhR1eP9PMrpzPmAAsboz+AaD0zOxRSavd/Sez3MTrJR0nablz93ZRQdvJ3c+pPc5m\nU/yppCW16dTd/dOSPh0oPACLED3VABaCuSZ4J0r6EQl1yswODx1Dj2rTq/djOnkAmBWSagBRMLNn\nZuUNe8zsZjN7de61GWUPZrbRzL6ZPd6sNJm6ycx+bmYtp+Rut/1s5sG/kvSH2ftbTndtZueZ2S4z\ne9DMttam1jaz08zs+9nz281sU+49J5rZo2Y2ZGY/M7MHzOwsMzs528ZuMzu/4ee61szOz0pRbjWz\nl3Zos/+erfOAmX01m4671Xq1ON5hZjuyf+/Jvb7JzD5nZp8ys72SNmZTl49l696V/fxHzNys/YWZ\n3WdmPzGzMzvEeY2ZfcDMrs/a6d+z8bxrr59uZrdk7fF1S6ehr73259nn/9zMtpnZS3IxfzJbrTbN\n+95svRc2HCP/aGYfaohp0syGs8dPNrPLzOxeM7vDzP5nu58FANohqQYQnJktkXS5pCuVlmG8S9K/\n2WPTVLdSm1J8Xbb8HHc/2t0/18v23X1E0gckXZq9/19bvP9lktYqLTE5RtIbJT2QvfyQpLdkz79S\n0v9jZqc3bGKNpNWS3qR0Su6/VDql+bMlvdHMfj+37gsl/VjptOEjkr6QT0BzMb1G0nslbch+pm9K\nuqR1U9VVJT1N0ssl/XlDwn66pM+6+4DSson3ZXE/V9LvZI/fl1t/paQVkp4iaUjShV3211uy9VZK\nOijp/OzneEb2ee/Kfo6vSrrczJZkr71T6WySR2dxT7fY9ouz/x+d7cPrs+XalYdLlO4zZZ85IOll\nki4xM1N6bNwo6cmSTpX0bjP7gw4/CwA0IakGEINTJP2Gu5/r7gfc/RpJV0g6o4dtdLr0P9ftPyLp\nCZJOMjNz99vcfZckufs33P0H2eNbJF0qaV3uva50dsWH3f0/JO2TdIm7P+DudytNhn83t/4ud/+o\nux90989Kuk1pst7oLEnnuPuPsjriD0qqmNl/6fBzjLj7/izOf234+b/j7pdnP8d+SWdKGs3ifEDS\nqNLEOP9zvd/dH3H3b0j6snKJawufcvdt7v4rSe+X9IYsoX2jpCvc/evuflDShyUtlfR7SpPvIyU9\n28yWuPvP3P2nHT6j5THg7t+U5Ga2Nnvq9ZK+ne3DNZKOdfe/y9p8WtK/SPrDDp8DAE1IqgHE4CmS\nGkeT2C5p1Ww2ZmZfMbNfZKUAZ/S6/awUofb+F2VJ+AWSPiZpl5l93MyWZeuuyUoW7s1KJ86SdGzD\nJvPTlf9K0q6G5WW55R0t4nxKizBPlPQPWcnEbqU9597uZ8peu6vDdhvb5ymSftZh/T1Z8t0tzlbb\n3650Kvdjs/dsrweZ1rXfKWmVu98haVhpj/0uM/u0ma3s8BmdfEaPfYk4U9K/ZY9PkLSq1o5mtkfS\nX0h64iw/B8AiRVINIAZ3S2rsYT1BjyWY+yQ9Pvdax8TK3U9z9ydkpQCXFNh+4/ufnXv/t7LnLnD3\nkyWdJOm3JP1ZtvqnJU0qTQIHJP2z5nbDXGNSfEIWf6M7JZ3l7iuyf8vdfZm7X9dmu6aZbdC43cab\nNHcoTdxrTmxYf7mZLS0QZ03+s09U2vt/f/aeE1usu0OS3P1Sd//93Drntth2kRtML5H0+qzu/IWS\nPp89f6eknzS04zHu/uq2WwKAFkiqAcTgekm/NLP/N6ulrUp6lR6rEZ6S9FozW2pmqyW9veH9OyU9\ndQ7b7yi7sXBNVpv9K0n7lZYmSGkv8x53f8TM1ijtBZ3x9iKfkfNEM/ufWZxvkPRMpaUVjT4u6S/t\nsRsmjzGz13fZ9vuzNnyWpLcpLVVp51JJ7zOzY83sWKUlG5/KvW6SRs3siKwm/JWSmurZc95s6c2i\nj1daSvK5rFf6s5JeaWYvyX7m/620fb9tZs/Inj9S0sNK2/7RFtu+L3v+ae0+3N2nlPbm/4ukK939\n59lLN0j6RXZsHGVmh5vZs8zs5A4/CwA0IakGEJy7PyLp1ZJOU9p7eYHSm/9+nK1yntKezZ1Ka4Ev\nbtjEiKRPZpfvmxLLAtvv5mhJn5C0W+l4yPcrrf2VpD+R9Ddm9qDSG/kaJ49p7EXttny9pKdnn/E3\nkl7n7nsb13X3SaV11JdmZSc3SVrf5efYLOl2SVdL+nt3/88O6/6tpO9m292aPf673Ov3SNqjtKf5\nU0p7zX/UYXufkjSRrX+kpHdnP8ePJL1Z6T65T2ly/mp3PyDpcdnPeF/2vuOUlmbMkNVp/52kb2XH\nwJo2MXxa6Y2I/5Z776NKv2BVlO7be5Xu66M7/CwA0MS6DctqZhcpPeHscvfntlnno5JeofQS7VDW\nIwAA6IGZbZT0dnd/cdeVe9vuiZJ+IumI2uQo88nMrlF6o+L/me/PBoD5UqSn+l+VDmPUkpm9QtLT\n3P3pSm/Q+XifYgMA9A8TowDAIdQ1qXb3a5Ve4mvnNZI+ma17vaRjzOxJ/QkPANAnIWeLZKZKAAve\nkj5sY5VmDpW0I3tuV+vVAQCtuPuE0rrjfm93u6RgU4+7e9tZIQFgoeBGRQAAAGCO+tFTvUMzxx89\nXm3GfjUzLgECAACgtNy95T0qRZNqU/ubXL4k6Z2SPmNmp0jaW5u+t00gBT9ycRgaGtL4+HjoMFAC\ny5Yt00MPPRQ6DERobGxMk5OTkqTNmzdr3bp0lvQNGzZoeHg4ZGiIVKUypKmp8dBhIHLkKM3M2t/z\n3TWpNrNPS6pK+k0z+5mkTUrHGHV3v9Ddv2Jmp5nZ7UqH1HtbX6JeJG699dbQISBiSZIoSRJJ0r59\n+zQyMiJJqlarqlarweJCXCqVivbuTYey3rx5c/3YqFQqAaNCzLZuDR0BsPB0TardvXF2sFbrnN2f\ncBafe++9N3QIAEpuamqq/uVLUv3xwMAAX77QxmDoAFACg4ODoUMolX7UVGMOBgYGQoeAiOV7pCcn\nJ+s91UAePdXoXTV0ACgBvpT3hqQ6gHz949atW+sHLfWP6GTlypWhQ0Ck6KkGgPBIqgMYHh6uJ8+V\nSmXGH0OgnfXr14cOAZGipxrAoTA+LvG9vDjGqQ6M8g8URYIEoF82baqGDgElMDFRDR1CqZBUB7Zh\nw4bQIaAkuKIBoF+4PQPoP8o/AqP3EUVNT0+HDgGRuuyyy3TFFVfUl2vjyt5///3UVKOlJEk4NlBA\nIm5qLY6kGohYfpzqiYmJ+vBGjFMNAEBcbD5nODQzZ0ZFYHZGRkYYUg8t5b98jY6OatOmTZL48gVg\nbswk0raZzKztNOUk1UBJkFSjiOyEHzoMAAvAyAj19406JdWUfwRGXRuKuv/++0OHgEjle6olMZ09\nuhoaSjQ+Xg0dBiJXrSaipro4Rv8ASuKhhx4KHQKABWJiInQEwMJD+QdQEpR/LHxmLa8oBsG5emGj\nVhaYHco/gJJqvAGthsv6C1M/EtmhoaH6kHoAgPlDUh0YNdXoJJ88X3fddfRUoyvGvkcxiaiVRTfk\nKL2hphooiZ07d4YOASUwNUVSDaA/uOjVG2qqgZLgsj6KoFYWRTBUGorgfNKMmmqgpJhREcChQEIN\n9B891YFRr4SiKpWKpqamQoeByJklcq+GDgOR428PiuB80qxTTzU11QAAAMAc0VMNRKxxSL1NmzZJ\novwD7VEDCaBfOJ80o6YaKKl88jw9Pc2Qeugq+94FAHPG+aQ3lH8EVuuFBLq59tprQ4eAEqhWk9Ah\noASGhpLQIaAEOJ/0hqQ6MG48AwDMt4mJ0BEACw/lH4GRVKOTfE31HXfcUS//oKYa7XBcoJhq6ABQ\nApxPekNPdWDT09OhQwAAAMAcMfpHAIzogNlgnGoUwfjDKILxh1EE55NmjFMNLAArV64MHQJKgJns\nAfQL55Pe0FMdWLVaZQQQFEKPAYpgXFkUMTLCVOXojvNJM3qqIzY4OBg6BJQECTWAfiGhBvqPpDqw\nZcuWhQ4BJcEVDRSThA4AJcD5BMUkoQMoFZLqwB566KHQIQAAAGCOSKoDo/wDRVH+gWKqoQNACXA+\nQTHV0AGUCpO/BNA4pF4NQ+qhE25URBHZCJ0ASmzFCmnPntBRpKzlLXnzb/lyaffu0FF0xugfgTH2\nMIoaGhrSOOMboQu+fKGIoaFE4+PV0GGgjVhG3YjpfBJLmzD6R8SoqUZRO3fuDB0CgAViYiJ0BMDC\nQ/lHAPnyjzvuuEMj2dhGlH+gUf5YueqqqzhW0BXHBYqphg4AJcD5pDeUfwRG+QeKGhwc1PT0dOgw\nACwAsVxKR2vsn2axtEmn8g96qgPI9z5u3bqV3ke0lT9Wtm/fzrGCrmKqgUTMEtFbjW44n/SGpDqA\nfEL0kY98pJ4oAcBcjY9L/A0EgPlH+UcAjUPqbcrGwKL3EZ0MDAxo7969ocNA5GK5RIq4jYwwVXnM\n+D1uFkubUP4BlNTY2JgmJyclSQ8++GD9S9eGDRs0PDwcMDIAZUZCDfQfSTUQseHh4XryvGrVqvoV\nDqC9RNTKohtqZVEEx0lvSKoDyJd5XHzxxdRUo5CDBw+GDgEAALTB5C+BrV27NnQIKIkTTjghdAgo\nhWroAFAC9D6iCI6T3tBTHdjQ0FDoEBCx/E2tW7ZsYUg9dJXd9wwAmGeFRv8ws/WSxpT2bF/k7uc2\nvH60pIslnSDpcEkfcffxFtth9I8GY2Nj3HCGQpYtW8a09uiKGkgUMTSUaHy8GjoMtBHLSBcxnU9i\naZM5jf5hZodJukDSqZLulrTFzL7o7j/MrfZOST9w99PN7FhJt5nZxe5+oA/xL2jMpohO8j3V+/bt\no6caQF9MTKRjmgPonyLlH2sk/djdt0uSmV0q6TWS8km1S3pC9vgJkh4goS5mcHAwdAiI2NTU1IwR\nP2qPBwYGSKrREscFiqmGDgAlwPmkN13LP8zsdZJe7u5/nC2/WdIad39Xbp1lkr4k6ZmSlkl6k7t/\ntcW2KP8Qk79gdpYsWaIDB/iuCmDuYrmUjtbYP81iaZP5mPzl5ZJudPeXmtnTJF1tZs91dwpAW8gn\nz5OTkwyph7byX8AOHjxI+Qe6iqkGEjFLRG81uuF80psiSfUOpTcg1hyfPZf3NknnSJK732FmP1Xa\na/3dxo0NDQ3VSx4GBgZUqVTqO6yWPCym5fvuu081McTDclzLl112mW655RbVTE5OamBgQAMDA/Xn\nYoqX5fDL4+NStRpPPCzHuSxNKUniiYflmctSEsX+qQndHo/FM/+fnySJxrMbELqV7BYp/zhc0m1K\nb1S8R9INks5w9225dT4m6V53HzWzJylNpn/H3Xc3bIvyD6U7q7bDKP9AUYODg5qeng4dBiIXyyVS\nNFuxQtqzJ3QU8Vm+XNq9u/t6iwm/x81iaZNO5R+9DKn3D3psSL0PmtlZktzdLzSzJ0sal/Tk7C3n\nuPslLbZDUt2gUqkwAggK4VhBEbH84UEz9k1rtEsz2qRZLG0y55pqd79S0m81PPfPucf3KK2rRgH5\nnuqtW7dSJ4tCli1bFjoElEKi2iVSoJ0kSfh7g644TnrDjIoB5JPniy++mBsVUcjOnTtDhwAAANoo\nVP7Rtw+j/EMSNdWYnZUrV5JYo6tYLpGiGfumNdqlGW3SLJY2mY8h9dCDfPI8PT1NTzXaGhsb0+Tk\npCRp165d9eNmw4YNTG+PlrLv6ACAeXZY6AAAtFcbcrKWTNceVyqVsIEhWtVqEjoElEDjkGlAKxwn\nvSGpDozkCAAAoPyoqQZKYtWqVdqxo3HeJQBlEUtNaGxol2a0SbNY2qRTTTU91YFxaQVFLV26NHQI\nAACgDZLqwGpTXwJAP/BFHUVwnKAIjpPeMPoHELH88It33HEHEwWhq/FxiUMDKDeXSS0LDBYvz/03\nViTVAeQTpYmJCQ0ODkoiUQIwdxMTVXEBDN3wtyZuJo+ifrgaOoAcs9hTam5UDG5kZIRxqlHI4OCg\npqenQ4eByMVyMw+asW9ao12a0SbNYmkTblSMGEkSilqyhAtLKCIJHQBKgFpZFMFx0hv+SgfGONXo\nhJpqAADKgfKPwJIkITlCIZQKoYhYLpGiGfumNdqlGW3SLJY2ofwjYgyph6Kuu+660CGgBDZtCh0B\nACxOJNWBXX311aFDQEncfvvtoUNACVSrSegQUALUyqIIjpPeUFMdQL5O9u6776ZOFgAAoORIqoGI\njY2NaXJyUlJ6o2LtS9eGDRs0PDwcMDLEii/mKILjBEVwnPSGGxUDyCdKmzdv1rp16ySRKKGzpUuX\n6le/+lXoMADMUiw3WsWGdmlGmzSLpU063ahIUh3YihUrtHv37tBhIFL5UqHR0VFtyu5Co1QI7TCi\nULxiSQqkuI6TmNolFrG0CcdJqzgY/SNahx3GLgDQPwwoBABh0FMd2OrVqxnVYYEza/mFNhh+Bxe2\nWHpz0Ix90xrt0ow2aRZLm3TqqeZGxcCOP/740CHgEOtXEmu2Uu47+7ItAADQX9QeBDA2Nlavid28\neXP98djYWOjQELHly5eFDgGlkIQOACXA+MMoguOkN/RUBzA8PFwf5WPp0qUctCjkzDPXhw4BAAC0\nQU11YEcddZT2798fOgwAC0QsdYdoxr5pjXZpRps0i6VNqKmOTH6c6l//+tdM6AGgb7JRFwEA84ya\n6gAqlcqMcYZrjyuVStjAEDXKhFBEtZqEDgFtuCztbovgXxJBDLV/rrhGSMJj+LvTG3qqA8gn1B/4\nwAc0MjISNB4AwKFn8iguX0uSkkSKaVKP0EEAfUBSHUB+lrxHHnmknlQzSx46SZJqLH8DETHOISiC\n4wRFcJz0hvIPoCRGR0NHAAAA2mH0j8BWrlypnTuZ0APdmSVyr4YOA5FLkoTepUjFMnqBFNdxElO7\nxCKWNuE4aRVH+9E/6KkO7MCBA6FDALCAjI+HjgAAFid6qgPI11SPjo5qUzYGFjXV6CSWb+mIG8dJ\nvNg3rdEuzWiTZrG0SaeeapLqwJYtW6aHHnoodBgogVhOKIgbx0m82Det0S7NaJNmsbQJ5R+RGRsb\nq/dK79u3r/54bGwsdGiI2MaNSegQUApJ6ABQAow/jCI4TnrDkHoBVCoV7d27V5K0efPmeskHk7+g\nk6Gh0BEAAIB26KkGSoJ6exRTDR0ASoDzCYrgOOkNPdUAsIBk9z0DKDlj9vYZli8PHUF39FQHMDU1\nNWMEkNrjqampsIEhatS2oYhqNQkdAkqA80nc3OP4JyXBY6j927079F7pjp7qAKipBgAAWFgYUm8O\nLLJrMwupbdFsZCT9B6CcYhkSLDa0S7zYN80Ypzpi69ev15VXXhk6DJQAJzeg3Pgdbo12iRf7phnj\nVEds/fr1oUNAaSShA0AJUCuLIjhOUEwSOoBSIakOjDpqAP00Ph46AgALxcaNoSMol0LlH2a2XtKY\n0iT8Inc/t8U6VUnnSTpC0n3u/pIW61D+AcwSl+FQBMdJvNg3rdEuKJM5lX+Y2WGSLpD0cknPknSG\nmT2zYZ1jJH1M0qvc/dmS3jDnqBcJbjwDAAAovyLlH2sk/djdt7v7I5IulfSahnXOlPR5d98hSe5+\nf3/DXLhGR5PQIaAkNm5MQoeAUkhCB4ASoKYaRXCc9KZIUr1K0p255buy5/KeIWmFmV1jZlvM7C39\nChBAamgodAQAAKCdrjXVZvY6SS939z/Olt8saY27vyu3zvmSni/ppZJ+Q9J3JJ3m7rc3bIua6gbU\nkgHoJ84p8WLftEa7oEw61VQXmVFxh6QTcsvHZ8/l3SXpfnffL2m/mX1D0u9Iur1hPQ0NDWlwcFCS\nNDAwoEqlUp9RsHaZYbEtS3HFwzLLLPe2vGKFtGdPulz7fX6sDGP+l9N5qcLGs2xZossvj2P/xLTM\n+Z7lMi0nSVUjI/HEE2I5SRKNZ8Mq1fLXdor0VB8u6TZJp0q6R9INks5w9225dZ4p6XxJ6yU9TtL1\nkt7k7rc2bIue6gZmidyrocNACSRJUv+FR1xi6mmL5TiJqU1iEVObxHKcSHG1C2YiR2k2p55qdz9o\nZmdL+poeG1Jvm5mdlb7sF7r7D83sKkk3SToo6cLGhBqtMQYkAABA+TFNOVASIyMMwRgretqa0SbN\naJPWaJd4sW+adeqpJqkGSoKTW7zYN81ok2a0SWu0S7zYN83mNPkLDq1aMTzQXRI6AJQA5xQUwXGC\nYpLQAZQKSTUAAACacN9Xbyj/AEqCy3DxYt80o02aWcsLxli+XNq9O3QUQDGUf0SMG88AYHFwj+df\nTPGQUGOhIKkObHQ0CR0CSmLjxiR0CCgBamVRTBI6AJQA55PekFQDJTE0FDoCAADQDjXVgVF3CJQf\nv8fNaJO4sX+A2aGmGgAAAD3hvq/ekFQHl4QOACVBbRuK4DhBEdyjgSK476s3JNWBMQYkAGC+cY8G\n0H/UVAMlMTLCpbhYUZ/ajDYByo/f42adaqpJqoGS4OQWL/ZNM9oEKD9+j5txo2LEqH9EcUnoAFAC\nnFNQBMcJiklCB1AqJNUAAABown1fvaH8AygJLsPFi33TjDaJG/doALND+UfEOKkBAObb6GjoCICF\nh6Q6MMbxfulAAAAXPklEQVSARFGMK4siqJVFMUnoAFACnE96Q1INlATjygIAEC9qqgOj7hAoP36P\nm9EmcWP/ALNDTTUAAAB6wn1fvSGpDi4JHQBKgto2FMFxgiK4RwNFcN9Xb0iqA2MMSADAfOMeDaD/\nqKkGSoJxZeNFfWoz2gQoP36Pm3WqqSapBkqCk1u82DfNaBOg/Pg9bsaNihGj/hHFJaEDQAlwTkER\nHCcoJgkdQKmQVAMAAKAJ9331hvIPoCS4DBcv9k0z2iRu3KMBzA7lHxHjpAYAmG+jo6EjABYekurA\nGAMSRTGubLxclnbNRvAviSAGmaVtgogloQNACVB735sloQMAUAzjysbL5PGUOiSJVK2GjiIt/wgd\nBADMI2qqA6PuECg/fo+b0SZxY/8As0NNNQAAAHrCfV+9IakOLgkdAEqC2jYUwXGCIrhHA0Vw31dv\nSKoDYwxIAMB84x4NoP+oqQZKgnFl40V9ajPaBCg/fo+bdaqpJqkGSoKTW7zYN81oE6D8+D1uxo2K\nEaP+EcUloQNACXBOQREcJygmCR1AqZBUAwAAoAn3ffWG8g+gJLgMFy/2TTPaJG7cowHMDuUfEeOk\nBgCYb6OjoSMAFh6S6sAYAzJuK1akPW4x/JOS4DGYpW2CeFEri2KS0AGgBDif9GZJ6ACAmO3ZE88l\n7CSRqtXQUdQSfAAAkEdNdWDUHcaN/dOMNmlGmzSjTeLG/gFmh5pqAAAA9IT7vnpTKKk2s/Vm9kMz\n+5GZ/XmH9V5gZo+Y2Wv7F+JCl4QOACVBbRuK4DhBERs3JqFDQAlw31dvuibVZnaYpAskvVzSsySd\nYWbPbLPeByVd1e8gFzLGgAQAzLehodARAAtP15pqMztF0iZ3f0W2/F5J7u7nNqz3bkkPS3qBpCvc\n/QsttkVNNUqFusNmtEkz2qQZbQKUH7/HzeZaU71K0p255buy5/If8BRJG9z9nyQxNgAAAAAWlX7d\nqDgmKV9rTWJdEPWPKIpjBUVwnKAIjhMUk4QOoFSKjFO9Q9IJueXjs+fyTpZ0qZmZpGMlvcLMHnH3\nLzVubGhoSIODg5KkgYEBVSoVVbPBd2u/5ItpeWpqKqp4WG5eluKIZ2pqKujnx9YeLLdergkdj5Qo\nHVs9zOez3Hk5lvMJy3Ev1+77iiWeEMtJkmh8fFyS6vlrO0Vqqg+XdJukUyXdI+kGSWe4+7Y26/+r\npMupqcZCQD1ZM9qkGW3SjDaJ28gIw6UBszGnmmp3PyjpbElfk/QDSZe6+zYzO8vM/rjVW+YU7SLD\nSQ0AMN9GR0NHACw8zKgYmFki92roMNBGTL1tSZLUL02FFFObxCKmNuE4QRH87UERsZxPYsKMigAA\nAMAhRE91YPTmxI3904w2aUabNKNN4sb+AWaHnmoAAAD0hPu+ekNSHVwSOgCURG2IH6ATjhMUsXFj\nEjoElMDoaBI6hFIpMk71grRihbRnT+goUhbJVDnLl0u7d4eOAiinWH6PY7F8eegI0MnQUOgIgIVn\n0dZUU0/WjDZpRps0o03ixv4B0C+cT5pRUw0AAAAcQiTVgVH/iKI4VlBMEjoAlADnExSThA6gVEiq\nAQAA0GTjxtARlAs11aijTZrRJs1ok7ixf1DEyAjDpQGzQU01ACwSmzaFjgBlMDoaOgJg4SGpDoy6\ntri5LO36i+BfEkEMMkvbBNGqVpPQIaAUktABoATIUXqzaMepBooweTyX0pNEqlZDR5GWF4QOAgCA\nyFBTjTrapBlt0ow2AcqP32NgdqipBgAAQE+4mbU3JNWBUa+EojhWUATHCYrYuDEJHQJKYHQ0CR1C\nqZBUA8ACMj4eOgKUwdBQ6AiAhYeaatTRJs1ok2a0SdzYPwD6xWxM7sOhw4gKNdUAAADo0QWhAygV\nkurAqH9EURwrKCYJHQBKgPPJwmdmc/4n3dGX7aTbWvhIqgEAABYYd5/Vv/POO0/r1q3TunXrJKn+\n+Lzzzpv1NmMp/T3USKoDq0YwmQfKgWMFxVRDB4ASSJJq6BCABYekGgAWkE2bQkeAMhgdDR0BsPCQ\nVAdGXRuK4lhBEdVqEjoElEISOgBEqlKpqFqt1q+O1h5XKpWwgZUASTUAAAAwR0tCB7DYUSeLojhW\nUATHCYqphg4AkZqamppxZbT2eGBggPNLFyTVAAAAkJSWf+zdu1eStHnz5noiTflHd4u2/MNl6dRj\ngf8lEcRQ++daHONIlhU11SiC4wRFbNyYhA4BWHAWbU+1yeOYyjdJpEgup5hJMTQJgNkbH4/mlIKI\nDQ2FjgCxovxj9mw+B+Q2M49lAHAzxZFUR4Q2aUabNKNN4sb+AdAvRx55pB5++OHQYUTFzOTuLS/t\nL9qeagAAAMw0NjamyclJSdIjjzxS753esGGDhoeHA0YWP3qqA0uSJJrLKbG0SUxiapNYjpWY2gTN\nzBK5V0OHgcjFcj5B3FatWqUdO3aEDiMqnXqqF+2NigAAAGjvuOOOCx1CqZBUB0ZPAYriWEEx1dAB\noASSpBo6BJTA2rVrQ4dQKiTVALCAbNoUOgKUweho6AhQBscee2zoEEqFpDowxpRFURwrKKJaTUKH\ngFJIQgeAEpieng4dQqkw+gfQhTEnzgzLl4eOAMDcTYlSIbSSJEm9E2diYkKDg4OS0hJEyhA7Y/QP\n1NEmcWP/AOgXsxG5j4QOA5EbGRnRyMhI6DCiwjjVAAAsENany2dmcy+sjqWjDIcG5R+9IakOjLFC\nUVwiLteiG84pC99sE9n8Zf3R0VFtyu5q5bI+2rn11ltDh1AqlH8EFtMfwFjaBK0xqQeKGBpKND5e\nDR0GIpddwg4dBiLH5C/NOpV/kFSjjjaJG/sHRXCcoJ21a9fqu9/9riTp17/+tR73uMdJkk4++WRd\ne+21IUNDRLii0Rk11cACwPjDAOaiUqnorrvukiRt375dK1eurD8PYO5IqgOLqfwDcUvHH64GjgLx\nS8RxglZWr15dHx5t+/bt9cerV68OFxSiMzU1NWNehNrjgYEB8pUuKP8ILKakOpY2QWsxHSuIF7X3\nKIKaahSxZMkSHThwIHQYUZlz+YeZrZc0pnQGxovc/dyG18+U9OfZ4i8k/Q93v3n2IS8eJEkoimMF\nxVRDB4BIjY2NaXJysr5cO6ds2LBBw8PDgaJCbPI11QcPHqyPU01NdXddk2ozO0zSBZJOlXS3pC1m\n9kV3/2FutZ9IerG7P5gl4J+QdMqhCBgA0B6192jn9ttvnzHucO3x7bffHiYgYIEp0lO9RtKP3X27\nJJnZpZJeI6meVLv7dbn1r5O0qp9BLmRc0kdRHCsogtp7AHOR75GenJxkRsUeFEmqV0m6M7d8l9JE\nu53/W9JX5xIUgGbj4xI5NYDZuuCCC3TBBRdISutCmS0P3ezfvz90CKXS19E/zOwlkt4maW0/t7uQ\n0fOIoiYmqhofDx0FYsc5Be3ka2UlUSuLro466qjQIZRKkaR6h6QTcsvHZ8/NYGbPlXShpPXuvqfd\nxoaGhurD+AwMDKhSqdR/mWu/7CyHWZYSJUk88bA8c5n9wzLLLM9leWpqSnnT09P1v8cxxMdyfMsD\nAwNRxRNiOUkSjWc9WrXfl3a6DqlnZodLuk3pjYr3SLpB0hnuvi23zgmS/lPSWxrqqxu3xZB6DZIk\nqe/E0GJpE7TGUGkoIqZzCuKSJEk9WWCmPLSTHyVm8+bNWrdunSRGiamZ05B67n7QzM6W9DU9NqTe\nNjM7K33ZL5T0fkkrJP2jmZmkR9y9U911FKxlkyxey5eHjgDAXFF7j3aY1ANFDA8P15PnSqUy45hB\nZ4Vqqt39Skm/1fDcP+cev0PSO/ob2qEVS4+sWTWaWBC7augAUALU3qOdSqWivXv3Skp7IGuJNNOU\no51a+QeKOSx0AACKYfxhAMB82rBhQ+gQSqWvo39gNhLRA4kiGH8YxSTiOEErlH+gV1zF6A1JNQAA\ni0C+VnbFihXUygJ9RvlHcNXQAaAk6ElCMdXQAaAETjjhhO4rYdFrHIYRnZFUB0adLIB+4pyCIoaG\nhkKHgBKoDa2HYkiqA0vrZIHuuFSLIjinoAhqZVHEXXfdFTqEUiGpBkqCYdIA9Mtll10WOgREamxs\nrD4h0B133FF/PDY2Fjq06HWdUbGvHxbRjIpA2TDjJYB+qU2/DHQyODio6enp0GFEZU4zKgIAAGBx\nyE9nv337do2MjEhiOvsi6KkOLEkSDlIUYpbIvRo6DESOcwraGRsbq994tnnzZq1bt05SOsFHbag9\nIG/NmjW64YYbQocRFXqqIzY+LvH3D0C/cE5BO/lxqlevXk35B7o66aSTQodQKvRUB0adLIriWEER\nHCcoglpZFMGVr2adeqoZ/QMoCcYfBtAvT3ziE0OHACw4lH8El4gZ0FBEOv5wNXAUiF8ijhO0kr8B\nbcuWLdyAhq7Gx8c5NnpAUg0AwCKQT54//OEP15NqAP1BUh1cNXQAKAl6C1BMNXQAiFS+p3rfvn30\nVKOl/HEyMTGhwcFBSRwnRZBUB0adLIB+4pyCdqampmaM+FF7PDAwQLKEunzyPD09zRWNHjD6R2Dc\nWYuiOFZQBMcJishGMAgdBiJXqVQ0NTUVOoyoMPoHsACMj4eOAECZnX322RocHKxfzq89Pvvss8MG\nhmitXLkydAilQvlHYPQooaiJiSqJNbrinIJ2Vq9eXU+ot2/fXn+8evXqcEEhOvma6quuuora+x5Q\n/gGUBJN6AOiXJUuW6MCBA6HDQOSGhoY0Tm/ODJR/RIxpYlFcEjoAlADnFBRxxBFHhA4BJUA9dW8o\n/whsfFziagqAfuGcgnbyl/X379/PZX10deSRR4YOoVQo/wiMS/ooimMFRXCcoIiRkRGGSkNL+S9f\no6Oj2pSN08mXr1Sn8g96qoGSYPxhAMChlk+ekyThy1cPSKqDS8QMaCiiWk3EsYLuEnGcoJuBgYHQ\nISBS+Z7qzZs3UybUA5JqAAAASJqZPH/0ox+lp7oHjP4RXDV0ACgJeghQTDV0ACgBRnVAEUcffXTo\nEEqFnurAqJMF0E+cU1AESTXaGRsb0+TkpKR0kqBah86GDRs0PDwcMLL4MfpHYEmS0AOJQjhWUATH\nCdphVAf0asWKFdq9e3foMKLC5C/AAsCkVgCA+fTwww+HDqFUKP8IjN4BFDUxUSWxRlecUwDMRf6K\nxr59+xj9owck1QAALAL5pOjjH/84ozoAfUZNdWDUP6Ios0Tu1dBhIHKcU9AONdXoVaVS4abWBsyo\nGLHxcYlzGYB+4ZyCdvLJ88UXX0xPNbpauXJl6BBKhZ7qwMwkmgRFcKygCI4TFFGtVuu91kA7XPlq\nRk81sAAw/jCAuWD6afSK46I39FQHRp0siqLHAEVwTkER69ev15VXXhk6DESOvzvNGKcaAADU3Xzz\nzaFDABYceqoDo/4RQD9xTkERK1eu1M6dO0OHAZQONdURo04WQD+9/OVjkoZDh4EI5Wuqd+3aRU01\n0GeUfwRWrSahQ0BJcKc+irj55g+FDgGRmpqampFY1x4zDjHa4e9Ob+ipBkqC8YcXPrOWVxSDbIdS\nvYVneHhYw8PpVYylS5eSMAF9RlIdGJfcUNTERFXj46GjwKE020R2bGxMk5OTktKh0tatWydJ2rBh\nQz2JAvK91Pv376f8A+gzkmoAABYBZlRErxhSrzeFRv8ws/WSxpTWYF/k7ue2WOejkl4haZ+kIXdv\nKtJi9I9mHLAoivGHUUR2Z3roMBChfE/16OioNmV3ytNTjXaGhoY0ziXSGeY0+oeZHSbpAkmnSrpb\n0hYz+6K7/zC3ziskPc3dn25mL5T0cUmn9CX6Be6DH5ziZIaCpiRVQweBCOWTJUlc1kdL+ePhE5/4\nBD3VaCl/PpmYmNDg4KAkzidFdO2pNrNTJG1y91dky++V5PneajP7uKRr3P0z2fI2SVV339WwLXqq\nG5iNyH0kdBgoAY4VtPOc5zxH27ZtkyQdPHhQhx9+uCTpt3/7t5nkAy0NDg5qeno6dBiIXKVSYXSY\nBnMdp3qVpDtzy3dJWtNlnR3Zc7sEoC+ye88AYFbyPZDbt2/niga62rt3b+gQSoUbFYObDh0ASmJw\ncDp0CIjU29/+9hmjf6xdu1ZSOvoHUJNPnicnJyn/QFdLlpAm9qJo+ceIu6/PlouUf/xQ0rpW5R99\njh8AAACYN3Mp/9giabWZnSjpHkl/KOmMhnW+JOmdkj6TJeF7GxPqTkEAAAAAZdY1qXb3g2Z2tqSv\n6bEh9baZ2Vnpy36hu3/FzE4zs9uVDqn3tkMbNgAAABCPQuNUAwAAAGjvsNABLFZmdpGZ7TKzm0LH\ngniZ2fFm9nUz+4GZ3Wxm7wodE+JjZo8zs+vN7MbsONkUOibEy8wOM7Pvm9mXQseCeJnZtJltzc4r\nN4SOpwzoqQ7EzNZKekjSJ939uaHjQZzMbKWkle4+ZWbLJH1P0mvyky8BkmRmj3f3X5rZ4ZK+Jeld\n7s4fQjQxsz+V9HxJR7v76aHjQZzM7CeSnu/ue0LHUhb0VAfi7tdK4kBFR+6+092nsscPSdqmdAx4\nYAZ3/2X28HFK75ehxwRNzOx4SadJ+pfQsSB6JvLEntBYQEmY2aCkiqTrw0aCGGWX9G+UtFPS1e6+\nJXRMiNJ5kv5MfOlCdy7pajPbYmbvCB1MGZBUAyWQlX5cJundWY81MIO7P+ruvyvpeEkvNLOTQseE\nuJjZKyXtyq5+WfYPaOdF7v48pVc23pmVraIDkmogcma2RGlC/Sl3/2LoeBA3d/+5pGskrQ8dC6Lz\nIkmnZ7Wyl0h6iZl9MnBMiJS735P9/z5J/y5pTdiI4kdSHRY9BSji/0i61d3/IXQgiJOZHWtmx2SP\nl0r6A0nczIoZ3P0v3f0Ed3+q0oncvu7ubw0dF+JjZo/PrpDKzH5D0ssk3RI2qviRVAdiZp+W9G1J\nzzCzn5kZE+agiZm9SNIfSXppNqzR982MHkg0erKka8xsSmnN/VXu/pXAMQEorydJuja7T+M6SZe7\n+9cCxxQ9htQDAAAA5oieagAAAGCOSKoBAACAOSKpBgAAAOaIpBoAAACYI5JqAAAAYI5IqgEAAIA5\nIqkGgHlkZseY2f+Y58880czOmM/PBIDFhqQaAObXckl/0uoFMzv8EH3mf5V0Zi9vOISxAMCCRFIN\nAPPrHElPzWbHPNfM1pnZN8zsi5J+kPUq31xb2czeY2Z/lT1+qpl91cy2mNlmM3tG48bN7MW52Te/\nl00xfI6ktdlz784+4xtm9t3s3ynZextjebyZXZFt7yYze8O8tBAAlNCS0AEAwCLzXknPcvfnSWki\nK+l3s+d+ZmYnSmo31e2Fks5y9zvMbI2kf5J0asM6/1vSn7j7d8zs8ZL2Z5/5Hnc/PfvMoyT9N3d/\n2MxWS7pE0guy9+djea2kHe7+qux9T+hLCwDAAkRSDQDh3eDuP+u0Qtbj/HuSPmdmlj19RItVvyXp\nPDP7N0lfcPcdj61ed6SkC8ysIumgpKe3ieVmSR82s3Mkfdndr+3ppwKARYSkGgDC25d7fEBSvp75\nqOz/h0naU+vhbsfdzzWzKyS9UtK3zOxlLVb7U0k73f25We30r1rF4u4/NrPnSTpN0t+a2X+4+98W\n/qkAYBGhphoA5tcvJHUqo9gl6TgzW25mj5P0Kkly919I+qmZvb62opk9t/HNZvZUd/+Bu/+9pC2S\nnpl95tG51Y6RdE/2+K2amcTnt/VkSb9y909L+pCkjgk9ACxm9FQDwDxy991m9i0zu0nSVyV9peH1\nA2b210oT4rskbcu9/GZJ/2Rm71N6/r5U0k0NHzFsZi9RWtbxg+wzXNJBM7tR0rikj0n6gpm9VdKV\nmtlTnvccSR8ys0clPSxpXocCBIAyMfd298MAAAAAKILyDwAAAGCOSKoBAACAOSKpBgAAAOaIpBoA\nAACYI5JqAAAAYI5IqgEAAIA5IqkGAAAA5oikGgAAAJij/x8uG/vkI+c/JwAAAABJRU5ErkJggg==\n" }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "\n", "probpos = pd.DataFrame({\"out-of-sample prob positive\":probs[[3,4]].sum(axis=1), \n", " \"true stars\":[r['y'] for r in revtest]})\n", "probpos.boxplot(\"out-of-sample prob positive\",by=\"true stars\", figsize=(12,5))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 1 }