{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/html": [ "<script>\n", " function code_toggle() {\n", " if (code_shown){\n", " $('div.input').hide('500');\n", " $('#toggleButton').val('Show Code')\n", " } else {\n", " $('div.input').show('500');\n", " $('#toggleButton').val('Hide Code')\n", " }\n", " code_shown = !code_shown\n", " }\n", "\n", " $( document ).ready(function(){\n", " code_shown=false;\n", " $('div.input').hide()\n", " });\n", "</script>\n", "<form action=\"javascript:code_toggle()\"><input type=\"submit\" id=\"toggleButton\" value=\"Show Code\"></form>" ], "text/plain": [ "<IPython.core.display.HTML object>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "<script>\n", " function code_toggle() {\n", " if (code_shown){\n", " $('div.input').hide('500');\n", " $('#toggleButton').val('Show Code')\n", " } else {\n", " $('div.input').show('500');\n", " $('#toggleButton').val('Hide Code')\n", " }\n", " code_shown = !code_shown\n", " }\n", "\n", " $( document ).ready(function(){\n", " code_shown=false;\n", " $('div.input').hide()\n", " });\n", "</script>\n", "<form action=\"javascript:code_toggle()\"><input type=\"submit\" id=\"toggleButton\" value=\"Show Code\"></form>" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%%capture\n", "%load_ext autoreload\n", "%autoreload 2\n", "%matplotlib inline\n", "# %cd .. \n", "import sys\n", "sys.path.append(\"..\")\n", "import statnlpbook.util as util\n", "import statnlpbook.sequence as seq\n", "import matplotlib\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "matplotlib.rcParams['figure.figsize'] = (10.0, 6.0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!---\n", "Latex Macros\n", "-->\n", "$$\n", "\\newcommand{\\Xs}{\\mathcal{X}}\n", "\\newcommand{\\Ys}{\\mathcal{Y}}\n", "\\newcommand{\\y}{\\mathbf{y}}\n", "\\newcommand{\\balpha}{\\boldsymbol{\\alpha}}\n", "\\newcommand{\\bbeta}{\\boldsymbol{\\beta}}\n", "\\newcommand{\\aligns}{\\mathbf{a}}\n", "\\newcommand{\\align}{a}\n", "\\newcommand{\\source}{\\mathbf{s}}\n", "\\newcommand{\\target}{\\mathbf{t}}\n", "\\newcommand{\\ssource}{s}\n", "\\newcommand{\\starget}{t}\n", "\\newcommand{\\repr}{\\mathbf{f}}\n", "\\newcommand{\\repry}{\\mathbf{g}}\n", "\\newcommand{\\x}{\\mathbf{x}}\n", "\\newcommand{\\prob}{p}\n", "\\newcommand{\\vocab}{V}\n", "\\newcommand{\\params}{\\boldsymbol{\\theta}}\n", "\\newcommand{\\param}{\\theta}\n", "\\DeclareMathOperator{\\perplexity}{PP}\n", "\\DeclareMathOperator{\\argmax}{argmax}\n", "\\DeclareMathOperator{\\argmin}{argmin}\n", "\\newcommand{\\train}{\\mathcal{D}}\n", "\\newcommand{\\counts}[2]{\\#_{#1}(#2) }\n", "\\newcommand{\\length}[1]{\\text{length}(#1) }\n", "\\newcommand{\\indi}{\\mathbb{I}}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Sequence Labelling \n", "\n", "Many real-world applications can be cast as *sequence labelling* problems that involve assigning labels to each element in a sequence. For example, in *Part-Of-Speech tagging* each token in a sentence as assigned a part-of-speech such as verb or determiner that indicates the syntactic type of the token. In *Named Entity Tagging* we assign each token with the type of entity the token refers to, such as \"Person\" or \"Organisation\", or \"None\" if the token does not refer to an entity. \n", "\n", "## Sequence Labelling as Structured Prediction\n", "\n", "The problem of sequence labelling is an obvious (and somewhat canonical) instance of structured prediction. Here the input space \\\\(\\Xs\\\\) are sequences of words and the output space $\\Ys$ are sequences of output labels. Our goal is again to define a model a model \\\\(s_{\\params}(\\x,\\y)\\\\) that assigns high *scores* to the sequence of label \\\\(\\y=y_1 \\ldots y_n\\\\) that fits the input text \\\\(\\x=x_1 \\ldots x_n\\\\), and lower scores otherwise. The model will be parametrized by \\\\(\\params\\\\), and these parameters we will learn from some training set \\\\(\\train\\\\) of \\\\((\\x,\\y)\\\\) pairs. In contrast to the classification scenario the prediction problem $\\argmax_\\y s_{\\params}(\\x,\\y)$ is now non-trivial in general, as we have to search through an exponentional number of label sequences. In practice this issue is overcome by making assumptions about the factorization structure of $s_{\\params}(\\x,\\y)$ and/or search approximations that sacrifice the ability to find the true optimum of the search problem for more expressiveness. \n", "\n", "In this chapter we will restrict ourselves to scoring functions $s_{\\params}(\\x,\\y)$ of the form $\\prob_\\params(\\y|\\x)$ that model probability distributions over label sequences $\\y$ conditioned on input sequences $\\x$. This is the sequence equivalent to conditional models we discussed in the [text classification](doc_classify.ipynb) chapter. Other choices are possible but omitted here they are either very similar (structured SVMs, cite) or less effective in *supervised* sequence labelling (HMMs, cite). \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part-of-Speech Tagging as Sequence Labelling\n", "Part-of-Speech (PoS) tagging is an important task within NLP. It is a standard pre-processing step in many tasks. For example, most dependency parsers assume as input PoS tagged sentences. Likewise, [Reverb](reverb), one of the most effective relation extraction methods, defines relations in terms of PoS sequences.\n", "\n", "Traditionally, and based on the existence of corresponding annotated training sets, PoS tagging has been applied to quite restricted domains such newswire or biomedical texts. Recently there has been increasing interest in NLP in general, and PoS tagging in particular, for social media data. He we will focus on PoS tagging for tweets and use the [Tweebank dataset](http://www.cs.cmu.edu/~ark/TweetNLP/#pos) and the [\"october 27\" splits](https://github.com/brendano/ark-tweet-nlp/tree/master/data/twpos-data-v0.3/oct27.splits).\n", "\n", "Let us load the data and look at an example tagged sentence." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"I/O predict/V I/O won't/V win/V a/D single/A game/N I/O bet/V on/P ./, Got/V Cliff/^ Lee/^ today/N ,/, so/P if/P he/O loses/V its/L on/P me/O RT/~ @e_one/@ :/~ Texas/^ (/, cont/~ )/, http://tl.gd/6meogh/U\"" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train = seq.load_tweebank(\"../data/oct27.splits/oct27.train\")\n", "dev = seq.load_tweebank(\"../data/oct27.splits/oct27.dev\")\n", "test = seq.load_tweebank(\"../data/oct27.splits/oct27.test\")\n", "\" \".join([w + \"/\" + t for w,t in zip(train[0][0],train[0][1])])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have printed the tokens of tweet paired with their PoS tag. The tags (such as \"O\", \"V\" and \"^\") are described in the [Tweebank annotation guideline](http://www.cs.cmu.edu/~ark/TweetNLP/annot_guidelines.pdf). For example, \"O\" denotes pronouns, \"V\" verbs and \"^\" proper nouns. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# count tags here?`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Local Models / Classifiers\n", "We will tackle sequence labelling as a (discriminative) structured prediction problem. This means we will build a model $p_\\params(\\y|\\x)$ that computes the conditional probability of output label sequence $\\y$ given input sequence $\\x$. We will first consider the simplest type of model: a *fully factorised* or *local* model. In this model the probability of $\\x$ and $\\y$ is a product of *local* probabilities for the label $y_i$ of each token:\n", "\n", "$$\n", "p_\\params(\\y|\\x) = \\prod_{i=1}^n p_\\params(y_i|\\x,i)\n", "$$\n", "\n", "In this model all labels are independent of each other. This assumption has a crucial benefit: inference (and hence training) in this model is trivial. To find the most likely assignment of labels under this model you can find the most likely tag for each token independently. Notice that each local term conditions on the complete input $\\x$, not just on $x_i$. This is important, as the sentential context at position $i$ is often important to determine the tag at $i$. \n", "\n", "It is common to indicate this independence structure in a factor graph, a graphical representation of the model. In this representation each variable of the model (our per-token tag labels and the input sequence $\\x$) is drawn using a circle, and *observed* variables are shaded. Each factor in the model (terms in the product) is drawn as a box that connects the variables that appear in the corresponding term. For example, the term $p_\\params(y_3|\\x,3)$ would connect the variables $y_3$ and $\\x$. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n", "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n", " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n", "<!-- Generated by graphviz version 2.38.0 (20140413.2041)\n", " -->\n", "<!-- Title: %3 Pages: 1 -->\n", "<svg width=\"655pt\" height=\"178pt\"\n", " viewBox=\"0.00 0.00 655.00 178.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n", "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 173.997)\">\n", "<title>%3</title>\n", "<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-173.997 651,-173.997 651,4 -4,4\"/>\n", "<!-- x -->\n", "<g id=\"node1\" class=\"node\"><title>x</title>\n", "<ellipse fill=\"lightgrey\" stroke=\"black\" cx=\"323.5\" cy=\"-18\" rx=\"18\" ry=\"18\"/>\n", "<text text-anchor=\"middle\" x=\"323.5\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">x</text>\n", "</g>\n", "<!-- p(y1 | x, 1) -->\n", "<g id=\"node2\" class=\"node\"><title>p(y1 | x, 1)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"77,-95 0,-95 0,-72 77,-72 77,-95\"/>\n", "<text text-anchor=\"middle\" x=\"38.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y1 | x, 1)</text>\n", "</g>\n", "<!-- p(y1 | x, 1)--x -->\n", "<g id=\"edge2\" class=\"edge\"><title>p(y1 | x, 1)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M77.2282,-73.8458C80.0274,-73.2209 82.8071,-72.6006 85.5,-72 167.926,-53.6152 266.422,-31.6977 305.883,-22.9188\"/>\n", "</g>\n", "<!-- y1 -->\n", "<g id=\"node3\" class=\"node\"><title>y1</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"38.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"38.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y1</text>\n", "</g>\n", "<!-- y1--p(y1 | x, 1) -->\n", "<g id=\"edge1\" class=\"edge\"><title>y1--p(y1 | x, 1)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M38.5,-130.698C38.5,-119.15 38.5,-104.778 38.5,-95.1581\"/>\n", "</g>\n", "<!-- p(y2 | x, 2) -->\n", "<g id=\"node4\" class=\"node\"><title>p(y2 | x, 2)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"172,-95 95,-95 95,-72 172,-72 172,-95\"/>\n", "<text text-anchor=\"middle\" x=\"133.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y2 | x, 2)</text>\n", "</g>\n", "<!-- p(y2 | x, 2)--x -->\n", "<g id=\"edge4\" class=\"edge\"><title>p(y2 | x, 2)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M165.302,-71.8715C205.868,-58.3139 274.571,-35.3525 306.388,-24.7191\"/>\n", "</g>\n", "<!-- y2 -->\n", "<g id=\"node5\" class=\"node\"><title>y2</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"133.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"133.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y2</text>\n", "</g>\n", "<!-- y2--p(y2 | x, 2) -->\n", "<g id=\"edge3\" class=\"edge\"><title>y2--p(y2 | x, 2)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M133.5,-130.698C133.5,-119.15 133.5,-104.778 133.5,-95.1581\"/>\n", "</g>\n", "<!-- p(y3 | x, 3) -->\n", "<g id=\"node6\" class=\"node\"><title>p(y3 | x, 3)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"267,-95 190,-95 190,-72 267,-72 267,-95\"/>\n", "<text text-anchor=\"middle\" x=\"228.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y3 | x, 3)</text>\n", "</g>\n", "<!-- p(y3 | x, 3)--x -->\n", "<g id=\"edge6\" class=\"edge\"><title>p(y3 | x, 3)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M244.615,-71.7281C262.614,-59.6974 291.495,-40.393 308.868,-28.7805\"/>\n", "</g>\n", "<!-- y3 -->\n", "<g id=\"node7\" class=\"node\"><title>y3</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"228.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"228.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y3</text>\n", "</g>\n", "<!-- y3--p(y3 | x, 3) -->\n", "<g id=\"edge5\" class=\"edge\"><title>y3--p(y3 | x, 3)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M228.5,-130.698C228.5,-119.15 228.5,-104.778 228.5,-95.1581\"/>\n", "</g>\n", "<!-- p(y4 | x, 4) -->\n", "<g id=\"node8\" class=\"node\"><title>p(y4 | x, 4)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"362,-95 285,-95 285,-72 362,-72 362,-95\"/>\n", "<text text-anchor=\"middle\" x=\"323.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y4 | x, 4)</text>\n", "</g>\n", "<!-- p(y4 | x, 4)--x -->\n", "<g id=\"edge8\" class=\"edge\"><title>p(y4 | x, 4)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M323.5,-71.7281C323.5,-61.962 323.5,-47.4028 323.5,-36.056\"/>\n", "</g>\n", "<!-- y4 -->\n", "<g id=\"node9\" class=\"node\"><title>y4</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"323.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"323.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y4</text>\n", "</g>\n", "<!-- y4--p(y4 | x, 4) -->\n", "<g id=\"edge7\" class=\"edge\"><title>y4--p(y4 | x, 4)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M323.5,-130.698C323.5,-119.15 323.5,-104.778 323.5,-95.1581\"/>\n", "</g>\n", "<!-- p(y5 | x, 5) -->\n", "<g id=\"node10\" class=\"node\"><title>p(y5 | x, 5)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"457,-95 380,-95 380,-72 457,-72 457,-95\"/>\n", "<text text-anchor=\"middle\" x=\"418.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y5 | x, 5)</text>\n", "</g>\n", "<!-- p(y5 | x, 5)--x -->\n", "<g id=\"edge10\" class=\"edge\"><title>p(y5 | x, 5)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M402.385,-71.7281C384.386,-59.6974 355.505,-40.393 338.132,-28.7805\"/>\n", "</g>\n", "<!-- y5 -->\n", "<g id=\"node11\" class=\"node\"><title>y5</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"418.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"418.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y5</text>\n", "</g>\n", "<!-- y5--p(y5 | x, 5) -->\n", "<g id=\"edge9\" class=\"edge\"><title>y5--p(y5 | x, 5)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M418.5,-130.698C418.5,-119.15 418.5,-104.778 418.5,-95.1581\"/>\n", "</g>\n", "<!-- p(y6 | x, 6) -->\n", "<g id=\"node12\" class=\"node\"><title>p(y6 | x, 6)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"552,-95 475,-95 475,-72 552,-72 552,-95\"/>\n", "<text text-anchor=\"middle\" x=\"513.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y6 | x, 6)</text>\n", "</g>\n", "<!-- p(y6 | x, 6)--x -->\n", "<g id=\"edge12\" class=\"edge\"><title>p(y6 | x, 6)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M481.698,-71.8715C441.132,-58.3139 372.429,-35.3525 340.612,-24.7191\"/>\n", "</g>\n", "<!-- y6 -->\n", "<g id=\"node13\" class=\"node\"><title>y6</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"513.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"513.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y6</text>\n", "</g>\n", "<!-- y6--p(y6 | x, 6) -->\n", "<g id=\"edge11\" class=\"edge\"><title>y6--p(y6 | x, 6)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M513.5,-130.698C513.5,-119.15 513.5,-104.778 513.5,-95.1581\"/>\n", "</g>\n", "<!-- p(y7 | x, 7) -->\n", "<g id=\"node14\" class=\"node\"><title>p(y7 | x, 7)</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"647,-95 570,-95 570,-72 647,-72 647,-95\"/>\n", "<text text-anchor=\"middle\" x=\"608.5\" y=\"-79.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">p(y7 | x, 7)</text>\n", "</g>\n", "<!-- p(y7 | x, 7)--x -->\n", "<g id=\"edge14\" class=\"edge\"><title>p(y7 | x, 7)--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M569.767,-73.87C507.062,-59.899 385.81,-32.883 340.837,-22.8629\"/>\n", "</g>\n", "<!-- y7 -->\n", "<g id=\"node15\" class=\"node\"><title>y7</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"608.5\" cy=\"-150.498\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"608.5\" y=\"-146.798\" font-family=\"Times,serif\" font-size=\"14.00\">y7</text>\n", "</g>\n", "<!-- y7--p(y7 | x, 7) -->\n", "<g id=\"edge13\" class=\"edge\"><title>y7--p(y7 | x, 7)</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M608.5,-130.698C608.5,-119.15 608.5,-104.778 608.5,-95.1581\"/>\n", "</g>\n", "</g>\n", "</svg>\n" ], "text/plain": [ "<graphviz.dot.Graph at 0x7f44786c3748>" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seq.draw_local_fg(7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The independence assumptions of this model allow us to think of it not as a single sequence model, but a sequence of classification models. We have a classifier $p_\\params(y|\\x,i)$ that predicts the best class/tag for a token based on its sentence $\\x$ and position $i$.\n", "\n", "$$\n", " p_\\params(y|\\x,i) = \\frac{1}{Z_\\x} \\exp \\langle \\repr(\\x,i),\\params_y \\rangle\n", "$$\n", "\n", "Let us define a simple version of this model on the PoS tagging task defined above. Here we use a feature function (template) that has one active feature corresponding to the word $x_i$ observed at index $i$. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def feat_1(x,i):\n", " return {\n", " 'word':x[i]\n", " }\n", "local_1 = seq.LocalSequenceLabeler(feat_1, train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that in the above feature functions return a dictionary mapping a feature name (`word`) to a feature value (`x[i]`, the word at index `i`). This is not quite what we require `repr(\\x,i)` to be, namely a function from $\\x$ and $i$ to a real feature vector. However, internally this dictionary-based representation can be transformed into the desired form by defining a feature function template like so:\n", "\n", "$$\n", "f_{\\text{word},w}(\\x,i) = \\indi\\left[x_i = w\\right]\n", "$$\n", "\n", "This means that we have one feature function $f_{\\text{word},w}$ per word $w$. For example, the feature function $f_{\\text{word},\\text{\"the\"}}$ returns $1$ if $x_i = \\text{\"the\"}$ and $0$ otherwise. \n", "\n", "We can assess the accuracy of this model on the development set." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.6383993365125441" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seq.accuracy(dev, local_1.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a good start, but a macro-level evaluation score does not really show us where the approach fails and can be improved. One alternative view on the system is a confusion matrix that shows, for each pair of labels $l_1$ and $l_2$, how often $l_1$ was classified as $l_2$. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfEAAAGoCAYAAABWs9xCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8JGV56Pvfs+bCbQRnWMNsLoOgAoo3hCVe91EgCqgR\njAEHjeLxMlFxu0VNBJMdMZEtZ2tiTlRwz4kKMSoSFWWfJF6CsqPRqAMhICCCIAJBmEFMxMjAzHr2\nH1VL2pVZa3V1Vc3qrv5951Of6a6ueuvp7lr91PvWW29FZiJJkkbPxGIHIEmSBmMSlyRpRJnEJUka\nUSZxSZJGlElckqQRZRKXJGlEmcQlSRpRJnFJkkaUSVySpBG1tO0NrFw1mfuu3b9WGTst9VhjmDQx\nxl80UMawqPt5dOmzkLbniisu35yZqxc7DoAluz8sc+svGikrf7Hpi5l5XCOFDaj1JL7v2v256G+/\nVquMh++1W0PRqAlNDNUb0Z3UVffz6NJnIW3PLsvilsWOYUZu/QU7HXJyI2Xdd+UHJxspqIbWk7gk\nScMjILrTumsSlySNjwA61PrVncMRSZLGTK2aeEScBdybme9tJhxJklpmc7okSSPK5nRJkrTYWqmJ\nR8R6YD3A3vuubWMTkiQNoFu901t5J5m5ITOnMnNq1Z6LfhmdJEkPimhm6mtT8cOIuDoiroyIjeW8\nVRHx5Yi4ofx/Zc/yZ0bEjRFxfUQcu1D5tZJ4Zp5lpzZJkuZ1VGYelplT5fMzgEsz8yDg0vI5EXEo\nsA54DHAccG5ELJmv4O60KUiStJCgaE5vYhrcCcAF5eMLgBN75l+YmVsy82bgRuDI+QqqFUVEvDYi\nXl6nDEmSdpyGmtL77+GewN9FxOVlfzGANZl5R/n4x8Ca8vG+wK09695WzptTrY5tmfmhOutLkjTC\nJmfOc5c2ZOaGWcs8IzNvj4i9gC9HxPd6X8zMjIiBb8DgdeKSpPHSXO/0zT3nubcrM28v/78rIi6m\naB6/MyL2zsw7ImJv4K5y8duB3ku69ivnzclz4pKk8bKDmtMjYreIeMjMY+A5wHeBS4BTy8VOBT5f\nPr4EWBcRO0XEgcBBwLfn24Y1cUmS2rEGuLi83fBS4BOZ+YWI+A5wUUS8CrgFOBkgM6+JiIuAa4Gt\nwGmZuW2+DbSexJdMBA/ddVmtMqan69+/emKiO8PsLTbvf/2r/DykUbLjBnvJzJuAJ2xn/t3AMXOs\nczZwdr/bsCYuSRof3opUkiQNA2vikqTx0qGx003ikqQx0q0boAyUxCPi3cCXgD2AR2fmuxuNSpIk\nLWjQw5EnA/8IPBP4++bCkSSpZRPRzDQEKtXEI+I9wLHAgcA3gUcAx0TEpzPzD1uIT5IkzaFSEs/M\n3ykvRH858Gbgssx8+uzlykHe1wPsu9/+TcQpSVJ9M3cx64hB3snhwD8DjwKu294CmbkhM6cyc2rP\nyck68UmS1KwdexezVvVdE4+Iw4DzKQZk3wzsWsyOK4GnZuYvWolQkiRtV9818cy8MjMPA74PHAp8\nBTg2Mw8zgUuSRkN5iVkT0xCo2rFtNXBPZk5HxKMy89qW4pIkqR1D0hTehKod2zYBzysfP6WViCRJ\nUl8csU2SNF6GpCm8CSZxSdL4GKKe5U3ozuGIJEljZofUxLPm+hMNDG/38/u21lp/t51ttNDwmp6u\n+1fWzN+ZNBJsTpckaUTZnC5JkhabNXFJ0hjxfuKSJI0um9MhIg6IiFc0GIskSapgoJp4RLwOeCOw\nokzk6zLzx00GJklS4zp2K9LKSTwiHgK8EzgOeDxwGfDzZsOSJKkN3TonPsg7maa49HsVQGb+MDN/\n1rtARKyPiI0RsfHuzZsbCFOSJM1WOYln5s+B1wDvBv4oIt4bEbvOWmZDZk5l5tSek5MNhSpJUgNm\nhl6tOw2BgdoUMvMS4CTgfwCrgbc0GZQkSa0Z1/uJA0TECmDP8unPgOsom9YlSdKOM0jv9GXA/6RI\n5JPAj4CXNBmUJEmtGZKm8CZUTuKZeQ9wXEQcADwrM89vOCZJktoR9k6f8VPgyqYCkSRJ1Qw87Gpm\nmsQlSaNnnJvTK29gIthzxfK2N7OguvcDv+mu+uPZPHyv3WqXIW2P9wKX+hcdSuLdOTEgSdKY8S5m\nkqSxEXSrJm4SlySNjyinjrA5XZKkEVWpJh4Rk8BfUQz0ch9wdGbe20ZgkiQ1L8a6Of11wN9n5jsi\nYh/g/hZikiSpNeOcxO8HDgDIzH9pPBpJktS3qufEfwD8RkS8to1gJElqW0Q0Mg2DvpN4ROwLnAk8\nEnh1RLyonH9VROwxa9n1EbExIjZu2ryp0YAlSaqjS0m8SnP604GrM/PuiHgecGlErAF+mJn/2rtg\nZm4ANgAcccRUNhatJEn6pSrN6VcBR0XEPpl5J3A68EHgE61EJklS06LBaQj0XRPPzO9FxO8BX4yI\nB4A7gXXAORFxRWZ+v60gJUlqQozzJWaZ+ZfAX86a/anmwpEkSf1y2FVJ0lgZ25q4JEmjrktJ3LHT\nJUkaUa3XxBPYNl3vKrMlE4t/1PTwvXarXcbKEz9Yu4x7Pnda7TIkqV/33rd1sUNoXJdq4janS5LG\nxxBdHtYEm9MlSRpR1sQlSWPF5nRJkkZQ1wZ7sTldkqQRZU1ckjRWulQTN4lLksZLd3J4O83pvfcT\n3+z9xCVJakUrSTwzN2TmVGZOTU6ubmMTkiRVF0VzehPTMLA5XZI0VoYlATehdk08Ii6NiH2bCEaS\nJPWvVk08IiaARwI/aSYcSZLa1aWaeN3m9EOBz2TmL5oIRpKkNnVtsJdaSTwzvwu8uaFYJElSBXZs\nkySNl+5UxE3ikqQxEp4Tr+SOn93Hu79yQ60yfv/XDm4omsV1z+dOq13G9HTWLmNiojs7sKR2rdjZ\nul5dEbEE2AjcnpnPj4hVwKeAA4AfAidn5j3lsmcCrwK2AW/MzC/OV7Y3QJEkjZVFGOzlvwLX9Tw/\nA7g0Mw8CLi2fExGHAuuAxwDHAeeWBwBzMolLksbKjkziEbEf8Dzgz3tmnwBcUD6+ADixZ/6Fmbkl\nM28GbgSOnK98k7gkSYOZnLlPSDmt384yfwr8LjDdM29NZt5RPv4xsKZ8vC9wa89yt5Xz5uTJDknS\neGmuW9DmzJyaczMRzwfuyszLI+JZ21smMzMiBu7s1HcSj4g1wNuBo4CtwBXAOzPz1nlXlCRpiOzA\n3ulPB14QEc8FdgZ2j4i/BO6MiL0z846I2Bu4q1z+dmBtz/r7lfPm1FdzekQ8AvgC8A/AVGYeDnwS\nuLh8TZIk9cjMMzNzv8w8gKLD2lcy87eAS4BTy8VOBT5fPr4EWBcRO0XEgcBBwLfn20a/NfHzgFMz\n86qe4C6NiN8C/pgHT8pLkjS0huQ2oucAF0XEq4BbgJMBMvOaiLgIuJaixfu0zNw2X0ELJvGIOBjY\nlJlXle37fwjcBERmvigipiNiMjM396yzHlgPsPte+wz0DiVJasNiJPHMvAy4rHx8N3DMHMudDZzd\nb7n91MSfAPxjea3aO4CjgT2A75av3wAcCPwyiWfmBmADwN4HP7b+6CSSJOk/6Lc5fRswCfwgM38K\n/DQiri1f24sHT8pLkjTUhqA5vTH9dGz7LvBkipr2IyJij4jYH3h0RDwO2Cszb2kzSEmSGhMNTUNg\nwZp4Zl5XJu1DgHcBX6U4J34J8Fbgla1GKEmStqvf5vTXAx8H3gYcUc47HNgnM+9sIzBJktowbs3p\nZOZ1wAuAF1EM8vLPwOuAq+ZbT5KkoRKLcgOU1vQ9Yltm3ga8tsVYJElSBY6dLkkaGwEMSSW6Ea0n\n8VW7LOeUxzngS1MmJurvfVfcfE+t9Q8/cGXtGLpkerreUAhNfKdNqPs+YHjeS12Z9T+LYWlu1WzD\n0xTeBG9FKknSiLI5XZI0VjpUETeJS5LGi83pkiRp0VVO4hFxYkRkRDyqjYAkSWpNFM3pTUzDYJCa\n+CnA18v/JUkaGUFxFUUT0zColMQjYgXwDOBVwLpWIpIkSX2pWhM/AfhCZn4fuDsijtjeQhGxPiI2\nRsTGe+7evL1FJElaFOPcnH4KcGH5+ELmaFLPzA2ZOZWZUyv3nKwTnyRJjRrLsdMjYhVwNPC4iEhg\nCZAR8TvZxPBGkiSpkio18d8EPpaZD8vMAzJzLXAz8J/bCU2SpIaNce/0U4CLZ837DPZSlySNiOIG\nKGPYnJ6ZR21n3p81G44kSeqXw65KksbI8NSim2ASlySNlQ7l8PaT+E5LJ3jY5K5tb0YV1L0f+Cs+\n/k+1Yzj/pU+sXUYTunIP7a68j2HRpZqaus2auCRprHTpIM0kLkkaH0N0eVgTvBWpJEkjypq4JGls\nzFwn3hVVhl3dBlwNLAO2An8BvC8zp1uKTZKkxnUoh1eqif8iMw8DiIi9gE8AuwPvaCMwSZI0v4HO\niWfmXcB64A3RpXYJSVLnjeWwq7Nl5k0RsQTYC7izuZAkSWrPkOTfRrTSOz0i1kfExojYuHnTpjY2\nIUnS2Bs4iUfEw4FtwF2zX8vMDZk5lZlTk6tX14lPkqTmhM3pRMRq4EPABzKz/niPkiTtAMUlZosd\nRXOqJPFdIuJKHrzE7GPAn7QSlSRJWlCV+4kvaTMQSZLaNzxN4U1wxDZJ0ljpUA537HRJkkaVNXFJ\n0lixOV0DaaIj/zDsfOe/9Im1y/j5fVtrl7HbzvV334mJxf88m9CV96Hh4wVIw80kLkkaHx27n7hJ\nXJI0Nrp2K1I7tkmSNKKsiUuSxkqXauKVk3hEbAOu7pl1YWae01xIkiS1p0M5fKCa+C8y87DGI5Ek\nSZXYnC5JGitdak4fpGPbLhFxZc/04tkLeD9xSdJQKi8xa2IaBq00p2fmBmADwOFHTDlSgCRJLbA5\nXZI0NsK7mEmSNLo6lMMHSuK7RMSVPc+/kJlnNBWQJEnqT+UknplL2ghEkqQdYaJDVXGb0yVJY6VD\nOdyx0yVJGlXWxCVJY6O4xrs7VfHWk/h0JlsemK5VxtIl3Wgw2DZd/5L5pUu6sfPttnP9Xe/+rfX2\nK4DlS7uxb0nq30Q3fkYBm9MlSRpZNqdLksaKzemSJI2oDuVwm9MlSRpVlZJ4RKyJiE9ExE0RcXlE\nfDMiXthWcJIkNSkox09v4N8w6Ls5PYqTCJ8DLsjMl5TzHga8oKXYJElqXJd6p1c5J340cH9mfmhm\nRmbeAry/8agkSdKCqjSnPwa4op8FI2J9RGyMiI13b948WGSSJDUtiluRNjEtvKnYOSK+HRH/HBHX\nRMQ7y/mrIuLLEXFD+f/KnnXOjIgbI+L6iDh2oW0M3LEtIj5YBvad2a9l5obMnMrMqT0nJwfdhCRJ\njStGbas/9WELcHRmPgE4DDguIp4CnAFcmpkHAZeWz4mIQ4F1FJXm44BzI2Lem45VSeLXAIfPPMnM\n04BjgNUVypAkaSxk4d7y6bJySuAE4IJy/gXAieXjE4ALM3NLZt4M3AgcOd82qiTxrwA7R8Treubt\nWmF9SZIWVVDcirSJCZicOXVcTuv/w/YilkTElcBdwJcz81vAmsy8o1zkx8Ca8vG+wK09q99WzptT\n3x3bMjMj4kTgfRHxu8Am4OfA2/otQ5KkxdbgYC+bM3NqvgUycxtwWEQ8FLg4Ih476/WMiIFvrFFp\nxLbyyGHdoBuTJGkcZeZPI+KrFOe674yIvTPzjojYm6KWDnA7sLZntf3KeXNyxDZJ0ljZgb3TV5c1\ncCJiF+DZwPeAS4BTy8VOBT5fPr4EWBcRO0XEgcBBwLfn24Zjp0uSxkaFnuVN2Bu4oOxhPgFclJn/\nf0R8E7goIl4F3AKcDJCZ10TERcC1wFbgtLI5fk4mcUmSWpCZVwFP3M78uymu7treOmcDZ/e7jdaT\n+EQEOy2z1R5g6RI/hyYtX1r/89z8sy21y5h8yE61y5CGVZdu2zljokPvyZq4JGmsdCeF27FNkqSR\nZU1ckjRWunSKwCQuSRobxYhtix1FcwZqTo+IexdeSpIktcmauCRpfPQ5UMuoMIlLksZKh3J4O73T\nI2L9zF1dNm/a1MYmJEkae60k8czckJlTmTk1udrbjUuShseOGjt9R7A5XZI0NuydLkmShsKgSXzX\niLitZ3pzo1FJktSSsW9Oz0xr8JKkkTQc6bcZJmNJkkaUHdskSWMjwluRSpI0sjqUw9tP4t+/82f8\n2vu+VquMy976zIaikX7V5EN2ql3GKz95Za31P3LKYbVjyMzaZTRhWDr71LXlgW21y9hp2ZIGIqmn\nif2iK99pV1kTlySNlS4dmJjEJUljpUM53N7pkiSNKmvikqSxEUSneqdXqolHREbEH/c8f2tEnNV4\nVJIktSGK5vQmpmFQtTl9C/AbETHZRjCSJKl/VZP4VmADcHoLsUiS1LoujZ0+SMe2DwIvjYg95log\nItZHxMaI2PjAz/918OgkSWrYREPTMKgcR2b+G/AXwBvnWWZDZk5l5tSy3ebM9ZIkqYZBe6f/KXAF\n8NEGY5EkqVVBtwZ7GahFIDN/AlwEvKrZcCRJatdENDMNgzrN+n8M2EtdkqRFUqk5PTNX9Dy+E9i1\n8YgkSWrRsNSim+CIbZKksVEM1NKdLD4sveQlSVJFrdfED17zEC598//V9ma0A22brn+P4iUdas+q\nez/wkz7yndoxfPLUI2qXsXSJx/QzhuFe4E3oUo2zSR36+bE5XZI0Xrp0bOOhtyRJI8qauCRpbAR0\n6lakJnFJ0ljpUhN01fuJ7xcRn4+IGyLiBxHx/0bE8raCkyRJc+s7iUfRzfGzwOcy8yDgYGAFcHZL\nsUmS1LjiWvH60zCoUhM/GrgvMz8KkJnbKO4r/sqIcOQ2SdLQiwgmGpqGQZUk/hjg8t4Z5W1JfwQ8\nssmgJEnSwlrp2BYR64H1AGv337+NTUiSNJAhqUQ3okpN/FrgV4aFiojdgf2BG3vnZ+aGzJzKzKnJ\nydX1o5QkqSHjeivSS4FdI+LlABGxhOJ2pOdn5r+3EZwkSZpb30k8MxN4IXBSRNwAfB+4D3h7S7FJ\nktSomcFeutKxrer9xG8Ffr2lWCRJat2Q5N9GdGngGkmSxorDrkqSxscQdUprgklckjRWgu5k8R2S\nxIs+cXV05wPvgiVdOowdAn/1yifVLmP/376odhk/PO+k2mVM1Nw3pqfr/lbUj0EaJdbEJUljo+id\nvthRNMckLkkaK11K4vZOlyRpRFkTlySNlejQheKVk3hEbAOuLte9DjjVYVclSaOga+fEB2lO/0Vm\nHpaZjwXuB17bcEySJKkPdZvTvwY8volAJElqXXRr2NWBk3hELAWOB76wndcevJ/4Wu8nLkkaHsNy\n85ImDNKcvktEXAlsBH4EfHj2Ar9yP/HV3k9ckqQ2DFIT/0VmHtZ4JJIktaxrHdu8xEySNFY61Jru\nYC+SJI2qyjXxzFzRRiCSJLUvmOjQTbVsTpckjY3A5nRJkrSAiFgbEV+NiGsj4pqI+K/l/FUR8eWI\nuKH8f2XPOmdGxI0RcX1EHLvQNkzikqTxEUXv9CamPmwF3pKZhwJPAU6LiEOBM4BLM/Mg4NLyOeVr\n64DHAMcB50bEkvk20Hpz+nQmWx6YrlXG0iUeazRpywPbaq2/07J596mxMz2dtdafaOB6l5vO/c3a\nZXzrpp/ULuOpj9yz1vpdaubU8NpRg71k5h3AHeXjn0XEdcC+wAnAs8rFLgAuA95Wzr8wM7cAN0fE\njcCRwDfn2obZUZKkwUxGxMaeaf1cC0bEAcATgW8Ba8oED/BjYE35eF/g1p7VbivnzcmObZKksdFw\nx7bNmTm14DYjVgCfAd6Umf/WeyvUzMyIGLg5zyQuSRorO3Ls9IhYRpHAP56Zny1n3xkRe2fmHRGx\nN3BXOf92YG3P6vuV8+Zkc7okSS2Iosr9YeC6zPyTnpcuAU4tH58KfL5n/rqI2CkiDgQOAr493zYq\n1cQjYhtwdbnezcDLMvOnVcqQJGkx7cCK+NOBlwFXlzcOA3g7cA5wUUS8CrgFOBkgM6+JiIuAayl6\ntp+WmfP2RK7anP7Lm59ExAXAacDZFcuQJGlRBDuuCTozv15ucnuOmWOds6mQV+u8l2+yQK85SZLU\nnoE6tpUXnx/Ddu4lXr6+HlgPsN/a/QcOTpKkRgVEhwYkqFoT36Vs15+5ru3L21soMzdk5lRmTu05\nOVk3RkmStB1Vk/jMOfGHUbTzn9Z8SJIktScamobBQOfEM/PfgTcCb4kIrzWXJI2EoLhOvIlpGAzc\nsS0z/wm4CjiluXAkSVK/KtWiM3PFrOe/3mw4kiS1azjq0M2wKVySNFaGpCW8EQ67KknSiLImLkka\nI9Gp68RbT+ITEeyyfEnbm1EFOy3z+2jSxMTi/yAsXVK/Ue2pj9yzgUjq6dKPq4bTjhx2dUfo0nuR\nJGms2JwuSRorXWrxMYlLksZKd1K4zemSJI2sgZJ4RPxeRFwTEVdFxJUR8eSmA5MkqXHlXcyamIZB\n5eb0iHgq8Hzg8MzcEhGTwPLGI5MkqWFd650+yDnxvYHNmbkFIDM3NxuSJEnqxyAHJF8C1kbE9yPi\n3Ih45uwFImJ9RGyMiI2bN2+qH6UkSQ3pUnN65SSemfcCRwDrgU3ApyLiFbOW2ZCZU5k5NTm5upFA\nJUlqQpfuJz7QJWaZuQ24DLgsIq4GTgXOby4sSZK0kEE6th0CTGfmDeWsw4BbGo1KkqSWDElLeCMG\nqYmvAN4fEQ8FtgI3UjStS5I01Ire6d3J4pWTeGZeDjythVgkSVIFDrsqSRor496cLknSiApinJvT\nqwqG437LUpdNT2ftMobh7/Tn922tXcZuO1s30fhwb5ckjRWb0yVJGkFd653epXHgJUkaK9bEJUnj\nI8a4OT0i9gQuLZ/+J2AbxfjpAEdm5v0NxiZJUuPGNoln5t0Uw6wSEWcB92bme1uIS5IkLcDmdEnS\nWPE6cUmSRlAAQzAkQmNa6Z0eEesjYmNEbNy0edPCK0iSpMpaSeKZuSEzpzJzavXk6jY2IUnSQKKh\nf8PA5nRJ0ljpUu90B3uRJGlEDVwTz8yzGoxDkqQdYliawptgc7okaWzYO12SJA0Fa+KSpDEyPD3L\nm9B6Et+ydZqb7vp5rTIevtduDUUjddNER9oHd9vZeoVa1rEboNicLknSiPKwV5I0VjpUETeJS5LG\nR9E7vTtp3OZ0SZJGVKUkHhEHRMR3Z807KyLe2mxYkiS1IxqahoHN6ZKk8TIsGbgBNqdLkjSiWqmJ\nR8R6YD3A3vuubWMTkiQNpEuDvVStiWc/83vvJ75qz8nBIpMkqQURzUzDoGoSvxtYOWveKmBzM+FI\nkqR+VUrimXkvcEdEHA0QEauA44CvtxCbJEmNG/fe6S8HPhgRf1I+f2dm/qDBmCRJas+wZOAGVE7i\nmXktcFQLsUiSpAq8TlySNDaKpvDuVMVN4pKk8TFEPcub4GAvkiSNqNZr4j+7fyt/f8umWmU8fK/d\nGopGANPTc13u35+JiQ4dxjbgga3TtdZftrT+sXTd7xT8XnutfNIbapdxz3c+0EAkiy+z/r41bLq0\np9ucLkkaLx3K4janS5I0oqyJS5LGSNg7XZKkUTWWvdMj4qsRceyseW+KiPOaD0uSJC2kyjnxTwLr\nZs1bV86XJGnoNTVuej+V+Yj4SETcFRHf7Zm3KiK+HBE3lP+v7HntzIi4MSKun11pnkuVJP5p4HkR\nsbzc2AHAPsDXKpQhSdLi2nF3QDmf4iZhvc4ALs3Mg4BLy+dExKEUFePHlOucGxFLFtpA30k8M38C\nfBs4vpy1Drgot3MRYUSsj4iNEbHx3nvu7ncTkiR1Rmb+PfCTWbNPAC4oH18AnNgz/8LM3JKZNwM3\nAkcutI2ql5j1NqnP2ZSemRsycyozp1as3LPiJiRJak809A+YnKmwltP6Pja/JjPvKB//GFhTPt4X\nuLVnudvKefOq2jv988D7IuJwYNfMvLzi+pIkLaoGe6dvzsypQVfOzIyIWkPiVaqJZ+a9wFeBj2CH\nNkmSqrozIvYGKP+/q5x/O7C2Z7n9ynnzGmTEtk8CT8AkLkkaQTuuX9t2XQKcWj4+laKFe2b+uojY\nKSIOBA6i6Ic2r8qDvWTm5+jUyLOSpLFRMwNX2lTEJ4FnUZw7vw14B3AOcFFEvAq4BTgZIDOviYiL\ngGuBrcBpmbltoW04YpskSS3IzFPmeOmYOZY/Gzi7yjZM4pKkseLY6ZIkjaCgW2Ont57EJ3ddziue\ndEDbm1EFExMd2oOHwLKli39H3ya+0+2M21RZdOTX8Z7vfGCxQxgaXflOu8qauCRprHTpsMQkLkka\nLx3K4ovfDihJkgZiTVySNFa61Du9Uk08ItZGxM0Rsap8vrJ8fkAbwUmS1LSIZqZhUHXs9FuB8yhG\nnKH8f0Nm/rDhuCRJ0gIGaU5/H3B5RLwJeAbwhmZDkiSpPUNSiW7EIGOnPxARvwN8AXhOZj4we5ny\nnqrrAdbuv3/tICVJakyHsvigvdOPB+4AHru9FzNzQ2ZOZebU6snVAwcnSZLmVjmJR8RhwLOBpwCn\nz9wXVZKkYVfcxKyZf8Ogau/0oOjY9qbM/BHwHuC9bQQmSVLjGuqZPpK904HXAD/KzC+Xz88FHh0R\nz2w2LEmStJBKHdsycwOwoef5NuDwpoOSJKktQ1KJboQjtkmSxkuHsrhjp0uSNKKsiUuSxsjw9Cxv\nQutJPIGt26ZrlbF0iQ0GUttiWLrbSi3r0q5udpQkaUTZnC5JGhtBp/q1mcQlSWOmQ1nc5nRJkkZU\n1WFXXxgRV86apiPi+LYClCSpSV0aO73qiG0XAxfPPC9vOfpS4IsNxyVJUiu61Dt94HPiEXEw8AfA\n0zKz3jVkkiSpsoHOiUfEMuATwFvKu5nNfn19RGyMiI2bN22qG6MkSY2JhqZhMGjHtj8CrsnMT23v\nxczckJlTmTk1uXr14NFJktSkjt2KtHJzekQ8C3gR3r1MkqRFVSmJR8RK4KPASzLzZ+2EJElSm4ak\nGt2AqjXx1wJ7AefNGmf53XM1rUuSNCyC4WkKb0LVS8zeDby7pVgkSVIFDrsqSRorHaqIm8QlSeNl\nbJvTB3GtheVdAAAOxUlEQVTHv93Hu/7uhlplnHXsIQ1FI4DMrLW+953+VVu31RvraOkSb2Ggdtx9\n7/21y9hzxfIGIlFbrIlLksbKsIx73gSrAJIkjShr4pKk8dKdirhJXJI0XjqUwyvfT3xpRPx1RGyO\niMe2FZQkSVpY1XPi5wHfA04EPhUR+zUfkiRJ7Wjq5ifDcpFO383pEfEO4F8z863l81cDn4yI52fm\nv7YVoCRJTepS7/S+k3hmvnPW828C/3l7y0bEemA9wO6r96kTnyRJmkMrl5j13k981z1WtrEJSZIG\nEw1NQ8De6ZKksTIk+bcRDvYiSdKIsiYuSRorw9KzvAkmcUnSGIlO9U63OV2SpBFlTVySNDaCbjWn\nWxOXJGlEtV4T32f3nTnr2EPa3sxIuPe+rbXLWLFz/a8sunQYOgSWLvFYWMNpzxXLa5fx7Zt+0kAk\naovN6ZKksdKleoxJXJI0VuydLkmSFp01cUnS+Bii24g2wSQuSRobQ3TvkkbYnC5J0oiyJi5JGi8d\nqor3ncQj4nRgHXA/8FHga8AJwD9k5jdnLbseWA+wdv/9GwtWkqS6xrV3+hrg6cCrgaOA/wXsDnxr\n9oKZuSEzpzJzavXk6kYClSRJv6rvmnhmnlE+vB54WTvhSJLULnunS5I0ojqUw+2dLknSqDKJS5LG\nSzQ09bOpiOMi4vqIuDEizlh4jWpsTpckjZUd1Ts9IpYAHwSeDdwGfCciLsnMa5vahjVxSZLacSRw\nY2belJn3AxdSXJrdGGvikqSxEezQ3un7Arf2PL8NeHKTG2g9iV9xxeWbd1kWtyyw2CSwucZm6q7f\npTKGIYYulTEMMQxLGcMQw7CUMQwxjFIZD6tZfmOuuOLyL+6yLCYbKm7niNjY83xDZm5oqOy+tJ7E\nM3PB0V4iYmNmTg26jbrrd6mMYYihS2UMQwzDUsYwxDAsZQxDDF0rY0fJzON24OZuB9b2PN+vnNcY\nz4lLktSO7wAHRcSBEbGcYujyS5rcgOfEJUlqQWZujYg3AF8ElgAfycxrmtzGsCTxuucQmjgH0ZUy\nhiGGLpUxDDEMSxnDEMOwlDEMMXStjE7KzL8B/qat8iMz2ypbkiS1yHPikkZCRHhfY2kWk7jUYRHd\nuF9TRDwXuDQi9q1ZTq1TiBGNXZpUS0T42y1gxJN4RKxa7Bg0fIblBy4i9o+I3Roop04iXrKI2+4t\nZ9ca6x4LvBd4WWbePuj3GxEHA78fEXsOuP7DgPdExH6DrF+WcVREPG3Q9csyfg14eZ0y1B2L8mMX\nEQ+PiIfULGMv4PURsTwiDqhZ1lD86C+WiFgz63ntz6PGD+0REfGUGtt9BvCKiHjiAOs2VmstP9O3\nAK8bNJFHxH5lwhkoaUTEs4GPRcQZEfH8QcoAlg+4Xm8czwX+e0SsXXDh/7juc4C/AK4FfgKQmdMD\nflcrgVUU38kgFYAVFCNw7VXGNsg+/kzgZYOuHxFHA58DzuxKK4vq2eHJKyJ2Bt4AvCUiVgxYxn7A\nARS1jP8BvGOQsiLi0RHxEuD0iNhlgPWfGxFvLR8v2oFARCyLiGdExJkR8YIqSTAiHgXcERHvi4jX\nQPEjWb7W93uKiIMi4ikRcXRErBzkhzYijqfo5frvVdbrWf844P3AVmCPAYpYUpbTxFUbmyiuEd0H\neGXVRB4RJwB/BXwY+HREnFVeZ9rv+scBZwPfAHYDXhQRlYZ7LBPohRHxjoj4jSrr9pTxfODdwGWZ\neetCy89a9xjgA8CbKd7HK8uDNDIzq+5fmfkt4GPA7sAbqiby8tKgrwIfiojdZ/5OKvpGuX2qrt/T\nIvF64NvZUK9kDwZGXGbu8Ak4kCL5ngmsqLjuCuD/AQ4G3gXcCzx5wDj+ALgFeP0A6x4D/DPFYPYT\nvfN38Ge5nGLwgLcCZwEnAV8o/39IH+vvB3wdeBvwJYpazwuA3SvE8DzgCuBi4MsUYwU/sXwt+izj\nOOAfgOeUz1cCB1aI4ZnAjbP3BeAxfa4/CfwQWFU+Xzrg93EQcMjMewd+neLA4o397uvAUcD3gSOA\nh5b7+j9SJOUlfay/CpgGfr18vhb4FHBihfdxHPAt4DTgncD/Bzyy4mfxnyiS3pN69tVdy31u5z7W\nfxLwtPLxIcAfURwQPL1nmXn3L+BpwLpZ855c/v78/kJ/I+VnuaLn+W7Ah4CjyucTfbyPYygqLk+n\n+O37B2CfWcvMWw7FXbCuA55aPr8aOLyfz2BWOc8ATgZeW/5f6Ttd4LvaqUosTs1Mi1J7zMybKW7P\ntgp4e5XzZZl5L8UP0t8Az6H4Q3xxRJwQEaf0W05E7AE8heIH6tqIOCQinhwRh/VZA30m8JnM/Hw+\nWHPdFTg3Il7UbxzbiWtZhWWXAhcBF2fmezPzLOB/Ay8FDqP4fOaVmbcB3wYOB55L8bm+EvjriDgy\nIg5aIIbjgP8GnJ6ZL8zMZ1P84F8SEU/IzFzo8yxrRH8DvCczvxQRj6A4MKnSG/mJwPuzqG3NlPse\n4H9HMdjCvDJzM/BfgG+ULQlbI2JplVpK2fR9PfC1iDgN+G3grymS4e7Aq/vc158G/FlmXg7cl5nf\nB15MkVjP7OO9/ITi4OGcssZ4K/AAsGb+NX/5Pma+j3dl5gcpvs/lFAc6VWwpt3tf2QL3dorv9ePA\neQvVhDPzO5n5jYiYyMzrKQ4wHwCeP3NeOcsMMo+VFE35J/WU+y3g0xStJEfPtWJEPLRc7p1lywiZ\n+XOKZv1Xl8/7qU0vp/j7+l2Kz/LRFK0KLyi3E/OVU/6dPwp4dWZ+M4rbW94FrC5jyJ7l5iojyv3z\n4xS/DYcAh/YR+4Ii4kDgxN5YtAMt5hEE8AiKGkqlozfghcBNwPnl8/cAd1CtpvEQiqatt1IckV5P\n8YN7AX3UzIGXUByRn0VRE30u8DiKI9zTKZL8sorva4LiVnWn97n8IcBZ5eOlwJ8APwZeQ/Hj9XfM\nUxOd+dwpfmQupKg5Pav8bD9Ece7tPGC3OdafqfE9v3y+c89r7yjL2aPP9zJTm388RW3+LX2uN/Me\n3k+RdGbmHw+cT5EQbwFO6LO844EfACtnPtfy/6MoWxcWWP/o8jP5LxSnBv4K+CjwPykOPk8Ddlrg\nvZzX870GZe27/Gwuozgnu+DfTPlebqBokv4sfdR+Z30f11C2yJR/G18H/pSieXtyof27jP0tFKNV\n3VZ+H6+mqAmfD7ygyt9HWeZB5b71Z/TZAld+DlcBL571Ob+G4iB4zlowxW/UyyjGuz6boka8tPzb\neskA8R8IfAL4X8BXKA4SvgG8aL7vtGc/nCj//0PgrT2vnwSsn++9lMt9CnhE1bjnKe9JFAdnf0uf\nf+tOzU6LHkDPzvB4YNc+ltut3GkeT5HM31X+UH1sgG3+GkWT+IXlj8JJFAcV7+5j3TXA/w38E0UC\n+TxFc/SXgO8BVzKryazPmB5HcS718D6W/W3gz8vHH6I4PfBkiua6UykOTvZdoIygSOJ/RHGU/j3K\ng6HyB3PlAus/j6Jpb8/y+U49r30FmKrw3o+jSIBnlM+X9Mw/aoF1j6FI/jNNjMuA5eXj3wdOqRDH\n7ET+eoqm+v37XH+m6XM5RVP2qeWP3N3Adxf6set5L0eUzyfK97MP8BnmOKiaZx+fBvYqn1dJ5DMH\nAe8v438hRaL4FvDn9HHKheL011PLfbF33/gw8FtV/z7KdR9F0SKxusI6zy3/Vl/cM28dxcHVgqdO\nKE5pnElxYPu35d/beytsv/eU2+nAR8vHe5f7x8P7LGfmAORtwGfLxy8t96tH9bHeZyhP99SdKA4I\nPlJOhzZRptMA38NiB1DuDCuANwGTfS4/czT6OxTnX4+jOFf2BwNs+yGU58WAx1IcFX+43zjKH7in\n9sx7KcWBwC41Po+jKZPiAss9B3hT+fjPePB82cHAv5Q/OC/uc5uHUNTi/9sA8c5OesvK/z8PPLZi\nWc+mOJB4aPn8FRTN/fOeH6c4uDuL4lznkT3zT6E4qOnrR3LWe7qKotZ5HXBYxfWfR3Fee+Yc+0qK\nGvQBfazb+16meuafTHGO+aEDvJdrKBN5xXVnDgLWzNrv+/pbnaPMk4CN1KgRUrGVq1znWIpWmbdT\ntBB8p8r+yYMHle8q96nN9NHvZDvlPAz4y0Hfe1nG4ylaRV5Uvo8FkyjFOes/HmQ/2E5ZLwfOKx8v\nr1ueU43vYrED+GUgFf4oKWqPqyia974InFP+sKytsf29KTpmfRr4zX7jKX9YP0ZxVP9yilrKnEfE\n/b6/Ppc7mOLc5SMoEvqnKGryh1I02T25/AHvq/ZVJsyz6KNFZDvrzk7kM5/FIInjeIra/euAr9F/\n57R9KVojLgPeB/x3ikQ6UC2BIhFPA08YcP3jy+0veEA2x3t5B0Ufh3Momk+vrRHLCRSnKyb63b9m\nvY9r6/74l39jb6I4oKh0cNfURNF34hyKvjCPrrhu9Dzei54Dm4rlPLT8PAfqkFuWsX+5b15f5X00\nlXCBR1Ke9qu6Pzk1O43s2OllJ46HZubmsgPTOZl594BlTVB0NjkxM8+OYnjHzZm54KVOZeeXkyiO\niH9C0RR/9SBxDCIiTqZo2v8cRW36dIpzdqcDD6c4uHhdZv6sj7IeRVH7W9fPe9/O+seX659LcR5x\nfWZ+t2o5ZVnPpziP+8SscNef8lLBwylq9LdTXNp0wyAxlOXtOshn0bP+CRQHRkdk9UuKdgGmKGqQ\nm4G/zaKD16CxrMiiY+gg655AcVAxVfV99JSxC0Ur0/WZeeMgZSy2shNarR/NsrPk71Hc0epfBixj\nGUUfmA/U2SfqiIidMnPLYmxbDxrZJA4P/kFFxJLM3NZguZXLK/+oIjPvbyqOPrc7CfwGxXXzn6W4\nTGofilrPeykScpUkWDdpDZR824hlWNRJnsOkK+9jGETE0szcWrOMZZn5QFMxaTSNdBJXoRxI5EiK\no/tNFIOd/Bvwwcy8dhHi6UTylaRhZxLvkPJa3C1ttE5IkobPWI8Z3kFbes7XDXTeUpI0OqyJS5I0\noqyJS5I0okzikiSNKJO4JEkjyiQuSdKIMolLkjSi/g/BpKL9OyUh/AAAAABJRU5ErkJggg==\n", "text/plain": [ "<matplotlib.figure.Figure at 0x7f44786c32b0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "seq.plot_confusion_matrix(dev, local_1.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This matrix shows a couple of interesting properties of our current model. First, we notice a relatively strong diagonal. This corresponds to correctly predicted labels and is a good sign. We also notice the 'N' (common nouns) column that receives a lot of counts. This means that many words that aren't common nouns are labelled as common nouns. This may not be surprising, as the large frequency of common nouns in the data (observe the very dark 'N'-'N' dot) could have 'N' the default class, chosen whenever there is too much uncertainty. \n", "\n", "The matrix enables us to spot systematic problems that, if fixed, may lead to substantial improvements. One problem is the fact that '@' (used for tokens that address twitter user ids) is never labelled correctly (its diagonal entry is missing). Why could this be? To investigate this further it is useful to look at the learned weight vector for this class. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAk0AAAHBCAYAAAB0YI9mAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XeYZVWd7//36kgHoGm6gSbHBposTZAMAt3kICAoAoIi\nEkYERRCRNESRKEFochJFQYIkCWNAQUBFMdxxHGdGxzsy987c370zd+aOsn9/fL6bver0qapVdc6p\nKtvP63nqqTqnztlx7bW+K+y1U1VVmJmZmdnAxo32BpiZmZn9KXDQZGZmZlbAQZOZmZlZAQdNZmZm\nZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRWY0IuFzpo1q1pzzTV7sWgzMzOz\nrnr11Vf/uaqq2YN9ridB05prrskrr7zSi0WbmZmZdVVK6e9KPufuOTMzM7MCDprMzMzMCjhoMjMz\nMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIz\nMzMr4KDJzMzMrICDJjMzM7MCE0Z7A4ZrzTMf7+ryfn3pPl1dnpmZmS1Z3NJkZmZmVsBBk5mZmVkB\nB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZ\nAQdNZmZmZgUcNJmZmZkVcNBkZmZmVmBCyYdSSr8G/jfwR+APVVXN7+VGmZmZmY01RUFT2LWqqn/u\n2ZaYmZmZjWHunjMzMzMrUBo0VcA3UkqvppSO7+UGmZmZmY1Fpd1zO1RV9duU0grAMymln1dV9c38\nAxFMHQ+w+uqrd3kzzczMzEZXUUtTVVW/jd+/Bx4Ctm7zmZurqppfVdX82bNnd3crzczMzEbZoEFT\nSmlaSmnp+m9gT+Anvd4wMzMzs7GkpHtuReChlFL9+fuqqnqyp1tlZmZmNsYMGjRVVfUrYLMR2BYz\nMzOzMctTDpiZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEH\nTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkB\nB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZ\nAQdNZmZmZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZ\nWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUmjPYGjHVrnvl4V5f360v36fk6\nRmo9S9K+jNR6lvR9Gan1LEnHbEnal5Fcj9locEuTmZmZWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbA\nQZOZmZlZAQdNZmZmZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRVw0GRmZmZW\nwEGTmZmZWYHioCmlND6l9IOU0mO93CAzMzOzsWgoLU0fBX7Wqw0xMzMzG8uKgqaU0qrAPsCi3m6O\nmZmZ2dhU2tJ0NXAG8FZ/H0gpHZ9SeiWl9Mqbb77ZlY0zMzMzGysGDZpSSvsCv6+q6tWBPldV1c1V\nVc2vqmr+7Nmzu7aBZmZmZmNBSUvT9sD+KaVfA18Edksp3dPTrTIzMzMbYwYNmqqqOquqqlWrqloT\nOBx4rqqqI3u+ZWZmZmZjiOdpMjMzMyswYSgfrqrqBeCFnmyJmZmZ2RjmliYzMzOzAg6azMzMzAo4\naDIzMzMr4KDJzMzMrICDJjMzM7MCDprMzMzMCjhoMjMzMyvgoMnMzMysgIMmMzMzswIOmszMzMwK\nOGgyMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzMrICDJjMzM7MCDprMzMzM\nCkwY7Q0wMzMbijXPfLzry/z1pfv0fD0jsY4lbT3t1jGa3NJkZmZmVsBBk5mZmVkBB01mZmZmBRw0\nmZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUc\nNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYF\nHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZm\nBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYFBg2aUkpLpZReTin9KKX0Rkrp/JHYMDMzM7Ox\nZELBZ/4T2K2qqv+TUpoIfDul9ERVVd/r8baZmZmZjRmDBk1VVVXA/4mXE+On6uVGmZmZmY01RWOa\nUkrjU0o/BH4PPFNV1UttPnN8SumVlNIrb775Zre308zMzGxUFQVNVVX9saqqzYFVga1TShu3+czN\nVVXNr6pq/uzZs7u9nWZmZmajakh3z1VV9a/A88DC3myOmZmZ2dhUcvfc7JTSjPh7CrAH8PNeb5iZ\nmZnZWFJy99wc4M6U0ngUZH2pqqrHertZZmZmZmNLyd1zrwNbjMC2mJmZmY1ZnhHczMzMrICDJjMz\nM7MCDprMzMzMCjhoMjMzMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKyAgyYz\nMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzMrICDJjMzM7MCDprMzMzMCjhoMjMzMyvgoMnMzMysgIMm\nMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzMrICD\nJjMzM7MCDprMzMzMCjhoMjMzMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKyA\ngyYzMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzMrICDJjMzM7MCDprMzMzMCjhoMjMzMyvgoMnMzMys\ngIMmMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzM\nrMCgQVNKabWU0vMppZ+mlN5IKX10JDbMzMzMbCyZUPCZPwCnV1X1WkppaeDVlNIzVVX9tMfbZmZm\nZjZmDNrSVFXV76qqei3+/t/Az4BVer1hZmZmZmPJkMY0pZTWBLYAXurFxpiZmZmNVcVBU0ppOvAV\n4NSqqv6/Nv8/PqX0SkrplTfffLOb22hmZmY26oqCppTSRBQw3VtV1VfbfaaqqpurqppfVdX82bNn\nd3MbzczMzEZdyd1zCbgV+FlVVVf2fpPMzMzMxp6SlqbtgfcDu6WUfhg/e/d4u8zMzMzGlEGnHKiq\n6ttAGoFtMTMzMxuzPCO4mZmZWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUc\nNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYF\nHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZm\nBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZm\nZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01mZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZm\nZmYFHDSZmZmZFXDQZGZmZlbAQZOZmZlZAQdNZmZmZgUcNJmZmZkVcNBkZmZmVsBBk5mZmVkBB01m\nZmZmBRw0mZmZmRVw0GRmZmZWwEGTmZmZWQEHTWZmZmYFBg2aUkq3pZR+n1L6yUhskJmZmdlYVNLS\ndAewsMfbYWZmZjamDRo0VVX1TeB/jsC2mJmZmY1ZXRvTlFI6PqX0SkrplTfffLNbizUzMzMbE7oW\nNFVVdXNVVfOrqpo/e/bsbi3WzMzMbEzw3XNmZmZmBRw0mZmZmRUomXLgfuC7wPoppd+klI7r/WaZ\nmZmZjS0TBvtAVVVHjMSGmJmZmY1l7p4zMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIzMzMr\n4KDJzMzMrICDJjMzM7MCDprMzMzMCjhoMjMzMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgyMzMz\nK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzMrICDJjMzM7MCDprMzMzMCjhoMjMz\nMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIz\nMzMr4KDJzMzMrICDJjMzM7MCDprMzMzMCjhoMjMzMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgy\nMzMzK+CgyczMzKyAgyYzMzOzAg6azMzMzAo4aDIzMzMr4KDJzMzMrICDJjMzM7MCDprMzMzMCjho\nMjMzMyvgoMnMzMysgIMmMzMzswIOmszMzMwKOGgyMzMzK+CgyczMzKxAUdCUUlqYUvpFSumXKaUz\ne71RZmZmZmPNoEFTSmk8cD2wFzAPOCKlNK/XG2ZmZmY2lpS0NG0N/LKqql9VVfX/gC8CB/R2s8zM\nzMzGllRV1cAfSOkQYGFVVR+M1+8Htqmq6uSWzx0PHB8v1wd+0f3NHZZZwD8vIetZkvZlpNazJO3L\nSK3H+/LnvR7vy5/3epakfRmKNaqqmj3YhyZ0a21VVd0M3Nyt5XVLSumVqqrmLwnrWZL2ZaTWsyTt\ny0itx/vy570e78uf93qWpH3phZLuud8Cq2WvV433zMzMzP5slARN3wfWSymtlVKaBBwOPNLbzTIz\nMzMbWwbtnquq6g8ppZOBp4DxwG1VVb3R8y3rnpHqMhyJ9SxJ+zJS61mS9mWk1uN9+fNej/flz3s9\nS9K+dN2gA8HNzMzMzDOCm5mZmRVx0GTWIymlNNrbYGZm3eOgyazLUkpTAaqqqhw4mZktORw0mXVR\nSmkm8MmU0j7gwKkX6uM51o9rtp3Ljfa22NiVUlo5pbTMML87I6W0Sre3aZB1dvX6Syn9ScUhf1Ib\na32llCbHNBAjuc76ghnR9ebrHuMmAROBHVNKe8DoB05/aplSgTVBx3W4C8jS8eQubdNiy4/zvgdw\nXkpp2V6sp2Q7urCMKd3Ylg7WP+R9yM7v+O5vUfeklNYF/gY4axjfXQq4GDgypbTaYJ/vhqgUrg7N\n9TfcNJZSOj+ltHVVVW/9KeVRfzIbuiRKKZ2VUvp4Smn/oV7cKaW5wOeBo1JKK/VmCxcXBcE+wPUp\npWt7ua7WY9LpRTrYOlJKE+L3sK+Lqqr+O3rA9f8G9hztwCmlNC4ypZRSmtWNwnuUA8BpwP0ppQWd\nLCfOxwLgxpTSabHcronlzwcWAg9VVfW/urn8VlmQMD+ltE1Kact6Ozpc7jzg6ymlOSN53rP92RZ4\nV31tln43jv+ewKdTSienlFbu1bYOV0ppA+B2NJ3PW/Fe8TGuquo/gC8DGwAHpZRW78V21lJKNwO3\nAM+mlK5JKb03tmO4edtbwBdSSpuPVOCUUlqjriillOYNZ50OmkZJSuk6YBvgDeBU4KKU0tqF350H\n3A38EHguCuoREQXBBcDjwKYppftSSjN6sJ5xVVX9MaU0LqV0aUrpgyml7VJK47sZgGTruAYVoJtW\nVfXWELd1zZTSCfH3hKqqfgvcA/wrsGC0AqcoPN6KdX4LZdCfSyntMJRlxO/1U0pbQ+cF8VDlGVtV\nVf8GfAVYpvV/Q1zmtsCF6LgcCpyZUlqz022NZY+PQv5qYH/g9/F+z859pK39gRuBBcBlKaWOHqye\nUlofuA24v6qq343keY/9WQjcC/w78MchfndXdPy/AZwNnDiUwKvXUkobAncB1wKfAg6MFr2iNFKn\n+6qqnkfnaGt6GDillG4HVgCOAvYE/h7YKaV0SmxHcdpIKV2SUlq6qqrz0TG4fSQCp5TSu4DNgI+l\nlF4EdhxqXg8OmkZFUlPqBsBxVVU9AXwSOBZ432CtRimlWSih3VhV1fVVVf0q3t8npbRVj7d7PeBE\n4OGqqh6uqmoXNEHq51OXx21kF9AzwP8F5gOfQS1rEzoNQFpasW5E+/EL4NGU0vZDWM5E4FzghpTS\nxcBJKaVVqqr6O+BW4H8Ce0etd8QCjjq4jJcLgKeBDwGvoTFXO5UsJ47zvihQOSWl9GRKadM0At0e\nKaVlUkqTIy1slv3rx8CFKaXVhpPppZTWAs4E7qmq6nYUNK0DHBf/G+721ulxqaqq/oCO+98AJ0Nv\nz31KaXngJGB34HfAFODF4Z6npBbJu4BvxnNFSWoR3zEqbT2TZHngY8CxVVW9OMTvjgf2Ak4A/gs9\n9uummKh51Lvr4th+Jrbpy8D/QPnEf2WVnIG+X1eGNokg6SeoIrsVCr662lUX61gROKiqqn+LMucm\n4JvARkktZkOxDvD9lNK0qqquQhXMngZOKaU7gd2A/0Tp6v8BX4z/DakccdA0On6LIvVd4oS9DnwP\ntTztP8h3JwA/rKrqjqwV4CTgs6gVYWHvNptlUE1o65TSOwCqqjoMmIGaWbtSk8sS8fHA16NGsjHw\n34GdgcPqwGmYy89bsXYEfllV1UlVVV0BXALcVtoaU1XVf6Ga3k9RBj0beDildCg6VpcA/4TO9Wb9\nLqiLWvbv06h2+PfRInk/8BhwRt0CNsiy5qOgcA/gIZRGPwVs0suWkyhYzgSOSWrJPCml9FxK6RDU\nwnoV6voaTmvTcqj79KCU0tyqqn4DfBzYFPhw0liRIcu6/O5OKZ0d23cwsGVK6crhLHMI/gD8I3A0\nOt/HVFX1JrBrKmzBzkV34iPA6iml7VNKzwBHoAreqdGS0xNxXf8LCv7+Pc7veHi7VXex8ZRZy0tV\nVdUfUQXoo2gIw7urqvpNSun9wPt7td1D8B/AZ6qqug2gqqp/Qi1pW8W1WyW17K7X7svx/72BO4Ej\nUYXo31DlbwvgPV1ucfoXlJetltSKOj5afB8CViKuw8HUAWuUGd8EXovA6XMocLo1C5y6OQRjLWBZ\n4DxgGjpu3wU+GNd/PeyjKB9x0DSCUkq7pZTqJsFvA/uiQuwF4D5USBydUlp2gBO4LLBbSmmDuHiW\nAtYDDgGuQIXM9C5tbx2UbRy1y9+hTPMXwL51EFBV1b7ApVG77mR99UVVB0NfRl1mNwIPVlV1DLpI\njwaKu5ha15FdlE+iY3ZcSumCyLBuAq5E4zjmDrCcpbOX30Xn8SdVVX0a+Cp6RMCXUIDxDWAWqgn2\nXLZ/d6JBm1OBA1JKq1dV9S/AA8BzwCYFi/sbVGOfhwarzgMq1NXXy/35d+BXwLroOvkwcAOqpT4H\nHAYcCNrfgRaUpeP1I4D4MfAJVFE5IaW0XlVV/4haUb8YY0WGLKn17or4mQccWVXVv6PujN1TStcP\nZ7mDrHPTlNJ2EeS8ibqiTq6q6q9TSrugtFx800ZKaYUIlKmq6iLgR6iL7IdVVR2BzsO/AOt3eT/q\nczQt1v0WCgT3qqrqrWglmg/8Jaqk1d+bXH8+Wl7eGa1UvwLmAJdVVfV3kVd9EuVhoyalNLGqqv8E\n/kcEIPW5+TdgzdiP7YBHiS7oNstYHbVU7YfOxR+B/6yq6jvAF4B30IWyPaV0bEppExTkTQX2r6rq\nj1Ehm1JV1f9Fed2gN1NEvvvHLI8/HuWLP8gCp3uBxyOf6lqrbFVVf4tuzvkdML+qqjNi3asB+yU9\nV/csoKxyUVWVf0bgBzVnPoei86vjvfmoGXnPeD0dNYlP7GcZ9WNvLkU145Xi9aT4vTca6zS9i9u9\nAPhvsf2/AXZCBdnlqBVliy6tp963cahA2zbbvy8Bh8Tf96Mm++GsY1q9LhSgnhGvD0LjH07PtmPX\nAZazNCrAP5C991HUhTUHeAk1Aa+DxswshVpr7kMthWkE0ttFwFPZ65tRoLNWvJ7S3zmIv2cBK2av\nPwWcmx2v7wIb9Gjbx8fv3VGm/CJqsZkY728Z5+p14COF6Wohaom5H9XMpwNroEL4JmBuF7b7cGBX\n4J3oQeerx/vTI81s18VjVO/X+ahFaFsUxF4KPA/8BRovud8QlrkRasV7Fngme3+PPL1E2r60Nc10\nYV/2Ap6Ia2U/1CL4QqTbq+N8H5R9b0UUyK+LAtPfxPn9KbAjcAZwR5zvF4EDepFeC/dxuezvLSKv\nWCZ775PAe1Fr56vAPgMsazbKe+vrcL14fwFqkVumC9t7ABqofRWqeG2Kgo7jWz73GPDxQZY1rv6N\ngr1TgXfEezfG+arz5r26eY3Q5CUXAr8Ebs/+vyCul1dQZalsuaOViP6cflCB+kT8vW4k+HFtTvCj\nwBcKlncQuovhVJpC8J2o5rxnF7d7BvAdYLd4fSDwt2jQ4XrA5+oLtsP11JlmQgHSU5GpnA2sgoLB\nfwL+Cg1K7fO9wnUcB2wTf78D9W1fG6+XjkzieuC8lu+Na7OsWWgM2h3A+7L3H4uM5hPZe5Pj98eB\nTXuYxsZlf09BrUP/BLw3e38R8DVgdptjvwJwePy9INLSG8B74r19UAF2PvAysEOv9iXWtzMK1vdF\nY8NuAI5B44VAhcP+RODb5vuTsr/noQG328XrayNdT0PTF1wOzBvGNo5veX0AKrh/CMyM9xai8Sbj\nh7r8Qdadn8PPoMrS9sBM4IOo22bn0usk0vRf0VROnm13jmMdPybyhC7uz+6oVWs71FryY9TNODnO\n8wez81en2VVQ68RFqFW1/v9xkc7nxbWwNk0+2fMKS5t9mxhp+Zx4vQJwc/xdBxQfQUHJD+knYALW\nAjaPv19AN5rMiNc7oi6vtbu0zasCX0ctleeg/H5rdFPDtZGmH0XjAkuWl1CjwZmo7HoRWC3+d2Ps\ny+Ts84vlu0Pc/tZrs04zT5MFSJHu57V+bsBlj3QC+nP8Qd1JH42/z0aF2X1okGX+mcUCgtaTn/2/\n7o77WSS6N+hiTQrVLFaMZW9OE7GfCNwbf8/ownrGZ39vBJwff78T1QLOQ603a5IFhEPN/GhqNu+K\n37uiJvz94/UU1FJw9ADLmAFMiL+no4LpHuCoeG9Byzmc0N/563L6qs9NimNYFxBHAA+jMR31Z4/r\nZxnvR8HJKSho3QjV/F+P4zIDtfbcAezdw32p0/3pwF/G35NQofIkKkiXyj7zImrJy1vJVkDdSEuj\nboVnUXd4njlegwrp6fXyhrCNM7K/d4ht2wZViC6OYzQ9/vcTBmg1GOYxWg2Nozske+/COBbbM4wC\nB1gedYdvEq/fQHnUl9AA2gkoT/hVD/ZnKRTkbxDX0MvAu1Fge8IgaX4OGrf0GvD+7P+fQZWYCb1K\nq0Pcx01RS+cZaIjFzS3/3xgFigv6uR62R11KX0EtP5ujwPAmVHb8kC7k/8DU7O+rUKvluahFdtX4\nOQ5VAk/MPttvmkP50gLgonj9LOpChqaFaVi9B4OkjXEof74R+Fj9P5S/3dfme0XXzagnpj+HH+Bd\nkUHfiQYzrxPvf4WI1OnbUjAO1SqmxusJ2f/ywmEC6uLbBNiw9f8dbO/2qJDZGBWkn80S4t7Abd1Y\nF32bbf8KeBD4h5btuBzVdma1fq9wHXlQtiOqaRwdr/dCTbYHxeuJ2WdTy3JWQwHq4yh4mB/H/3DU\n9XUYKtzfqDOEEUpbeQbxLVTI/T0KGubHdj1G1iKW71/2eyJqyVkEPJZ9bgEqkI5s9/0u7kfr8d4P\n3Tm5Ufbed+NY1zXUY2jTQoRaGNZFQf8qqKXhCeA0+naJ3AC8c4jbORV1oX8YdRO+gYKk+1AN/HBU\nM38ptn+/bh4vFLyOQ11kN5MVlKib4R6GWJlBhdp01Ir4eBznO1EBcx7q8ppeH9tu7E+b8z0RBW5f\np2lN+RoKBNdutz4UOOyOgq5rUSVrfvxvJ9T6NqpBE33z6/WAf0Zdzl+Nc/gRdMPLccBm/RybPVEg\neTLq9r0UdfGtiPLGj9MM8Rj2eUHdoPfUaQpVOs5EFapLYr0btfleu9b4j6BgbtN4vSEaS/k6cGq8\ntwwKyKYMtKxh7su4SEsfj+P3FvDpLL3/gH4C8kGXPZoJakn+YfHmwZXiIj8/e28j4JqWz9WF2JWo\nKbSOxBcLnDrNuPrZ7k1Q8/jx8Xo5VDO4HbgM1WiKx0kUrvPSuCgnoybcvAVuV6LbqNNzEO/tggqB\nY+L1QjToeMtBlrVsZFj/irpbf4YKmasiw3sIjSvZgS53XRTs5wQ05ujz8Xpn4Do0xcB4lCmf3s93\n67S0Rvw+AhVWH6AJ2veJ/V2lR2mu3oZtY71zIy1cBHwatTqui7okNh9gOXnQu1yk1yvi2lsP1dQ/\nRgctpDTdRfeh7pC6S2hLNBavrtEuDyyd718Xjs884OeRxhJq9b0ZTZmwASqIt+1gPavGOm4Hts/e\nf5bo6uvyed8XtQJ8JEt/j6G5dLZGlcrFuv9j3yehoPkhlEcsg7rXv45anl4FDu72Ng/zvC0APhl/\nzwX+DvU27I8qhTcTQwf62c/bifGTcY4WoYBw1S5u6zIoqPnrSGMnolbnS+L4Lo9aUa8D1h1kWTfE\ntXoeCsCXQ62/t6KuuRXjcw8Ct3ZxH/IW9w1Ra+PSqKJ8Aeoi/WzH6xnNRLWk/tC39n826krbKRLP\n66hQm4BaBa4bYDlXRgKeEq8XqzXRpe6f7AJ/P2oW/zzNQOzpqMXiA8BO+ee7sN4PofEzR2bvPQZ8\nu79tHOL+jIvM9+LIaJZDrRh30wRO/Y41Qi1MH46/Z0Qm8CkUQOyB5vp4Bt3B8jKwfDePzwDbdQd9\nuwI/SQRN8XpXNP5sJbLxPf0sayEKijZAQdYH0cDbo2gCp1nd3oeWbai39xZUoLyTJhB5CbU4HDTA\n9yfFNbYFqlkeHq8vQV0LK6HA60VU+xz2dRPrWhDbe3X2/iGoEO9FYLkvKlyfR63Wu8b774/3fgLs\n26V1fQD5Lh4iAAAgAElEQVS1aqyGgs2fEC3ZXdyfNSKfOTPO0QNoPNYpcT39jKz7sZ9lrBJp9N5I\nP9NQhe9+mtaqER/DFOutW9F3RYHITtn/1kFj3z5YuKyzUGt/PU5u7Uh7Zw92bQ/jnHwMdYN/BHWZ\n/i+U501Dwexg5+QBYFH8PR61ys5Hwys2Q+Ngv4daNO/KvtetXovxqJt5blynFwCfi/99GLU4DXuY\nR1U5aOrZD03z4GUo4v5hvH9sXESP03ckf91EvnTLcv4HauFo1+JUB2czgD2GuZ11cLFy9t6CyIgO\no8uFJYu3wM2NDOF6sq4SFICc04X1fR2Nfdke+DXNmKYDInPYMT9nbb6/KeqeOiVez0atbRdln1kV\neA+w+wilrVkokHirTkNoAO1NZC0NaDzTWoMsawNUQO2QvTcBtZ7dHOl1XLtj08X9mYu6BXaO18eh\nloI6QF+eJoBvm8mhFqDD0HiFfyBaStAdZZejVsGViQGtw9jG+jqZQwzCRl3Vj2dp4x0oEFhxqMsf\nZN2roW7k7WL970XjX3aJ/0+jSwOAY3nvQt1zj6JujH6D1WEew7loPOCx8XoVlEfeE+d6IpEf0dKq\njqY6+AXNnZQr07Q47YhaTBbrQhqpn9iXOq2ORy1pH8xe12MiN0LTBcylfS/CRqjldRKqCNyMbsSZ\ngoKuZ8hu1Ohge1vz49XjWrkPtdLsEOttLZfadZdORZN01kHKSSjo+izqNdk93l+HGDsXr7uWt6Cx\ninnefDxq+ZqFegVO7Xgdo5W4lvQfdIfbJ+Lvb9F30NzywCp5oqG5jT0fzPhQJOBrIqN4O3CiCZiW\njeXv3MG27o3GMF2Kahr1nUl3oprscsNddst68ha4M2K9a6O++UtQd8yQxpgMsr5pKDMeh7pmTov3\n6zFlA3X1rE/TwrQ56jas++Jno0J9sabedplJj9LXdpEJ/Az4Urx3Jmq6vwU1fT9QsJz5wJ31ttNM\nXzEuzn3PCqBYx4RIc6+gsUB1jfEYFCgsLFlO/F4dBRNfQa1MdQH0jjozpYOaOSo8XkC3+F+IWq4W\notbg59C4qf27eHzyQO2r2fvjY3/eICoBw1z+YtNOZP9bI/Zvo26k62xf9kR52StxLFfI9vFSlOfl\nY1ymE91BRIUAtSS9RhOArIVa275KtMaMxk+k5Q+jsaD1dXQBMS0GTaBXz6M0tZ/lLEATIH8VtSpO\nQuXJItRV+gYKPE4mm/ZkGNub58f70FQ0lkctQndSWGnOrsFZqBXsZRTY1cHv4WgesZXapYsO9uH0\nOF4roIrwz8mmQEBj3q5D+fVdrds7rHWOVgJb0n5YPGLfB3XdfAP4i3hvGoq+l80TDc0gy2PiwjgM\ntYJcnn2unidpWvbejFj+kG//zjKxHVCX4XqxjlfQOJCJqLvhi8CcLh6ncahwuQZlkk+jZttZqAvt\n82R95kO5qGiZ34p4lAQafHl09v4dZHPmtK6D5q6Lf0VdFRNYPHCaFRfo50Ywje2Tbd9KKKBeIdJA\nfUfj5mg8wvHZ9/o9hqiV7Edk49RQMPvRHu5HnfbySsCHInPL7wg7lqwlcJBl7YYyzTmo4LqauGsQ\ntUItBNbvYJs3QWOYpqOxEi9F+pqAKhiPEAH/UNLsIPuUD1p/kuyOKxTQ3okCxBWGsY55NJWC8a3r\nbvP5jocBoGDhCRSMTUQB0iVEwYxajdZt+c5akUdcgioI9Q0vd0e6nYhagx/s5Px2MW1PQpWqes62\n/VGFdL1IK5uhnoO2c5yhytqtwFbxehFq2a6nLtkKtTzuGcdjWPtM3+ELX0f58QPo0S6gsuXySHfL\nDrKsOviqg9iZqEv3rpbPfY3uliV15XCHSAdz0M0eXyW7CxENyRjStAIDrne0E9mS8EPfu8AuQc2p\nk9AdYfkkcQ8DV7V8d0M0tqmujRyHIvRH2qxnEU0XxjQU4AxYqLRZxvKowK9rPYeiAmFPFI3vj5rl\nr0SFzZAz5EHWfzoafzMeFTy3olriRigQOHSYy60Dz4k0rVjj0diYF1EBvFwc67sLlvd+NOnatShw\nGkcTOOVddT2dryjbnk+h7rhr0ViDyWi8z/Xx/+8QLU7t0mY/y6wzzqPRYNPT0RiMH9Kl8TEDrLOe\nyPA8FJwnVKG4mmxuqfw7AyzzQNSNtFe8noHmMLsaVQBeQbMtD2U7VwauyF5vE8f76EhPa8f760fa\n6HY39p4oWLgRjV2ZEdflV2IbfoHGOX2BYbQEoyDyl2Td8i3/rwvCroyZiWvlA+jxNbvEe3NQAXdN\nu+OXpZWT0Lxq57T8/x7UWvXfgAN7kV6HsH91fjoTjd/5LBrPMxVVYp5CFdDv06a7E+VV01GFrs98\ne6hr7h9o8rhVUO/CJsPc1g3q44t6N85CPRYvojLgq/H/5Rmk+4++rVVHo3Gek2N5fwtcEv+/m5Yg\nqsPjfQnZXb7Z+6ugFribaDM1Cl3oChy1RLak/UQCfBp1mdQJaXMUqT+Bxj7c3PL5eZFI30/cRh3/\ney+ah2V/2jQ3x3fXYoiTJcb6vhMXcN1VtVQk8ttpmuLvRk3gHQ/+ZPEWuPo5QF8BPhXvPYdqTWvk\n+ziEdbwH1VpnxnY/jALWT6Pup20is7oduKG/dRAtB/H3hFjWA5GR1XcTboaCvdNGOH1tjmqs96Fu\npmcjg/oOUXtDjzz52DCWvQIax/L1yGy6ept8azqI7c4nMnyduP0XddVdT0sz/gDLnRnHYlWUcW8Z\n189EFExdxzC6zFBhN4/mrq5VUNDyfZruor3j+He7YvEOFAjsjYLYeyI/mIQGtX8Gta5sj7pBisdQ\nkc1JhbqOPoryk/zW+Hys5GP0E1gNYZ1ro4rSLNQ6+hhNS8qceN1fy8tqqGvqSDQW9NCW/8/MztFo\nTFyZT4WyCcpbl4308pdobOjScY1tTDZlAwqu6nFddcV7IxQknU3f1pHbiGAzXg/rqQ/oJqTv0txI\nsEWk9SdQcDoV5SNPtXxvoNbqcagV9rK4FuspaWaheb3+g6jcdes8ocrQFvH35Jb/rYMGsn+JYQaW\nA657pBPZkvqD+k4fak0YqAaxE31v4R0XGeB99D/Z4IdQIX84gzSPFm7fhijDfzfKlF8mGySMunjO\njovoe8DGXTw241DNeevsvdtomuVvpmUeoCEsO6Ha/hWo0L8i3t8ItTR8mmz8WL5NLa/XRAXVxTSt\nebuhlri6q7Ue0LkFbW4R7lG6Op6mFXJL1Cq3NypETkddhN0aqPv2gO9uZGzZcmehFqVV4/WJkR7z\niQy/nR3fNQqXuwwKbl+O838barl8k5iTJU8nw0xb99HM5n8KCqSPi+vyDXrQIodaRz+bnZPpqEUm\n71LeHnWBFFecIv2ciwKYcegu0htb9jcfK/ksQ2zJ7me9u6G8ZzlUQTuNmKIj/r9Ya1ZsywooqK67\nxPdAhfDCWObd7b47Uj8oMH+E5skC08lmyEbjwi6Mz6zZ5rv3oNbjC1DrSH2DwYaRls+mpdDv5PqM\nY7oUKlseI/LjOM53xP8SyjfPG8JyPwScFX+/SlamodaqS1q3vwvH/ul6nfmyabprN2SYN0cNuu7R\nSnB/6j8s3oKyEWrhWJ1mEOAytMybQhNMLYVqrru0vJ/X+I5CmXZRrXuAbZ2BMt17s/d+ggqAi1BL\nzNqo1vwNhtlF1rLOy2m6S55DNbBHgOfivdtjm/6KGIjcuv8F66ibxBOqxX0e1UaXi/c3QS0NnyOe\nA9bfOlAg9AoqvH+KgpJzUCC2Muo3/yrDnBCtg+N4FMpcj47XO6LA6X2oAJpBS9dGSxqq09VMBh+b\n0JOaOppzpx7Dtnxkbu0mMvweWYvrIMtcG7XqrotahK6nedxPfSfhkGb6bjledfAwlZi8Ml4fFmn7\nBrowoWC776NxK78jmzss9u/A7PWWeZou2J8tUQXiPSiwfDjS+D/Tcus7Cm6ep8OuZ7I7rlB31eVx\n7ldAgdvjKDjrd7wUCqhvpplBek/UJfcig9z+PhI/6O7TZ1F30QyyIDT+Xz9c9x1tvrs7amE/L47N\nS6h3YUb83I8Cp46fJUrfyZMvQkHH02jeuokofz6DuCGov7TZuqx4XU+H8W36zlN22EDfG851kqXn\n/eManN+a5lDAucZA+9DRdox2ovtT/CGL9onHVkTC+yLqaqub779My8MM4yKq7yi4gmZwb51Jjyee\nHxWvO20aXxPVjM9AzcVHolrGdajr6uOoCXkqqinVzzLqtCD4IGrmvZi+E3p+meZur4Po+2y04dSe\nJqJgaWPUJH5DXDR14PQOsjsS23x/Ds0jbnZEBcsiNJj4dDT/0tnx//0Y4I67LqexvDtrb/pOyLk9\nCkQ/yAA17SyDOTAyxSdR19VimXCW/qbRxUnzsuUfhgYuX4QKzXEUTGQ4wPI2jWVdQd9ujH3RHXRD\nagGKa66+q3IhKuQvRuPslkeB093Z57vawoGCgetRYTAZTTL6N7E/u6AxW0N64G92/hegW78fbTkf\n70OVjLzrZDzqZun3gdWF694AjYu8AuWTO8XxrLu/59DPOLM4t3/Zck5vRePUEgq05uT7OJo/aJD3\nt1CF8yV0Q8/HUL67fz/XW0KtpOfQPJHgfDTFzLdQa+yudPF5lbHOb6BK5MJY9+NxrjZGXVrntaaf\nlmXkE0iui3pMNkN5y5XZ5x6hZeLmDre9tZFiFirDLs7TKipfBn1+a0fbMtoJ7k/tJ7voE4qsH48L\nZS/Ul3o3akF6ipYBx6gb6fvEADX0JPKf0xIYofk5HqfD+V5ifT+iafE5KS6aJ7PPzIl1DfmBpf2s\nczxNC9BCdMffbS2feYQooLL3hvJolM/Rd2zSwzQTgK4f/7+DmGgy+1xrjX4yamV7gmaM17tQgXkO\nKtj3paU2M4JpbTwq3Oq7v/LAaTvUWrjTIMt4F+q+moUGkf82Msfp+Xri9wzU2tZx12xsc1252Ce2\n4Q5UK78cjR87Hd308FMKWw2IcQzx98YoyL0CVQCmxHU3pDFM8b0LUKF1MDH/TVzH16BgemnUpfRo\nfsw6PEb5lAgvZun2AhSsHYxa4L4y1H3K1rEH6jJZiAKPaS3/rx8F8/bDcMnu2hvO/mRpaVsUKC+K\nY/t7sqlXBljOtrHfeWXrg2gm7VMYxS65lvO2MtHahwKnp9BA98NQ4HkNgwSfqNJ6P83cax+LdP1d\nBpl5u3Bb81nyx9N36ooVUAvZ07RM9UKb/Djb73oKl0Wxz7ujSvnnaXoU7ux021u3JdZ7d6z3ANST\ncwHKQx6P63VYvRZD2p7RTHx/qj+RsXwIuDBeL0C1wrrVaC36TtQ4DhXmL9Myrwbqsvg5ypi3QIXh\nj+nw4Yto0rR/oO+t9kuhMTJXEE2ncbG/TncKyhtRpv8KCghXo5lUco/sc99rvUiHuJ5V0fPV6knU\nniMLMFFr3vXELdX9LGPtuMhnojEkXyJaBVHN/qbIwCYPdzs7SFv5QzpvR2NAJtJMOnpMfe4Klnd4\n7N/+qPv1vZFWz0IZdZ0hLRuZTzfGsGyEArQVUEDyNZrxE7ug4OC82KdptExkOMByx8f+P529t1ks\n/05UY55asqw2y94dFXR3AWfGe5NQ10o9ncM04vlgHR6f5bO/N0FzDtUT/+2Mas8X0Ixxebsbeojr\nmYy6MOtuy9dpWtNWpKloXIdaRTp5blmdZndHQfFZdfqM9HcECvLvRAFVuy7kFWgmhtw60n6dx26A\nunNHbeLKlv3dD1VI354XCLW8PEG0TA9wjHam7zMDn0It2p/I3hty13Kb9W2ByoGlUKvsBFTR/2z2\nmQNRANT2MUv9LPcGdAPBODQtS52+5qJK5l7ZZzvuksv+/gLK1/eN9R4R789ElYJtu7XeAbdptBPf\nn8oPqnluGX9/CN2ue3r2/3ehwugjrScd1RrfoGnNSKgZsX4w4icjQXwTdVnU7w8rE0OD4F5GNfhv\ntyS8KajF6VKUOf+ALjxLDo27eRjdWnwUGnD6KGp92xMN4LwTteIMest/P+vIu6xWQEHhVaigOywy\nsh3j94AzJKPC6svx92QUoOSB006om+8cWu4u6mEaW6yGj4Kdq1CrzHjUotk6k3m7ZvT8LqmlUddx\nPfD2FjSWrA5WZsTrjqdPQBn0s6g1a0ocu2fJxoLF/36GgvfBHvHS2jq4VJzvtx9Xggrou+jnDqxB\nlp+P9dgh0ugzZJUI1ALQlbtwUKD4ZaJrCgX4vwC+1rIdV8X1OX0oaS87JivR0i2E7gRdCrUuPRDr\nrucU6sadsnUF6Qg0FvMqNCap3qbN0VipxZ6Phwrv76GW+GviGGwbaf2ZOEYLOt3GLp3DDVHLxvrx\n92tZvrE+qpzMo31rTT09Rj6lwH7EmLl4PWEo53yA7dwJVShformjbYM499dEWniAmEJlgOWcgQKV\nOqA9E+VLT2T7vTIFs4Z3sC9nozmZ6uc5boO6Mhdruezmettuy2gnwD+Fn0hc9S2a9a2yn0EZ9zo0\ntfW9yOZ2yb47DdUmTkXdYY+S9f9mn51BFJzDPfGRCb59N1pcFK+1fGYKqil8h+4ETJsTdxhl760c\n+3t3vD4Q+H9kLW3tMpUB1pHPB3IKKnxWRIHhWyiouA8FB+dk32stdCeh4GMHNKD07YnZUKb/AE1w\nuysjVLNFAVv9UM4bWvZhXxSUfjS2c7ECHI27qaeM2Ad1xeRdmFej1rMFkdnlLaEHkd3O3IV9OQoV\nxP+ECv39UABad0tvi1qMBiyoaQrbhagb4/R4XQ/O/g5qpv8Bw3hQbbb8jdB4uKXQ9XwNur63Ri2x\nP6O7jylZCtXKPxOvV0WFW/7swJ0Y/sSF+6PuvkfiuNfjFK9BrUqvEd19cT31OzN46XGMn9NoWumm\no1btRfQN4G8ha1GJ99ZCAdOGNJMqXhR/T0FB2FbdOv4d7usc1KL+ney4boIqqfUUKv3N9D0rzvOq\ncbw2i32bioLCY7u0jXlF4PPAPxKVLFRBXAe1Lt1P33F67Spfi1CQvx1qjU6okvJzsmlXUGDb9k7w\nYe5D6ximM1CjwuE0ZeQ2KO/vaPzdkLdttBPhWP+h77w9h6HMuh4jdHkkvA1oX6tYPy6wyXGxfQ21\nOLXeYbE5BXfCFGzrTNRHvU/L+w8Ar7a8N5nmFvBOB31vSnNLcd6HPhfdcTY3Xs/P/jecQd/jUKvF\nlTRdC/XM3GcWfH951Ce+ZVxwj9G39Woq6kp4hJbbWXucxt6BWgRXRYO+63mlPpp95moU5OVN33Wh\nPxm17F2GJoqsJyl9FgVjM1EAcDWauLLjQHmQ/TkCje14EFUYpqIC9AcoE/51vh+DLGtv1F29A7rT\n6yaa7qorUQVhyPtDU9HZE/jvqKXqAdRCs2Fs5y9Ql1BHXeX9rH89NN6vvslgNRTMdvTU98hLvhlp\n/cQ4dvVNEVfGeamfv9hp10lrwXYoqhzWN8LU4z43jtfLoEpNPnB/Mmo1fp6mJWO52IcRnQttgP1s\nrXjtgwKJk2i6UDdD3Z/9PusR5VWvooBwESpL/hVNVruADh6J0+7coJuSto284Btk4+JoaeFtlxbQ\njUPtJpDcEo2D+iia0PJLnabbdukq0s8eNHfYHoMqgwtpJvrsWmWmePtGO0GO5Z+Wk3cjqtWejGpr\nda35skhAc1q+OxcVEsdm782IjPncLCPbBrWWDDigt3B756MujytaL95Y70s9Oj6roq651er3aQr0\ne4CjWr43rMwa3dGWT5tQF54roVas92frXWzQd/y+GLVy7I9q4VuhWvFUlIHPQrX8rt21UrBfM1CB\n9hpwR7y3HWpWr1tXbmWAQbSoleyS+FzeSvUA2TOkaJ711d3bcPt2Aa+OWsc+EdfKGlVzTRxI4cB6\nVMh+DQXlC1FrxHdRK9bbAeNQ9oe+D0fdAo0d2j7O+6dRt8scNObteprHdnTr0ShzacYVrYHG+ZyT\nvX6ODlo3UavH8WgG/O/V+QAab7MSMf6k0/Taku4+gvKx+agb5ZOxHeui8Y35fHCTsr/Xic/PRPnr\ne7L0eTTDmKi12z/ZedsVjfuqx63ujVpxPpJtc2t3aP3deShIXoZm8so94n87x3KGNW6t3fri73ei\nIR/1nFyHoMD0INQqdlq777Us7zJiPrrW6yyOx6ko3zo7+0635mEah3pkvobKlm+gcvhYlG8fSDbe\ntFvrLdq20U6UY/0HdQOdSDSdo+biv0ADXesWp/e0fGceGiRYj00ah6LySZFxPYKaOPdFBeVi070P\ncRvzgmAHNA7nKhafUO0R4Ec9OEYroqb/11i8X/txhtmy0XoxxwV/dfxdD/hdLs7R0gMsZ4PIQA6J\n1xejVoTfRwbyHMrcv4tqynNHKG3lmdy9KHg+gRgojGp0v47M7ov9fC9viq8Htz9O3zl+HkVdyePp\ncrCUbw8aT3YkzYM/56Ja9VUM8+5MFMzUY/TGo+D2v1AwNqSMEgVDR6HWr0mo1v8acRMBKrzPRi16\nK9Nyp1kXjtNeqKX5VXSTwVKohekHxC32DHEAcEtaGB9p4K8iLa8S7+8R6XtWu+8NcX3TUaXsBFS5\n+G8o4HkaFdCnoDEv34t02+9dkShgeAV1++yLWhGvR5WjvyUGx4/2D+pefg0FSM+gyklCrZS30gwX\naNdas098917U8ntE9r+9UdDcUf4/wHYvREHNWagM2h+NKS16XiYKWC5oeW88zQSSrTNxdy1wQV29\n+UShtxN3faNGh650ZQ5r20Y7QY7FHxSVHxCZ6B5owPGlNF110+NCuZusRYKm8DgDeCt7/xn6Psdq\nDmqK/w0tXWnD2NZJaBD6pqiQPRIFTn9J+8Bpi07WF8vYJfv7UzStIfeivv4TUQHxMHD7MNeRd5vV\nM6jXt2bnzftfJJtNvPXCRQHsy6iGsk72/qmocNwxzseaqKbf9Wn3B9o/mm6iDVFN/Ro0Pqsu8GaQ\nDXCmbyGZP8ftbtSFvAXqNj6Xvrfnd3zeB9mfumA5G3V7fgoFBWujjPrzKOAZ6HEM9f7Mj32qW0k2\nQkHf8qj76U6GcZcfCp7noe6gZVGw/136zgm0PLqzrytjaLJ9moQCvXmxD7ejIGNKpLufM8RbzLNl\n74cqcbehYO9E1L11MGod+Qldmrkc5X1HorzlafpOKHouzbi8lWm621orP/mDiC+jeT7ZTigw+Rxd\n7KrqcH/noG7m1VFL2PdRnnM1Cpz2op+Wwbh2X45jMwl1kX8/lrM0Ciq72lWOAs98Bu4FqLX/DLIW\nwvjfgEFOfPdGFp+OYGpcg/nNEp22xLbm20cRc0Zl6fxBCp8W0NM0MdobMNZ+IpG8gAqhX0UC+SAK\ncjalKeyWpmWadlRrrP9/FZpX5FliwGfLZ6cRAVcnCS4ysb3iAvwNTXfCNmgK/z6BUxcS91ZoHFf+\nrLC8qfeEWO+NwEXZ+8Md9P0wKnSvQ83bR6JxEl+Ic3LPAMuZFp9p+4gWFAjfh4LNjufdGcL+5fOO\nfAkFGNehGlxdO/xYawaRZR5569JC1Gq2W/be2qgwupQ2MxH3YH9WR2PXVkbjWn6CxmyciwqLdSls\nvUMVgH+M7/9zvf2oBeKLaKqJekxOcZccTYVnGhq/dGFcO3NQq3A+J9CEkuUO4fjsj1onnqPpwn4n\nCnI+gwKnYd1iHsfrJdSq9y2ah62eiCpON9F0BXWlYEMDtw9HrUQ3Zf8/GLVy9dtCh8Z5XhVpfDyq\ntFxKSyWp12l2CPs8HXUjvgONB1wTtTD9LdmNFm2+NzfS/UP0nRn9BJru2HqIRif5f+u4sq1R/ph3\nmV2KyrT9svfaDfo+KfKOerzomnGuLgUWZp/r6gSS9B0Gs3Jck+uhsYY7ZZ97jr7TCozKxKajnijH\n0k9cyM9mr2+rL4zIgL6CavITWr6XUDfR79EtsnX/9AVo/o28H/+dKPNfvovbvQUqYB6n711RW6Fa\n82V0+CiWbJnLoILxZvTU8uPIbp/Nj0n295AzQRRQPIlaXXZD3QBPoO7NuWg+mPcMtI7I8B6nGUPS\n7lE19QNE++3e61FaS7F/Z6DWzL+nmWBwIarJLWzzvRXj+E+IY/RZNL5gSrz/ZByb9SM9F8+y3WGa\n2AAFnz9AhcVRqLvx0tLMDbW23Uhzp88pka7nooByO4Y4vxcap7YLas3ZDxXs26BulU9FGlkJTSFy\ncQ+OzWaoO+4E1D3+PM1t0zugVoDiwayoQMkLkk/Qdx6uOq1PyH93cX/WjfxkCmqNvw34i/jfPBQ0\nzc4+v3Scg9VRy/0ElC/dFefgfPSswFOy74yFWb63Ri2e9fHcGbg8/t4DtYZt2fKdOn/ZCgWwy6G8\n/pHsM8eiVsZxdFhRo2+wsU+k8ymoZfYFoiIQx/nkgnT6FirjnqSpfM9Bwz2eQEHLI8S4y26eqzge\nT6FW8h/Htb4vagi4MP53y2ini6py0JSftOVQC8r5NHd/7EfficAuQM3ea7T5/sxIVP+OuqnqwOkq\nNC6lnnb+dbrQh51doHWtYBX0WIRbacZSrRD7cBUd3lJO35rgZFRAX48ew/D3KDi7DdVCNmrdzsJ1\nfIgYJBwX/tGoNvo8anm5KC7e1odYtjb/L0tzW+qDRPcWTUEyDhWU747XPQ8sWrcz0tu58fezNA8l\nrWcYbjvZKAqI1ov0NhWN//gdCvw+g7rHXov/dfzMqiHu3z7AZfH3nqg1cNBJU2ketHk1CjCOpikQ\n/gIN8t9ymNs0HrW6PEM2Tga1HNyBxt8sjQqHjif2bFn3Rig4OK9Of6hF8SniOYC0dJkU7Mu7UYBa\np+/T4tw/Q1PAH4oKmol0XjC3XlvTiRtcUJ52KAqUvh3Xaf58vPVRC+QNqLB9EHWL13njzvH91+M4\nFR+LHqfj7dF0GXV+tgD1IvwaVVL+iX7GW8U+LaKZ8mV87Pe3Ix3+lDaVzA62dRwKlu+J83IJan1c\nF3X5Pg/c39/5zN6fiVrFjkV5yhNo7GddgZlCDPbP193F/bgbVZBnoik+6m7eLVC+csxg+zBi6WO0\nE1GVufoAACAASURBVOhY+KEJQNZCzdmfisT/bRafwfvgAZazFardfysSXV1IXwH8ZySGolutC7f7\ngMgsb6d58O/JKHC6CLWyTIrXn+pgPXl30tOo4F4Wjf2qJ7XcNN4/fJjruDky1jViPePRmJjjaAZ/\nHxHn5JMDLGc6qgn/BcrUL0CDb5eK/9fnZMfY9mE9MmIY+5fXCtdCBfWPUBfu0dnnbiKbdbpdBhHH\n/nr6PjNv7fh7TqS/Od3eh4J93AT4Q1wDv2GQwiG77uqWl2mosP8cMYN4vH/qYMsaJN2uirqwHkKt\nWXVa2ALdXXhOfn66eDzWRYXBl4lxeChwWoQCjfEMMaiJ63lmXNN7oorRX9MEZjuifKbjJ7zT9waT\n5YCZ8ffOKHhIcc4Oj+2pKzwJdWn9kL7zsi2FKldnkHXhobzjeYb4bL0up906Lc5EQXvd6nskmnZi\nM5Q3vZu+LX2rola+egzioeg5f2e3LP9U1DLf8VitOF71Q+HfS9zuj7rTTiIeuIsqTuu17uMAyz00\nzllClfDvowkkLyMLWrpxnbR+H83DNheVL2fEe/NoaYVlDHTdjurKx8IPGox5JVFDioR3M6r9XJZ9\nrnVei4RqfNegzHHZSKSfiQvplkgAdSF9PlktrAvbvR4KmA6Oi/yXNGMXDkY1vLol5YsMY7bkNvv7\nGHBV9t7UuNBuAE5q+fxQxjBdBDzcz/+ORMHNNFTA9XvXBM3cHe9DNfqj4/UNaJzNqqiQ2SrOb0/n\nK8qPXX1MUK3zuHh9CAqc3oUKlC8Bdw20jOz1IShw+ghNYfY+1LTdb2DfrX1p834doGyGxgAWtdqg\nJvjvRBrYEwUFn42f7UrWPcgx3xEF2yvGtfl5mgrGODSPTVeml8jW+U4UWGyJauiL0HQGb7d40tm0\nAnNQpeAWVPtfA90gcQ8KmDu6uSTWMRO1itRdo19ELYcLUUD0NM2YzGVYfMqVT9AE9XU3cj3g+xxa\nKj6x7DN6lW6HkBYfirzh8HhvOprK5K9o6S5H+f8bqEvrlzRTaxyCxsN2HLi22cY76Ptg3MNi2+q8\nb02Up2zTLm0OsuyJKEDaCpVrT8a5Pgc4vgf7ko/NvAUNZTkqe+9BWoK1sfAz6hswqjuvE/UQajZ9\nmuaW9NVRjf8T9HOrdGR816J+4HtQl9wqaPDrovjMvZGg8wkfuzE9/kaoif/S7L2D0bif/erty/7X\n8fPTiIkhs9d1bX3ZyFSGdQsoCsauoHnEx66o1vEQTW324Xj95f6OY2zHZcD74vUhaHzMB1CmfTlq\n/XsGdbEe2G45PU5v95Ld7ou6Od+NBnLfSza4st12obFdJxGPO0EFWB04LRPL2qe/73e47W3nv2r5\nzLjWzzBA8Ixafe5G459ORl00B9HcaXY10Y01zG0+ABWA9cDxpVDgdD0qCP6aLkwq23J86sk4j0Et\nDjuhCs4XUGDY8eNKYj0roW6U21HgNAEFaKvl2zPMZW+A7vC7AnW9r4gqG+9G3afHozuK76GfsYBx\nXveMvx9HLYivobGhS9M8N3J8pN376NJDw4e5z1ujIGFnlPe/XdFEgdPRZHdUooryj4D3x+sbUCBb\nT3Z5AMqPuzadQFwTD2SvZ6AA+urYvnoalkcY5JFIaMzQgaiikj/S5TTgX4BnsvemZH932sL0cbI7\nqsnmp4rj/2VUiXyILj70t6tpZbQ3YNR2XIVq/uDPw1Gf8NszjUZGdzXR9NpmGeuiwOsC1Fp1F+qX\n/S1Nd8lDdOH2ZVoKIpTxPxPbWRdW74nMbEX6dgcNa/btlteroi6X/C6KSWjM1Lh22zmEdX0yjtln\nUfPwIlQg/CCObaLvQ07bDfpeBs2FdS1N8FsHTkfH62mosGl7K3QP0lifLp9IC3W6yG8OWIq+gfVi\nxxMVYr9A3SC3oS7keq6Y28ha+nq1X2hsx51xrbR9xEe2zxPanafsc6sB/wF8Ol6vhFot74zfk+ig\ndZRmluk6iHgHTUD9XhQQ7D/c5bekuzrjXw5l/GugAullmm6btVCAUzx+jkEm7kQF5glk4xi7sD8r\nxLVYTyVwMrqbsR4vNRcFGHegSsjsfpZzNcqHlqfvfDvfQBW+6+k79cDEbmz/MPd5NgrsHmzZ/vto\nZjPPr8mJKJ95hKYn4fVYxreIVjTUSv53dOGmH1Qh/E32+kwUqE2M9HwtarF9mKxy2c+yro/r7FKU\nzz6EGgjqgO/LNK2EXb2rGJVXT9O3xyLPC69D80pdmL036l1yffZhtDdg1HZcGVw9PfvEeP00fW8P\nXYvFxzQt2/J6Y1RoHYdqF9tGgiya9bhwW/OnYx9BPLgyEv8tsZ31Z1bswvrygGtvmoHx74kMbz9U\nQ3yALszDFK9PjH3ZnSaoeSfZQPz8WOTng2YyyOloIOONwKHx3iGoS+bE1nPX4/SVjwO7FTV534Ra\nBPNM4lSiYG+3f/Heu1DX3dbxetfIMM+imStm0AHXHe7PFqjV9II4TxfTMp1Blm5moEKnbWGBuq2m\noVaH/0nTvTgLdTHe30k6Rt3GE1AAc0ccq7tQzb8eH7dYq9gw1rMsCvhXzK6/KyOtfYsILFEXyioM\nocUX5Uf30rTA9hc4rYwKvhvowqSsKGi6Go07eireuwANfl635bN3kRVuLf+7FRXE4yLd1N1de8X3\nju51mh1kP/NK6AQUePyU7Plpcb0+SJspFFC+fwvqdn2emIcPdfE9TNO62bbCPYztPRh1A+6NWs+/\nRd/xSstHPrFPu33M3ruEbChEXKvjUYXzE/HeobFvPckvUU/OM/W12PK/PekboI6pgKmq/gyDJlRj\nbzt3DbqLaVr8fTB97xhLaGDa37RmFGhg3m2oEOs4aOln2/ZF86J8ADULHxsZ0s2oRtTVZ/DEsl9A\nrW3fRd1C8+K4/A0KmIb1vCGyAgsNopzZz+ceoiVoavn/KpGZfw9169S14w+jQr3OqN+LAqnVhrO9\nHR7Hz9LM9nwCCgiOQmN/vkL2kNYBlrEX6gaunyi+FAqgb6fNHGA92Id5aExYfVfm1mhA70U0D7Cu\nA6ZlUSGycz/LGo8K4Y/E6+tQbbyelXs2HUyPgbqWLkQtImujwnuX+N/maKxGx7fh0wxen4laYY+K\n15+Pc1U/WmNr1GK6+RCWPS5+zopr4O0B1v18fjkUVHdlQkhUYP6vPG2hYQf/QBY4xXX2hezc5/nl\nqpH256K76J5ErXs/QpW8I9DYtxG/E4omwN0xrsmFNLOSP0TfO7XyiXTXQEF9fT42jGvwGbLgCFVm\n6wl/O7178ZNx/a8d2/ktNLXNMi2f24W+gWB/aeXK7Fo7HHWZ3havb0DdyquTPYqpC8e7z/xP2bH8\nBnBd9t63GWDuq7HyM+obMKI7qwDjUXRXQD5+ZDJqpXgqLugvAte2fHfZOMlXoxap1sCprnmcT/OA\n2m7NYTEZFbYroFrri2RjF1B3VseDWekbzNxMU0j/Ct1BcjrNHTPTWr83jHV8HXVhPIVqN3WLw7Zo\n0PmiguU9hgqp02I5N6LM+vw4H/UYn67MU1WwPXnBcRwaA3Jq9t6H0BQQX6bvgM52M30vQzOlxH5o\nOov66fSTUYtTz2vrqAXoZeCb2Xtboub9z5LdPo8CpsHGUxxDTMIYry9HLU7daCXdKM77ufSd1HUf\n1IXSjS65peMc1hO8HoNal+vxWPeiCse5qLu5eJ1xjd+Cgqbl4pp7lH4CJ5ruofvy/e1w/05AweVF\nxM0k8f7FaN6sKSi//Bgx/QdN4DSOuLsMVVZOQwX6ZDS+ay4KVn7ACE310c8+7oLytQsiXXw80vR+\nqPJ8bMvnN0Bj1b6ABn/vG+/XgdOZKODfMv4/4DVQuI2LUEvX9nH8JqCWpseB92af+xrZrPb9LCuh\nQOVVdD0vQAHYligfvgRVjg5v/V6H+5DP//REHMd67NUaqCy9Et24dG8n6xqxtDPaGzBiO6oa7dfi\n79VRVN36nLSHUeR9bfZeXpjtizLMDSLBtQuc7iR7XEcXtru+tfTOSNgv1JkNyqS78tgPmmBmPAoc\nd0SB4jOoRrkLatU5j6xwG85FFRfwVTS3lh4f5+N9qJl5ATGRXL5tLcvIB7o/TNOVsDvK7F9DQcZP\n6OJEokPYv81ouqAWkd1NFsd4arv9owmYDkCB1UM0hdDe6BbgQ/PP9mDb3+7qBVaOv+uH5+YPTJ6f\npcVJaPBvvp8zacbCbEQMmo3XX6LvTPJX0sHDZNF0B8tn67om0uomkaa+StNS1mlBsDQKlK4nCi80\nfiXvFj4CtcruMNR1osJrLZqHLLcNnGiu2aXj/8sN95zH7/qhu9tl672KbLwUfVua3s4z6tcov6jH\n9ExANy/cH+f3PtSd9Swj9LiifvZ3fVQxPjhLOzcSE0CiAe/5fERz0JxH9dxBp6ExhfX5WQsFuovQ\nuMNu3L34l8Bjbd6fglqeHkQtdYvIZmUfZJlL1Z+NZeSz4L+GGgUuIcuburAf7eZ/uoimMrs6CuS+\n3pquxurPqG/AiOykamz588kuRE3N99O3efAZ+t6hNS7/HX/XGUwdOF0Ur1eLRDmlC9tbr2NDmkz5\nONRkXveV7xAXctcGmaMC835ijhF010+emJ+nZYzXMNe3Nbpz6YbsvQ+gWtzx9DMoOl7XMwyvAaya\nvf8s8UDHeL1OHKOOn+o+jP3bHdU2d0Y18gtRq8xitU/aFKaoX//7KLP+Cpp3py6MDwD+Lwpoepa5\noG7Tl1Et9Mp4byYKdh7q5zsrZX9PQoNTL0RBzAGoa+ZK1D15GFnQNNDx6GddqxOtxdm6HqQJnDaO\n6/ML8fe0oSx/gPXWecIKKDC6NbtGj0Ldc0fSMkVJwXJXpe9Yk6tRQVbvz+mo8KmfOl+3MC2HJtXd\nabj7FMvZF7X+nBJpd49Y9mkooKinLxlH34pkHjA9jlpspsU2rVlvK82s5avToyEMQ9jXY9C8XbfQ\nTBK6DWoVbDvBJs2kqAkFRo+hLsezUeVyHqrYdmXiSjTwuz7XrQ/GnYEqln9L34krB80PUOXhiLiW\nHyUeIBzXyZEo7+nKOKxsna3zP72MKn8Xx7kY8CafsfYz6hswYjva3OGyPhocujEq0F4ixpWQ3X4c\nmUBeQC/WvRMXypMos36dLtwySxPA7IYCgX9EzdyroVta/xrVBn5Mlx7Cma37rpaLcBJqPr0XteZc\n17qdhctdrF8fNYM/D3w4e+9kBp6HqXWG4S8RT/CO/z8FPNffMe1h2moN7KZGxvQC6pJYluZ5fP12\nSWT7cRrqoqwfjfGJOO/1uJm2dyx1uA8b0xT+u0R6nhPn5A/1uUetNg+TjdHp7/iiweO3oEHFc+Ja\nOgSNbfk74F/poJBBY5Tqh/rORpNi3klzF9AnUOtYt8f7bYG6dcahmyMW0bdycyPDKHhQXpTf6n0F\nGkRdt2ichYLYGfF6RqSxjrqCaG6CWQEFfi8S8y6hPLDP1Ctkd0jW6TauxzMi7T9N07U/6oUgTZ66\nGk3wvC8a6H18bP/6ca0t3/LdPhXmuC6vidfborx/r3g9NV9fh9v8NeCC1m1BAeiWKOgZcBLcdu9H\n2j0HVcjXQWXMA6j1px4Gsmqn29+yzoHmf/rwYPsw1n5GfQN6unNtBnzGRZ13jcwnmzunPnm0L6A/\nzuJPij4GPebhoC5u93aotrcVGptzPyqEx6FCeEcGGRxauJ7Wwv4z6DbwfD6StVCTej4odMgBU2z7\ntXGh1HNJ7ROZw0cKltPfDMPnosGSdcDxItnYmxFMawk1qdfB+VSaiee2ikzu3W2+1zYwRwXiozR3\nLj4Rmc0KwzkPg2z78igoOzT2Y724LvZGgzM3if/fhboWBxxITd+WiLXRmI9z6XuX4J6Rti9HwfmQ\nH+gcf78AvBR/z0KB0/MomHkJ2L4H53o2qrTsGefuCFQAHx3/H1LARN+u5meB57PXV8Z7dSBYB4kT\nUEG0cxf2ZwZqaTgBtc7VYzL3Qy0DeeAwC413XCXbjkdQi8tUskcCdTONdmEf90VzTt1LcwflIaii\n9UKkmX2zz+fXZb8tY5F+T4y/u/lYkQUo+H5ny/tTUMVgg+y9tuttuQ5PQWXHcigw/jRq/R2HgrAV\nUEvhLh1u96jM/zSiaWm0N6DnO6hM/tTIABa7ewqNG7m05b2BCujTaQa+/v/tnXm4JVV57n/f6Wbo\nBhroZgqGQS8CjQljbBpBQQHFZrIZFJRBjUhzc+0wSZiRCzQyCNg+jCYMAkGgUaRlanhoxgTsK4rE\nGIfEOAQCYbgavbQyfPePdxV77WLvc/ZQtWqffep9nnrO2bV3Va1Va/i+9Q3v2ihMzHOLaHgaK6Kj\niDYnRP7gf0YxP4X4m2kWPG+6elBA4/dpk+3Ty8RAw3R/KnLD/YGGu2lPtMIb1Y3G2AzD8a7es7ot\nY4/vMK90PoIU7Zi3ZxFKdY9jQbJ2bqWYfx5N2IZihE5Gro07iHb4Lrge6yGrxjxkyVgdCYyraWQg\nnozcAaPG60V12xwpRdOR4nQ1EqzxZP+uUOeOUvFpr2DeRfNG22ci4sW+Y0tyz18pattDCe7FUMfD\nQh073r6G9q7me4EHo8+XhTGyQu76fmOY1orOLUQxgFkM2o5hHviz3LUbIsX0egLVAlrETQ1lnJ9/\nTlVHVM+paGH1PmTx/AZwVfhuz9BusaLXbsG8Zu7+s9Hitu/tX8hlmCFm74uRWz/mxruVKIlprLpH\n934s9Lc/Def+HC1Yr0LZx1+iy82wWzyzMv6npP2q6gKUUim5Qi4I/9+GzMWXoJV7NilsHAbD1S2u\nH0tAnxD9NrME9EoiuT5aPcRKzA5o0o/Nr7chi1ORWXIjKPbgKuRnzjLy/hrFUxTCNYVW4qehWIcl\nyES7nIbitFUH9xiTYZiCd3Qfozwxl9WfRZ9vCJNFFsD/FaIA6Oj60RTzzyNBvEe4348oecsXZA36\nPc3C48QwER6OYjg62jQ3lPtOtLI/B61is2DZM2kwLe9KIGPt4J5juWaXELlmacSqFLWQ2QK5JY9B\nLo4tUIxWtsHyGnSnMHXiao5X5X0ziSOr0Zzw/xxklb0WWQfegebIR0Md22YaoqzNm5GykbkOP8kA\nKUxROXZH89stNPb/m47m0xvC50NQbNonEGHxqAtm5G7aBVGvFBH0HWeY3ZO1NVLwTkNW5gdCP7l2\ntHdMTg4hC9NSFIryN0Rxn+H7Schy2u84qZz/KVmfqroApVRKE/QPULbGSeHcjDAp3YkUpvWI3EI0\nm6A7EdB9m2LRCujnyELxJRqkjtORsD0FmVC3Q2bva+iRTLLFsyfRoBFYGQnMJ6IBeyptiOs6uPce\nKLPvY9G5KcjcnAXOXxYmiu2i34y2RUdXDMOJ+tkIUpDuQ9k4J4bz14XzTxBtBZCbzEZTzE+nWXnZ\nMH99CXX5LLI2XIQC2UdQcOypaJXaUdo84iz7MVIkP0Yji21N5PZ7M7s0nNukg3t26pq9n2YLTVHu\ny12Q5WU/lLH049C/n0YWuK6U9S7q848U5GoO7bl/aMv5SAC/HwXjfiXUZwqyan+MRsZmPiZmH5Sk\ncHEYd39P5DLOnlVWP+2yztshJfCUUO/P0VgYzkALgU3RXPgJNL+MtWDOFIBNaHCU9atwtMowW0DI\nRA3tsj3NGX2tMopXj79Dyu/9SGE6DsmyzFJaNNN3cv6nyvpV1QUouOFia816SGhlqegWBsqp4fwa\n0W/zbpYkAhrFclyEVjnHokn4M0jgrBEG8E3IfLo18hNf2muHR8riRuH/lZEFISMjPIRGgG4/m4le\ngyx6i8J9v09DGTyTRuDlAqIgwA7uO3AMw+F9fTn0rZ3C5JCR2u1ECBDN+l/u2tEU82mhX5SpJGUW\nlNnIbbFD+Hwiim+JqQNWbVWH3P2yiXovYHF0fucw3s4NfS3bs7BjRYPuXLPfAhb0+W7WBy6NPl9A\nM9PytkjILgXu7+H+lbia0SJtPyIOtNB3j0VZf3MZxVWK4peup0EWuhlyxVxNQ2AOioVpU+QWzwL0\nd0MK+/+iMQe2inntdM+8QhVD3pphtgxlmJ1HbtPaVu8YKcT30Ajgn43k1WpIebqbRuB+YQoTFfE/\nVXmMMCQws0nu/rqZjZjZ51B6/lzgT8zsCy68iAI2z3X3/5td6+5v5G63GopReBnYwMwOCue/hLiK\nvoO05r7g7n9ECsZuaMV2EMpqeAAFKS5Dk/NuqGOeh/zZr3f7LDO7GilcN5jZoe6+3N2vQ4rYf7n7\nDQQLGjJpZ9dZF8+4FA3Mvd39AHd/P+JJusfMpqBJbEfEbL6Bu18ZrmvZD81sUvTxDOTm2QQpXp80\nswvRpH0GCsaf3U15u0WuPCCl9x9ds8B3kbtlazNbwd0fdfe7w3Uj4TcxXgSeMrMZwMvufpq7b4sm\nu1tRrMtqZdXF3d3M9kHCckfgfDObiwJbfwkcYWa7hvf5++ya+B5mtpqZrWRmG6G2ASlIr5rZfuGa\nh9Aqf0OkRC43M3P317oo7pbIRQkaL8uBD5nZ2WiBMz1qm8VozE/u4v55/AHY3MyuCZ8nIasYAO7+\npLvfGPr362b2V13ev5P6TA7P+k4f9cDMppjZh8LHt6PxfQuwq5ntH+bFiwj7SqLFWqv7zEKKyDqh\n/AA/Q3PhLOASM1uxRT9PjjCf/BolL5wQxt/9yBW5C/ARM1u5TR8cbVwuAlY2s2ktZEa/uB0tyP8C\nZSG+iN7r79A88yZajMNJyCr5XTSO10LjcG+0CN8Hufdfy+RkUYUOZXkOWObuLxBCPtz9u+4+BylR\nC4GtzGxquzqMJwyN0pQpTMhVshHwqrs/gywRe5nZOeF3L7j77dCsEKQW0Jmi4O5LkRVlJ6SkfRj5\n4HdCHX5jd/8dijs4zN1/0MOzLkUT/yHIFL+ema0Uvv45MMnM9kcC80J3vyS7ttPObWbzkLvz0PB5\nlXD9IYh59yp3vxGt2I6OfmetJqCcEvw+d/814pHZC/n690UuhgMRx80JwENlDcZcefaPJqqjzWxr\nd38FTVR/EsrzJtpMsEkU81wdpmR91symIXfArmiD1snA46GsC5BAfC4I1be8UzPbDK3cL0Z9aqGZ\nHR/u8yCwvZmdZmY7o778Y+BAM5vcQxuNKchQvByoP1/YpVLWhLC4+igww8wuQwL4BTNb38xWN7ON\ngmACWZuml1CfqaPdoEt80MweQ0rD9wlcbGhemwvg7l9ECTHP5S82s5lI8D2PqBb2DgrX68CzyAW2\nICwCK4WZbYus3Y4WzU8Bd4R+txQtmh909+VtblH6uDSz95jZR8zsg2b2QQB3fzXcfwmKvdrD3f8V\n9eWrwnVvkTeREvSfaHGzMZrH1wlz0gvA7mUoTBnCu1zHzA5Gc+C24X/Qwv9aNE7WbH2HcYYizFVV\nHzRcDV+gkRWxMfKjvhetnp6hRcp3+G2SLQBok8aKrEpLEFnl0eHcKuRiBXp8N/OAN6LPC5H1ZymK\n8ZqKgvW+ThQvRQ/m01CHu6PPmStmFjIP50na2qXKDiTDMI39+BZEZTwSCaJPI0vT5aNcX9neXEiw\nL6QRIL0KcmueDjxMI6Hhw0R8ZW3u1S4m58zwPjZBFswbwjvZDilO36CH7E8SuGZbjc3wzm5GsXeP\nh/pkW//8Dxpu1K76Xor65J63K1LU7o3OzQh97AHgo6NcuzWaO7O0+mlIGfk3tLj7DyI3dNUHWhw+\ngLizpoR++VU037V0Caccl5SQYRb60WOhnPORYngTEZ9bp/fqoh6V8z9V1seqLkCfDTcp93kPJEgX\no6Djm8P/02m/43oSAU379PIsxfQmorTp0TpoD89egsy/R6JJemuUcvos8Nn8M/p5XniHS3PnNg/v\nb0y29Fx7DATDMA2l/Cs0YhrWRqbn7VHczqk0Z1XmJ5XK9+YK72ojGlt7/A0SiNnnnRG1xagZmowe\nk3MGzUHsK6G07ifpIEsy/77C/2ULsvzYvDX0u1Vp7AUXZwbFm7N2SpeQsj6xAjgdKQ7vCWMwjtF8\nJ4pxGjUrMoy9f8mdy7Za2qascddBPeOdA95BUDTR/LAYKaArhvpfR4ts4JTjkgIzzHLlfhuwKPpu\nY0LSEGEbpILfe3L+p0E6Ki9AHw0Xp3wfgNI2t0V+3M+FgTKFaJfwrJO16XilCWhGz5Y5IQzsmciM\nuTElZJ/Q2Ng2Jkecg9wr8YTeDXFlTGQWB8rejVxl2edvkCMQbdOWA8UwnH8WYnu+GgnXq8M7vZO3\nEp7mr6vUchY9fwNkDXok9OmZoQ9+Jzz3n+mAZZ6xg9i/RPPE+lkifqYuyptCkLUbm19AGUerIsVj\nKWHfvdBuvRK8ll2fvAJ4G+KpWwklxyxCysRfhL9vy10fb+G0M2F/ztDOD6Yaex3UcwZSPKeGNvpy\n6H8ZrYAhJfQuNKe3yjZLOi4pKMOMZtl3FloA/4LmrcKOR9bReQW/9+T8T4N2VF6APhtwBLkWLkFu\nk/NoUNlPpgUPExUIaMbOljk+TGpLKMC6QHtl5k6aTfSLCRQAPTyjlZn5RBpbPNwbJpmvEZGx8VYL\nzEAyDOcmpl1QrNJslIlyChJAayFrZluzMxVbzmgIwVlIWVoBCevF4Zmrhr6yL42NWkd9x5SYXUp6\nQTYW9UP2/LWBbQe5PoytAK4S2vsWlIzRUkFG7rcnkQXs7wnULGiMP17muOuirusg69KGYWzORNa6\nU7L3h5JofkoLhT3luKSEDLNwz/nAxeHznFCPw8PnG8ll3fX5vivhfxrEo/IC9NB4saXoOBpmzR9G\nHWYdpMk37ZVGRQKasVfmWcfvm/SL7pSZS3t8xmhm5thF9TDR5q60Xu0NLMMwzeSfd4dJKt7y4kZa\n8GYxYJYzFNPyDSKuJaQs3EEP23BQUkwOFSiYHYzNthbSQasPHSqA4XNGA5JfxExDC4GtwufdUSzc\n7uHzQ4SNZKs4yNFfIMvp3Uhx2gwpTpeGd30/zV6GysYlUl6viMbJmdF3T4aynssYMX9Rvc9BGJ0V\nLQAAFHRJREFUnoMdw+fp4b4/RaEYt+av6aPslfE/DeJReQG6bLys068QJp89aew1lW1psDYSuqtG\n12WNXYmAZuyV+WUUozAVpsyM8ZyxzMzzW1zT9hkMEMNwNCkZmpCzwPx/J5B1hrb8HJEVM7pu4Cxn\nYTJ9g0hohvMnI0HRcmf33G9LicmhYgWzg7HZldWsyvrQmdt0tHE4E1FQLKMR5zYFWaouKrufdlC/\nzZDCeVGoyyQUz3MKcjuuHz4fgRYJ8V5ylY/LUKaDkYKzGDg4nL8SZTZ/kDb7FtJCGQn3WJY7Nw2Y\nHn3uq49REf/TIB+VF6CLxouF2bfDJLQjSvs+P/rdXeT2ksvdJ7mAJlG2DAUrM63agM7MzAe2ars2\n9xwYhmHkIo2Vg/nIlXUPDdLKdZBwWatV+RgAy1k0VqbRSDTYG/EP7Z377ds7uF8pMTkMhiArbGxW\nXR96UABpdrlsGcbwJSgDa5twfk8Ubzk11VhsUbd3ItfjPDSHX0/DArI6WuAsyvozIUA/GgvJx2X+\nPvSYYUZj0T+CLGh/TcMSeAfN2wcVktATrp+EFNFzwntbK7y3KUhhWsoEU5jcx4nSRLRJKApczfYM\nmox4Q74SOt+3iPbmaXGfZAKaklbmbZ5VuDIzyrMKMTNH7TcQDMMokPQOxHHy/nDuI4ie4azod29a\nCtqVjwotZzSExL4oC+ybNBSdOSj1vCX1xmj9mBJicqjO8luW1axShZkeFUCUAXpk+P8CZG06DG1F\ndSayslZGKxD62umEGB3kFnoaxdIchxJppiDZcBuwaZv7JBuXNCsvPWeY0VCYDO0VeDnKiruKhiv2\nLuCpgt95PO5PRvIknpcPZQIqTO7jQGkKgz0T0pshU+bPCGnRyE23DVr5HpDvbNHnZAKaklbmYzyz\nMGWmg2f1bGaO7jELKXL3ZpMWWtnshxSVmwib3iboY5eGfrAXckWcS0MRvQwp2QuQAvJ3Y9yrcstZ\neP/LUJzHbYh9OtsceV/gFTRRjxXwXXpMDokVzLLHZlX1Cf93rQCiUIfLgN+G97AP2oj2IKRMfYKQ\nIFDlQSMmczJSDq8PZb0ijMvMKnICIc41d33KBXPhGWZIufpq1Hd3QlxU7wznTirhnSfnfxoPR+UF\nGKPRLqV5w9MRtIJbgEzIW7Rr7NznZAKaElfmHTy7b2WmzX0LJTIL1z2OrGI7hve0f/huhzA4Syer\nDM/Lk39ujRSNGeHzqshydxxR+m6rCZaKLWc0VqXHotiDfcKk93kUIHpY+H7tUe6RLCaHxApm2WOz\n4vp0rQAiCoq1URbaMpTY8JEwNu+jBI6fLuv3p8g6ujnNCRg7Rf9viBJfMnfcsWguil3mScYlBWaY\n5e4zH2U8/ormDc5vBD7d7roC+lRS/qfxclRegFEaLy/MjiPELiE+ptNRVseo8QYkFNAkWJnnnlc6\nK2uLSaAvIjMGkGEYKQJLsjYMZXgYMUB/Omu76PetFKZKLGc0ExmuF51fAynOGdP33cjyEPN0VUb/\nQGIFs+yxWXF9ulYAkTvrbKRw7ITimS4I5T4xjIFNiu6vXdRvJrKQfg0pewe2+d0HkOtovfAudqNZ\nMUwyLikww4y3Lvr3CG34BaSQzw7nb6NYWoHK+J/G01F5AcZoxCXI9DoP8TBtEn23FTIbHjrK9aUL\naBKuzHPPLZ2VlRLMzOG6gWMYRkJlGVKW5oXJ92OIz+aIMa6txHJGe5b51cLEdz2KR9ghfD97jPsl\nickhnSBLMjZT1adFvfpSAFEA9W5oI+3zwjiemfWFovtrF/VbF8VnHh4+H4woZZqywpAF+ClgTpv7\nJBmXlJRhhjwERyPF6ybkeTkIxZjdAdxSwrtPyv80Ho/KC9BBIzYxWRNtxdHJwKZEAU11vE+lsrJS\ngpmZAWIYpj35583AM7nfrjrGvSqxnDE6y/znkYt2D2Qt+xG5rLlR7ltqTA7pBFmSsZmwPqUpgMh9\ndzFhQ9r4eVUcoe9+imaX3CJgs1z7LsrGb779Uo1LCswwo3nOnY6U8F+hHS+OQptRz0JWtU2j3/a9\nGKcxTyfjfxqvR+UFyDVcO2H2bSJBTW7VFjV4UgFN4mwZErCyUgKRGQPEMEx78s81w/dLiFJ4O2k/\nKrCcMTaRYawQbNhJH6TkmBwSKpgpxmaq+pBAAQz3eTchNqrKI/TnzFqazUP3Ze+SEE8DrDJafcse\nl1BOhhlSkjZGsWaPIjfqQcBLyOK0evTbQmKYcudK538az0flBYgaYSwm63toIcxa3CepgCZRtgwJ\nWFkpwczMADEM0zn555O0yUZhQCxnjE1keFE3/Y50wbLJFMwUYzNFfahgcVZGnx3jmXFsXpydlc03\ni1CowZYoVGPjVmVOPS4pOMMstO//DuXeCxkQ5ofxeTIRJ2ERZY/qkIz/abwflRcgNELfwix8n1RA\nkyhbhgSsrJRAZMaAMQzTA/lnm/tUbjmjQCZrSorJoUIFs4yxWXF9BoY1v4S6tYrNO55mi8ppwFeR\nK3Rum/skGZdQboZZuO9sFNd1UxjTq+d+06+yXwn/0zAclRcgNE7fwozEApp0K/PSWVkp0MxM84pk\nIBiGKZD8kwGxnFEQkzUlx+RQgYJZ5tisqD6Vc3+VdTB6bN6xNNzml6NYm4x4tpI980iYYRbmrOOA\n3wCnReeLzMJMzv803o8RKoQJG6FV2+tm9iEU2HwEsJ6ZnYsUoWfz18V/A1YA/huZSg8xs23c/RU0\n2Uw3s6lmVkh9zWwWCp5cBykGIMLN76BV+yVmtqKHXtfHcya5++vAfwK/R6uY89GE+QrwApoUXot+\n2zXc/fXwbh5BgX+3An8M9Vjb3a/v9Bnu7ma2vZkd6e4/QAJsMlrNXGdmZyIryM3u/v/c/Y1eytwN\nQjs8h/z0LxA24nX377r7HKRELQS2NbOpuevehJnNRBxh70BWPlDMwUtIWcHdd3b3J8qoh5lNij6e\ngfrfJsBngU+a2YVIOTgDtd/s3BjJ329rwubN4b08jRYpF5jZVcgl8i13f7rH8k5Di6BPufseaCU7\n08x2d/e5wB/MbPte7j3KM0sbmxXVZzJwICJsPAa5an4JfNHM1g2/sRTjqCTshza9vsbMRszsAuAM\ndz8TZQTOC7+7AinBS6F5bKYcl2GuNBRPumoYG0cBB5vZ4eFn2yCi4Sv6fNYv0E4FW7j7WdH5nuVK\nPB+Y2XwUC7uHmW3n7m+4+6NoQf7e8Kxz89dNdFSqNPUrzKoQ0GGALgSeR77nvc1s/6BMPIsG6wJ3\n/2O/zypSmWlTl0nh7whij37W3Y9394VI+C4Hzjez9bPydHDPFZCV4wIz+3go+4dDuY9A6c0fd/e7\nuy1vP3D35cA6ZnYw8ATqUweHr5chy9dSRNvwJqpSzPPI2jgIlve5+68Rf81eqO32Ra6NAxEp4AnA\nQ6NNsO7+feBfkCDG3X/r7t8EdkWWmr16bacqFMwyx2ZF9UmyOKsYW6LsTlAA8nLgQ2Z2NrJQrxW+\n+yd3fxjeXGwnH5fRM7OyLQqfH0eL+1PN7Hbkyr42d01PcPfX3P0/wr36qoOZjeT6yk/QVjl/h97b\n7HB+ZWTVi8sxnvtYoahUaYLehRmkF9Blr8yj5xSuzLR6RrRqOhMpZe82s0PCPf8dcaPMRO6BTu65\nAYpHuxD4MXLDjQD/Bfwl8Ct3v9Hd/6Hb8vaCFhPWWcgysy7iP/mMmd2MLHa3ojTnpmsGwXIWK0zI\nPbND+OoW5No4ErkY5wEfRS7pw9z9p7n7ZBbamWa2s5mt5u4fAP7VzB6M6vxzd3/E3b/XZTkrUzDL\nGJsV1yfZ4qxivAg8ZWYzgJfd/TR33xbF9CwCVjazafEc5wGpxmU2H0eL9VNQHNsl4fNLQcZshxi6\nDwzX5ZWUvtDv3JJdb2ZXmtnRqA+9C2VYPwF83czuAF7PlL4aLeAV+AQpgMmairYAIF0aawqfeWFE\nZgwYw3Dcx+hvw8xK9+aiICLD6H6lxuRQ4eavZYzNKurDALLml3XQ+ybDScYlQ5BhlitXUv6nYTyq\nbsCehBmJBHRWVtKnsZbKyhrVq1AiMwaEYTjXx3om/6Q6xbwUIkNKDpYloYKZYmymrE+LZw8ca36B\ndet3k+Ek45IhyzAjIf/TMB+pG60wJmsSCWgSZstQkjIT/TYJkRkVMQxTLJN5JZYzSiIypOTsUipQ\nMMscmynrQ0WLsyoOmi2nvWwynHxcMgQZZiTkfxr2I1WDlcZkTYkCmsRprLlzhSozJDYzk5hhmHKY\nzJNbziiQyDDXjqXRP1CNICttbFZUn8q5v8o+aFaYut5kOLpPqeMyN27mo9jBXwHbRedvRPFLLa8b\nxIME/E8T4UjRUCmYrAsX0CTifSKBMkPFZuayByIlk3+S2HJGgUSGJIrJKVuQ5Z5V+thMXJ+B4P4q\n86Dg2Lzw+8LHJbmFA9q/8dTQty4mbHoN3MY43biWBPxPw3xkZuFSYGb7oyy2T7n7syGl8Wy0qvoM\nMhHu7X3yDOWead5jpeJrzWxL5E+fglIwr3T375nZniit+38Cy72/rIwRd38jZOk8hvz4KwOvAj90\n9y+b2V3IVbNVr8+JnrcdCh4/ImT6vAcJ4/Pc/admdpIHXo7xgpDZsi5y926IlKQX0Xuchybgvrms\nAuXFu9CG0Q8XUvjWz9kHJUU8igg3X0TuuOej34x00u9CdumXgUPQu/gdmvhPQcGfmwA/9wKzGc1s\nUxRc+nHgF+4+q4ixnXpsRs8tpT7R/Weivrs9cIy7P2pmU5AFfpq7H1vEc6pAlPU5OYy/ERRe8ChS\nem8Hlrj7hZ326RbPKGVcmtmViArhbxET+UsoO/uLaMGx3N0/WtTzUsPE/7WuBzqDGl2gLG2MBEzW\nJZW7imyZUnzmDKmZOe5j0TsrbMPMbt5rwfUpjMmaioLYw7NLcc1WMTbLqE9uXA4Ea34J76z0TYZH\ne699tsmEyjAbhjqkPEol4qNkJuuikYr3KeZ/sZJYWScCkZmXTP45ynMLfz9FEhkGS8WRKAtpfcSR\n9QxaOd+OLFhT296gT7h4cZa5+8P9kvtlSM3JFqPo+rhXz/2VAFMRr1XGXP46UnQvRkr7t9z9EujP\nOxCjn3tk15rZAchdehSietkaeBm5a48BXnH3n0TlHo9t04RhqENKlKI0VSXM+kEqYsZUyowPMZGZ\nJSD/TImiiQzDwuQCZLW6GqXGr45WlF9EGak/K74mLcvStzAcJNLUguozkKz5RcLdfwl8G1gRWdJm\nuPsjiHz15jBWC1OYikBQ7rZECusWaCw+j0g2L0T97DfZ7wel3DXSolClabwKs5Qr87KVmZwVazqy\n8h2HAhofRpxP/4aCpY/34Jcvks24TFgJTOZVogwmawB3/42734/Si1dE/GfXha/HTRxD1VazojFI\nCmCZCLF556O2ehuw0MzWcfdrI4WpUMbsfuHuz6GA77PQ3PIpNIes4u4L3P0E6H9rlBrjG0XS/o9b\nYZZiZZ5KmRl2M7Mn3DAzBbzE/d/C/X6C4kj2QhlLA7NY6QSDZDXrF8OmALaDjeNNhkN4xONoI+H/\ng2LZ5mffD5JlrEY1KDR7LhJmb3f3Y8xsDsoMucXdrzOzG4H7BtkdlCBb5gA0GEfQNgEPAv+ErAv3\nouy234Tf9jRAw8T0VyjA/HJk9t88POMEYI1s1TSekL0PMzsHOAl4r7s/FpTQ7ZGL64fAq97Y/2mg\nJrmoDjNR/NKT7v7fZnYnWtHukuL5ZT6jLJQ9NlPAzFZHgvgyxMG0JmL+/5GZbRjcWuMWITYvy9K8\n090XBg/Evsjd9TTa2WDg98yzOsOsRgsUojQNgzCLYeWlsSZTZoKFahbyz/8Euf/2j33yg9wGMVoJ\nRjNbDKzn7u+Ozk1D2XIvhc89pTGXDTObi2gFniekMrv75Wb2TcQ1NXvUG0xglDU2U2MYFMA8wkLg\nGmTV3Ay5uq5w99vMbAcUx3Vpt67mQcCgziU10qMvpWnYhFkrFK1YpFZmzGwjlDZ7OnChu59V9DPK\nhDW4rEbQVguvoq0lngqxX6u6+wfCb2Mun4GsXxgLX0WB3U+Z2e7A3sBid7/PzB4CTnD3Jyot6DjA\noLZxpxgWBRDejM27Czjb3S8L/XxXxG5/P4rd+kw/ruYaNQYBPStNwybMUiOlMjNezcyWmPyzbNgQ\nExnW6A/DMC+a2QOI/2vz6Nzb0aa8v3P371VWuBo1CkJPStOwCbOqUIUyM56sfBlsHDOZ5xYMyZis\na9QoE1XH5tWoURV6yp6LJvVtkZJ0FMoE+Rrw52b2TnefA3y9mGIOJ9z9tUxhskQp/+NBIAdlPPu/\nFPLPVAiCZdiJDGtMMIR+PRcxzZ8EXGlmR7n7nsDLZvZ4tSWsUaMcdCWoh0mYDRpqYSnYkDGZ2wQg\nMqwx8RBilg5C+4rugQLAZ5rZ7u4+F/iDmW1faSFr1CgBHStNwybMagwmfIiYzG2CEBnWmFgILrkF\naF/D1cLpR1Em6IcB3H3nOpmhxjCiY6VpmIRZjcFDzoo57pnMbYIQGdaYGMh5C1ZA+8o9hhbM27hI\nSJcB081s6qCOyxo1+sWYHXvYhFmNwUQULD0UTOY+REzWNWrUsXk1aghjKjbDJsxqDC5syDbM9CHZ\n/61GjTo2r0YNoSPKARvSbTlqDB5Sk3+mwjARGdaYWAixectR/NLNaFzeCpyI3HSHu/sz1ZWwRo10\n6JinaViFWY3BREryz9QYhjrUmBgIsXmnABsgJvvfAocCfwvMRXuN7ly7mmtMFHRNbjnMwqzGYGG8\nMpnXqDFMsCHfZLhGjW7QKyN4LcxqJMV4ZDKvUWOYYEO4yXCNGt2irw17oRZmNWrUqDFRUMfm1Zjo\n6FtpqlGjRo0aEw91SEaNiYhaaapRo0aNGjVq1OgANQFljRo1atSoUaNGB6iVpho1atSoUaNGjQ5Q\nK001atSoUaNGjRodoFaaatSoUaNGjRo1OkCtNNWoUaNGjRo1anSAWmmqUaNGjRo1atToAP8fffjV\nLcvFs0UAAAAASUVORK5CYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x7f444c9e2240>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "local_1.plot_lr_weights([\"@\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see features for specific users such as \"word=@justinbieber\" corresponding to the feature function $f_{\\text{word},\\text{\"@justinbieber\"}}$. This is all the model can do right now: remembering words from the training set. This works for \"@justinbieber\", but for less popular twitter accounts that are unlikely to appear in the training set this means we cannot correctly classify them at test time. In other words, the model does not generalise well. \n", "\n", "In this particular case the problem can be easily solved. We notice that each of the \"@\" words starts (obviously) with a \"@\" symbol. If we turn this observation into a feature it should be simple to learn the class. We can do this by introducting a new feature `first_at` which returns `True` if $x_i \\text{ starts with @}$ and `False` otherwise. Internally this is again transformed into a proper real-valued feature template." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.6879535558780842" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def feat_2(x,i):\n", " return {\n", " **feat_1(x,i),\n", " 'first_at':x[i][0:1] == '@'\n", " }\n", "local_2 = seq.LocalSequenceLabeler(feat_2, train)\n", "seq.accuracy(dev, local_2.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This looks much better. Of course this particular aspect of the problem is so deterministic that we could simply preprocess all words starting with @ and label them as '@' right away. But is important to know how easy such observations can be incorporated into the probabilistic model. \n", "\n", "To confirm that these results actually from improved '@' prediction, let us look at the confusion matrix again." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfEAAAGoCAYAAABWs9xCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8JVV54P3fc/rGpUG6OU0Pl0ZQEUWjCEfwNq8CUUCN\nYBywveLrpUfFcURJBJOJmMjIO5qYN14wPfGCGkXilZkkXoIy0WiUhiEgIIIgAkHoBkyEQEN3P/NH\n1ZHtmT7n7NpV1Wfv2r9vf+rTe9epWvXsffbZT61Vq9aKzESSJI2eiYUOQJIkDcYkLknSiDKJS5I0\nokzikiSNKJO4JEkjyiQuSdKIMolLkjSiTOKSJI0ok7gkSSNqcdsHWLFyMvdds3+tMpYt9lxDw6vu\nmIfRSBTS8Lrssks3ZeaqhY4DYNHuD8/ccl8jZeV9G7+Wmcc1UtiAWk/i+67Znwv+9tu1ynjEXrs2\nFI3UvLpDF0eYxtVtOy+JmxY6hmm55T6WHXxyI2Xdf/mHJhspqIbWk7gkScMjILrTumsSlySNjwA6\n1PrVndMRSZLGTK2aeEScBdyTme9rJhxJklpmc7okSSPK5nRJkrTQWqmJR8Q6YB3A3vuuaeMQkiQN\noFu901t5JZm5PjOnMnNq5Z4LfhudJEkPiWhm6etQ8dOIuDIiLo+IDeW6lRHxjYi4rvx/Rc/2Z0bE\n9RFxbUQcO1/5tZJ4Zp5lpzZJkuZ0VGYemplT5fMzgIsy8yDgovI5EXEIsBZ4HHAc8OGIWDRXwd1p\nU5AkaT5B0ZzexDK4E4DzysfnASf2rD8/Mzdn5o3A9cARcxVUK4qIeH1EvLJOGZIk7TgNNaX338M9\ngb+LiEvL/mIAqzPztvLxz4HV5eN9gZt79r2lXDerWh3bMvMjdfaXJGmETU5f5y6tz8z1M7Z5Rmbe\nGhF7Ad+IiB/1/jAzMyIGnoDB+8QlSeOlud7pm3quc29XZt5a/n9HRHyJonn89ojYOzNvi4i9gTvK\nzW8Fem/p2q9cNyuviUuSxssOak6PiF0jYrfpx8BzgB8CFwKnlJudAnylfHwhsDYilkXEgcBBwA/m\nOoY1cUmS2rEa+FI53fBi4DOZ+dWIuAS4ICJeA9wEnAyQmVdFxAXA1cAW4NTM3DrXAVpP4osmgj12\nWVKrjG3b6s3XDDAx0Z1h9jRcnA9cGiU7brCXzLwBeOJ21t8JHDPLPmcDZ/d7DGvikqTx4VSkkiRp\nGFgTlySNlw6NnW4SlySNkW5NgDJQEo+I9wBfBx4GPDYz39NoVJIkaV6Dno4cCfwj8Ezg75sLR5Kk\nlk1EM8sQqFQTj4j3AscCBwLfAx4JHBMRn8/MP2whPkmSNItKSTwzf6e8Ef2VwFuBizPz6TO3Kwd5\nXwew7377NxGnJEn1Tc9i1hGDvJLDgH8CHgNcs70NMnN9Zk5l5tSek5N14pMkqVk7dhazVvVdE4+I\nQ4FPUAzIvgnYpVgdlwNPzcz7WolQkiRtV9818cy8PDMPBX4MHAJ8Ezg2Mw81gUuSRkN5i1kTyxCo\n2rFtFXB3Zm6LiMdk5tUtxSVJUjuGpCm8CVU7tm0Enlc+fkorEUmSpL44YpskabwMSVN4E0zikqTx\nMUQ9y5vQndMRSZLGzA6piWfN/ScaGN7u3vu31Np/151stNDw2rat7l9ZM39n0kiwOV2SpBFlc7ok\nSVpo1sQlSWPE+cQlSRpdNqdDRBwQEa9qMBZJklTBQDXxiHgD8GZgeZnI12bmz5sMTJKkxnVsKtLK\nSTwidgPeBRwHPAG4GLi32bAkSWpDt66JD/JKtlHc+r0SIDN/mpm/7N0gItZFxIaI2HDnpk0NhClJ\nkmaqnMQz817gdcB7gD+KiPdFxC4ztlmfmVOZObXn5GRDoUqS1IDpoVfrLkNgoDaFzLwQOAn4b8Aq\n4G1NBiVJUmvGdT5xgIhYDuxZPv0lcA1l07okSdpxBumdvgT4c4pEPgn8DHhpk0FJktSaIWkKb0Ll\nJJ6ZdwPHRcQBwLMy8xMNxyRJUjvC3unTfgFc3lQgkiSpmoGHXc1Mk7gkafSMc3N65QNMBHsuX9r2\nYeZVdz7wG+6oP57NI/batXYZ0vY4F7jUv+hQEu/OhQFJksaMs5hJksZG0K2auElckjQ+olw6wuZ0\nSZJGVKWaeERMAn9FMdDL/cDRmXlPG4FJktS8GOvm9DcAf5+Z74yIfYAHWohJkqTWjHMSfwA4ACAz\n/7nxaCRJUt+qXhP/CfDbEfH6NoKRJKltEdHIMgz6TuIRsS9wJvAo4LUR8aJy/RUR8bAZ266LiA0R\nsWHjpo2NBixJUh1dSuJVmtOfDlyZmXdGxPOAiyJiNfDTzPyX3g0zcz2wHuDww6eysWglSdKvVGlO\nvwI4KiL2yczbgdOADwGfaSUySZKaFg0uQ6Dvmnhm/igifg/4WkQ8CNwOrAXOiYjLMvPHbQUpSVIT\nYpxvMcvMTwOfnrH6c82FI0mS+uWwq5KksTK2NXFJkkZdl5K4Y6dLkjSiWq+JJ7B1W727zBZNLPxZ\n0yP22rV2GStO/FDtMu7+8qm1y5Ckft1z/5aFDqFxXaqJ25wuSRofQ3R7WBNsTpckaURZE5ckjRWb\n0yVJGkFdG+zF5nRJkkaUNXFJ0ljpUk3cJC5JGi/dyeHtNKf3zie+yfnEJUlqRStJPDPXZ+ZUZk5N\nTq5q4xCSJFUXRXN6E8swsDldkjRWhiUBN6F2TTwiLoqIfZsIRpIk9a9WTTwiJoBHAXc1E44kSe3q\nUk28bnP6IcAXMvO+JoKRJKlNXRvspVYSz8wfAm9tKBZJklSBHdskSeOlOxVxk7gkaYyE18QrCWDR\nRHfesDru/vKptcs462vX1i/j2INrlyFpPCzfybpeXRGxCNgA3JqZz4+IlcDngAOAnwInZ+bd5bZn\nAq8BtgJvzsyvzVW2E6BIksbKAgz28p+Ba3qenwFclJkHAReVz4mIQ4C1wOOA44APlycAszKJS5LG\nyo5M4hGxH/A84C96Vp8AnFc+Pg84sWf9+Zm5OTNvBK4HjpirfJO4JEmDmZyeJ6Rc1m1nmz8FfhfY\n1rNudWbeVj7+ObC6fLwvcHPPdreU62blxQ5J0nhprpvWpsycmvUwEc8H7sjMSyPiWdvbJjMzInLQ\nAPpO4hGxGngHcBSwBbgMeFdm3jznjpIkDZEd2Dv96cALIuK5wE7A7hHxaeD2iNg7M2+LiL2BO8rt\nbwXW9Oy/X7luVn01p0fEI4GvAv8ATGXmYcBngS+VP5MkST0y88zM3C8zD6DosPbNzHw5cCFwSrnZ\nKcBXyscXAmsjYllEHAgcBPxgrmP0WxM/FzglM6/oCe6iiHg58Mc8dFFekqShNSTTiJ4DXBARrwFu\nAk4GyMyrIuIC4GqKFu9TM3PrXAXNm8Qj4tHAxsy8omzf/0PgBiAy80URsS0iJjNzU88+64B1AGv2\n33+gVyhJUhsWIoln5sXAxeXjO4FjZtnubODsfsvtpyb+ROAfy3vV3gkcDTwM+GH58+uAA4FfJfHM\nXA+sBzj88KmBL9hLkqTZ9ducvhWYBH6Smb8AfhERV5c/24uHLspLkjTUhqA5vTH9dGz7IXAkRU37\nkRHxsIjYH3hsRPwGsFdm3tRmkJIkNSYaWobAvDXxzLymTNoHA+8GvkVxTfxC4HTg1a1GKEmStqvf\n5vQ3An8JvB04vFx3GLBPZt7eRmCSJLVh3JrTycxrgBcAL6IY5OWfgDcAV8y1nyRJQyUWZAKU1vQ9\nYltm3gK8vsVYJElSBY6dLkkaGwEMSSW6Ea0n8fu3bOMnt99Tq4xHrl7eUDSj76xjD65dxmU33l1r\n/8MOXFE7hi7Ztq3eUAgTE8PxjVL3dcDwvJa6Muu/F8PS3KqZhqcpvAlORSpJ0oiyOV2SNFY6VBE3\niUuSxovN6ZIkacFVTuIRcWJEZEQ8po2AJElqTRTN6U0sw2CQmvhLgO+U/0uSNDKC4i6KJpZhUCmJ\nR8Ry4BnAa4C1rUQkSZL6UrUmfgLw1cz8MXBnRBy+vY0iYl1EbIiIDXffuWl7m0iStCDGuTn9JcD5\n5ePzmaVJPTPXZ+ZUZk6t2HOyTnySJDVqLMdOj4iVwNHAb0REAouAjIjfySaGN5IkSZVUqYn/B+BT\nmfnwzDwgM9cANwL/vp3QJElq2Bj3Tn8J8KUZ676AvdQlSSOimABlDJvTM/Oo7az7s2bDkSRJ/XLY\nVUnSGBmeWnQTTOKSpLHSoRzefhJftniCh0/u0vZhVEHd+cBf+enLasfwyZcfVruMYTEMIzc5F3iz\nulRTU7dZE5ckjZUunaSZxCVJ42OIbg9rglORSpI0oqyJS5LGxvR94l1RZdjVrcCVwBJgC/BJ4P2Z\nua2l2CRJalyHcnilmvh9mXkoQETsBXwG2B14ZxuBSZKkuQ10TTwz7wDWAW+KLrVLSJI6byyHXZ0p\nM2+IiEXAXsDtzYUkSVJ7hiT/NqKV3ukRsS4iNkTEhk0bN7ZxCEmSxt7ASTwiHgFsBe6Y+bPMXJ+Z\nU5k5NblqVZ34JElqTticTkSsAj4CfDAz64/3KEnSDlDcYrbQUTSnShLfOSIu56FbzD4F/EkrUUmS\npHlVmU98UZuBSJLUvuFpCm+CI7ZJksZKh3K4Y6dLkjSqrIlLksaKzekaSBMd+Yfhw/fJlx9Wu4x7\n799Su4xdd/LjO21iYuE/F+omb0Aabn4LSpLGR8fmEzeJS5LGRtemIrVjmyRJI8qauCRprHSpJl45\niUfEVuDKnlXnZ+Y5zYUkSVJ7OpTDB6qJ35eZhzYeiSRJqsTmdEnSWOlSc/ogHdt2jojLe5YXz9zA\n+cQlSUOpvMWsiWUYtNKcnpnrgfUAhx0+5UgBkiS1wOZ0SdLYCGcxkyRpdHUohw+UxHeOiMt7nn81\nM89oKiBJktSfykk8Mxe1EYgkSTvCRIeq4janS5LGSodyuGOnS5I0qqyJS5LGRnGPd3eq4q0n8W2Z\nbH5wW60yFi/qRoPB1m31b5lfvKgbH75dd6r/0XtgS73PFcDSxd34bEnq30Q3vkYBm9MlSRpZNqdL\nksaKzemSJI2oDuVwm9MlSRpVlZJ4RKyOiM9ExA0RcWlEfC8iXthWcJIkNSkox09v4N8w6Ls5PYqL\nCF8GzsvMl5brHg68oKXYJElqXJd6p1e5Jn408EBmfmR6RWbeBHyg8agkSdK8qjSnPw64rJ8NI2Jd\nRGyIiA13bto0WGSSJDUtiqlIm1jmP1TsFBE/iIh/ioirIuJd5fqVEfGNiLiu/H9Fzz5nRsT1EXFt\nRBw73zEG7tgWER8qA7tk5s8yc31mTmXm1J6Tk4MeQpKkxhWjttVf+rAZODoznwgcChwXEU8BzgAu\nysyDgIvK50TEIcBaikrzccCHI2LOSceqJPGrgMOmn2TmqcAxwKoKZUiSNBaycE/5dEm5JHACcF65\n/jzgxPLxCcD5mbk5M28ErgeOmOsYVZL4N4GdIuINPet2qbC/JEkLKiimIm1iASanLx2Xy7r/63gR\niyLicuAO4BuZ+X1gdWbeVm7yc2B1+Xhf4Oae3W8p182q745tmZkRcSLw/oj4XWAjcC/w9n7LkCRp\noTU42MumzJyaa4PM3AocGhF7AF+KiMfP+HlGxMATa1Qasa08c1g76MEkSRpHmfmLiPgWxbXu2yNi\n78y8LSL2pqilA9wKrOnZbb9y3awcsU2SNFZ2YO/0VWUNnIjYGXg28CPgQuCUcrNTgK+Ujy8E1kbE\nsog4EDgI+MFcx3DsdEnS2KjQs7wJewPnlT3MJ4ALMvN/RsT3gAsi4jXATcDJAJl5VURcAFwNbAFO\nLZvjZ2USlySpBZl5BfCk7ay/k+Luru3tczZwdr/HaD2JT0SwbImt9gCLF/k+NGnp4vrv56Zfbq5d\nxuRuy2qXIQ2rLk3bOW2iQ6/Jmrgkaax0J4XbsU2SpJFlTVySNFa6dInAJC5JGhvFiG0LHUVzBmpO\nj4h75t9KkiS1yZq4JGl89DlQy6gwiUuSxkqHcng7vdMjYt30rC6bNm5s4xCSJI29VpJ4Zq7PzKnM\nnJpc5XTjkqThsaPGTt8RbE6XJI0Ne6dLkqShMGgS3yUibulZ3tpoVJIktWTsm9Mz0xq8JGkkDUf6\nbYbJWJKkEWXHNknS2IhwKlJJkkZWh3J4+0n8x7f/kt98/7drlXHx6c9sKBrp103utqx2Ga/+7OW1\n9v/YSw6tHUNm1i6jCcPS2aeuzQ9urV3GsiWLGoikniY+F135nXaVNXFJ0ljp0omJSVySNFY6lMPt\nnS5J0qiyJi5JGhtBdKp3eqWaeERkRPxxz/PTI+KsxqOSJKkNUTSnN7EMg6rN6ZuB346IyTaCkSRJ\n/auaxLcA64HTWohFkqTWdWns9EE6tn0IeFlEPGy2DSJiXURsiIgND977L4NHJ0lSwyYaWoZB5Tgy\n81+BTwJvnmOb9Zk5lZlTS3adNddLkqQaBu2d/qfAZcDHG4xFkqRWBd0a7GWgFoHMvAu4AHhNs+FI\nktSuiWhmGQZ1mvX/GLCXuiRJC6RSc3pmLu95fDuwS+MRSZLUomGpRTfBEdskSWOjGKilO1l8WHrJ\nS5KkilqviT969W5c9Nb/p+3DaAfauq3+HMWLOtSeVXc+8JM+dkntGD57yuG1y1i8yHP6acMwF3gT\nulTjbFKHvn5sTpckjZcundt46i1J0oiyJi5JGhsBnZqK1CQuSRorXWqCrjqf+H4R8ZWIuC4ifhIR\n/39ELG0rOEmSNLu+k3gU3Ry/CHw5Mw8CHg0sB85uKTZJkhpX3CtefxkGVWriRwP3Z+bHATJzK8W8\n4q+OCEdukyQNvYhgoqFlGFRJ4o8DLu1dUU5L+jPgUU0GJUmS5tdKx7aIWAesA1iz//5tHEKSpIEM\nSSW6EVVq4lcDvzYsVETsDuwPXN+7PjPXZ+ZUZk5NTq6qH6UkSQ0Z16lILwJ2iYhXAkTEIorpSD+R\nmf/WRnCSJGl2fSfxzEzghcBJEXEd8GPgfuAdLcUmSVKjpgd76UrHtqrzid8M/FZLsUiS1Lohyb+N\n6NLANZIkjRWHXZUkjY8h6pTWBJO4JGmsBN3J4jskiRd94urozhveBYu6dBo7BP7q1U+uXcaadZ+r\nXcbP/vzk2mVEzYuN27bV/a6ACT+fGiPWxCVJY6Ponb7QUTTHJC5JGitdSuL2TpckaURZE5ckjZW6\nfTeGSeUkHhFbgSvLfa8BTnHYVUnSKOjaNfFBmtPvy8xDM/PxwAPA6xuOSZIk9aFuc/q3gSc0EYgk\nSa2Lbg27OnASj4jFwPHAV7fzs4fmE1/jfOKSpOExLJOXNGGQ5vSdI+JyYAPwM+CjMzf4tfnEVzmf\nuCRJbRikJn5fZh7aeCSSJLWsax3bvMVMkjRWOtSa7mAvkiSNqso18cxc3kYgkiS1L5jo0KRaNqdL\nksZGYHO6JEmaR0SsiYhvRcTVEXFVRPzncv3KiPhGRFxX/r+iZ58zI+L6iLg2Io6d7xgmcUnS+Iii\nd3oTSx+2AG/LzEOApwCnRsQhwBnARZl5EHBR+ZzyZ2uBxwHHAR+OiEVzHaD15vRtmWx+cFutMhYv\n8lyjSZsf3Fpr/2VL5vxMjZ1t27LW/hMN3O9y47kn1S7jH39yV+0ynvqoPWvt36VmTg2vHTXYS2be\nBtxWPv5lRFwD7AucADyr3Ow84GLg7eX68zNzM3BjRFwPHAF8b7ZjmB0lSRrMZERs6FnWzbZhRBwA\nPAn4PrC6TPAAPwdWl4/3BW7u2e2Wct2s7NgmSRobDXds25SZU/MeM2I58AXgLZn5r71ToWZmRsTA\nzXkmcUnSWNmRY6dHxBKKBP6XmfnFcvXtEbF3Zt4WEXsDd5TrbwXW9Oy+X7luVjanS5LUgiiq3B8F\nrsnMP+n50YXAKeXjU4Cv9KxfGxHLIuJA4CDgB3Mdo1JNPCK2AleW+90IvCIzf1GlDEmSFtIOrIg/\nHXgFcGU5cRjAO4BzgAsi4jXATcDJAJl5VURcAFxN0bP91Mycsydy1eb0X01+EhHnAacCZ1csQ5Kk\nBRHsuCbozPxOecjtOWaWfc6mQl6t81q+xzy95iRJUnsG6thW3nx+DNuZS7z8+TpgHcB+a/YfODhJ\nkhoVEB0akKBqTXznsl1/+r62b2xvo8xcn5lTmTm15+Rk3RglSdJ2VE3i09fEH07Rzn9q8yFJktSe\naGgZBgNdE8/MfwPeDLwtIrzXXJI0EoLiPvEmlmEwcMe2zPzfwBXAS5oLR5Ik9atSLTozl894/lvN\nhiNJUruGow7dDJvCJUljZUhawhvhsKuSJI0oa+KSpDESnbpPvPUkPhHBzksXtX0YVbBsib+PJk1M\nLPwXwuJF9RvVnvqoPRuIpJ4ufblqOO3IYVd3hC69FkmSxorN6ZKksdKlFh+TuCRprHQnhducLknS\nyBooiUfE70XEVRFxRURcHhFHNh2YJEmNK2cxa2IZBpWb0yPiqcDzgcMyc3NETAJLG49MkqSGda13\n+iDXxPcGNmXmZoDM3NRsSJIkqR+DnJB8HVgTET+OiA9HxDNnbhAR6yJiQ0Rs2LRpY/0oJUlqSJea\n0ysn8cy8BzgcWAdsBD4XEa+asc36zJzKzKnJyVWNBCpJUhO6NJ/4QLeYZeZW4GLg4oi4EjgF+ERz\nYUmSpPkM0rHtYGBbZl5XrjoUuKnRqCRJasmQtIQ3YpCa+HLgAxGxB7AFuJ6iaV2SpKFW9E7vThav\nnMQz81LgaS3EIkmSKnDYVUnSWBn35nRJkkZUEOPcnF5VMBzzLUtdtm1b1i6jidpJ3Xtn7928pXYM\nuy6zbqLx4addkjRWbE6XJGkEda13epfGgZckaaxYE5ckjY8Y4+b0iNgTuKh8+u+ArRTjpwMckZkP\nNBibJEmNG9sknpl3UgyzSkScBdyTme9rIS5JkjQPm9MlSWPF+8QlSRpBAXRp6JJWeqdHxLqI2BAR\nGzZu2jj/DpIkqbJWknhmrs/MqcycWjW5qo1DSJI0kGjo3zCwOV2SNFa61DvdwV4kSRpRA9fEM/Os\nBuOQJGmHGJam8CbYnC5JGhv2TpckSUPBmrgkaYwMT8/yJrSexDdv2cYNd9xbq4xH7LVrQ9FI3TTR\nkfbBXZdZr1DLOjYBis3pkiSNKE97JUljpUMVcZO4JGl8FL3Tu5PGbU6XJGlEVUriEXFARPxwxrqz\nIuL0ZsOSJKkd0dAyDGxOlySNl2HJwA2wOV2SpBHVSk08ItYB6wD23ndNG4eQJGkgXRrspWpNPPtZ\n3zuf+Mo9JweLTJKkFkQ0swyDqkn8TmDFjHUrgU3NhCNJkvpVKYln5j3AbRFxNEBErASOA77TQmyS\nJDVu3HunvxL4UET8Sfn8XZn5kwZjkiSpPcOSgRtQOYln5tXAUS3EIkmSKvA+cUnS2CiawrtTFTeJ\nS5LGxxD1LG+Cg71IkjSiWq+J//KBLfz9TRtrlfGIvXZtKBoBbNs22+3+/ZmY6NBpbAMe3LKt1v5L\nFtc/l677OwV/r71WPPlNtcu4+5IPNhDJwsus/9kaNl36pNucLkkaLx3K4janS5I0oqyJS5LGSNg7\nXZKkUTWWvdMj4lsRceyMdW+JiHObD0uSJM2nyjXxzwJrZ6xbW66XJGnoNTVuej+V+Yj4WETcERE/\n7Fm3MiK+ERHXlf+v6PnZmRFxfURcO7PSPJsqSfzzwPMiYml5sAOAfYBvVyhDkqSFteNmQPkExSRh\nvc4ALsrMg4CLyudExCEUFePHlft8OCIWzXeAvpN4Zt4F/AA4vly1Frggt3MTYUSsi4gNEbHhnrvv\n7PcQkiR1Rmb+PXDXjNUnAOeVj88DTuxZf35mbs7MG4HrgSPmO0bVW8x6m9RnbUrPzPWZOZWZU8tX\n7FnxEJIktSca+gdMTldYy2VdH4dfnZm3lY9/DqwuH+8L3Nyz3S3lujlV7Z3+FeD9EXEYsEtmXlpx\nf0mSFlSDvdM3ZebUoDtnZkZErSHxKtXEM/Me4FvAx7BDmyRJVd0eEXsDlP/fUa6/FVjTs91+5bo5\nDTJi22eBJ2ISlySNoB3Xr227LgROKR+fQtHCPb1+bUQsi4gDgYMo+qHNqfJgL5n5ZTo18qwkaWzU\nzMCVDhXxWeBZFNfObwHeCZwDXBARrwFuAk4GyMyrIuIC4GpgC3BqZm6d7xiO2CZJUgsy8yWz/OiY\nWbY/Gzi7yjFM4pKkseLY6ZIkjaCgW2Ont57EJ3dZyquefEDbh1EFExMd+gQPgSWLF35G3yZ+p9sZ\nt6my6Mi3492XfHChQxgaXfmddpU1cUnSWOnSaYlJXJI0XjqUxRe+HVCSJA3Emrgkaax0qXd6pZp4\nRKyJiBsjYmX5fEX5/IA2gpMkqWkRzSzDoOrY6TcD51KMOEP5//rM/GnDcUmSpHkM0pz+fuDSiHgL\n8AzgTc2GJElSe4akEt2IQcZOfzAifgf4KvCczHxw5jblnKrrANbsv3/tICVJakyHsvigvdOPB24D\nHr+9H2bm+sycysypVZOrBg5OkiTNrnISj4hDgWcDTwFOm54XVZKkYVdMYtbMv2FQtXd6UHRse0tm\n/gx4L/C+NgKTJKlxDfVMH8ne6cDrgJ9l5jfK5x8GHhsRz2w2LEmSNJ9KHdsycz2wvuf5VuCwpoOS\nJKktQ1KJboQjtkmSxkuHsrhjp0uSNKKsiUuSxsjw9CxvQutJPIEtW7fVKmPxIhsMpLbFsHS3lVrW\npY+62VGSpBFlc7okaWwEnerXZhKXJI2ZDmVxm9MlSRpRVYddfWFEXD5j2RYRx7cVoCRJTerS2OlV\nR2z7EvCl6efllKMvA77WcFySJLWiS73TB74mHhGPBv4AeFpm1ruHTJIkVTbQNfGIWAJ8BnhbOZvZ\nzJ+vi4gNEbFh08aNdWOUJKkx0dAyDAbt2PZHwFWZ+bnt/TAz12fmVGZOTa5aNXh0kiQ1qWNTkVZu\nTo+IZwEvwtnLJElaUJWSeESsAD4OvDQzf9lOSJIktWlIqtENqFoTfz2wF3DujHGW3zNb07okScMi\nGJ6m8CZUvcXsPcB7WopFkiRV4LCrkqSx0qGKuElckjRexrY5fRC3/ev9vPvvrqtVxlnHHtxQNALI\nzFr7O+9cYmyyAAAOsUlEQVT0r9uytd5YR4sXOYWB2nHnPQ/ULmPP5UsbiERtsSYuSRorwzLueROs\nAkiSNKKsiUuSxkt3KuImcUnSeOlQDq88n/jiiPjriNgUEY9vKyhJkjS/qtfEzwV+BJwIfC4i9ms+\nJEmS2tHU5CfDcpNO383pEfFO4F8y8/Ty+WuBz0bE8zPzX9oKUJKkJnWpd3rfSTwz3zXj+feAf7+9\nbSNiHbAOYPdV+9SJT5IkzaKVW8x65xPf5WEr2jiEJEmDiYaWIWDvdEnSWBmS/NsIB3uRJGlEWROX\nJI2VYelZ3gSTuCRpjESneqfbnC5J0oiyJi5JGhtBt5rTrYlLkjSiWq+J77P7Tpx17MFtH2Yk3HP/\nltplLN+p/q8sunQaOgQWL/JcWMNpz+VLa5fxgxvuaiAStcXmdEnSWOlSPcYkLkkaK/ZOlyRJC86a\nuCRpfAzRNKJNMIlLksbGEM1d0gib0yVJGlHWxCVJ46VDVfG+k3hEnAasBR4APg58GzgB+IfM/N6M\nbdcB6wDW7L9/Y8FKklTXuPZOXw08HXgtcBTwP4Ddge/P3DAz12fmVGZOrZpc1UigkiTp1/VdE8/M\nM8qH1wKvaCccSZLaZe90SZJGVIdyuL3TJUkaVSZxSdJ4iYaWfg4VcVxEXBsR10fEGfPvUY3N6ZKk\nsbKjeqdHxCLgQ8CzgVuASyLiwsy8uqljWBOXJKkdRwDXZ+YNmfkAcD7FrdmNsSYuSRobwQ7tnb4v\ncHPP81uAI5s8QOtJ/LLLLt2085K4aZ7NJoFNNQ5Td/8ulTEMMXSpjGGIYVjKGIYYhqWMYYhhlMp4\neM3yG3PZZZd+beclMdlQcTtFxIae5+szc31DZfel9SSemfOO9hIRGzJzatBj1N2/S2UMQwxdKmMY\nYhiWMoYhhmEpYxhi6FoZO0pmHrcDD3crsKbn+X7lusZ4TVySpHZcAhwUEQdGxFKKocsvbPIAXhOX\nJKkFmbklIt4EfA1YBHwsM69q8hjDksTrXkNo4hpEV8oYhhi6VMYwxDAsZQxDDMNSxjDE0LUyOikz\n/wb4m7bKj8xsq2xJktQir4lLGgkR4bzG0gwmcanDIroxX1NEPBe4KCL2rVlOrUuIEY3dmlRLRPjd\nLWDEk3hErFzoGDR8huULLiL2j4hdGyinTiJetIDH7i1nlxr7Hgu8D3hFZt466O83Ih4N/H5E7Dng\n/g8H3hsR+w2yf1nGURHxtEH3L8v4TeCVdcpQdyzIl11EPCIidqtZxl7AGyNiaUQcULOsofjSXygR\nsXrG89rvR40v2sMj4ik1jvsM4FUR8aQB9m2s1lq+p28D3jBoIo+I/cqEM1DSiIhnA5+KiDMi4vmD\nlAEsHXC/3jieC/zXiFgz78b/977PAT4JXA3cBZCZ2wb8Xa0AVlL8TgapACynGIFrrzK2QT7jzwRe\nMej+EXE08GXgzK60sqieHZ68ImIn4E3A2yJi+YBl7AccQFHL+G/AOwcpKyIeGxEvBU6LiJ0H2P+5\nEXF6+XjBTgQiYklEPCMizoyIF1RJghHxGOC2iHh/RLwOii/J8md9v6aIOCginhIRR0fEikG+aCPi\neIperv9WZb+e/Y8DPgBsAR42QBGLynKauGtjI8U9ovsAr66ayCPiBOCvgI8Cn4+Is8r7TPvd/zjg\nbOC7wK7AiyKi0nCPZQI9PyLeGRG/XWXfnjKeD7wHuDgzb55v+xn7HgN8EHgrxet4dXmSRmZm1c9X\nZn4f+BSwO/Cmqom8vDXoW8BHImL36b+Tir5bHp+q+/e0SLwR+EE21CvZk4ERl5k7fAEOpEi+ZwLL\nK+67HPj/gEcD7wbuAY4cMI4/AG4C3jjAvscA/0QxmP1E7/od/F4upRg84HTgLOAk4Kvl/7v1sf9+\nwHeAtwNfp6j1vADYvUIMzwMuA74EfINirOAnlT+LPss4DvgH4Dnl8xXAgRVieCZw/czPAvC4Pvef\nBH4KrCyfLx7w93EQcPD0awd+i+LE4s39ftaBo4AfA4cDe5Sf9X+kSMqL+th/JbAN+K3y+Rrgc8CJ\nFV7HccD3gVOBdwH/HXhUxffi31EkvSf3fFZ3KT9zO/Wx/5OBp5WPDwb+iOKE4Ok928z5+QKeBqyd\nse7I8vvn9+f7Gynfy+U9z3cFPgIcVT6f6ON1HENRcXk6xXffPwD7zNhmznIoZsG6Bnhq+fxK4LB+\n3oMZ5TwDOBl4ffl/pd/pPL+rZVVicWlmWZDaY2beSDE920rgHVWul2XmPRRfSH8DPIfiD/HFEXFC\nRLyk33Ii4mHAUyi+oK6OiIMj4siIOLTPGugzgS9k5lfyoZrrLsCHI+JF/caxnbiWVNh2MXAB8KXM\nfF9mngX8L+BlwKEU78+cMvMW4AfAYcBzKd7XVwN/HRFHRMRB88RwHPBfgNMy84WZ+WyKL/wLI+KJ\nmZnzvZ9ljehvgPdm5tcj4pEUJyZVeiM/CfhAFrWt6XLfC/yvKAZbmFNmbgL+E/DdsiVhS0QsrlJL\nKZu+rwW+HRGnAv8R+GuKZLg78No+P+tPA/4sMy8F7s/MHwMvpkisZ/bxWu6iOHk4p6wx3gw8CKye\ne89fvY7p38e7M/NDFL/PpRQnOlVsLo97f9kC9w6K3+tfAufOVxPOzEsy87sRMZGZ11KcYD4IPH/6\nunKWGWQOKyia8k/qKff7wOcpWkmOnm3HiNij3O5dZcsImXkvRbP+a8vn/dSml1L8ff0uxXv5WIpW\nhReUx4m5yin/zh8DvDYzvxfF9JZ3AKvKGLJnu9nKiPLz+ZcU3w0HA4f0Efu8IuJA4MTeWLQDLeQZ\nBPBIihpKpbM34IXADcAnyufvBW6jWk1jN4qmrdMpzkivpfjCPY8+aubASynOyM+iqIk+F/gNijPc\n0yiS/JKKr2uCYqq60/rc/mDgrPLxYuBPgJ8Dr6P48vo75qiJTr/vFF8y51PUnJ5Vvrcfobj2di6w\n6yz7T9f4nl8+36nnZ+8sy3lYn69lujb/BIra/Nv63G/6NXyAIulMrz8e+ARFQrwJOKHP8o4HfgKs\nmH5fy/+PomxdmGf/o8v35D9RXBr4K+DjwJ9TnHyeCiyb57Wc2/N7Dcrad/neXExxTXbev5nytVxH\n0ST9Rfqo/c74fVxF2SJT/m18B/hTiubtyfk+32Xsb6MYreqW8vfxWoqa8CeAF1T5+yjLPKj8bP0Z\nfbbAle/DFcCLZ7zPr6M4CZ61FkzxHfUKivGuz6aoES8u/7ZeOkD8BwKfAf4H8E2Kk4TvAi+a63fa\n8zmcKP//Q+D0np+fBKyb67WU230OeGTVuOco78kUJ2d/S59/6y7NLgseQM+H4QnALn1st2v5oXkC\nRTJ/d/lF9akBjvmbFE3i55dfCidRnFS8p499VwP/L/C/KRLIVyiao78O/Ai4nBlNZn3G9BsU11IP\n62Pb/wj8Rfn4IxSXB46kaK47heLkZN95ygiKJP5HFGfpP6I8GSq/MFfMs//zKJr29iyfL+v52TeB\nqQqv/TiKBHhG+XxRz/qj5tn3GIrkP93EuARYWj7+feAlFeKYmcjfSNFUv3+f+083fS6laMo+pfyS\nuxP44Xxfdj2v5fDy+UT5evYBvsAsJ1VzfMa3AXuVz6sk8umTgA+U8b+QIlF8H/gL+rjkQnH566nl\nZ7H3s/FR4OVV/z7KfR9D0SKxqsI+zy3/Vl/cs24txcnVvJdOKC5pnElxYvu35d/b+yocv/eS22nA\nx8vHe5efj0f0Wc70CcjbgS+Wj19Wfq4e08d+X6C83FN3oTgh+Fi5HNJEmS4D/B4WOoDyw7AceAsw\n2ef202ejv0Nx/fU4imtlfzDAsXejvC4GPJ7irPij/cZRfsE9tWfdyyhOBHau8X4cTZkU59nuOcBb\nysd/xkPXyx4N/HP5hfPiPo95MEUt/r8MEO/MpLek/P8rwOMrlvVsihOJPcrnr6Jo7p/z+jjFyd1Z\nFNc6j+hZ/xKKk5q+viRnvKYrKGqd1wCHVtz/eRTXtaevsa+gqEEf0Me+va9lqmf9yRTXmPcY4LVc\nRZnIK+47fRKwesbnvq+/1VnKPAnYQI0aIRVbucp9jqVolXkHRQvBJVU+nzx0Uvnu8jO1iT76nWyn\nnIcDnx70tZdlPIGiVeRF5euYN4lSXLP+40E+B9sp65XAueXjpXXLc6nxu1joAH4VSIU/Sora40qK\n5r2vAeeUXyxrahx/b4qOWZ8H/kO/8ZRfrJ+iOKt/JUUtZdYz4n5fX5/bPZri2uUjKRL65yhq8odQ\nNNkdWX6B91X7KhPmWfTRIrKdfWcm8un3YpDEcTxF7f4NwLfpv3PavhStERcD7wf+K0UiHaiWQJGI\ntwFPHHD/48vjz3tCNstreSdFH4dzKJpPr64RywkUlysm+v18zXgdV9f98i//xt5CcUJR6eSuqYWi\n78Q5FH1hHltx3+h5vBc9JzYVy9mjfD8H6pBblrF/+dm8tsrraCrhAo+ivOxX9fPk0uwysmOnl504\n9sjMTWUHpnMy884By5qg6GxyYmaeHcXwjpsyc95bncrOLydRnBHfRdEUf+UgcQwiIk6maNr/MkVt\n+jSKa3anAY+gOLl4Q2b+so+yHkNR+1vbz2vfzv7Hl/t/mOI64rrM/GHVcsqynk9xHfdJWWHWn/JW\nwcMoavS3UtzadN0gMZTl7TLIe9Gz/wkUJ0aHZ/VbinYGpihqkJuAv82ig9egsSzPomPoIPueQHFS\nMVX1dfSUsTNFK9O1mXn9IGUstLITWq0vzbKz5O9RzGj1zwOWsYSiD8wH63wm6oiIZZm5eSGOrYeM\nbBKHh/6gImJRZm5tsNzK5ZV/VJGZDzQVR5/HnQR+m+K++S9S3Ca1D0Wt530UCblKEqybtAZKvm3E\nMizqJM9h0pXXMQwiYnFmbqlZxpLMfLCpmDSaRjqJq1AOJHIExdn9RorBTv4V+FBmXr0A8XQi+UrS\nsDOJd0h5L+7mNlonJEnDZ6zHDO+gzT3X6wa6bilJGh3WxCVJGlHWxCVJGlEmcUmSRpRJXJKkEWUS\nlyRpRJnEJUkaUf8H+JickQMFqJoAAAAASUVORK5CYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x7f444c93de80>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "seq.plot_confusion_matrix(dev, local_2.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clearly, and unsurprisingly, we are now doing very well at predicting '@' labels. We can check that the model learned what we expected by also replotting its weight. (Exercise: figure out what happened to \"Deja_fckn_Vu\"). " ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAk8AAAHBCAYAAABwlV9bAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3XecJVWd///XmcQkYIZhgCEOSM5hyDnOwBAkiKAkwUWi\nKCg5I0FEoqQhZxAEJChBgquiICAGQFfXHFZx1w2/Td/dtX5/vD9Fnb5zu/tWd93uBt7Px6Mf3ff2\nvVWn0jmfE+pUKooCMzMzM+vMqOFOgJmZmdk7iYMnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1\nOHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlbDmG4sdNFFFy1mzpzZjUWbmZmZNeqVV175c1EU\n0zv9fFeCp5kzZ/Lyyy93Y9FmZmZmjUop/arO591tZ2ZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMys\nBgdPZmZmZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczM\nzKwGB09mZmZmNYwZ7gQM1MyTH290eb+8aG6jyzMzM7N3J7c8mZmZmdXg4MnMzMysBgdPZmZmZjU4\neDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09mZmZm\nNTh4MjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2Zm\nZmY1OHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdP\nZmZmZjU4eDIzMzOrwcGTmZmZWQ0dBU8ppU+mlF5PKf0opXRPSml8txNmZmZmNhL1GzyllJYCPg7M\nKopiTWA0sF+3E2ZmZmY2EnXabTcGmJBSGgNMBH7fvSSZmZmZjVz9Bk9FUfwOuAT4NfAH4F+Koniq\n9XMppcNTSi+nlF5+6623mk+pmZmZ2QjQSbfdVGAPYHlgSWBSSumA1s8VRTGvKIpZRVHMmj59evMp\nNTMzMxsBOum22wH4RVEUbxVF8T/Ag8Bm3U2WmZmZ2cjUSfD0a2CTlNLElFICtgfe7G6yzMzMzEam\nTsY8vQg8ALwK/DC+M6/L6TIzMzMbkcZ08qGiKM4CzupyWszMzMxGPM8wbmZmZlaDgyczMzOzGhw8\nmZmZmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7Ma\nHDyZmZmZ1eDgyczMzKwGB09mZmZmNTh4MjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMz\nsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1OHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgycz\nMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoOD\nJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09mZmZmNTh4MjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZW\ng4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1OHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZm\nZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBk\nZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKyGjoKnlNKUlNIDKaUfp5TeTClt2u2EmZmZmY1EYzr8\n3BXAE0VR7JNSGgdM7GKazMzMzEasfoOnlNLCwFbAIQBFUfw/4P91N1lmZmZmI1Mn3XbLA28Bt6SU\nvpdSujGlNKn1Qymlw1NKL6eUXn7rrbcaT6iZmZnZSNBJ8DQGWB+4tiiK9YB/B05u/VBRFPOKophV\nFMWs6dOnN5xMMzMzs5Ghk+Dpt8Bvi6J4MV4/gIIpMzMzs/ecfoOnoij+AfhNSmmVeGt74I2upsrM\nzMxshOr0brtjgbviTrufAx/pXpLMzMzMRq6OgqeiKF4DZnU5LWZmZmYjnmcYNzMzM6vBwZOZmZlZ\nDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1OHgyMzMzq8HBk5mZ\nmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOrwcGT\nmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09mZmZmNTh4MjMzM6vB\nwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1OHgyMzMz\nq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZmZjU4eDIz\nMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09mZmZmNTh4\nMjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1\nOHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGjoOnlJKo1NK30spPdbNBJmZmZmN\nZHVano4D3uxWQszMzMzeCToKnlJKSwNzgRu7mxwzMzOzka3TlqfLgROBv/b2gZTS4Smll1NKL7/1\n1luNJM7MzMxspOk3eEop7Qr8qSiKV/r6XFEU84qimFUUxazp06c3lkAzMzOzkaSTlqfNgd1TSr8E\n7gW2Synd2dVUmZmZmY1Q/QZPRVGcUhTF0kVRzAT2A54tiuKArqfMzMzMbATyPE9mZmZmNYyp8+Gi\nKJ4Hnu9KSszMzMzeAdzyZGZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOr\nwcGTmZmZWQ0OnszMzMxqqDVJ5nvRzJMfb3R5v7xobqPLMzMzs6HlliczMzOzGhw8mZmZmdXg4MnM\nzMysBgdPZmZmZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDg\nyczMzKwGB09mZmZmNTh4MjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV\n4ODJzMzMrAYHT2ZmZmY1OHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGhw8mZmZ\nmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZ\nmZmZ1eDgyczMzKwGB09mZmZmNTh4MjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxoc\nPJmZmZnV4ODJzMzMrAYHT2ZmZmY1OHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOz\nGhw8mZmZmdXg4MnMzMyshn6Dp5TSMiml51JKb6SUXk8pHTcUCTMzMzMbicZ08Jn/BU4oiuLVlNKC\nwCsppaeLonijy2kzMzMzG3H6bXkqiuIPRVG8Gn//G/AmsFS3E2ZmZmY2EtUa85RSmgmsB7zY5n+H\np5ReTim9/NZbbzWTOjMzM7MRpuPgKaU0GfgS8ImiKP619f9FUcwrimJWURSzpk+f3mQazczMzEaM\njoKnlNJYFDjdVRTFg91NkpmZmdnI1cnddgm4CXizKIpLu58kMzMzs5Grk5anzYEDge1SSq/Fzy5d\nTpeZmZnZiNTvVAVFUXwTSEOQFjMzM7MRzzOMm5mZmdXg4MnMzMysBgdPZmZmZjU4eDIzMzOrwcGT\nmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09mZmZmNTh4MjMzM6vB\nwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1OHgyMzMz\nq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZmZjU4eDIz\nMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09mZmZmNTh4\nMjMzM6vBwZOZmZlZDQ6ezMzMzGpw8GRmZmZWg4MnMzMzsxocPJmZmZnV4ODJzMzMrAYHT2ZmZmY1\nOHgyMzMzq8HBk5mZmVkNDp7MzMzManDwZGZmZlaDgyczMzOzGhw8mZmZmdXg4MnMzMysBgdPZmZm\nZjU4eDIzMzOrwcGTmZmZWQ0OnszMzMxqcPBkZmZmVoODJzMzM7MaHDyZmZmZ1eDgyczMzKwGB09m\nZmZmNYwZ7gQYzDz58caX+cuL5ja+TDMzM3PLk5mZmVktHQVPKaU5KaWfpJR+llI6uduJMjMzMxup\n+u22SymNBq4GdgR+C3w3pfRIURRvdDtx1qymuwfbdQ0OVRfkO3U97/ZtGar1vJv22btpW4ZqPe+2\nfWbvPJ2MedoI+FlRFD8HSCndC+wBOHgyMzMbhHdqIDhU6xmpAWcqiqLvD6S0DzCnKIqPxusDgY2L\nojim5XOHA4fHy1WAnzSf3AFZFPjzu2Q976ZtGar1vJu2ZajW4215b6/H2/LeXs+7aVvqWK4oiumd\nfrixu+2KopgHzGtqeU1JKb1cFMWsd8N63k3bMlTreTdty1Ctx9vy3l6Pt+W9vZ5307Z0UycDxn8H\nLJO9XjreMzMzM3vP6SR4+i6wUkpp+ZTSOGA/4JHuJsvMzMxsZOq3264oiv9NKR0DPAmMBm4uiuL1\nrqesOUPVlTgU63k3bctQrefdtC1DtR5vy3t7Pd6W9/Z63k3b0jX9Dhg3MzMzs4pnGDczMzOrwcGT\nmXUkpZSGOw1mZiOBgycz61NKaSJAURSFAygzMwdPrk2b9SGltAhwUkppLjiA6oZyf47k/Zqlcepw\np8VGrpTSkimlhQb43SkppaWaTlMH6x3Q9feeDZ5SSgvA24XBosOdnve6lNICMRXGUK6zvGiGdL35\nut8BxgFjgS1TSjvC8AZQKaV3Y541E7RfB7qA7FxeoKE09Vh2HPMdgbNTSgs3vY5O09HQciY0sZwB\nrrv2NmTHdnTzKWpOSmlF4O+BUwbw3fHABcABKaVl+vt8U6JyuCxU11+nx+jdmBH1K07CrVNKH0sp\n7QCc+k6vUaWUpgzz+k9JKX0qpbR73Ys8pbQy8AXgoJTSEt1J4fyiQJgLXJ1SurKb62rdJ3Uv1Lrr\nSCmNid+DusaLovgH9GDwfwN2Gs4AKqU0qiiKvyZZtIlCfLiD2JTSJOCelNLswSwnjsds4NqU0vGx\n3EbEsmcBc4CHiqL4l6aW3U4WLMxKKW2cUtqgTEcDy14d+EpKacZQHftsezYBti+vzU6/G/t/J+D0\nlNIxKaUlu5XWgUoprQrcgqY0+mu81/H+LYriv4D7gVWBPVNKy3YjnbmU0jzgBuCZlNIVKaUPRVo6\nytvek8ETOrivAwcC9wD3F0Xxl5Ee2fcmpbQCcGZKadthWv9VwMZon34COD/S1Ml3VwfuAF4Dno3C\nekhEgXAu8Diwdkrp7m4EoVHo/19KaVRK6aKU0kdTSpullEY3GYRk67gCFaJrF0Xx1wGkd2ZK6Yj4\ne0xRFL8D7gT+GZg9HAFUFCJ/jfV9A2XUn08pbVFnGfF7lZTSRtBMgVxXHtAWRfHvwJeAhVr/V3OZ\nmwDnoX3zAeDklNLMBtI6Ogr7y4HdgT/F+1077nFe7Q5cC8wGPptS2mOwy00prQLcDNxTFMUfhurY\nx/bMAe4C/gP4v5rf3Rbt/68BpwFH1QnAui2ltBpwO3AlcCrw/mjd6+gcKc/5oiieQ8dnI7ocQKWU\nbgEWAw4CdgJ+DWyVUjo20tLvufGeDJ5ix/xr/LwGbB/vd3xSD7eWi2c88P8BO6eUthridCyDaguH\nFUXxVeAk4FDgw/21IiV1l94OXFsUxdVFUfw83p+bUtqwy+leCTgKeLgoioeLotgGTRr7haZbIaPQ\nHwU8DfwnMAs4E7W0jRlsENIS9F+LtuMnwKMppc1rLmsscBZwTUrpAuDolNJSRVH8CrgJ+Cdgl6gJ\nD0nwUQaZ8XI28BTwN8CraDxWR+d87OddUbBybErpiZTS2kNVaUopLZRSWiDOh3Wyf/0QOC+ltMwA\ng93lgZOBO4uiuAUFT+8DDov/DSSt5fk4viiK/0X7/e+BY6C7xz2lNA04GtgB+AMwAXhhMMcpWilv\nB/42nsNKUiv5llGB64ok04BPAocWRfFCze+OBnYGjgD+Bz0a7bqYvHrYK/uxX8+MNN0P/CPKI/4n\nq+z09f2yUrRWBEs/QhXaDVEQ1ngXXqxncWDPoij+Pcqd64C/BdaIVrR+vaeCp6zmuWhRFP9WFMUu\nwGHAaimlz8b/lo0WiREppbR0SmnxuHg2SCmNLYriDdSC9hdgj24HHi1+h6L2bWL//gD4DmqJ2r2f\n744BXiuK4tbs2BwNfA61KszpXrJZCNWMNkoprQ9QFMW+wBTg+qZqdlnmcTjwlaIozgHWBP4B2BrY\ntwygBrj8vFVrS+BnRVEcXRTFJcCFwM11WmeKovgfVPt7A2XW04GHU0ofQPvrQuCP6Hiv0+uCGtKy\nfaejmuKvo4XyHuAx4MSyNayfZc1CgeGOwEPoHD0VWKvbLWhRyJwMHJLUunl0SunZlNI+qAJ3GeoW\nG0jr01TUrbpnSmnloih+C3wKWBv4WNJ4klqybsA7UkqnRdr2AjZIKV1ad3k1/S/we+BgdLwPKYri\nLWDb1GGLdqvoanwEWDaltHlK6Wlgf1TZ+0TqUqt9XNd/QUHgf8SxHQ1vt/DON94ya4kpokL/E+A4\nNLRh76IofptSOhD1nAy3/wLOLIriZoCiKP6IWtY2jGu3SGrpXandl+P/uwC3AQegitG/o0rgesAH\nu9AC9ReUly2T1LI6OlqAHwKWIK7D/ryngqc4ULsBX04p3ZNSOqsoil8DFwMzU0pfRjtwRIqaxlHA\nx6Nw/zjwdARQb6LMYQXguChIu5mW7VJKW0ZN+ZvArqgwex64GxUUB6eUFu6jMFgY2C6ltGocm/HA\nSsA+wCWooJncUHrL4GzNqGn+AWWcPwF2LQOBoih2BS6K2vZg1jc6llcGRfejrrRrgQeKojgEXagH\nAx0HN63ryGp3T6B9dlhK6dzIuK4DLkVjPFbuZ1kLZi+/jY7lj4qiOB14ED1K4Yso2PgasCiqHXZV\ntn23oYGdE1EFYdmiKP4C3Ac8C6zVweL+HtXgV0eDWlcHCtQF2O1t+Q/g58CK6Fr5GHANaiF6FtgX\neD9om/taUHYurxLBxA+BT6NKyxEppZWKovg9yivujfEktSS15l0SP6sDBxRF8R+oi2OHlNLVdZfZ\nwTrXTiltFoHOW6iL6piiKH6aUtoGncu1bu5IKS1WVoaLojgf+D7qPnutKIr90XH4C7BKg9tRHp9J\nsd6/ooBw56Io/hoV31nAZ1BlrfxeeRNT2RKzaVKr1c+BGcBni6L4VeRVJ6E8bNhEufPfwD9GEFIe\nm38HZsZ2bAY8SnRLt1nGsqjlajd0HP4P+O+iKL4FXA+sT0NxSkrp0JTSWijgmwjsXhTF/0XlbEJR\nFP+J8rrObrooiuI984O6515DAcaJaOzTZfG/6cAZwA7Dnc5+tuEg4K74ezQq1B4HFoj3PoWap1ft\nYhquQxn+Q8Dl8d4s1Ly8U7yeHOkY28syykcDXRRpXiJej4vfu6CxUJMbTPds4O8i/b8FtkKF2cWo\nRWW9htZTbtsoVKhtkm3fF4F94u97UFP+QNYxqVwXClRPjNd7ovERJ2Tp2LafZS2ICvKPZO8dh7q3\nZgAvom6H96ExNeNRC87dqPUwdfmcPx94Mns9DwU8y8frCb0dg/h7UWDx7PWpwFnZ/vp2l6+X0fF7\nB5Q5v4BaccbG+xvE8foBcGSH59Yc1DpzD6qtTwaWQwXydcDKg0zzfsC2wKbo4fDLxvuT43zZrMH9\nU27TOagCuAkKZi8CnkOVxNeB3Woudw2U3z8DPJ29v2N+zsS5fVHreTPIbdkZ+GpcJ7uh1sHn47y9\nPI71ntn3FkcB/YooQP1tHNs3gC1ReXVrHOsXgD26db52sI1Ts7/Xi3xioey9k4APoZbPV4C5fSxr\nOsp7y+twpXh/NirfFmoozXsQ5T2qhK2Ngs/DWz73GPCpjpY5XAdgmA76jsA6wFzUv7kW6qO9ouVz\nXS0MGtiOl4Cz4+8JKEJ/GbXYfB/YtIvrPg74avy9Ypz4o1r3H6ptXN/B8vZEdzx8gqow3BTVondq\nMN1TgG8B28Xr9wO/QIMTVwI+X164g1xPmXkmFCg9GZnLacBSKCj8I/B1NHC19jmHupo3jr/XB/4b\nuDJeLxgZxdXlOZJ9b1Qvy1sUjVO7Ffhw9v5jkeF8OnsvD9LX7tI5Nir7ewJqLfoj8KHs/RuBLwPT\n2+z7xYD94u/ZcS69Dnww3puLCrJz4lraolvXS5a2rVHgvisaO3YNcAgaUwQqKHYnguA23x+X/b06\nGpy7Wby+Ms7tSWjag4uB1Wumb3TL6z1QAf4asEi8NweNRxldZ9kdrDs/hmeiStPmwCLAR1F3ztZ1\nrpM4p79OVVF5pt1xjvX8kMgXGtqeHVA+vBnKm3+IKr0LxDH+aHbsynN2KdQidj5qZS3/f1ic56vH\ntbACVT455OUUmrbk74Az4vViwLz4e1T8PhIFJq/RS+AELA+sG38/j25GmRKvt0Tl8woNpntp4Cuo\n9fIMlOdvhG6AuDLO60fRuMHOljnUO384f1CT73jU1L97vPd5NIh3FXopXIY5zYujoG9S9t7WcVEu\nkb13Piowd+1yeg4Gjou/T0OF2t1oIGb+mfkCg94yXapuujdRX/frNFizQrWMxWPZ61K1BBxF1Yo3\npYH1jM7+XgM4J/7eFNWiz0YtOTPJAsO6mSCwfvzePn5vi5r2y3N6Amo5OLif5UwBxsTfk1EhdSdw\nULw3u+U4juntGDZ4rMpjk2IflgXF/sDDaMxH+dnDelnGgShAORYFr2ugloAfxH6Zglp+bgV26fL2\nlOf+CcBn4u9xqIB5AhWq47PPvIDyqLzlbDHUvbQg6m54BnWVr5595gpUYE8ul9dh+qZkf28R6doY\nVYwuiH00Of73I/poRRjg/lkGjbHbJ3vvvNgPmzPAPBmYhrrK14rXr6N86ovAdnEurx3XTWPbFMfu\nCHQTzWwUnO+Ngtsj+jnnZ6BxTa8CB2b/PxNVZMZ081ytsY1ro1bPE9HQi3kt/18TBYyze7kWNkdd\n/19CrUDrogDxOlR2vEZD+T8wMfv7MtSSeRZqoV06fg5DlcGjss/2e94N+4EYpoN/EWrp2DsuqFq1\ntCFO6+nAvSgSXxc1/y4YJ98HWj5bdnl1rUaCuj6/j2pH/wC8L97/EhG107PlYBSqZUyM12Oy/+UF\nxBjU9bcWsFpT2xEX6jfjgr4JDUYvM6tdgJubWBdVrWsUqvE+APymJR0Xo5rPoq3f63AdeXC2JWrC\nPzhe7wz8jOgKIOsubbdtqNB6E3X57hX7fgwKLuahMTjjUKFzzBCd6+VxGYW6B7+Ibkb4WKRvX1SI\nfLjle6nl91jUqnMj8Fj2udmoYDqg3fcb3pbU8no3dLflGtl73459vUy8PoQ2eRFqdVgRVQCWQq0P\nXwWOp2d3yTXUaHVGgdjtsX83iGN9K8oTr4xz4QzUbfs00W3W1P5CQewo1G02j6zARC3pdzKASg0K\nvCejlsXHYz/fhlr3zkbdYZPLfTvYbWpzrMei4O0rVK0rX0YB4Qq9XI/rohar8bHvLwJmxf+2Qq1x\nwxo80TO/Xgn4M+qGfjCO4ZHoxpjDgHV62Tc7oYDyGNQdfBHq+lsc5Y2fohr6Mdg8+ZY4h/aI1wui\nIQ4Hoh6Ti/LrMfteR3nysB2IYT4J9kMtTq/Ss995xHXXxQU/MS76xyJz2wkV/C8BM4YiDS2vl4iL\n/ZzsvTXopfszLoo/UY3TmS+A6sa+R4HY9US/Ngo8n4mL6rOohlNrHEUH67woLswF0LiwvEVuW6I7\nabDHIN7bBhUEh8TrOWhg8gYdLG/hyLz+GXXFvokKm8si83sIjT3Zgga7NDpI1xg0JukL8Xpr4Co0\nNcHoyJxP6OW75bm0XPzeHxVaH6EK3ufGti7Vres9S8cmse6V43w4H1WGNkXB0PNE4drLcvIAeGqc\ns5fE9bcSqkB9kgG2mlJ1I92NKmdlV9EGaKzeJ+P1NGDBfNsa2DerAz+O8yuhVuB5aJqFVVGBvMkg\n17V0rOcWYPPs/WeIbsAGj/muqGX7yOz8ewwNE9kIVS7nGxYQ2z4OBc4PoTxiIdSL8BXUEvUKsFc3\nztUBHLfZwEnx98rAr1Dvw+6ocjiPGFLQy3beQoytjONzIwoMl244vQuhHqafxnl2FGqJvjD28TTU\nsnoVsOKA1jGcB6QLB3gCMQ4jLsD39fHZ0cDC+Ykx0n5a0xUZwd6oqXkeivxndzkNeWvAaaiLbSuU\nmf8AFW5jUCvBVX0s59I4iSfE6/lqUTTUJZRd6Aei5vIvUA3YnoxaMD4CbNXk8UcF/HfIWjYiA/1m\nf8e2w+0ZFZnwBZHhTEUtGndQBVB9jkNCLU4fi7+noFr5qSiY2BG1cj6N7np5CZjW5D7qJU230rN7\n8CQieIrX26LxaUuQjf3pZVlzUHC0alzjH0UDdA+iCqAWbXob2qSjTPMNqHDZlCooeRG1QuzZx/fH\nxXW2Hqos7RevL0RdDkugAOwFVFsf0LUT65kdab08e38fVJh3o1KzK8q/nkOt2NvG+wfGez+iweEH\nca0fE+f+SrH81Rpc/nKRz5wcx+c+NF7r2LiW3iTrluxlGUvFOXpXnDuTUMXvHqrWq2Epp6ha1bdF\nwchW2f/D24GvAAAgAElEQVTeh8bGfbTDZZ2CWv/LcXQrxLl3Wn/X9gCPyydR9/iRqDv1X1CeNwkF\ntn0elz6XPxwHo4sHeSZqUboedZ0s3+YzfY6/GYk/rRcN6hv/MCqou94qgArtr6Ca79noNl/QIOOf\noqbxW/L0EnfltCznH1FrR7sWqDJImwLsOJj9BCyZvTc7MqR9abjQbD2HUE3sc6jWuGn2/kvEAMtB\nru8raFzM5sAvqcY87REZxJb5MetlGWujFtdj4/V01AJ3fvaZpYEPMgR3nqKBvS+igem3xHubofEP\nm2Sfe7jd9dyyrFVRQbVF9t4Y1JI2L87XUb3tmwa3aWXUXbB1vD4MtR6Uwfo0qmC+bYGIWoX2RWO2\nfkO0nKC70C5GrYRLEgNfa6avvE5mEIO1UUv249l5sT4KCBavs+wO1r0M6l7eLNb/ITQ+Zpv4/yQa\nHCgcy9wedds9CnyPPoLWAezDldFYwUPj9VIoj7wzjvNYIj+ipZUdjbP9CdVdl0tStUBtiVpP5utW\nGqqf2JbyPB2NWtY+mr0ux0uugaYZWJn2vQproFbYcagyMA/dsDMBBV9Pk93QMcg0t+bJy8a1cjfq\nttsi1t1aNtUOTIfloHTxYI9Cteh/Ay7N32+3g+PkHNFTE7Tbxjbb0e1bxfck7rhCY1HygXXTgKVa\njkF563s+6PGhOImviAzj7QAq246FY/lbDyKtu6AxThehWkd5F9NtqGY7daDL7uUcGoUGTu6CalGL\no9rn+TR41yMqVM6O9X0NOD7eL8ec9dr9E/9fharFaV3UpfiJeD0dFe6fa/O9rtd2UUF6GQp8vhjv\nnYya9G9A48fu62A5s4DbynRTjQEcFce+qwVRrGdMnHcvo/FCZa39EBQ0zOlkOfF7WRRYfAm1OpWF\n0fpxHZ3PAGvrqAB5Hk0NcB5qxZqDWoefRWOqdm9w3+QB24PZ+6NjW14nKgODWMd8U1Zk/1sutnGN\nwZ7X2bbshPKyl2NfLpZt40Uoz8unRJhMdBERFQPUsvQqVSCyPGp9e5BonRmOnziPP4bGipbX0bnE\nVBpUAV85D9PEXpYzG02k/CBqZRyHypMbUffp6yiAOoZsqpQBpjnPk+dSVTimoUaV22iwAj0sB6YL\nB7o8mRdGke0n4qQ8OvtMWVjnLRzP0sXb+od6+xtcXmv0Phd153wN+Hi5P9EjFBbO00E1EPOQuED2\nRa0iF2efK+dZyu8gnBLLr33beHb8t0BdiSvFOl5GY0TGom6Ie2lwjFhcpF9Fmf9FaAD3Oqg15QLU\nXbhiazo7XPbYltcTUBfNn8nuokNdXpv1tY5IZ/lsuo9ExtgaQC2KCs7PD9E5OzdL2xIosF4szoHy\nDsh10ViFw7Pv9boPUYvZ98nGsaGg9rgub0t5/uUVgr9B4ynyu8gOJWsd7GdZ26EWxRmoELucuNMQ\ntUrNAVYZYHrXQmOcJqM7uV6M82sMqmg8QuSLdc7ZfrYnH9j+BNkdWiiwvQ0FiYsNcD2rU1UORreu\nv83nB9XzgIKGr6KAbCwKlC4kCmfUirRiy3eWjzziQlRRKG+MuSPO27GoZfiBgR7bhs/rcahiVc73\ntjuqmK4U58o6qCeh7RxpqMJ2E7BhvL4RtXKX051siFoid4r9MeBtzs6zspfkCtR9el28PwW12D5B\nVmYNav8M9wFq8EDviqLYxeLA7hwn9CHAaqgWWI5xmoKaCvvMyIY4/eXBXye2pdeLe7AXfj/pyO8a\nuxAFo+NQN2g+0dzDxASj2XurobFPZe3ksNjPj7RZz41U3RqTUKBT63igGsWiVLWgD6CCYSfUkrI7\naqq/FBU4A8qY+1j/CWh8zmhUAN2Eao1roIDgAwNcbhmAjqVq1RqNxs28gArhqbGv7+hwmQeiCfeu\nRAHUKKoAKu/CG4o5j05F3XRXonEIC6CxQFfH/79FtEC1Ozd7WWZ5/RyMBqWegMZovEYXp+/I1ltO\ning2CtQTqlxcTjY/Vf6dPpb5ftS9tHO8noIqhJejysDLaAbnTtO4JHBJ9nrj2N8Hx/m0Qry/SpwX\nTXdv74SChmvR2JYpcV1+KdLwE5TnXc8AW4ZRMPkzsi77lv+XleZBj6uJ6+QjqIdjm3hvBmpduaLd\n/svOk6PRvGxntPz/TtR69XfA+7t1vna4fWV+uggaCvM5dFfmRFSZeRJVRL9Lmy5QlFdNRhW7HvP1\noS6731DlcUuh3oa1BpHeVct9jHo8TkENKS+gcuDB+P80GugafHu9w3mQGjzYm8UFWDaFToyfHVAU\n+iuiuTwO6ouMoMAp244dYzteQIHfOvTe5bgwyrAbH78RJ+FTqCulXN+6KGr/KhobMa/l86vHiXog\ncet1/O9DaB6X3WnTDB3fXZ6aEy7G+r4VF3LZhTUeFcS3UDXP34FaIQc9QJT5W+SWj+PwJeDUeO9Z\nVItaLt/GGuv4YBz7RSLdD6PA9XTULbVxZFq3ANf0tQ6iNSH+HhPLuy8ytfIOxHXiejh+CM/zdVEN\n9m7U9fRMnPvfIloG0aNUPjmAZS+Gxrh8BbU+Nnp7fbtzIdKeT4r4A2JOH9SFdzXZnGz9LHeR2B9L\no0Bmg7iGxqKg6ipqdqehvHB1qrvAlkLBy3epupF2if3fdAVjfRQQ7IKC2TsjPxiHBr6fiVpbNkdj\nA2uNsSKb0wp1Kx2H8pT8tvq8t+ExegmwOlzfCqjCtChqLX2MqmVlRrzurSVmGdRldQAaK9o61cwi\n2TEajgkw8ylU1kJ568JxvnwGjR1dMK6xNcmmeUBBVjnuq6yAr4GCpdPoOSfZzUTQGa8H/BQJdMPS\nt6luOlgvzvevokB1IspLnmz53uCnwRnqA9Tggc5nZp2DIuJN0Kj676Nuk6XjYL89bxAqgNYZ7vS3\n2Z7VUFCycry+FLVkvB1A0TNw+g7Z7bcNp2UH4KE2+3o0GnuR3/Y7CmWEd9P7pIV/gwr7/WigyTT2\n1XfRnYfro0x3+ez/X4sLdr3YT2s2uG9GoZr0Rtl7N1M118+jZR6hOuc0qv1fggr/S+L9NVCrw+lk\n48vyNLV5byYqtC6gauHbDrXOld2w5eDP9Whze3EXzqvDqVolN0CtdLugwuQE1G046MG82XEqr5um\nA6dFUQvT0vH6qDgn80kRv5nt3+U6XO5CKMh9Kc6BmyMPeAs4vfVcGcC5dTfV0wGORcH0YXFdvk4X\nWuhQa+nnsmMyGbXQ5F3Nm6M74OpWoDZAEx6eFMveDbi2ZZvzPPMZBllpjmvou6jldwE019ZDVBX3\n+Vq2Ih2LocC67CbfEd01PSeWeUe77w7VDwrOH6F6UsFkstm20Zix8+IzM9t8907UmnwuGr9U3oiw\nWpzHp9HSujTY6zP263hUvjxG5Mmxr2+N/yWUd57d+D4broM1yANd7vQdUNfFh1Fh8yy6NXk2Khy2\navPdETMtAVVQsiBqlv81PcdIXBIZ3HptMoHGuleYv0VlDdTisSzVYMGFaJl3JUv/eFST3abl/bz2\ndxDKvDuqgfeR1iko870re+9HsZ/ORy0zK6Ba9HwTiQ5wnRdTdaE8i2pkjwDPxnu3RJq+TgxYrnuu\nUTWVJ1Sr+wKqnU6N99dCrQ6fJ54z1tc64px5GRXib6AA5Yy4TpZE48MepJdZj7t0vh+EMtmD4/WW\nKID6MCqIptDS5dFyDpXn1SL0E4R38zpH8/aU49ymoYK73aSI3yFrhe1nmSuglt4VUSvR1VSPEirv\nPux45vCW/VXmHROJSTDj9b5xbl9DcxMTtt4ZvCF6VMcG2XtXk3VNoSBo2TrLj++cjlpqb0L51QFo\nTOBHW74zNc6zAeeZZHdnoW6si+O4L4aCt8dR3tzXcIu9UeXqmHi9E+qqe4FB3DLf4Hm9KipbLoxr\n8dqW/5cP8V2/zXd3QC3uZ8e+eRH1NkyJn3tQANXIs0rpeePU+ain5Ck0991YlEefSNw81Nv5Oag0\nDPcBq7nD8qnWV0fRenn773LEoMTIiF6lg8kCh2k78gJhe1R4LIfuMJpH9iBXlEmXjxhYEBWITQZO\nb0f/xOMw4uS7F3XBlc3699PywMS4mMrbcC+hGgRcZtajiedTxesBN5fH92eimvKJqBn5AFTjuAq1\nKH4KNS1PRDWn8llJgy0QPoqafi+g58Sg91PdHbYnPZ+9Vv/WV+33L6DgaSlUqN1MFUCtT3YHYy/L\nmEH1+JwtUQFzIxp0fAKav+m0+P9u9HOXXkPnWN7FtQs9J/bcnKrS02vNm6rQfD/KGJ9A3VnzZcbZ\n+TeJhiffy9axLxrkfD4qQEfRwaSIfSxv7VjWJfTs4tgV3XHXcatQXHPlXZhzUGF/ARqHNw0FUHdk\nn296fp2dUIC0OwqK94/rZ1dUuH2PATxYODsHZqP5eh5tOR4fRhWOq/NzAU2xsu0gtmdV1BNwCcon\nt4r9WXaJz6CXMWhxXD/TcjxvQpXlhAKuGfn2DecPGgz+DVTxfBGNGf4kynd37+V6S6jF9AyqJxyc\ng6am+QZqmd2Whp+FGev9GqpQzon1Px7Ha03UC3V26/nT2PqH+2DV2FFro8CinNL/85GpHEbPAGA3\nFDgN66C7PrZjMVTjLwevH0NVE18xTtRrafNQXNT6M1/UP4i0lBd/QhH643HB7IxuH70DtSg9ScvA\nZNS99F3i2WDoyec/piVAQl2pjzPI+WJifd+nagE6Oi6cJ7LPzIh1NfK4HZTxli1Cc9Adgje3fOYR\nWiZjpcY4tDiP87FLD1NNJLpK/P9WYrLK7HPtxjgtgFrevko1Dmx7VHieEdfNrsRjH4b4vB+NCrjy\nTrE8gNoMtR7O11LcsoztUZfWomiw+e8ig5ycryd+T0EVjUa6bCPdZT4zN9JxK6qpX4zGl52AbpB4\ngw5bEoD1sr/XRMHuJagyMCGuvY7HOMV3zkWF117E/DlxHV+BAuoFUVfTo/k+G+T+yadReCE7b89F\nQdteqDXuS3W2p816dkRjK+egIGRSy//Lx8y8/eBdsjv96m5Pdi5tgoLlG2Pf/olsypY+lrNJbHde\n6fooGod7LMPYVddy3JYkWv9QAPUkGhC/Lwo+r6CfABRVXu+hmrvtk3FOf5sBzuLdZh35rPuj6Tnt\nxWKo1ewpWu6ipxtjg4fzwNXYYcuhWXoPjh00B42zuQDVCNbPPjuLqu9z2CP5NttSTlH/ZGRix5EN\njI2T+GQ0v800Wpreu5CehPqMz4vXs1EtsWxFWp6eEz6OQoX6S7TMy4G6MX6MMuj1UKH4Qwb5kEc0\n+dpv6HmL/ng0huYSYN94byU0WHfQBSYKYG9FBfDH0WDPcnLKHbPPfaf1Qq25nqVRd+3n4/WzZIEm\nat27mrgNu4/lrIBarRZB40y+SLQUotr+dZGZLTCE5/rbA3dj392CxoiMpZq89JDy2HWwvP1i23ZH\n3bIfinP1FJRhl8HNwjR4Ny1qkb0S5T0TUGFY5jHboEDh7NiuSbRMitjHckfHPngqe2+dWP5tqAY9\nsZNltSx3B1Tg3Q6cHO+NQ10u5TQQk2hg7CdZUI+6ll8l5s5Dg3kvQAFUOQbm7e7pAaxrAdS1WXZn\n/oCqhW1xqkrHVahSPeCxNNl+vDjOr5XivU1RS9qP4hhNgbZdy4tRTTC5UZz7ZR67KuriHbYJMFu2\ndzdUMf0xVZ6xIqqEndbPPtqans8kfBK1bn86e69Wd3Mf6VwPlQXjUSvtGFTp/1z2mfejSnXbRzg1\nut+G+8B1sMMSGuT6WVSD+SSKbndAGdml8b+uD3htcJsWQIPYHkO3bR8WF9TSqMVnSzocZDrA9X+Q\n6NJEgdPP8pMN1fD/nhjc23IslkA1u+Oz9+6nevjiSWhg5N/G9pXvDzQjWw0Fam/EhZJnVBNQC9RF\nKJP+Hg08qw6Ny3kY3ZJ8EBqU+mgcm53QQM/bUItOR1MFtFlH3pW1GAoOL0MF3r6RoW0Zv/udcRkV\nXPdn59fm9AygtkJdgGfQcjdSl86x+Wr7KOi5DLXQjEYtnK0zo7drUcvvqFoQdSmXA3RvQGPNyoBl\nSrxupGsbZdTPoBauCbHvniEbKxb/exMF8v09PqZ1TND4OOZvPwoFFda308tdW30sOx8HskWco0+T\nVSZQi8CAbwtvWd/YuPZnxutl0d3CX25Jx2VxfU6ue95l+2QJWrqM0J2j41Fr032x/nJeokHdXUtV\nUdofjdW8DI1ZKtOzLhpHNd/z91AB/h3UMn9F7INN4lx/OvZRVx+rVWM7yxuVVom/X83yjFVQJWV1\n2t+UUk6pkU9FsBsxpi5ej6l7zPtI61aocvki1QPdV41jf0WcC/cRU690fd8N98HrcKdNRYMO/xkV\nKIegSL4MoK5BQVQjk191If15DXxDVJOdiGqsf0WBwVWoufHluplmzbSMp7qts7zF9kyUeb+Pqva+\nM9ncMNl3J6HaxSdQN9mjZLO5Z5+dQjUGbaCB03Syu9fiwni15TMTUOvdt2gmcFqXuCMpe2/J2N47\n4vX7gf9H1vLWLnPpYx35TLjHokJo8TgP/oqCi7tRkHBG9r12gcU4FIhsgQaflssegwqA+6gC3W0Z\ngtouCtrKh39e07INu6Lg9LhI43wFOWpxLaeamIu6Z/KuzctRS9psVDvOW0b3JLsNuqHtOQgVyH9E\nAcBuKBAtu6w3QS1IfRbYVHnAHNTFcUK8LgdyfwtNjPk9aj4UN1v2Gmi83Hh0PV+Bru+NUMvsmzT4\n+JNYz8rAmfF6aVTA5c8m3IrBTYC4O+oKfCT2ezmW8QqUb75KdAXGNdXrTOOd7Mf4OZ6q1W4yauW+\nkZ6B/A1kLSzx3vIocFqNamLG8+PvCSgY27DJ83MQ2zoDtbB/K9una6HKajn1Sm8zhy8ax3np2F/r\nxLZNRMHhoQ2mM68UfAH4PVHhQhXF96HWpnvoOZavuxXE4T6Ane481Mz5azQB3WTUb3xDZEQT+su4\nhjHteUtJ+QDOTahuszwH+NvsMwN6QnqHacnn/NkXZdjlGKKL4+Rblfa1jFXiQlsgLrovoxao1jsy\n1qXDO2f6SesiqP96bsv79wGvtLy3ANVt44MdHL421a3Ief/6ymisWjmVxKzsfwPpghiFWjAupepu\nKGf5PrnDZUxD49I2QGOdHqNni9ZE1M3wCHDKEJ3v66MWwqXR4PByXqrjss9cjgK9nVv3YRzLs1Br\n8j5Uk50+g4KyRVAgcDmaAHPQAXMH27Q/Gv/xAKo8TESF6fdQgfrLfFv6WdYuqCt7C3Rn2HVUXVmX\nospCrW2iqvDsBPwDarW6D7XWrBZp/AnKQwfVhd7L+ldC4wHLGxGWQUHtTQ0se13Uij0NDTz+IdUN\nFJfGcSmf8TjgcS3Mf8fxB1AlsbxhphwXuma8XghVbvLB/QugVuTnqLrspkb6h2wetX62s7Xlcy5q\nPTyaqmt1HdQl2uuzJFFe9QoKDG9EZck/o0lvZzPIR+20Oz7oBqZNIj/4GtnYOVpafAdzLnScpuE+\nmDV34HLoborDIwM7MjKKacOdtl7SuyqqwSwQGdm3qO4OLAuLCXHyfR21QIztUlrefg4eCoL2QIPV\nr6KqQX8WtX7NaPnuyqigODR7b0pk0GdlmdnGqPWkz4G/HaZ3FuoGuaT1Io71vtil/bM06rJbpnw/\nO1Z3Age1fG9AFym6+y2fbqEsQJdArVoHZuttOzg8fl+AWj12R7XyDVHlYmKcd4uimn+jd7r0sV1T\nUKH2KnBrvLcZamovW1puoo/BtqjF7ML4XN5qdR/Z86moniXWeA2TnpWeZVGL2afjelku3l8ZtUJ2\nNAAfFbhfRgH6HNRC8W3UqvV28NjpNtHzIazrobFFm8cxPx11x8xA4+GuJpvvrol9E9tfjjlaDo0D\nOiN7/SyDbOlELSGHo1n1v1PmBWhMzhIM8sHoZJXV2HdHonxsFrq1/qRIw4qoVyCfT25c9vf74vOL\noPz1g9n5eTADmPC1W+c0aoE+jGpc6y6oRefILM2tXaTld1dHwfJCVJNg7hj/2zqWM+Bxbe3WGX9v\nioaDlHN67YOC1D1RS9nx7b7X1f053Ad0ADt0XVRAl7OHduVW5AbSOTUu9nVRgDQVFQbTUABTZpKL\nxXZ0veUMBWdHEU3qka6Po8GwZQvUB1u+szoaTFiOXRqFulzGReb1CBqjsSsqMHcZZBrzAmELNEbn\nMuafmO0R4Ptd2EeLo+6AV5n/yduPM8CWjtYLOi76y+PvclDw1DhGC/azrFUjI9knXl+AWhb+FBnJ\nsyij/zaqPa88BOdWntHdFdfoEUTFBrWO/TIyvHt7+V7ePF8OgH+cnnMEPYq6mEfTpUySqqDYEk2H\nUT5gdGVU076MAd7RiQKbchzf6Lj2/wcFZXW6fmeg7sRJcS2+Eufs4vH/RVBh/jzqdp40kPT2sf6d\nUcvzK2gc6njU4vQ94tZ8BjBQuOV8GB3nwdfjXF4q3t8xzu9F232vxromo8rZEaiC8Xco8HkKFdLH\nopt3vhPnba93UKLA4WV0s8KuqEXxalRJ+gUj5AH0VHejH4nGX12IyqOdUPlUDiNo1/swN757F2oJ\n3j/73y4oeB5U/t9P2uegitkpqBzaHY07HZLncc6XnuE+mAPcievHCdnRBHTDlMZl4mQ6Nk7KFdGc\nF6dkn9kUdT32WVgOMh37oFamRSLT+Q0aYF124U2ONN5B1jpBVYCcCPw1e/9pej4nawZqov8tLV1s\nA0jrODRYfW1U2B6AAqjP0D6AWm8w64tlbJP9fSpV68hdqKXwKFRQPAzcMsB15F1p5Yzs5S3debP/\nvWSzk/eSga2OCt5DyaZIQGOynkcF/gw0J9ZyNDQ4uJPto+o+Wg3V3K9A47fKQm8K2Xg+ehaU+TPi\n7kBdy+uh7uSz6HlL/6CPewfbVBYyp1Hd2DEeFeafQ4HdRPootLNtmhXbVbaarIECwGmocnUb9Z/r\nuGqcC4uhAntxFGDkcwpNQ3cBNjLGJtuecSjYWz3SfwsKNibEOfdjBnBrerb83VCF7mYU+B2Fur72\nQi0mP6KB2dBR3ncAylueouekpGdRjdtbkqobrrUSlD/w+LPAhfH3VihA+TwNd2ENYntnoK7nZVHL\n2HdRnnM5CqB2ppeWwrh2X4p9Mw51nX83lrMgCi4b70JHQeiF2evZqAfgRFqGuDAEXXU91jfcB3QQ\nO7VrAUeDabwQ+F+qwXfl2IBrULP692nocRS9rP9aVKDege4Qm4jGin0VBShlobcg2e338d4y2f8v\nQ/OSPEMMDG357CQi8GrNXGqmd3JcwM/Ffiq7GTZGjwboEUANZl3x/Q3ROK/8OWR58+8Rsd5rgfOz\n9wc6OPxhVPBehZq9D0DjKK6PY3JnP8uaFJ9r+/gXFBTfjQLPrj08umWd+YOkv4iCjKtQ7bWsKX6S\nlrtHqQrKvLVpDmpB2y57bwVUKF1Eg3Oc9bNNy6LxbUuisS8/Ql3rZ6GCY0U6bM1DlYHfx/f/XG4D\napW4F43jLMfsdNRVR1XxmYSGLZwX184MlKfkcwqN6XS7O9ye3VG+9ixV1/amKNA5EwVQA741PfbX\ni6iV7xtUD3U9ClWirqPqJhpMXlOet8ujKTBeBq7L/r8XavHqtcUOjQO9LM7x0ajichEtlaWhOGc7\n3ObJqHtxfTRecCZqcfoF2Q0Zbb63cpzzD9FzpvUjqLppy6Ebg82TW8eebYTyyNOy9y5C5dpu2XtD\n/yzA4T6gg9jJI24Op9a0oWkATkez4ZYPKF0GtRIcR8vjTBpOw6XAM9nrm8sLJDKhL6Ga/ZjWtKPu\noz+hW2vL/utz0fwdeT//pqgAaGzMWaTpz6i7Jr+LakNUi/4sg3zES7bMhVDhOA89Jf0w2k9O2rZr\nqcZ6RsW+PAE9x+rvUBC0RGRMO5B1l/a2jsj8HqcaZ9LuMTjlw0qHrHIR58wTqDa4IwoIykkK56CW\nlTltvrd47P8xsY8+h8ZjTIj3n4h9s0qczx3P2N3AebEqCkK/hwqOg1BX5EWdXq+oBe5aqjuDjo1z\ne2UUXG5GjTnC0Bi2bVDrzm6ogN8YtWyfGufHEmjqkQu6sF/WQd10R6Bu8+fK8wy1EN9JzTv5UMC3\nVfb60/Scy6s818fkvxvanhUjP5mAWudvBj4e/1sdBU/Ts88vGMdgWdSSPwblS7fHMTgHPYfw2Ow7\nw15OoQBkVrYvtwYujr93RK1jG7R8J787/BuoTLgXeCT7zKGo1XEUzUy0mo/LnRvn+gTUUvs8USmI\nfX3MsO/X4U7Ae+EHNW3+K9W4om7PsTMVtaicQ3W3yG70nEzsXNQUvlyb7y+Capb/gbqvygDqMjRu\nJUXG8wMa6OPOLtTyrrOl0KMWbqIaa7VYbMNlDPJWdHrWDBdABfXV6GaEX6Mg7WZ0F8oarenscB1/\nQwwkjov/YFQ7fQ61xJyPAqjWh2W2Gxy+MNW0Dw8QXV9UBcooVGjuHa+7HmTQM2CbCpwVfz9D9fDT\ncsbitpOWosBopTjfJqLxIX9Awd+ZqMvs1fhfI8/EqrmNc4HPxt87oRbCfidgjeMxFnWHvFIe+/jf\nx9ENAbUfHRXnz1Go6/wXVJNRro/udjoZFfAzaGiC0Gzda6Ag4ezy3EMtjE9SPS2h1p3CsT17o0C1\nPL+Pj+P/NFVh/wHUujaWQRTSrdcWCja/h4KAcj7Br6OWjufo+fy9VVBr5DUocHwAVYLLvHHr+P4P\nYj917a7pmtu8OZpio8zPZqMK/C9RZeWP9DIeK7bpRqqpYkbHdn8zzsM3aFPZHGR6R6Gg+c44Nhei\n1sgVUXfwc8A9vR3TId23w31w380/cSKUgcHeaA6f+WrgDa+zXN/yqIn71LgIvsn8M4Lv1cdyNkS1\n/W+gQr4sqC8B/hvNF9PR7dkdpnuPyDBvoWqROwYFUOejFpdx8frUwRyT7Ng8hQrwhdHYsHJyzLXj\n/f0GuI55kcEuF+sZjcbLHEY1SHz/OCYn9bOsyah2/PHI4M9FA3XHx//L47JlpL/2oygGsH15DXF5\nVNoL7zgAACAASURBVGB/H3XtHpx97jqyWazbZXSx76+m5/P4Voi/Z8T5N6PpbehwO9dC3e6Xom7k\nPguK7NorW2MmoUL/88SM5PH+J/pbVh/n7dKoW+sh1LJVngfrobsRz8iPT4P7YkXU/X8/MU4PBVDl\nncKjGUBgE9f0InFd74QqST+lCtK2RHnNjoNMf34jylRgkfh7axREpDhe+0VayopPQl1dr9FzXrfx\nqJJ1IlnXHso7nmMAz+5r8FjlD9A+mKoV+AA0XcU6KG/am56tfkujFr9yjOIHUK/JaS3L/wRqqW9k\nLFfss/IB9B8iprhA3YpHEw/2RZWolVq3c9j283Cu/L3yk53MH6CLM8uiQZuXEjWmOPnmodrQZ7PP\ntc6JkVDt74rIJBeOE/XMuKBuQIFGWVCfQ4PPDkStD0+jboiDUbdDOa5hL1TjK1tV7mWQk4jG9j4G\nXJa9NzGOzzXA0S2frzPG6Xzg4V7+dwAKcCahgq7PieSI1hbUCncV1TMQr0HjcJZGhc2GcYyHYs6j\nt8cqoVroYfF6HxRAbY8Kli8Ct/e1jOz1PiiAOpKqUPswmtOn1wC/ye1p834ZrKyDxgl21JKD7rT6\nVpwHO6Hg4HPxs1kn6+5jn2+Jgu7F49r8AlVFYxSaA6eRKSmydW6KAowNqKZVOZ2s9ZPBT0cwA1UO\nbkDdkMuhmynuRMHzYG9EWQS1kpTdpfeiVsQ5KDB6imrM5kLMP1XLp6mC+7J7uRwYfgYtFaBY9ond\nPG87PA8finxhv3hvMpoC5eu0VOJR/v86Gs7xM6rpOPZB42UHFbz2kc5bySZZRvMPfp0q75uJ8pWN\nW743/N2hw52Ad/pPu4PI/IPeejwOo/V1Q+m4IS6WoyIzKG9jXxa1AHyaXm6vjgzwStQydifqqlsK\nDZC9MT5zV5zU+cSRg94G1B3wJFG7iPf2QuOCynFiea1x0M9mIyaXzF6XtfeFI3MZ0Oy4cVwvoXp0\nyLZoJumHqGq3D8fr+/s5hxZGLU4fjtf7oPEzH4kM/GLUIvg06n59f2/L6tJ5fxfZLcKo+3NvNOD7\nLuD6frZvO1Sr3CJez6EKoBaKZc3t7fsNpL/XObSyz4xq/Qx9BNKoJegOND7qGNR9syfV3WmXM8Cn\nIKCW2R9QDS4fjwKoq1EB/lMamJy2Zd+UE3oeglogtkIVnetRcNjY9Cqo2/ljqOV5Y5QnTaAalD7Q\npxSsiu4IvAR1yS+OKhx7oy7Vw9EdyHfSyzjBOKY7xd+Po9bEV9HY0QWpnks5Os7du2no4eQD3OaN\n0HjBrVHe/3aFEwVQB5PdgYkqzN8HDozX16Bgtpw0cw+UHzc6DUFcE/dlr6egQPrySGM5hcsjNPS4\npUbTP9wJeCf/ZJnM3LigLqGlOyX7bOMDHrNl70PPh4vuh/qLy+h9hcjwLieaZNssY0UUgJ2LWq9u\nRwOcf0fVjfIQDdz2TEthhAqApyOdZYH1wcjUFqdnN9GAZvNueb006oaZk703Do2pGtUunTXWdVLs\ns8+hpv4bUaHwvdi3iZ4PU+1tcPhC6KaCK6kC4TKAOjheT0KFTtvbqBs+x3p0BcW5UJ4X+U0E4+kZ\nYM+3P1Fh9hPUPXIz6lou55q5mazlr8vbNBsNZt+PXh4fkm33mN6OVfx/GeC/gNPj9RKoJfO2+D2O\nAbaYUs1aXQYS61MF1R9C+c7uA1l2m3OuHMMzFRXAy6GJQF+i6s5ZHgU5tcbW0c8EoKjgPIJsrOMg\nt2exuBbLKQiOQXc+lmOpVkaBxq2oIjK9l+VcjvKhaWR3xKJZrp9E+Vc+ZUFXJjrucJunowDvgZb0\n3001O3p+TY5FecwjVGXUD2IZ3yBa1VCr+a9o6OYgVDH8bfb6ZBS0jY1z+krUgvswWUVzJP0MewLe\n6T8ocHoF3c3wIpprpRz4XJ6MZQZcPrS07UU6iDRMBdaNv8fG66foeVvp8sw/5mnhltdrosLrMFTb\n2ARl/h3NoNxhWvOnce9PdGNGBnRDpLP8zOINrC8PvHahGkD/wcj4dkM1xvtoYB6neH1UbMsOVIHN\npmQD9vN90XpMqCaWnIwGs14LfCDe2wd11xzVevy6eI7n48RuQt2E16EWwjxw+gTZ3Gu9bN/2qEtv\no3i9bWSap1DNNdPvoOwGtmm9uBbPjWN1AS1TIbRct5fTS8GBurQmoZaIf6LqelwUdT/eM9BzGXUn\nj0GBzK2xr25HLQHl+Ln5WsgGsJ6FUeC/eHb9XRrn2TeI4BJ1qyxFzRZglCfdRdUq21sAtSSqbFzD\nICd3RcHT5Whc0pPx3rlokPSKLZ+9HTivl+XchFruR8U5U3aD7RzfO3goztk+tjOvjI5BwccbRJd6\nvH8d8XihNt9fM66B01GQfkm8vysKXsrWzrYV7wGmeS/URbgLak3/Bj3HM02LvGJuu+0cCT/DnoB3\n2g8q3MuHUE5AY4rWRLWzZ1CN/MdUAVRZk5uCApqtG0zLuvQy902kZVL8vRc97zBL6Fbcv2/NMNDg\nvZtRYTbo4KWXtO2K5lX5CGouPjQypnmohtTYg0tjfaPQra7Xo+D26Nj+vWIf3McAn8NFVnChwZaL\n9PK5h2gJntp8ZqnI2L+DunvKGvPHUOFeZtofQgHVkE4Si1rTytmjj0BBwUFoXNCXyB4G28cydkbd\nw+WT28ejQPoW2swh1qXtWB2NGSvv5NwIDf49n+ph2WXgtDAqULbuZVmjUYF8ZLy+CtXQy5m+pzPA\nqTVQl9N5qIVkBVSIbxP/WxeNURx0SzbVAPdFUKvsQfH6C3Gsykd2bIRaUNetufxR8XNKXAdvD8bu\n5fNTUYA96AHJKCj4l/zcQsMRfkMWQMU1dn123PP8cuk491dGd909gVr7vo/Kg/3RuLihn2uo53i4\nI1D3dznL+UPAIfl5n/29HArsy2OxWlyDT5MFSahSW04c3MR0BCdFHrBCpPUbaFqchVo+tw09g8IR\nFTgVhYOngRz8WZGJlJOCTUVdXi9lmcxbkcmMyj7zNRq8fRgFGo+iWV7z8SULoBaLJ+PCvhe4suW7\nC0d6LkcBXWsAVdZEzqF6EG4jJ2+k7x5UK9wXDQx9e1wD6uYa9KBXegY186gK65+jO05OoLrDZlLr\n9wawjq/EOfAk6qYpWx82QYPTb+xwmY+hAuv4WNa1KOM+J45JOQ6okbmu+klLXoAchlpYP5G99zdo\n6oj76Tnos93M4QtRVSh2Q9NglJWQBVAL1JDU3lGL0Ev0fCD3BmgOp8+R3XqPAqc+x1ugMUEPZq8v\nRi1Qg6p8oPGAN6DCfmb2/lzUtdJEV92CcQzLiWIPQa3N5Vitu1DF4yzUDV1rnXGd34CCp6lx3T1K\nLwEUVWv93fk2D2L7jkBB5vnETSfx/gVozq0JKL/8JDFtCFUANYrqWaQfimtymzhft0LB1JYorx+S\nOch62cZtUL52bpwXn4rzeTdUiT605fOrorFs16NB4rvG+2UAdTIK+jeI/zcy3gjl7Q+g6RMWQK1k\nu6Auwg9ln/sy2Uz5I/Vn2BPwTvxBBe4PswxnRmQ4M9HEdxdQDYQdFRfvoB5g2bL+q4Avx9/Loibu\n1uewPYwGNV6ZvZcXartGxrkqiv7bBVC3kT0CpIF0l7ej3obm73i+zHRQZt3Io0SogprRKIDcEgWM\nT6Ma5jaohedssgKOAQSIKHC6jLi7Bg1AvQbV6qahcTUXt6atzXLyQfEPU3Uz7IAy/ldRwPEjhvBB\n2LF961B1S91IVgmIfTyx3fZRBU57oADrIarCaBfgH6m6I7v2nLosHYsDS8bf5UN684czz8rOx3Fo\noHC+rYtQjZdZgxhgG6+/SM/Z6S9lgNc8miJhWraeK+JcXSvOqQepWs0GO6PzgihgupoowND4lryr\neH/USrvFQNaJWvqWp3qgc9sAiuq6XTD+P3Ugxzt+lw/33Sxb52VkY6no2fL0dp5Rvkb5RTnmZwy6\nyeGeOLZ3o26uZxiCRyD1sb2roAryXtm5cy0xiSQaGD8r+/wM1DNSPnrmeDTmsDw2y6Ng90Y0LnFQ\ndzpm6/0M8Fib9yeglqgHUOvdjWQzvY/kn2FPwDvlpzXDQIXid6lu1b4BdXf9iWwOF6J1o8F0TKXn\n88/OQ03Q9wBXZe8/Tc87ukblv/Ntogqgzo/Xy6DulAlN7TdUqykz58NQU3rZl75FXNCNDUaP/X4P\nMUcJukvoK9nnnqNlDNgA17cRutPpmuy9j6Ba3eH0Mng6e6+ctXg5sodcR6b8RPb6fbGfGgvCO9y+\nHVDtc2tUQz8PtdDMVxttvUbivZ3iOpmBuvbepCqU9wD+EwU1XX2MBepSfQm1EF4a7y2Cgp6HevnO\nEtnf49Ag1vNQQLMH6ra5FHVd7ksWPPW1T9p8Zlmi9ThbzwNUAdSacX1eH39P6nTZ/ay3zBMWQwHS\nTdk1ehDqtjuAlqlNOlz20mRTdqBW7lezbToBBdMbx+uyxWkqmqB3q4FsUyxjV9QadGycuzvGco9H\ngUU57cnb8/DF6zxwehy14EyK9Mws00k1A/qydGloQ41tPQSNtb2BaqLRjVErYduJOqkmV00oQHoM\ndUWehiqZq6MKbmMTYKIB4uWxXqDlf1NQefoLek6AOWIebdN2m4Y7Ae+EH6oCeXNUMG4ZF9hWkYG+\nP16vTIODq/tITzmOahU0iHRNVLC9SIw7IbttOdKWF9LzdfnEBfMEyrR/QAO32mb7bTsUDPweNX8v\ng26F/SlqgfohDTzos2Xdt7dciONQF+VdqGXnqtZ0drjc+fr9UfP4c8DHsveOof95nFpnLf4i8cTw\n+P+TwLO97dcunVutdyZORAXr83G+L0z1vL9euyqybTgedV2Wj9v4dBz3clxNozdPZOtfkyoQ2CbO\n6RlxXP63PP6oJedhsnE8ve1fNMj8BjQAeUZcT/ug8S+/Av6ZARY4aAxT+eDg6Whizduobhf/NGop\na3o84Hqou2cUuoniRnpWcq5lgAOFUX70dPb6EjTgumzlOAUFs1Pi9ZQ4zwbcTUR1s8xiKAB8gZi3\nCeWBPaZsIbubsjxv41o8Mc79p6i6/Ie9MKfKU5ehCqJ3RQPCD4/0rxLX2rSW7/aoOMd1eUW83gTl\n/eVTMCbm62sg3V8Gzm1NDwpGN0AVmT4n1B1pP8OegHfKD4qMf4KaFv+JqtlzOxRA5bPPdmNemvkG\nhsbFnXeZzCKbe6dMC+0L6U8x/1OpD0GPjmjsYcWoG/N1dIfWOag1aH+q4HNL+hlA2uF6Wgv9M9Gt\n4/l8JsujpvZ88GjtwCnSfiWaX6eci2puZBBHdris3mYtPgsNqiyDjxfIxuYM0bmeUDN7GaRPpJq8\nbsPI6PZu8722AToqFB+lutPxqyhQX2wgx6GD9E9DAdoHYltWimtjFzSr+1rx/9tRt2Ofg67p2Tqx\nAhoXchY97yzcKc7vi1Gg3lFBS89xZc8DL8bfi6IA6jkU1LwIbN6FYz0dVV52imO3PyqID47/1w6c\n6NkF/QzwXPb60nivDArLgHEMap3YepDbMwW10B2BWuvKMZu7oRsy8gBiUTQecqksDY+gFpiJZI8a\navocHeQ27ormrLqL6o7LfVBl6/k4Z3bNPp9fl722lMW5e1T83WigiMrPa2l5liPqtruNbBqPptfd\nteMw3Al4J/ygwuI21L21EQqW8sJhe9Q10ejzpNqkYzS6HXw32txphcaVXNTyXl+F9AlUg2OXiwx6\nz3g92C6BsoZ0JHBD9v6h6DbaD5MFfoPdL9nfb3cBoYGPr9HL3UEDuUipmvRPR62Q/03VDTUX1fj6\n7Vqj/1mL86eIb1Q3nQPZrpbX30ABdz7vzwPoFvl8rEh5nNsF6J+OjDuh8UOnoi6PR4hb1ru0LUug\nVo4jUMvGwqjwuJnqjsVTUTdBn2P66Nm1fU7kBSvEsk6jZ6a/Rmx3v7fx03ug+RV6PtD7HDSBYyNj\nT7LlLpAd2wOJLsfYvoNi+2o9Fofeu6CfBJ7PXl8T18nYlu8PZozTotl7V6LxgeX4tM0jH1iz5bvL\nogD1DmKKBlSZmxjp+3jreobrJ9vOiahytRVq/XwQmBf/mxvHLQ/4eqs4T21Z/iaoktvIY2XQHc0r\nUN0kMhONObuInvPr3U92w9M76WfYEzASfyITmNqSAZyAxgS8RNX/vR/xgE80iPvDXUjLecQt7mjM\nyFNo/MCjWeYwMy6Km9t8//9v78zD7ajKdP/7TgImkARMmAQBtVEI3c2oEARvUEAxMhgBBUVRWwSv\n1yBThAZBLpMMAuJlUhsIgxoIikSIBgSZbCC3QRq9Knqbbm2ltRu52nqNA3z9x7sqe51in332ULWq\nzj71Pk895+y9q2qtVbXW+r71De8aT0gvjs7NLAP9klFuilYSsTKzG5r8Y5PsLcgCVWRW3QiKTfhs\neEdZBt8xKNaiEHcqWpl/HMVCrETBvKtpKVDbd3mfcVmLKYFQdYy6xFxYfxV9vgHFpWSB/p8hCpKO\nru+koJ+IBPK+4X7fJ802MtcAv2O0IDkJBUYfgeI8utqcN9T9drTaPxu5hLLA2jNosTfvRSB2Hed+\n47lrVxK5a2nFshS1oNkWuSqPRW67bdGCMNvIeX16V5y6cUHHLryB2MmRFWlB+H8BstBeiyzdr0Bz\n5AOhjWNmJqIsz6VI6cjcie+lRopTVI990Px2E639BWej+fSG8PlwJKfehbLAOy6cES/gnoiypajg\n8O1RxvAtyMI8N3z/EjR3rkBz9W3AtXV7zl23s+oK1O0IE8kd4aXfizTjqUiT/keCdSlMOj9AgbTZ\nthwD7bk2Rn1eHsr9AnBy+G5OmJhuR4rTJkTuIkabprsR0gObSdGK6ClksfgULXLI2UjonoKCbHdG\n5vBr6JOUsk3ZU2jRD0xDQvPhaNCeyhgEeF3ce1+UCfiO6LvpyASdBdhfHiaLnaNzOk4E9MhanKDf\njyBF6U6UvXNS+H5J+P5hYEm79tFZQT+N0QrMFvnrS2rPB5EF4iIU9D6CAmlPRVaFrlLuEe/ZD5FS\n+Q5amW8vRu7ANRmp4butxrlft+7auxhtrSkq9mRPZIl5GwpB+GHo308ga1zPCnsPbfp7CnBBh3d5\nUHiPi5AQfj3Kcv5MaM90ZOV+B60Mz3zSzwHIY3BxGHNfIHIlZ2WV2U97aPPOSBk8JbT7I7QWiHPQ\nYuBVaC58F5pbxls4nxj+34oWv1kRW27NRnPG+1F284rwbjLZOT2MxTgLsBbPuad2Vl2BOh1o9bQq\nvPSXotXNt5G7Yv0wcS5BGvVjRBNwP5POOHWJrTebIOGVpa9bGDCnhu/Xj87Nu1+SCGkU53ERWvUc\nFybjDyChs34YyF9EBJA7oBiLy+iTeA0pjVuG/6cha0JGaHg4rSDevjctRQre8vD+70ECIlMKz6AV\noHkOUbB4l/euFWtxeF6fDn1rD6QQZuR4exACSbP+l7u2k4I+K/SLspWlzKoyD7k0dgufT0IxMDHl\nwIx27cjdLxP4+wHLo+/nhzF3buhvbbdj6nDfXty1XwXOGfC5bApcFn2+gNGszTshYXsPcFefZSR3\nQaPF2tuIONRC3z0OZQkupIP7FC2Ir6dFOro1cildTYvgtBaWEKQUPUkrkH9vJIf+B605sF1MbLd7\n8hWuuKB4w++Ed7IZkqvPoLi29+bOrcVz7vUYoQEAZrY5Yr0+092vBv7N3X/m7q9FE9DF7n4MEjL/\nC2UL3WZmIwDu/ucC6zLF3Z8zsxEz+whK618IvMTMPuHCMyiw81x3/3/Zte7+fO52M1EMw7PA5mZ2\naPj+U4jr6BHk+x8I7v5HpGjsjVZwhyJ6grtRMOMqNEnvjWIizkO+7ud6LcvMrkaK1w1m9m53X+3u\nS5BC9u/ufgPBooZM3dl11kMZl6EJaX93P9jdX484lr5uZtPRZLY76jObu/tV4boxx5SZTYk+no7c\nP1shJey9ZnYhmsBPR4H783qpcy/I1QWk/P69azb7B+SC2cHM1nL3B9x9RbhuJJwT4xngcTObAzzr\n7h93952QInMzioOZWUY7Mri7m9kBaGzuDpxvZgtREOxPgCPNbK/wPH+XXRPfw8xmmtmLzGxL9G5A\nitKfzOxt4Zp70cp/C6RQrjYz62H8b4dcl6Dxshp4k5mdhRY6s6N3sxyN+am9PY1R+AOwjZldEz5P\nQRYyQnsedfcbQ/9+zsw+3EcZ3bRpaijvkT7bgZlNN7M3hY8vR+P7JmAvMzsozIsXEfatRIu2dvfZ\nBSkkG4W6A/wYzYW7AJeY2dpt+nlyhPnkX1GCw+Iw/u5CLso9gbea2bQx+l+ncbkMmGZms9rIjCJw\nK1qcvxplLj6Dnu1v0VyzBnV4zn2hau2tLgdaiTzF6BT2bFU5Ew2sZCyyaGL4JlLWsjTazZCwPrvN\n+S/gKwn/l761AKPdhKchZWlTFBh+FhroN9DaSPY8+iSWQ0rTEiQA3o5WvdmGo69GVsGD0KrrmD7L\nOBp4PvocM5B/Gbg+/P9KIj6aTs+QGrEW5+pyEBKohxGsglGfv4uQDTXO/SqxoiHzf8xgfiuyBr0H\nuYiyFHUL/XLMOjB2YO1LkGvoPGRFmY/mgtNCX+vJ4kwPlmAUQ9W35TS695zQpstDf9s3jM/10EIm\ni/U5Cfh4H/dPZd2ejhZFD6LF2Fa0NpJdTpQlPFa/RQu6h1Cm3e7ITZ3xPu2GFqSVkV7m6roTmuum\nI4VjCbK0ZfJgbzrEWKYalyjO7K1o8RpzHB6HFu1xrNv06P8JaXFaU/+qK1CHg5YwWQu5m66PfpuG\nVjDfJOJOKrEumTD4BK0sipehIMbXhbr8nDap4rm2lCqkGSP9FSlOK1E82EfDd+uSiyXo89nklZpL\nkTXoHhQDtg4KEP8SUTxVP4M0tGFF3A/C312QDz9P9Dam6ZsashbT2u/vnKh+RyFT+/uRInHFeG0K\n/yff+wvFVVxKS9FYFwmE04D7aCU/vHm8ccvYMTtnhGeyFRIMN4TnsjNyZX6ZHjNGSSDQ2o3N8LyW\noti8h0Jbsi2F/oKWe7XnfpeiTVFZeyErxjei7+aEPnY38PYO1+6A5s4sHX8Wsuj/EwrE/hmRe7rq\nAy1q7kbcW9NDn/wcmu/aKu2pxyWtxewnw1j5SugLGRXFzbRcugPvjVeno/IK1OHIdbi1kUC+Lvpu\ndyRoCttVulMdwud9kTBdjoKTl4b/ZzP27u5JhDRjp6VnaalfJEq3zl07aMbQSmRhOCpM1Dsg3p6n\ngQ/myxhwYriTiKMmfLdNeH5dsa9TM9ZiWsr5Z2jFPGyIeFh2RZaVUxmdhZmPcaqFFS08qy1pbRvy\nMSQcs8/zkfWzY1YnnWN2Tmd0wPuLUEr4o3SfWZlMoLUZmzeHPjeD1j5zMfN3vAnsuBQLqdvEaEVw\nNlIgXhvGYGzleiWKgeqYQRnG3Q9y32VbOO1Y1rjrop3xTgSvICibaG5YjpTQtUP7l9Amezj1uERx\nf3FfWh8pfMfQCkY/JPS59ap6tqW9s6orUPWR63DZLvZrIQXqUiQsHyPaE6nEOhiy3GyPTLb7o6yK\naWjlsWZH8qzOY7SjNCFN58yaxWGAz0WuupdRTjBitnluTLK4AGXNxJN6LwSYsek5DqhdAdwbff4y\nOSLSDu+zNqzF+XIQe/TVSMheHZ7p7byQODV/XeVWtKgOmyPr0P2hX88N/fCRUPb/oQvmesYPeP8U\noxXyD9JlZi0JBVqHsfkJlIk6Aykg9xD29Avvrde96lJZt/OK4C2I5+5FKIlmGVIqXh3+bpa7Pt4a\naj5h/8/wjr+VYtx12c45SPlcJ7yjT4e+l9ERGFJE70BzerttnpKPy3C/zLJ5aBgzV4fPlyNvyRb0\n4QqeCEflFai08S/scItpuWfWQkHBz9OirC91WwzkbrgEWbnOo0WRP5U2PE5UIKQZP7PmhDC5rexn\nwmxT3lhKze2MNt0vp00sWJdltDM9n0Rr24hvhMnmOiJCt3b9gRqyFjNaOd8TxfHMQxw4pyBBtAGy\nbr60i/tUZkWjJRB3QUrTWkhoLw/lzgj95UBam8JWQhtBYoE2ztg8LSp/Q2CnAftS2dbt8RTBdcO7\nvgnFgbZVkpFb7lFkDfsCgdIFjfGHyhx3PbR1I2Rt2iKMzbnIcndK9uxQss2PaKO0px6XaB7ZEiWW\nbICs1vcjd/YdyCK1LcF9G19X9bMu9L1VXYHKGj52h/smLY1/Kjk6+YLrEFuOjqdl6vwere0RNkJa\n/ai92KhISDP+Kv3i8NvAZlp6U2ou67OMTqbn2HV1H9EGsoyhjFJT1mJGk4iuQEHQ8TYaN9KGd4t6\nWtH2QhaJmCrkxFCv+X3cr/CYHSpQNLsYmx0tpnVqE10qguFzRh+Sdy/PQguC7cPnfZA3YZ/w+V7C\nZrVVHOQoM5AVdQVSoLZGCtRl4TnfxWivQ6XjEimyV0bj5Izot0dDfc+loF0k6nhUXoGkje2uwx0f\nnxtdW6igi+qyVpiA3kJrL6tsq4QNkeCdEV2X8c9UIqQZf5V+OcUoToUqNR3KGc/0vKjNNR3LoCas\nxdGkbGFizgL4/5lA+hne5UeIrJrRdbWzooVy3owswh/Lff+3YQy33U0+d27hMTtUL9AKt6BV1Sa6\nc6V2StKYi2grVtGKgZuOLFcXpein47Rva6R0XhTaMgVlU5+C3JGbhs9HooVCvFddLcZlqNdhyBW8\nHDgsfH8V4tp7IyXGCVd9VF6BZA2tSYeL742E2tfCRLQ7Sq8+PzrvDnJ71eXuk1xIky79tXClJv8O\n6M70fEi7d9fhvrVgLUau01hBWITcW1+ntUDYCAmZDdrVjZpY0aLxMotWUsL+iMNo/9y53VArFB6z\nQw3ml6LHZpVtog9FMC4f8TedG+5zJSEYHC1Srw31r4TVGgW3fwdlD78+jK9sj9H10EJnGa1NkzMq\nlmwcVLVwzlv2dkShGnORm/WbSB6dHcbQF+kQBjDRj8orkKyh9REEfxH9fzqtPYmmAv8TZUEt5iOw\nAAAAEsJJREFURezC13a4TzIhTdpsoVKUmjHKKtT0TE1Yi1HA6W2IIPL14bu3oiSIM6Pz1lgNxqob\nFVvRaAmMA1Hm2FdoKTwLUNp6W9qOTn2ZgmN2qE6glTY2q2pTuF9fiiDKGD0q/H8BgdAYbXF1BrK6\nVkZHEPrZaQSWbRS39QTKFD0eJdxMR7LhFuBVY9wn6bhktGL6EbS4eHF4tqeG8TmC5umNkLVwz6qe\nc5J3WXUFkja2ekFwHS1hvTUyb/6YkEqN3Hc7olXwwdF1+YynZEKahNlCUZnJ/OkUZHpGAczbIkV2\nUfbsUPr0d5FQXjtBH78s9IP9kIviXFoK6eVI2T4HKSF/N8696mJFe2Oox0uQQPk+rY2YDwR+Hybs\nbt1qpcTskF6glT42U7aJARVBFAJxOfCb8BwOQBveHoqUqncRkgiqPGjFbE5FSuL1oa5XhnE5Bbnt\nFhPiYHPXJx2XuWf8YaQ4zyRYlRBVzFeRlfIhpHCXFitcl6PyCiRraMWCgBD8HJeFVnPnINPytmNc\nl1eckglpEmcL5couxZ+en+gpwPRMTViLeSGJ6A5I2ZgTPs9AlrzjgaM79XtqYEWjFd93HMoOPCBM\n3CeizKP3hN837KIPlxqzQ3qBVvrYTNkmBlQEEW3FhihrbRVKgHhrGJd3ApuW2Ve7aN9LkaV0G0Yn\nauwR/b8FSpDJ3HTHobkodqWnXDgbL7Q43YPmkY8BX8+/w/AOhiqrbsznU3UFkjSyYkHAC4Xa8YTY\nJsTndBrKAukYj0BCIU0F6a+5z4X709tMBAObnqkZazFSBlZm7zCUfx9ilH5/9u6i89spTpVZ0RhN\nirhJ9P36SInOmMNXIGtEzPWV70NJYnZIPL+kGJsp28SAiiByc52FFI89ULzTBaHOJ4UxsFUZ/bXL\n9s1F1tLrkNJ3yBjnvQGFKGwSnsXejFYQUy6csxisbAFzLLL4z0Dy63YCsSe55KrJclRegdIbWB93\nykpkkj0a8ThtFf22PTJJv7vD9aULaarLrCndn06JpmdqxlqMhMsqpDQdHSbhdyA+nCPHubYyKxpj\nM9fPRKvg61FG3W7h93nj3K/0mJ1U80vKsZmqTbl2DaQIokDrvRE333lhHM/N+kEZ/bXL9m2M4jeP\nCJ8PQ1Q0s6NzRpBF+HFgwRj3SblwPggtTLK9IechxWkmUqJWRP1wUipO7kOuPKXscF3WZxQzNqM3\nSRx3gFOikKY63qhS/ekUbHrOvqcmrMWMTSK6FPh57twZ49yrMisanZnrT0Tu232RBe375LLsOty3\ntJidVPNLyrGZok2UqAgit97FwC+AR+LyqjhCv30fo111y4Ctc+93WTZ+8+8v5bikFW91Nlp0bIDm\n0OlIcbqHRnHSs6q6AqU1rGJ3CmMLta8RCWxyKzhawjmpkCZxZg0J/OmUZHqmJqzFjE0i+uLw+0rg\n7nbPvsM9K7GiMT4pYqwcbNFNP6TEmJ2U80uqsZmiTSRQBMN9XkOInaryCH05s5xm89CdtHat2DT8\nXbdTe1OMS0ZbAf8WuRBjN+27aRSn1vOqugKlNq46QTAeM/bXaSPU2twnqZAmUWYNCfzplGR6pias\nxXRPIvoocPIY96iNFY3xSREv6qXvkSBmJ+X8knBsltomKlikldVnO5QZx+1tGH2fzTfLUAjCdiiE\n42Xt6lzFuESK04MojGQRsjR+MdeOSa84uQ+R8lQXQVCEUAu/JxXSJMqsIYE/nZJMz9SItZg+SETH\nuE9drGiFsWNTQsxOlfNLWWOzqjZREwb+ktrWLm7vBKJdF1AizOeQe3ThGPdJNi4ZbXHaDFgW/fYy\n4JpwVJqxWLej8goU3AkqFwRFCDUSC2kSZdaQwJ9OwaZnRrsWK2ctpkASUWpiRQtlFcKOTYkxO1XM\nL2WPzdRtoibcYSW9q05xe8fRcqdfgWJfMwLbyvbkozVfGrL2/jXwL8Dh0TknhDF1dBFlDssxwpDA\nzGYhZeV97r4v0pTnmtk+7r4Q+IOZ7Vpi+WZmW6IV3HNm9iYUAH0ksImZnYsUoqfz18V/A9YC/hOZ\nTw83sx3d/fdo0pltZuuYWSHvzsx2QUGWGyHlAETc+QhawV9iZmt7GEUDlDPF3Z8D/g34HVrRnI8m\nzt8D/4Emhz9H5/YMd38uPJv7UbDmzcAfQzs2dPfreynD3d3MdjWzo9z9H5Ewmwp8G1hiZmcgq8hS\nd///7v58P/XuFuE9/AJY5e7/Qdjw193/wd0XIGXqUmAnM1snd90amNlcxDH2CmT1A3gA+BVSWHD3\n+e7+cFltMbMp0cfTUR/cCvgg8F4zuxApCqejdzgvN07y99uBsFF0eDZPoAXLBWb2WeQu+aq7P9FH\nXZPPL2WPzdRtMrOpwCGI+PFY5Bb6CfBJM9s4nGNlj6ES8Ta0ufY1ZjZiZhcAp7v7GSiD8Ohw3pVI\nGb4HRo/N1OMyzJeGYk5nhLHxIeAwMzsinLYjIi2+sogyhwVDoTzVQRAMKtSqENLhuV0K/BJtDbO/\nmR0UlIqn0TM8x93/OGhZRSs1bdoyJfwdQUzUT7v7Ce5+KRLAq4HzzWzTrD5d3nctZPW4wMzeGer/\n5lD3I1Fq9DvdfUWvde4X7r4a2MjMDgMeRn3qsPDzKmQFuwfRPaxBVQp6O2TvOQiZ/+bu/4o4cPZD\n7+9A5PY4BBEMLgbu7aQouPt3gB8goYy7/8bdvwLshaw3+/XznqqYX8oem6nblGqRVjG2Q5mgIE6y\n1cCbzOwsZK3eIPz2XXe/D9YsuqtaOGflZvVbFj4/hBb6p5rZrcjFfW3umkmPCas81UkQZOhXqEF6\nIV3mKj1XTilKTb6MaAV1BlLOXmNmh4d7/jPiVpmL3Abd3ndzFLN2IfBD5J4bAf4d+Bvgp+5+o7t/\nu9c694o2k9aZyEqzMfBR4ANmthRZ8G5G6dGjrqmLFS1WnJDrZrfw003I7XEUcj8eDbwduavf4+4/\nyt0ns9rONbP5ZjbT3d8A/F8z+1bU7qfc/X53f6yHOlY2v5Q1NisU0skWaRXjGeBxM5sDPOvuH3f3\nnVBc5zJgmpnNiuc4D0g5LrM5OVq4n4Li3C4Jn38V5MzOwPvd/ZBw3cgEV26LhdfAd9jvQQ02gaQA\nZmwq2lqA8jNrkvnTQxmLgIvD5wVoQjgifL6RsBlnl/erDWtx3McYgESUGuz9RUGkiNH9SovZqXJ+\nKWtspm4TNWPgL/Og/82Mk41LWhnOI2HsHUMrtuo2oizw3LwzIQP4S33fVVdggE5QB0EwsFAjkZDO\n6kr6zJpClZoO7TobBWHuHj7PDpPWj4BbgZvbvbdx7l05a3Guj/VNIkqFe39REikiJQbWppxfUo3N\nlG3KlVsrBv6C2zboZsbJxiUtxcmQVesKFOf2WVoca3cAj1f9XCfCUXkF+uwElW8CSUFCLZybREiT\nNv21NKUmnPuCTDkUZ7Aq990sclsh9NGW5KzFFEgiSoVWNEoiRaTEjNQq5peyx2aqNlHRIq2Kg9FW\n1H42M65kXKLF/Oeiuu8BfD6rGx0odJojeo5VV6CPF1+pO4USmbEpUUiTKP21XX0pWKmhAtMzCVmL\nKYFElIqsaBRIiph7l6XQRlQxv5Q9NlO3iRpQxpR9MFpx6nkz4+g+pY/L3LhZhGILfwrsHH1/I4pv\nantdc7R5rlVXoM/OUJUgSMGMXbiQJhFvFAmUGmpgei5zUqFkElGqsaIVRopIgpidlPNLwrGZyrpd\nG+6wsg4KjtsL55cyLsktIND+kKeG/nUxYXNt4BYGCJ+YjEdmYp2QMLNXIU6KdwL/4u679Jvq3kVZ\nB6Gst/e5+9NmNg+t5hYiP/Yb0WalA/EU5co07/MFxdea2XbI3z4dkbZd5e6PmdlbUCr4fwdW+wBZ\nHCET4/mQ0fMg8vNPA/4EfM/dP21mdyD3zfb9lhOVtzMKMj8yZAW9Fgnk89z9R2Z2srufO2g5KRGy\nYDZGbuAtkLL0DHqOR6OJeGAurECV8ZdoY+r7Cqn82GUdgBIoHkDknc8gN90vo3NGuul7ISP108Dh\n6Hn8FgmAU4CnUPbhU15Q9mNZ80vqsZkru7Q5M2TVfRgpuMe6+wNmNh1Z5Ge5+3GDllEVogzRqWH8\njaCwgweQ8nsrsNLdL+y2P7cpo7RxaWZXIRqFzyN281+hjO5PooXHand/e5FlDj2q1t4GPUjgTmGC\n7jRNBdlClORPZ8hNzxTMjN7Pcy2hTYWxY1NdRmop80sVY7OMNuXGZeUM/CU9r9I3M+70XAt4L7MR\nJchPgYOR8vwU4tfaBHhVdO6Eez9VHROW5ymDi/dilbvfVwaBlyVixi4aqXij4mduZovQKnNfM9vZ\n3Z939weQkvk6AA/WoF7eVRt+kSeRkPk7xFEzL3w/DQWnr0HuutrCSyYR7VBuKc/HCiRFDNaLo1Dm\n0qaIY+vnaCV9K7JorTPmDQZAGfNLak63PIpsk3s9uMNKxjqIEytjQn8OKbwXI8X9q+5+CQzmLYgx\n6D2y683sYORK/RCiidkBeBa5cY8Ffu/uT4ZzJzK7e3JMeOUpRhmCoCqhNghSkTumUmqyAW1mV5nZ\nR5Gp/C+RJeJh4EtmdhvwnAcm3IkCS0AimhpFkyKGRcoFyIp1NUqpXw+tkj+Jslh/XHxLXlCPgeeX\nOhGvwuBtqloRTAF3/wnwNWBtZFmb4+73IwLXpWGsFqY4FYWg6G2HlNdt0Vj8JSLsvBD1tV9n59ep\n7hMBQ6U8FYmJKtRSrtLLVmpyVq3ZyOp3PAp6vA9xRv0TCqo+wYPPvki25zJhJTGjV4my2LHd/dfu\nfhfavmVtxKG2JPz8s6LqXyaqtKCVgbopgmUhxO2dj97VZsClZraRu18bKU61Y99291+gwPAz0fzy\nPjSPrOvu57j7Ymi2XOkXE0LIpMZEFmopVumplJphNz37EG7K6SXtLxfd/0kUa7IfynKqzcJlPNTF\nglYEhk0RHAs2wTczDqETD6FNi/83inVblP1eN2vZRMKEzrYrE5FQe7m7H2tmC1AmyU3uvsTMbgTu\nrLObqMzMmnD/g9GAHEHbD3wL+C6yNHwDZcP9Opzb1yANE9SHUSD6FcgdsE0oYzGwfraCmkjInoeZ\nnQ2cDLzO3R8MyuiuyO31PeBP3tpbqnYTXdSOuSi+6VF3/08zux2tcPdMUX6ZZZSBssdmCpjZekgY\nX444nF6MdhL4vpltEdxdExYhbi/L6Lzd3S8NHokDkQvsCbRTwoTYky8oghu7+4Sw1NYdjfKUw7AI\ntQxWUvprSqUmWKx2Qb77J5Fb8KDYX1/ndxCjnYA0s+XAJu7+mui7WSi77lfhc1/pzylgZgsRHcEv\nCSnQ7n6FmX0F8VXN63iDSYqyxmZqDIMimEdYDFyDLJxbI/fXle5+i5nthuK8LuvV/VwX1Hk+mSho\nlKeAYRRqeRStYKRWasxsS5RqexpwobufWXQZZcJaXFgjaAuHP6EtKx4PsWEz3P0N4dyYC6i27Qvj\n4XMoAPxxM9sH2B9Y7u53mtm9wGJ3f7jSitYcdX7H3WBYFEFYE7d3B3CWu18e+vheiCn/LhTb9YFB\n3M8NJj4a5YnhFGopkVKpmaimZ0tMIpoCNsSkiA36xzDMi2Z2N+IO2yb67uVo89/fuvtjlVWuQS0w\n6ZWnYRRqVaAKpWYiWf0y2ARnRs8tHpKyYzdoUAaqjttrMDEx6bPtool9J6QsfQhljlwH/LWZvdLd\nFwBfqqqOEwHu/udMcbJEVAETQSgHpTz7vxQS0ZQIQmbYSREbTCKEPr0QsdafDFxlZh9y97cAz5rZ\nQ9XWsEEdMWmVp2ETanVCIzAFG0JmdJsEpIgNJhdCTNOhaN/SfVGg+Fwz28fdFwJ/MLNdK61kg9ph\nUipPwyjUGtQPPmTM6DZJSBEbTB4EV905aM/EmeHrB1DW6JsB3H1+k/DQII9JqTwNm1BrUC/krJpD\nwYxuk4QUscHwI+c9WAvtW/cgWjjv6CIzXQXMNrN16jwuG1SHSdUphlGoNagfooDqoWFG9yFix24w\nudHE7TUoApNKKRhGodagnrAh3JTTh2B/uQYNmri9BkVg0lEV2JBu99GgfkhNIpoSw0SK2GDyIMTt\nrUbxTUvRuLwZOAm5745w959XV8MGEwWTTnmC4RZqDeqHlCSiVWBY2tFguBHi9k4BNkes+L8B3g18\nHliI9jKd37ifG3SDSak8ZRh2odagPpiozOgNGgwTbMg3M26QDpNaeYJGqDVIj4nIjN6gwTDBhnAz\n4wZpMemVpxiNUGvQoEGDyYEmbq/BIGiUpwYNGjRoMKnRhGo06BWN8tSgQYMGDRo0aNADJhXPU4MG\nDRo0aNCgwaBolKcGDRo0aNCgQYMe0ChPDRo0aNCgQYMGPaBRnho0aNCgQYMGDXpAozw1aNCgQYMG\nDRr0gEZ5atCgQYMGDRo06AH/BT/XE2lI3fLGAAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x7f444c9450f0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "local_2.plot_lr_weights(['@'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are other rows in the confusion matrix with empty diagonal entries, suggesting problems. But another approach to improving the system is to look for errors with high frequency. On such error corresponds to the problem of distinguishing proper nouns (label '^') such as \"McDonals\" or \"smash burger\" from common nouns (label 'N') such as \"wife\" or \"rain\". \n", "\n", "To better understand this problem it is useful to look at some actual predictions of the model where this error occurs. In particular, if we show for each error both the context of this error (say, the surrounding words) as well as the used feature representation, we can both spot potential bugs and get inspiration for additional features. In fact, **this type of debugging and sanity checking should be performed for any model you ever train**. Even if (or maybe: in particular when) your model is doing well it is worth to perform this type of micro analysis to spot potential bugs that lead to worse performance, or bugs that effectively mean you are cheating (e.g. because you are using gold data at test time). " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"fb4de120-25b0-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#fb4de120-25b0-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#fb4de120-25b0-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " \n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c5112b0>" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(local_2.errors(dev[:10], filter_guess=lambda y: y=='N',filter_gold=lambda y: y=='^'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By inspecting proper noun instances in this way we can notice that proper nouns tend to be capitalised. This suggests that the model could benefit from a feature representation that captures whether a word starts with a lower or upper case character. Indeed, adding such feature leads to better performance." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.7337756583039602" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def feat_3(x,i):\n", " return {\n", " **feat_2(x,i),\n", " 'is_lower':x[i].islower()\n", " }\n", "local_3 = seq.LocalSequenceLabeler(feat_3, train, C=10)\n", "seq.accuracy(dev, local_3.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This improvement indeed comes from being able to identify proper nouns when they are capitalised. This can be observed when inspecting the corresponding errors: some of previous errors are now fixed, and the specific error count on the first 10 development instances has reduced from 7 to 3. " ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"fd35a748-25b0-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#fd35a748-25b0-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#fd35a748-25b0-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " <div class=\"item active\"><table style=\"\"><tr><td>players</td><td>and</td><td>his</td><td>wife</td><td>own</td><td><b>smash</b></td><td>burger</td></tr><tr><td>N</td><td>&</td><td>D</td><td>N</td><td>V</td><td><b>^</b></td><td>^</td></tr><tr><td>N</td><td>&</td><td>D</td><td>N</td><td>N</td><td><b>N</b></td><td>N</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>smash</td></tr>\n", " <tr><td>2.02</td><td>1.71</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 1 / 3</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>and</td><td>his</td><td>wife</td><td>own</td><td>smash</td><td><b>burger</b></td></tr><tr><td>&</td><td>D</td><td>N</td><td>V</td><td>^</td><td><b>^</b></td></tr><tr><td>&</td><td>D</td><td>N</td><td>N</td><td>N</td><td><b>N</b></td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>burger</td></tr>\n", " <tr><td>2.02</td><td>1.71</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 2 / 3</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>me</td><td>for</td><td>blowing</td><td>up</td><td>your</td><td><b>youtube</b></td><td>comment</td><td>section</td><td>.</td></tr><tr><td>O</td><td>P</td><td>V</td><td>T</td><td>D</td><td><b>^</b></td><td>N</td><td>N</td><td>,</td></tr><tr><td>O</td><td>P</td><td>N</td><td>T</td><td>D</td><td><b>N</b></td><td>N</td><td>N</td><td>,</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>youtube</td></tr>\n", " <tr><td>2.02</td><td>1.71</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 3 / 3</div>\n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c4daac8>" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(local_3.errors(dev[:10], filter_guess=lambda y: y=='N',filter_gold=lambda y: y=='^'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us look for further problems we could fix with more features, and consider the confusion matrix of the current model." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfEAAAGoCAYAAABWs9xCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8JGV54PHfc4aZ4TIqM5xhwmUQgqCiUYQT8LYfBVYF\nNYK6ImgU18usipsVNSuYixglstGYrMbLzsYLMVHEC8JuEpWg7HpbcSBEBEVQRCAIDGAUAwMz8+wf\nVUfbk5lzurqq5nRX/b7zqc90V1e99XR3nX7qfeuttyIzkSRJk2dqsQOQJEmjMYlLkjShTOKSJE0o\nk7gkSRPKJC5J0oQyiUuSNKFM4pIkTSiTuCRJE8okLknShNqp7Q2s2mM691m7X60yli3xWEPja2vN\nQQ+nopk4pHF1+eWXbczM1YsdB8CSBz44c/M9jZSV99z++cw8tpHCRtR6Et9n7X589gtfrVfGql0a\nikZq3r33b6m1/s5LlzQUiTSedlkaNyx2DLNy8z0sf+iJjZR17xXvnW6koBpaT+KSJI2PgOhO665J\nXJLUHwFEd85hdedwRJKknqlVE4+IM4G7M/OdzYQjSVLLbE6XJGlC2ZwuSZIWWys18YhYB6wD2Hvf\ntW1sQpKkEXSrd3or7yQz12fmTGbOrNpj0S+jkyTplyKamcZArZp4Zp7ZUBySJKkiO7ZJkvoj6FRz\net1LzF4J/Gtm/lVD8UiS1KLxaQpvQt3m9A80FYgkSarG5nRJUr/YnC5J0oTqUHN6dw5HJEkaMxHx\nw4i4MiKuiIgN5bxVEXFRRFxb/r9yYPkzIuK6iLgmIp62UPmt18SXTAUrd1taq4ytW7N2HFNT3Tny\nkiSNalEGezkqMzcOPD8duDgzz46I08vnb4yIQ4CTgEcAewP/EBEHZ+aW7RVsTVyS1B+ztyJd3MFe\njgfOKR+fA5wwMP/czNyUmdcD1wFHzFeQSVySpPYkRY36snJIcoA1mXlL+fjHwJry8T7AjQPr3lTO\n2y47tkmS+qW55vTp2fPcpfWZuX7OMk/MzJsjYk/gooj47uCLmZkRMfI5Y5O4JKlHGj0nvjEzZ+Zb\nIDNvLv+/LSLOp2gevzUi9srMWyJiL+C2cvGbgcG7hu1bztuukd5JRLw9Io6KiBMi4oxRypAkqcsi\nYreIeMDsY+CpwLeBC4FTysVOAS4oH18InBQRyyPiAOAg4NL5tjFqTfxI4I+APwY+NWIZkiTteDvu\naqU1wPlRdILbCfhYZn4uIr4JnBcRLwNuAE4EyMyrIuI84GpgM3DqfD3TZwsdWkS8A3gacADwdeBA\n4JiI+FRm/lGltyZJUodl5g+AR29j/h3AMdtZ5yzgrGG3USmJZ+bvlkcJLwZeB1ySmU+Yu1zZA28d\nwL5r96uyCUmS2tOxu5iN8k4OA/4JeBjwnW0tkJnrM3MmM2f2mJ6uE58kSc1a/OvEGzN0TTwiDgU+\nQtFbbiOwazE7rgAel5n3tBKhJEnapqFr4pl5RWYeCnwPOAT4IvC0zDzUBC5JmgzlJWZNTGOgase2\n1cBdmbk1Ih6WmVe3FJckSe0Yk6bwJlTt2HY78Izy8WNbiUiSJA3FEdskSf0yJk3hTTCJS5L6Y4x6\nljehO4cjkiT1TOs18Uy4b/PWWmXsurx+mD+/d3Ot9Xfb2UYLbduyJYt/LLx168g3QfqFqR03FKW0\nuGxOlyRpQtmcLkmSFps1cUlSjzR6P/FFZxKXJPWLzekQEftHxEsajEWSJFUwUk08Il4F/A6wokzk\nJ2Xmj5sMTJKkxnXsVqSVk3hEPAB4C3As8CjgEuDnzYYlSVIbunVOfJR3shVIYBVAZv4wM382uEBE\nrIuIDRGx4Y47NjYQpiRJmqtyEs/MnwOvAN4OvDUi3hkRu85ZZn1mzmTmzB57TDcUqiRJDZgderXu\nNAZGalPIzAuB5wF/AqwGXt9kUJIktaav9xMHiIgVwB7l058B36FsWpckSTvOKL3TlwL/gyKRTwM/\nAl7QZFCSJLVmTJrCm1A5iWfmXcCxEbE/8OTM/EjDMUmS1I6wd/qsnwBXNBWIJEmqZuRhVzPTJC5J\nmjx9bk6vvIGpYPfdlrW9mQXVvR/49bfVH8/mgD13q12Gxs843Id7HGKQJkV0KIl358SAJEk9413M\nJEm9EXSrJm4SlyT1R5RTR9icLknShKpUE4+IaeCTFAO93AscnZl3txGYJEnNi143p78K+L+Z+eaI\n2Bu4r4WYJElqTZ+T+H3A/gCZ+c+NRyNJkoZW9Zz494HnRMQr2whGkqS2RUQj0zgYOolHxD7AGcBD\ngJdHxHPL+d+KiAfNWXZdRGyIiA23b7y90YAlSaqjS0m8SnP6E4ArM/OOiHgGcHFErAF+mJn/Mrhg\nZq4H1gMcfvhMNhatJEn6hSrN6d8CjoqIvTPzVuA04L3Ax1qJTJKkpkWD0xgYuiaemd+NiN8DPh8R\n9wO3AicBZ0fE5Zn5vbaClCSpCdHnS8wy86+Bv54z+xPNhSNJkoblsKuSpF7pbU1ckqRJ16Uk7tjp\nkiRNqNZr4gls2VrvKrMlU4t/1HTAnrvVLmPl8/5n7TLu+uQrapchScO6+97Nix1C47pUE7c5XZLU\nH2N0eVgTbE6XJGlCWROXJPWKzemSJE2grg32YnO6JEkTypq4JKlXulQTN4lLkvqlOzm8neb0wfuJ\nb/R+4pIktaKVJJ6Z6zNzJjNnpqdXt7EJSZKqi6I5vYlpHNicLknqlXFJwE2oXROPiIsjYp8mgpEk\nScOrVROPiCngIcCdzYQjSVK7ulQTr9ucfgjw6cy8p4lgJElqU9cGe6mVxDPz28DrGopFkiRVYMc2\nSVK/dKcibhKXJPVIeE68kgCWTHXnA6vjrk++onYZb73oe7XL+IOnHFy7DEn9sGJn63rjzG9HktQr\n1sQlSZpQXUri3opUkqQJZU1cktQv3amID5/EI2IN8CbgKGAzcDnwlsy8saXYJElqXO+a0yPiQOBz\nwFeBmcw8DPg4cH75miRJ2oaIWBIR/xgR/7t8vioiLoqIa8v/Vw4se0ZEXBcR10TE0xYqe9hz4u8H\nTsnM8zLzPoDMvBj4beBPq78lSZJ2vKZuQ1qxNv9fgO8MPD8duDgzDwIuLp8TEYcAJwGPAI4F3hcR\nS+YreMEkHhEHA7dn5rci4pkRcXlEfCoiPp2Z3wW2RsT0nHXWRcSGiNhw+8bbK7xPSZLatSOTeETs\nCzwD+MuB2ccD55SPzwFOGJh/bmZuyszrgeuAI+Yrf5ia+KOB/1ceDbwZOBp4PfDU8vVrgQMGV8jM\n9Zk5k5kzq6dXD7EJSZImzvRshbWc1m1jmT8H/iuwdWDemsy8pXz8Y2BN+XgfYLCf2U3lvO0atmPb\nFmAa+H5m/gT4SURcXb62J3DbkOVIkrSoGuzYtjEzZ+bZzjOB2zLzsoh48raWycyMiBw1gGFq4t8G\njgQ2AgdGxIMiYj/g4RHxG8CemXnDqAFIkrRDRUPTwp4APCsifgicCxwdEX8N3BoRewGU/89WhG8G\n1g6sv285b7sWTOKZ+R1gP+ChwNuALwHvAi4E3gC8dKi3IklSj2TmGZm5b2buT9Fh7YuZ+dsU+fOU\ncrFTgAvKxxcCJ0XE8og4ADgIuHS+bQzbnP5q4G+ANwKHl/MOA/bOzFuHLEOSpEU3BteJnw2cFxEv\nA24ATgTIzKsi4jzgaorxWE7NzC3zFTRUEs/M70TEs4DfB/4EWEJxdPDWkd+CJEk72iLdijQzLwEu\nKR/fARyzneXOAs4attyhR2zLzJuAVw67vCRJapdjp0uSeiOAxW9Nb07rSXxLJj+/d3OtMnbzpvS/\n8AdPObh2Gf90w09qrf/oB+9eO4Yu+ee77qm1/t4rd2koknq2bh35KpdfmJrqxq/jlgY+iyUd+Sy6\np/Joa2PNW5FKkjShrOJKknqlQxVxk7gkqV9sTpckSYuuchKPiBMiIiPiYW0EJElSa6JoTm9iGgej\n1MRPBr5S/i9J0sQIiqsompjGQaUkHhErgCcCL6MYB1aSJC2SqjXx44HPZeb3gDsi4vBtLRQR62bv\nr3rHxo21g5QkqSl9bk4/meJ2apT/b7NJPTPXZ+ZMZs7sMT1dJz5JkhoVEY1M42DoS8wiYhVwNPAb\n5Q3MlwAZEb+bmfWHN5IkSZVUqYn/B+CjmfngzNw/M9cC1wP/rp3QJElqWI97p58MnD9n3qexl7ok\naUIUN0DpYXN6Zh61jXnvbjYcSZI0LIddlST1yPjUoptgEpck9UqHcnj7SXwqguVLHaJ9nNS9H/hL\nP35F7Rg+dPKhtcsYF+NwP3DvBd4s7wWuSWFNXJLUKzanS5I0icbo8rAm2M4tSdKEsiYuSeqN2evE\nu6LKsKtbgCuBpcBm4K+AP8vMrS3FJklS4zqUwyvVxO/JzEMBImJP4GPAA4E3txGYJEma30jnxDPz\nNmAd8JroUruEJKnzejns6lyZ+YOIWALsCdzaXEiSJLVnTPJvI1rpnR4R6yJiQ0Rs2Hj77W1sQpKk\n3hs5iUfErwNbgNvmvpaZ6zNzJjNnplevrhOfJEnNCZvTiYjVwAeAv8jM+uM9SpK0AxSXmC12FM2p\nksR3iYgr+OUlZh8F3tVKVJIkaUFV7ie+pM1AJElq3/g0hTfBEdskSb3SoRzu2OmSJE0qa+KSpF6x\nOV0jaaIj/zjsfB86+dDaZfz83s21y9htZ3ffWVNTi79fqJu8AGm8+SsoSeqPjt1P3CQuSeqNrt2K\n1I5tkiRNKGvikqRe6VJNvHISj4gtwJUDs87NzLObC0mSpPZ0KIePVBO/JzPrd0+WJEm12JwuSeqV\nLjWnj9KxbZeIuGJgev7cBbyfuCRpLJWXmDUxjYNWmtMzcz2wHuCww2ccKUCSpBbYnC5J6o3wLmaS\nJE2uDuXwkZL4LhFxxcDzz2Xm6U0FJEmShlM5iWfmkjYCkSRpR5jqUFXc5nRJUq90KIc7drokSZPK\nmrgkqTeKa7y7UxVvPYknsNUrxQHY0sAHsdOSbux8u+1cf9e7b/PW2mUs26l+Y9T9NeNY2kAMkoY3\n1Y2fUcDmdEmSJpbN6ZKkXrE5XZKkCdWhHG5zuiRJk6pSEo+INRHxsYj4QURcFhFfj4hntxWcJElN\nCsrx0xv4Nw6Gbk6P4iTCZ4FzMvMF5bwHA89qKTZJkhrXpd7pVc6JHw3cl5kfmJ2RmTcA72k8KkmS\ntKAqSfwRwOXDLBgR64B1AGvX7jdCWJIktSC6dSvSkTu2RcR7I+KfIuKbc1/LzPWZOZOZM3usXl0v\nQkmSGlSM2lZ/GgdVkvhVwGGzTzLzVOAYwCwtSdIiqJLEvwjsHBGvGpi3a8PxSJLUmqC4FWkT0zgY\nOolnZgInAE+KiOsj4lLgHOCNbQUnSVLTdlRzekTsHBGXlqeer4qIt5TzV0XERRFxbfn/yoF1zoiI\n6yLimoh42kLbqDRiW2beApxUZR1JknpqE3B0Zt4dEUuBr0TE3wPPAS7OzLMj4nTgdOCNEXEIRY59\nBLA38A8RcXBmbtneBhyxTZLUK1H2UK87LSQLd5dPl5ZTAsdTtGRT/n9C+fh44NzM3JSZ1wPXAUfM\ntw2TuCSpN5pqSi9z+HREbBiY1v3b7cWSiLgCuA24KDO/AawpW7YBfgysKR/vA9w4sPpN5bzt8gYo\nkiSNZmNmzsy3QNkUfmhE7A6cHxGPnPN6RkSOGkDrSTyApUvGoxffYttpiQ0fTVq2U/3Pc+PPNtUu\nY+Vuy2qXIY2rLg2MMmsxepZn5k8i4kvAscCtEbFXZt4SEXtR1NIBbgbWDqy2bzlvu8wqkqReiYam\nBbcTsbqsgRMRuwBPAb4LXAicUi52CnBB+fhC4KSIWB4RBwAHAZfOtw2b0yVJasdewDkRsYSi0nxe\nZv7viPg6cF5EvAy4ATgRIDOviojzgKuBzcCp8/VMB5O4JKlndtQpgsz8FvCYbcy/g2LE022tcxZw\n1rDbMIlLknqjGLFtsaNozkjnxCPi7oWXkiRJbbImLknqj47ditQkLknqlQ7l8HYuMYuIdbMj2Gzc\neHsbm5AkqfdaSeKZuT4zZzJzZnra241LksbHjho7fUewOV2S1Bv2TpckSWNh1CS+a0TcNDC9rtGo\nJElqSe+b0zPTGrwkaSKNR/pthslYkqQJZcc2SVJvRCzOrUjbYhKXJPVKh3J4+0l80/1bufbH9YZa\nP3ivBzQUjfSrph+wvHYZK3/zNbXWv+ubf1E7hsysXUYTxqWzT12b7p/37o9DWb50SQOR1NPEftGV\n77SrrIlLknqlSwcmJnFJUq90KIfbO12SpEllTVyS1BtBdKp3eqWaeERkRPzpwPM3RMSZjUclSVIb\nomhOb2IaB1Wb0zcBz4mI6TaCkSRJw6uaxDcD64HTWohFkqTWdWns9FE6tr0XeGFEPGh7C0TEuojY\nEBEb7rpz4+jRSZLUsKmGpnFQOY7M/CnwV8DvzLPM+sycycyZlatseZckqQ2j9k7/c+By4MMNxiJJ\nUquCbg32MlKLQGbeCZwHvKzZcCRJatdUNDONgzrN+n8K2FYuSdIiqdScnpkrBh7fCuzaeESSJLVo\nXGrRTXDENklSbxQDtXQni49LL3lJklRR6zXx5UunOOjXViy8oCbGlq3171G8pEPtWXd84z211j/5\nIxtqx/DRFx1Wu4ydlnhMP2sc7gXehC7VOJvUoZ8fm9MlSf3SpWMbD70lSZpQ1sQlSb0R0KlbkZrE\nJUm90qUm6Kr3E983Ii6IiGsj4vsR8d8jYllbwUmSpO0bOolH0c3xM8BnM/Mg4GBgBXBWS7FJktS4\n4lrx+tM4qFITPxq4NzM/DJCZWyjuK/7SiHDkNknS2IsIphqaxkGVJP4I4LLBGeVtSX8EPKTJoCRJ\n0sJa6dgWEeuAdQBr99uvjU1IkjSSMalEN6JKTfxq4PDBGRHxQGA/4LrB+Zm5PjNnMnNmenp1/Sgl\nSWpIX29FejGwa0S8GCAillDcjvQjmfmvbQQnSZK2b+gknpkJPBt4XkRcC3wPuBd4U0uxSZLUqNnB\nXrrSsa3q/cRvBH6rpVgkSWrdmOTfRnRp4BpJknrFYVclSf0xRp3SmmASlyT1StCdLL5DknhmvfW7\ndP6iC5Z06TB2DHz8JTO1yzjg1E/XLuP773lO7TKmau4bW7fW/LFoIAZpklgTlyT1RtE7fbGjaI5J\nXJLUK11K4vZOlyRpQlkTlyT1SnSoo1XlJB4RW4Ary3W/A5zisKuSpEnQtXPiozSn35OZh2bmI4H7\ngFc2HJMkSRpC3eb0LwOPaiIQSZJaF926bHnkJB4ROwHHAZ/bxmveT1ySNJbG5eYlTRilOX2XiLgC\n2AD8CPjg3AW8n7gkSe0bpSZ+T2Ye2ngkkiS1rGsd27zETJLUKx1qTXewF0mSJlXlmnhmrmgjEEmS\n2hdMeRczSZImT2BzuiRJGgPWxCVJ/RH2Tq9sy9astf5Ulz7xMbDp/i211l++dElDkXTDfZu31lp/\n52X1P89r3/3s2mVcev2dtct47IF71Fq/S82cGl99H+xFkiQtICLWRsSXIuLqiLgqIv5LOX9VRFwU\nEdeW/68cWOeMiLguIq6JiKcttA2TuCSpN2Y7tjUxDWEz8PrMPAR4LHBqRBwCnA5cnJkHAReXzylf\nOwl4BHAs8L6ImLepziQuSeqVqYhGpoVk5i2ZeXn5+GcUt+/eBzgeOKdc7BzghPLx8cC5mbkpM68H\nrgOOmPe9jPQJSJKk6YjYMDCt296CEbE/8BjgG8CazLylfOnHwJry8T7AjQOr3VTO265KHdsiYgtw\nZbne9cCLMvMnVcqQJGkxNdivbWNmziy8vVgBfBp4bWb+NAYCyMyMiJF7f1etid+TmYdm5iOBO4FT\nR92wJEk7WlAkviamobYXsZQigf9NZn6mnH1rROxVvr4XcFs5/2Zg7cDq+5bztqtOc/rXWaCaL0lS\nX0VR5f4g8J3MfNfASxcCp5SPTwEuGJh/UkQsj4gDgIOAS+fbxkjXiZe95Y5hG/cSL19fB6wDWLt2\nv1E2IUlS8wKiwfb0BTwBeBFwZURcUc57E3A2cF5EvAy4ATgRIDOviojzgKsperafmpnzDuxRNYnv\nUgayD0Uvu4u2tVBmrgfWAxx2+Ey9kV4kSZpAmfkV2O7dVo7ZzjpnAWcNu42RzokDDy4D85y4JGmi\nREPTOBjpnHhm/ivwO8DrI8Lx1yVJEyHYcdeJ7wgjd2zLzH8EvgWc3Fw4kiRpWJVq0Zm5Ys7z32o2\nHEmS2jUedehm2BQuSeqVMWkJb4TDrkqSNKGsiUuSeiR25HXirWs9iQewdCcr/ONk+dJ572ynipYv\nXfz9e6cl9WN47IF7NBBJPV36cdV4mh12tSu69F4kSeoVm9MlSb3SpRYfk7gkqVe6k8JtTpckaWKN\nlMQj4vci4qqI+FZEXBERRzYdmCRJjSvvYtbENA4qN6dHxOOAZwKHZeamiJgGljUemSRJDeta7/RR\nzonvBWzMzE0Ambmx2ZAkSdIwRjkg+QKwNiK+FxHvi4gnzV0gItZFxIaI2HD7xtvrRylJUkO61Jxe\nOYln5t3A4cA64HbgExHxkjnLrM/MmcycWT29upFAJUlqQpfuJz7SJWaZuQW4BLgkIq4ETgE+0lxY\nkiRpIaN0bHsosDUzry1nHQrc0GhUkiS1ZExawhsxSk18BfCeiNgd2AxcR9G0LknSWCt6p3cni1dO\n4pl5GfD4FmKRJEkVOOyqJKlX+t6cLknShAqiz83pkn7VOFwvunVr1i5jamrx38fP791cu4zddvZn\nTf3h3i5J6pUxOO5ujElcktQbXeud3qVx4CVJ6hVr4pKk/ogeN6dHxB7AxeXTXwO2UIyfDnBEZt7X\nYGySJDWut0k8M++gGGaViDgTuDsz39lCXJIkaQE2p0uSesXrxCVJmkABjMGQCI1ppXd6RKyLiA0R\nseH2jbcvvIIkSaqslSSemeszcyYzZ1ZPr25jE5IkjSQa+jcObE6XJPVKl3qnO9iLJEkTauSaeGae\n2WAckiTtEOPSFN4Em9MlSb1h73RJkjQWrIlLknpkfHqWN6H1JH7/luS2n26qVcaeD1zeUDRS8+6+\nd3Ot9VfsXP/PcKoj7YO7NfBZSPPq2A1QbE6XJGlCedgrSeqVDlXETeKSpP4oeqd3J43bnC5J0oSq\nlMQjYv+I+PaceWdGxBuaDUuSpHZEQ9M4sDldktQv45KBG2BzuiRJE6qVmnhErAPWAeyz79o2NiFJ\n0ki6NNhL1Zp4DjN/8H7iq/bwfuKSpPER0cw0Dqom8TuAlXPmrQI2NhOOJEkaVqUknpl3A7dExNEA\nEbEKOBb4SguxSZLUuL73Tn8x8N6IeFf5/C2Z+f0GY5IkqT3jkoEbUDmJZ+bVwFEtxCJJkirwOnFJ\nUm8UTeHdqYqbxCVJ/TFGPcub4GAvkiRNqNZr4pnJpvu3tL0ZVbB16/Yu9x/O1FSHDmMbcNXNP621\n/pEHrqodQ93vFPxeB6183Otql3HX19+18EITILP+vjVuurSn25wuSeqXDmVxm9MlSZpQ1sQlST0S\n9k6XJGlS9bJ3ekR8KSKeNmfeayPi/c2HJUmSFlLlnPjHgZPmzDupnC9J0thratz0canMV0ninwKe\nERHLACJif2Bv4MvNhyVJUkt2UBaPiA9FxG0R8e2Beasi4qKIuLb8f+XAa2dExHURcc3clu/tGTqJ\nZ+adwKXAceWsk4DzchsXEUbEuojYEBEb7rzTu5RKknrpIxR3+hx0OnBxZh4EXFw+JyIOocirjyjX\neV9ELFloA1UvMRtsUt9uU3pmrs/MmcycWbVquuImJElqTzT0byGZ+X+BO+fMPh44p3x8DnDCwPxz\nM3NTZl4PXAccsdA2qibxC4BjIuIwYNfMvKzi+pIkLaqIZiZgerbVuZzWDbH5NZl5S/n4x8Ca8vE+\nwI0Dy91UzptXpUvMMvPuiPgS8CHs0CZJ6reNmTkz6sqZmRFRa1zbUUZs+zjwaEzikqQJtMi902+N\niL0Ayv9vK+ffDKwdWG7fct68KifxzPxsZkZmfrfqupIkLarFv8bsQuCU8vEpFKepZ+efFBHLI+IA\n4CCKzuTzcsQ2SZJaEBEfB55Mce78JuDNwNnAeRHxMuAG4ESAzLwqIs4DrgY2A6dm5oK3ADWJS5J6\nZUeNnZ6ZJ2/npWO2s/xZwFlVtmESlyT1RtCtsdNbT+LLdppi7R67tr0ZVTA11aE9eAwceeCqxQ6h\nke90G+M2VRYd+XW86+vvWuwQxkZXvtOusiYuSeqVLh2WmMQlSf3SoSw+ynXikiRpDFgTlyT1yo7q\nnb4jVKqJR8TaiLg+IlaVz1eWz/dvIzhJkprW4Njpi65SEs/MG4H3U1ysTvn/+sz8YcNxSZKkBYzS\nnP5nwGUR8VrgicBrmg1JkqT2jEkluhGVk3hm3h8Rvwt8DnhqZt4/d5nydmzrANbut1/tICVJakyH\nsviovdOPA24BHrmtFzNzfWbOZObM6unVIwcnSZK2r3ISj4hDgacAjwVOm72lmiRJ4664AVkz/8ZB\n1d7pQdGx7bWZ+SPgHcA72whMkqTGNdQzfSJ7pwOvAH6UmReVz98HPDwintRsWJIkaSGVOrZl5npg\n/cDzLcBhTQclSVJbxqQS3QhHbJMk9UuHsrhjp0uSNKGsiUuSemR8epY3ofUknsCWrVmrjCVT3fnA\npXEV49LdVmpZl3Z1m9MlSZpQNqdLknoj6FS/NpO4JKlnOpTFbU6XJGlCVR129dkRccWcaWtEHNdW\ngJIkNalLY6dXHbHtfOD82eflLUdfCHy+4bgkSWpFl3qnj3xOPCIOBv4QeHxmbm0uJEmSNIyRzolH\nxFLgY8Dry7uZzX19XURsiIgNGzfeXjdGSZIaEw1N42DUjm1vBa7KzE9s68XMXJ+ZM5k5Mz29evTo\nJElqUsduRVq5OT0ingw8F+9eJknSoqqUxCNiJfBh4AWZ+bN2QpIkqU1jUo1uQNWa+CuBPYH3zxln\n+e3ba1qXJGlcBOPTFN6EqpeYvR14e0uxSJKkChx2VZLUKx2qiJvEJUn90tvm9FHc8fP7OGfDDbXK\neOkR+zfmKOIMAAAOuElEQVQTjADIrHd/d+87/avuvPu+WuuvWrGsoUikX3VHzX0TYA/3z7FmTVyS\n1CvjMu55E7yLmSRJE8qauCSpX7pTETeJS5L6pUM5vPL9xHeKiL+NiI0R8ci2gpIkSQurek78/cB3\ngROAT0TEvs2HJElSO5q6+cm4XKQzdHN6RLwZ+JfMfEP5/OXAxyPimZn5L20FKElSk7rUO33oJJ6Z\nb5nz/OvAv9vWshGxDlgHsGrN3nXikyRJ29HKJWaD9xNfsfsebWxCkqTRREPTGLB3uiSpV8Yk/zbC\nwV4kSZpQ1sQlSb0yLj3Lm2ASlyT1SHSqd7rN6ZIkTShr4pKk3gi61ZxuTVySpAnVek18erdlvPSI\n/dvezES4+97NtctYsXP9ryy6dBg6BlatWLbYIUjbtEcD++aGH9zVQCRqi83pkqRe6VI9xiQuSeoV\ne6dLkqRFZ01cktQfY3Qb0SaYxCVJvTFG9y5phM3pkiRNKGvikqR+6VBVfOgkHhGnAScB9wEfBr4M\nHA98NTO/PmfZdcA6gLX77ddYsJIk1dXX3ulrgCcALweOAv4X8EDgG3MXzMz1mTmTmTOrp1c3Eqgk\nSfpVQ9fEM/P08uE1wIvaCUeSpHbZO12SpAnVoRxu73RJkiaVSVyS1C/R0DTMpiKOjYhrIuK6iDh9\n4TWqsTldktQrO6p3ekQsAd4LPAW4CfhmRFyYmVc3tQ1r4pIkteMI4LrM/EFm3gecS3FpdmOsiUuS\neiPYob3T9wFuHHh+E3BkkxtoPYlffvllG3dZGjcssNg0sLHGZuqu36UyxiGGLpUxDjGMSxnjEMO4\nlDEOMUxSGQ+uWX5jLr/8ss/vsjSmGypu54jYMPB8fWaub6jsobSexDNzwdFeImJDZs6Muo2663ep\njHGIoUtljEMM41LGOMQwLmWMQwxdK2NHycxjd+DmbgbWDjzft5zXGM+JS5LUjm8CB0XEARGxjGLo\n8gub3IDnxCVJakFmbo6I1wCfB5YAH8rMq5rcxrgk8brnEJo4B9GVMsYhhi6VMQ4xjEsZ4xDDuJQx\nDjF0rYxOysy/A/6urfIjM9sqW5Iktchz4pImQkR4X2NpDpO41GER3bhfU0Q8Hbg4IvapWU6tU4gR\njV2aVEtE+NstYMKTeESsWuwYNH7G5QcuIvaLiN0aKKdOIl6yiNseLGfXGus+DXgn8KLMvHnU7zci\nDgZ+PyL2GHH9BwPviIh9R1m/LOOoiHj8qOuXZfx74MV1ylB3LMqPXUT8ekQ8oGYZewKvjohlEbF/\nzbLG4kd/sUTEmjnPa38eNX5oD4+Ix9bY7hOBl0TEY0ZYt7Faa/mZvh541aiJPCL2LRPOSEkjIp4C\nfDQiTo+IZ45SBrBsxPUG43g68McRsXbBhf/tuk8F/gq4GrgTIDO3jvhdrQRWUXwno1QAVlCMwLVn\nGdso+/iTgBeNun5EHA18FjijK60sqmeHJ6+I2Bl4DfD6iFgxYhn7AvtT1DL+BHjzKGVFxMMj4gXA\naRGxywjrPz0i3lA+XrQDgYhYGhFPjIgzIuJZVZJgRDwMuCUi/iwiXgHFj2T52tDvKSIOiojHRsTR\nEbFylB/aiDiOopfrv1ZZb2D9Y4H3AJuBB41QxJKynCau2rid4hrRvYGXVk3kEXE88Engg8CnIuLM\n8jrTYdc/FjgL+BqwG/DciKg03GOZQM+NiDdHxHOqrDtQxjOBtwOXZOaNCy0/Z91jgL8AXkfxPl5a\nHqSRmVl1/8rMbwAfBR4IvKZqIi8vDfoS8IGIeODs30lFXyu3T9X1B1okXg1cmg31SvZgYMJl5g6f\ngAMoku8ZwIqK664A/htwMPA24G7gyBHj+EPgBuDVI6x7DPBPFIPZTw3O38Gf5TKKwQPeAJwJPA/4\nXPn/A4ZYf1/gK8AbgS9Q1HqeBTywQgzPAC4Hzgcuohgr+DHlazFkGccCXwWeWj5fCRxQIYYnAdfN\n3ReARwy5/jTwQ2BV+XynEb+Pg4CHzr534LcoDix+Z9h9HTgK+B5wOLB7ua//P4qkvGSI9VcBW4Hf\nKp+vBT4BnFDhfRwLfAM4FXgL8D+Bh1T8LH6NIun95sC+umu5z+08xPq/CTy+fPxQ4K0UBwRPGFhm\n3v0LeDxw0px5R5a/P7+/0N9I+VmuGHi+G/AB4Kjy+dQQ7+MYiorLEyh++74K7D1nmXnLobgL1neA\nx5XPrwQOG+YzmFPOE4ETgVeW/1f6Thf4rpZXicWpmWlRao+ZeT3F7dlWAW+qcr4sM++m+EH6O+Cp\nFH+Iz4+I4yPi5GHLiYgHAY+l+IG6OiIeGhFHRsShQ9ZAnwR8OjMvyF/WXHcF3hcRzx02jm3EtbTC\nsjsB5wHnZ+Y7M/NM4P8ALwQOpfh85pWZNwGXAocBT6f4XF8K/G1EHBERBy0Qw7HAHwCnZeazM/Mp\nFD/4F0bEozMzF/o8yxrR3wHvyMwvRMSBFAcmVXojPwZ4Txa1rdly3wH8nygGW5hXZm4E/jPwtbIl\nYXNE7FSlllI2fV8DfDkiTgX+E/C3FMnwgcDLh9zXHw+8OzMvA+7NzO8Bz6dIrGcM8V7upDh4OLus\nMd4I3A+smX/NX7yP2e/jbZn5XorvcxnFgU4Vm8rt3lu2wL2J4nv9G+D9C9WEM/Obmfm1iJjKzGso\nDjDvB545e145ywwyj5UUTfnPGyj3G8CnKFpJjt7eihGxe7ncW8qWETLz5xTN+i8vnw9Tm15G8ff1\nXyk+y4dTtCo8q9xOzFdO+Xf+MODlmfn1KG5veRuwuowhB5bbXhlR7p9/Q/Hb8FDgkCFiX1BEHACc\nMBiLdqDFPIIADqSooVQ6egOeDfwA+Ej5/B3ALVSraTyAomnrDRRHpNdQ/OCewxA1c+AFFEfkZ1LU\nRJ8O/AbFEe5pFEl+acX3NUVxq7rThlz+ocCZ5eOdgHcBPwZeQfHj9Q/MUxOd/dwpfmTOpag5Pbn8\nbD9Ace7t/cBu21l/tsb3zPL5zgOvvbks50FDvpfZ2vyjKGrzrx9yvdn38B6KpDM7/zjgIxQJ8Qbg\n+CHLOw74PrBy9nMt/z+KsnVhgfWPLj+T/0xxauCTwIeB/0Fx8HkqsHyB9/L+ge81KGvf5WdzCcU5\n2QX/Zsr3ci1Fk/RnGKL2O+f7uIqyRab82/gK8OcUzdvTC+3fZeyvpxit6qby+3g5RU34I8Czqvx9\nlGUeVO5b72bIFrjyc/gW8Pw5n/MrKA6Ct1sLpviNehHFeNdnUdSIdyr/tl4wQvwHAB8D/hfwRYqD\nhK8Bz53vOx3YD6fK//8IeMPA688D1s33XsrlPgEcWDXuecr7TYqDs79nyL91p2anRQ9gYGd4FLDr\nEMvtVu40j6JI5m8rf6g+OsI2/z1Fk/i55Y/C8ygOKt4+xLprgP8I/CNFArmAojn6C8B3gSuY02Q2\nZEy/QXEu9bAhlv1PwF+Wjz9AcXrgSIrmulMoDk72WaCMoEjib6U4Sv8u5cFQ+YO5coH1n0HRtLdH\n+Xz5wGtfBGYqvPdjKRLg6eXzJQPzj1pg3WMokv9sE+NSYFn5+PeBkyvEMTeRv5qiqX6/Idefbfpc\nRtGUfUr5I3cH8O2FfuwG3svh5fOp8v3sDXya7RxUzbOPbwX2LJ9XSeSzBwHvKeN/NkWi+Abwlwxx\nyoXi9Nfjyn1xcN/4IPDbVf8+ynUfRtEisbrCOk8v/1afPzDvJIqDqwVPnVCc0jiD4sD278u/t3dW\n2P7gKbfTgA+Xj/cq949fH7Kc2QOQNwKfKR+/sNyvHjbEep+mPN1Td6I4IPhQOR3SRJlOI3wPix1A\nuTOsAF4LTA+5/OzR6O9SnH89luJc2R+OsO0HUJ4XAx5JcVT8wWHjKH/gHjcw74UUBwK71Pg8jqZM\nigss91TgteXjd/PL82UHA/9c/uA8f8htPpSiFv8HI8Q7N+ktLf+/AHhkxbKeQnEgsXv5/CUUzf3z\nnh+nOLg7k+Jc5xED80+mOKgZ6kdyznv6FkWt8zvAoRXXfwbFee3Zc+wrKWrQ+w+x7uB7mRmYfyLF\nOebdR3gvV1Em8orrzh4ErJmz3w/1t7qdMp8HbKBGjZCKrVzlOk+jaJV5E0ULwTer7J/88qDybeU+\ntZEh+p1so5wHA3896nsvy3gURavIc8v3sWASpThn/aej7AfbKOvFwPvLx8vqludU47tY7AB+EUiF\nP0qK2uMqiua9zwNnlz8sa2tsfy+KjlmfAv7DsPGUP6wfpTiqfzFFLWW7R8TDvr8hlzuY4tzlgRQJ\n/RMUNflDKJrsjix/wIeqfZUJ80yGaBHZxrpzE/nsZzFK4jiOonb/KuDLDN85bR+K1ohLgD8D/pgi\nkY5US6BIxFuBR4+4/nHl9hc8INvOe3kzRR+HsymaT6+uEcvxFKcrpobdv+a8j6vr/viXf2OvpTig\nqHRw19RE0XfibIq+MA+vuG4MPN6TgQObiuXsXn6eI3XILcvYr9w3r6nyPppKuMBDKE/7Vd2fnJqd\nJnbs9LITx+6ZubHswHR2Zt4xYllTFJ1NTsjMs6IY3nFjZi54qVPZ+eV5FEfEd1I0xV85ShyjiIgT\nKZr2P0tRmz6N4pzdacCvUxxcvCozfzZEWQ+jqP2dNMx738b6x5Xrv4/iPOK6zPx21XLKsp5JcR73\nMVnhrj/lpYKHUdTob6a4tOnaUWIoy9t1lM9iYP3jKQ6MDs/qlxTtAsxQ1CA3An+fRQevUWNZkUXH\n0FHWPZ7ioGKm6vsYKGMXilamazLzulHKWGxlJ7RaP5plZ8nfo7ij1T+PWMZSij4wf1Fnn6gjIpZn\n5qbF2LZ+aWKTOPzyDyoilmTmlgbLrVxe+UcVmXlfU3EMud1p4DkU181/huIyqb0paj3vpEjIVZJg\n3aQ1UvJtI5ZxUSd5jpOuvI9xEBE7ZebmmmUszcz7m4pJk2mik7gK5UAiR1Ac3d9OMdjJT4H3ZubV\nixBPJ5KvJI07k3iHlNfibmqjdUKSNH56PWZ4B20aOF830nlLSdLksCYuSdKEsiYuSdKEMolLkjSh\nTOKSJE0ok7gkSRPKJC5J0oT6/0wPoWDMW2LpAAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x7f444c5116d8>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "seq.plot_confusion_matrix(dev, local_3.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another high frequency error stems from misclassifying verbs ('V') as common nouns ('N'). We can understand these errors better by again inspecting examples. " ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"fd83fb3c-25b0-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#fd83fb3c-25b0-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#fd83fb3c-25b0-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " <div class=\"item active\"><table style=\"\"><tr><td>the</td><td>players</td><td>and</td><td>his</td><td>wife</td><td><b>own</b></td><td>smash</td><td>burger</td></tr><tr><td>D</td><td>N</td><td>&</td><td>D</td><td>N</td><td><b>V</b></td><td>^</td><td>^</td></tr><tr><td>D</td><td>N</td><td>&</td><td>D</td><td>N</td><td><b>N</b></td><td>N</td><td>N</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>own</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 1 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>RT</td><td>@TheRealQuailman</td><td>:</td><td>Currently</td><td><b>laughing</b></td><td>at</td><td>Laker</td><td>haters</td><td>.</td></tr><tr><td>~</td><td>@</td><td>~</td><td>R</td><td><b>V</b></td><td>P</td><td>^</td><td>N</td><td>,</td></tr><tr><td>~</td><td>@</td><td>~</td><td>!</td><td><b>N</b></td><td>P</td><td>^</td><td>N</td><td>,</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>laughing</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 2 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>@ShiversTheNinja</td><td><b>forgive</b></td><td>me</td><td>for</td><td>blowing</td><td>up</td></tr><tr><td>@</td><td><b>V</b></td><td>O</td><td>P</td><td>V</td><td>T</td></tr><tr><td>@</td><td><b>N</b></td><td>O</td><td>P</td><td>N</td><td>T</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>forgive</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 3 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>@ShiversTheNinja</td><td>forgive</td><td>me</td><td>for</td><td><b>blowing</b></td><td>up</td><td>your</td><td>youtube</td><td>comment</td></tr><tr><td>@</td><td>V</td><td>O</td><td>P</td><td><b>V</b></td><td>T</td><td>D</td><td>^</td><td>N</td></tr><tr><td>@</td><td>N</td><td>O</td><td>P</td><td><b>N</b></td><td>T</td><td>D</td><td>N</td><td>N</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>blowing</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 4 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>Question</td><td>:</td><td>How</td><td>CAN</td><td>you</td><td><b>mend</b></td><td>a</td><td>broken</td><td>heart</td><td>?</td></tr><tr><td>N</td><td>,</td><td>R</td><td>V</td><td>O</td><td><b>V</b></td><td>D</td><td>A</td><td>N</td><td>,</td></tr><tr><td>!</td><td>~</td><td>R</td><td>V</td><td>O</td><td><b>N</b></td><td>D</td><td>V</td><td>V</td><td>,</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>mend</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 5 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>last</td><td>night</td><td>,</td><td>but</td><td>didn't</td><td><b>bother</b></td><td>calling</td><td>Shawn</td><td>because</td><td>I'd</td></tr><tr><td>A</td><td>N</td><td>,</td><td>&</td><td>V</td><td><b>V</b></td><td>V</td><td>^</td><td>P</td><td>L</td></tr><tr><td>A</td><td>N</td><td>,</td><td>&</td><td>V</td><td><b>N</b></td><td>V</td><td>!</td><td>P</td><td>L</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>bother</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 6 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>are</td><td>in</td><td>!</td><td>See</td><td>who</td><td><b>passed</b></td><td>and</td><td>who</td><td>made</td><td>the</td></tr><tr><td>V</td><td>P</td><td>,</td><td>V</td><td>O</td><td><b>V</b></td><td>&</td><td>O</td><td>V</td><td>D</td></tr><tr><td>V</td><td>P</td><td>,</td><td>V</td><td>O</td><td><b>N</b></td><td>&</td><td>O</td><td>V</td><td>D</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>passed</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 7 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>and</td><td>watch</td><td>the</td><td>news</td><td>and</td><td><b>tune</b></td><td>out</td><td>over</td><td>some</td><td>fresh</td></tr><tr><td>&</td><td>V</td><td>D</td><td>N</td><td>&</td><td><b>V</b></td><td>T</td><td>P</td><td>D</td><td>A</td></tr><tr><td>&</td><td>V</td><td>D</td><td>N</td><td>&</td><td><b>N</b></td><td>T</td><td>P</td><td>D</td><td>A</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>tune</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 8 / 9</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>that</td><td>,</td><td>regretfully</td><td>I</td><td>was</td><td><b>tied</b></td><td>up</td><td>,</td><td>Physed</td><td>at</td></tr><tr><td>P</td><td>,</td><td>R</td><td>O</td><td>V</td><td><b>V</b></td><td>T</td><td>,</td><td>N</td><td>P</td></tr><tr><td>P</td><td>,</td><td>N</td><td>O</td><td>V</td><td><b>N</b></td><td>T</td><td>,</td><td>!</td><td>P</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>tied</td></tr>\n", " <tr><td>-1.49</td><td>3.62</td><td>0.00</td></tr>\n", " <tr><td>-1.20</td><td>4.06</td><td>0.00</td></tr>\n", " </table> 9 / 9</div>\n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c159358>" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(local_3.errors(dev[:20], filter_guess=lambda y: y=='N',filter_gold=lambda y: y=='V'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We find a couple of errors where verbs like \"laughing\", \"blowing\" or \"passed\" are misclassified as common nouns. When looking at the features of those instances, and the corresponding weights, we see that for $f_{\\text{word},w}$ feature template weights are $0$. This usually suggests that the word has not appeared (or not appeared as a verb) in the training set. However, we can tell that these words may be verbs without having to have seen them before because they come with standard verb suffixes such as \"ing\" or \"ed\". We can easily test for such cases by incorporating features that look at length two and three suffixes of the target token." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.7785610615799295" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def feat_4(x,i):\n", " return {\n", " **feat_3(x,i),\n", " 'last_3': \"\".join(x[i][-3:]),\n", " 'last_2': \"\".join(x[i][-2:]),\n", " }\n", "local_4 = seq.LocalSequenceLabeler(feat_4, train)\n", "seq.accuracy(dev, local_4.predict(dev))" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"001d7ff8-25b1-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#001d7ff8-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#001d7ff8-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " <div class=\"item active\"><table style=\"\"><tr><td>the</td><td>players</td><td>and</td><td>his</td><td>wife</td><td><b>own</b></td><td>smash</td><td>burger</td></tr><tr><td>D</td><td>N</td><td>&</td><td>D</td><td>N</td><td><b>V</b></td><td>^</td><td>^</td></tr><tr><td>D</td><td>N</td><td>&</td><td>D</td><td>N</td><td><b>N</b></td><td>V</td><td>N</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>last_2</td><td>last_3</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>wn</td><td>own</td><td>own</td></tr>\n", " <tr><td>-1.50</td><td>2.63</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " <tr><td>-2.20</td><td>2.62</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " </table> 1 / 4</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>Question</td><td>:</td><td>How</td><td>CAN</td><td>you</td><td><b>mend</b></td><td>a</td><td>broken</td><td>heart</td><td>?</td></tr><tr><td>N</td><td>,</td><td>R</td><td>V</td><td>O</td><td><b>V</b></td><td>D</td><td>A</td><td>N</td><td>,</td></tr><tr><td>N</td><td>~</td><td>R</td><td>V</td><td>O</td><td><b>N</b></td><td>D</td><td>V</td><td>V</td><td>,</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>last_2</td><td>last_3</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>nd</td><td>end</td><td>mend</td></tr>\n", " <tr><td>-1.50</td><td>2.63</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " <tr><td>-2.20</td><td>2.62</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " </table> 2 / 4</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>last</td><td>night</td><td>,</td><td>but</td><td>didn't</td><td><b>bother</b></td><td>calling</td><td>Shawn</td><td>because</td><td>I'd</td></tr><tr><td>A</td><td>N</td><td>,</td><td>&</td><td>V</td><td><b>V</b></td><td>V</td><td>^</td><td>P</td><td>L</td></tr><tr><td>A</td><td>N</td><td>,</td><td>&</td><td>V</td><td><b>N</b></td><td>V</td><td>N</td><td>P</td><td>L</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>last_2</td><td>last_3</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>er</td><td>her</td><td>bother</td></tr>\n", " <tr><td>-1.50</td><td>2.63</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " <tr><td>-2.20</td><td>2.62</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " </table> 3 / 4</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>and</td><td>watch</td><td>the</td><td>news</td><td>and</td><td><b>tune</b></td><td>out</td><td>over</td><td>some</td><td>fresh</td></tr><tr><td>&</td><td>V</td><td>D</td><td>N</td><td>&</td><td><b>V</b></td><td>T</td><td>P</td><td>D</td><td>A</td></tr><tr><td>&</td><td>V</td><td>D</td><td>N</td><td>&</td><td><b>N</b></td><td>T</td><td>P</td><td>D</td><td>A</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>last_2</td><td>last_3</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>ne</td><td>une</td><td>tune</td></tr>\n", " <tr><td>-1.50</td><td>2.63</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " <tr><td>-2.20</td><td>2.62</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " </table> 4 / 4</div>\n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c511198>" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(local_4.errors(dev[:20], filter_guess=lambda y: y=='N',filter_gold=lambda y: y=='V' ))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This indeed removed the earlier mistakes for unseen verbs.\n", "\n", "## Maximum Entropy Markov Models (MEMM)\n", "\n", "There is an observation we have not made use of. For PoS tagging as well as many other sequence labelling tasks we often have *dependencies* between consecutive labels. For example, after a non-possessive pronoun (\"O\") such as \"you\" a verb is more likely than a noun. This means that the current label $y_i$ does not only depend on the observation $\\x$ and index $i$, but also on the previous label $y_{i-1}$ (or even labels further in the past). Our *local* model above cannot capture this. \n", "\n", "One simple extension to the local model is the [Maximum Entropy Markov Model (MEMM)](http://www.ai.mit.edu/courses/6.891-nlp/READINGS/maxent.pdf). This model can again be understood as a product of local logistic regression (aka Maximum Entropy) classifiers $\\prob_\\params(y_i|\\x,y_{i-1},i)$, but now each of these classifiers can use the previous label as observed feature and hence makes a first-order Markov assumption. Let $y_0=\\text{PAD}$ be padding for the label sequence to operate properly $\\prob_\\params(y_1|\\x,y_{0},i)$, then the MEMM defines the following distribution over \n", "\n", "$$\n", "p_\\params(\\y|\\x) = \\prod_{i=1}^n p_\\params(y_i|\\x,y_{i-1},i)\n", "$$\n", "\n", "The individual terms are defined as log-linear models as before, this time with access for features to the previous label: \n", "\n", "$$\n", " p_\\params(y_i|\\x,y_{i-1},i) = \\frac{1}{Z_{\\x,y_{i-1},i}} \\exp \\langle \\repr(\\x,y_{i-1},i),\\params_{y_i} \\rangle\n", "$$\n", "\n", "where $Z_{\\x,y_{i-1},i}=\\sum_y \\exp \\langle \\repr(\\x,y_{i-1},i),\\params_{y_i} \\rangle $ is a *local* per-token normalisation factor.\n", "\n", "We show the factor graph of this model below. The observed \"P\" node corresponds to our setting of $y_0=\\text{PAD}$. " ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n", "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n", " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n", "<!-- Generated by graphviz version 2.38.0 (20140413.2041)\n", " -->\n", "<!-- Title: %3 Pages: 1 -->\n", "<svg width=\"656pt\" height=\"142pt\"\n", " viewBox=\"0.00 0.00 656.00 141.89\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n", "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 137.89)\">\n", "<title>%3</title>\n", "<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-137.89 651.998,-137.89 651.998,4 -4,4\"/>\n", "<!-- x -->\n", "<g id=\"node1\" class=\"node\"><title>x</title>\n", "<ellipse fill=\"lightgrey\" stroke=\"black\" cx=\"322.5\" cy=\"-18\" rx=\"18\" ry=\"18\"/>\n", "<text text-anchor=\"middle\" x=\"322.5\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">x</text>\n", "</g>\n", "<!-- y0 -->\n", "<g id=\"node2\" class=\"node\"><title>y0</title>\n", "<ellipse fill=\"lightgrey\" stroke=\"black\" cx=\"63.5\" cy=\"-91.4983\" rx=\"18\" ry=\"18\"/>\n", "<text text-anchor=\"middle\" x=\"63.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">P</text>\n", "</g>\n", "<!-- t1 -->\n", "<g id=\"node9\" class=\"node\"><title>t1</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"27,-102.998 0,-102.998 0,-79.9983 27,-79.9983 27,-102.998\"/>\n", "<text text-anchor=\"middle\" x=\"13.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">t1</text>\n", "</g>\n", "<!-- y0--t1 -->\n", "<g id=\"edge2\" class=\"edge\"><title>y0--t1</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M45.3359,-91.4983C39.2812,-91.4983 33.2266,-91.4983 27.1719,-91.4983\"/>\n", "</g>\n", "<!-- y1 -->\n", "<g id=\"node3\" class=\"node\"><title>y1</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"118.5\" cy=\"-91.4983\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"118.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">y1</text>\n", "</g>\n", "<!-- y1--t1 -->\n", "<g id=\"edge1\" class=\"edge\"><title>y1--t1</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M107.478,-107.637C100.971,-115.687 91.9629,-124.622 81.5,-128.997 66.7385,-135.169 60.0289,-135.698 45.5,-128.997 33.9982,-123.691 25.1,-111.987 19.6393,-103.056\"/>\n", "</g>\n", "<!-- t2 -->\n", "<g id=\"node10\" class=\"node\"><title>t2</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"183,-102.998 156,-102.998 156,-79.9983 183,-79.9983 183,-102.998\"/>\n", "<text text-anchor=\"middle\" x=\"169.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">t2</text>\n", "</g>\n", "<!-- y1--t2 -->\n", "<g id=\"edge5\" class=\"edge\"><title>y1--t2</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M138.023,-91.4983C143.925,-91.4983 149.827,-91.4983 155.729,-91.4983\"/>\n", "</g>\n", "<!-- y2 -->\n", "<g id=\"node4\" class=\"node\"><title>y2</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"220.5\" cy=\"-91.4983\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"220.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">y2</text>\n", "</g>\n", "<!-- y2--t2 -->\n", "<g id=\"edge4\" class=\"edge\"><title>y2--t2</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M200.896,-91.4983C194.946,-91.4983 188.996,-91.4983 183.047,-91.4983\"/>\n", "</g>\n", "<!-- t3 -->\n", "<g id=\"node11\" class=\"node\"><title>t3</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"285,-102.998 258,-102.998 258,-79.9983 285,-79.9983 285,-102.998\"/>\n", "<text text-anchor=\"middle\" x=\"271.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">t3</text>\n", "</g>\n", "<!-- y2--t3 -->\n", "<g id=\"edge8\" class=\"edge\"><title>y2--t3</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M240.023,-91.4983C245.925,-91.4983 251.827,-91.4983 257.729,-91.4983\"/>\n", "</g>\n", "<!-- y3 -->\n", "<g id=\"node5\" class=\"node\"><title>y3</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"322.5\" cy=\"-91.4983\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"322.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">y3</text>\n", "</g>\n", "<!-- y3--t3 -->\n", "<g id=\"edge7\" class=\"edge\"><title>y3--t3</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M302.896,-91.4983C296.946,-91.4983 290.996,-91.4983 285.047,-91.4983\"/>\n", "</g>\n", "<!-- t4 -->\n", "<g id=\"node12\" class=\"node\"><title>t4</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"387,-102.998 360,-102.998 360,-79.9983 387,-79.9983 387,-102.998\"/>\n", "<text text-anchor=\"middle\" x=\"373.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">t4</text>\n", "</g>\n", "<!-- y3--t4 -->\n", "<g id=\"edge11\" class=\"edge\"><title>y3--t4</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M342.023,-91.4983C347.925,-91.4983 353.827,-91.4983 359.729,-91.4983\"/>\n", "</g>\n", "<!-- y4 -->\n", "<g id=\"node6\" class=\"node\"><title>y4</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"424.5\" cy=\"-91.4983\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"424.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">y4</text>\n", "</g>\n", "<!-- y4--t4 -->\n", "<g id=\"edge10\" class=\"edge\"><title>y4--t4</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M404.896,-91.4983C398.946,-91.4983 392.996,-91.4983 387.047,-91.4983\"/>\n", "</g>\n", "<!-- t5 -->\n", "<g id=\"node13\" class=\"node\"><title>t5</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"489,-102.998 462,-102.998 462,-79.9983 489,-79.9983 489,-102.998\"/>\n", "<text text-anchor=\"middle\" x=\"475.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">t5</text>\n", "</g>\n", "<!-- y4--t5 -->\n", "<g id=\"edge14\" class=\"edge\"><title>y4--t5</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M444.023,-91.4983C449.925,-91.4983 455.827,-91.4983 461.729,-91.4983\"/>\n", "</g>\n", "<!-- y5 -->\n", "<g id=\"node7\" class=\"node\"><title>y5</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"526.5\" cy=\"-91.4983\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"526.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">y5</text>\n", "</g>\n", "<!-- y5--t5 -->\n", "<g id=\"edge13\" class=\"edge\"><title>y5--t5</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M506.896,-91.4983C500.946,-91.4983 494.996,-91.4983 489.047,-91.4983\"/>\n", "</g>\n", "<!-- t6 -->\n", "<g id=\"node14\" class=\"node\"><title>t6</title>\n", "<polygon fill=\"black\" stroke=\"black\" points=\"591,-102.998 564,-102.998 564,-79.9983 591,-79.9983 591,-102.998\"/>\n", "<text text-anchor=\"middle\" x=\"577.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"white\">t6</text>\n", "</g>\n", "<!-- y5--t6 -->\n", "<g id=\"edge17\" class=\"edge\"><title>y5--t6</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M546.023,-91.4983C551.925,-91.4983 557.827,-91.4983 563.729,-91.4983\"/>\n", "</g>\n", "<!-- y6 -->\n", "<g id=\"node8\" class=\"node\"><title>y6</title>\n", "<ellipse fill=\"none\" stroke=\"black\" cx=\"628.5\" cy=\"-91.4983\" rx=\"19.4965\" ry=\"19.4965\"/>\n", "<text text-anchor=\"middle\" x=\"628.5\" y=\"-87.7983\" font-family=\"Times,serif\" font-size=\"14.00\">y6</text>\n", "</g>\n", "<!-- y6--t6 -->\n", "<g id=\"edge16\" class=\"edge\"><title>y6--t6</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M608.896,-91.4983C602.946,-91.4983 596.996,-91.4983 591.047,-91.4983\"/>\n", "</g>\n", "<!-- t1--x -->\n", "<g id=\"edge3\" class=\"edge\"><title>t1--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M24.4484,-79.8712C28.0399,-76.9086 32.2154,-73.9665 36.5,-72 130.814,-28.7126 258.003,-20.7695 304.513,-19.3202\"/>\n", "</g>\n", "<!-- t2--x -->\n", "<g id=\"edge6\" class=\"edge\"><title>t2--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M181.673,-79.626C185.04,-76.9268 188.791,-74.1843 192.5,-72 230.513,-49.6134 279.759,-32.3885 305.352,-24.2173\"/>\n", "</g>\n", "<!-- t3--x -->\n", "<g id=\"edge9\" class=\"edge\"><title>t3--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M279.025,-79.9485C287.892,-67.5182 302.657,-46.8178 312.485,-33.0404\"/>\n", "</g>\n", "<!-- t4--x -->\n", "<g id=\"edge12\" class=\"edge\"><title>t4--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M365.975,-79.9485C357.108,-67.5182 342.343,-46.8178 332.515,-33.0404\"/>\n", "</g>\n", "<!-- t5--x -->\n", "<g id=\"edge15\" class=\"edge\"><title>t5--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M463.327,-79.626C459.96,-76.9268 456.209,-74.1843 452.5,-72 414.487,-49.6134 365.241,-32.3885 339.648,-24.2173\"/>\n", "</g>\n", "<!-- t6--x -->\n", "<g id=\"edge18\" class=\"edge\"><title>t6--x</title>\n", "<path fill=\"none\" stroke=\"black\" d=\"M566.5,-79.9798C562.907,-77.0216 558.743,-74.0547 554.5,-72 480.316,-36.0765 380.86,-23.956 340.573,-20.3558\"/>\n", "</g>\n", "</g>\n", "</svg>\n" ], "text/plain": [ "<graphviz.dot.Graph at 0x7f444bfedf28>" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seq.draw_transition_fg(7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training MEMMs\n", "We can train MEMMs by optimising the conditional likelihood of the gold label sequences, $\\sum_{(\\x,\\y) \\in \\train} \\log \\prob_\\params(\\y|\\x)$. This objective can be formulated as follows:\n", "\n", "$$\n", "\\sum_{(\\x,\\y) \\in \\train} \\sum_{i=1}^{|\\x|} \\log \\prob_\\params(y_i|\\x,y_{i-1},i) \n", "$$\n", "\n", "The objective is equivalent to a logistic regression objective for a classifier that assigns labels based on previous gold labels. This makes MEMMs easy to train: one can simply use a logistic regression classifier library and prepare a list of classifier training instances by iterating over all sequences and then generating one training instance per token. \n", "\n", "### Prediction in MEMMs\n", "To predict the best label sequence we need to find a $\\y^*$ with maximal conditional probability given the observed word sequence $\\x$:\n", "\n", "$$\n", "\\y^* =\\argmax_\\y \\prob_\\params(\\y|\\x).\n", "$$\n", "\n", "Due to the label dependencies, we cannot simply choose each label in isolation in order find the maximum. One solution to this problem is a forward greedy method:\n", "\n", "1. set $y_0 \\leftarrow \\text{PAD}$\n", "1. for $i$ in $1 \\ldots |\\x|$:\n", " 1. $y_i \\leftarrow \\argmax_{y} \\prob_\\params(y|\\x,y_{i-1},i)$\n", " \n", "This is an approximation because you can choose a locally optimal $y_i$ but for choosing the next $y_{i+1}$ a different previous $y_i'$ could have led to a much higher aggregate probability. In other words, you may find that \n", "\n", "$$\n", "\\prob_\\params(y_i'|\\x,y_{i-1},i) \\prob_\\params(y_{i+1}'|\\x,y_i',i) > \\prob_\\params(y_i|\\x,y_{i-1},i)\\prob_\\params(y_{i+1}|\\x,y_i,i)\n", "$$ \n", "\n", "even though $\\prob_\\params(y_i|\\x,y_{i-1},i) > \\prob_\\params(y_i'|\\x,y_{i-1},i)$. We will address this issue later.\n", "\n", "We provide an implementation of MEMMs, wrapping around a Scikit-Learn logistic regression model, in `seq.MEMMSequenceLabeler`. Below we first implement the greedy prediction algorithm above using this class. The class provides a local `predict_next` method that chooses the best label conditioned on a previous label. With this function at hand the greedy prediction algorithm is easy:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "def memm_greedy_predict(memm: seq.MEMMSequenceLabeler, data, use_gold_history=False):\n", " result = []\n", " for x, y in data:\n", " y_guess = []\n", " for i in range(0, len(x)):\n", " prediction = memm.predict_next(x, i, y_guess if not use_gold_history else y)\n", " y_guess.append(prediction)\n", " result.append(y_guess)\n", " return result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are ready to define a specific MEMM model. To specify the model we need to define its feature function. We use the feature function from before, assessing only $\\x$ and $i$, and append to it (see the `**feat_4(x,i)` notation) a new feature that captures the first item of the label history so far. Notice that we could assess labels further in the past (after increasing the `order` appropriately) but leave this for the reader to test. " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.8090400165871864" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def memm_feat_1(x,i,hist):\n", " return {\n", " **feat_4(x,i),\n", " 'prev_y': hist[0],\n", "# 'prev_nom': hist[0] in {'N','^','O','S','Z'}\n", " }\n", "\n", "memm_1 = seq.MEMMSequenceLabeler(memm_feat_1, train, order=1, C=10)\n", "seq.accuracy(dev,memm_greedy_predict(memm_1, dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have improved the predictions even further, and did avoid some of the verb mistakes we observed earlier. To illustrate this let us consider the remaining errors on the development subset used earlier. We see that instead of 5 errors we now encounter only 3. " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"04481688-25b1-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#04481688-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#04481688-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " <div class=\"item active\"><table style=\"\"><tr><td>the</td><td>players</td><td>and</td><td>his</td><td>wife</td><td><b>own</b></td><td>smash</td><td>burger</td></tr><tr><td>D</td><td>N</td><td>&</td><td>D</td><td>N</td><td><b>V</b></td><td>^</td><td>^</td></tr><tr><td>D</td><td>N</td><td>&</td><td>D</td><td>N</td><td><b>N</b></td><td>V</td><td>A</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>last_2</td><td>last_3</td><td>prev_y</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>wn</td><td>own</td><td>N</td><td>own</td></tr>\n", " <tr><td>-1.47</td><td>1.79</td><td>0.00</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " <tr><td>-1.82</td><td>1.22</td><td>0.00</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " </table> 1 / 2</div>\n", "<div class=\"item\"><table style=\"\"><tr><td>and</td><td>watch</td><td>the</td><td>news</td><td>and</td><td><b>tune</b></td><td>out</td><td>over</td><td>some</td><td>fresh</td></tr><tr><td>&</td><td>V</td><td>D</td><td>N</td><td>&</td><td><b>V</b></td><td>T</td><td>P</td><td>D</td><td>A</td></tr><tr><td>&</td><td>V</td><td>D</td><td>N</td><td>&</td><td><b>N</b></td><td>P</td><td>O</td><td>D</td><td>A</td></tr></table>\n", " <table>\n", " <tr><td>first_at</td><td>is_lower</td><td>last_2</td><td>last_3</td><td>prev_y</td><td>word</td></tr>\n", " <tr><td>False</td><td>True</td><td>ne</td><td>une</td><td>&</td><td>tune</td></tr>\n", " <tr><td>-1.47</td><td>1.79</td><td>0.00</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " <tr><td>-1.82</td><td>1.22</td><td>0.00</td><td>0.00</td><td>0.00</td><td>0.00</td></tr>\n", " </table> 2 / 2</div>\n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c4df518>" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "util.Carousel(seq.errors(dev[:20], memm_greedy_predict(memm_1, dev[:20]), 'V', 'N',model=memm_1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We should also inspect what transition weights the model learnt. For the case of verbs ('V') we observe a high weight for $f_{\\text{prev_y},\\text{O}}$, indicating that pronouns are often followed by verbs, as we expected earlier. " ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlYAAAGSCAYAAAAhNI8gAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X28LVVd+PHPl3sv4gNys3sU5EEgUVE0xRsQmlKKyUPS\ngxkoWqSRBIZlAj9LDCvDUkskRQxUjCAqQ1JIyeeHUC9EGGqFpolSXjUhkrQL398fa23Z7M7l7HP2\nmtn77Pt5v17zOvthzqw1s9as+e41a2YiM5EkSdLktpt2BiRJkuaFgZUkSVIjBlaSJEmNGFhJkiQ1\nYmAlSZLUiIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS1MjaaSW8YcOG3HPPPaeVvCRJ0tiuvvrqr2Xm\nwlLzTS2w2nPPPdm0adO0kpckSRpbRHxxnPk8FShJktSIgZUkSVIjBlaSJEmNGFhJkiQ1YmAlSZLU\niIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMGVpIkSY0YWEmSJDViYCVJktTI2mlnoGt7\nnvaupsv7wplHNF2eJEmaH/ZYSZIkNWJgJUmS1IiBlSRJUiNLBlYRsUNEfCIi/iEiro+IMxaZ55CI\nuDkirq3T6d1kV5IkaXaNM3j928CPZOatEbEO+EhEXJGZV43M9+HMPLJ9FiVJklaHJQOrzEzg1vp2\nXZ2yy0xJkiStRmONsYqINRFxLfBV4MrM/Pgisx0cEddFxBUR8YitLOf4iNgUEZs2b948QbYlSZJm\nz1iBVWbenpmPBnYDDoiI/UZmuQbYIzMfBbwOuHQryzk3Mzdm5saFhYVJ8i1JkjRzlnVVYGZ+E3g/\n8NSRz2/JzFvr68uBdRGxoVkuJUmSVoFxrgpciIj19fU9gUOBz47Ms3NERH19QF3u19tnV5IkaXaN\nc1XgLsBbI2INJWC6JDPfGRHPB8jMc4CnAydExBbgNuDoOuhdkiRpmzHOVYHXAY9Z5PNzhl6fDZzd\nNmuSJEmri3delyRJasTASpIkqREDK0mSpEYMrCRJkhoxsJIkSWrEwEqSJKkRAytJkqRGDKwkSZIa\nMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJasTASpIkqREDK0mSpEYMrCRJkhoxsJIkSWrE\nwEqSJKkRAytJkqRGDKwkSZIaMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJasTASpIkqRED\nK0mSpEYMrCRJkhoxsJIkSWrEwEqSJKkRAytJkqRGlgysImKHiPhERPxDRFwfEWcsMk9ExFkRcUNE\nXBcR+3eTXUmSpNm1dox5vg38SGbeGhHrgI9ExBWZedXQPIcB+9TpQOAN9a8kSdI2Y8keqyxurW/X\n1SlHZjsKuKDOexWwPiJ2aZtVSZKk2TbWGKuIWBMR1wJfBa7MzI+PzLIr8KWh9zfWz0aXc3xEbIqI\nTZs3b15pniVJkmbSWIFVZt6emY8GdgMOiIj9VpJYZp6bmRszc+PCwsJKFiFJkjSzlnVVYGZ+E3g/\n8NSRr74M7D70frf6mSRJ0jZjnKsCFyJifX19T+BQ4LMjs10GPKdeHXgQcHNm3tQ8t5IkSTNsnKsC\ndwHeGhFrKIHYJZn5zoh4PkBmngNcDhwO3AB8Cziuo/xKkiTNrCUDq8y8DnjMIp+fM/Q6gRPbZk2S\nJGl18c7rkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMGVpIkSY0YWEmSJDViYCVJktSIgZUkSVIjBlaS\nJEmNGFhJkiQ1YmAlSZLUiIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMGVpIkSY0YWEmS\nJDViYCVJktSIgZUkSVIjBlaSJEmNGFhJkiQ1YmAlSZLUiIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS\n1IiBlSRJUiMGVpIkSY0YWEmSJDViYCVJktTIkoFVROweEe+PiE9HxPURcfIi8xwSETdHxLV1Or2b\n7EqSJM2utWPMswV4UWZeExE7AldHxJWZ+emR+T6cmUe2z6IkSdLqsGSPVWbelJnX1Nf/BXwG2LXr\njEmSJK02yxpjFRF7Ao8BPr7I1wdHxHURcUVEPGIr/398RGyKiE2bN29edmYlSZJm2diBVUTcB/hL\n4IWZecvI19cAe2Tmo4DXAZcutozMPDczN2bmxoWFhZXmWZIkaSaNFVhFxDpKUHVhZr599PvMvCUz\nb62vLwfWRcSGpjmVJEmaceNcFRjAecBnMvM1W5ln5zofEXFAXe7XW2ZUkiRp1o1zVeDjgGcDn4qI\na+tnLwH2AMjMc4CnAydExBbgNuDozMwO8itJkjSzlgysMvMjQCwxz9nA2a0yJUmStBp553VJkqRG\nDKwkSZIaMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJasTASpIkqREDK0mSpEYMrCRJkhox\nsJIkSWrEwEqSJKkRAytJkqRGDKwkSZIaMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJamTt\ntDMwD/Y87V3Nl/mFM49ovkxJktQte6wkSZIaMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJ\nasTASpIkqREDK0mSpEYMrCRJkhoxsJIkSWrEwEqSJKmRJQOriNg9It4fEZ+OiOsj4uRF5omIOCsi\nboiI6yJi/26yK0mSNLvGeQjzFuBFmXlNROwIXB0RV2bmp4fmOQzYp04HAm+ofyVJkrYZS/ZYZeZN\nmXlNff1fwGeAXUdmOwq4IIurgPURsUvz3EqSJM2wZY2xiog9gccAHx/5alfgS0Pvb+T/Bl+SJElz\nbezAKiLuA/wl8MLMvGUliUXE8RGxKSI2bd68eSWLkCRJmlljBVYRsY4SVF2YmW9fZJYvA7sPvd+t\nfnYXmXluZm7MzI0LCwsrya8kSdLMGueqwADOAz6Tma/ZymyXAc+pVwceBNycmTc1zKckSdLMG+eq\nwMcBzwY+FRHX1s9eAuwBkJnnAJcDhwM3AN8CjmufVe152ruaLu8LZx7RdHmSJG3rlgysMvMjQCwx\nTwIntsqUJEnSauSd1yVJkhoZ51SgtiGtTzeCpxwlSdsOe6wkSZIaMbCSJElqxMBKkiSpEcdYaSoc\nyyVJmkf2WEmSJDViYCVJktSIgZUkSVIjBlaSJEmNOHhdc83nK0qS+mSPlSRJUiMGVpIkSY0YWEmS\nJDViYCVJktSIgZUkSVIjBlaSJEmNeLsFaUI+91CSNGCPlSRJUiMGVpIkSY0YWEmSJDViYCVJktSI\ngZUkSVIjBlaSJEmNGFhJkiQ1YmAlSZLUiIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMG\nVpIkSY0sGVhFxPkR8dWI+MetfH9IRNwcEdfW6fT22ZQkSZp9a8eY5y3A2cAFdzPPhzPzyCY5kiRJ\nWqWW7LHKzA8B3+ghL5IkSataqzFWB0fEdRFxRUQ8YmszRcTxEbEpIjZt3ry5UdKSJEmzoUVgdQ2w\nR2Y+CngdcOnWZszMczNzY2ZuXFhYaJC0JEnS7Jg4sMrMWzLz1vr6cmBdRGyYOGeSJEmrzMSBVUTs\nHBFRXx9Ql/n1SZcrSZK02ix5VWBEXAQcAmyIiBuBlwHrADLzHODpwAkRsQW4DTg6M7OzHEuSJM2o\nJQOrzDxmie/PptyOQZIkaZvmndclSZIaMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJasTA\nSpIkqREDK0mSpEYMrCRJkhpZ8s7rkmbDnqe9q+nyvnDmEU2XJ0myx0qSJKkZAytJkqRGDKwkSZIa\nMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJasTASpIkqREDK0mSpEZ8pI2ku/DROZK0cvZY\nSZIkNWJgJUmS1IinAiX1rvXpRvCUo6TZYI+VJElSIwZWkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMG\nVpIkSY0YWEmSJDWyZGAVEedHxFcj4h+38n1ExFkRcUNEXBcR+7fPpiRJ0uwb5wahbwHOBi7YyveH\nAfvU6UDgDfWvJE2Vzz2U1Lcle6wy80PAN+5mlqOAC7K4ClgfEbu0yqAkSdJq0WKM1a7Al4be31g/\n+z8i4viI2BQRmzZv3twgaUmSpNnR6+D1zDw3Mzdm5saFhYU+k5YkSepci8Dqy8DuQ+93q59JkiRt\nU1oEVpcBz6lXBx4E3JyZNzVYriRJ0qqy5FWBEXERcAiwISJuBF4GrAPIzHOAy4HDgRuAbwHHdZVZ\nSZKkWbZkYJWZxyzxfQInNsuRJEnSKuWd1yVJkhoZ5wahkqStaH0TUvBGpNJqZo+VJElSIwZWkiRJ\njRhYSZIkNWJgJUmS1IiBlSRJUiMGVpIkSY0YWEmSJDViYCVJktSINwiVpFWg9Y1IvQmp1A17rCRJ\nkhoxsJIkSWrEwEqSJKkRAytJkqRGDKwkSZIaMbCSJElqxNstSJK+y9s6SJOxx0qSJKkRAytJkqRG\nDKwkSZIaMbCSJElqxMBKkiSpEQMrSZKkRgysJEmSGjGwkiRJasQbhEqSetX6JqTgjUg1O+yxkiRJ\nasTASpIkqREDK0mSpEYMrCRJkhoZK7CKiKdGxD9FxA0Rcdoi3x8SETdHxLV1Or19ViVJkmbbklcF\nRsQa4I+AQ4EbgU9GxGWZ+emRWT+cmUd2kEdJkqRVYZzbLRwA3JCZnweIiIuBo4DRwEqSpJnR+rYO\n3tJB4xjnVOCuwJeG3t9YPxt1cERcFxFXRMQjFltQRBwfEZsiYtPmzZtXkF1JkqTZ1Wrw+jXAHpn5\nKOB1wKWLzZSZ52bmxszcuLCw0ChpSZKk2TBOYPVlYPeh97vVz74rM2/JzFvr68uBdRGxoVkuJUmS\nVoFxAqtPAvtExF4RsT1wNHDZ8AwRsXNERH19QF3u11tnVpIkaZYtOXg9M7dExEnAu4E1wPmZeX1E\nPL9+fw7wdOCEiNgC3AYcnZnZYb4lSZo6n3uoUWM9hLme3rt85LNzhl6fDZzdNmuSJEmri3delyRJ\nasTASpIkqREDK0mSpEYMrCRJkhoxsJIkSWrEwEqSJKmRsW63IEmSpsf7Za0e9lhJkiQ1YmAlSZLU\niIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMGVpIkSY0YWEmSJDViYCVJktSIgZUkSVIj\nBlaSJEmNGFhJkiQ1YmAlSZLUiIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS1IiBlSRJUiMGVpIkSY0Y\nWEmSJDViYCVJktSIgZUkSVIjBlaSJEmNjBVYRcRTI+KfIuKGiDhtke8jIs6q318XEfu3z6okSdJs\nWzKwiog1wB8BhwEPB46JiIePzHYYsE+djgfe0DifkiRJM2+cHqsDgBsy8/OZ+R3gYuCokXmOAi7I\n4ipgfUTs0jivkiRJMy0y8+5niHg68NTMfF59/2zgwMw8aWiedwJnZuZH6vv3Aqdm5qaRZR1P6dFi\njz32eOwXv/jFlusiSZImsOdp72q6vC+ceUTnaWwtndYi4urM3LjUfL0OXs/MczNzY2ZuXFhY6DNp\nSZKkzo0TWH0Z2H3o/W71s+XOI0mSNNfWjjHPJ4F9ImIvSrB0NPDMkXkuA06KiIuBA4GbM/OmpjmV\nJEmd6uOU2rxbMrDKzC0RcRLwbmANcH5mXh8Rz6/fnwNcDhwO3AB8CziuuyxLkiTNpnF6rMjMyynB\n0/Bn5wy9TuDEtlmTJElaXbzzuiRJUiMGVpIkSY2MdSpQkiSphXkfIG+PlSRJUiMGVpIkSY0YWEmS\nJDViYCVJktSIgZUkSVIjBlaSJEmNGFhJkiQ1YmAlSZLUiIGVJElSIwZWkiRJjRhYSZIkNWJgJUmS\n1IiBlSRJUiMGVpIkSY1EZk4n4YjNwBenkvjiNgBfm4M05i2deVqXvtJxXbbtdFyXbTudeVqXPtMZ\nx4Myc2GpmaYWWM2aiNiUmRtXexrzls48rUtf6bgu23Y6rsu2nc48rUuf6bTkqUBJkqRGDKwkSZIa\nMbC607lzksa8pTNP69JXOq7Ltp2O67JtpzNP69JnOs04xkqSJKkRe6wkSZIaMbCSJElqxMBKklYg\nImLaeZA0ewysNHci4h49pNHLQTUidoyIdX2kNS8iYrv6t7MyiojvA06MiLVdpdG3rut0ROwSEUve\nXLFBOp3vL32Ve0RsX//21d6s6pggIh4YET9cX0/th8+q3ojTEhEPiIj9ImLviPieaeenhR4a1V7q\nWkSsB94VEY/rMI2HAC+OiFMj4t4dpnNf4O3AI+t7e0iWEBH3B86tB6ROtldEPBS4BPhqZm7pIo2a\nTtS/ayJiTVfpDNnQ1YIj4mHA+4GHd5VGTeehwHkR8f0dpvFg4BURcWiX7X9E3A94X0R8f2ZmV/t/\nFPcGyMw7Bp81TmN9y+XdjRcAzwTIKV6ZZ2C1TBGxL/AB4NeBi4FzIuJZHaTTea9LTede0G0lrI3d\nyyLipV3vYJn5TUr5vCoiDmi9/HqAuJhy0P5J4KyufiFn5i3APwC/GRH3mmZD0VpE3D8i9omIJ0XE\nDg0X/UBg58z8zuAg0VKty+8B3piZl3Qc7O4AkJm301GQOFDL4PSIeFAHy94XeBPwqsz8YMfb7ImU\nA+tRHe3/+1L2/68B/5mZ/9k6jYHM/AbwPuCtEbFfF/t/ROwPvLKm8caIeHFE/GDLQK4ey94TEb/W\nYnlLuAC4R013evFNZjqNOQG7Ap8Bjqvv96IcXP8eOLZhOvcDfh84qOP1eRjwRmC/jtO4BjgF+Dhw\nXkfprAe2G3p/MnA1cGDDNDYAnwV+t77fAbgWOKGD9dl+qI69Cdi3vt+ug7Sii+XeTXr7Ap8E/hj4\nCuU+NcdPuMx19e8a4KPAEzrK9yeA64G/AO492H4dpPWI2q68FfgT4FXAfVuXe/17X+AhlN7R72uc\nxmCf+f2h8vkIcFRHdWv/2t5cApwKPLbhsncB/nHQ/g99fiDw0IbprB15f1qtc/vV9032VeAg4MOU\nXsQdgccBBwDvBZ7YuFwOAD4NnNhBme9R839v4Hsox+gHdVG/xp3ssVqe3YCPZeabATLzXzPz7cAZ\nwDMiYtdG6XwvcBtwbEQ8dvBhy196EbEb8DfAVZSKOPj8aRFxVKM0NgCXAu/OzN8DngDsFRE/PjLf\nRPWwjtv4F+C9EfHKiNgjM18LvAZ43fA2nNC9gb8C1kbEgZn5P8BlwH81Wj4RsUdE3Dszv1M/+hKl\n0Xgh3NlVP2Eag1NM+0TEfbO4IyLWR8QPTLr8JdJ+GPAW4A8z83mUA+HfAwdHxAkrXObOwKsj4vmU\nOvYpYPtF5lvx/hMRO1J+HPxBZj4C2AL8RS2rLk7TPBb4GPAG4PXA67P0YDZT830o8IeUH3JfyszP\ntUyDO/eZb0fEgcBFwCcy8x2tEoiIDfW0GZl5DXAe8G3gwcBP116ZFh4AXDVo/2vaL6b88Hlui32n\ntmUn1eEGAGTmmcDbgEsi4kF1X520zdwA/D/g6Zn5aUpA8mzgVuD5wMkRscckaQyltSYzPwEcS1m3\nF7dY7mDZlAD6ncCfAT8FfIsSbE1t+ISB1fLcDjwqIh4w8vlVlAbke1skkpn/AvwpcCPwvKHAYDAo\n9yENThXuAlyRmW/Ocqph4F7A3hMue2A0EPk25aD3oxHxqoh4dERsaBAsrKPsVHcAPwq8ICKuonTX\nbwZeWRv1iWTmFyldzV8Hjo6I3weeRBk7MrHaCJwCXBMRx9VttoXS+7ZPRBzRIo16QD0SeC2lbAan\nMg8GzoiITsbB1PEobwI2ZeaFAJn575S6/nfA/hGxkn3ogZSew8cBzwV+BHhLRPxERPxkTfueWX/e\nrtBtwBmZeVHN99HAN+kuuNoJ+O/MvCozP5aZn+9g3MujgTMp+85pmXly/bzZcWFon7mV0vNGZv7q\nUB4eFhH7rHT5NRD5LCXo+PX68YeBD1HW7R6U4KrFQ3yDofY/Iu5D6e05HfgG8KSYfFD7Qyg9oz8R\n5QIJ4LvB1cXAOyJixwZt5sOAzZn5H7VdeSWQlCEuD6Rsw4mHOETEdpl5e/17DXAc5TTtji3qc2be\nnpknUoK236Ycc24DTo2IHTr60bMkA6vl+Qrwr8BDBx/UA9W/A/9GOYU3kcGOmZmfpfy6+zdKcHVA\nraCHULq6Jz34fY1SwV8ZEa+IiF+JiGcAhwNPiYinRMRhkySwSCDySkrg8zFK43Eq8He1gZokna8A\nZ1F6x95X/55C6aK/jXKgvTwmGN81VC6foTRw3wR+BjgzM78cDQYX156jk4DfogS3b4uIF1F+Kb+H\nOrh4pQe+oaDqcEov68nARkpjB+VU7ccop4a6EJTxb1+LiB8dfJiZN1NO3TwR+OHlLjQzr8nM84Hn\nZOaxlLpwP8qg/1+NiEuBv42Ie660kc3MLZn5hZHPjqHUg4sHwdVKlr0V/0Y5WA+n13qMzXbAOZn5\n7lqvBwfCJmPTRvaZC4ALgZtqQEeUC0z+msnq27q63P8GjomI3wKeDPwy5VTXb1D2m2dEuRhkEjcx\n1P5n5q2UU9iXUtq4PYCJAqvM/Cjlh8b9gZ8ZCa5eDmyijiGa0ALlGABlOMufZeYJwJspQcoTmLCj\noPZUDXrXnlX3kauAJ2Xmf1F+3E9kKJD9cv0R8trMfDxwC/DnHeyXYzGwGsPgoFkP4B+i9IA8PiLW\n1wPV44AfYqQhXEk6mbklylVAz6yBydspp4N+KiJOooxHeU5m/v0E6WyXmf9KCXK+D3gUpet8b0oD\ntTfw45SgZKVpLBaIHAO8KDPfBhxZD0xH1gZq0nT+GXgXpYfqOcBNtSH6GeAI4PAsA9tXksZdygUY\nBIxvBJ4QEQeP9PqtdF3W1XX5k8x8KfCzlLEJPw/8EvDyiNh3pQe+oaBqsF0eTvmlenn9/uuUMSQr\n2k5jpP8Nyqmtb1OC9+Hg6j9rPr68nGUOB7RDDejbKAeI36uN7AnA8zLztpU0soM0ImK7iDg2Iu4x\nCG5rHb4D+KuWPT2Ug3iTntCBQVAZ9RQwpb26SzvSMKga3meeRVmfi4H/oAxxeD7l9OMLM/PqlaZT\n2+TXA1cC76aclv8UpefihzPzvynB1VtXeip1qP3/d0pPzqD93ykz/7e2/z8P/HmW4QErMrT/f5Dy\nQ2OB0tu2b/3+8cD3U84qTOpfKcMyghKsfU9N/+OU+nw7ZRuuSC3/2+s+8W7gPsB3am/bt6NchfiG\nmOBiiaE6Ngjcdhgqq2fSzX45npziAK/VMAFr6t+1wM/U16dQfmm9hzKO53PA0xqlsx2lkfglykDP\ndZTg5+WUA97T63zBCgbNUgc+Dv09hNJzNHh/AGVQ8a4N1mUN5QqdoARrv0EZhPtDQ/OueODvaDr1\n9UMp3dlnA49rWP6j5bJ9LZfTKcHuRAOLR9blWOBe9f0GyliRc2o9+4EJ0ngK5ZTZg4GnUU5hP2hQ\nvyfdVltJc0/qgNVBGpSxii8BXg0cVj97LHAdy7iQYqRsjgXuMbTNPgk8tYPyP6G2BTuOzPeYBmkN\n9sEY+qzJwPjBcoAjgSuA9V2U91a22XBbtjelt/Qmapu50nUcrrN1X3wxJcjaE9iZcgHARPWarbf/\nl9V1ey1lfOePNUznmPr6CZTTW+8DfpMyFvbwRmW0E6Ut3q3uey+t++QfU8ZebaAMmt95wvJ/Z91e\n96YE1gcNzbfiNvNu9sv7jMw38X65ovxNI9HVMi1SeCcNfbcX5bTFkdSD3QQNxNYq4p9Rr2qrDdJD\nG6bzbEpwcAjwjtoonUQZr3BYw202HIjsTftAZGuN9yCdnToul32BvRuvywl1Pe47Mt+Kgt1BvQV+\nrtbbH6dc3fag4fRbT7WhO5/yy/FMygD8Pep3D6A04L9d/17NMn6c3M0226l+fgZwdMNyWfQA0Wrb\nsfUgceLAijuDqsPrdt6H0lt9fIvlj7nNhveZ7wMeMsn6sfiPqodRfrz9EQ2uBlykjg23/w+inLp+\nCvXg3aiduRI4sb5eQxlU/mTK6fofbFQPHkz5sfuLlHFv9wMeDzwG+LU6z7nAa1aQxnZD63I55bhy\nL8qVhi9aLD8N61jz/XLF23qaic/ydDeF9+e0vYR/7Io49D8r6alaLBAZpP0MSiDyeuDQDrZZl4FI\nJ+mspFwar8vFg4Z0kJcVLn/7utyDh7bLX9NxUDWU/sGUnqgzKL2umyinIXet359O6UX7sXHr9pjb\n7IVMcMuFvsp/kfXZaq/YhGkcXrf9g4GjKKcAV9wrPc1ttpW2bPhH1cuoP97GqU/LrGOt2/+tpXNJ\n43SGeyzfTf3RBjyP0nN1NuWH1xn19Xmj/7vM9C6k9HjdC/hb4ORJltd3HZt4e087A7M4TaPwlqiI\nE923ZIlG4geG5pv5Ct932XRcLp2vSz3YnA6cOpTW+uF60cE2u8upLEovwjPq+1Mop7Q/XD9/JDUA\nHqf+zVj5t+hNGuu0yYRp3N0p4C7uizaNtmw1/qjqK53FeiwPA36+fr43pfftlynjYJ+8kvICHk0Z\nywbwg8A9KeMEXzial47rWPP7yi07f9POwCxPPTSqnVfE5ey8jdaps0a1r3T6bCC6qmeUcWaDbvHd\nKT0UP9xF/kfSfQjlYHcqd95A80TKbTd2pRzUf4UyDuaD1Bufbovlv5x9c4XL7+0U8Ky1ZY3WqZeD\ndx/psHiP5R5L/M9YaXNn4LYL5Ya29xuqd89rWP69tssT5XXaGZi1qacGopeKOJJmZztvjweiTtPp\ns1y6XBfKaZFT6gHnFZSrC38BeO7g+1b1aiTdwV32T6VcXfTmoe8upYy1OmXos3Wzss36Lv+RdLsI\nrHs5BTyDbdlqCaq73P9j5P1YPZYTpHfP+ve+wF8yNBatxTab1n45UflOOwOzMvVZeF1XxKFlzMWB\nqMd0Oi+XnuvZQyg3zXwv5XESX2DkqplWE1t/3M+g1+Qw4MKh+deOu47zVP5Dy+l63+zlFLBt2Wym\nU5fReY9lXe4n6v59L0rw9k7qxQmrqY61nKaegVmZemwg+qiIc3Ug6iOdPsqlx3UZ/cW6PeXRMedR\nf923boi1exPiAAANRklEQVQoV0n9LuXeRIOxLi8Hnl1fD57h9YIZ3WZ9lX+n+yY9ngK2LZvpdHq7\naIVyxeK53PlMy/OBQ+p3k/Yg9rJftp68QSgQEXsBH6x3Gt9COZXx1Bh6XhO0ufNxlhtzvgH4CUpl\nfD7wVcpjBFo9TmKH+ve/KTdme2ZN+y2Z+cc1nZhkffraZn2l00e5dL0ugxtADv9/vRnsd7I8TuJv\nqHc7blGXh+XWH/fzvvr9f1IGx356Ocudp/KvOts3680R/xf4nYh4BeWJA6+nPqKq3lCxWbnbls1m\nOoPFUAKSH6rv/4nyI+eLg5t3NkijJFSenXgK5bYp30u5L9arozxSZqIbzva4XzYVjdvXVSsijqN0\nMd8L+HfKvT0uyMwPtHzMw1B66ykP1305pTdhC+WGliu+c29d7l6UK2ReRhkc/EDKQ1Z/NcvdyZvp\na5v1WTZdlcvQ8puvS0ScRXnu4xVb+X5wF+Q3Un5B/sIEq7DY8tdmeaYhEbEnZUzXcyk3Bv3renf8\n2wcHjOUeCOel/PvaN+uBeoGyDrtQttt+OcETDpZIz7ZsRtKJ8rzE92bmVRGxO+VGpr+ame+v308a\nhC76/4PP693bFyg9p3+SmX+30rRGlt9pu9yagdWQLgpvGhVxXg5EXaYzrQaiptF6XY6n3NzvtCzP\n3Rv+7rvrGRGHZOYHRj+fxFDQtoZyf6qLKFf8PRNYD/xVZn6sQTpzUf5d7puj6xQR2wP7Ua7KvC4z\nXztJuduWzXY6dR98EeXmrx+n9FJtD9yRmee16KmKiIXM3LxYXRhpay4CPpqZZy9z+VNrl5vKGTgf\n2ffEVs5hDz6nDPx8IOUOvpPe7XZha2kOf0Y5IP2f8+0TpLueMsbizcA/UO5fssOsb7Me0+m8XHpc\nl4dSxk89qL5f7NEoxwK/R2lom9z2gg4e9zNP5X83aXe+b3LXq71+mvKw8EnzbVs2g+kssvzmF63U\nfXyB8tzard64tO7796HcDuOhq6WOtZ5m8vxkDzbAnWNSBjIza2T8v1ke7nk/yvniZYvywNYF4JqI\nODBrbVgkve0j4j6UR69cuYJ0Yitf3ZyZX6I8tuIwyiW2j1nu8od0vs36SKevcqk6W5eI+K2IeHOU\nB7T+M/B5ykOhyfJE+eFfj8dQfsm+Nct4q4l/5cddH7I6eG7aW4E/pTzi43OUG9Cemct7+O3clH/X\n+2ZEnBURh21lHe6IOx9Q/WTK2JcVsS2b+XQYTicz/zkzP0rZVsdSAqznLpaXMZe7XWbekZmbKadi\nD4rycO3tRuZbU9uXW4Hfzsx/Wk4aPbbLndumAqseG4jOK+KQuTgQ9ZFOX+XS0zY7izKI8yTK1TiX\nAJsj4gcGy695OZYSVD0rM69fZhqLqttxEFS9kzL+5WxKgPWJzPx4zcNnMvPz4y5zXsp/SNcH1n8E\nnhYRO41+UZc/OO1zUdZxdcs9sNqWzWZbNpJe1xetPHjo9bWUiyLWDH7A1fQGP7TuFxEfoDxdYdz8\n971fdm6bCax6LrxOK+JgfeblQNRj2fRSLl2uS0T8QUScBvxSZp5KGWtwA+XKmWPqOg3mfSTlF+vP\nZuayrsa7O3lnj9fbgA9RTmVcBlyWma+uaY/dtsxT+Q/Wp6cD6wcpp17WD9Ktf4d7K48FDq9pbbeC\nA6tt2QymU5fRaY9lLZd7AH8aEedHxBGZ+V7gXyg/7AblMij/9ZRHMb1sme1NL/tlr3IGzkf2MTF0\n3wvKZeDvALYftD/172DMyP2ADwAPX2Ya2wH3oDw64HzgiPr5q4BzhuYbpLMeeA/wxOWmM/T6RcDJ\nlLEt243Mt2bo9cGzuM36SKevcul6XYA3UcYsPRL4H+DIoe/2owweXzvyPwvLXYcl8tD8Ro1zVv6d\n7pvAb1HGGu0LBCWw/pvFtj0l0P574BErWY8+tlnX26uvOtZ3OvX/j6f8oNppke+G68Ehi31+N8sd\n5PP+9e8OlHtVnU35IfUsyj24HjD0P/eltANjP/i8z/2y72nqGeh8BXsovL4q4tD/zsWBqOt0+iyX\nHtZlR+oAZMpYptfU17ssMu9aGj9cd2hbNrtR4zyV/9D/d71vLgCvpPQSXkh5oO7bGHqYep3v2Lpd\nl/vj0LZsBtuyraTZ2UUrlGcL/l2tZ+cNfX4SZezkHZRbeAw+/1nGfGh433VsGtPUM9DZivXfQHRW\nEYf+Zy4ORH2WTdfl0se6UJ7zdxBlEOpHKV3tg+9+F/ihHvanZneMnqfyH/q/rvfNP6A8H+9lQ/9/\nBuVZeVuoj66p3z2SMq5m2T1VfW2zHrbX3LVl9X8777GkjF+7hnJ14cnA9Qz1igH3pwZzK5362i+n\nNU09A52uXH+NaqcVsc+dt8dt1kfj3XkD0fW6UHqELq6vj6fcgXqv+v73KQebTh6qPJSH5o+VmJfy\n72PfpMdTwF1vsz62V591rOt0GPmxQkc9lkPlck/gUZQ7nR9JuR/WoL35P7eEGM3fLNSxWZimnoHO\nVqy/BqLzilj/by4ORF2nM4Vy6XJdXkr5xXnE0GcvplwNdhFwKfWASuNTf4vkpdnzwOap/Ov/dnlg\n7fwU8BT2GduyZZRLfd15jyXwFOC3gSMoN2P9FHeeln0CZWD6AyZZlz73y2lOU89A05Xpv4HorCKO\npLPqD0R9lk3X5dLHulB6hQ6gPLj45SPf7UW5NH0wpqLTHquhdFd8o8Z5Kv+RtLrcN3s7BdzXNut4\ne81dW1b/r/MeS8od4F83qFPAC6g3AwWOBq4Dfmw11LFZmKaegeYr1F8D0WlF7HPn7XGbdZ5OHw1E\n1+sC/DHlsuM/BN4FbGZoTBN3vZKqeU/V1urQUJ1c0R2j56X8+9g36fEUcNfbrI/t1Wcd6zod7tpT\n1VmP5VC5BCXYvYryg27w+UmUG/5eADx1pWXSRx2btWnqGWi6Mv01EJ1XxPq/c3Eg6jqdKZRLl+ty\nGvAH9fXBwC9Suvs/R4PHkoyZh+aPlZin8q//3+WBtfNTwFPYZ2zLllEu9XXnPZbA4yljJ4+j3J/q\nBaP5WS11bJamqWegQcXou4HorCKOLGfVH4j6LJuuy6WPdaFc6XML9aooysDe/Sljmn4UeEuXDQ+N\nnwc2T+U/sqwu983eTgH3tc063l5z15bV//05OuqxHMrzwbWeXUgZEP8x6tMcJi3zvuvYrE1Tz8Bq\nKLy+KmKfO2+PjWofv4g6byD62maU0yRfAX566LOPAY8aXe/G69bJjRrnpfz72Dfp4RSwbdlE69R5\nXa6v++ixPIBypeVB9f2DgRNrffgKcEaDsu+lXZ7FaeoZWC2F12VFHEln1R+I+iybrstlGo0EZfzJ\nl4BXU04NXgmsa53OSJrNbtQ4T+U/klaX+2Zvp4Bty2YrHe4aVPXSYwkcCtwOvKS+XwccBby8pvP4\n1VDHZnWaegZWS+F1WRH7aiT63GY9ptNpA9F3PRtK8ycowdVFQ5+t7SCdTm7UOC/l38e+Sc+ngG3L\nZicd7hpU9XrRSi3zG4Bj6vsnUnoXdxrN2yzVsdUwTT0DE1aMXguv44o4FweiaZRNl+UyjXo2lO6T\nKbc1OKaDZQ8OgJ3cqHHOyr/zfZOeTwHbls1WOkzpohXgxyhB/V8AlwBPWw11bNanqWdgtRVeVxVx\nng5E0yibLhuIadSzoXR/Cvidjpbd9eN+5qL8+9o36fkUsG3ZbKTD9C9aeRqlp+zFg3VpuN06bZdn\ndZp6BlZj4XVVEeflQDStsumygZhGPetyor87U89F+fd4AO/lFHDX28y2bEXl3vtFK0PLfgpwI/CT\nq6WOzfI09Qys1sLrqiLOy4FoWmXTZQMxjXrWOO+93ahxHsu/xwN4Z6eA+9xmtmXLXnbvF62MpH8o\nsPdqqmOzOk09A6u58LqqiPNyIJpW2XTZQPS9Lh3lvffHSsxL+fd4AO/sFHCf28y2bNnL7rXHss+p\n63Z5lqbBL9i5ERGHAp/LzM9POy+TiIinUK7S+uXMfHvHafWyzealbGB1rktE7E8ZoH5JZn44Il4A\nnAI8nTL25SXAr2fmX3eU/qrbZovpc9+cB7Zly172kym9Vmdm5kWtl6/uzV1gNU/m5UCk6YmIyMyM\niKA8NPk7wLGUepURcRLlKqQtwJ9m5t8M/meK2Z557pvL4/Zanoj4KWD/zPz1aedFy2dgJc25iHg8\n5WGuO1N6pc7KzNcNfT84PXPHlLIoSXNj7bQzIKm9oZ6qg4E3Ua7IupFy08GX1q/PBqi9U/7CkqQG\nDKykOVSDqgOA3wGOy8yrIuLBwL9RTv29JCIWMvNlU82oJM2Z7aadAUmd2Ylypd+P1PdfpPRafQ54\nHOVybklSQwZW0pzKzCuBnwR+PiKOycz/Bb5JuV/ONzLzI3V8lSSpEU8FSnMsM98REXcAF9Yrje4A\nfjMzb67fO7ZKkhqyx0qac/W+VMdSHob7ycy8LKopZ02S5o49VtI2oAZT/wOcHxGf88aWktQN72Ml\nbUO8UaMkdcvASpIkqRHHWEmSJDViYCVJktSIgZUkSVIjBlaSJEmNGFhJkiQ18v8Buuy8uGKEI0QA\nAAAASUVORK5CYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x7f444c4dfdd8>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "memm_1.plot_lr_weights('V',feat_filter=lambda s: s.startswith(\"prev_\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Beam Search\n", "As we discussed, the greedy search approach is an approximate solution to the $\\argmax$ problem. One way to improve over greedy search is to maintain a *beam* of $k$-best solutions in each step. This enables initially weaker solutions to remain in the beam and move up the ranks in later steps in case they are more consistent with future observations. \n", "\n", "Technically a $k$-best beam search proceeds as follows. Let $L$ be the label set.\n", "\n", "1. Initialise a beam $B \\leftarrow \\left[(\\text{PAD}, 0) \\right]$ of partial solutions $(\\y,s)$ where $s$ is the partial log-score of $\\y$. \n", "1. **for** $i$ in $1 \\ldots |\\x|$:\n", " 1. Let $C\\leftarrow \\{\\}$ be the next beam candidates.\n", " 1. **for** $\\y, s$ in $B$ and $y$ in $L$: \n", " 1. $C \\leftarrow C \\cup \\{ (\\y \\| y, s + \\log \\prob_\\params(y|\\x,y_{i-1},i)) \\} $\n", " 1. Let $B\\leftarrow k\\text{-highest-scoring}(C)$ be the $k$ pairs $(\\y, s)$ with highest scores.\n", "1. **Return** $\\y$ with highest score in $B$. \n", "\n", "Note that a slightly faster version can use a priority queue. \n", "\n", "In Python we can implement this algorithm like so:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.808832676757205" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def memm_beam_search(memm, x, width=2):\n", " beam = [([],0.)]\n", " history = [beam]\n", " for i in range(0, len(x)):\n", " # use priority queue \n", " candidates = []\n", " for (prev,score) in beam:\n", " scores = memm.predict_scores(x, i, prev)\n", " for label_index,label_score in enumerate(scores):\n", " candidates.append((prev + [memm.labels()[label_index]], score + label_score))\n", " beam = sorted(candidates, key=lambda x: -x[1])[:width]\n", " history.append(beam)\n", " return beam, history\n", " \n", "def batch_predict(data, beam_predictor):\n", " return [beam_predictor(x)[0][0][0] for x,y in data]\n", "\n", "seq.accuracy(dev, batch_predict(dev, lambda x: memm_beam_search(memm_1, x, 10)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With beam of size 10 accuracy improves only marginally. You can try other beam sizes (leading to longer runtimes) but likely will not see substantial improvements. Is this because we already finding solutions with highest probability, or because higher probability doesn't necessarily mean higher accuracy? \n", "\n", "We can test how many per-token predictions differ when comparing greedy search to a beam search of a given width, simply calculating their accuracies relative to each other:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.9691063653327804" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seq.accuracy(memm_greedy_predict(memm_1, dev), batch_predict(dev, lambda x: memm_beam_search(memm_1, x, 10)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We notice that about 4% of the tokens receive different labels, simply searching for higher scoring sequences. This suggest that we frequently find higher probability sequences, but that these are not necessarily more correct. We can also calculate the average log probability of the argmax sequence using different beam sizes. Again we see that there is a substantial difference between scores, they are just not reflected in task accuracy. " ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-3.2609996767845657" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum([memm_beam_search(memm_1, x, 1)[0][0][1] for x,y in dev]) / len(dev)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-3.1891722917612846" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum([memm_beam_search(memm_1, x, 5)[0][0][1] for x,y in dev]) / len(dev)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Beam search is a simple and often effective way to find sequences (or other structures) with higher probability. However, often it is also inefficient in the sense that it does not fully leverage the factorisation structure and conditional independences. To illustrate this problem recall that the conditional probability of a label $y_i$ only depends on the previous label $y_{i-1}$, any earlier labels have no impact on the term $\\prob(y_i|\\x,y_{i-1},i)$. With this in mind let us follow the beam for an example instance. " ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"28b8e268-25b1-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#28b8e268-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#28b8e268-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " <div class=\"item active\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>0.00</td></tr>\n", " </table>\n", " 1 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>-0.07</td></tr><tr><td>^</td><td>-4.37</td></tr><tr><td>N</td><td>-4.51</td></tr>\n", " </table>\n", " 2 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>-0.54</td></tr><tr><td>A</td><td>A</td><td>-1.19</td></tr><tr><td>A</td><td>^</td><td>-3.37</td></tr>\n", " </table>\n", " 3 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>-1.06</td></tr><tr><td>A</td><td>A</td><td>N</td><td>-1.30</td></tr><tr><td>A</td><td>N</td><td>V</td><td>-2.05</td></tr>\n", " </table>\n", " 4 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>-1.06</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>-1.30</td></tr><tr><td>A</td><td>N</td><td>V</td><td>P</td><td>-2.05</td></tr>\n", " </table>\n", " 5 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>-1.41</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>-1.65</td></tr><tr><td>A</td><td>N</td><td>V</td><td>P</td><td>N</td><td>-2.40</td></tr>\n", " </table>\n", " 6 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>-1.41</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>-1.65</td></tr><tr><td>A</td><td>N</td><td>V</td><td>P</td><td>N</td><td>,</td><td>-2.41</td></tr>\n", " </table>\n", " 7 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>-1.50</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>-1.74</td></tr><tr><td>A</td><td>N</td><td>V</td><td>P</td><td>N</td><td>,</td><td>O</td><td>-2.49</td></tr>\n", " </table>\n", " 8 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>-2.01</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>-2.25</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>-2.53</td></tr>\n", " </table>\n", " 9 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>-2.50</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>-2.54</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>-2.74</td></tr>\n", " </table>\n", " 10 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>-2.50</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>-2.54</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>-2.74</td></tr>\n", " </table>\n", " 11 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>-3.06</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>N</td><td>-3.11</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>-3.30</td></tr>\n", " </table>\n", " 12 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>-3.08</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>N</td><td>P</td><td>-3.12</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>-3.32</td></tr>\n", " </table>\n", " 13 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>-3.12</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>N</td><td>P</td><td>V</td><td>-3.16</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>-3.36</td></tr>\n", " </table>\n", " 14 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>-3.31</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>-3.35</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>-3.55</td></tr>\n", " </table>\n", " 15 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>-3.33</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>-3.37</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>-3.57</td></tr>\n", " </table>\n", " 16 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>Z</td><td>-4.06</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>Z</td><td>-4.10</td></tr><tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>Z</td><td>-4.30</td></tr>\n", " </table>\n", " 17 / 17</div>\n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c55fb00>" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "example = 56\n", "beam, history = memm_beam_search(memm_1, dev[example][0],3)\n", "seq.render_beam_history(history, dev[example], end=17)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the search *frontier*, the most recent label in each of the hypotheses, often has very little diversity. Sometimes this makes sense: for the word \"of\" it is very certain that the label \"P\" for preposition should be assigned, and the frontier reflects that. However, for the adjective \"better\" of \"better way\" the frontier contains the label \"R\" (adverb) twice, and the gold label \"A\" (adjective) not at all. This leads to an error in this case. We can fix this error by simply increasing the beam size to 4. You can test this above. In this case \"A\" barely makes it into the beam, and becomes the winning label in the next step as it fits better to the noun \"way\". \n", "\n", "One can generally avoid search errors by increasing the width, but for many models this is sub-optimal because it ignores the factorization or dependency structure of the model. In this particular case labels only depend on the previous label. This means that it makes no sense to maintain more than one hypothesis with the same frontier label in the beam. One only needs to remember the highest scoring sequence with that frontier label. To prove this consider two partial sequences $\\y$ and $\\y'$ of length $l$ with the same last label $t=y_l=y'_l$. Assume that the log probability $s = \\sum_{i}^0 \\log \\prob(y_i|\\x,y_{i-1},i)$ of $\\y$ is larger than the log probability $s' = \\sum_{i}^0 \\log \\prob(y'_i|\\x,y'_{i-1},i)$ of $\\y'$. Further assume that the label $y_{l+1}$ maximises $\\prob(y_{l+1}|\\x,t,i+1)$. Then the log probability of $\\y \\| y_{l+1}$ is larger than the log probability of $\\y' \\| y_{l+1}$ and hence there is no need to carry around $\\y'$. \n", "\n", "### Viterbi \n", "\n", "The Viterbi (link/cite) algorithm leverages conditional independences of the model directly. It does so by maintaining a map $\\alpha_i(l)$ from label $l$ and token index $i$ to the score $\\log \\prob(\\y|\\x)$ of highest scoring sequence $\\y$ ending in label $l$ at token $i$. For each pair $(l,i)$ we also remember the sequence $\\y$ that yielded that score in a map $\\beta_i(l)$. The algorithm initialises $\\alpha_{1}(l) =\\log \\prob(l|\\x,\\text{PAD},1)$ and then updates the $\\alpha$ map via the following recursion:\n", "\n", "$$\n", "\\alpha_i(l) = \\max_y \\alpha_{i-1}(y) + \\log \\prob(l|\\x,y,i) \n", "$$\n", "\n", "and in $\\beta_i(l)$ we store the 'winning' $y$ from the $\\max$ term. Once we reached the sequence end the result sequence can be inferred by finding the label $l$ with highest $\\alpha_{|\\x|}(l)$ and then back-tracking using $\\beta$. It is easy to show that this algorithm returns the *optimal* solution to the prediction/search problem, assuming that labels only depend on the previous label. (Exercise: extend to $n$ previous labels) \n", "\n", "Below we implement a beam version of the viterbi algorithm. In this version we restrict the maximisation that defines $\\alpha_i(l)$ to only range over the top $k$ highest scoring previous labels $y$. " ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " <div id=\"28c9a062-25b1-11ec-9009-0242ac110002\" class=\"carousel\" data-ride=\"carousel\" data-interval=\"false\">\n", " <!-- Controls -->\n", " <a href=\"#28c9a062-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"prev\">Previous</a>\n", " \n", " <a href=\"#28c9a062-25b1-11ec-9009-0242ac110002\" role=\"button2\" data-slide=\"next\">Next</a>\n", " <div class=\"carousel-inner\" role=\"listbox\">\n", " <div class=\"item active\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>0.00</td></tr>\n", " </table>\n", " 1 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>-0.07</td></tr><tr><td>^</td><td>-4.37</td></tr><tr><td>N</td><td>-4.51</td></tr>\n", " </table>\n", " 2 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>-0.54</td></tr><tr><td>A</td><td>A</td><td>-1.19</td></tr><tr><td>A</td><td>^</td><td>-3.37</td></tr>\n", " </table>\n", " 3 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>-1.06</td></tr><tr><td>A</td><td>N</td><td>V</td><td>-2.05</td></tr><tr><td>A</td><td>N</td><td>P</td><td>-2.81</td></tr>\n", " </table>\n", " 4 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>-1.06</td></tr><tr><td>A</td><td>N</td><td>P</td><td>N</td><td>-8.40</td></tr><tr><td>A</td><td>N</td><td>P</td><td>V</td><td>-8.61</td></tr>\n", " </table>\n", " 5 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>-1.41</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>^</td><td>-2.44</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>A</td><td>-5.13</td></tr>\n", " </table>\n", " 6 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>-1.41</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>^</td><td>^</td><td>-5.04</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>N</td><td>-9.48</td></tr>\n", " </table>\n", " 7 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>-1.50</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>D</td><td>-4.43</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>#</td><td>-5.87</td></tr>\n", " </table>\n", " 8 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>-2.01</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>-2.53</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>V</td><td>-4.98</td></tr>\n", " </table>\n", " 9 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>-2.50</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>-2.54</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>V</td><td>-4.06</td></tr>\n", " </table>\n", " 10 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>-2.50</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>V</td><td>T</td><td>-8.25</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>A</td><td>-10.32</td></tr>\n", " </table>\n", " 11 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>-3.06</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>V</td><td>-3.91</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>A</td><td>-4.80</td></tr>\n", " </table>\n", " 12 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>-3.08</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>V</td><td>V</td><td>-7.65</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>V</td><td>A</td><td>-8.92</td></tr>\n", " </table>\n", " 13 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>-3.12</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>N</td><td>-6.55</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>A</td><td>-8.16</td></tr>\n", " </table>\n", " 14 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>-3.31</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>T</td><td>-4.92</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>&</td><td>-8.77</td></tr>\n", " </table>\n", " 15 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>-3.33</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>T</td><td>-8.94</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>N</td><td>-9.03</td></tr>\n", " </table>\n", " 16 / 17</div>\n", "<div class=\"item\">\n", " <table>\n", " <tr><td>Happy</td><td>International</td><td>Year</td><td>of</td><td>Biodiversity</td><td>!</td><td>What</td><td>better</td><td>way</td><td>to</td><td>celebrate</td><td>than</td><td>tuning</td><td>in</td><td>to</td><td>CropLife's</td><td>Biodiversity</td></tr>\n", " <tr><td>A</td><td>A</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>A</td><td>N</td><td>P</td><td>V</td><td>P</td><td>V</td><td>T</td><td>P</td><td>Z</td><td>^</td></tr>\n", " <tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>Z</td><td>-4.06</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>L</td><td>-4.61</td></tr><tr><td>A</td><td>N</td><td>N</td><td>P</td><td>N</td><td>,</td><td>O</td><td>R</td><td>R</td><td>P</td><td>N</td><td>P</td><td>V</td><td>P</td><td>P</td><td>S</td><td>-5.72</td></tr>\n", " </table>\n", " 17 / 17</div>\n", " </div>\n", " </div>\n", " " ], "text/plain": [ "<statnlpbook.util.Carousel at 0x7f444c80aac8>" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from collections import defaultdict\n", "import math\n", "def memm_viterbi_search(memm, x, width=2):\n", " labels = memm.labels()\n", " # initialise\n", " alpha = [{}]\n", " beta = [{}]\n", " for label_index, label_score in enumerate(memm.predict_scores_hist(x, 0, [\"PAD\"])):\n", " label = labels[label_index]\n", " alpha[0][label] = label_score\n", " beta[0][label] = \"PAD\"\n", " \n", " # prune\n", " seq.prune_alpha_beta(alpha[0], beta[0], width)\n", " \n", " # recursion \n", " for i in range(1, len(x)):\n", " alpha.append(defaultdict(lambda: -math.inf))\n", " beta.append({})\n", " for p in alpha[i-1].keys():\n", " for label_index, label_score in enumerate(memm.predict_scores_hist(x, i, [p])):\n", " label = labels[label_index]\n", " new_score = alpha[i-1][p] + label_score\n", " if new_score > alpha[i][label]:\n", " alpha[i][label] = new_score\n", " beta[i][label] = p\n", " # prune\n", " seq.prune_alpha_beta(alpha[i], beta[i], width)\n", " \n", " # convert to beam history to be used in the same way beam search was used. \n", " history = seq.convert_alpha_beta_to_history(x, alpha, beta)\n", " return history[-1], history\n", "\n", "beam, history = memm_viterbi_search(memm_1, dev[example][0],3)\n", "seq.render_beam_history(history, dev[example], 17)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Crucially, for the same beam size we now keep the correct labelling of \"better\" in the beam and reach a better solution, both in terms of log probability and actual accuracy. \n", "\n", "This improvement in log probabilities does not always lead to higher global accuracy:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.8090400165871864" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seq.accuracy(dev, batch_predict(dev, lambda x: memm_viterbi_search(memm_1, x, 10)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Label Bias Problem\n", "\n", "MEMMs multiply several *locally* normalised transition probabilities to arrive at a sequence probability. That is, for each token $i$ and a given previous state $y_{i-1}$ the sum of transition scores into next states $\\sum_{y_i} \\prob_\\params(y_i|\\x,y_{i-1},i)$ equals 1. This local normalisation makes training easy (why?), but it also leads to a problem. Consider two simple sequences \"that works\" and \"that house\". The former is a pronoun (\"O\") followed by a verb (\"V\"), the latter is a determiner (\"D\") followed by a noun (\"N\"). Let us assume that \n", "\n", "$$\n", "\\prob_\\params(\\text{D}|\\x,\\text{PAD},0) =\\prob_\\params(\\text{O}|\\x,\\text{PAD},0) \\approx 0.5,\n", "$$\n", "\n", "meaning that at the beginning of a sentence both the determiner and pronoun label for \"that\" have roughly the same probability 0.5. Now assume that in the training set determiners are always followed by nouns, and pronouns always by verbs. This would mean that \n", "\n", "$$\n", "\\prob_\\params(\\text{N}|\\x,\\text{D},i) = \\prob_\\params(\\text{V}|\\x,\\text{O},i) \\approx 1\n", "$$ \n", "\n", "and hence transitions from these two states are completely independent of the observation. \n", "\n", "Now we have $\\prob_\\params(\\text{D N}|\\, \\text{that works}) \\approx 0.5$ and $\\prob_\\params(\\text{O V}|\\, \\text{that works}) \\approx 0.5$, and the same for the input \"that house\". This means that once we enter the \"D\" or \"O\" state, the following observations have no effect on the sequence probability. The reason is that MEMMs requires *all* incoming probability mass (0.5 in the above example) to a given state (such as \"D\" or \"O\") to be distributed among the outgoing states. If there is only one possible next state, then that next state will receive all the mass, regardless of the observation. In particular, the model cannot say \"in state \"D\" and for observation \"works\", *all* labels are impossible. More generally, states with few outgoing distributions effectively ignore observations, and this creates a bias towards such states. This problem is known as the *label bias problem*. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CRFs\n", "\n", "[Conditional Random Fields (CRFs)](http://www.seas.upenn.edu/~strctlrn/bib/PDF/crf.pdf) have been developed to overcome the label bias. The core problem of MEMMs is local normalisation. CRFs replace this with *global* normalisation. That is instead of normalising across all possible next states $y_{i+1}$ given a current state $y_i$ and observation $\\x$, the CRF normalises across all possible *sequences* $\\y$ given observation $\\x$. Formally the CRF is hence defined as follows:\n", "\n", "$$\n", "p_\\params(\\y|\\x) = \\frac{1}{Z_{\\x}} \\prod_i^{|\\x|} \\exp \\langle \\repr(\\x,y_{i-1},i), \\params_{y_i} \\rangle\n", "$$\n", "\n", "where $Z_{\\x}=\\sum_\\y \\prod_i^{|\\x|} \\exp \\langle \\repr(\\x,y_{i-1},i), \\params_{y_i} \\rangle$ is the *partition function*, a *global* normalisation constant depending on $\\x$. Notably each term $\\exp \\langle \\repr(\\x,y_{i-1},i), \\params_{y_i} \\rangle$ in the product can now take on values in $[0,\\infty)$ as opposed to the MEMM terms in $[0,1]$. \n", "\n", "The name CRF stems from the fact they correspond to [Markov random fields](http://www.statslab.cam.ac.uk/~grg/books/hammfest/3-pdc.ps), globally conditioned on the observation $\\x$. While in this chapter we focus on cases where the dependency structure corresponds to a linear chain, CRFs are more general and encompass any graphical structure. \n", "\n", "### Training Linear Chain CRFs\n", "\n", "CRFs can be trained (as usual) by maximising the conditional log-likelihood of the data \n", "\n", "$$\n", "CL(\\params) = \\sum_{(\\x,\\y) \\in \\train} \\log \\prob_\\params(\\y|\\x).\n", "$$\n", "\n", "This is substantially harder than for MEMMs because the partition function makes it impossible to break up the objective into only per-token logistic regression terms. Instead the objective needs to be treated on a per-sequence basis. Conceptually this is not difficult: just as for logistic regression we need to calculate the gradient of the objective, and once we have this gradient, we choose a gradient descent/ascent method to optimise the function. The general CRF conditional log-likelihood is in fact a generalisation of the logistic regression objective, and hence the CRF gradient will look very similar to the gradient of logistic regression. In this chapter we will only briefly discuss the gradient and what is necessary to calculate it. \n", "\n", "\\begin{split}\n", " \\nabla_{y'} CL(\\params) = \\sum_{(\\x,\\y) \\in \\train} \\sum^{|\\x|}_i\\repr(\\x,y_{i-1},i) \\delta(y_i,y') - p_\\params(y',y_{i-1}|\\x) \\repr(\\x,y_{i-1},i) \n", "\\end{split}\n", "\n", "\n", "### Prediction in Linear Chain CRFs\n", "\n", "From the perspective of finding $\\argmax_\\y \\prob_\\params(\\y|\\x)$ we can treat the CRF just as the MEMM. They share the same factorisation/dependency structure and are just differently normalised. That is, we can again simply perform greedy search, use a beam or apply Viterbi. \n", "\n", "Below we train a CRF model using the same feature vectors as used for the MEMM model, and then do prediction via the Viterbi algorithm (this is the standard algorithm in most CRF libraries). " ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.8216877462160481" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crf_1 = seq.CRFSequenceLabeler(feat_4, train)\n", "seq.accuracy(dev, crf_1.predict(dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A notable 1% point improvement over the MEMM model that essentially comes from free in the sense that we are using exactly the same feature representation and are just changing from local to global normalisation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test Data\n", "Let us run the above models on the test data to see whether our findings generalise.\n" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>0</th>\n", " <th>1</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>word</td>\n", " <td>0.650727</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>+ first @</td>\n", " <td>0.695749</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>+ cap</td>\n", " <td>0.741331</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>+ suffix</td>\n", " <td>0.788870</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>MEMM</td>\n", " <td>0.810962</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>CRF</td>\n", " <td>0.822707</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " 0 1\n", "0 word 0.650727\n", "1 + first @ 0.695749\n", "2 + cap 0.741331\n", "3 + suffix 0.788870\n", "4 MEMM 0.810962\n", "5 CRF 0.822707" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "pd.DataFrame([\n", " [\"word\", seq.accuracy(test, local_1.predict(test))],\n", " [\"+ first @\", seq.accuracy(test, local_2.predict(test))],\n", " [\"+ cap\", seq.accuracy(test, local_3.predict(test))],\n", " [\"+ suffix\", seq.accuracy(test, local_4.predict(test))],\n", " [\"MEMM\", seq.accuracy(test, memm_1.predict(test))],\n", " [\"CRF\", seq.accuracy(test, crf_1.predict(test))] \n", " ])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Background Material\n", "* [Tackling the Poor Assumptions of Naive Bayes Text Classifiers](https://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf), Rennie et al, ICML 2003 " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 1 }