{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Evaluation metrics in NLP"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"__author__ = \"Christopher Potts\"\n",
"__version__ = \"CS224u, Stanford, Spring 2023\""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Contents\n",
"\n",
"1. [Overview](#Overview)\n",
"1. [Set-up](#Set-up)\n",
"1. [Classifier metrics](#Classifier-metrics)\n",
" 1. [Confusion matrix](#Confusion-matrix)\n",
" 1. [Accuracy](#Accuracy)\n",
" 1. [Precision](#Precision)\n",
" 1. [Recall](#Recall)\n",
" 1. [F scores](#F-scores)\n",
" 1. [Macro-averaged F scores](#Macro-averaged-F-scores)\n",
" 1. [Weighted F scores](#Weighted-F-scores)\n",
" 1. [Micro-averaged F scores](#Micro-averaged-F-scores)\n",
" 1. [Precision–recall curves](#Precision–recall-curves)\n",
" 1. [Average precision](#Average-precision)\n",
" 1. [Receiver Operating Characteristic (ROC) curve](#Receiver-Operating-Characteristic-(ROC)-curve)\n",
"1. [Regression metrics](#Regression-metrics)\n",
" 1. [Mean squared error](#Mean-squared-error)\n",
" 1. [R-squared scores](#R-squared-scores)\n",
" 1. [Pearson correlation](#Pearson-correlation)\n",
" 1. [Spearman rank correlation](#Spearman-rank-correlation)\n",
"1. [Sequence prediction](#Sequence-prediction)\n",
" 1. [Word error rate](#Word-error-rate)\n",
" 1. [BLEU scores](#BLEU-scores)\n",
" 1. [Perplexity](#Perplexity)\n",
"1. [Other resources](#Other-resources)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Overview\n",
"\n",
"1. Different evaluation metrics __encode different values__ and have __different biases and other weaknesses__. Thus, you should choose your metrics carefully, and motivate those choices when writing up and presenting your work.\n",
"\n",
"1. This notebook reviews some of the most prominent evaluation metrics in NLP, seeking not only to define them, but also to articulate what values they encode and what their weaknesses are.\n",
"\n",
"1. In your own work, __you shouldn't feel confined to these metrics__. Per item 1 above, you should feel that you have the freedom to motivate new metrics and specific uses of existing metrics, depending on what your goals are.\n",
"\n",
"1. If you're working on an established problem, then you'll feel pressure from readers (and referees) to use the metrics that have already been used for the problem. This might be a compelling pressure. However, you should always feel free to argue against those cultural norms and motivate new ones. Areas can stagnate due to poor metrics, so we must be vigilant!"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"This notebook discusses prominent metrics in NLP evaluations. I've had to be selective to keep the notebook from growing too long and complex. I think the measures and considerations here are fairly representative of the issues that arise in NLP evaluation.\n",
"\n",
"The scikit-learn [model evaluation usage guide](http://scikit-learn.org/stable/modules/model_evaluation.html) is excellent as a source of implementations, definitions, and references for a wide range of metrics for classification, regression, ranking, and clustering.\n",
"\n",
"This notebook is the first in a two-part series on evaluation. Part 2 is on [evaluation methods](evaluation_methods.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Set-up"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"from nltk.metrics.distance import edit_distance\n",
"from nltk.translate import bleu_score\n",
"import numpy as np\n",
"import pandas as pd\n",
"import scipy.stats\n",
"from sklearn import metrics\n",
"import utils"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"utils.fix_random_seeds()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Classifier metrics"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Confusion matrix\n",
"\n",
"A confusion matrix gives a complete comparison of how the observed/gold labels compare to the labels predicted by a classifier.\n",
"\n",
"`ex1 = `\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
15
\n",
"
10
\n",
"
100
\n",
"
\n",
"
\n",
"
neg
\n",
"
10
\n",
"
15
\n",
"
10
\n",
"
\n",
"
\n",
"
neutral
\n",
"
10
\n",
"
100
\n",
"
1000
\n",
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"For classifiers that predict real values (scores, probabilities), it is important to remember that __a threshold was imposed to create these categorical predictions__. \n",
"\n",
"The position of this threshold can have a large impact on the overall assessment that uses the confusion matrix as an input. The default is to choose the class with the highest probability. This is so deeply ingrained that it is often not even mentioned. However, it might be inappropriate:\n",
"\n",
" 1. We might care about the full distribution.\n",
" 1. Where the important class is very small relative to the others, any significant amount of positive probability for it might be important.\n",
"\n",
"Metrics like [average precision](#Average-precision) explore this threshold as part of their evaluation procedure. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"This function creates the toy confusion matrices that we will use for illustrative examples:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"def illustrative_confusion_matrix(data):\n",
" classes = ['pos', 'neg', 'neutral']\n",
" ex = pd.DataFrame(\n",
" data,\n",
" columns=classes,\n",
" index=classes)\n",
" ex.index.name = \"observed\"\n",
" return ex"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"ex1 = illustrative_confusion_matrix([\n",
" [15, 10, 100],\n",
" [10, 15, 10],\n",
" [10, 100, 1000]])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Accuracy\n",
"\n",
"[Accuracy](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score) is the sum of the correct predictions divided by the sum of all predictions:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def accuracy(cm):\n",
" return cm.values.diagonal().sum() / cm.values.sum()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's an illustrative confusion matrix:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`ex1 = `\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
15
\n",
"
10
\n",
"
100
\n",
"
\n",
"
\n",
"
neg
\n",
"
10
\n",
"
15
\n",
"
10
\n",
"
\n",
"
\n",
"
neutral
\n",
"
10
\n",
"
100
\n",
"
1000
\n",
"
\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.8110236220472441"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"accuracy(ex1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Accuracy bounds\n",
"\n",
"[0, 1], with 0 the worst and 1 the best."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by accuracy\n",
"\n",
"Accuracy seems to directly encode a core value we have for classifiers – how often they are correct. In addition, the accuracy of a classifier on a test set will be negatively correlated with the [negative log (logistic, cross-entropy) loss](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html#sklearn.metrics.log_loss), which is a common loss for classifiers. In this sense, these classifiers are optimizing for accuracy."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of accuracy\n",
"\n",
"* Accuracy does not give per-class metrics for multi-class problems.\n",
"\n",
"* Accuracy fails to control for size imbalances in the classes. For instance, consider the variant of the above in which the classifier guessed only __neutral__:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"ex2 = illustrative_confusion_matrix([\n",
" [0, 0, 125],\n",
" [0, 0, 35],\n",
" [0, 0, 1110]])"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
\n",
"
\n",
"
observed
\n",
"
\n",
"
\n",
"
\n",
"
\n",
" \n",
" \n",
"
\n",
"
pos
\n",
"
0
\n",
"
0
\n",
"
125
\n",
"
\n",
"
\n",
"
neg
\n",
"
0
\n",
"
0
\n",
"
35
\n",
"
\n",
"
\n",
"
neutral
\n",
"
0
\n",
"
0
\n",
"
1110
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" pos neg neutral\n",
"observed \n",
"pos 0 0 125\n",
"neg 0 0 35\n",
"neutral 0 0 1110"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ex2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Intuitively, this is a worse classifier than the one that produced `ex1`. Whereas `ex1` does well at __pos__ and __neg__ despite their small size, this classifier doesn't even try to get them right – it always predicts __neutral__. However, its accuracy is higher!"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.8110236220472441\n",
"0.8740157480314961\n"
]
}
],
"source": [
"print(accuracy(ex1))\n",
"print(accuracy(ex2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Related to accuracy\n",
"\n",
"* Accuracy is inversely proportional to the [negative log (logistic, cross-entropy) loss](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html#sklearn.metrics.log_loss) that many classifiers optimize:\n",
"\n",
"$$\n",
"-\\frac{1}{N} \\sum_{i=1}^{N} \\sum_{k=1}^{K} y_{i,k} \\log(p_{i,k})\n",
"$$\n",
"\n",
"* Accuracy can be related in a similar way to [KL divergence](https://en.wikipedia.org/wiki/Kullback–Leibler_divergence): \n",
"$$\n",
"D_{\\text{KL}}(y \\parallel p) = \n",
" \\sum _{k=1}^{K} y_{k} \\log\\left(\\frac {y_{k}}{p_{k}}\\right)\n",
"$$\n",
" Where $y$ is a \"one-hot vector\" (a classification label) with $1$ at position $k$, this reduces to \n",
" $$\n",
" \\log\\left(\\frac{1}{p_{k}}\\right) = -\\log(p_{k})\n",
" $$\n",
" Thus, KL-divergence is an analogue of accuracy for soft labels."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Precision\n",
"\n",
"[Precision](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score) is the sum of the correct predictions divided by the sum of all guesses. This is a per-class notion; in our confusion matrices, it's the diagonal values divided by the column sums:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"def precision(cm):\n",
" return cm.values.diagonal() / cm.sum(axis=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`ex1 =`\n",
"
"
],
"text/plain": [
" pos neg neutral\n",
"observed \n",
"pos 0 0 125\n",
"neg 0 0 35\n",
"neutral 0 0 1110"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ex2"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pos NaN\n",
"neg NaN\n",
"neutral 0.874016\n",
"dtype: float64"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"precision(ex2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's common to see these `NaN` values mapped to 0."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Precision bounds\n",
"\n",
"[0, 1], with 0 the worst and 1 the best. (Caveat: undefined values resulting from dividing by 0 need to be mapped to 0.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by precision\n",
"\n",
"Precision encodes a _conservative_ value in penalizing incorrect guesses."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of precision\n",
"\n",
"Precision's dangerous edge case is that one can achieve very high precision for a category by rarely guessing it. Consider, for example, the following classifier's flawless predictions for __pos__ and __neg__. These predictions are at the expense of __neutral__, but that is such a big class that it hardly matters to the precision for that class either."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"ex3 = illustrative_confusion_matrix([\n",
" [1, 0, 124],\n",
" [0, 1, 24],\n",
" [0, 0, 1110]])"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
\n",
"
\n",
"
observed
\n",
"
\n",
"
\n",
"
\n",
"
\n",
" \n",
" \n",
"
\n",
"
pos
\n",
"
1
\n",
"
0
\n",
"
124
\n",
"
\n",
"
\n",
"
neg
\n",
"
0
\n",
"
1
\n",
"
24
\n",
"
\n",
"
\n",
"
neutral
\n",
"
0
\n",
"
0
\n",
"
1110
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" pos neg neutral\n",
"observed \n",
"pos 1 0 124\n",
"neg 0 1 24\n",
"neutral 0 0 1110"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ex3"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pos 1.000000\n",
"neg 1.000000\n",
"neutral 0.882353\n",
"dtype: float64"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"precision(ex3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These numbers mask the fact that this is a very poor classifier!"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Compare with our less imbalanced `ex1`; for \"perfect\" precision on `pos` and `neg`, we incurred only a small drop in `neutral` here:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
\n",
"
\n",
"
observed
\n",
"
\n",
"
\n",
"
\n",
"
\n",
" \n",
" \n",
"
\n",
"
pos
\n",
"
15
\n",
"
10
\n",
"
100
\n",
"
\n",
"
\n",
"
neg
\n",
"
10
\n",
"
15
\n",
"
10
\n",
"
\n",
"
\n",
"
neutral
\n",
"
10
\n",
"
100
\n",
"
1000
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" pos neg neutral\n",
"observed \n",
"pos 15 10 100\n",
"neg 10 15 10\n",
"neutral 10 100 1000"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ex1"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pos 0.428571\n",
"neg 0.120000\n",
"neutral 0.900901\n",
"dtype: float64"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"precision(ex1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Recall\n",
"\n",
"[Recall](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score) is the sum of the correct predictions divided by the sum of all true instances. This is a per-class notion; in our confusion matrices, it's the diagonal values divided by the row sums. Recall is sometimes called the \"true positive rate\"."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"def recall(cm):\n",
" return cm.values.diagonal() / cm.sum(axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`ex1 =`\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
recall
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
15
\n",
"
10
\n",
"
100
\n",
"
0.12
\n",
"
\n",
"
\n",
"
neg
\n",
"
10
\n",
"
15
\n",
"
10
\n",
"
0.43
\n",
"
\n",
"
\n",
"
neutral
\n",
"
10
\n",
"
100
\n",
"
1000
\n",
"
0.90
\n",
"
\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"observed\n",
"pos 0.120000\n",
"neg 0.428571\n",
"neutral 0.900901\n",
"dtype: float64"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"recall(ex1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Recall trades off against precision. For instance, consider again `ex3`, in which the classifier was very conservative with __pos__ and __neg__:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`ex3 =`\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
recall
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
1
\n",
"
0
\n",
"
124
\n",
"
0.008
\n",
"
\n",
"
\n",
"
neg
\n",
"
0
\n",
"
1
\n",
"
24
\n",
"
0.040
\n",
"
\n",
"
\n",
"
neutral
\n",
"
0
\n",
"
0
\n",
"
1110
\n",
"
1.000
\n",
"
\n",
"
\n",
"
\n",
"
precision
\n",
"
1.00
\n",
"
1.00
\n",
"
0.88
\n",
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Recall bounds\n",
"\n",
"[0, 1], with 0 the worst and 1 the best."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by recall\n",
"\n",
"Recall encodes a _permissive_ value in penalizing only missed true cases."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of recall\n",
"\n",
"Recall's dangerous edge case is that one can achieve very high recall for a category by always guessing it. This could mean a lot of incorrect guesses, but recall sees only the correct ones. You can see this in `ex3` above. The model did make some incorrect __neutral__ predictions, but it missed none, so it achieved perfect recall for that category.\n",
"\n",
"`ex3 =`\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
recall
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
1
\n",
"
0
\n",
"
124
\n",
"
0.008
\n",
"
\n",
"
\n",
"
neg
\n",
"
0
\n",
"
1
\n",
"
24
\n",
"
0.040
\n",
"
\n",
"
\n",
"
neutral
\n",
"
0
\n",
"
0
\n",
"
1110
\n",
"
1.000
\n",
"
\n",
"
\n",
"
\n",
"
precision
\n",
"
1.00
\n",
"
1.00
\n",
"
0.88
\n",
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### F scores\n",
"\n",
"[F scores](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.fbeta_score.html#sklearn.metrics.fbeta_score) combine precision and recall via their harmonic mean, with a value $\\beta$ that can be used to emphasize one or the other. Like precision and recall, this is a per-category notion.\n",
"\n",
"$$\n",
"(\\beta^{2}+1) \\cdot \\frac{\\textbf{precision} \\cdot\n",
" \\textbf{recall}}{(\\beta^{2} \\cdot \\textbf{precision}) +\n",
" \\textbf{recall}}\n",
"$$\n",
"\n",
"Where $\\beta=1$, we have F1:\n",
"\n",
"$$\n",
"2 \\cdot \\frac{\\textbf{precision} \\cdot \\textbf{recall}}{\\textbf{precision} + \\textbf{recall}}\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"def f_score(cm, beta):\n",
" p = precision(cm)\n",
" r = recall(cm)\n",
" return (beta**2 + 1) * ((p * r) / ((beta**2 * p) + r))"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"def f1_score(cm):\n",
" return f_score(cm, beta=1.0)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
" pos neg neutral\n",
"observed \n",
"pos 1 0 124\n",
"neg 0 1 24\n",
"neutral 0 0 1110"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ex3"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pos 0.015873\n",
"neg 0.076923\n",
"neutral 0.937500\n",
"dtype: float64"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f1_score(ex3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Bounds of F scores\n",
"\n",
"[0, 1], with 0 the worst and 1 the best, and guaranteed to be between precision and recall."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by F scores\n",
"\n",
"The F$_{\\beta}$ score for a class $K$ is an attempt to summarize how well the classifier's $K$ predictions align with the true instances of $K$. Alignment brings in both missed cases and incorrect predictions. Intuitively, precision and recall keep each other in check in the calculation. This idea runs through almost all robust classification metrics."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of F scores\n",
"\n",
"* There is no normalization for the size of the dataset within $K$ or outside of it.\n",
"\n",
"* For a given category $K$, the F$_{\\beta}$ score for $K$ ignores all the values that are off the row and column for $K$, which might be the majority of the data. This means that the individual scores for a category can be very misleading about the overall performance of the system. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`ex1 = `\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
F1
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
15
\n",
"
10
\n",
"
100
\n",
"
0.187
\n",
"
\n",
"
\n",
"
neg
\n",
"
10
\n",
"
15
\n",
"
10
\n",
"
0.187
\n",
"
\n",
"
\n",
"
neutral
\n",
"
10
\n",
"
100
\n",
"
1,000
\n",
"
0.90
\n",
"
\n",
"
\n",
"\n",
"\n",
"`ex4 =`\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
predicted
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
pos
\n",
"
neg
\n",
"
neutral
\n",
"
F1
\n",
"
\n",
"
\n",
"
gold
\n",
"
pos
\n",
"
15
\n",
"
10
\n",
"
100
\n",
"
0.187
\n",
"
\n",
"
\n",
"
neg
\n",
"
10
\n",
"
15
\n",
"
10
\n",
"
0.187
\n",
"
\n",
"
\n",
"
neutral
\n",
"
10
\n",
"
100
\n",
"
100,000
\n",
"
0.999
\n",
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Related to F scores\n",
"\n",
"* Dice similarity for binary vectors is sometimes used to assess how well a model has learned to identify a set of items. In this setting, [it is equivalent to the per-token F1 score](https://brenocon.com/blog/2012/04/f-scores-dice-and-jaccard-set-similarity/).\n",
"\n",
"* The intuition behind F scores (balancing precision and recall) runs through many of the metrics discussed below."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Macro-averaged F scores\n",
"\n",
"The [macro-averaged F$_{\\beta}$ score](http://scikit-learn.org/stable/modules/model_evaluation.html#multiclass-and-multilabel-classification) (macro F$_{\\beta}$) is the mean of the F$_{\\beta}$ score for each category:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"def macro_f_score(cm, beta):\n",
" return f_score(cm, beta).mean(skipna=False)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
" pos neg neutral\n",
"observed \n",
"pos 1 0 124\n",
"neg 0 1 24\n",
"neutral 0 0 1110"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ex3"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pos 0.015873\n",
"neg 0.076923\n",
"neutral 0.937500\n",
"dtype: float64"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f1_score(ex3)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.34343203093203095"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"macro_f_score(ex3, beta=1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Bounds of macro-averaged F scores\n",
"\n",
"[0, 1], with 0 the worst and 1 the best, and guaranteed to be between precision and recall."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by macro-averaged F scores\n",
"\n",
"Macro F$_{\\beta}$ scores inherit the values of F$_{\\beta}$ scores, and they additionally say that we care about all the classes equally regardless of their size. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of macro-averaged F scores\n",
"\n",
"In NLP, we typically care about modeling all of the classes well, so macro-F$_{\\beta}$ scores often seem appropriate. However, this is also the source of their primary weaknesses:\n",
"\n",
"* If a model is doing really well on a small class $K$, its high macro F$_{\\beta}$ score might mask the fact that it mostly makes incorrect predictions outside of $K$. So F$_{\\beta}$ scoring will make this kind of classifier look better than it is.\n",
"\n",
"* Conversely, if a model does well on a very large class, its overall performance might be high even if it stumbles on some small classes. So F$_{\\beta}$ scoring will make this kind of classifier look worse than it is, as measured by sheer number of good predictions."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Weighted F scores\n",
"\n",
"[Weighted F$_{\\beta}$ scores](http://scikit-learn.org/stable/modules/model_evaluation.html#multiclass-and-multilabel-classification) average the per-category F$_{\\beta}$ scores, but it's a weighted average based on the size of the classes in the observed/gold data:"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"def weighted_f_score(cm, beta):\n",
" scores = f_score(cm, beta=beta).values\n",
" weights = cm.sum(axis=1)\n",
" return np.average(scores, weights=weights)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.828993812624765"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weighted_f_score(ex3, beta=1.0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Bounds of weighted F scores\n",
"\n",
"[0, 1], with 0 the worst and 1 the best, but without a guarantee that it will be between precision and recall."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by weighted F scores\n",
"\n",
"Weighted F$_{\\beta}$ scores inherit the values of F$_{\\beta}$ scores, and they additionally say that we want to weight the summary by the number of actual and predicted examples in each class. This will probably correspond well with how the classifier will perform, on a per example basis, on data with the same class distribution as the training data."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of weighted F scores\n",
"\n",
"Large classes will dominate these calculations. Just like macro-averaging, this can make a classifier look artificially good or bad, depending on where its errors tend to occur."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Micro-averaged F scores\n",
"\n",
"[Micro-averaged F$_{\\beta}$ scores](http://scikit-learn.org/stable/modules/model_evaluation.html#multiclass-and-multilabel-classification) (micro F$_{\\beta}$ scores) add up the 2 $\\times$ 2 confusion matrices for each category versus the rest, and then they calculate the F$_{\\beta}$ scores, with the convention being that the positive class's F$_{\\beta}$ score is reported. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"This function creates the 2 $\\times$ 2 matrix for a category `cat` in a confusion matrix `cm`:"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"def cat_versus_rest(cm, cat):\n",
" yes = cm.loc[cat, cat]\n",
" yes_no = cm.loc[cat].sum() - yes\n",
" no_yes = cm[cat].sum() - yes\n",
" no = cm.values.sum() - yes - yes_no - no_yes\n",
" return pd.DataFrame(\n",
" [[yes, yes_no],\n",
" [no_yes, no]],\n",
" columns=['yes', 'no'],\n",
" index=['yes', 'no'])"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
" yes no\n",
"yes 1030 240\n",
"no 240 2300"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum([cat_versus_rest(ex1, cat) for cat in ex1.index])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"For the micro F$_{\\beta}$ score, we just add up these per-category confusion matrices and calculate the F$_{\\beta}$ score:"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [],
"source": [
"def micro_f_score(cm, beta):\n",
" c = sum([cat_versus_rest(cm, cat) for cat in cm.index])\n",
" return f_score(c, beta=beta).loc['yes']"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.8110236220472442"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"micro_f_score(ex1, beta=1.0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Bounds of micro-averaged F scores\n",
"\n",
"[0, 1], with 0 the worst and 1 the best, and guaranteed to be between precision and recall."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Value encoded by micro-averaged F scores\n",
"\n",
"* Micro F$_{\\beta}$ scores inherit the values of weighted F$_{\\beta}$ scores. (The resulting scores tend to be very similar.)\n",
"\n",
"* For two-class problems, this has an intuitive interpretation in which precision and recall are defined in terms of correct and incorrect guesses ignoring the class. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Weaknesses of micro-averaged F scores\n",
"\n",
"The weaknesses too are the same as those of weighted F$_{\\beta}$ scores, with the additional drawback that we actually get two potentially very different values, for the positive and negative classes, and we have to choose one to meet our goal of having a single summary number. (See the `'yes'` in the final line of `micro_f_score`.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Related to micro-averaged F scores\n",
"\n",
"* Micro-averaging is equivalent to accuracy.\n",
"\n",
"* F1 is identical to both precision and recall on the 2 $\\times$ 2 matrix that is the basis for the calculation."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Precision–recall curves\n",
"\n",
"I noted above that confusion matrices hide a threshold for turning probabilities/scores into predicted labels. With precision–recall curves, we finally address this.\n",
"\n",
"A precision–recall curve is a method for summarizing the relationship between precision and recall for a binary classifier. \n",
"\n",
"The basis for this calculation is not the confusion matrix, but rather the raw scores or probabilities returned by the classifier. Normally, we use 0.5 as the threshold for saying that a prediction is positive. However, each distinct real value in the set of predictions is a potential threshold. The precision–recall curve explores this space."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Here's a basic implementation; [the sklearn version](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_curve.html) is more flexible and so recommended for real experimental frameworks."
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [],
"source": [
"def precision_recall_curve(y, probs):\n",
" \"\"\"`y` is a list of labels, and `probs` is a list of predicted\n",
" probabilities or predicted scores -- likely a column of the\n",
" output of `predict_proba` using an `sklearn` classifier.\n",
" \"\"\"\n",
" thresholds = sorted(set(probs))\n",
" data = []\n",
" for t in thresholds:\n",
" # Use `t` to create labels:\n",
" pred = [1 if p >= t else 0 for p in probs]\n",
" # Precision/recall analysis as usual, focused on\n",
" # the positive class:\n",
" cm = pd.DataFrame(metrics.confusion_matrix(y, pred))\n",
" prec = precision(cm)[1]\n",
" rec = recall(cm)[1]\n",
" data.append((t, prec, rec))\n",
" # For intuitive graphs, always include this end-point:\n",
" data.append((None, 1, 0))\n",
" return pd.DataFrame(\n",
" data, columns=['threshold', 'precision', 'recall'])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"I'll illustrate with a hypothetical binary classification problem involving balanced classes:"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"y = np.random.choice((0, 1), size=1000, p=(0.5, 0.5))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Suppose our classifier is generally able to distinguish the two classes, but it never predicts a value above 0.4, so our usual methods of thresholding at 0.5 would make the classifier look very bad:"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"y_pred = [np.random.uniform(0.0, 0.3) if x == 0 else np.random.uniform(0.1, 0.4)\n",
" for x in y]"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The precision–recall curve can help us identify the optimal threshold given whatever our real-world goals happen to be:"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [],
"source": [
"prc = precision_recall_curve(y, y_pred)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"def plot_precision_recall_curve(prc):\n",
" ax1 = prc.plot.scatter(x='recall', y='precision', legend=False)\n",
" ax1.set_xlim([0, 1])\n",
" ax1.set_ylim([0, 1.1])\n",
" ax1.set_ylabel(\"precision\")\n",
" ax2 = ax1.twiny()\n",
" ax2.set_xticklabels(prc['threshold'].values[::100].round(3))\n",
" _ = ax2.set_xlabel(\"threshold\")"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/45/77p9r7r13q7_pwzzlsv85fxr0000gn/T/ipykernel_21512/1028661969.py:7: UserWarning: FixedFormatter should only be used together with FixedLocator\n",
" ax2.set_xticklabels(prc['threshold'].values[::100].round(3))\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAksAAAHZCAYAAACW+3/XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAABF2UlEQVR4nO3deVxWZf7/8fctq6BgAiIGIpSmZGaCJpJjtlDaaDVNaou2aN+cFheySXMqdZqxpml3m8y0Jk0ys2kxk2/jCuZCoBX+yhFUMIhAZTUUOL8//HKPBB5Z7gW4X8/H4348vC+uc87ndIX323Ou+zoWwzAMAQAAoF7tnF0AAABAS0ZYAgAAMEFYAgAAMEFYAgAAMEFYAgAAMEFYAgAAMEFYAgAAMEFYAgAAMEFYAgAAMEFYAmB3mzdvlsVi0YkTJxx63BUrVqhTp07N2sehQ4dksViUnp5+zj7OOj8AjkFYAmBzV199taZNm+bsMgDAJghLAFqk06dPO7sEAJBEWAJgY/fee6+2bNmiV199VRaLRRaLRYcOHZIkpaamKiYmRj4+PhoyZIi+//5763Zz5sxR//799dZbbykyMlJeXl4yDENFRUX6n//5H3Xp0kV+fn665pprtHfvXut2e/fu1fDhw9WxY0f5+fkpOjpae/bsqVXTF198oT59+qhDhw668cYblZuba/1ZdXW15s2bp9DQUHl5eal///7asGGD6TmuX79evXr1Uvv27TV8+HDr+QFomwhLAGzq1VdfVWxsrB544AHl5uYqNzdXYWFhkqTZs2frxRdf1J49e+Tu7q7777+/1rb/+c9/9P7772vt2rXWOUI33XST8vLytH79eqWmpmrAgAG69tprdezYMUnSXXfdpdDQUO3evVupqamaOXOmPDw8rPssLy/X3//+d/3zn//U1q1bdeTIEc2YMaNWvS+++KL+/ve/a9++fbrhhhs0evRoHThwoN7zy87O1u9+9zuNHDlS6enpmjRpkmbOnGnL/4QAWhoDAGxs2LBhxtSpU63vN23aZEgy/vd//9fa9tlnnxmSjJMnTxqGYRjPPPOM4eHhYeTn51v7fPnll4afn5/xyy+/1Nr/RRddZPzjH/8wDMMwOnbsaKxYsaLeOpYvX25IMv7zn/9Y2xYuXGgEBwdb33fr1s34y1/+Umu7gQMHGg899JBhGIaRlZVlSDLS0tIMwzCMWbNmGX369DGqq6ut/Z944glDknH8+PHz/acB0ApxZQmAw/Tr18/655CQEElSfn6+tS08PFxBQUHW96mpqSotLVVAQIA6dOhgfWVlZengwYOSpISEBE2aNEnXXXednnvuOWt7DR8fH1100UW1jltzzOLiYv3444+Ki4urtU1cXJz2799f7zns379fgwcPlsVisbbFxsY26r8DgNbF3dkFAHAdZ98eqwkb1dXV1jZfX99a/aurqxUSEqLNmzfX2VfNkgBz5szRnXfeqc8++0yff/65nnnmGa1evVq33nprnWPWHNcwjDptZzMMo07b2T8D4Fq4sgTA5jw9PVVVVdXs/QwYMEB5eXlyd3fXxRdfXOsVGBho7derVy9Nnz5dGzdu1O9+9zstX768Qfv38/NTt27dtH379lrtKSkp6tOnT73bREVF6auvvqrV9uv3ANoWwhIAm+vRo4d27typQ4cOqaCgoNbVo8a47rrrFBsbq1tuuUVffPGFDh06pJSUFP3pT3/Snj17dPLkST3yyCPavHmzDh8+rOTkZO3evfucQac+jz/+uJ5//nklJibq+++/18yZM5Wenq6pU6fW23/y5Mk6ePCgEhIS9P3332vVqlVasWJFk84PQOtAWAJgczNmzJCbm5uioqIUFBSkI0eONGk/FotF69ev129+8xvdf//96tWrl8aNG6dDhw4pODhYbm5uKiws1IQJE9SrVy+NGTNGI0aM0Ny5cxt8jClTpuixxx7TY489pssuu0wbNmzQxx9/rJ49e9bbv3v37lq7dq0++eQTXX755VqyZIn++te/Nun8ALQOFoMb8AAAAOfElSUAAAAThCUAAAAThCUAAAAThCUAAAAThCUAAAAThCUAAAATrS4sLVq0SBEREfL29lZ0dLS2bdtm2n/Lli2Kjo6Wt7e3IiMjtWTJkjp91q5dq6ioKHl5eSkqKkrr1q2r9fOtW7dq1KhR6tatmywWiz766CNbnlKrZuvxWLFihSwWS53XL7/8Uqvf0aNHdffddysgIEA+Pj7q37+/UlNTbX5+rUljxiI3N1d33nmnLrnkErVr107Tpk2r0+e7777Tbbfdph49eshiseiVV16p04ffjfrZeizOtnr1alksFt1yyy212hmL+tl6LK6++up6/4666aab6t3n/PnzZbFYzjuurqAxY/Hhhx/q+uuvV1BQkPz8/BQbG6svvviiTr/zfX6frTlj0arCUmJioqZNm6bZs2crLS1NQ4cO1YgRI8654F1WVpZGjhypoUOHKi0tTU8++aSmTJmitWvXWvvs2LFDY8eO1fjx47V3716NHz9eY8aM0c6dO619ysrKdPnll2vBggV2P8fWxB7jIZ15BEVubm6tl7e3t/Xnx48fV1xcnDw8PPT5558rIyNDL774ovVZYa6osWNRUVGhoKAgzZ49W5dffnm9fcrLyxUZGannnntOXbt2rbcPvxt12WMsahw+fFgzZszQ0KFD6/yMsajLHmPx4Ycf1vq76dtvv5Wbm5tuv/32On13796tN954o9YDpF1VY8di69atuv7667V+/XqlpqZq+PDhGjVqlNLS0qx9GvL5XaPZY2G0IoMGDTImT55cq613797GzJkz6+3/xz/+0ejdu3ettgcffNAYPHiw9f2YMWOMG2+8sVafG264wRg3bly9+5RkrFu3rgnVtz32GI/ly5cb/v7+psd94oknjKuuuqppRbdRjR2Lsw0bNsyYOnWqaZ/w8HDj5ZdfNu3D78YZ9hqLyspKIy4uznjzzTeNe+65x7j55pvPuR/G4gx7/14YhmG8/PLLRseOHY3S0tJa7SUlJUbPnj2NpKSkBu+rLWvOWNSIiooy5s6da33f0M9vW4xFq7mydOrUKaWmpio+Pr5We3x8vFJSUurdZseOHXX633DDDdqzZ49Onz5t2udc+8QZ9hoPSSotLVV4eLhCQ0P129/+tta/JCTp448/VkxMjG6//XZ16dJFV1xxhZYuXWqjM2t9mjIWsA97jsW8efMUFBSkiRMnNms/rsJRvxfLli3TuHHj5OvrW6v94Ycf1k033aTrrrvOZsdqrWwxFtXV1SopKVHnzp2tbQ39/LbFWLg3eUsHKygoUFVVlYKDg2u1BwcHKy8vr95t8vLy6u1fWVmpgoIChYSEnLPPufaJM+w1Hr1799aKFSt02WWXqbi4WK+++qri4uK0d+9e67O6MjMztXjxYiUkJOjJJ5/Url27NGXKFHl5eWnChAn2OeEWrCljAfuw11gkJydr2bJlSk9Pb2aFrsMRvxe7du3St99+q2XLltVqX716tb7++mvt3r3bJsdp7WwxFi+++KLKyso0ZswYa1tDPr9tNRatJizVsFgstd4bhlGn7Xz9f93e2H3iv2w9HoMHD9bgwYOtP4+Li9OAAQP0+uuv67XXXpN05l8YMTEx1oeXXnHFFfruu++0ePFilwxLNfj/uOWw5ViUlJTo7rvv1tKlSxUYGGiL8lyKPX8vli1bpr59+2rQoEHWtuzsbE2dOlUbN26sNdcSTR+L9957T3PmzNG//vUvdenSpcH7tOVYtJqwFBgYKDc3tzopND8/v06yrNG1a9d6+7u7uysgIMC0z7n2iTPsNR6/1q5dOw0cOFAHDhywtoWEhCgqKqpWvz59+tSZKO4qmjIWsA97jMXBgwd16NAhjRo1ytpWXV0tSXJ3d9f333+viy66qOlFt1H2/r0oLy/X6tWrNW/evFrtqampys/PV3R0tLWtqqpKW7du1YIFC1RRUSE3N7dmH781ac5YJCYmauLEiVqzZk2d22jn+/y25Vi0mjlLnp6eio6OVlJSUq32pKQkDRkypN5tYmNj6/TfuHGjYmJi5OHhYdrnXPvEGfYaj18zDEPp6ekKCQmxtsXFxen777+v1e+HH35QeHh4U06l1WvKWMA+7DEWvXv31jfffKP09HTra/To0Ro+fLjS09MVFhZmi9LbHHv/Xrz//vuqqKjQ3XffXav92muvrTNeMTExuuuuu5Senu5yQUlq+li89957uvfee7Vq1ap6l2Y43+e3Tcei0VPCnWj16tWGh4eHsWzZMiMjI8OYNm2a4evraxw6dMgwDMOYOXOmMX78eGv/zMxMw8fHx5g+fbqRkZFhLFu2zPDw8DA++OADa5/k5GTDzc3NeO6554z9+/cbzz33nOHu7m589dVX1j4lJSVGWlqakZaWZkgyXnrpJSMtLc04fPiw406+BbLHeMyZM8fYsGGDcfDgQSMtLc247777DHd3d2Pnzp3WPrt27TLc3d2Nv/zlL8aBAweMlStXGj4+Psa7777ruJNvYRo7FoZhWP+fjo6ONu68804jLS3N+O6776w/r6iosPYJCQkxZsyYYaSlpRkHDhyw9uF3oy57jMWv1fdtOMaiLnuOxVVXXWWMHTu2QXXwbbjGj8WqVasMd3d3Y+HChUZubq71deLECWufhnx+/1pTx6JVhSXDMIyFCxca4eHhhqenpzFgwABjy5Yt1p/dc889xrBhw2r137x5s3HFFVcYnp6eRo8ePYzFixfX2eeaNWuMSy65xPDw8DB69+5trF27ttbPN23aZEiq87rnnnvscYqtiq3HY9q0aUb37t0NT09PIygoyIiPjzdSUlLqHPeTTz4x+vbta3h5eRm9e/c23njjDbucX2vS2LGo7//p8PBw68+zsrLq7XP2fvjdqJ+tx+LX6gtLjEX97DEW33//vSHJ2LhxY4NqICyd0ZixGDZsWIP+fz7f5/evNXUsLIbxfzNsAQAAUEermbMEAADgDIQlAAAAE4QlAAAAE4QlAAAAE4QlAAAAE4QlAAAAEy4XlioqKjRnzhxVVFQ4uxSI8WhJGIuWg7FoORiLlsOZY+Fy6ywVFxfL399fRUVF8vPzc3Y5Lo/xaDkYi5aDsWg5GIuWw5lj4XJXlgAAABqDsAQAAGDC3dkFOFplZaUkKTs7W/7+/k6uBiUlJZKko0ePqri42MnVuDbGouVgLFoOxqLlKCoqkvTfz3FHcrk5S9u3b9fQoUOdXQYAAGiCbdu26aqrrnLoMV3uylL37t0lSbt27VJISIiTqwEAAA2Rm5urQYMGWT/HHcnlwlK7dmemaYWEhCg0NNTJ1QAAgMao+Rx36DEdfkQAAIBWhLAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABggrAEAABgwqlhaevWrRo1apS6desmi8Wijz766LzbbNmyRdHR0fL29lZkZKSWLFli/0IBAIDLcnfmwcvKynT55Zfrvvvu02233Xbe/llZWRo5cqQeeOABvfvuu0pOTtZDDz2koKCgBm1/treSM3VBUHlTSwfsLrCDtwZHBigi0NfZpQCAS3NqWBoxYoRGjBjR4P5LlixR9+7d9corr0iS+vTpoz179ujvf/97o8PSG1uy5O5X0qhtAGeIjQzQkruj5e/j4exSAMAltao5Szt27FB8fHytthtuuEF79uzR6dOnnVQVYF87Mgv16Htpzi4DAFxWqwpLeXl5Cg4OrtUWHBysyspKFRQU1LtNRUWFiouLra+SEq4mofXZeuBn7cs+4ewyAMAltaqwJEkWi6XWe8Mw6m2vMX/+fPn7+1tfUVFRdq8RsIcn133j7BIAwCW1qrDUtWtX5eXl1WrLz8+Xu7u7AgIC6t1m1qxZKioqsr4yMjIcUSpgc9/+WKysgjJnlwEALsepE7wbKzY2Vp988kmtto0bNyomJkYeHvVPfvXy8pKXl5f1fXFxsV1rBOwp42gR344DAAdz6pWl0tJSpaenKz09XdKZpQHS09N15MgRSWeuCk2YMMHaf/LkyTp8+LASEhK0f/9+vfXWW1q2bJlmzJjhjPIBh1uy5aCzSwAAl+PUK0t79uzR8OHDre8TEhIkSffcc49WrFih3Nxca3CSpIiICK1fv17Tp0/XwoUL1a1bN7322muNXjZAkh4cFqkLgro2/yQAG3pnR5byS879zc5vfizWvuwT6hfWyXFFAYCLsxg1M6RdRE5OjsLCwpSdna3Q0FBnlwPUsn7fj3polfkyAX27+enTKUMdVBEAtAzO/PxuVRO8gbauvdf5L/Yy0RsAHIuwBLQg4Z19GtTvUCFhCQAchbAEtCCRQR00sMcF5+3n3q7+dcUAALZHWAJamDcnDFQHLzfTPl/uz+dWHAA4CGEJaGH8fTz0t9v6mfZZkXJIw/++WXe88ZWKynkuIgDYE2EJaIEaMtFbOvOQ3T+sTLVzNQDg2ghLQAvUmF/MlIOF3JIDADsiLAEtUHUj++/MLLRLHQAAwhLQIjV0CYEaLrWyLAA4GGEJaIEigzroNz2DGtz/nZRDTPQGADshLAEt1Ot3XKEhFwU0qO/+vBImegOAnRCWgBbK38dDqx4YrE0zrtZ9Q3qctz8TvQHAPghLQAsXEeir31zSsFtyE1fs5nYcANgYYQloBRo64TuzoIzbcQBgY4QloBVozIRvbscBgG0RloBW4vU7rlCfkI4N6nv7khTtyzlh34IAwEUQloBWwt/HQwvvHNCgvgWlpzR6QbJuXbidOUwA0EyEJaAVaez6S2nZRfrNC5sITADQDIQloJVpzO04SSo6eVqT3t5tx4oAoG0jLAGtTGNux9XYffi49mWfsE9BANDGEZaAVqixt+Mk6Ym1++xUDQC0bYQloJVqzONQpDOPRGFJAQBoPMIS0ErVPA7l44fjFBno26BtdmYW2rkqAGh7CEtAK9cvrJP+PeNqLb7rivP2PXmq0gEVAUDb4u7sAgDYxojLuqnvhQf17dHic/b56+f7VVBWIW8PNwV28NbgyABFNPCqFAC4KsIS0Ib87ooLTcPS6Spp4abMWm2xkQFacne0/H087F0eALRK3IYD2pD2no3/98+OzELdu2KXHaoBgLaBsAS0IVdGdG7SdmlHTuj6lzef83lymT+X6r1dh/XeriN8ow6Ay+E2HNCGRAZ10BVh/krLLmr0tgd+KtPoBcnq2aWD4i/tIm8PN/1yqlof7/1R2cdP1uobdkF73RcXoeG9uzDnCUCbZzEMw3B2EY6Uk5OjsLAwZWdnKzQ01NnlADZXVH5aQ57/UmUVVQ45HnOeADiCMz+/uQ0HtDH+Ph5aNelKhx1vR2ah/rAy1WHHAwBHIywBbdDlYRc0+nEozZFysJC5TADaLMIS0EY19nEozcXq4ADaKiZ4A21UzeNQsgrKtDOzUIakD1NztPvwcbscL+dYuV32CwDOxpUloI2LCPTVuEHddceg7nrznoF2u9r0+bd5dtkvADgbYQlwIU15+G5DHSwoY94SgDaJ23CAC6p5+O6+7BOalpiuzF+FnLAL2uvm/t3k7eEmSQro4KWC0gq9uPEH0/1OWPaVPn30NywjAKBNISwBLqwmNJ09r+lcD9fN/Ln0vGEp+/gvuur5f2v7E9cQmAC0GYQlAIoI9D3vStyRQR0UHuCjw4XmE7lLKio1ZP6XSpl1LYEJQJvAnCUADXZPbI8G9Ss7XaW45/+tovLT9i0IAByAsASgwa6+pOELXZZWVOre5bvsWA0AOAZhCUCDRQZ1UGxkw5ceSMs+wTfkALR6hCUAjbLk7uhGrdXEyt4AWjvCEoBGOXutJi+38/8VUlBa4YCqAMB+CEsAmqRfWCftmn2dPNwszi4FAOyKsASgyfx9PPTA0AjTPvuyi5i3BKBVY50lAM1Ss8r3uWzc/5M27v9JfUI66vnb+qlfaCfHFAYANsKVJQAOsT+3RKMXJOv6l7ZoX84JZ5cDAA3GlSUAzRLU0btR/Q/kl2r0gmSFXdBeo/uHWK9MBXbwPuejVgDAmQhLAJqlm3/jwlKN7OMntXBTZp32K8L8teK+K3lUCoAWg9twAJql2sb7S8su0lU8KgVAC0JYAtAs4Z19bL7PkopKDfxLEnObALQIhCUAzRIZ1EG/6RkkWy+3dKrK0OgFybrjja+4ygTAqQhLAJrt9TuuUNzFDX/IbmPsyCzUvSt4IC8A52GCN4Bm8/fx0DsTBymroEyHCssU4OOp5zb8P6UctM1z4dKOnNAtC5P19n2DmPgNwOEISwBsJiLQ1/rV/1UPDFZWQZl2ZhYq51i5/rX3R2UfP9nkfadnn9AfVqZq1QODbVUuADQIYQmA3Zwdnmbc2Fv7sk/osTV7dSC/tEn7SzlYqKyCMtZiAuBQTp+ztGjRIkVERMjb21vR0dHatm2baf+VK1fq8ssvl4+Pj0JCQnTfffepsNA2l/oB2Fe/sE5KShimTTOu1vL7BmrTjKu17fHhjZocvjOT33cAjuXUsJSYmKhp06Zp9uzZSktL09ChQzVixAgdOXKk3v7bt2/XhAkTNHHiRH333Xdas2aNdu/erUmTJjm4cgDNERHoq+GXdFFEoK/CAnz09VPx6urn1aBtC0or7FwdANTm1LD00ksvaeLEiZo0aZL69OmjV155RWFhYVq8eHG9/b/66iv16NFDU6ZMUUREhK666io9+OCD2rNnj4MrB2BL/j4eDZ6L9N6uwywlAMChnBaWTp06pdTUVMXHx9dqj4+PV0pKSr3bDBkyRDk5OVq/fr0Mw9BPP/2kDz74QDfddNM5j1NRUaHi4mLrq6SkxKbnAcA2IoM6KDYy4Lz9jp6o0JDnvyQwAXAYp4WlgoICVVVVKTg4uFZ7cHCw8vLy6t1myJAhWrlypcaOHStPT0917dpVnTp10uuvv37O48yfP1/+/v7WV1RUlE3PA4DtLLk7WuEB518RvKyiSgP+vFHZheUOqAqAq3P6BG+LpfbMTsMw6rTVyMjI0JQpU/T0008rNTVVGzZsUFZWliZPnnzO/c+aNUtFRUXWV0ZGhk3rB2A7/j4eum3AhQ3qW2VII18z/0IIANiC05YOCAwMlJubW52rSPn5+XWuNtWYP3++4uLi9Pjjj0uS+vXrJ19fXw0dOlTPPvusQkJC6mzj5eUlL6//ThwtLi624VkAsLWgjt4N7ltSUaltB37W0J72WT0cACQnXlny9PRUdHS0kpKSarUnJSVpyJAh9W5TXl6udu1ql+zm5ibpzBUpAK3flRGdG9X//uW7uR0HwK6cehsuISFBb775pt566y3t379f06dP15EjR6y31WbNmqUJEyZY+48aNUoffvihFi9erMzMTCUnJ2vKlCkaNGiQunXr5qzTAGBDDZ3oXeN0taFRC7bbsSIArs6pK3iPHTtWhYWFmjdvnnJzc9W3b1+tX79e4eHhkqTc3Nxaay7de++9Kikp0YIFC/TYY4+pU6dOuuaaa/T888876xQA2MGSu6P1h5WpDX623ImTp7kdB8BuLIaL3b/KyclRWFiYsrOzFRoa6uxyAJjIKijTi1/8P336Tf3fkD3bFWH+GjOwuwZHBvA4FKANcubnN8+GA9BiRQT6KiH+kgaFpbTsIqVlfyNJio0M0JK7o+Xv42HvEgG4AKcvHQAAZho7h0mSdmQW6t7lu7Tp+3xlFZTZqTIAroIrSwBavCV3R+u3r29T9vGTDd4mLfuE7lu+W5LUs0sHxV/aRaEX+HKbDkCjEZYAtHj+Ph66PSZULyUdaNL2B/JLdSC/1Pqe23QAGoPbcABahcYsVnk+OzIL9cA7PIAbQMMQlgC0Co1drPJ8dh06xnwmAA1CWALQKjRlovf5jH59m/blnLDpPgG0PYQlAK3Gkruj9RsbLjxZUlGl0QuSFTf/y3OGpsyfS/lWHeDimOANoNXw9/HQOxMHKaugTIcKy9QjwFfTE9OVnn2iWfs9WvSLRi9I1oX+3rplQDd5e7jpl1PV+nRfrg4f++9z537TM0iv33EFE8MBF8MK3gBataLy0416NEpzDbkoQKseGOyQYwH4L1bwBoAm8vfx0KoHBiuroEw7MwuVc6xc/9r7Y6PWZGqMlIOFSkhMk197D0WF+CnIz1s9AnxZuwlowwhLANqEiMD/BpYZN/a2hqc3tmYq08bzjT5M+7FO28AeF+jNCQO5RQe0QUzwBtAmRQT6atyg7lr3UJwG9bDtsgP12X3ouH7zwiYVlZ+2+7EAOBZhCUCb5u/jofcnx+rjh+N0YSfbLWxZn6KTpzV+2U67HgOA4xGWALiEfmGdlDzzWn38cJyCO3ra7Tj7jhaxzADQxhCWALiUfmGdtHP29eof1slux/hsX905TQBaL8ISAJf09n2DTBe49HK3NHnf+7KLmrwtgJaHb8MBcEn1LXApyfrnzj6eDl2/CUDLRVgC4NLOXnKg5n2Ns9dvOvBTiX4q+UV7Dh1XXnGF+U4t0qbv81l/CWgjCEsAYOLXYUqSHnh7t5L2559zm40ZP2ljxk+SpC4dvXR7TKh+Hx1GcAJaKcISADRSR++G/9WZX1KhhZsOauGmg+rS0VMPXX2xwgN9ueoEtCKEJQBopOJfKpu0XX7JKc35JMP6vkeAjybE9tDFXXxVZYgABbRQhCUAcJJDheWa92lGrbZewR00bmCYyk5V6ljpKRmSru0TrKEm39wDYF+EJQBopLAL2ttt3z/8VKp5n+6v1bYi5bD827vr00eGKizAx27HBlA/whIANFKvrn4OP2bRyUpd+9Jm3XVldxWdPK1OPh7qGeynwZEB3LoD7IywBACNdGWE/R/MW59TVYaWpxyu094npKOev62f+oV2cnxRgAsgLAFAI0UGdVBsZIB2ZLaMBSv355Zo9IJkBfh6Kjays8YO6s4cJ8CGCEsA0ARL7o7Wo++laeuBn61tPQJ89Nt+IfL2cNOJ8tN6O+WQTlcbDqupsOyUPv0mT59+kyc3SddGdVG3Tu2ZIA40k8UwDMf9JrcAOTk5CgsLU3Z2tkJDQ51dDoBW7uzHpdQ3d2jDN7ma88l351/1287cLdI1fQhPaL2c+flNWAIAB6h5bMq3OSe04bs8FZSddmo9vp7ttGHqML5dh1aDsORAhCUALUFNeMo5Vq5/7f1R2cdPOqWOF2/vp9uiw5xybKAxCEsORFgC0BLVhCdD0qUhfnr2s/3adeiYQ47d3qOdlt0zUEMuDnTI8YCmcObnNxO8AaAF+PUDe9+fHKusgjJ9tu9H7csukiR16eipxD3ZOl1t22OfPF2tO9/cqdjIAC25O1r+Ph62PQDQynFlCQBakaLy07p3+S6lZZ+wy/4H9eis9yfHKvPnUh0+Vs7z6tBicGUJANAg/j4eWvdwnPW2XUFphX45VaU1qdn6qeRUs/e/69AxDXo2Sfml/93Xb3oG6fU7ruCKE1wWV5YAoI2ouW2XeuiYvs4+oaKTlTbbd7Cfl0b07cqyA3AaJng7EGEJgKs4Ozxt+qHAZvv1dLNofGx3dfLxlCQFdvDmGXWwO27DAQBsLiLQV49c01OSlF1Yrhtf2aqy01XN3u+pKkPLttd9Rl2In5eG9w6Sp7sbV6DQpnBlCQBcSPxLm/VDfplDjuXpZtHVlwTpstBO+m2/blx5QrM48/O7nUOPBgBwqjWT4zSwxwUOOdapKkMbM/L14sYfNPzvmzXq9W0qKnfuyuVAUxCWAMCF+Pt4aM3kIeoT3MHhx/7maLGin92oDd/mOvzYQHMwZwkAXNDqB4fogXf2OGyV8BqV1dLkd7+Wm6R7r+qhuwf3kGEY2plVKMnCRHG0SIQlAHBB/j4etVYJz/y5TO0s0i+VVdrwbZ4qbbxK+K9VSVq2/ZCWbT9U52eRgb56ZVx/9QvtZN8igAYiLAGACzv7G3M1ispPa9Lbu7X78HGn1JRZUKbRC5J5/ApaDMISAKAWfx8PrfnDEGUVlOlQYZl6BPjq+9xiPf2vb2ut7G1vOzILddviZE0cGqGC0gpJrOkE5yAsAQDqdfbDfSMCfXXjZSHWAOXezqLi8tN69csf7LoUwX9+LtOsD7+t0x52QXstvGuA+oV2UubPpdqZdebRL4Qp2ANhCQDQYGcHKEm66fJuyiooU8bRIq1IOeSwW3fZx09q9IJkuenM/Kdfu8DHQ1Ov7aniX07rUEGZOvl4qGewH0EKTUJYAgA0S02AqglONbfu3C0WXffyZlVU2m/t43OtR368/LTmfJJR788CO3jqhku7yNPNTYakqBA/Bfl5q0eAL0EK9SIsAQBs5tdXnr5/dqTWpuboibV77f4Nu4YqKD2llTtz6v1ZgK+nLgryUd8LO2l8bA/CEyQ1Iyz98MMP2rx5s/Lz81VdXfs34Omnn252YQCAtuG26FDdFh2qbQd+1rs7Disp4ye1kNxUR2HZKRWWndKuQyf0VvIhhV3QXp8+OtT6jbya+VE/5JXIkHgGnoto0rPhli5dqj/84Q8KDAxU165dZbFY/rtDi0Vff/21TYu0JZ4NBwDOt+3Az/r3/nydrqzSv3/I148nKpxdkqkbo4L1XW6xso+frPOzdpKui+qi8bE9dGGn9tbJ5lLtb+9l/lyqw8fKud3XRM78/G5SWAoPD9dDDz2kJ554wh412RVhCQBankuf/lxlp1rq9abm83KzqKLqvx+3l13op2dv6auM3GJJFl3YyVtVhghSJpz5+d2k23DHjx/X7bffbutaAAAuasPUYRq1YLtOnGybD9o9OyhJZ56Td/PClHr79gruoHEDw1R2qlLHSk+p6P/+m5T8UimLpA7e7ir5pVIdvd01ODKAyekO0KQrSxMnTtTAgQM1efJke9RkV1xZAoCWq+b2XDuLdIGvpwI6eCn0gvaqrDa04z+FenN7pqrt9+W6Vm1gjwv05oSBbXbF81Z3Zeniiy/WU089pa+++kqXXXaZPDxqD8yUKVNsUhwAwLUM7Rl0zgnTwy/poidv6mMNVCW/nFa1IUUE+ap/WCf956dSLdx0QAVlbfPq1PnsPnRcA/68UVHd/BTi560O3mc+4vOKf9HJU1Xy8XSXIUMWWWTIUO+ufrr6kiAdPXFSP+SVqOjkadajOocmXVmKiIg49w4tFmVmZjarKHviyhIAtG1ZBWXamVkoQ9LgyABJ0rs7Dumbo0U6mF+iwvJK5xbYSoRd0F5PjuytEydPq6C0Qlk/lyuv+KRC/L3VI9BXx0pP1flGoD2/LdjqJnjb0qJFi/TCCy8oNzdXl156qV555RUNHTr0nP0rKio0b948vfvuu8rLy1NoaKhmz56t+++/v0HHIywBgGs7+5EtR4+f1IGfSpR97KTKT1Vq+8FCZ5fXKrm3kzr5eKigtO5VPXeL1LNrR3X28dAlXTsq59gvyj5eprDOPrqwU3vrnCy/9h6KCvFTlWFIstT6FuGn+37UocPZennida3nNtzZarLW2csHNFRiYqKmTZumRYsWKS4uTv/4xz80YsQIZWRkqHv37vVuM2bMGP30009atmyZLr74YuXn56uykn8lAAAa5tcLZ54t42iRfvv69ha7DlRLVVmteoOSJFUa0v7cEklS8sFj1vb9eaXn3a+nm3Tq/5ZprywuaH6hTdTkK0vvvPOOXnjhBR04cECS1KtXLz3++OMaP358g/dx5ZVXasCAAVq8eLG1rU+fPrrllls0f/78Ov03bNigcePGKTMzU507d25K2VxZAgCc15o92dr4XZ7aWSzy9XLXBT4e1gnnNbf2lm09qHVpR1V2mmjlCJXFBTq6+N7Wc2XppZde0lNPPaVHHnlEcXFxMgxDycnJmjx5sgoKCjR9+vTz7uPUqVNKTU3VzJkza7XHx8crJaX+r1N+/PHHiomJ0d/+9jf985//lK+vr0aPHq0///nPat++fb3bVFRUqKLiv4udlZSUNOJMAQCu6PaYMN0eE2ba59nf9dOzv+unrIIyfbbvR2X+XGadbF5ze+94+Wld4OMhi8Wit5OzdJpv8rVKTQpLr7/+uhYvXqwJEyZY226++WZdeumlmjNnToPCUkFBgaqqqhQcHFyrPTg4WHl5efVuk5mZqe3bt8vb21vr1q1TQUGBHnroIR07dkxvvfVWvdvMnz9fc+fObcTZAQDQcBGBvnrkmp7n7fen30bVWhrh4uCOcm9nUdqR4/r8m1wdP1l3SkmIn5eiwzvL26Odik9W6lR1lb7NKXLZb/w5S5PCUm5uroYMGVKnfciQIcrNzW3Uvn4918kwjHPOf6qurpbFYtHKlSvl7+8v6cxVrt///vdauHBhvVeXZs2apYSEBOv7o0ePKioqqlE1AgBgC/UtjXB7TJj++n9XqHZmnnlUSs3tvnPNrdqXfUJTVqfpUGG5I8p2eU1eZ+n999/Xk08+Was9MTFRPXueP11LUmBgoNzc3OpcRcrPz69ztalGSEiILrzwQmtQks7McTIMQzk5OfUe28vLS15eXtb3xcXFDaoPAABHMpt4/mv9wjpp8+PDa90CbGeRcotOqvxUlbp09Javl3utNi/3dsr4sVjFFVV2PpO2p0lhae7cuRo7dqy2bt2quLg4WSwWbd++XV9++aXef//9Bu3D09NT0dHRSkpK0q233mptT0pK0s0331zvNnFxcVqzZo1KS0vVoUMHSdIPP/ygdu3aMVkbAOByGnoL8Gz1XcGSpM3/L/+ci3r++tl2rqbJ34ZLTU3Vyy+/rP3798swDEVFRemxxx7TFVdc0eB9JCYmavz48VqyZIliY2P1xhtvaOnSpfruu+8UHh6uWbNm6ejRo3rnnXckSaWlperTp48GDx6suXPnqqCgQJMmTdKwYcO0dOnSBh2Tb8MBAHBu57odePb6VJXVhgpKKpTxY7F1JfULfDx0cXBHDY4MUM7xcq37+qhKf6mUxSLll1TIw92inGMn9WPRL9ZjBft5qbC4Qg1ZAKjVfRtOkqKjo/Xuu+826+Bjx45VYWGh5s2bp9zcXPXt21fr169XeHi4pDNzo44cOWLt36FDByUlJenRRx9VTEyMAgICNGbMGD377LPNqgMAAJxxrtuBjblNGBHoe86Vu2tC19kP/62Z+N65g6fcLGcmvfu399C6tKNqCRe0Gnxlqbi4WH5+ftY/m6np1xJxZQkAgNbh0qc+t65j1SquLF1wwQXKzc1Vly5d1KlTp3q/sVbzTbaqKiaPAQCAtqHBYenf//63ddXsTZs22a0gAAAASS3msTMNDkvDhg2r988AAAB20bTvoNlcu6ZstGHDBm3fvt36fuHCherfv7/uvPNOHT9+3GbFAQAAF3aORaodrUlh6fHHH7dO8v7mm2+UkJCgkSNHKjMzs9Zq2QAAAE3WQq4sNWnpgKysLOsjQ9auXatRo0bpr3/9q77++muNHDnSpgUCAADXVN1CwlKTrix5enqqvPzM82j+93//V/Hx8ZKkzp078zgRAABgGy3kNlyTrixdddVVSkhIUFxcnHbt2qXExERJZx49wtpFAADAFpp0RccOmlTHggUL5O7urg8++ECLFy/WhRdeKEn6/PPPdeONN9q0QAAA4Jpaym24Jl1Z6t69uz799NM67S+//HKzCwIAAJCk6paRlRoeltrK404AAAAag8edAAAAmOBxJwAAACZ43AkAAICJJn0bbvny5VqzZk2d9jVr1ujtt99udlEAAAAtRZPC0nPPPafAwMA67V26dNFf//rXZhcFAADQUjQpLB0+fFgRERF12sPDw3XkyJFmFwUAANBCVg5oWljq0qWL9u3bV6d97969CggIaHZRAAAALSUtNSksjRs3TlOmTNGmTZtUVVWlqqoq/fvf/9bUqVM1btw4W9cIAABcUAvJSk1bwfvZZ5/V4cOHde2118rd/cwuqqurNWHCBOYsAQAAm2gZj9FtYljy9PRUYmKi/vznP2vv3r1q3769LrvsMoWHh9u6PgAA4KJa9ZWlGj169JBhGLrooousV5gAAABsoaVcWWrSnKXy8nJNnDhRPj4+uvTSS63fgJsyZYqee+45mxYIAABcU0u5stSksDRr1izt3btXmzdvlre3t7X9uuuuU2Jios2KAwAArstoIWmpSffOPvroIyUmJmrw4MG1HqgbFRWlgwcP2qw4AADgulr1bbiff/5ZXbp0qdNeVlZWKzwBAAA0WQuJFE0KSwMHDtRnn31mfV8TkJYuXarY2FjbVAYAAFxaq74NN3/+fN14443KyMhQZWWlXn31VX333XfasWOHtmzZYusaAQCAC3Jzk6qrnF1FE68sDRkyRCkpKSovL9dFF12kjRs3Kjg4WDt27FB0dLStawQAAC7oosAOzi5BUhOuLJ0+fVr/8z//o6eeekpvv/22PWoCAADQrJF9dM/y3c4uo/FXljw8PLRu3Tp71AIAAGA17JIu8mkBa1436Tbcrbfeqo8++sjGpQAAANT2xfThusDHw6k1NCmvXXzxxfrzn/+slJQURUdHy9fXt9bPp0yZYpPiAACAawsL8FHa0/Fau22vfr/YOTVYDKPxX8yLiIg49w4tFmVmZjarKHvKyclRWFiYsrOzFRoa6uxyAABAAzjz87tJV5aysrKsf67JWixGCQAA2qImzVmSpGXLlqlv377y9vaWt7e3+vbtqzfffNOWtQEAADhdk64sPfXUU3r55Zf16KOPWlfs3rFjh6ZPn65Dhw7p2WeftWmRAAAAztKksLR48WItXbpUd9xxh7Vt9OjR6tevnx599FHCEgAAaDOadBuuqqpKMTExddqjo6NVWVnZ7KIAAABaiiaFpbvvvluLF9f9/t4bb7yhu+66q9lFAQAAtBRNXhdz2bJl2rhxowYPHixJ+uqrr5Sdna0JEyYoISHB2u+ll15qfpUAAABO0qSw9O2332rAgAGSpIMHD0qSgoKCFBQUpG+//dbaj+UEAABAa9eksLRp0yZb1wEAANAiNXmdJQAAAFdAWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADBBWAIAADDh9LC0aNEiRUREyNvbW9HR0dq2bVuDtktOTpa7u7v69+9v3wIBAIBLc2pYSkxM1LRp0zR79mylpaVp6NChGjFihI4cOWK6XVFRkSZMmKBrr73WQZUCAABX5dSw9NJLL2nixImaNGmS+vTpo1deeUVhYWFavHix6XYPPvig7rzzTsXGxjqoUgAA4KqcFpZOnTql1NRUxcfH12qPj49XSkrKObdbvny5Dh48qGeeeaZBx6moqFBxcbH1VVJS0qy6AQCAa3FaWCooKFBVVZWCg4NrtQcHBysvL6/ebQ4cOKCZM2dq5cqVcnd3b9Bx5s+fL39/f+srKiqq2bUDAADX4fQJ3haLpdZ7wzDqtElSVVWV7rzzTs2dO1e9evVq8P5nzZqloqIi6ysjI6PZNQMAANfRsMszdhAYGCg3N7c6V5Hy8/PrXG2SpJKSEu3Zs0dpaWl65JFHJEnV1dUyDEPu7u7auHGjrrnmmjrbeXl5ycvLy/q+uLjYxmcCAADaMqddWfL09FR0dLSSkpJqtSclJWnIkCF1+vv5+embb75Renq69TV58mRdcsklSk9P15VXXumo0gEAgAtx2pUlSUpISND48eMVExOj2NhYvfHGGzpy5IgmT54s6cwttKNHj+qdd95Ru3bt1Ldv31rbd+nSRd7e3nXaAQAAbMWpYWns2LEqLCzUvHnzlJubq759+2r9+vUKDw+XJOXm5p53zSUAAAB7shiGYTi7CEfKyclRWFiYsrOzFRoa6uxyAABAAzjz89vp34YDAABoyQhLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJpwelhYtWqSIiAh5e3srOjpa27ZtO2ffDz/8UNdff72CgoLk5+en2NhYffHFFw6sFgAAuBqnhqXExERNmzZNs2fPVlpamoYOHaoRI0boyJEj9fbfunWrrr/+eq1fv16pqakaPny4Ro0apbS0NAdXDgAAXIXFMAzDWQe/8sorNWDAAC1evNja1qdPH91yyy2aP39+g/Zx6aWXauzYsXr66acb1D8nJ0dhYWHKzs5WaGhok+oGAACO5czPb6ddWTp16pRSU1MVHx9fqz0+Pl4pKSkN2kd1dbVKSkrUuXNne5QIAAAgd2cduKCgQFVVVQoODq7VHhwcrLy8vAbt48UXX1RZWZnGjBlzzj4VFRWqqKiwvi8pKWlawQAAwCU5fYK3xWKp9d4wjDpt9Xnvvfc0Z84cJSYmqkuXLufsN3/+fPn7+1tfUVFRza4ZAAC4DqeFpcDAQLm5udW5ipSfn1/natOvJSYmauLEiXr//fd13XXXmfadNWuWioqKrK+MjIxm1w4AAFyH08KSp6enoqOjlZSUVKs9KSlJQ4YMOed27733nu69916tWrVKN91003mP4+XlJT8/P+urY8eOza4dAAC4DqfNWZKkhIQEjR8/XjExMYqNjdUbb7yhI0eOaPLkyZLOXBU6evSo3nnnHUlngtKECRP06quvavDgwdarUu3bt5e/v7/TzgMAALRdTg1LY8eOVWFhoebNm6fc3Fz17dtX69evV3h4uCQpNze31ppL//jHP1RZWamHH35YDz/8sLX9nnvu0YoVKxxdPgAAcAFOXWfJGVhnCQCA1scl11kCAABoDQhLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJpwelhYtWqSIiAh5e3srOjpa27ZtM+2/ZcsWRUdHy9vbW5GRkVqyZImDKgUAAK7IqWEpMTFR06ZN0+zZs5WWlqahQ4dqxIgROnLkSL39s7KyNHLkSA0dOlRpaWl68sknNWXKFK1du9bBlQMAAFdhMQzDcNbBr7zySg0YMECLFy+2tvXp00e33HKL5s+fX6f/E088oY8//lj79++3tk2ePFl79+7Vjh07GnTMnJwchYWFKTs7W6Ghoc0/CQAAYHfO/Px22pWlU6dOKTU1VfHx8bXa4+PjlZKSUu82O3bsqNP/hhtu0J49e3T69Gm71QoAAFyXu7MOXFBQoKqqKgUHB9dqDw4OVl5eXr3b5OXl1du/srJSBQUFCgkJqbNNRUWFKioqrO+LiookSbm5uc09BQAA4CA1n9vV1dUOP7bTwlINi8VS671hGHXazte/vvYa8+fP19y5c+u0Dxo0qLGlAgAAJ8vOzlb37t0dekynhaXAwEC5ubnVuYqUn59f5+pRja5du9bb393dXQEBAfVuM2vWLCUkJFjfHzt2TBEREfr222/l7+/fzLNAc5WUlCgqKkoZGRnq2LGjs8txaYxFy8FYtByMRctRVFSkvn37qk+fPg4/ttPCkqenp6Kjo5WUlKRbb73V2p6UlKSbb7653m1iY2P1ySef1GrbuHGjYmJi5OHhUe82Xl5e8vLyqtMeFhYmPz+/ZpwBbKG4uFiSdOGFFzIeTsZYtByMRcvBWLQcNf/93d0dH12cunRAQkKC3nzzTb311lvav3+/pk+friNHjmjy5MmSzlwVmjBhgrX/5MmTdfjwYSUkJGj//v166623tGzZMs2YMcNZpwAAANo4p85ZGjt2rAoLCzVv3jzl5uaqb9++Wr9+vcLDwyWdmcx19ppLERERWr9+vaZPn66FCxeqW7dueu2113Tbbbc56xQAAEAb5/QJ3g899JAeeuihen+2YsWKOm3Dhg3T119/3eTjeXl56Zlnnqn31hwcj/FoORiLloOxaDkYi5bDmWPh1EUpAQAAWjqnPxsOAACgJSMsAQAAmCAsAQAAmCAsAQAAmGiTYWnRokWKiIiQt7e3oqOjtW3bNtP+W7ZsUXR0tLy9vRUZGaklS5Y4qNK2rzFj8eGHH+r6669XUFCQ/Pz8FBsbqy+++MKB1bZ9jf3dqJGcnCx3d3f179/fvgW6kMaORUVFhWbPnq3w8HB5eXnpoosu0ltvveWgatu2xo7FypUrdfnll8vHx0chISG67777VFhY6KBq266tW7dq1KhR6tatmywWiz766KPzbuOwz2+jjVm9erXh4eFhLF261MjIyDCmTp1q+Pr6GocPH663f2ZmpuHj42NMnTrVyMjIMJYuXWp4eHgYH3zwgYMrb3saOxZTp041nn/+eWPXrl3GDz/8YMyaNcvw8PAwvv76awdX3jY1djxqnDhxwoiMjDTi4+ONyy+/3DHFtnFNGYvRo0cbV155pZGUlGRkZWUZO3fuNJKTkx1YddvU2LHYtm2b0a5dO+PVV181MjMzjW3bthmXXnqpccsttzi48rZn/fr1xuzZs421a9cakox169aZ9nfk53ebC0uDBg0yJk+eXKutd+/exsyZM+vt/8c//tHo3bt3rbYHH3zQGDx4sN1qdBWNHYv6REVFGXPnzrV1aS6pqeMxduxY409/+pPxzDPPEJZspLFj8fnnnxv+/v5GYWGhI8pzKY0dixdeeMGIjIys1fbaa68ZoaGhdqvRFTUkLDny87tN3YY7deqUUlNTFR8fX6s9Pj5eKSkp9W6zY8eOOv1vuOEG7dmzR6dPn7ZbrW1dU8bi16qrq1VSUqLOnTvbo0SX0tTxWL58uQ4ePKhnnnnG3iW6jKaMxccff6yYmBj97W9/04UXXqhevXppxowZOnnypCNKbrOaMhZDhgxRTk6O1q9fL8Mw9NNPP+mDDz7QTTfd5IiScRZHfn47fQVvWyooKFBVVZWCg4NrtQcHBysvL6/ebfLy8urtX1lZqYKCAoWEhNit3rasKWPxay+++KLKyso0ZswYe5ToUpoyHgcOHNDMmTO1bds2pzy4sq1qylhkZmZq+/bt8vb21rp161RQUKCHHnpIx44dY95SMzRlLIYMGaKVK1dq7Nix+uWXX1RZWanRo0fr9ddfd0TJOIsjP7/b1JWlGhaLpdZ7wzDqtJ2vf33taLzGjkWN9957T3PmzFFiYqK6dOlir/JcTkPHo6qqSnfeeafmzp2rXr16Oao8l9KY343q6mpZLBatXLlSgwYN0siRI/XSSy9pxYoVXF2ygcaMRUZGhqZMmaKnn35aqamp2rBhg7KysqwPgIdjOerzu039czEwMFBubm51/kWQn59fJ33W6Nq1a7393d3dFRAQYLda27qmjEWNxMRETZw4UWvWrNF1111nzzJdRmPHo6SkRHv27FFaWpoeeeQRSWc+sA3DkLu7uzZu3KhrrrnGIbW3NU353QgJCdGFF14of39/a1ufPn1kGIZycnLUs2dPu9bcVjVlLObPn6+4uDg9/vjjkqR+/frJ19dXQ4cO1bPPPsvdCAdy5Od3m7qy5OnpqejoaCUlJdVqT0pK0pAhQ+rdJjY2tk7/jRs3KiYmRh4eHnarta1rylhIZ64o3XvvvVq1ahVzAGyosePh5+enb775Runp6dbX5MmTdckllyg9PV1XXnmlo0pvc5ryuxEXF6cff/xRpaWl1rYffvhB7dq1U2hoqF3rbcuaMhbl5eVq1672R6ebm5uk/17VgGM49PPb5lPGnazma6DLli0zMjIyjGnTphm+vr7GoUOHDMMwjJkzZxrjx4+39q/56uH06dONjIwMY9myZSwdYCONHYtVq1YZ7u7uxsKFC43c3Fzr68SJE846hTalsePxa3wbznYaOxYlJSVGaGio8fvf/9747rvvjC1bthg9e/Y0Jk2a5KxTaDMaOxbLly833N3djUWLFhkHDx40tm/fbsTExBiDBg1y1im0GSUlJUZaWpqRlpZmSDJeeuklIy0tzbqMgzM/v9tcWDIMw1i4cKERHh5ueHp6GgMGDDC2bNli/dk999xjDBs2rFb/zZs3G1dccYXh6elp9OjRw1i8eLGDK267GjMWw4YNMyTVed1zzz2OL7yNauzvxtkIS7bV2LHYv3+/cd111xnt27c3QkNDjYSEBKO8vNzBVbdNjR2L1157zYiKijLat29vhISEGHfddZeRk5Pj4Krbnk2bNpl+Bjjz89tiGFw3BAAAOJc2NWcJAADA1ghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAAAAJghLAFzanDlz1L9/f+v7e++9V7fccovT6gHQ8hCWAAAATBCWALRYp06dcnYJAEBYAtByXH311XrkkUeUkJCgwMBAXX/99crIyNDIkSPVoUMHBQcHa/z48SooKLBuU11dreeff14XX3yxvLy81L17d/3lL3+x/vyJJ55Qr1695OPjo8jISD311FM6ffq0M04PQCtFWALQorz99ttyd3dXcnKynnvuOQ0bNkz9+/fXnj17tGHDBv30008aM2aMtf+sWbP0/PPP66mnnlJGRoZWrVql4OBg6887duyoFStWKCMjQ6+++qqWLl2ql19+2RmnBqCV4kG6AFqMq6++WkVFRUpLS5MkPf3009q5c6e++OILa5+cnByFhYXp+++/V0hIiIKCgrRgwQJNmjSpQcd44YUXlJiYqD179kg6M8H7o48+Unp6uqQzE7xPnDihjz76yKbnBqD1cnd2AQBwtpiYGOufU1NTtWnTJnXo0KFOv4MHD+rEiROqqKjQtddee879ffDBB3rllVf0n//8R6WlpaqsrJSfn59dagfQNhGWALQovr6+1j9XV1dr1KhRev755+v0CwkJUWZmpum+vvrqK40bN05z587VDTfcIH9/f61evVovvviizesG0HYRlgC0WAMGDNDatWvVo0cPubvX/euqZ8+eat++vb788st6b8MlJycrPDxcs2fPtrYdPnzYrjUDaHuY4A2gxXr44Yd17Ngx3XHHHdq1a5cyMzO1ceNG3X///aqqqpK3t7eeeOIJ/fGPf9Q777yjgwcP6quvvtKyZcskSRdffLGOHDmi1atX6+DBg3rttde0bt06J58VgNaGsASgxerWrZuSk5NVVVWlG264QX379tXUqVPl7++vdu3O/PX11FNP6bHHHtPTTz+tPn36aOzYscrPz5ck3XzzzZo+fboeeeQR9e/fXykpKXrqqaeceUoAWiG+DQcAAGCCK0sAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAmCEsAAAAm/j/+6ctRiXY4OQAAAABJRU5ErkJggg==\n",
"text/plain": [
"