{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Problem\n",
"\n",
"[Pandas](http://pandas.pydata.org) and [scikit-learn](http://scikit-learn.org/stable/) have largely overlapping, but different data models.\n",
"Both are based off NumPy arrays, but the extensions pandas has made to NumPy's type system have created a slight rift between the two. Most notably, pandas supports heterogenous data and has added several extension data-types on top of NumPy.\n",
"\n",
"### 1. Homogeneity vs. Heterogeneity\n",
"\n",
"NumPy `ndarray`s (and so scikit-learn feature matrices) are *homogeneous*, they must have a single dtype, regardless of the number of dimensions.\n",
"Pandas `DataFrame`s are potentially *heterogenous*, and can store columns of multiple dtypes within a single DataFrame."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"dtype('float64')"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([\n",
" [10, 1.0], # mix of integer and floats\n",
" [20, 2.0],\n",
" [30, 3.0],\n",
" ])\n",
"x.dtype"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0 int64\n",
"1 float64\n",
"dtype: object"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame([\n",
" [10, 1.0],\n",
" [20, 2.0],\n",
" [30, 3.0]\n",
"])\n",
"df.dtypes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Extension Types\n",
"\n",
"Pandas has implemented some *extension dtypes*: `Categoricals` and datetimes with timezones.\n",
"In statistics, categorical data refers to a variable that can take only a limited, fixed set of distince values (e.g. days of week).\n",
"These extension types cannot be expressed natively as NumPy arrays, and must go through some kind of (potentially lossy) conversion process."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0 a\n",
"1 b\n",
"2 c\n",
"3 a\n",
"dtype: category\n",
"Categories (4, object): [d < a < b < c]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = pd.Series(pd.Categorical(['a', 'b', 'c', 'a'],\n",
" categories=['d', 'a', 'b', 'c'],\n",
" ordered=True))\n",
"s"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Casting this to a NumPy array loses the categories and ordered information."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array(['a', 'b', 'c', 'a'], dtype=object)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.asarray(s)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"\"Real-world\" data is often complex and heterogeneous, making pandas the tool of choice.\n",
"However, tools like Scikit-Learn, which do not depend on pandas, can't use its\n",
"richer data structures.\n",
"We need a way of bridging the gap between pandas' DataFrames and the NumPy arrays appropriate for scikit-learn.\n",
"Fortunately the tools are all there to make this conversion smooth."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Data\n",
"\n",
"For our example we'll work with a simple dataset on tips:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" total_bill | \n",
" tip | \n",
" sex | \n",
" smoker | \n",
" day | \n",
" time | \n",
" size | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 16.99 | \n",
" 1.01 | \n",
" Female | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 2 | \n",
"
\n",
" \n",
" 1 | \n",
" 10.34 | \n",
" 1.66 | \n",
" Male | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 3 | \n",
"
\n",
" \n",
" 2 | \n",
" 21.01 | \n",
" 3.50 | \n",
" Male | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 23.68 | \n",
" 3.31 | \n",
" Male | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 2 | \n",
"
\n",
" \n",
" 4 | \n",
" 24.59 | \n",
" 3.61 | \n",
" Female | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" total_bill tip sex smoker day time size\n",
"0 16.99 1.01 Female No Sun Dinner 2\n",
"1 10.34 1.66 Male No Sun Dinner 3\n",
"2 21.01 3.50 Male No Sun Dinner 3\n",
"3 23.68 3.31 Male No Sun Dinner 2\n",
"4 24.59 3.61 Female No Sun Dinner 4"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(\"tips.csv\")\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 244 entries, 0 to 243\n",
"Data columns (total 7 columns):\n",
"total_bill 244 non-null float64\n",
"tip 244 non-null float64\n",
"sex 244 non-null object\n",
"smoker 244 non-null object\n",
"day 244 non-null object\n",
"time 244 non-null object\n",
"size 244 non-null int64\n",
"dtypes: float64(2), int64(1), object(4)\n",
"memory usage: 13.4+ KB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our target variable is the tip amount. The remainder of the columns make up our features."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"y = df['tip']\n",
"X = df.drop('tip', axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice the feature matrix is a mixture of numeric and categorical (in the statistical sense) columns."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Stats\n",
"\n",
"We'll use a linear regression to predict `tip`.\n",
"When you fit a linear regression, you end up having to solve an equation like\n",
"\n",
"$$\n",
"\\hat{\\boldsymbol{\\beta}} = \\left(\\boldsymbol{X}^T\\boldsymbol{X}\\right)^{-1} \\boldsymbol{X}^T \\boldsymbol{y}\n",
"$$\n",
"\n",
"where\n",
"\n",
"- $\\hat{\\boldsymbol{\\beta}}$ is our estimate for the vector of coefficients describing the best-fit line\n",
"- $\\boldsymbol{X}$ is the feature matrix\n",
"- $\\boldsymbol{y}$ is the target array (tip amount)\n",
"\n",
"There's no need to worry about that equation; it likely won't make sense unless you've seen it before.\n",
"The only point I want to emphasize is that finding the optimal set of coefficients requires doing a matrix multiplication.\n",
"This means we (or our library) needs to somehow convert our *categorical* data (`sex`, `smoker`, `day`, and `time`) into numeric data.\n",
"The next two sections offer some possible ways of doing that conversion."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Factorize\n",
"\n",
"One option [often suggested](http://stackoverflow.com/q/25530504/1889400) is to *factorize* the non-numeric columns.\n",
"Factorization maps each distinct field to numeric codes."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Codes: [0 0 1 1 1 0 1 1 2 3 3 1 2 2 2 0 0 1 0 0 2 1 3 1 1]\n",
"\n",
"Labels: Index(['Sun', 'Sun', 'Sat', 'Sat', 'Sat', 'Sun', 'Sat', 'Sat', 'Thur', 'Fri',\n",
" 'Fri', 'Sat', 'Thur', 'Thur', 'Thur', 'Sun', 'Sun', 'Sat', 'Sun', 'Sun',\n",
" 'Thur', 'Sat', 'Fri', 'Sat', 'Sat'],\n",
" dtype='object')\n"
]
}
],
"source": [
"codes, labels = pd.factorize(df['day'])\n",
"print('Codes: ', codes[::10], end='\\n\\n')\n",
"print('Labels: ', labels[codes[::10]])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So every occurance of `'Sun'` becomes `0`, each `'Sat'` becomes `1`, and so on.\n",
"\n",
"We could assign the factorized values into a new DataFrame:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" total_bill | \n",
" sex | \n",
" smoker | \n",
" day | \n",
" time | \n",
" size | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 16.99 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 2 | \n",
"
\n",
" \n",
" 1 | \n",
" 10.34 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 3 | \n",
"
\n",
" \n",
" 2 | \n",
" 21.01 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 23.68 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 2 | \n",
"
\n",
" \n",
" 4 | \n",
" 24.59 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" total_bill sex smoker day time size\n",
"0 16.99 0 0 0 0 2\n",
"1 10.34 1 0 0 0 3\n",
"2 21.01 1 0 0 0 3\n",
"3 23.68 1 0 0 0 2\n",
"4 24.59 0 0 0 0 4"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cat_cols = ['sex', 'smoker', 'day', 'time']\n",
"\n",
"X_factorized = X.copy()\n",
"X_factorized[cat_cols] = X_factorized[cat_cols].apply(\n",
" lambda x: pd.factorize(x)[0]\n",
")\n",
"X_factorized.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And fit the regression:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from sklearn.linear_model import LinearRegression"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/tom.augspurger/Envs/mtg/lib/python3.5/site-packages/scipy/linalg/basic.py:884: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver.\n",
" warnings.warn(mesg, RuntimeWarning)\n"
]
},
{
"data": {
"text/plain": [
"LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lm = LinearRegression()\n",
"lm.fit(X_factorized, y)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"total_bill 0.094076\n",
"sex -0.029216\n",
"smoker -0.081041\n",
"day -0.007834\n",
"time 0.005721\n",
"size 0.179367\n",
"dtype: float64"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.Series(lm.coef_, X_factorized.columns)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We successfully fit the model.\n",
"However, there are several problems with this approach.\n",
"\n",
"First, ordering of the unique values *before* factorization becomes important.\n",
"If we wanted to predict for a related dataset using the same model, we would need\n",
"to ensure that the original values get factorized to the same numerical values.\n",
"If the order differed, say `sex='male'` came first instead of `sex='female'`, values for `'male'` would be factorized to `0` instead of `1`, and our predictions would be incorrect.\n",
"We would like for our data container to store the relationship between category and numeric code for us.\n",
"\n",
"Second it asserts that the difference between any two \"adjacent\" categories is the same.\n",
"In our linear model, this implies that the change in $tip$ with respect to a jump from `'Sunday'` to `'Saturday'` has the same effect as a jump from `'Saturday'` to `'Thursday'`.\n",
"\n",
"$$\n",
"\\frac{\n",
" \\Delta{\\text{tip}}\n",
"}{\n",
" \\Delta({\\text{Sun.} \\rightarrow \\text{Sat.}})\n",
"} = \\frac{\n",
" \\Delta{\\text{tip}}\n",
"}{\n",
" \\Delta({\\text{Sat.} \\rightarrow \\text{Thur.}})\n",
"}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can plot what this looks like:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAoMAAAHRCAYAAAARwLPvAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xl8FPX9P/DX7Jns5r4TkhCSkBDuI9ynBFQU630WK63a\nim09aq1ttbWt2mq/amtL0Vpbfipab/CoggLKodz3kUBCLnLf59478/sjZJNlNxBwk9nsvJ4+fJj5\n7MzueyUkr/3M5xAkSZJARERERIqkkrsAIiIiIpIPwyARERGRgjEMEhERESkYwyARERGRgjEMEhER\nESkYwyARERGRgskWBg8fPoy5c+e6jmtra/HjH/8Y06dPx5w5c/Dkk0/CbrfLVR4RERGRIsgSBt97\n7z3ceeedcDgcrraf//znSExMxPbt2/Hhhx/iyJEjWLVqlRzlERERESnGoIfBl156CWvWrMGKFStc\nbXa7HUajEStWrIBWq0V0dDSuuuoqHDhwYLDLIyIiIlKUQQ+DN9xwA9atW4exY8e62rRaLV566SVE\nR0e72r788kuMGjVqsMsjIiIiUpRBD4MxMTHnPefJJ59ESUkJfvjDHw5CRURERETKpZG7gN6sVise\nfvhhFBYWYs2aNYiKiur3tc3NzWhpaXFrczqdsFqtyM7OhkbjV2+ViIiIyC/4TUJqbW3FXXfdhZCQ\nELzzzjsIDQ29oOvXrFmDlStXen1s06ZNSE5O9kWZRERERAHFb8LgT37yE8TGxuLvf/871Gr1BV+/\nbNkyLF261K2tpqYGy5cv91GFRERERIHHL8LggQMHsHfvXuj1euTm5kIQBADAmDFj8Prrr/frOSIj\nIxEZGenWptVqfV4rERERUSARJEmS5C5ioFRUVCAvL4+3iYmIiIj6wO3oiIiIiBSMYZCIiIhIwRgG\niYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhI\nwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCI\niIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSM\nYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiI\niBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgG\niYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhI\nwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCIiIhIwRgGiYiIiBSMYZCI\niIhIwRgGiYiIiBRMtjB4+PBhzJ0713Xc1taGn/zkJ8jNzcXChQvx3nvvyVUaERERkWJo5HjR9957\nD8888ww0mp6Xf+yxx2A0GrFjxw7k5+fj7rvvRlZWFsaPHy9HiURERESKMOg9gy+99BLWrFmDFStW\nuNpMJhM2bdqE++67D1qtFuPHj8dVV12FdevWDXZ5RHQRJEnCyYZinG6tkrsUIiK6QIPeM3jDDTfg\nnnvuwe7du11tpaWl0Gq1GDZsmKttxIgR+OKLLwa7PCK6QCabGX/46q8obi4HAMxKzcX9M34AQRBk\nroyIiPpj0MNgTEyMR5vZbIZer3drCwoKgsVi6ffzNjc3o6Wlxa2tpqbm4ookon774tQ2VxAEgG/K\n92JR+hyMjc+WsSoiIuovWcYMni04OBg2m82tzWKxwGAw9Ps51qxZg5UrV/q6NCI6jyZzS7/aiIjI\nP/lFGBw+fDjsdjtqamqQkJAAACgpKUFGRka/n2PZsmVYunSpW1tNTQ2WL1/uy1KJ6CyzU3Oxvugr\nSJIEADDqDJicOFbmqoiIqL/8IgwajUYsXLgQzz33HJ544gmcPHkSn3zyCV5++eV+P0dkZCQiIyPd\n2rRara9LJaKzZMWk49F5P8XGU9sRpNXjquxFCNEb5S6LiIj6yS/CIAA88cQTePzxxzF//nwYjUY8\n8sgjXFaGaIgYn5CD8Qk5cpdBREQXQZC67+0EoIqKCuTl5WHTpk1ITk6WuxwiIiIiv8Pt6IiIiIgU\njGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomI\niIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEY\nBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiI\nSMEYBomIiIgUjGGQiPpNlES5SyAiIh/TyF0AEfm/baW7sebQB2i3dWJe2nTcNeVWaFRqucsiIiIf\nYM8gEZ1To6kZq3a/imZLKxyiA5uLv8aGwq/kLouIiHyEYZCIzqm4uRzOs24PFzaVylMMERH5HMMg\nEZ3TyKg0aFTuI0pGx2bKVA0REfkawyARnVNEcDgenHUXkkLjYdQZsDQrD4vS58pdFhER+QgnkBDR\neU0dNgFTh02QuwwiIhoA7BkkIiIiUjCGQSIiIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIiUjCGQSIi\nIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIiUjCG\nQSIiIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIiUjCGQSIiIiIFYxgkIiIi\nUjCGQSIiIiIFYxgkIiIiUjCGQSIiIiIF86swuHnzZlx11VWYPHkylixZgk8++UTukoiIiIgCmkbu\nArpZLBY88MADeO6557B48WLs3bsXy5cvx+TJk5GUlCR3eUREREQByW96BgVBgNFohN1udx1rtVqo\n1WqZKyMiIiIKXH7TM6jX6/H000/jvvvuw8MPPwxJkvDUU08hPj5e7tKIiIiIApbf9AxWVlbioYce\nwlNPPYVDhw7hxRdfxFNPPYUTJ07IXRoRERFRwPKbnsGNGzciJycHS5cuBQDMnz8fCxYswLp16/DI\nI4+c9/rm5ma0tLS4tdXU1AxIrURERESBwm/CoF6vh81mc2vTaDTQaPpX4po1a7By5cqBKI2IiIgo\nYPlNGFywYAGee+45rF27Ftdeey12796NjRs34rXXXuvX9cuWLXP1KnarqanB8uXLB6BaIiIiosDg\nN2EwISEBL730Ep5++mn88Y9/REJCAp555hmMHj26X9dHRkYiMjLSrU2r1Q5EqUREREQBw2/CIABM\nmTIF7777rtxlEBERESmG38wmJiIiIqLBxzBIREREpGAMg0REREQKxjBIREREpGAMg0REREQKxjBI\nREREpGAMg0REREQK5lfrDBJRYDpYfQwfn/gCkgQszc7D5KRxcpdERERnMAwS0YA63VqFp7etgiiJ\nAIDj9YV45tJfYXhEssyVERERwNvERDTA9lUdcQVBABAlEfuqjshYERER9cYwSEQDKjE0zqMtIcSz\njYiI5MEwSEQDamrSBMwdPg3CmX/mpE7FjORJcpdFRERncMwgEQ0olUqFn874Pr47/lpIkBBtiJS7\nJCIi6oVhkIgGRZQhQu4SiIjIC94mJiIiIlIwhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMYJCIiIlIw\nhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMYJCIi\nIlIwhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMYJCIiIlIwhkEiIiIiBWMY\nJCIiIlIwjdwFENHQZXPYsLVsF+o6GzE9eRIyoobLXRIREV0ghkEiumjPbH8RR2oLAAAfFXyBX879\nMSYmjpa5KpJT485daC84gdCcUYiePk3ucoioHxgGieiiVLRWu4IgAIiSiPVFXzEMKlj5m2/h9Nvv\ndh2s/RApt96M1FtukrcoIjovjhkkoouiUak92rQqfr5UsqpP/ud2XP3x//o4k4j8CcMgEV2UhNA4\nzEmd6jrWqbW4KnuRjBWR3FQarduxoOWHA6KhgH9Tieii/WTGcsxNm4b6zkZMThqHGEOU3CWRjFJu\nvhHFL7/idkxE/o9hkIgumkpQYVLiWLnLID+ReOUShI7KRntBAUJHjUJIRrrcJRFRPzAMEhGRz4Rk\npDMEEg0xDIN0QZxOEWvWF2DrgQrERATj+1eNwajhvDUYCOxOO944tBa7Kg8iISQWd0y8AWmRKXKX\nRUREA4wTSOiCfLi1GO9tLkRdsxnHS5rwh1d2wmJzyF0W+cD7xz/Dp4VfotHUjGN1J/HMthfhFJ1y\nl0VERAOMYZAuyMGTdW7H7SY7TlW0ylQN+dKRmny340ZzMyrbamSqhoiIBgvDIF2Q9GHhbsdajQrJ\ncSEyVUO+NDwi2e04WBuEuJAYmaohIqLBwjBIF+TGvCzk5sQDAEINOtx300SEh+hlrop84ZZx38GY\nuCwAQERQGH487Q4EafhnS0QU6ARJkiS5ixgoFRUVyMvLw6ZNm5CcnHz+C6jfOs126HVqaNT8PBFo\nOm0mBGn0UHvZYYSIiAIPZxPTRTEGa89/Eg1JRp1B7hKIiGgQsVuHiIiISMEYBomIiIgUjGGQiIiI\nSMEYBomIiIgUjGGQiIiISMEYBomIiIgUjGGQiIiISMH6vc6gzWbD1q1bUVhYCLVajezsbMyZMwdq\nte8Wpq2trcXjjz+OPXv2IDQ0FHfeeSduv/12nz0/EREREbnrVxgsKSnBnXfeiebmZowYMQKiKOLF\nF19EcnIyXnnlFcTHx/ukmHvvvRczZ87EqlWrUFJSgttuuw3jxo3DxIkTffL8REREROSuX7eJf/vb\n32Ls2LHYtm0bPvjgA6xbtw5btmxBcnIyfve73/mkkEOHDqG+vh4PPfQQVCoVMjIy8Pbbb2PEiBE+\neX7q4hQlHCtuRHlNm9ylEBERkR/oV8/gkSNH8P777yMkJMTVFhYWhp/97Ge46aabfFLIsWPHkJmZ\niT//+c/4+OOPERISgnvuuQfXXHONT56fgJZ2K361ajsq6joAAIunpeK+myfJXBURERHJqV9hMD09\nHUeOHEFGRoZbe1lZGVJSUnxSSGtrK3bt2oWZM2fiq6++wpEjR3DXXXchJSUFU6ZMOe/1zc3NaGlp\ncWurqanxSW2B4qNtp1xBEAC+2F2OK2aNQGZKhIxVERERkZz6FQavvfZaPPXUUzh58iSmTJkCjUaD\no0eP4tVXX8X111+PdevWuc692J48nU6HiIgI3H333QCASZMm4dJLL8WmTZv6FQbXrFmDlStXXtRr\nK0VTm6VfbUTt1g5sKNqKFnMr5gyfilGxmXKXREREA6RfYXD16tUIDQ3F+vXrsX79eld7SEgINmzY\ngA0bNgAABEG46DA4YsQIOBwOSJIEQRAAAKIoQpKkfl2/bNkyLF261K2tpqYGy5cvv6h6AtGCycnY\nvPc0uv+XRobqMX5kjLxFkd8RRRG/+/IvON1aBQD4ongbHp33U4xPyJG5MiIiGgj9CoObN28e6Dow\ne/ZsBAcHY+XKlbj33ntx6NAhbNy4EatXr+7X9ZGRkYiMjHRr02q1A1HqkDUxKw6/vXMGNu4uR6hR\nh+sWZCJI1+/VhUghTjSecgVBAJAkCZuLv2YYJJ9xdHai/cRJGFJToY+JlrscIsXrMwmUlJQgLS0N\ngiCgpKTknE/iixm/er0er7/+On7/+99j1qxZCAkJwW9+8xuMHz/+Wz839cjNiUdujm+WAqLAZNQa\nPNoMOs82oovRdjwfx5/4I5wmE6BSIeOeu5Fw2aVyl0WkaH2GwSVLluDrr79GdHQ0lixZAkEQvN6y\nFQQB+fn5PikmJSUFr7zyik+ei4guTmrEMMxJnYrt5XsAAGH6ECzNzpO5KgoUZa+/0RUEAUAUUfrq\n64hbeAlUvJNDJJs+w+CmTZsQFRUFoGsCyR133IHQ0FC3c1pbW7Fq1aqBrZCIBt19M3+AxZlz0Wxu\nw8TE0TBog+UuiQKErbnZ7dhpMkO0WhkGiWTUZxisqqrC9u3bAQDr1q3DyJEjYTQa3c4pLi7Gjh07\nBrZCIpJFTuxIuUugABR3yQKUv/mW6zgydzI0vdawJaLB12cYDA0NxcsvvwxJkiBJEl577TWoVD0b\nlgiCAIPBgF/84heDUigREQ19yTfdAG14OJoPHIQxbTiGXfMduUsiUjxB6sfaLbfffjtWrlyJ8PDw\nwajJZyoqKpCXl4dNmzYhOTlZ7nKIiIiI/E6/1hV5/fXXB7oOGgD7Cmqx/WAV4qIMuGrOCIQYdHKX\nRERERH6Gi8wFqK8PVeHp1/a4jncfr8FfHpgvY0VERETkjxgGA9Tnu8vcjotOt6CkqhUjkjxv9Zut\nDqx67xB2HK1GYrQR91w3HmPSuRAsERGREqjOfwoNRaHB7reEBQEwBnlfuuHNDQX4an8FrDYnSqvb\n8KdXd8PucA5GmURERCQzhsEAdWPeSIQaesLflbNHIC7K+y4Sx4ob3Y5bO2yorO8c0PqIiIjIP/A2\ncYAanhiGVx5djIMn6xEXZUBmckSf545Ki0Lh6RbXcahBh6QYY5/nExERUeBgGAxghiAtZo1POu95\nyy4fhaZWC3YerUZCtBH33jAeOq16ECokIiIiuTEMEgxBWvzyjqlyl0FEREQy4JhBIiIiIgVjGCQi\nIiJSMIZBIiIiIgVjGCQiIiJSMIZBIiIiIgVjGCQiIiJSMIZBIiIiIgXjOoPUL1v2V2Dn0WokxYbg\n2vkZCDHozn8RERER+T2GQQIASJIEABAEweOx9TtK8Y/3DrmOjxQ14M8/nTtYpdEgqemoR3FTObJj\n0hFtiJS7HFKohm92oGH7NwiKj0PS1d+BLiJc7pKIAh7DoALVNZmwdksR2jptWDQ1FcdLmvDh1iKo\nVCrclDcS110y0u38zXtPux3nlzahqr4DSbEhg1k2DaBNp7bj5X1vQpIkqFVqPDjzLkxLnih3WaQw\ndV9tReFfXnAdtxw4iAl/edbrh1Qi8h2GQYWx2p14ZOU2NLRaAABbD1T2etSJ1Z8cx6i0KIweEe1q\nDQ9xvyWsUQu8TRxAREnEm0c+dPUOO0Un/nv4Q4ZBGnT1X37ldtxZUorOklKEpI+QpyAihWAYVACn\nKOHgyTpIEuBwiq4g2JcTZc1uYfDWS0fhWHET2k02CAJw8+JshBkZBgOFJEkw2c1ubR12k0zVkJJp\nz74lrFJBGxYmTzFECsIwGOBsdid++Y/tKDzdAgBIjDGe95ox6dFux+nDwvHvxxbjWHEjkmKMvD0c\nYNQqNRakzcSm4u2utrz0WTJWREqVfOP1aDl0GPbmrp9Xw669GvqY6PNcRUTfFsNgAHM6RXx9uMoV\nBAGguqET2cMjcaKsGQCQGG3EvEnD8PmuMqhVAm5clIWMYZ4DtoP1GuTmxLuObTYHdDp++wSKu6bc\ngrSIZBQ1lWJMXBbmp83o81yH6IRGpT5vGxEAiA4HAECl0biOu78+myE5GVP+uQptx45DFx0NQ2rK\noNVJpGSC1D1QKABVVFQgLy8PmzZtQnJystzlDBqTxY6/vnUAu45WwxisRbvJ7vb4967IQW5OPNo6\nbBiTEQ2Numu5yR1HqvDy2iNoardi9vgk3HfzRASdFfi+3HcaK985CJtDhF6nxkO3TcbMcUmD9t5I\nPo2mZvx952ocry9ESlgifjz9DkQZIvH3natxpLYASaHxWDHtdmTHZMhdKvkBa2Mj8p98Gp3FxQCA\n4ORhAASYKyoQmjMKWT+7H0FxcR7XSaKIkldWo/aLjVDpdUi97VYkXnH5IFdPpCxcdDoAvbnhBHYc\nqYYowSMIBuvVmDtxGEYkhWNCVqwrCHaYbHj2jf1oaLVAFCVsO1iJ9zYVul3bYbLhhbcOwOYQAQBW\nmxP/9/peWKyOwXljJKv/7H8bx+u7vidOt1XjhR3/wav738WR2gIAQFV7Lf66498QRVHOMslPFP/z\nFVcQBABzRSXMFRUAgPb8Apx68WWv19V/tQXV//sUos0GR3sHiv/5L3SWlQ9KzURKxft8AehkebNH\n24LJyQgz6nD5zDQkRHuOGyyraYfN7jzn85RWt8Epunck250SKuo7kJkc4YPKyZ8VNZW6HVd31MEp\nuX/PNJqa0WJpQ5SB3w9K136y8JyPdxR6f7z9ZJGXc4tgHJ7qk7qIyBN7BgPQ2Az3AddhRh1+etNE\n3H3NOKTEh3q9ZkRSGAxB7p8NxmbEuB2nDwt39SR202vVSO3jOSmw5MS6rz+ZGj4MY+Oy3doSQmIR\nGcxFggkIHzf2nI+HjRnTR/to9waVCmFjcnxVFhF5wZ7BAHTz4my0ddrw9aEqJEQbcPc146DTnntw\nvyFIi18vn4Z/f3QUDS1mzJuUjGsXZHqc8+j3p+H5N/eh3WRHuFGHX3wv97zPTYHhB5NvhlN04nBt\nPkZEpODu3NsQERQGi9OGg9XHkBKehLum3MIFggkAkH73nXC0taHl8BEAQGh2FgRBQGdpGcLHjUHG\nih96vS527myYKytR89kGqIODkHrbLQhOTBzM0okUhxNIiIiIiBSMt4mJiIiIFIxhkIiIiEjBGAaJ\niIhIEWxcCs0rTiAhD06niH0FdbA5nJg6OgF6ThAhIqIhrL6mHR+8sR+1VW2ITwzDtcsmIy6BK2F0\nYxgkN3aHiF/9YztOlPdsV/fs/fMQZtTJXBkNZTaHDRqVBioVb0YQ0eD76J1DqK1qAwDUVrfh47cP\n4s7758pclf9gGCQ3u4/VuIIgAFQ3dmLj7jJcd8nIc1xF5J3FbsHKXa9iT+UhhAWFYvmkGzA7darc\nZRGRwlRXtLgdV1W0ylSJf+LHdHJjtto92kwcY0EXaV3BBuyuPAgJElotbVi16zW0WTvkLouIFCbt\nrE0URmRG93GmMjEMkpsZYxMRFaZ3HQfr1Vg4JUXGimgoK25y31PWLjpQ0VolUzVEpFRX3zIRWWPi\nYTDqkDUmHt+5ZaLcJfkV3iYmNyEGHZ67fz7W7yyF3S5i0bRUJMWGyF0WDVFj47NxsOa469igDUZ6\nJPeYJaLBFRoehFt+ME3uMvwWwyB5iIkIxrLLuRcofXtXZuWhxdKOr8v2IMoQgdsnXI8gbZDcZRER\nUS8MgwHG6RSxN78WJqsD08ckwBCklbskUjC1So3vTbwe35t4vdylEBFRHxgGA4hTlPDoS9/gWHEj\nACAqLAjP3T8PMRHBMldGRERE/ooTSALIwZN1riAIAE1tFqzfUSpbPUREROT/GAYDiMXm7FcbERER\nUTeGwQCSmxOP+CiD61inVWPxNM7cJCIior5xzGAA0WvVeO7+ediwswwmix0Lc1OQmhDmetxsdUCl\nErjXMBEREbkwDAaY8BA9blqU5dbmFCW8+P4hbNxdDrVahesWZOK7l4+SqUIiIiLyJwyDCrDtYCU2\n7CwDADhFJ9764gQmZsViTDq346FzK2+pxNr89ei0mbAwfTZmpEx2e7zD2ondlYcQojNgctI4aFTs\ndSaiwXXsQCUO7jkNg1GHOXkjEZsQKndJQw7DoAKUVHpuyF1a1cowSOdkspnx+JfPo9NmAgAcrDmO\nR+f/FBMSRgMA6job8esvnnbtNZwTm4nHL3kQKoFDkYlocJw8Xov31+x3HZ86UY/7Hs2DTs94cyH4\nU1sBJmbFuh2rVALGj4zt42yiLodr811BsNuO8n2urz8v2uIKggCQX1+Eo7UnBq0+IqLjB933Ojd1\n2lBS1CBTNUMXo7MCTMqOw73Xj8dH24qh06hx06IspMSzG53OLdbo2XPcu83mtHs87q2N6EI5TCY0\n7dwFSZQQMWUS9JGRcpdEfio80nNThYheq2pQ/zAMKsSSWSOwZNYIucugISQjajguy5yPz4u2QoKE\njKjhuHzkAtfjeemzsbn4a1cATAqNx8Qzt5CJLpa9rQ2HHvoFrHX1rraYuXMw8oGfQqXhryxyN31e\nOgrza1FT2QYIwPS56YhPDDv/heRGkCRJkruIgVJRUYG8vDxs2rQJycnJcpdDNCTVdzai02ZGWqTn\n36HKthpsK9sFo9aIS9JnIkRnlKFCCiSV6z5C6epXPdoz7v0REi67VIaKyN9JkoTaqjYEG7QIj2Sv\n4MXgxywiOqdYYzRi+8h4w8IScMu4qwe3IApoot37UIPO0rJBroSGCkEQkDAsXO4yhjS/nEDS0NCA\nWbNmYcuWLXKXQkQXqMXShs+LtmLn6f2wOGzYeXo/vijahlZLm9yl0RAQt2A+1AbP3p3ISRNlqIZI\nGfyyZ/DRRx9Fa6vncihE5N8q22rw2MY/o9NuBgCoBBVESQQAvHlkHf646BEkhsbJWSL5OX1sDCb9\n7XmUvf4mWo8eg0qvQ9LSKxA1barcpREFLL8Lg2+99RaMRiMSEhLkLoWILtBnhV+6giAAVxAEgE6b\nCRsKv8LyyTfJURoNIfrYWGT97H65yyBSDL8KgyUlJVi9ejXeffddXHPNNXKXM6RU1LXjT/9vD+pb\nzMgYFo7f3DkdOq0a728uxL6COgxPDMN3LxuF03Xt+ODLIoiihKvmpiM3J/6cz7tlfwU+31UGY7AW\nNy/KQkZyxCC9I/J3dqcda/PX40hNAdIiU3DT2KVwiM5zXyM6Bqk68heSKKLq40/QtHM3ghISkHzj\ndaj66BM0fL0DgkaN0Kws2NvboY+KRMrNN8GQ4j5RqXHnLlR/uh7qID2Sr78OodlZfbwSKZ0kSdjz\ndSmOH6pCRKQB8y/LQmQ0J7X1h9+EQafTiUceeQS/+c1vEBbGaeEX6ucvbEWnpesX7dHiRvxq1deY\nnB2H9zYXAgDyS5tQeLoZZdXtcDi7emsOFtbjufvnIbOPgLc3vxbPvtGzyPDhwnq88uhihBh0A/xu\naChYc2hK4PQBAAAgAElEQVQtPiv8EgBworEYNR11uHXc1dhWtht2L+sNalUaLM6YO9hlksyqPvwY\npf/vNQBA2/F8NOzcBdHUs5h5085dAIB2AK1HjyH35Reh0nX9jGk9dhwFT/8fcGbRi5aDhzH5xZXQ\nR0cN7pugIWHvN2VYv/YoAKAcTThd2oQf/3IhVCpB5sr8n99MIPnHP/6BnJwczJkz56Kub25uRklJ\nidu/p0+f9nGV/qmqocMVBLuVVLXim8PuK7Ofqmh1BUEAEEUJO49W9/m8Z1/faXHgYGF9H2eT0uyq\nOOB2fKgmH4mh8fjzpb/G0uw8RAV3fcgQICAndiT+77JHkRaZIkepJKPGHTvdjnsHwbPZm1vQll/g\nOm7aucsVBAFAtFrRsn+/t0uJkH/Y/fdZc6MJNV62YyVPftMz+Nlnn6GhoQGfffYZAKC9vR0PPvgg\nVqxYgbvvvvu8169ZswYrV64c6DL9UmRoEAQAvReMDNZrkBBjRFVDp1ub2eoeGhP76EIXRQll1Z6z\nPxPY5U5nxIfEoMnc4jqODAqHXq3DsLAEfG/iDbh9wvU43VqFMH0IIoK57INSBSUkoP3Eyf6dLAgI\niu+ZYBSU4DmMJYjjyakPkdEGlBb1HKvUAsIjPHcoIU9+FQZ7W7hwIR5//HHMnz+/X9cvW7YMS5cu\ndWurqanB8uXLfVWi3wrWa3D1/Ays23IKACAIwIrrxyMtMRzl1W1oaLUgSKfGvdePx74TdfhqXwUA\nYNroBMyb5H0x7t3Ha3DydItb2/QxCX3eUibluWPiDXh62yq0WNoQrAnCnVNugUrVc7NBEASkRgyT\nsULyB6m33Yz2wkJYqqohaLVIXLoENZ99DtFiAQAIajUkpxOCWo2Um290C3txi/LQtHsvWg4eAgQB\n8ZcuQvi4sXK9FfJz8xZn4XRpExpqO6BWq5C3NAfGUL3cZQ0JfrsDSV5eHn7729/2Owx6o7QdSBpb\nLThW3ICpoxMQrO/K+U6niLKadiREG2AI0gIAaptMcIoikmJC+nyutzeewJrPCtzabl+Sg5sWcfA2\n9XA4Hahoq0ZCSCyCtEFyl0N+ShJFmMrLoYuOhjY0FKLdjraCAqh0eoSOzIS5ogKasHDoIrz3IJsr\nq6DS6aCPjRnkymmokSQJdTXtCA3VwxDCINhffhsGfUFpYdCXik634GcvbHEN1xEE4PkH5rNnkIiI\nKMD4zW1i8i+ZKRF4eFku1n7VNQDjuksyGQSJiIgCEMNgAKprMmHH0WpEhQZh5vhEaNQXN2l87sRh\nmDuRY74C3Z7KQ/j05GZoVGpcPepSjI0fNSCvU1B/Cmvz18PisOKyzHmYlZo7IK9DREQXhmEwwJyq\naMEv/7EdFlvX4r8TR8biiXtmoa7JhPBQPfRatcwVkj8paizFs9v/CenMXPRjdYV4fslvkRAS69PX\naTK34MktL8B2Zv3B/PpChOlDBix4EhFR//nNOoPkGx9tK3YFQaBrYekf/Wkj7nzqC9zxu/XYeqBC\nxurI3+ytOuQKggDgEB3YX3XE569zuCbfFQS77a485PPXISKiC8cwGGC8zQfqXmuw0+LAyncPeaw1\nSMqVEBLn0ZYY6tn27V/Hs6cx0ctrExHR4GMYDDDz+1g3sJvZ6kBDi3mQqiF/Nyd1KqYNmwiga6eQ\nhSNmYWLCGJ+/zqjYTCwZeQkEoWtbqIkJo5GXPtvnr0NERBeOYwYDTPdagn1JijFiWKz7+oKNrWa8\n9mk+ymrakDsqHjcvzoZWw88JSqBRa/DzOT9CfWcj1IIaUYYIiJKIPRWHUNVei8mJY322cPT3J9+E\na3Mug81pR1wI14sjIvIXDIMBJjMlHJGhejS3W11tE7Ni0dxmQVJsCJZfOdpj0+4nV+9G0ZndRrr3\nL16+1Pe9Q+S/Yo3Rrq9f2rMGX5XsAAC8deQjPDznHkxJGueT1+G2dERE/ofdPwFGq1HjDz+ahWmj\nEzAiKQzfuyIHv797Jn7/w5kI1mvwzGt78er/jsPu6Jpk0thqdgXBbjuP1shROvmBFnMrtpTsdB2L\nkoiPC76QsSIiIhpo7BkMIJ1mO9pNNqQlhuE3d053tdc0duKZ1/aiqKIr9BVXtcLuEHFj3kjYHU6E\nGrRoN/XM9EyMNri+bmozo6K2A+MyY1zjvUhZAnaLIpKFpa4OGoMBmpC+t8MkGijtbRYIgoAQ7lns\nhmEwQKzbUoTXP82HzSEiJy0Kj/1gOiRJwhP/2YUTZc0e53++sxQfby+GKEpIjQ91C4OHihpQVtOG\nD74swua9pwEAWo0Kl05PRfbwKMyZMIxjCgNURHA45qZNw9bSXQAAQRBwVXaeT1/jQPVRlDSfxti4\nbGTFpPv0ucl/OUwmFPzxGbQeOQpBo0HKTTcg5eYb5S6LAoTd7sSxA1UwdVoxekISIqIMbo+LThHr\n/nsQRw9WQgAwaXoqrrxhPDs5zuDexAGgrtmEu5/6AmKvP8nrL8mEU5Swbsupi3rOEYlhKKlu8/pY\n90LWFJhEUcTuyoOobKvBlKRxSItM8dlzv3l4Hdblb3Ad/yj3u8jLmOOz5yf/dfrtd1H+5ltubZP+\n8QIMAfyzmQaHJEpYvfJrVJzp+NDq1PjBT+cgPinMdc7R/ZX44I39btfdetc0jMyJH9Ra/RW7dwJA\ndUOnWxAEgIq6DlTWd1z0c9Y1m/p87GBhPU5VtPT5OA1tKpUKM1Im4/oxV3yrIGhz2FDVVgNRFAEA\ndqcdn57c7HbORxyPqBjmyqp+tRFdqPKSJlcQBAC7zYk9X5e4ndNQ5/n70FubUvE2cQAYlRaF8BAd\nWjtsrrbpYxIgShL2HK/1eo1KECCeo1N43qRkfL6rDM6zUyYFPJvDhq/L96LV2o4ZKZMvamu6vZWH\n8Y/dr6LTZkKsMRqPzFkxIItZ09ARNS0X9Vu2uo7VRgPCx/a9akH9lm2o/t+nUOl0SL7hOkRMnDAY\nZVKAGjk6Hls3nnQNglapBIwcxZ9J3RgGA4Beq8YTP5qFN9YXoLHNgksmJ2Px9OEAALPViS37TyMk\nWAe1WkBzmxVTR8djdHo0PviyEDa7iCtnj0BclAH/XHsYbR02zJ6QhNuX5GDG2AT87Z2DaOu0weEU\n0Z0dJ2bFIiM5QsZ3TANFkiQ8seVvONHQNbzg/WOf4g95P8eIfvQQtlnaYXHaEB0cgZf3voFOW1fv\ncn1nI147+D4eW3Afrsha6Hab+OqcSwfmjZDfiZkzG/b2DtRt+hLasFCk3HITNEaj13NbjxzFyef/\n6jpuyy/A5FV/R1A8f3mTp9QRUUgeHul2m3jqnBFu5wxLjcCN38vFzq3FUKkEzLokAzHxoXKU65c4\nZpD6paqhAzsOVyM6IhizxydxAkmAKqg/hd9uftat7ZIRs7Bi2u3nvO6NQ2vxyYmNcEoiRsdm4Xj9\nSbfHYwxRWHXVUwCAg9XHUNxczgkk1KeS1a+iat1Hbm0ZK36EhMv54YG8s9udOH6wCp0dNoyekOgx\ngYTOTRE9g79e9TUio+MRpFcjSK9BsE6DIL36zH/dvw4+82+QTu06DtKpEazXQK/TQK0aWjOPNu4u\nx6fflCBIp8HNi7MwYeSF3/IDgKSYEFy/cKSPq6Oh4Hyz7YqbyvFhweeu4+P1JxFnjEZdZ6OrLXfY\neNfXExPHYGIiFzWnvgUP89z1JjjZNzvhUGDSatWYMNV3k92URhFhsK7ZhGZrq0+eS6dVI1ivRpDu\n3MExSK85c85Z5/Y67gqmaqjVA9PLtv9EHV54+4DruKCsCf/85SLERgYPyOvR0Jcdk46c2JHIry8E\nAOg1elyeueCc11S1e45LzYrOQGr4MBQ1lSIjajhuHXf1QJRLASrukvloOXAQjd/sgKBWI+GKJecc\nX0hE344iwqAv2exO2OxOtMJ2/pP7SatR9QTH3j2XrsDYExyD9O5fe5x7JoRqNSrsOe6+k4jdIeLr\nw5W4Zn6mz2qnwCIIAh6b/1PsOL0frZZ2zEiZ5LZVnTfj40dBr9HD6ujZAjEhJBbv538KSZKwr+oI\nXtqzBg/Oumugy6chSHI6Uf7WO2jauQtBiQlIuOIKGFNTMOqRn8PW1AxBo4E2jGO76OK1NJmw8ZPj\nqK9pR2ZOPC5Zkg2NRi13WX5FEWMGRyz8JbSGKLnLGVQatQC1SoDVLno8lhBtwPjMGBiCtG4h9Oye\nzKBePZ3B+q6AyQU6yZuTDcV4//hnMNlMWJQxF1tKd+Jo3Qm3c1YtfQoWhxU7Kw4gOjgCs4dPhU6t\nlali8hen33kP5W/8171RJSBxyeVI/yE/QNC399KzX6Guut11PHNBBhZfNVrGivyPInoGn7xnFiKi\n4mG2OWCxOmCxOl1fd/3XCYvVAZPVAcuZY3Pvr7uvsznlfiv95nBKcDi95/yaRhNqGssv+DlVKqGn\nd9IjRLoHR8+ezF49mL3GbOp1agbMIcBit2Bb2R602zowKzXXY7mZrJh0/Grej13HW8t2uT0uQMCp\npjL8dee/4RS7/h5tL9+N3yx4YOCLJ7/WtGevZ6Moofp/nyF69iyEj+Evbbp4rc0mtyAIAIX5tQyD\nZ1FEGEyINiLZB0uhiKIEq93pCpFmS1dA7AmODpitTlhsjjNtZ8619pxntnYHy55zh0rfrChK6LQ4\n0Glx+Ow5BQFuPZK9x156C5FdYdN7CO3uyQzSaaAaYhN9/Jkoinj8y+dR0ty1NeHa4+vx5KKHMTyi\n7xn6c4dPw9HaE5DOLOql1+iw+sA7riAIAEdqT6C0+bRPdzihoceQkoKOk4VeH7NUVzMM0rdiDNUj\n2KCFudeWq7FcUsaDIsKgr6hUgqvnK9JHzylJ3QGzV4g80xvpCo5WB8w2p0dPprlXz2VXMO05Vxwi\ni0VLElwhGbCe9/z+0uu6eh49J+2o+wiRGhi6H/fWkzmAE3383dG6E64gCABWpw1fFG3DXbm3ej2/\nvKUS/97/tisIAoDFYYXF4fnnqxKU+f+UeqR+9xZ0lpah85T71pkqnQ4RkybJVBUFCo1GjatumoCP\n3zkEs8mOuIRQLFqaI3dZfodhUGaCIJwJGxoAep88pyRJsDtEt97J3fk12HagEoIgYMqoOBSWN+Ng\nYYPrmpBgLSZlxcFid2Bffq3b9naCAKhVQp+3nf2R1eaE1eZES4fvAqZOo3KbwOM+aUftPtmnd09n\nr9nmPT2cXW0aPw+YNocN28t2e7SfK8RtPLXdbTJJXyYnjUNqBJcLUTp9dDQmPv9nWOvr0Xr0OGo3\nboJar0fyjddDH62ssd40MEaNS0RmThw62qxcf7APDIMBSBAE6LRq6LRqhJ9pG54YhhsXZgEAnE4R\nN/zqf27XdJjtmDwqFoumDcebGwrw3897Bv+vuG48IAhobDEjNyceUeFBrjGU5rN6Lrtvf/e0exun\n6XT1eNocnhNc/JXNIcLmsKGt03czyTVqVZ8TeM6exON1ss+ZANoTTNXQqH0z0afD2ol7P3nUo0dP\nq9Lg0sx5fV6nUp074I6LH4XFGXMxdRi3F6Me+thYxF0yH3GXzJe7FApAGo2aQfAcGAYVQhQllNW0\nISYiGIYgLbxlhaqGTgDAbZeNwtiMaBSdbsW4zGj884MjOFHetc3PB18V4ekfz0FWqm9ulDudovt4\nyrMm7XSPq+x77KXnZB/rkJroI6LdJKK913iWb0utEvqewKPTIDjIfRKPZ09m12NrT73v9dauXXTg\nWN1JJIcnen39yzLnY0vJDnTazV4fv3Xc1ciMTvPZ+yUiom+HYVABqhs68NALW9FuskMlAHddMw5X\nzErDh1uL3c6bkBkDACipasXhogbERgTDZLG7giDQtVbhx9uL8dBtU3xSm1qtgjFYBWOw75YYcYoS\nrDb3CTxmj55Mh9dxmq7Q6TaLvCuUDhVOUUKn2Y5O88UHTEFngi7nCFR9jFxY880WfPl5V4+lTieh\nXV8Ch8oElVpCi1gNvRCGqOBEZIRmIVRvxNbaTbA5rYgKjsCmop2AqEZ6dDIn+gQQp9mMsjffQkfR\nKURPm4ak71wJQa2G02xG3ZdbYG9tRczc2TD0c2tQyelE/dZtMFdWIWpqLkKzswb4HZC/sFod+PzD\nY2iobcek6amYOC3V63m1VW04frgK4RHBGDclGVrtha0d2NTQiaMHKqEP0mBCbgqCfPh7aKhRxDqD\nSt+b+Hu/W4/m9p4eHgHAG08swbqvivDu5kLXbOZhsUb88Nrx+MMrO+E8M2gwY1g4TlW6796SEG3A\nv369eLDK9wuiKMFmd3oPkb16NfuaRd57tnnvY3+c5yPoO6Ef+w0Edd8B2NkWCVvBdACAbtRuqMOa\nvJ4n2bWQRBVUevceRkkUYCuYDr092qNH8lxbRnodpznEt4wMBJIoYt89P4a1ts7VFjV9Kkb98hc4\n/PAv0VHUNTlE0Gox7k9PInTk+Re+P/Hs82jY9nXXgSAg++GHEDN75oDUT/7l+d99jo5ev7NmL8xA\n3pXus8pLihrwxj93uiZLpmXG4Hsr+v/9UV/Tjn//bRtsZz7ox8aH4Ic/mw+1xr/HcQ8U9gwGOIvV\n4RYEAUACUF7TDqcouS1rU1nfiTc3FLiCIACcqmxFkE7ttsZibaMJze0WRIYGDXT5fkN15tZrkF4D\n+GhVAkmSYHOIbsHR4hYsz4ROi/eezJ7weebcM9c6v2XCVMdWnDMIAoAquGvdLsHQ1mcQBABBa4e3\naCaoJKjjymEujujqdW334USfi90yMqhXIB2kLSMDRVt+gVsQBICm3XvRvP+AKwgCgGS3o2b9hvOG\nQWtjExq2f9PTIEmo+vgThkEFKDvV4BYEAWDfjjKPMLhne4nbqhmlRQ2oqWpFQlI4+mP/rjJXEASA\n+toOFJ2oQ/aYhG9R/dDFMBjghD56SVLjQ3HgRJ1Hu0bteX5SrBHFlW1uz6niQtHfmiAI0GvV0GvV\nCA/xzUxyALA7nD3h8DwTeExeQmiN9jTazvci0pnbMdK3+D4QByZgDdaWka6livqxZWTXOM3egbRr\nR59AodJ4/1UieGkX1BqYKipRuvpVmCsqETk1F2l3LINKq+11jrprGYNen1b7eg0KLGov28R5mxDn\n7QPahXxo8zbRTckf+vi3K8DptWrMGp+Ibw5Xu9rGZUQj1KjDZTPSsGFnmWv5lczkcCy/cgwe++c3\nrkkYM8Ym4LIZaXjyP7tcPU5LZqb5NLyQb2k1amg1aoQZdRd1fYMpB7/Y8Ed02DpdbfHGWNR21ruO\nv5N9GSbMzIXF6sT7JfUo6TzpvRbRCEhq2NXu8VJyquGoHX5R9cnB7hBhd9jQbvLdc2rUQq+F1nuv\nf9kTNPucRe5nW0aGZmfBMHw4TGVlrra4hQsQMWE8wsePQ+vhIwAAtcGAxKVLkP/U07BUVQEAqj/+\nBCqtBml33O66VhcRjoRLF6Fm/ecAukLlsOuuGbw3RLJJHh6JyGgDmht7/rLNXujZkzxjfjpOHKuB\n48yWq6PGJVzQYtK5s9JwcHe5azHqYakRSM+KPc9VgYtjBhVi055y7DxajSk58bh8RpqrvbXDim+O\nVCNYr8GscYnQadWobzZj17FqxEQEY+roBKhVAirrO7AvvxbJ8aGYlBXLLeQC3P6qo3hm2yrXwtEj\no9Jwx6QbUdhYguyYDLfZwKIoYl/1ETR0NkGjUuNwTQEEAOMSRmHO8GlQCSpsL9uNgoZTUAkqDAuL\nx5zh0xCuD4fV1scs8T62jDR7zDYfultGDoR+bRkZ1FdP5rfbMlJ0OFD96Wdozz+B6NkzETtntqu9\nadce2FtbET1jOpxWC/bf8xO3aw1pwzHphefd2iRJQsuBgzBXViJyymQEJyX57n8U+TVRlLBt40nU\nVrUhd3Ya0kd6D2mtzSacOFaL8IhgjBwdf8ET0jo7rMg/XI2gYC1GjUuAxkuvpFIwDBKRh7/t+A+2\nl+9xa3sy72FkxaTLVFH/nL1lpMc+41b7ebeM9DbZJ3B/Sp5ff7eMjGwoR+y+zVBbTRAnTodmwaVe\nd/vRwYmjK+6Fs6PD9Rqx8+ch62f3y/guiZSNt4mJyING5fmjwVubvxnsLSP7Xmjde09m95hNs9UR\nUFtGBjstuLf0fWilrp5Z9eZP8NGxFhwP9f7hYWRILpaYd8LgtKDBGIePLOmQ/r6NW0YSycT/f7oT\n0aC7MnshdlUcgNlhAQBMThyL9Cjva331pcncghZzG9Iik4f0HsSDtWWk94XWewJon7PIe53rcMqz\no0+KudYVBLulm6pcYTDYaUG4vQO1+ihIggqFIak4ZUxGsNOKTk0w0CgCjX3PSu+v3ltGGrzMIj/X\nlpHdi7EPtS0jiXyBYVDhPtx6Cp99UwpDkAa3XTYKuTnxcpdEfmB4RDKeWvQLbCndicyotAveOu6d\no5/gg+OfQZREpIQl4rEF9yMyuH9LPiiBty0jfcHuEM+Mw+y6JX6uLSPdZpvbnL2WMLrwLSPrdZGQ\nALdlhOp0XX2zE1tPYFH9HmggolVjxNtJi9CkC4coqLqCoA8N1paR7hN9+rFl5Fkh1FdbRhL5CsOg\ngn1zuAqvfHjUdfzU6t34168XISbCtz+gaejZWroL/9z7BuxOOyKCwhAXEoMRkSn9urauowHvH/vU\nNfnkdFs1PszfgOWTbxrIkgldS+BoNTqEGADAN3+Pu7eM9Fho/azey9ZdNoTt2gjBbkNzcjZ0U+Zj\nqtmO+Rv3QYOuQBnu6MT3T3+CU4Zh2Bg7DR0a/98r1p+3jOy9h7lOppnkFBgYBhXswMl6t2OHU8Th\nogYszO3fL30KTHanHav3vw27s+uXX4ulDW8cWovHFtzXr+vrTU2uINitzvTtbwGSPPq9ZeS8DDit\nyyDZ7dCEhOAqAObqauz/3OF2mlZyYlRnOXLTQpHyy197ncDT10Lrbj2Z1t5jNrvahspEH19sGXk2\nlYBzzyI/a/ees5cxclvS6Ezo1GvV3DJSIRgGFSwtMcyjbUSSZxspi8luRqfd7NZW39kIoGus247T\n+1DYWIqc2ExMS57ocX12dDqigyPRaO7Z03pWyuSBLZr8glqvB/Q94yqDExNhTB+BzuISj3M7jh1D\nhEEDVZhvdjI655aRXibwnGsWec/X/rllpDeiBJgsDpgsjvOf3E9dM8k9eyT73DLS7XY4t4wcShgG\nFezS6cNRUNqErQcqoNOqcfPibOi1arz+WT50GhVGpkRgeGIYosN521hJwoPCMDp2JI7XF7raZqZ2\nhbk3Dq/FRwVfAAD+d3ITbhn3HVw3eonb9Rq1Br+95AG8f/xTNJtbMXf4NMwZPm3w3gD5ldG/eRTl\nb72Dhu3b4ezsWUg4OCXZp7uKDNSWkd0Tfc61ZWS/J/v4aMvIwdI1k9w5YFtGBp/Ve3muLSN7tofk\nlpEDgesMEjrNdmg0KtQ3m/DgX7a4LdwrCMDtS3JwY16WjBXSYGuzduD9Y5+ivLUSExPGYGl2HgRB\nwB3vPwirs2dwfnhQGP519TMyVkpDhan8NE7833MwlZ9GUEI8sh56EKFZI+UuSxZ2h3hWcOx7y0iz\nx2zzsxde7wqd9n5M9AlkHltGeu3J7GPLSK89mYG1ZeT5sGcwgNjsTpyqaD2zl3Ar9hbUYnhCGBbm\nppxzeYTusUAbd5d77OAgScAb6wuwaGoqIn10O4f8X5g+BN8/a8KHJEnQqrVuYVCnPs84MlIcU0Ul\nSl75D0zl5YicMhkjfrAc6uBgGFJTMOnvf4W9rQ2a0FBFT3bonugTari4LSO9cXRP9OnVc+ma7OO1\nJ/PssZee59rsQ2dHn8HcMtJtl55+bBnZu7dTri0jz4dhMEAUVbTg9//aiZYOK1SCALFXh++hwno8\nvCz3vM+h13rfiscpSmhsszAMKpwgCLhhzBX4fwfe7TqGgBvHXClzVeRPJElCwZ+egbmiEgBQ+/lG\nCBoNMn50t+scbVjPuOSmvfvQcvAQjCPSELdgPgS1crcD+7Y0ahVCglUIOd9EnwvgFCWPLSPPnsDT\nM7O8r4XWz9o+cghtGelwSugw29Hhy4k+59ky8uxJPOfaMrL7XL22f1tGngvDYIB49X/H0dLRNaZD\nPOvO/7aDlfjhNeMQHqJH4elm/PujY6hp7MTs8UlYvnSMqyv80hnDsWFXGRpbLW7Xq1UCjp1qQGZy\nxOC8GZJdUWMpXjv4Huo7mzAzZTJum3AtNCo1rshaiHB9GN459jE6bWYUNZZiVsoU6DQ6HKg+ireO\nfIQOmwl56bM9xhJS4LM1NLqCYLeWg4e9nlv1yaco+de/XcfN+w5AtFnRUXgKKr0eTosFIRnpSL/7\nB9yXWCZqlQBDkBaGIN8FzPNtGdnebkHR3gq01XVAG6KHMT0SDkE472Sf8w3DNAJIgQA9gCYAFR5r\nHgwOUZTQaXGg0+cTfc6/ZeSPrh3f53MwDAaI+mZzn4+pVSpo1CrYHSKe+PcuNJ8ZCPzRtmIYg7W4\n7bJRAIDo8GCs+sVC7Dxag/0FtdhyoOuHulOU8MpHx5AUG4KpoxMG/s2QrGxOO57e9g+0Wbv2jv3k\n5CYcry/C7xY+CJ1KizcPr0X9maViPj+1FXqNDgvSZuDP216C88wuFG8d+QgxhijMS5su2/ugwaeN\njIA2MgL25hZXW0j6CK/nVv/vM7fjxq+/8TinZf8BFDzzLCa98LxvCyXZnG/LyLVv7EfT6VYAgMNq\nQqRRh3vvn3vO55QkCTaHCLPF4XUWuclsw+4Pj8Nh7fr5lABgRHI4EGNUzJaRABgGlWDepGH47+cn\nvD52zfwMGIO1KKpocQXBbgdP1rvCIAAYgrRYmJsCm93pCoOucwvrGQYVoLyl0hUEuxU3l+G/hz/E\nZZnzXEGw247T+/Dpyc1wSu4D2A/X5jMMKoxKo0HWA/eh8O+rYGtoQGh2NtK+f4fXc9X6/m3tZyot\ng721Fdpw7mCjBMWFDW7HVeUtsFoc0Af1HVcEQYBeqz4z1Mnz+6rqdAu+sbrfno7VabD89r6HT0mS\nBAC3EO0AACAASURBVIdTdNutp+fWuPdZ5J6zzf1ny8jzYRgMEDcvzkawXoM16/Nhs/d8s40aHok7\nrhwNAEiKMSJYr+5aJuCMjGHef8Cme2nv61wKLImhcdCptbA53cfJHKktwHcnXIsQnREdtk5Xe7O5\nzSMIAsCICC5erkQREycg918vwmkyQxNi7PO8lFtuRMEzzwJi1/eOJjQUjvZ2j/P0sTHQhPporRjy\newnDwnCqoGdDhKgYI3T6bzeWNDrWCK1ODXuv8YqJ5/l9JggCtBo1tBo1woy+m+jTe8vI3ksPWaxO\nmNwWUveyZaS3yT793DLyfBgGA4RaJeDaBZkYNTwKz/93H2oaTchMicBD353iOscQpMWDt07Gi+8f\nRnO7FZOyYnHb5aO8Pl9WaiRuX5KDdzedhMMpIm9qKuZP5i93JTDqDLhn6u34+87VbqNq0iJToFNr\n8ZPpy/Hy3jfQZG5Bdkw6TjQUezzHnOHTcGnmvMEsm/yIoFKdMwgCQPSM6Zi86m9oPXIUxrQ0QJJw\n8i8vwFJdA5VeD9FqhT4uDiMf+AkElXKW+FC6JdeOw/uv70N1RSsiow245rZJ33pyhD5Ii2tunYhP\nPziKznYrMrJjMe9SeZZLc98y0je8bhl51gSerlvIfeM6gwFIkiSYLI4+t4/qniHWn0HBdocTTlFC\nkI6fGwKZKIrYWrYLp5rKMCYuCzNSJuNg9TG8tGcNmswtyInNxAMz70JkcNenaVESYXFYYdAG4w9f\n/hVH63qGKFyTcxluG3+NXG+FhjBJkuDsNEETYoSjsxNqg8Evl+GggWcx26EP0vj0z18UJdhtznPe\nclYqhkEiwit7/4vPT211Hd82/hpck3OZW+jrS4etE2uPr0d5axUmJozGkqxLoBK89+Q0mpqx8dR2\n2EU7Fo6YhaQwjkENJP+/vTuPj6q+9z/+miWTlWyQECAkkECAsCMEwbALCkJUBGkFaVXo1apU77W/\nilalWnrVWkTFpcWNKl57r7UWFURAQWUHBZQAISGBJBAIWck6k8z8/ogODIkQYshMMu/n4+Hj4fme\nJZ8DJ8N7zvme77e2upqyw+n4deqEb/vwJh+npqKCk2vXUX36NB2uGklwYp9mrFJEzqd4LOLlampr\n+CzT9U3OtembuKHPNRgNxgsGQYAgSyC3Drrpoj+nzFrOwnVPUlxVCsCn6V/w9KSHiGoX2fTixWOU\nZ2ax/7HHsZWUYDCZ6H7HL+l03ZQmHSt10ROcOZQG1L11nPjIQ4RdofmtRS4XdcTwQlXVNexLz6eo\ntOriG0ubZzAY6s0k4mdu3Juel2Jnzl5nEASoqqnmi6Pbm/3niHscXfkOtpK6IUEctbVkrXibmoof\nH/Lqx5QdOeIMgnUHc5D3yafNVaaINEB3Br3MoaOFLFq+jbJKG2aTgbtuGsik4bHuLkvcyGQ00b9j\nb7bnfONsG9PtyiYd65/7V7M+4yv8ffyY1X8aw6MHO9f5+dQPmFuzv2ZCXDLtAxoacUxaE+tp1yGH\n7NXV1JaXYQ648J3l85l86890ZPK/tGOIyKXRnUEvs+LjA86pdWpqHbz+4X5sNa1neiBpftU1Vvbm\npbq07T+V9iNb/7jNx3byj+8+pKCyiJzSE/xl89+4+6Pf80VW3d2/oZ0H0CO8m8s+uaV5vLh9RZNr\nF8/RYXSyy3K7Pr3xjYi45OP4d+lMxLixzmVTQABdbrzeuVyRk0Np6gEctfrc8iZ2u4N1H6byl0Wf\n8re/bOJIWv7Fd5JG051BL1NQ4vrYprzSRmV1LT5mzQnqrSpslVTVuA5GXlhRdMnH2X+yfoDMLy/g\nxe0riA3tQmxoNI9PeIBb3/uNc6YSaFrwFM/T5cbrMfn7U7RzF/5do4m+aXqTj5Vw3710vHo81fn5\nhF0xxDmfcfpLf+Xk2rpHxv7RXei3+HEsoZom0xvs/CqTrRszACg/U80/3tjJ/Y9OxK8Z52L2Zroz\n6GXGDHF9q3pIr8hmHVBTWp8w/xD6RrqOuZUcm3TJx4kLb7i7gQMHqacOA2A2mujRvtt5+8Vc8s8S\nz2MwGOg0+RoSH32Y7rf9Ap/gnzZQdEi/vkSOG+sMgmVHMp1BEKAyJ5fjqz76ST9DWo+sdNeZSWzW\nWnKPXfqXVmmY7gx6mZ9N7EVwoIXdB0/RvXMwN43r6e6SxAP811W/4oMDn5JTcpwhnfsxMf7SBoyu\nqqmmQ0AYyTHD2JK9G/t5M5LEnxMU7xw2h+e3vk5mcTaxIV34ddLcZjkHadushYX12wrqt0nb1Klr\nKIf2n3QuG00GOnYKdmNFbYtHhcFdu3bx9NNPc+TIEcLDw7njjjuYNWuWu8tqU4xGA1OT45iaHOfu\nUsSDBFkCmTPwxibtm1F4lMWbXqDMWo7RYGTOgBspqCzi04wvMRtM3Jh4LQkdzl5vXYKjeOqah6iu\nseJr1l1paZyQ/v3wCQvDVnT2blDEmFFurEha0oix8eTnnSF13wkCAi1MmpZIUHD9l42kaTxm0OnS\n0lImTpzIY489xpQpU0hNTeW2225j6dKljBgxoknH1KDTjZOeU8xf399H9qkykhI7cuf0AY2anUQE\nYPGm59mbd8C57Gv2ZXnKk5hNPhioe1tZvE/liRNU558muE9vjD7N83lSlZdHzvv/pqa0hMgJ4wkf\nNrRZjiutR01NLSajEYNRM9M0J4+5M3j8+HHGjh3LlCl1g5QmJiYyfPhwvvnmmyaHQW+UnlNMtbWW\nPt3CMZ7zy1Jtq+VAZgFB/hZKy63EdQkhtJ0vtXYHi9/YzuniujEHP9+dQ3mljTtS+pFXWEFYOwup\nmUV0CPWjX1wHDh0rIjoyiMiwZpxYUTxerb2W1PzDFFYUERMaTfewunmqs4pyyCg86rJtdU01+RWF\nnCorwFZrpWtoF7oER7Et+2v25KUSHRxFoE8AdoedUd2G1xvjUFo3W0kJ6a8sp3DLVgCMfr70vG8B\nHUZcSU1ZGYdfeInir7/BNyKChAfuIygujopjx7AWlxCc2Aej+ew/S6VphynPOEJQjziC4uMp/vY7\ngnrGEz70CvLWrqNo99dEjBlNcJ+G51iXtsdsNlFUUEHh6XJiuofho6lSm4XH3Bk8X0lJCZMnT2bx\n4sWMGzeuScfwpjuDdruD/16xg23f5QEQ1zmEP/36KgL9fTiWV8rDr2yh+MzZN0bNJiP/ecsQfExG\nFr+5o1E/wwA4AKMB7kjpR8ro+MtwJuJO1horB09nEBnY3jkzSFFlCY9ueIaT5Wc7cA+PHowRA1tz\nvq53DB+jmVqH3aXfYJ+IHhzIT6+3rcXkw1OTHqKLpqVrEwq2bufgn/8CDQz70j55JLUVlRR/fXY8\nS4xG2l81goIvNwPgFxVFvz89gSU0hNTH/0jxnn0u22L//poyGOCHf7qMRvo9/hgh/ftdrtMSD7L5\ns3Q2rD4ADggItDDnziuJ6hzi7rJaPY+M1GfOnOHOO++kf//+jQ6CRUVFFBcXu7Tl5eVdjvI80p60\nfGcQBDhyvIS1244yfVwP3l2X5hIEAWpq7by26jtioxrfAfeHbw12B/x9zQEmDo/F39cjLyFpgtzS\nPP7w+bMUV5ViwMBNfSdzc79pfJy2wSUIAi4DVJ/PZq+p19ZQEASw1tpYsec9Hhp9z08rXjxC5utv\nNhgEAQq+2gKm87oM2O3OIAh1j4GPr/qQ4D69XYPg99s6nXsPw27n5LoNCoNeoLLCysZPDjn/Maoo\nt7Lpk0PMuv3SRz8QVx73L3l2djZ33XUXsbGxPPvss43e7+2332bZsmWXsTLPVnSm/tRyP7Q1tA6g\n+Ew1wQFNm5Ku2lpLRZVNYbAN+ef+1c7p4hw4eD/1EybGj6a4svQie/40RRXFF99IWgVb8UX+Lg0X\n7+dlKyrGWnRpQ4aYAtVtxRtUVtiorXUdqaDsTDVlpVXs3JxFRbmVgcO6Eh2rGY0ulUeNM7h//35m\nzZrFqFGjePHFF7FYGv+m4Zw5c/jkk09c/nvzzTcvX7EeZlhiFEHnDL5pMhoYPbgLAOOu6NrgPmOG\nRDN+WMPrLmZAjw60D9EUUW3JufMGA9gddkqrzzCqWxLn/xMeYPYnyBLY6GOH+gVjNjT8Isk1Pcde\nYqXiqSLG/PiQROZ2QXSacm0D7a7jEUaMHU14UhJG38bNj+0TGkrnlGmXVqi0SuEdAusFvb6DO/PG\nss18uf4wu7ce5Y1lm8nO1JBDl8pjbuucPn2a+fPnc/vttzNv3rxL3j8sLIywMNeLxKeZ3mBrDYID\nLTx97yj+/UUGVdW1TB7ZjZ5d6/48Jg2PxdfHxFd7cymvrMHiY6RvXHtuGBOPj9lEoJ8PH23OpMpa\nQ/tgP3wtJoxGA2UVNlLP+6XqER3CkN4dmT62hztOUy6j0d2G892pQ87lbqHRxITUzRzy4Oi7WXVw\nHYUVxcSFxTCz33UYDEY+PLSeY8W5FFeWYjAYuCpmKNZaKx+lbQDAgIGkLoP45ZCZGA1G9p9K43R5\nIRuztlJrr2VKwniujk/+sZKklYn7j3n4de7EqU1fUH0iD7vNhsFkIjC+O71/+wCW9uFYQkI4sXYd\nJl8LXW+eSbveCRz/YBXW4hIix48lbEjdfNYDnv5vjr61korsbAJiYmjXK4FT6z8DIPpnM/GLjMRW\nXELo4EGXPP+xtF4/n5fEls8zKMgvo1e/KAICLRQVVDjXO+wO9uzIpmv3cDdW2fp4zAskf/3rX1m6\ndCn+/v78UJLBYGDu3Lncd999TTqmN71Acrl8vDmTlZ8coNpay7UjunFHSj+Xt5SlbdlybBdbs7+m\nY1AHUnpNJNivabNIHCk8xomyk/Tv2Idg36BmrlK8Tcn+/eRv/AKf0FA6XTcFS6heGJA62ZmFvLFs\ns0vbiLHxTJyW6KaKWiePCYOXg8Jg83A4HNgddY+eRURaUvHefexf9ITzBRJzUCAD//Jn/KI6urky\n8RT/eGMnh75/gbJdiB+335tMSJjuFl8Kj3lMLJ7LYDBgUg6U81hrbaSeOky4fwgxoV3cXY60USfX\nf+byJnFNWTl7H/h/XPG3V/R4WACYddswjmYUUFlhJb5XhMYebAL9iYnIJTtZls9jny2hsLLu7dFr\ne4zl5n5TeWXX2+w5sZ+uIZ351dDZzsGpRc5VeSKPytxcgvv0xhx44ReRzEH119ecKaNo124iRqu/\nqdSJjW/v7hJaNYVBL7N573F2pOYRHRnE1OQ4DQ0jTfLvA586gyDAJ+kbKawsZkfuHqBuvuIlW5bz\n/JQ/YGjEcCLiPXI/WEXWm38HhwNTQAB9Fz1Cu14JP7p9l+unceqzjdirXIfBMrdTX1SR5uJRQ8vI\n5bVmSyZP/n0nn+3K5u+rD7Dwxa84mnd5x5CTtqmk+ky9tvTCLJflk2X5FFWVtFBF4umqTp0i42+v\nkbXiLeeg0bUVFRx7590L7ucXFcXgF57FJzTU2RY6aCChAwdc1npFvIluC7Vx7647xKovMvAxGzGZ\nXLN/Rm4J9/z5c24c24Pbp/V1U4XSGqQXZFFmraBfZAJmk5kx3a5kZ+5e5/rO7ToSFxbDV8d2Otsi\nAtsT6tf4GW6k7aqpqGTf/3sIWwODSdtKLv6FwS8ykqGvvkLxnr2Y/P0J7puoO84izUhhsA3bsT+P\nlZ8cvOh2/96UzrTkOCL09pVXSzt9hMqaKvpG9sJsPDtA9JIty9mWXTcHccegCJ4Y/18kRQ/id6N+\nzVdHd9A+IIypCRMwGAyU2yrZk7efrsGduXPYHIwGPXwQKNr9dYNBECBi3FgAKnJySV/2EmWH0wnp\n15ce996Nb4ez/cCMPj6EDxvaEuWKeB2FwTYsNbOgXpuP2YitxnU6H7sDSsurFQa9lMPh4KmvXubr\n498CdXf5Hp/wAMG+QRw6neEMglD36PeT9I38rP/1XNG5P1d07u9yrIWj727R2qV18Gmgf59/TFei\np99A5Pdh8PCzz1GWngFA8Z69ZLz0ComPPtySZYoHqK6ykZVeQHhEIBEdmzbOqVw6hcE2KPN4Cau3\nZJFfVFFvXecOgRzLO8O5g0vGdQ4hrosGcfVW35486AyCAMfPnGR9xpdMT5xMaXVZve1Lquq3iVxI\nyID+hF0xhKLddV8s/DpF0f+Pf8AnpO5zx26zOYPgD0oPXvyphrQtJ3JKeOuVrVRV2gAYdXVPxk3u\n7eaqvIPCYBtz/HQZv33hS6qttQDf9xU04Otjws9i5mje2Y7/oe18mZgUw/Wj49X/xos1FPhKq+qu\nkwEd+xDuH+p8c9hgMDCm2/AWrU9aP4PRSOKjD1OaeoCaigpCBw7AeM50oUYfHwLj4ynPOBsI2/Xq\n5Y5SxY2+XJ/mDIIAmz9LJym5O4HtGjdPtTSdwmAbYbc7WLv9KB9vPuIMggC2Gju3Te3P1OTupDyw\nymWfiqoa5k7RlD3ebkinfoT4tnO+IWwyGBn1feDzNVt4YsIDfHxoA4cKjuBn9iOrOIeE9nEYja79\nAWvstXyavom0gkz6dOjBxPhR9baRtstWWsrxVR9RfSqfDskjCU8aVm+b2spKTn+5mdLUA3ROmYYl\nNIQTqz/hxEcf4wD8o7tQefxEXZ/BX9/Z8ichblVeZnVZttsdVFbaGhUGj2UWsmf7MXz9zCSNiiOs\nfYBzXWWFlW1fHKG4oII+AzrRu3+nZq+9tVMYbCPeWnOA9z473OC64EALBoOB3rFhHDx6thN3n25h\nLVWeeLAAiz9/vPq3rEn7nMqaaibEXUV8eKxzfURge2rstWQUHgVg/6lDnDhzituG3OxynNd3v8v6\nI18BdXMc55Xl84vBM1y2Ka0u4+NDG8ivKGRk1yEM7TLwMp+dtASHw8H+Rx+nPDMTgPxNX5DwwH8S\nMeoq5zYF23dw8E9POZcLd+wk6ropZP51uevBDAbs1VbsVtdgIG3f4KSuZGcWOpeju4XRIfLi40nm\nHivi7y9twW6v6wC1f89x7n5wPL5+dRFn5d+2czy77unGt1/ncsMtgxlwhaaoPZe+trcR63cea7A9\nsXs4IwfUfQu6/5YhJHYPx2Q00D++A/fePLglSxQPVm6tpKrWitloIsBS/0Wiz7O2uixvzHRddjgc\nbMra5tK2Om2Dy3YOh4M/bnyOfx34hK+O7uDpr15hy7FdzXgW4i7lmZnOIPiD4/92fRJxasPnLsuV\n2Tnkvv+v+gdzODhz6BBpS59v9jrFsw1KimHmL4bSf0gXxkxK4JZ5jeuSsm9XjjMIApSdqSb94CkA\nTp0odQbBH+zZkd18RbcRujPYRoQEWig+U+1cDvQzs/C2JAbEd3D2B+zcIYin7hnlrhLFQx0vzeOR\nz57BVlvXV+erYztZOnkRYf5nXyoK9g2ioKLIZflcBoOBIN9AiirPjhnnAF7e+Ra9I3oQFRRBVnEO\nWcU5Lvt9nrmVkTEaLqS182lX/63PssPplB44SHCfuhcAfELqjzlpCQnGeiq/wWOWpR3GYbdjUFcD\nr9JnQCf6DLi0x7gBQfUfIwcEWQDwD7BgMDjHOQcg8Pt1cpZ+y9qIX1yXiI+57q/TaDQw7/r+DOwR\noRdD5KK2ZO92BkGASlsV23O+cdnm1oHTMX0/ZqDJaGL2wBvrHefWgdMx4Hq9ORwO0gvq7hi1swTW\nWx/iq6Ej2gLfiAjCGhgDMH/jJuf/d5l+Az5hZ7umRE2+lvi7f43BZKq3H0C7XgkKgtIoQ0fG0j7i\n7BzWvfp2pNv3cxW3C/FjxNgeznX+AT4kX92zxWv0dLoz2EYMS4zitd9P5NDRIuK6hBAZFnDxnUSo\nf5cPIMTPNaSNjBlK7w49yCg6So/wbi53DX+QHJtEUWUJb+1939lmMBhIaB8HQIfAcK7rNYGPDq2v\n+xm+7bgh8ZrmPBVxoy43Xk/RTtfH/ubgs3cDS/Z9C/ZaDD4+dEgeSdx/zMNgMDB85QpOrFkLDjtF\ne/ZxJvUA7RJ60mOBxqyUxgkM8uXO344lK/00vn4+RMe69oe/emofBgyNpriwgm7x7bH4Kvqcz+Bw\nnHvztG3JyclhwoQJbNiwgehodRY93960fNbvPEa7QAs3jI4nMlwB0htV2apY9PmzHCmq63faNzKB\nh0ffi9l06R+Ydoedd/Z9wPqMrwj08efnA64nOTbJZZtjxbmcrigkMTIBP7OGjGhLDj71DAVb6vqJ\n+nXuRP///iOW0FAqjmXzzYL7XZ7V9bx/AZFjx7irVBE5h8Kgl/o2/TQPv7LZ+dncPsSPvy68Gl+f\nhh/ZSNtmd9g5kJ+OyWCkVweNOylNd+ZwOrXl5QT364vRXPeF4uT6DaS/8JLLdlGTryX+zvnuKFFE\nzqN7pV7q893ZLh1qC0qq2JuWT1LfKPcVJW5jNBjpG5lw0e3KrRV8e/IgEYHtiQ+PxVprY19eKv4+\n/iRG9FSIFNr17FG/rXdvzu/FH9QzviXLEpELUBj0UiENvH0VqlHe5QKOFuew6PNnKbfWTXM4rttI\nUvPTOFl+GoCBUX1YOPoejAZ1+hdXAdFd6LngHjJe+Rv26rpRD7JWvE1wYiL+nfQFVMTd9KntpVJG\nxdGp/dm3r8YOiSYhRoNQy4/714G1ziAI8HnWFmcQBNibd4DvTh5yR2nSCgR27+YMggA1JSUc//eH\n7itIRJx0Z9BLhQX78dLvxvNdxmnaBViIjw51d0ni4c4Ngj/mTHU56zO+Ys+J/cSEdmZar4n4+/i1\nQHXi6WrK6s+BXVNev01EWp7uDHoxs8nIoIRIBUFplAlxV7ksdw3uhNl49vtkmH8IuaUn+NuulezI\n3cN7+1ezdOurLV2meKjgxD74R3c522A00vHqCe4rSEScdGdQROqx1trYm5eKxeRD/469MRqMXNl1\nCA+Nvpdt2buJCGzPtT3HkleWz+dHtuDv48c1PceweOMLLsf55sR+SqvLGhzLULyLwWSi/5+e4MTH\na7AWFRExdjQhffu6uywRQWFQRM5zprqMh9c/TV5Z3TRhfSJ68OjY+zAZTQzqlMigTonObePDY4kP\nj3Uuh/oHk3smz7nsb/bDz6Spn6SOT0gIMbf8zN1liMh59JhYRFxsOLLZGQQBDuSn8/WJ7xq178/6\npxDg4w/UDVdzy4AbsJgVBkVEPJnuDMqPysgpJvvkGQb0jCA8WC8BeAOHw0GZtbxee7m1AofDQa29\nFrPJjN1hx+5wYDa6DlLeq0M8L01bTNrpTKJDougQEN5SpYuINMheaweDAaNR46D+GIVBadA7aw/y\nP5/WDRNiMRt5bP6VDOgR4eaq5HLalbuX13b/g8LKYgwYcFA3QHCoXzBGg5G7PnyIoqoSuod25XR5\nIeW2Cq6KHcZ/DJ2Nj8nHeZwAH3+XR8kiIu7gcDhY/9EBdm7OxGQykjyhJ1eNrz8ouigMSgMqqmy8\n99lh57K1xs7/fHpIYbANq7BV8vy2N6iqOTsOXPfQrgzslEhyzDAe3vBnqr9f98McxgBfZG0nJqQL\nKb0ntnjNIiIXcvDbPLZuzACgxmZnw8cHiOkeTtfuemJxPvUZlHqqbbXYauwubRVVNW6qRlrC8dKT\nLkEQwN/Hj1sG3EBVTbUzCDYko/Do5S5PROSSHc8ublSbKAxKA8La+XFlP9cpoq4d0c09xUiLiAnp\nTJAl0KUt8fu5imNCuxBoCfjRfftG9rystYmINEVsfHvXBkMDbQKAweE4Z+bwNiYnJ4cJEyawYcMG\noqOj3V1Oq2K11fLJtiyO5Z0hKTGKpL6aP7StO5ifzopv3uNURQEjug7hF4NmOPsCpp46zN/3vEd+\nRSE9w7uTX15AqbWMsd2u5OcDrtd8xCLikbZuymD7F0cwmYyMujqBQUld3V2SR1IYFBEREfFi+jov\nIiIi4sUUBkVERES8mMKgiIiIiBdTGBQRERHxYgqDIiIiIl5MYVBERETEiykMioiIiHgxhUERERER\nL6YwKCIiIuLFFAZFREREvJjCoIiIiIgXUxgUERER8WIKgyIiIiJeTGFQRERExIspDIqIiIh4MYVB\nERERES+mMCgiIiLixRQGRURERLyYwqCIiIiIF1MYFBEREfFiCoMiIiIiXkxhUERERMSLKQyKiIiI\neDGFQREREREv5lFhMDU1lZkzZzJ48GBuvPFG9u7d6+6SRERERNo0jwmDVquVu+66ixkzZrBr1y7m\nzJnDXXfdRWVlpbtLExEREWmzPCYMbtu2DZPJxKxZszCZTNx00020b9+eTZs2ubs0ERERkTbLY8Lg\nkSNHiI+Pd2nr3r07R44ccVNFIiIiIm2fx4TByspK/P39Xdr8/f2pqqpyU0UiIiIibZ/Z3QX8oKHg\nV1lZSUBAQKP2Lyoqori42KUtNzcXgLy8vOYpUkRERKSVioqKwmyuH/08JgzGxcWxcuVKl7bMzExS\nUlIatf/bb7/NsmXLGlw3e/bsn1yfiIiISGu2YcMGoqOj67V7TBi88sorsVqtrFy5klmzZvHBBx9Q\nWFhIcnJyo/afM2cOU6dOdWmzWq0cP36cuLg4TCbT5Si7VcvOzuaXv/wlb775Jl27dnV3OdIK6RqS\nn0rXkDQHXUeNExUV1WC7x4RBi8XC8uXLefTRR1myZAmxsbG8/PLL+Pn5NWr/sLAwwsLC6rX36tWr\nuUttM2w2G1B3cTT0TUHkYnQNyU+la0iag66jn8ZjwiBAQkIC7777rrvLEBEREfEaHvM2sYiIiIi0\nPIVBERERES9mWrRo0SJ3FyHu4+fnR1JSUr0xHkUaS9eQ/FS6hqQ56DpqOoPD4XC4uwgRERERcQ89\nJhYRERHxYgqDIiIiIl5MYVBERETEiykMioiIiHgxhUERERERL6YwKCIiIuLFFAZF5LIpKSmhrKzM\n3WWIiMgFaJzBNiorK4unn36aXbt2UVtbS9euXZkzZw4zZsxwd2nSyvyUa+nKK6/k7bffpkePHi1Q\nqXiC+fPns2vXLgwGA9XV1RgMBiwWCwAdO3akqKiIbdu2ublKaa3Gjx9PQUEBJpMJAIfDgcFg0h6q\nywAAB3tJREFU4KmnnmLixIku286fP59JkyYxc+ZMd5TaqpjdXYA0P4fDwbx585gxYwZLly7FYrGw\nc+dO7rnnHkJCQur9woj8mJ96LRUXF7dQpeIpli9f7vz/BQsWkJCQwD333APAjh07+M1vfuOu0qSN\neP755xkzZsxFtzv3WpQL02PiNqioqIjc3FymTp3q/EY+bNgwfvvb32Kz2Vi2bBkLFixwbn/48GF6\n9+4N1H1Yp6Sk8OSTTzJ8+HDGjh3Lq6++6pbzEPe72LVUXV3NokWLmDRpEoMHD+aaa65hw4YNAEyf\nPh2AmTNnOttE7HY7S5YsYfTo0YwcOZLXX3/dua53796kp6c7lxcsWMCyZcsAuPXWW1m4cCHJycnc\neeedLV63eLbc3FyGDh3KwoULSUpK4sMPP+TWW29l5cqV7i6tVVAYbIPCw8NJSkritttu44UXXmD7\n9u1UVlYyY8YMpkyZAoDBYHDZ59zltLQ0wsLC2Lp1K7///e9ZsmQJJ0+ebNFzEM9wsWvptddeIzMz\nk3/96198/fXXTJ8+nSeeeAKA999/H4D33nuPCRMmuPM0xIOUlJQQFBTEpk2bePLJJ3n66aedny/n\nfy6dLzU1lbVr1/LMM8+0RKnSypSVlREdHc2WLVv0BOwS6TFxG7V8+XLeffdd1q1bx/Lly3E4HEya\nNIlHHnnkovuazWbmzZuH0Wjk6quvJiAggOzsbDp27NgClYunudC1NGfOHGbPno2/vz/Hjx8nMDCQ\nU6dOueyvbslyLovFwrx58zAYDIwePZrAwEByc3Pp2LHjRa+VcePGERgY2EKViqe6//77MZvNzv6C\nEyZMcHZFmDZtGmazGbNZ8eZS6E+rjbJYLMydO5e5c+ditVrZvXs3zzzzDA899BCJiYkX3Lddu3bO\nzrmA85dOvNOPXUsPP/wwCxcu5A9/+AP79u0jJiaG6OhoXStyQYGBgRiNZx9K+fj4UFtb26h9IyIi\nLldZ0oo8++yz9foM5ubmYjAY6NChg5uqat30mLgNWr16NSkpKc5li8XCiBEjuPfeezl48CAmkwmb\nzeZcX1RU5I4ypRW40LV04MABHnvsMeLi4ti2bRv/93//x+zZs91YrbR2RqPR5bNJLyDJpbpYVwNp\nmMJgGzRy5EhOnz7NkiVLKCwsBOqGB3nrrbcYP348sbGx7Nu3j1OnTlFWVsaKFSvcXLF4qotdS+Xl\n5fj5+WEwGDhx4gTPPfccgPNOj4+Pj8YZlEbr1q2b82WjzZs3s2fPHjdXJK2Jnko0ncJgGxQaGso7\n77zD0aNHmTp1KoMHD+aOO+5g4MCBPPjgg0ycOJFRo0aRkpLC9ddfz9ixYy94PH3T8l4Xu5YefPBB\nPvvsM5KSkrj//vuZP38+gYGBZGRkAHVvFN9222188MEHbj4TcYfGfHacu80jjzzC2rVrGTp0KO+8\n8w7Tpk27pGNJ23eh6+BCL0bKhWnQaREREREvpjuDIiIiIl5MYVBERETEiykMioiIiHgxhUERERER\nL6YwKCIiIuLFFAZFREREvJjCoIiIiIgXUxgUEWkmq1atYvz48e4uQ0TkkigMiog0I816ICKtjcKg\niIiIiBdTGBQRaaKsrCzmzp3LoEGDuOmmmzh27Jhz3aZNm5gxYwYDBw5k8ODBzJs3j5MnTwJw3XXX\n8eKLL7oca8GCBSxevLhF6xcRAYVBEZEmsdlszJ8/n7CwMP75z38yb9483njjDQBycnK4++67ufHG\nG1mzZg2vvvoqOTk5zgA4bdo01qxZ4zxWeXk5mzZtIiUlxS3nIiLeTWFQRKQJNm/ezOnTp1m8eDHx\n8fFMnjyZOXPmAGC323n44YeZPXs2nTt35oorruDaa68lPT0dgKlTp5KRkUFaWhoA69atIyoqiv79\n+7vtfETEe5ndXYCISGuUkZFBdHQ0QUFBzrb+/fvz0UcfERMTg6+vL8uXLyctLc0Z/Pr16wdAdHQ0\ngwYNYvXq1SQkJLB69WqmTp3qrlMRES+nO4MiIk1gMBhwOBwubWZz3ffrtLQ0Jk+ezP79+xk0aBCP\nPPIIt99+u8u2KSkprFmzhpKSErZs2aIwKCJuozAoItIECQkJZGdnU1xc7Gz77rvvAPjf//1fBg4c\nyNKlS5k9ezaDBw/m6NGjLuFx8uTJ5ObmsmLFCnr16kX37t1b/BxEREBhUESkSUaMGEFsbCy/+93v\nOHz4MOvXr+ett94CIDIykvT0dL755huys7N5+eWX2bhxI1ar1bl/aGgoycnJvPbaa0ybNs1dpyEi\nojAoItIUJpOJ5cuXA3DzzTezdOlS5s2bB8DcuXMZOnQov/rVr5g5cyYnT57kueeeIysri8rKSucx\npk6dSk1NDVOmTHHLOYiIABgc53d6ERGRFvHGG2/w5Zdf8vrrr7u7FBHxYrozKCLSwg4fPsyqVat4\n7bXX+PnPf+7uckTEyykMioi0sIMHD/LYY48xbtw4Jk6c6O5yRMTL6TGxiIiIiBfTnUERERERL6Yw\nKCIiIuLFFAZFREREvJjCoIiIiIgXUxgUERER8WIKgyIiIiJe7P8DUziZSrtXB2AAAAAASUVORK5C\nYII=\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"\n",
"sns.set(style='ticks', context='talk')\n",
"\n",
"# Fit just \"tip ~ day\", to make plotting easier\n",
"lm = LinearRegression()\n",
"lm.fit(X_factorized[['day']], y)\n",
"b0, b1 = lm.intercept_, lm.coef_.ravel()[0]\n",
"\n",
"ax = sns.stripplot('day', 'tip', data=df, jitter=1)\n",
"xlim = ax.get_xlim()\n",
"xx = np.linspace(*xlim, 2)\n",
"ax.plot(xx, [3.2245 + b1 * xx[0], b0 + b1 * xx[1]], lw=4)\n",
"sns.despine()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This may not be true in practice, and our data representation should not force our model to assume it.\n",
"Better to let the model *discover* that relationship, if it actually happens to be there."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dummy Encoding\n",
"\n",
"There's a better approach, dummy encoding.\n",
"This expands each categorical column to *multiple* columns, one per distinct value.\n",
"The values in these new dummy-encoded columns are either 1, indicating the presence of that value in that observation, or 0.\n",
"Versions of this are implemented in both scikit-learn and pandas."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 1, 0],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 0, 0, 1],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [1, 0, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 1, 0, 0],\n",
" [0, 0, 0, 1]])"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.preprocessing import LabelBinarizer\n",
"enc = LabelBinarizer()\n",
"enc.fit_transform(df['day'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So the `day` column has expanded to four, one for each unique value."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array(['Sun', 'Sat', 'Thur', 'Fri'], dtype=object)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.day.unique()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I recommend the pandas version, `get_dummies`. It offers a few conveniences:\n",
"\n",
"- Operates on multiple columns at once\n",
"- Passes through numeric columns unchanged\n",
"- Preserves row and column labels\n",
"- Provides a `drop_first` keyword for dropping a level per column. You might want this to avoid [perfect multicolinearity](https://en.wikipedia.org/wiki/Multicollinearity) if you have an intercept"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"total_bill 0.094487\n",
"size 0.175992\n",
"sex_Female 0.016220\n",
"sex_Male -0.016220\n",
"smoker_No 0.043204\n",
" ... \n",
"day_Sat -0.044159\n",
"day_Sun 0.051819\n",
"day_Thur -0.084960\n",
"time_Dinner -0.034064\n",
"time_Lunch 0.034064\n",
"dtype: float64"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_dummy = pd.get_dummies(X)\n",
"\n",
"lm = LinearRegression()\n",
"lm.fit(X_dummy, y)\n",
"\n",
"pd.Series(lm.coef_, index=X_dummy.columns)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This version solves both of our issues with `factorize`.\n",
"\n",
"1. The ordering of rows doesn't matter anymore because the dummy-columns are computed on the sorted version of the categories\n",
"2. \"adjacent\" categories each have their own parameter, so they're free to have their own effect"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Refinements\n",
"\n",
"Our last approach worked, but there's still room for improvement.\n",
"\n",
"1. We can't easily go from dummies back to categoricals\n",
"2. Doesn't integrate with scikit-learn `Pipeline` objects.\n",
"3. If working with a larger dataset and `partial_fit`, codes could be missing from subsets of the data.\n",
"4. Memory inefficient if there are many records relative to distinct categories\n",
"\n",
"To solve these, we'll store additonal information in the *type* of the column and write a [Transformer](http://scikit-learn.org/stable/modules/generated/sklearn.base.TransformerMixin.html) to handle the conversion to and from dummies."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Pandas `Categorical` dtype\n",
"\n",
"Pandas provides a `Categorical` dtype, which stores\n",
"\n",
"- All the possible values the column can take\n",
"- Whether there is an ordering on the categories\n",
"- A relationship between each distinct categorical value and an integer code\n",
"\n",
"Let's convert the categorical columns to `Categorical` dtype.\n",
"With `.astype('category')` we're just using the defaults of\n",
"\n",
"- The set of categories is just the set present in the column\n",
"- There is no ordering"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 244 entries, 0 to 243\n",
"Data columns (total 6 columns):\n",
"total_bill 244 non-null float64\n",
"sex 244 non-null category\n",
"smoker 244 non-null category\n",
"day 244 non-null category\n",
"time 244 non-null category\n",
"size 244 non-null int64\n",
"dtypes: category(4), float64(1), int64(1)\n",
"memory usage: 4.9 KB\n"
]
}
],
"source": [
"columns = ['sex', 'smoker', 'day', 'time']\n",
"X[columns] = X[columns].apply(lambda x: x.astype('category'))\n",
"X.info()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lm = LinearRegression().fit(pd.get_dummies(X), y)\n",
"lm"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from sklearn.base import TransformerMixin"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 244 entries, 0 to 243\n",
"Data columns (total 6 columns):\n",
"total_bill 244 non-null float64\n",
"sex 244 non-null category\n",
"smoker 244 non-null category\n",
"day 244 non-null category\n",
"time 244 non-null category\n",
"size 244 non-null int64\n",
"dtypes: category(4), float64(1), int64(1)\n",
"memory usage: 4.9 KB\n"
]
}
],
"source": [
"X.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Goals\n",
"\n",
"1. Convince you that writing your own transformers isn't impossibly hard\n",
"2. Concrete example of going from pandas to NumPy and back again"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['sex', 'smoker', 'day', 'time'], dtype='object')"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.select_dtypes(include=['category']).columns"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.day.cat"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['Female', 'Male'], dtype='object')"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.sex.cat.categories"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 16.99, 2. , 1. , ..., 0. , 1. , 0. ],\n",
" [ 10.34, 3. , 0. , ..., 0. , 1. , 0. ],\n",
" [ 21.01, 3. , 0. , ..., 0. , 1. , 0. ],\n",
" ..., \n",
" [ 22.67, 2. , 0. , ..., 0. , 1. , 0. ],\n",
" [ 17.82, 2. , 0. , ..., 0. , 1. , 0. ],\n",
" [ 18.78, 2. , 1. , ..., 1. , 1. , 0. ]])"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"class DummyEncoder(TransformerMixin):\n",
" \n",
" def fit(self, X, y=None):\n",
" # record info here, use in transform, inv_transform.\n",
" self.columns_ = X.columns\n",
" self.cat_cols_ = X.select_dtypes(include=['category']).columns\n",
" self.non_cat_cols_ = X.columns.drop(self.cat_cols_)\n",
" \n",
" self.cat_map_ = {col: X[col].cat for col in self.cat_cols_}\n",
" left = len(self.non_cat_cols_) # 2\n",
" self.cat_blocks_ = {}\n",
"\n",
" for col in self.cat_cols_:\n",
" right = left + len(X[col].cat.categories)\n",
" self.cat_blocks_[col] = slice(left, right)\n",
" left = right\n",
" return self\n",
" \n",
" def transform(self, X, y=None):\n",
" return np.asarray(pd.get_dummies(X))\n",
" \n",
" def inverse_transform(self, trn, y=None):\n",
" # Numpy to Pandas DataFrame\n",
" # original column names <=> positions\n",
" numeric = pd.DataFrame(trn[:, :len(self.non_cat_cols_)],\n",
" columns=self.non_cat_cols_)\n",
" series = []\n",
" for col, slice_ in self.cat_blocks_.items():\n",
" codes = trn[:, slice_].argmax(1)\n",
" cat = pd.Categorical.from_codes(codes,\n",
" self.cat_map_[col].categories,\n",
" ordered=self.cat_map_[col].ordered)\n",
" series.append(pd.Series(cat, name=col))\n",
" return pd.concat([numeric] + series, axis=1)[self.columns_]\n",
"\n",
" \n",
" \n",
"de = DummyEncoder()\n",
"trn = de.fit_transform(X)\n",
"trn"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" total_bill | \n",
" sex | \n",
" smoker | \n",
" day | \n",
" time | \n",
" size | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 16.99 | \n",
" Female | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 2.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 10.34 | \n",
" Male | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 3.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 21.01 | \n",
" Male | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 3.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 23.68 | \n",
" Male | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 2.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 24.59 | \n",
" Female | \n",
" No | \n",
" Sun | \n",
" Dinner | \n",
" 4.0 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 239 | \n",
" 29.03 | \n",
" Male | \n",
" No | \n",
" Sat | \n",
" Dinner | \n",
" 3.0 | \n",
"
\n",
" \n",
" 240 | \n",
" 27.18 | \n",
" Female | \n",
" Yes | \n",
" Sat | \n",
" Dinner | \n",
" 2.0 | \n",
"
\n",
" \n",
" 241 | \n",
" 22.67 | \n",
" Male | \n",
" Yes | \n",
" Sat | \n",
" Dinner | \n",
" 2.0 | \n",
"
\n",
" \n",
" 242 | \n",
" 17.82 | \n",
" Male | \n",
" No | \n",
" Sat | \n",
" Dinner | \n",
" 2.0 | \n",
"
\n",
" \n",
" 243 | \n",
" 18.78 | \n",
" Female | \n",
" No | \n",
" Thur | \n",
" Dinner | \n",
" 2.0 | \n",
"
\n",
" \n",
"
\n",
"
244 rows × 6 columns
\n",
"
"
],
"text/plain": [
" total_bill sex smoker day time size\n",
"0 16.99 Female No Sun Dinner 2.0\n",
"1 10.34 Male No Sun Dinner 3.0\n",
"2 21.01 Male No Sun Dinner 3.0\n",
"3 23.68 Male No Sun Dinner 2.0\n",
"4 24.59 Female No Sun Dinner 4.0\n",
".. ... ... ... ... ... ...\n",
"239 29.03 Male No Sat Dinner 3.0\n",
"240 27.18 Female Yes Sat Dinner 2.0\n",
"241 22.67 Male Yes Sat Dinner 2.0\n",
"242 17.82 Male No Sat Dinner 2.0\n",
"243 18.78 Female No Thur Dinner 2.0\n",
"\n",
"[244 rows x 6 columns]"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"de.inverse_transform(trn)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}