{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "

Методы машинного обучения

\n", "

Семинар: линейные модели классификации

\n", "

Методы понижения размерности

" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "plt.style.use('ggplot')\n", "plt.rcParams['figure.figsize'] = (12,8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Логистическая регрессия" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Игрушечный пример" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Сгенерируем выборку и опробуем логистическую регрессию" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "np.random.seed(0)\n", "X = np.r_[np.random.randn(20, 2) + [2, 2],\n", " np.random.randn(20, 2) + [-2, -2]]\n", "y = [-1] * 20 + [1] * 20" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(figsize=(7, 7))\n", "ax.scatter(X[:, 0],\n", " X[:, 1],\n", " c=y,\n", " cmap=plt.cm.Paired)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import LogisticRegression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Обучим логистическую регрессию на этих данных и нарисуем разделяющую гиперплоскость" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n" ] }, { "data": { "text/plain": [ "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, max_iter=100, multi_class='warn',\n", " n_jobs=None, penalty='l2', random_state=None, solver='warn',\n", " tol=0.0001, verbose=0, warm_start=False)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = LogisticRegression(C=1.0, \n", " fit_intercept=True, \n", " penalty='l2')\n", "model.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "w_0 = -0.183954\n", "w_1, w_2 = [[-1.06097157 -1.00171289]]\n" ] } ], "source": [ "print('w_0 = %f' % model.intercept_)\n", "print('w_1, w_2 = ', model.coef_)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Нарисуем эту гиперплоскость\n", "w_0 = model.intercept_[0]\n", "w_1 = model.coef_[0][0]\n", "w_2 = model.coef_[0][1]\n", "\n", "x_1 = np.linspace(-4, 4, 10)\n", "x_2 = - (w_0 + w_1*x_1)/w_2" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(figsize=(7, 7))\n", "ax.scatter(X[:, 0],\n", " X[:, 1],\n", " c=y,\n", " cmap=plt.cm.Paired)\n", "plt.plot(x_1, x_2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Пример с текстами" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Возьмем текстовые данные [отсюда](https://archive.ics.uci.edu/ml/machine-learning-databases/00331/). Архив содержит 3 файла с положительными и отрицательными отзывами с ресурсов\n", "* imdb.com\n", "* amazon.com\n", "* yelp.com\n", "\n", "Формат файла следующий:\n", "<отзыв>\\t<метка>\\n\n", "\n", "\n", "### Задача\n", "1. Загрузите тексты и метки классов в разные переменные\n", "2. Выберите меру качества классификации\n", "3. Обучите логистическую (без подбора гиперпараметров). Тексты представляются в виде мешка слов\n", "4. Выведите наиболее значимые слова из текста\n", "5. С помощью кросс-валидации найдите хорошие значения гиперпараметров для `CountVectorizer` и `LogisticRegression`" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('data/sentiment/imdb_labelled.txt', sep='\\t', header=None, names=['text', 'label'])" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
textlabel
0A very, very, very slow-moving, aimless movie ...0
1Not sure who was more lost - the flat characte...0
2Attempting artiness with black & white and cle...0
3Very little music or anything to speak of.0
4The best scene in the movie was when Gerardo i...1
\n", "
" ], "text/plain": [ " text label\n", "0 A very, very, very slow-moving, aimless movie ... 0\n", "1 Not sure who was more lost - the flat characte... 0\n", "2 Attempting artiness with black & white and cle... 0\n", "3 Very little music or anything to speak of. 0\n", "4 The best scene in the movie was when Gerardo i... 1" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "from sklearn.feature_extraction.text import CountVectorizer\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.model_selection import StratifiedKFold\n", "from sklearn.model_selection import RandomizedSearchCV, GridSearchCV" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "model = Pipeline([\n", " ('vect', CountVectorizer(max_df=0.95, stop_words='english', ngram_range=(2,2))),\n", " ('clf', LogisticRegression())\n", "])" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "X = df.text.values\n", "y = df.label.values" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n" ] }, { "data": { "text/plain": [ "Pipeline(memory=None,\n", " steps=[('vect', CountVectorizer(analyzer='word', binary=False, decode_error='strict',\n", " dtype=, encoding='utf-8', input='content',\n", " lowercase=True, max_df=0.95, max_features=None, min_df=1,\n", " ngram_range=(2, 2), preprocessor=None, stop_words='english',\n", " ...penalty='l2', random_state=None, solver='warn',\n", " tol=0.0001, verbose=0, warm_start=False))])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "coefs = model.named_steps['clf'].coef_[0] # Веса линейной регрессии для каждого словa\n", "words = model.named_steps['vect'].get_feature_names()" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "word_coefs = pd.Series(index=words, data=coefs).sort_values()" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "waste time -0.945444\n", "bad film -0.801509\n", "avoid costs -0.655693\n", "just bad -0.621132\n", "don waste -0.585915\n", "film just -0.565970\n", "worst series -0.554764\n", "90 minutes -0.549202\n", "hour half -0.545175\n", "cover girl -0.521187\n", "story stupid -0.513814\n", "avoid avoid -0.512680\n", "movie pretty -0.488534\n", "ve seen -0.462487\n", "just awful -0.456271\n", "acting bad -0.447700\n", "film started -0.444112\n", "really bad -0.424706\n", "hate movies -0.418471\n", "just didn -0.414149\n", "action scenes -0.413278\n", "couldn seriously -0.393361\n", "worse ticker -0.393361\n", "just lame -0.393361\n", "plot storyline -0.393361\n", "just painful -0.393361\n", "great disappointment -0.393361\n", "script script -0.393361\n", "plot whatsoever -0.393361\n", "engaging exciting -0.393361\n", " ... \n", "transfers good 0.408732\n", "like movie 0.412289\n", "rate movie 0.417140\n", "movie 10 0.417140\n", "film like 0.417562\n", "think disappointed 0.420384\n", "characters interesting 0.423554\n", "wonderful story 0.436351\n", "feel good 0.455643\n", "cinematography film 0.463604\n", "excellent job 0.487130\n", "thought provoking 0.487886\n", "excellent film 0.491852\n", "wind lion 0.496507\n", "short film 0.496581\n", "cast good 0.500491\n", "movie really 0.507134\n", "great film 0.519239\n", "music film 0.526220\n", "liked movie 0.540232\n", "definitely worth 0.559673\n", "highly recommended 0.568294\n", "saw film 0.573340\n", "movie good 0.574355\n", "long time 0.598076\n", "excellent performance 0.643658\n", "gave 10 0.643658\n", "worth checking 0.761545\n", "film great 0.837645\n", "10 10 1.126112\n", "Length: 5693, dtype: float64" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "word_coefs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Методы понижения размерности" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "# Load data\n", "df_wine = pd.read_csv('data/winequality-red.csv', sep=';')\n", "\n", "# Make classification target feature\n", "df_wine.loc[:, 'quality_cat'] = (df_wine.quality > 5).astype(int)\n", "df_wine = df_wine.drop('quality', axis=1)\n", "\n", "# Get descriptive and target features\n", "X = df_wine.iloc[:, :-1].values\n", "y = df_wine.iloc[:, -1].values" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholquality_cat
07.40.700.001.90.07611.034.00.99783.510.569.40
17.80.880.002.60.09825.067.00.99683.200.689.80
27.80.760.042.30.09215.054.00.99703.260.659.80
311.20.280.561.90.07517.060.00.99803.160.589.81
47.40.700.001.90.07611.034.00.99783.510.569.40
\n", "
" ], "text/plain": [ " fixed acidity volatile acidity citric acid residual sugar chlorides \\\n", "0 7.4 0.70 0.00 1.9 0.076 \n", "1 7.8 0.88 0.00 2.6 0.098 \n", "2 7.8 0.76 0.04 2.3 0.092 \n", "3 11.2 0.28 0.56 1.9 0.075 \n", "4 7.4 0.70 0.00 1.9 0.076 \n", "\n", " free sulfur dioxide total sulfur dioxide density pH sulphates \\\n", "0 11.0 34.0 0.9978 3.51 0.56 \n", "1 25.0 67.0 0.9968 3.20 0.68 \n", "2 15.0 54.0 0.9970 3.26 0.65 \n", "3 17.0 60.0 0.9980 3.16 0.58 \n", "4 11.0 34.0 0.9978 3.51 0.56 \n", "\n", " alcohol quality_cat \n", "0 9.4 0 \n", "1 9.8 0 \n", "2 9.8 0 \n", "3 9.8 1 \n", "4 9.4 0 " ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_wine.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recursive Feature Elimination" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "from sklearn.feature_selection import RFECV\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.model_selection import StratifiedKFold\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.preprocessing import StandardScaler" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "model_rfe = Pipeline([\n", " ('scaler', StandardScaler()),\n", " ('clf', RFECV(LogisticRegression(), step=1, cv=cv, verbose=1, scoring='roc_auc'))\n", "])\n", "\n", "cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=123)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fitting estimator with 11 features.\n", "Fitting estimator with 10 features.\n", "Fitting estimator with 9 features.\n", "Fitting estimator with 8 features.\n", "Fitting estimator with 7 features.\n", "Fitting estimator with 6 features.\n", "Fitting estimator with 5 features.\n", "Fitting estimator with 4 features.\n", "Fitting estimator with 3 features.\n", "Fitting estimator with 2 features.\n", "Fitting estimator with 11 features.\n", "Fitting estimator with 10 features.\n", "Fitting estimator with 9 features.\n", "Fitting estimator with 8 features.\n", "Fitting estimator with 7 features.\n", "Fitting estimator with 6 features.\n", "Fitting estimator with 5 features.\n", "Fitting estimator with 4 features.\n", "Fitting estimator with 3 features.\n", "Fitting estimator with 2 features.\n", "Fitting estimator with 11 features.\n", "Fitting estimator with 10 features.\n", "Fitting estimator with 9 features.\n", "Fitting estimator with 8 features.\n", "Fitting estimator with 7 features.\n", "Fitting estimator with 6 features.\n", "Fitting estimator with 5 features.\n", "Fitting estimator with 4 features.\n", "Fitting estimator with 3 features.\n", "Fitting estimator with 2 features.\n", "Fitting estimator with 11 features.\n", "Fitting estimator with 10 features.\n", "Fitting estimator with 9 features.\n", "Fitting estimator with 8 features.\n", "Fitting estimator with 7 features.\n", "Fitting estimator with 6 features.\n", "Fitting estimator with 5 features.\n", "Fitting estimator with 4 features.\n", "Fitting estimator with 3 features.\n", "Fitting estimator with 2 features.\n", "Fitting estimator with 11 features.\n", "Fitting estimator with 10 features.\n", "Fitting estimator with 9 features.\n", "Fitting estimator with 8 features.\n", "Fitting estimator with 7 features.\n", "Fitting estimator with 6 features.\n", "Fitting estimator with 5 features.\n", "Fitting estimator with 4 features.\n", "Fitting estimator with 3 features.\n", "Fitting estimator with 2 features.\n", "Fitting estimator with 11 features.\n", "Fitting estimator with 10 features.\n", "Fitting estimator with 9 features.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n", "/Users/andrey.shestakov/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n", " FutureWarning)\n" ] }, { "data": { "text/plain": [ "Pipeline(memory=None,\n", " steps=[('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)), ('clf', RFECV(cv=StratifiedKFold(n_splits=5, random_state=123, shuffle=True),\n", " estimator=LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, max_iter=100, multi_clas...m_start=False),\n", " min_features_to_select=1, n_jobs=None, scoring='roc_auc', step=1,\n", " verbose=1))])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_rfe.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "rfe = model_rfe.named_steps['clf']" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.75720164, 0.7955729 , 0.80091782, 0.81279034, 0.81248978,\n", " 0.81463012, 0.8164726 , 0.81848756, 0.81790802, 0.81704451,\n", " 0.81717928])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rfe.grid_scores_ # Качество для 1, 2, 3 и тп признаков" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ True, True, True, False, True, True, True, False, False,\n", " True, True])" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rfe.support_" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "num_of_features = range(1, 12)" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0,0.5,'ROC AUC')" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAt0AAAHmCAYAAACvYjIBAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzs3Xt8VfWd7//3d++dC7dA2EEiEG6BoIA3TDkQvICk2ou2SL2MztGjONOL9jLT6YzFYn+dKiPHajt9eGm1pWBHe0or7fTMWGknY/WMoe3QKrU7XHZCAiEmGpJwCQkkO1nf3x8rBAIhBMhea19ez8eDR7LXWkne2V9b3yw/ay1jrbUCAAAAEDcBvwMAAAAAqY7SDQAAAMQZpRsAAACIM0o3AAAAEGeUbgAAACDOKN0AAABAnFG6AQAAgDijdAMAAABxRukGAAAA4ozSDQAAAMRZyO8A8VJfX+93hLSRl5enpqYmv2Mgzljn1McapwfWOT2wzt6ZMGHCoI7jTDcAAAAQZ5RuAAAAIM4o3QAAAECcUboBAACAOKN0AwAAAHHm2d1Ltm7dqnXr1slxHC1dulTLli3rs7+pqUnPPPOM2tra5DiO7rzzTs2bN0/vvPOOXnrpJXV1dSkUCumuu+7S3LlzvYoNAAAAnDdPSrfjOFq7dq1WrVqlcDislStXqri4WJMmTeo9ZuPGjVq4cKGuv/561dXV6bHHHtO8efM0atQoPfjggxo7dqxqa2u1evVqPffcc17EBgAAAIaEJ+MlVVVVys/P1/jx4xUKhVRSUqItW7b0OcYYo/b2dklSe3u7cnNzJUnTpk3T2LFjJUkFBQWKxWKKxWJexAYAAACGhCdnultaWhQOh3tfh8NhVVZW9jnm1ltv1aOPPqpNmzapo6NDDz/88Cnf5/e//72mTZumjIyMU/aVlZWprKxMkrRmzRrl5eUN8W+B0wmFQrzfaYB1Tn2scXpgndMD65x4PCnd1tpTthlj+rwuLy/X4sWLddNNNykajeqpp57Sk08+qUDAPRm/d+9evfTSS/rKV77S788oLS1VaWlp72uewuQdnnqVHljn1McapwfWOT2wzt5JqCdShsNhNTc3975ubm7uHR855rXXXtPChQslSUVFRYrFYmptbe09/oknntADDzyg/Px8LyIDAAAAQ8aT0l1YWKiGhgY1Njaqq6tLmzdvVnFxcZ9j8vLyFIlEJEl1dXWKxWLKyclRW1ub1qxZozvuuEMXXXSRF3EBAACAIeXJeEkwGNSKFSu0evVqOY6jJUuWqKCgQBs2bFBhYaGKi4t1991367nnntMrr7wiSbr//vtljNGmTZv03nvvaePGjdq4caMkadWqVRo9erQX0QEAAIDzZmx/A9cpoL6+3u8IaYO5sfTAOqc+1jg9sM7pgXX2TkLNdAMAAADpjNINAAAAxBmlGwCANGG7u2U7Ovq9lS+A+PLkQkoAAOAta63U3ChbUynV7JStiUq1u9TY2SkFQ9LwEdKwET0fh0vDR8gMH+l+fmz78BEyw0b0ea1hI6TsYac8bwPAwCjdAACkANt+WNpdKVsdld1dKVXvlFoPujtDGdKUQplrPqQR+RPU1rxPam+T2ttkj7RJR9qlAy2y7W3SkTaps+P49+3vh5lAb1F3P46Uho2QObbthIJuTi7sPV9jAkFP3hcgUVC6AQBIMrYrJtXtds9e10Tdj++9e/yA/Ekyc6+UphXJTC+SJk6VCbn/yh+Rl6cjZ7irhe2KuUX8WAnv+WhPfN3ulnV7pE1qPyzta5A90u7uP9J+/Hud7odkDzuhhB87q36syI+Uhrtn3M2Jhf2E401Gxnm+i4C3KN0AACQwa620772+Bbu2WuqKuQeMGi1NnyWzYInMtCJp6gx3TOQ8mFCG+31H9X0mxmAHSqzTLR09ckI5P6G095R19+PhniLfc6a9Ye/xUu847vc63Q/JyDz17HnPGXedeMZ92AiZUTlSzhgpJ1caOYqz7PAFpRsAgARi21rdcn1sTKQmKh0+5O7MzJQmz5C57qPS1J6z2GPHJdx8tQkE3ZGTk8r/oEu7tVLH0eMF/cjhnlGYnrLefvj4GfX242XeNjceL/rH/lKik4q7CUijctwCnjNGZvSYnkLulnKTM0Ya7e7TiFEyAe45gaFB6QYAwCc2FpP2Vve92LGxwd1pjHRhgcxlH5CmzXLPYk+Y3DsmksqMMe74SfawvtvP4nvYWKdbzNva3L+0HDoge2i/dOiAdHC/7KED7rb333W3xTrdrzvxmwQC0qgxUs5oaXSuzKgTCnnOGLeg5+RKo3sKeoL95QeJJfX/lwsAQAKw1kqNDbI1O6VjZ7H3VktdXe4Bo8e6M9hXfVBm6kxp6kx3xhnnxGRkuiMoObnHt53mWGute9b80AGpp5jbg24pV+sB2YM92+pr3W09a9anoAeDPQXdLeYmZ/TxkZacMTInlHUNH0lBT0OUbgAA4sC2HuyZwa50i3ZNpTsWIUlZ2dKUGTJLb5KZNkuaViTlhiliPjHGHJ8Bz5/objvNsdZad3zl2JnyE86eu697ttftdrd391PQQ6HjBf3kQt474tLzetgI/rlIEZRuAADOk411SrXVfc9i73vP3WkC0oQCmStL3DPZ02ZKF06WCXIxXzIyxkgjRrp/Lpw04MiLW9APHy/kB/dLrT1n0A/2FPSDLbJ7q91t/V08Gso4XshH9xTyk8dbjs2hD/H9063jSE631O1Ituej0+3m7O4+/vk5HHP8e3dL9jTHDnjMsdfHjzHX3CAz4+Ih+/2HGqUbAICzYB1Her++9+y1rYlKdTVuAZCk3Dxp2ky3AEyb5d4f+6TZZKQHt6CPcv9MmDxwQXccqe1w74iLPeHsee/rln3uX+gOHZSsc+qdXTIyewt5S1a2ujuOHi+lJ5fUU7Y5J5Xe7ji+M+coGHLn7ANBKRg4/nnPR3P5fL8TDojSDQDAAOyh/W65ro66RXt3lXuBniRlDXNv0Xf9MpmpRdL0IpkxYV/zIjmZQM9dVUblSBPPVNC7pcOtfQt5z9nzYyMuJhR058yPldJgUMYETtnmFtZjRTbo/peZYPCUQuuW3GDfY08uvYM4pm9ZPjHHwMekwl1kKN0AAPSwHR1S7a6+Z7GbG92dgYA0cYrMB652z2RPmyVdOJF7PsNzJhA8PnKiqf0W9Ny8PDWd4SFI8BalGwCQlqzTLTW8K7s76s5h1+yU3t3TO1ersePc2/Rd91GZqUXumEhWtq+ZASQvSjcAIOXZri7pvb2ye6rdM9m1u6S9Ne4DWCT3CYZTZ8p86BNu0Z5W5N5RAgCGCKUbAJBSbKxTenePW6z3VLsf63Yff0JhVrZUME1mUak0udB9quP4iSkxMwogcVG6AQBJy3YclfbWuMW6dpd7Jruh9vidF4aNkCZPdx+bPrlQZnKhNP5C5rABeI7SDQBICrb9sFuw9/QU7Npq6b06yfbcOG1kjjt3fcmVMlMKpcmFUt54HiwCICFQugEACce2HnQfNlO7S9rTM4N97GEzkjQm7Bbs4kXu2evJhTzREUBCo3QDAHxjrZUOtvTOXh8bE1HLCbc6yxvvjoYsKu05gz1dJoeLHAEkF0o3AMAT1lqp6f3eM9i21r2TiA4dcA8wxr2gccYcacp0mYLpbtkeMdLf4AAwBCjdAIAhZx1Haqzvmb/uGROprZbaD7sHBALuY7HnXukW6ynTpUnTeFw6gJRF6QYAnBfb3S017O0t1nbPsXtgH3EPCIWkiVNlihcdv4PIxMkymVn+BgcAD1G6AQCDZmMxqX5P3zuI1O2WYp3uAZlZ7j2wS65zL3ScXChdWCAT4l83ANIb/y8IAOiX7eiQ6mr63kGk/sR7YA93z1wv/nDPiEihNH4C98AGgH5QugEAskePqDPytpw/v9XzkJld0nvvStZxDxiZ4xbr6+cdv0Vf3nie4ggAg0TpBoA0Y62Vmhtld+2Qdm13P+7drf3HCvaYsW7BvnKRe4Hj5EIpN497YAPAeaB0A0CKs7GYe/Z6146eor3DvTe2JGVlS9NnyXz0Vo2+/AM6NGaczGjugQ0AQ43SDQApxh46IFXvkK3qKdm7K6WumLszb7zMRZdIhRfLFF4kTZwiE3RnsLPy8mSamgb4zgCAc0XpBoAkZp1uqX5v31GRxgZ3ZzDk3kFkyUdkCi+WCi+SGTPW38AAkKYo3QCQROyRdqlm5/Gz2DU7pSPt7s5Ro90z2Nfc4J7FnjJDJiPT38AAAEmUbgBIWMcem253bZeqes5iv7tHstZ9ZPrEKTLzrzk+KjIun4sdASBBUboBIEHYWKd7P+xdO9yivWuHdOiAuzN7mDT9IpkrFsrMuEiaWiQzfIS/gQEAg0bpBgCf2IP7e+ew7a4d0p4qqavL3TkuX2b2Fe4c9oyLpAmTeegMACQxSjcAeMA63dK7tbJV249f8Nj0vrszlCFNnSGz9KaeCx5nyeRw2z4ASCWUbgCIA9t+WKqOHh8VqY5KHUfcnaNz3TnsJR91Z7EnF8pkZPgbGAAQV5RuADhP1lqpsaF3Dtvu2iHV1/Zc8BiQJk2RWbikZ1TkYil8ARc8AkCaoXQDGJB1HKlqmzpqs2WPHJEyMnv+ZBz/GMqUMjOlYCgtyqTt7JB2V/W94PHwIXfnsBHueEjxIndUZNpMmezh/gYGAPiO0g1gQPbffiz77z/WgcEcbMzxEt6nmJ/6uQn1bMvs2RY6qcj3HGtOLvjHjs086XVGhkwgEJ/34ECzewa7qqdk11ZL3T0XPI6fKHPpB9yz2IUXSxdOilsOAEDyonQDOC279Xey//5jmQVLNGbZHTrYtE+KdUqxmHu2tyvW+/r4x46TXnfKdsWkzp7tR9qlrph7e7wTj+/slKxzaoazCRwMnVTu+z8jbzIy3YsXTyn0WcdfxzqlGncmW82N7vfPyHQveLz+427Bnj5LZtToIXmvAQCpjdINoF/2vTo5a7/lPtXw7geUeeEEmXBT7/54DJHY7u4By7u6OqXOTtkTX5+p8J+4/UibdOiA7LGvO/EvDsdu1XeiMWPdcl3ac1eRgmkyIS54BACcPUo3gFPYI+1ynvknKSNTgftXevYocRMMSsFh7oNgBjouDj/bOk5PAe8p6SYg5YxJixl1AED8UboB9GEdR866f5Ya6xX44iMyY8f5HckTJhCQMrPcPxrpdxwAQIrhah8AfdhXX5be/p3MrffKzLrE7zgAAKQESjeAXvbPf5T9xUsy/+NamaUf8zsOAAApg9INQJJkGxvkfP8JadJUmbs+yywzAABDiNINQLbjqJxn/0kyAQU+s1ImK8vvSAAApBRKN5DmrLWyLzwl1e9V4JNfkhmX73ckAABSDqUbSHP21/8qu+W/ZJbfJTP7Cr/jAACQkijdQBqz27bKbnxB5spFMjcs9zsOAAApi9INpCnb9L6c731DunCSzD2f58JJAADiiNINpCHb2SHnO49J3Y4CDzwkc4YnQAIAgPND6QbSjLVW9l+elfbWKPBXX5S5YILfkQAASHmePQZ+69atWrdunRzH0dKlS7Vs2bI++5uamvTMM8+ora1NjuPozjvv1Lx589Ta2qpvfvObqqqq0uLFi3Xfffd5FRlISfa1V2R/9xuZj98pc+kH/I4DAEBa8KR0O46jtWvXatWqVQqHw1q5cqWKi4s1adKk3mM2btyohQsX6vrrr1ddXZ0ee+wxzZs3TxkZGbr99ttVW1urvXv3ehEXSFk2GpH9yfely+bLfOQ2v+MAAJA2PBkvqaqqUn5+vsaPH69QKKSSkhJt2bKlzzHGGLW3t0uS2tvblZubK0nKzs7WRRddpMzMTC+iAinLtjTJ+e7/li64UIEVfysTYLoMAACveHKmu6WlReFwuPd1OBxWZWVln2NuvfVWPfroo9q0aZM6Ojr08MMPn9XPKCsrU1lZmSRpzZo1ysvLO//gGJRQKMT7neBsrFMtj39ZNhbT2Ie+oVDBlLP+Hqxz6mON0wPrnB5Y58TjSem21p6y7eTbk5WXl2vx4sW66aabFI1G9dRTT+nJJ59UYJBn40pLS1VaWtr7uqmp6fxCY9Dy8vJ4vxOc88OnZSu3KfCZlTowbKR0DuvFOqc+1jg9sM7pgXX2zoQJg7shgSf/fTkcDqu5ubn3dXNzc+/4yDGvvfaaFi5cKEkqKipSLBZTa2urF/GAlOb8v02y//VrmY/cJjNvod9xAABIS56U7sLCQjU0NKixsVFdXV3avHmziouL+xyTl5enSCQiSaqrq1MsFlNOTo4X8YCUZXftkP3R89LcK2U+foffcQAASFuejJcEg0GtWLFCq1evluM4WrJkiQoKCrRhwwYVFhaquLhYd999t5577jm98sorkqT777+/dwTlgQceUHt7u7q6urRlyxatWrWqz51PAJzKHmiR85010tg8Bf7q72QCQb8jAQCQtoztb+A6BdTX1/sdIW0wN5Z4bFdMzpOrpNpqBVZ+Q2bS1PP+nqxz6mON0wPrnB5YZ+8k1Ew3AG/Zn6yVqrbL3POFISncAADg/FC6gRTjlP+n7G9+KXPDzQp84Cq/4wAAAFG6gZRid1fKvvisdPFlMjff7XccAADQg9INpAh76ICc7zwmjc5V4K//XibIhZMAACQKSjeQAmx3t5znvyG1HlLg/pUyo7jdJgAAiYTSDaQA+/J6aeefZe56QGZyod9xAADASSjdQJJzfv+GbNkvZJbepMDCJX7HAQAA/aB0A0nM1lbL/vApqWiOzC33+h0HAACcBqUbSFL28CH3wsnhoxT41D/IhDx5wCwAADgHlG4gCVmnW873npQONLsXTubk+h0JAAAMgNINJCH7ry9K296WufPTMtOK/I4DAADOgNINJBn7x3LZVzfKXPMhBa6+3u84AABgECjdQBKx79bKWfdtqfAimb/4a7/jAACAQaJ0A0nCth+W8+xqKXuYAp9+UCYjw+9IAABgkCjdQBKwjiNn7bek5ka3cI8J+x0JAACcBUo3kATsv/9YemeLzO1/LTNjtt9xAADAWaJ0AwnObv297L/9WGbRUpnFH/Y7DgAAOAeUbiCB2ffq5PzgW9KUGTJ/+RkZY/yOBAAAzgGlG0hQ9mi7nGcfk0IZ7gNwMjL9jgQAAM4RpRtIQNZx5Pzgn6X333Uf8T52nN+RAADAeaB0AwnIvvqy9PbvZG65V2bWJX7HAQAA54nSDSQYG/mj7C9ekpl/rUzpx/yOAwAAhgClG0ggtrFBzveekCZOlbn7s1w4CQBAiqB0AwnCdhyV8+w/SSbgXjiZleV3JAAAMEQo3UACsNbKvvCUVL9Xgb/+ksy4fL8jAQCAIUTpBhKA/Y9/ld3yXzI33yUz5wq/4wAAgCFG6QZ8Zrf/SfblF6QrS2Q+tNzvOAAAIA4o3YCPbHOjnOcfly6cpMA9X+DCSQAAUhSlG/CJ7exwL5zsdhS4/yGZ7GF+RwIAAHFC6QZ8YK2VffFZaW+NAn/1RZnxE/yOBAAA4ojSDfjAvvaK7G9/I3PTHTKXfsDvOAAAIM4o3YDHbLRC9qdrpcvmy3z0Nr/jAAAAD1C6AQ/ZliY5310j5eUrsOJvZQL8TxAAgHTAv/EBj9hYzC3cnZ0KPPCQzPARfkcCAAAeoXQDHrH/5zmpJqrAir+RubDA7zgAAMBDlG7AA87/2yT7X7+W+citMvMW+h0HAAB4jNINxJndtUP2R89Lc+fJfPxOv+MAAAAfULqBOLIH97tz3GPzFPirL8kEgn5HAgAAPqB0A3Fiu2Jyvvu/pfY2Be5fKTNipN+RAACATyjdQJzYn6yVqrbJ/K/PyUya5nccAADgI0o3EAdO+X/K/uaXMtffrMD8a/yOAwAAfEbpBoaY3V0p++Kz0sWXySy/2+84AAAgAVC6gSFkWw/K+c5j0uhcBf7672WCXDgJAAAo3cCQsd3dcp57XGo9pMBnVsqMyvE7EgAASBCUbmCI2I3rpZ1/lrnrAZkphX7HAQAACYTSDQwB5/dvyP7HL2Suu1GBhUv8jgMAABIMpRs4T3ZvjewPn5Jmzpa5dYXfcQAAQAKidAPnwba1ynn2n6ThoxT49IMyoZDfkQAAQAKidAPnyDrdcp5/QjrQrMBnviyTk+t3JAAAkKAo3cA5sv/6krTtbZk7Py0zfZbfcQAAQAKjdAPnwP5xs+yrL8tcc4MCV1/vdxwAAJDgKN3AWbLv1spZ98/S9Fkyf/FJv+MAAIAkwFVfwFmwuyvlPP8NKSvbnePOyPA7EgAASAKUbmAQbEuT7M//RfZ3v5FGjVbg/odkxoT9jgUAAJIEpRsYgO04Kvurn8n+6meSY2U+/AmZD98qM2y439EAAEASoXQD/bCOI/u712V//i/SgWaZ4qtklt8tMy7f72gAACAJeVa6t27dqnXr1slxHC1dulTLli3rs7+pqUnPPPOM2tra5DiO7rzzTs2bN0+S9POf/1yvvfaaAoGA7r33Xl1++eVexUYaspXb5Gz4vrSnSpo6U4FP/b3MjNl+xwIAAEnMk9LtOI7Wrl2rVatWKRwOa+XKlSouLtakSZN6j9m4caMWLlyo66+/XnV1dXrsscc0b9481dXVafPmzfrmN7+p/fv365FHHtG3v/1tBQLceAVDy+57T87G9dIfN0tjwjL3/a3M/Gtl+GcNAACcJ09Kd1VVlfLz8zV+/HhJUklJibZs2dKndBtj1N7eLklqb29Xbq77dL8tW7aopKREGRkZuuCCC5Sfn6+qqioVFRV5ER1pwB5pl33lJ7L/+X+lQFDmY3fKXH+zTFaW39EAAECK8KR0t7S0KBw+fqeHcDisysrKPsfceuutevTRR7Vp0yZ1dHTo4Ycf7v3amTNn9h43duxYtbS0nPIzysrKVFZWJklas2aN8vLy4vGroB+hUCgp32/b3aUjZf+uwz96XvbQAWUv+YhG/uWnFAyP8ztaQkrWdcbgscbpgXVOD6xz4vGkdFtrT9lmjOnzury8XIsXL9ZNN92kaDSqp556Sk8++WS/X9uf0tJSlZaW9r5uamo6v9AYtLy8vKR7v+22t+X85AfSu3ukmbMV+PxXFZsyQ/utpCT7XbySjOuMs8MapwfWOT2wzt6ZMGHCoI7zpHSHw2E1Nzf3vm5ubu4dHznmtdde00MPPSRJKioqUiwWU2tr6ylf29LSorFjx3oRGynINtTJ+ekPpD//Qcobr8CnvyzNW3jKXwIBAACGkidXiBUWFqqhoUGNjY3q6urS5s2bVVxc3OeYvLw8RSIRSVJdXZ1isZhycnJUXFyszZs3KxaLqbGxUQ0NDZoxY4YXsZFC7OFDcv7P83L+8XNS1TaZW+5R4OvPylxZQuEGAABx58mZ7mAwqBUrVmj16tVyHEdLlixRQUGBNmzYoMLCQhUXF+vuu+/Wc889p1deeUWSdP/998sYo4KCAi1cuFBf/OIXFQgEdN9993HnEgya7YrJvv5L2X/bIB1pl7nmevdCyZwxfkcDAABpxNjBDk0nmfr6er8jpI1EnBuz1kp/+m85P10nNdZLs69Q4LYVMhOn+B0taSXiOmNoscbpgXVOD6yzdxJqphvwkt1bI+cna6Ud70j5kxT4/FeluVcyRgIAAHxD6UbKsAf3y/7iJdk3/0MaPlLmjk/KXPMhmRD/mAMAAH/RRpD0bKxT9j9+IfvLl6WuTpmlH5O58XaZESP9jgYAACCJ0o0kZq2V/cObshtfkJobpcv/hwK33CszfnCzVQAAAF6hdCMp2ZqonA3fl3btkCZNVeCLj8hcfJnfsQAAAPpF6UZSsS37ZH/2Q9nfvyHljJG5+7Myi5bKBIJ+RwMAADgtSjeSgj16RPZXP5P99c8lx8p8+BaZj9wikz3c72gAAABnROlGQrOOI/vb38j+/F+kgy0yH7haZvndMnnj/Y4GAAAwaJRuJCwbjcjZsFaq3SVNK1Lg0w/KzLjY71gAAABnjdKNhGMbG+RsXC+99VspN0/mvi/KzL9GJhDwOxoAAMA5oXQjYdj2NtlXfiL72r9JgaDMx++U+eDNMllZfkcDAAA4L5Ru+M52d8v+169kf/Ejqa1VZuF1Mjf/T5kxYb+jAQAADAlKN3xlI2/J+ekPpPpaqWiOArf9lcyUQr9jAQAADClKN3xhG/bK+ckPpMgfpXH5Cnzmy9IVC2WM8TsaAADAkKN0w1O29ZDsv/1I9o1NUla2zC33ylx3o0xGht/RAAAA4obSDU/Yrpjsa6/I/vsG6egRmWtvkPnYnTKjRvsdDQAAIO4o3Ygra6209fdyXl4nNTZIc65Q4Nb7ZCZO9jsaAACAZyjdiBtbWy3nJ2ulnX+WLixQ4Av/n8zcK/2OBQAA4DlKN4acPdAi+68vym7+T2nESJk7Py1zzQ0ywaDf0QAAAHxB6caQsZ0dsv/xC9lXX5a6umQ++HGZj94mM3yk39EAAAB8RenGebPWyvn9G7I/+6HUsk+6YoECt9wjc8EEv6MBAAAkBEo3zoutqdT+b6yUjVZIk6crsOJvZGZd4ncsAACAhELpxjmzHR1ynlwlDR8uc8/nZRYukQkwtw0AAHAySjfOXfUOqeOIcv7+EbVOKfI7DQAAQMIK+B0AyctGI5IJKOPiy/yOAgAAkNAo3ThnNhpx57iHj/A7CgAAQEKjdOOc2FinVB2VKZrjdxQAAICER+nGuamOSl0xmaK5ficBAABIeJRunBN3nttIMznTDQAAcCaUbpwTW1khTZwqM4KnTQIAAJwJpRtnzXbFpF3bZWYxWgIAADAYlG6cvd1VUmcnF1ECAAAMEqUbZ81GI+4nMznTDQAAMBiUbpw1G41IEybLjMrxOwoAAEBSoHTjrNjubqlqB7cKBAAAOAuUbpyd2l1SxxGJ0g0AADBolG6clWPz3FxECQAAMHiUbpwVuzMi5U+UGZ3rdxQAAICkQenGoFmnW6raxjw3AADAWaJ0Y/D27paOtDPPDQAAcJYo3Ri04/PclG4AAICzQenGoNloRBqXL5Mb9jsKAABAUqF0Y1Cs40iVzHMDAACcC0o3Bqd+j9TWyjw3AADAOaB0Y1DszgpJkplF6QYAADhblG4Mio1GpLHjZMIX+B0FAAAg6VC6cUbWWqmygnluAAAFqOWQAAAgAElEQVSAczRg6W5qatJvfvObfve9/vrram5ujksoJJiGvVLrQYlHvwMAAJyTAUv3yy+/rFgs1u++WCyml19+OS6hkFh678/NPDcAAMA5GbB0RyIRXX311f3uu/rqq/XOO+/EJRQSTLRCGjNWGneh30kAAACS0oCl+9ChQ8rKyup3X2ZmplpbW+MSConDWisbjcgUzZUxxu84AAAASWnA0p2bm6vdu3f3u2/37t0aM2ZMPDIhkbxfLx3cz/25AQAAzsOApXvRokV6/vnn1dLS0md7S0uLvv/975929ASpw1b23J+b0g0AAHDOQgPtXL58uWpqavSFL3xBM2bM0JgxY3TgwAFVVVXpkksu0fLly73KCb9EI1LOGCl/ot9JAAAAktaApTsUCunBBx/UO++8o0gkotbWVs2cOVPLly/XJZdc4lVG+KR3nnvmHOa5AQAAzsOApfuYSy+9VJdeeul5/aCtW7dq3bp1chxHS5cu1bJly/rsX79+vSoq3FGGzs5OHTx4UOvXr5ckvfjii3r77bclSZ/4xCdUUlJyXlkwSE3vSy1N0oc+4XcSAACApDZg6f7qV796yhnOYDCovLw8XXXVVYMu4o7jaO3atVq1apXC4bBWrlyp4uJiTZo0qfeYe+65p/fzV199VTU1NZKkt956SzU1NXr88ccVi8X0ta99TZdffrmGDx8+2N8R58hGmecGAAAYCgOW7uuuu+6Ubd3d3WpsbNTTTz+tO++8U4sXLz7jD6mqqlJ+fr7Gjx8vSSopKdGWLVv6lO4TlZeX67bbbpMk1dXVafbs2QoGgwoGg5oyZYq2bt3K2W4vRCPSyFHShQV+JwEAAEhqA5bugQr1/Pnz9eyzzw6qdLe0tCgcDve+DofDqqys7PfYffv2qbGxUXPnumdXp0yZopdfflk33nijOjo6VFFR0W9ZLysrU1lZmSRpzZo1ysvLO2MuDKxp13aF5szTmAsuGPC4UCjE+50GWOfUxxqnB9Y5PbDOiWdQM939KSwsVHNz86COtdaesu10F+aVl5drwYIFCgTcuxledtll2rVrl1atWqWcnBwVFRUpGAye8nWlpaUqLS3tfd3U1DSobOifbdkn5/16OYs/csb3Mi8vj/c7DbDOqY81Tg+sc3pgnb0zYcKEQR13zqW7vr5eo0ePHtSx4XC4T0Fvbm5Wbm5uv8du3rxZ9913X59ty5cv77094be//W3l5+efY2oMlo1GJDHPDQAAMBQGLN2RSOSUbV1dXdq3b59++ctf9jvz3Z/CwkI1NDSosbFRY8eO1ebNm/X5z3/+lOPq6+vV1tamoqKi3m2O46itrU2jRo3Snj17VFtbq8suu2xQPxfnIVohDR8hTZridxIAAICkN2Dp/s53vnPKtmN3L/noRz+qpUuXDuqHBINBrVixQqtXr5bjOFqyZIkKCgq0YcMGFRYWqri4WJL05ptvqqSkpM/oSVdXl7761a9KkoYPH67Pfe5z/Y6XYGjZnRFp5hyZAO81AADA+TK2v4HrFFBfX+93hKRlD7TI+ft7ZG69V4Hrbz7j8cyNpQfWOfWxxumBdU4PrLN3BjvTHTiXb3748GFt2rRJK1euPJcvR4JjnhsAAGBoDfpCyu7ubr311lt644039Pbbb2vs2LH64Ac/GM9s8Es0ImUPkwqm+50EAAAgJZyxdFdXV+v1119XeXm5HMfR/PnzlZGRoUcffXTQdy9BcrHRCmnGbBlm5wEAAIbEgKX77/7u7/T+++/riiuu0Cc/+UldeeWVCoVCevvtt73KB4/ZQwekhr0yC5f4HQUAACBlDDjT3dHRoUAgoMzMTGVlZXHXkHRQWSGJeW4AAIChNOCZ7qefflrbtm3TG2+8oW9961vKzMzUwoULFYvFTvtESSQ3uzMiZWZJU2b4HQUAACBlnPHuJbNnz9ZnPvMZfe9739Ndd92l+vp6HTlyRF/72tf0q1/9youM8JCNRqTCi2RC5/ywUgAAAJxk0M0qMzNT11xzja655hq1tLTojTfe0KZNm3TDDTfEMx88ZA8fkt7dI1N8ld9RAAAAUso5nc4cO3asbr75Zt1885kfnIIkUrlNEvPcAAAAQ+2cHo6D1GSjESkjU5pW5HcUAACAlELpRi8brZCmz5LJyPA7CgAAQEqhdEOSZNvbpL01MkVz/I4CAACQcgYs3YcPH9bWrVv73bd161YdPnw4LqHgg6ptknWY5wYAAIiDAUv3xo0bVV1d3e++mpoa/exnP4tLKHjPRiNSKCRNn+V3FAAAgJQzYOl+6623VFpa2u++0tJS/eEPf4hLKHjPRiukqUUymVl+RwEAAEg5A5buAwcOKCcnp999I0eO1MGDB+MSCt6yR9ulPVWMlgAAAMTJgKV7xIgRqq+v73dfQ0ODhg8fHpdQ8FjVDslxZGZxESUAAEA8DFi658+fr3Xr1qmzs7PP9s7OTr3wwgtasGBBXMPBGzYakYJBqfBiv6MAAACkpAGfSHn77bfr61//uj772c/q8ssv15gxY3TgwAH96U9/Ujgc1m233eZVTsSRjUakKTNksrL9jgIAAJCSBizdw4YN0yOPPKI33nhDf/7zn1VdXa2RI0fq9ttv1zXXXKNQ6JyeIo8EYjs6pN1VMh/8uN9RAAAAUtYZW3MoFNLSpUu1dOlSL/LAa9U7pO4uLqIEAACIozOW7sbGRv30pz/VO++8o9bWVo0aNUqXXHKJbrnlFuXn53uREXFkoxHJBKQZzHMDAADEy4AXUtbV1enBBx/UoUOHdMcdd+gf/uEfdMcdd6i1tVUrV65UXV2dVzkRJzYakSZPlxnGnWgAAADiZcAz3T/60Y90ww036C/+4i/6bF+8eLF+/OMf68UXX9SXv/zluAZE/NhYp1Qdlbnuo35HAQAASGkDnunevn27brrppn733XjjjdqxY0dcQsEj1VGpK8Y8NwAAQJwNWLodx1EwGOx3XygUkuM4cQkFb7jz3EaaMdvvKAAAACltwNJdWFio119/vd99r7/+ugoLC+ORCR6x0Yg0carMiJF+RwEAAEhpZ3w4zurVq1VfX68FCxb0Phznt7/9rd544w195Stf8SonhpjtiknVO2SuvsHvKAAAAClvwNI9a9YsrVq1Si+99JJ+/etfy1orY4yKior00EMPadasWV7lxFDbXSl1dsoUzfE7CQAAQMo74326i4qK9I//+I/q7OzU4cOHNWLECGVlZXmRDXFkd0bcT2ZyESUAAEC8DTjTfaLMzEyNHTu2t3Dv2bNH3/zmN+MWDPFloxXShMkyo3L8jgIAAJDyBjzT3dHRoZ///OfavXu3LrzwQt16661qbW3VD3/4Q73zzju69tprvcqJIWS7uqRd22UWXud3FAAAgLQwYOleu3atampqdNlll2nr1q2qra1VfX29rr32Wn3qU59STg5nSZNS7S6p46jE/bkBAAA8MWDp/tOf/qTHH39co0eP1oc//GHdf//9+trXvqaLL77Yq3yIA1tZIUlcRAkAAOCRAWe6jx49qtGjR0uSwuGwsrOzKdwpwO6MSPkTZUbn+h0FAAAgLQx4pru7u1uRSKTPtpNfz53LiEIysU63VLVN5gNX+x0FAAAgbQxYukePHq3vfOc7va9HjhzZ57UxRk8//XT80mHo7d0tHWlnnhsAAMBDA5buZ555xqsc8IiNuv+lwlC6AQAAPDPo+3QjNdhoRBqXL5Mb9jsKAABA2qB0pxHrOFLlNs5yAwAAeIzSnU7q90htrcxzAwAAeIzSnUbszp77c8+idAMAAHiJ0p1GbDQihS+QCV/gdxQAAIC0QulOE9ZaqbKCp1ACAAD4gNKdLhr2Sq0HmecGAADwAaU7TXB/bgAAAP9QutNFtEIaE5bG5fudBAAAIO1QutOAtVY2GpEpmiNjjN9xAAAA0g6lOx28Xy8d3M88NwAAgE8o3WmAeW4AAAB/UbrTQTQi5YyR8if6nQQAACAtUbpTnDvPXSEzk3luAAAAv1C6U13T+9L+JolHvwMAAPiG0p3imOcGAADwH6U71e2MSCNHSRcW+J0EAAAgbVG6U5yNRqSZc2QCLDUAAIBfQl79oK1bt2rdunVyHEdLly7VsmXL+uxfv369KioqJEmdnZ06ePCg1q9fL0l68cUX9dZbb8laq0suuUT33nsvFwUOgm3eJzU3ypR+zO8oAAAAac2T0u04jtauXatVq1YpHA5r5cqVKi4u1qRJk3qPueeee3o/f/XVV1VTUyNJ2rlzp3bu3KknnnhCkvTwww9r27ZtmjNnjhfRk5qtZJ4bAAAgEXgyc1BVVaX8/HyNHz9eoVBIJSUl2rJly2mPLy8v11VXXSVJMsaos7NTXV1disVi6u7u1ujRo72InfyiFdLwEdKkKX4nAQAASGuenOluaWlROBzufR0Oh1VZWdnvsfv27VNjY6PmznXPzhYVFWnOnDn65Cc/KWutPvShD/U5Q35MWVmZysrKJElr1qxRXl5eHH6T5NJUtV3BOVco94Lxcf05oVCI9zsNsM6pjzVOD6xzemCdE48npdtae8q2081kl5eXa8GCBQr0XPj33nvv6d1339V3v/tdSdIjjzyibdu2afbs2X2+rrS0VKWlpb2vm5qahip+UrIHWuQ07JVzVWnc34u8vLy0f7/TAeuc+ljj9MA6pwfW2TsTJkwY1HGejJeEw2E1Nzf3vm5ublZubm6/x27evFmLFi3qff3f//3fmjlzprKzs5Wdna0rrrjitGfJcRz35wYAAEgcnpTuwsJCNTQ0qLGxUV1dXdq8ebOKi4tPOa6+vl5tbW0qKirq3ZaXl6ft27eru7tbXV1d2rZtmyZOnOhF7OQWjUjZw6SC6X4nAQAASHuejJcEg0GtWLFCq1evluM4WrJkiQoKCrRhwwYVFhb2FvA333xTJSUlfUZPFixYoEgkoi996UuSpMsvv7zfwo6+bLRCmjFbJhj0OwoAAEDaM7a/gesUUF9f73cE39hDB+T83d0yy/+XAh/+RNx/HnNj6YF1Tn2scXpgndMD6+ydhJrphscq3YcMmSLuZQ4AAJAIKN0pyO6MSFnZ0pQZfkcBAACAKN0pyUYjUuFFMiFPRvYBAABwBpTuFGMPH5Le3cOtAgEAABIIpTvVVG6TJJmZzHMDAAAkCkp3irHRiJSRKU0rOvPBAAAA8ASlO8XYaESaPksmI8PvKAAAAOhB6U4htv2wtLeGWwUCAAAkGEp3KqncLlnLRZQAAAAJhtKdQmw0IoVC0vRZfkcBAADACSjdKcRGI9LUIpnMLL+jAAAA4ASU7hRhj7ZLtbsYLQEAAEhAlO5UUbVdchyZWVxECQAAkGgo3SnCRiukYFAqvNjvKAAAADgJpTtF2GhEmjJDJivb7ygAAAA4CaU7BdiODml3FfPcAAAACYrSnQqqd0jdXZRuAACABEXpTgE2GpFMQJrBPDcAAEAionSnABuNSJOnywwb7ncUAAAA9IPSneRsrFOqjsrMYrQEAAAgUVG6k111VOqKMc8NAACQwCjdSc6d5zbSzNl+RwEAAMBpULqTnI1GpElTZYaP9DsKAAAAToPSncRsV0yq3sFoCQAAQIKjdCez3ZVSZyelGwAAIMFRupOY3RlxP5k5x98gAAAAGBClO4nZaIU0YbLMqBy/owAAAGAAlO4kZbu6pF3bGS0BAABIApTuZFW7S+o4KlG6AQAAEh6lO0nZqDvPbYqY5wYAAEh0lO4kZaMVUv5EmdG5fkcBAADAGVC6k5B1uqWqbcxzAwAAJAlKdzLaWyMdaWeeGwAAIElQupPQsftzc6YbAAAgOVC6k5CNRqRx+TK5Yb+jAAAAYBAo3UnGOo5UyTw3AABAMqF0J5v6PVL7Yea5AQAAkgilO8nYnRWSJDOL0g0AAJAsKN1JxkYjUvgCmfAFfkcBAADAIFG6k4i1Vqqs4CmUAAAASYbSnUwa9kqtB5nnBgAASDKU7iRio9yfGwAAIBlRupNJtEIaE5bG5fudBAAAAGeB0p0krLWy0YhM0VwZY/yOAwAAgLNA6U4W79dLB/dLs7iIEgAAINlQupME89wAAADJi9KdLKIRKWeMNH6i30kAAABwlijdScCd565gnhsAACBJUbqTQdP70v4miYfiAAAAJCVKdxJgnhsAACC5UbqTwc6INHKUdGGB30kAAABwDijdScBGI9LMOTIBlgsAACAZ0eISnG3eJzU3MloCAACQxCjdCY55bgAAgORH6U500Yg0fIQ0aYrfSQAAAHCOQl79oK1bt2rdunVyHEdLly7VsmXL+uxfv369KioqJEmdnZ06ePCg1q9fr0gkohdeeKH3uPr6en3hC1/Q/PnzvYruq+Pz3EG/owAAAOAceVK6HcfR2rVrtWrVKoXDYa1cuVLFxcWaNGlS7zH33HNP7+evvvqqampqJElz587VN77xDUnS4cOH9bnPfU6XXXaZF7F9Zw80S40NMtd+yO8oAAAAOA+ejJdUVVUpPz9f48ePVygUUklJibZs2XLa48vLy3XVVVedsv13v/udrrjiCmVlZcUzbsKwO5nnBgAASAWenOluaWlROBzufR0Oh1VZWdnvsfv27VNjY6Pmzj21aJaXl+vGG2/s9+vKyspUVlYmSVqzZo3y8vKGILm/Du2t1tFhw5V3xQdkgp5NAp21UCiUEu83BsY6pz7WOD2wzumBdU48njQ5a+0p24wx/R5bXl6uBQsWKHDSPan379+v2tra046WlJaWqrS0tPd1U1PTeSRODN3v/EEqvFjN+w/4HWVAeXl5KfF+Y2Csc+pjjdMD65weWGfvTJgwYVDHeTJeEg6H1dzc3Pu6ublZubm5/R67efNmLVq06JTtv/3tbzV//nyFQol7xnco2UMHpIa9jJYAAACkAE9Kd2FhoRoaGtTY2Kiuri5t3rxZxcXFpxxXX1+vtrY2FRUVnbKvvLy83zKesirdO7mYojk+BwEAAMD58uS0cTAY1IoVK7R69Wo5jqMlS5aooKBAGzZsUGFhYW8Bf/PNN1VSUnLK6EljY6Oampo0e/ZsL+ImBLszImVlS1Nm+B0FAAAA58nY/gauU0B9fb3fEc5L99c+J43OVfBvv+53lDNibiw9sM6pjzVOD6xzemCdvZNQM904O/bwIendPcxzAwAApAhKdyKq3CaJ+3MDAACkCkp3ArLRiJSRKU2d6XcUAAAADAFKdwKy0Yg0fZZMRobfUQAAADAEKN0JxrYflvbWMFoCAACQQijdiaZyu2StzCxKNwAAQKqgdCcYG41IoZA07dQHBAEAACA5UboTjI1GpKlFMplZfkcBAADAEKF0JxB7tF2q3cU8NwAAQIqhdCeSqu2S48jMmuN3EgAAAAwhSncCsdGIFAxKhRf7HQUAAABDiNKdQGy0QpoyQyYr2+8oAAAAGEKU7gRhO45KuyuZ5wYAAEhBlO5EsWuH1N1N6QYAAEhBlO4EYaMRyQSkGcxzAwAApBpKd4Kw0Yg0ebrMsOF+RwEAAMAQo3QnANvZIdVEefQ7AABAiqJ0J4KaSqmri3luAACAFEXpTgDuPLeRZs72OwoAAADigNKdAGw0Ik2aKjN8pN9RAAAAEAeUbp/ZrphUvYPREgAAgBRG6fbb7kqps5PSDQAAkMIo3T6zOyPuJzPn+BsEAAAAcUPp9pmNVkgTp8iMyvE7CgAAAOKE0u0j29Ul7douU8RZbgAAgFRG6fZT7S6p4yjz3AAAACmO0u0jG+2Z5+ZMNwAAQEqjdPvIRiuk/EkyObl+RwEAAEAcUbp9Yp1uqWob89wAAABpgNLtl7010pF2iXluAACAlEfp9smx+3NzESUAAEDqo3T7xEYj0rh8mdyw31EAAAAQZ5RuH1jHkSq3cZYbAAAgTVC6/fDuHqn9MPPcAAAAaYLS7YNj9+c2syjdAAAA6YDS7QMbjUjhC2TCF/gdBQAAAB6gdHvMWitFK7g/NwAAQBqhdHutfq90+BDz3AAAAGmE0u2x3nluSjcAAEDaoHR7rbJCGhOWxuX7nQQAAAAeoXR7yForG43IFM2VMcbvOAAAAPAIpdtL79dLB/dLs7iIEgAAIJ1Quj3EPDcAAEB6onR7KRqRcsZI4yf6nQQAAAAeonR7xJ3nrmCeGwAAIA1Rur3S9L60v4n7cwMAAKQhSrdHmOcGAABIX5Rur+yMSCNzpAkFficBAACAxyjdHrHRiFQ0h3luAACANETp9oBt3ic1NzJaAgAAkKYo3R7oneeeyUNxAAAA0hGl2wvRiDR8hDRpit9JAAAA4ANKtwdsNCLNnCMTCPodBQAAAD6gdMeZPdAsNTbIFDFaAgAAkK4o3XFmd3J/bgAAgHRH6Y63aIWUPUwqmO53EgAAAPgk5NUP2rp1q9atWyfHcbR06VItW7asz/7169eroqJCktTZ2amDBw9q/fr1kqSmpiZ997vfVXNzsyRp5cqVuuCCC7yKfl5sNCLNmC0TZJ4bAAAgXXlSuh3H0dq1a7Vq1SqFw2GtXLlSxcXFmjRpUu8x99xzT+/nr776qmpqanpfP/3001q+fLkuvfRSHT16NGkeMGMP7Zfeq5MpWep3FAAAAPjIk/GSqqoq5efna/z48QqFQiopKdGWLVtOe3x5ebmuuuoqSVJdXZ26u7t16aWXSpKys7OVlZXlRezzF3XP3HMRJQAAQHrz5Ex3S0uLwuFw7+twOKzKysp+j923b58aGxs1d6574WF9fb1GjBihJ554Qo2Njbrkkkv0l3/5lwoE+v59oaysTGVlZZKkNWvWKC8vL06/zeAdqt2lo9nDlHflApmQZ5M8nguFQgnxfiO+WOfUxxqnB9Y5PbDOiceTJmitPWXb6UZEysvLtWDBgt5S7TiOtm/frscff1x5eXn61re+pddff13XXXddn68rLS1VaWlp7+umpqYh/A3OTfc7f5Cmz1LzgQN+R4mrvLy8hHi/EV+sc+pjjdMD65weWGfvTJgwYVDHeTJeEg6Hey+ClKTm5mbl5ub2e+zmzZu1aNGi3tdjx47VtGnTNH78eAWDQc2fP1/V1dVxz3y+bOsh6d093CoQAAAA3pTuwsJCNTQ0qLGxUV1dXdq8ebOKi4tPOa6+vl5tbW0qKirq3TZjxgy1tbXp0KFDkqRIJNLnAsyEVbVNEvfnBgAAgEfjJcFgUCtWrNDq1avlOI6WLFmigoICbdiwQYWFhb0F/M0331RJSUmf0ZNAIKC77rpLX//612Wt1fTp0/uMkSQqG41IGZnS1Jl+RwEAAIDPjO1v4DoF1NfX+/rzux/5G2nYCAW/tNrXHF5gbiw9sM6pjzVOD6xzemCdvZNQM93pxrYflvbWMFoCAAAASZTu+KjcLlkrM4vSDQAAAEp3XNhoRAqFpGlFZz4YAAAAKY/SHQc2GpGmFclkJsmTMwEAABBXlO4hZo+2S7W7mOcGAABAL0r3UKvaLjkOpRsAAAC9KN1DzEYjUjAoFV7kdxQAAAAkCEr3ELPRCmnKDJmsbL+jAAAAIEFQuoeQ7Tgq7a5ktAQAAAB9ULqH0q4dUnc3pRsAAAB9ULqHkI1GJBOQZlzsdxQAAAAkEEr3ELLRiDR5usyw4X5HAQAAQAIJ+R0gVVhrpZE5MpOm+h0FAAAACYbSPUSMMQre/5DfMQAAAJCAGC8BAAAA4ozSDQAAAMQZpRsAAACIM0o3AAAAEGeUbgAAACDOKN0AAABAnFG6AQAAgDijdAMAAABxRukGAAAA4ozSDQAAAMQZpRsAAACIM0o3AAAAEGeUbgAAACDOKN0AAABAnFG6AQAAgDijdAMAAABxRukGAAAA4ozSDQAAAMSZsdZav0MAAAAAqYwz3ThvX/7yl/2OAA+wzqmPNU4PrHN6YJ0TD6UbAAAAiDNKNwAAABBnlG6ct9LSUr8jwAOsc+pjjdMD65weWOfEw4WUAAAAQJxxphsAAACIM0o3AAAAEGchvwMgOf3/7d17TNX1H8fx5zngEZTOEThcgmQMyJYuJC6TLiiBdiFN/yAqw0V/NFEotbZw/dHc2CzduDoQh5dGLdL1B4WTPxpei7UIcCBCXkZaw+KqSHI//P5ondXot99PPOf3/Y1ej//4fPl8Pq9zGON9Pud9vvT19VFWVsbNmzcxmUysXr2atLQ0o2OJmzgcDnbu3Imfn59uQzVH/fbbb1RUVPDTTz9hMpnYsmULS5YsMTqWuNDx48c5efIkJpOJxYsXs3XrViwWi9GxxAXKy8tpbm7GZrNRUFAAwPDwMEVFRfT29hIQEMCOHTvw8fExOOk/m4pumRUPDw82bdpEREQEIyMj7Ny5k+joaB544AGjo4kbnDhxgtDQUEZGRoyOIm5y5MgRYmJieOedd5icnGRsbMzoSOJCAwMD1NXVUVRUhMViobCwkIaGBpKTk42OJi6QnJzMs88+S1lZmXOspqaGRx55hA0bNlBTU0NNTQ2ZmZkGphS1l8is+Pr6EhERAYC3tzehoaEMDAwYnErcob+/n+bmZlJTU42OIm5y584dOjo6SElJAcDT05OFCxcanEpczeFwMD4+ztTUFOPj4/j6+hodSVxk6dKlM06xGxsbWbVqFQCrVq2isbHRiGjyJzrplnvW09NDV1cXUVFRRkcRN/joo4/IzMzUKfcc1tPTg9Vqpby8nGvXrhEREUFWVhZeXl5GRxMX8fPzY926dWzZsgWLxcLy5ctZvny50bHEjW7duuV8YeXr68vQ0JDBiUQn3XJPRkdHKSgoICsriwULFhgdR1ysqakJm83mfFdD5qapqSm6urp4+umn2bt3L/Pnz6empsboWOJCw8PDNDY2UlZWxoEDBxgdHeXs2bNGxxL5R1HRLbM2OTlJQUEBSUlJrFixwug44gY//PAD33//PTk5ORQXF3PhwgVKS0uNjiUu5u/vj7+/Pw8++CAAiYmJdHV1GZxKXKmtrY3AwECsViuenp6sWLGCS5cuGR1L3P2FCJMAAAd4SURBVMhmszE4OAjA4OAgVqvV4ESi9hKZlenpaSoqKggNDWXt2rVGxxE32bhxIxs3bgSgvb2d2tpa3nrrLYNTiastWrQIf39/uru7CQkJoa2tTR+KnmPsdjuXL19mbGwMi8VCW1sbkZGRRscSN4qPj+fMmTNs2LCBM2fOkJCQYHSkfzz9R0qZlc7OTt5//33CwsIwmUwAvPLKK8TGxhqcTNzlj6Jbtwycm3788UcqKiqYnJwkMDCQrVu36vZic8yxY8doaGjAw8OD8PBwsrOzmTdvntGxxAWKi4u5ePEit2/fxmazkZGRQUJCAkVFRfT19WG323n77bf1O20wFd0iIiIiIm6mnm4RERERETdT0S0iIiIi4mYqukVERERE3ExFt4iIiIiIm6noFhERERFxM92nW0TkH2B6epr9+/fT2NhIcHAwH3zwwYzv+eyzz/jqq68wm81UVlYakFJEZO5S0S0i4iI5OTmMj4+zb98+vLy8AKivr+fcuXPs2rXL0GydnZ20trayf/9+Z7Y/6+vro7a2lvLycmw22z3t1d7ezr59+6ioqLindURE5hK1l4iIuNDU1BQnTpwwOsYMvb29BAQE/G3BDb8X3ffdd989F9yuMDU1ZXQEERGX00m3iIgLvfDCC3zxxRc888wzLFy48C/Xenp6yM3Npbq6Gg8PDwB27dpFUlISqampnD59mvr6eiIjIzl9+jQ+Pj68+eab3Lhxg6NHjzIxMUFmZibJycl/u/fAwACVlZV0dnbi4+PD+vXrWb16NSdPnuTQoUNMTk6yadMm1q1bR0ZGhnNea2sre/bscV5PTEwkJyeHS5cuUVVVxc8//0xAQABZWVksW7YMgFOnTvHll1/S39+P1Wpl/fr1rFmzhtHRUXbv3u1cC6CkpITq6mr8/f15+eWXgZmn4Tk5OaxZs4avv/6a7u5uPv74Y27dusXhw4fp6OjAy8uL559/nrS0NACuXLnCwYMHuXHjBhaLhSeffJLXXnvNdT9IEREXU9EtIuJCERERLFu2jNraWmeBeTcuX75MSkoKhw8f5tixYxQXFxMXF0dpaSkXL16koKCAxMTEvz2xLikpYfHixRw4cIDu7m7y8/MJCgoiJSUFs9lMfX09+fn5M+ZFR0fz3nvv/aUIHhgY4MMPPyQ3N5eYmBguXLhAQUEBxcXFWK1WbDYbeXl5BAUF0dHRwe7du4mMjCQiImLGWv+tb775hp07d2K1WjGZTOzZs4eEhAS2b99Of38/+fn5hISEEBMTw5EjR0hLS2PlypWMjo5y/fr1u36uRUT+l9ReIiLiYhkZGdTV1TE0NHTXcwMDA3nqqacwm808/vjj9Pf3k56ezrx581i+fDmenp788ssvM+b19fXR2dnJq6++isViITw8nNTUVM6ePTurx3D27FkeffRRYmNjMZvNREdHExkZSXNzMwCxsbEEBwdjMplYunQp0dHRdHZ2zmqvPzz33HPY7XYsFgtXr15laGiI9PR0PD09CQoKIjU1lYaGBgDn8zA0NISXlxdLliy5p71FRNxNJ90iIi4WFhZGXFwcNTU1hIaG3tXcP/dUWywWABYtWvSXsdHR0RnzBgcH8fHxwdvb2zlmt9u5evXq3cYHfi/iv/32W5qampxjU1NTzvaSlpYWPv/8c7q7u5menmZsbIywsLBZ7fXnvH/o7e1lcHCQrKws55jD4eDhhx8GIDs7m6NHj7Jjxw4CAwNJT08nLi7unvYXEXEnFd0iIm6QkZFBXl4ea9eudY790RIyNjbGggULALh586ZL9vP19WV4eJiRkRFn4d3X14efn9+s1vP39ycpKYns7OwZ1yYmJigoKCA3N5f4+Hg8PT3Zu3ev87rJZJoxZ/78+YyNjTm//k+P2263ExgYSGlp6d9ev//++9m+fTsOh4PvvvuOwsJCDh069G8/KCoiYjS1l4iIuEFwcDCPPfYYdXV1zjGr1Yqfnx/nzp3D4XBw8uRJfv31V5fsZ7fbeeihh/j0008ZHx/n2rVrnDp1iqSkpFmtl5SURFNTE+fPn8fhcDA+Pk57ezv9/f1MTk4yMTGB1WrFw8ODlpYWWltbnXNtNhu3b9/mzp07zrHw8HBaWloYHh7m5s2b//EOL1FRUXh7e1NTU8P4+DgOh4Pr169z5coV4Pf2l6GhIcxms/MFjNmsP2ki8v9LJ90iIm6Snp7OuXPn/jK2efNmDh48SHV1NSkpKS7tRd62bRuVlZVs3rwZHx8fXnzxRaKjo2e1lt1u59133+WTTz6hpKQEs9lMVFQUb7zxBt7e3rz++usUFRUxMTFBXFwc8fHxzrmhoaE88cQT5Obm4nA4KCwsZOXKlbS1tZGTk0NAQADJyckcP3783+5vNpvJy8ujqqqKnJwcJicnCQkJ4aWXXgLg/PnzVFVVMTY2RkBAANu2bXO244iI/D8yTU9PTxsdQkRERERkLtN7cSIiIiIibqaiW0RERETEzVR0i4iIiIi4mYpuERERERE3U9EtIiIiIuJmKrpFRERERNxMRbeIiIiIiJup6BYRERERcbN/AYNv1FkTEtRvAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(num_of_features, rfe.grid_scores_ )\n", "plt.xlabel('Num of features')\n", "plt.ylabel('ROC AUC')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PCA" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Попробуем получить PCA разными способами" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PCA через sklearn" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.decomposition import PCA" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pca = PCA(n_components=5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PCA через ковариационную матрицу" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy.linalg import eig" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Оценка качества при разных количествах компонент\n", "\n", "Реализуйте 2 пайплайнк:\n", " * StandartScaler + LogisticRegression\n", " * StandartScaler + PCA + LogisticRegression\n", "\n", "Оцените качество пайплайна с PCA при разных количествах компонент и сравните его с первым" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import LogisticRegression\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.model_selection import StratifiedKFold, cross_val_score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Singular Value Decomposition\n", "\n", "Для любой матрицы $A$ размера $n \\times m$ и ранга $r$ можно найти разложение вида:\n", "$$ A = U \\Sigma V^\\top ,$$\n", "где \n", "* $U$ - унитарная матрица, состоящая из собственных векторов $AA^\\top$\n", "* $V$ - унитарная матрица, состоящая из собственных векторов $A^\\top A$\n", "* $\\Sigma$ - диагональная матрица с сингулярными числами $s_i = \\sqrt{\\lambda_i}$\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SVD via PCA\n", "Матрицы $U$ и $V$ ортогональны и могут быть использованы для перехода к ортогональному базису:\n", "$$ AV = U\\Sigma $$\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy.linalg import svd" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Рационы питания в странах" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Загрузите набор данных о пищевом рационе в разных странах мира `diet.csv`\n", "* Примените на данных PCA с 2 компонентами\n", "* Изобразите объекты в сжатом пространстве" ] }, { "cell_type": "code", "execution_count": 162, "metadata": { "collapsed": true }, "outputs": [], "source": [ "df = pd.read_csv('data/diet.csv', sep=';')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Скорее всего вы обнаружите некоторые выбросы, с этим ничего не поделать - PCA чувствителен к выбросам\n", "* Удалите объекты-выборосы и повторите процедуру" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# PCA для текстов\n", "#### Он же Latent Semantic Index или Latent Semantic Allocation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Когда мы работаем с текстами, бы обычно переходим к модели мешка слов\n", "* Что если мы ходим применить для текстов PCA?\n", "* Центрировать данные мы не можем, так как тогда наша огромная матрица с признаками будет плотной, а значит будет очень много весить\n", "* Давайте не будем центрировать! =D" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from sklearn.decomposition import TruncatedSVD" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# T-SNE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Задание\n", "Сожмите признаковое пространство данных с цифрами с помощью tsne." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import load_digits" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Your Code Here" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "142px", "width": "252px" }, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": "block", "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }