{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "***\n", "***\n", "# 计算传播与机器学习\n", "\n", "***\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "![](./img/machine.jpg)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 1、 监督式学习\n", "\n", "工作机制:\n", "- 这个算法由一个目标变量或结果变量(或因变量)组成。\n", "- 这些变量由已知的一系列预示变量(自变量)预测而来。\n", "- 利用这一系列变量,我们生成一个将输入值映射到期望输出值的函数。\n", "- 这个训练过程会一直持续,直到模型在训练数据上获得期望的精确度。\n", "- 监督式学习的例子有:回归、决策树、随机森林、K – 近邻算法、逻辑回归等。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 2、非监督式学习\n", "\n", "工作机制:\n", "- 在这个算法中,没有任何目标变量或结果变量要预测或估计。\n", "- 这个算法用在不同的组内聚类分析。\n", "- 这种分析方式被广泛地用来细分客户,根据干预的方式分为不同的用户组。\n", "- 非监督式学习的例子有:关联算法和 K–均值算法。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 3、强化学习\n", "\n", "工作机制:\n", "- 这个算法训练机器进行决策。\n", "- 它是这样工作的:机器被放在一个能让它通过反复试错来训练自己的环境中。\n", "- 机器从过去的经验中进行学习,并且尝试利用了解最透彻的知识作出精确的商业判断。 \n", "- 强化学习的例子有马尔可夫决策过程。alphago\n", "\n", "> Chess. Here, the agent decides upon a series of moves depending on the state of the board (the environment), and the\n", "reward can be defined as win or lose at the end of the game:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- 线性回归\n", "- 逻辑回归\n", "- 决策树\n", "- SVM\n", "- 朴素贝叶斯\n", "---\n", "- K最近邻算法\n", "- K均值算法\n", "- 随机森林算法\n", "- 降维算法\n", "- Gradient Boost 和 Adaboost 算法\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> # 使用sklearn做线性回归\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 线性回归\n", "- 通常用于估计连续性变量的实际数值(房价、呼叫次数、总销售额等)。\n", "- 通过拟合最佳直线来建立自变量X和因变量Y的关系。\n", "- 这条最佳直线叫做回归线,并且用 $Y= \\beta *X + C$ 这条线性等式来表示。\n", "- 系数 $\\beta$ 和 C 可以通过最小二乘法获得" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:10:39.010055Z", "start_time": "2018-04-29T07:10:39.002664Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "%matplotlib inline\n", "import sklearn\n", "from sklearn import datasets\n", "from sklearn import linear_model\n", "import matplotlib.pyplot as plt\n", "from sklearn.metrics import classification_report\n", "from sklearn.preprocessing import scale" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:11:24.244682Z", "start_time": "2018-04-29T07:11:24.234905Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# boston data\n", "boston = datasets.load_boston()\n", "y = boston.target\n", "X = boston.data" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:11:34.160791Z", "start_time": "2018-04-29T07:11:34.154953Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'__class__ __contains__ __delattr__ __delitem__ __dict__ __dir__ __doc__ __eq__ __format__ __ge__ __getattr__ __getattribute__ __getitem__ __gt__ __hash__ __init__ __iter__ __le__ __len__ __lt__ __module__ __ne__ __new__ __reduce__ __reduce_ex__ __repr__ __setattr__ __setitem__ __setstate__ __sizeof__ __str__ __subclasshook__ __weakref__ clear copy fromkeys get items keys pop popitem setdefault update values'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "' '.join(dir(boston))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:11:45.142201Z", "start_time": "2018-04-29T07:11:45.137656Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD',\n", " 'TAX', 'PTRATIO', 'B', 'LSTAT'], \n", " dtype='|t| [95.0% Conf. Int.]\n", "-----------------------------------------------------------------------------------\n", "Intercept 36.4911 5.104 7.149 0.000 26.462 46.520\n", "boston.data[0] -0.1072 0.033 -3.276 0.001 -0.171 -0.043\n", "boston.data[1] 0.0464 0.014 3.380 0.001 0.019 0.073\n", "boston.data[2] 0.0209 0.061 0.339 0.735 -0.100 0.142\n", "boston.data[3] 2.6886 0.862 3.120 0.002 0.996 4.381\n", "boston.data[4] -17.7958 3.821 -4.658 0.000 -25.302 -10.289\n", "boston.data[5] 3.8048 0.418 9.102 0.000 2.983 4.626\n", "boston.data[6] 0.0008 0.013 0.057 0.955 -0.025 0.027\n", "boston.data[7] -1.4758 0.199 -7.398 0.000 -1.868 -1.084\n", "boston.data[8] 0.3057 0.066 4.608 0.000 0.175 0.436\n", "boston.data[9] -0.0123 0.004 -3.278 0.001 -0.020 -0.005\n", "boston.data[10] -0.9535 0.131 -7.287 0.000 -1.211 -0.696\n", "boston.data[11] 0.0094 0.003 3.500 0.001 0.004 0.015\n", "boston.data[12] -0.5255 0.051 -10.366 0.000 -0.625 -0.426\n", "==============================================================================\n", "Omnibus: 178.029 Durbin-Watson: 1.078\n", "Prob(Omnibus): 0.000 Jarque-Bera (JB): 782.015\n", "Skew: 1.521 Prob(JB): 1.54e-170\n", "Kurtosis: 8.276 Cond. No. 1.51e+04\n", "==============================================================================\n", "\n", "Warnings:\n", "[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n", "[2] The condition number is large, 1.51e+04. This might indicate that there are\n", "strong multicollinearity or other numerical problems.\n" ] } ], "source": [ "import numpy as np\n", "import statsmodels.api as sm\n", "import statsmodels.formula.api as smf\n", "\n", "# Fit regression model (using the natural log of one of the regressors)\n", "results = smf.ols('boston.target ~ boston.data', data=boston).fit()\n", "\n", "print(results.summary())" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:13:21.823618Z", "start_time": "2018-04-29T07:13:21.812795Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "regr = linear_model.LinearRegression()\n", "lm = regr.fit(boston.data, y)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:13:29.286705Z", "start_time": "2018-04-29T07:13:29.280511Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "(36.491103280363603,\n", " array([ -1.07170557e-01, 4.63952195e-02, 2.08602395e-02,\n", " 2.68856140e+00, -1.77957587e+01, 3.80475246e+00,\n", " 7.51061703e-04, -1.47575880e+00, 3.05655038e-01,\n", " -1.23293463e-02, -9.53463555e-01, 9.39251272e-03,\n", " -5.25466633e-01]),\n", " 0.74060774286494269)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lm.intercept_, lm.coef_, lm.score(boston.data, y)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:14:24.251725Z", "start_time": "2018-04-29T07:14:24.248401Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "predicted = regr.predict(boston.data)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:14:33.380349Z", "start_time": "2018-04-29T07:14:32.952670Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEXCAYAAAC+mHPKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJztnXl8FdXZ+L9PkgtZBAJCESOghCpIVQK4FLeqIFqXoraKVWvtj6qtWmstGpUKilYsWn211VfU1r4uKIqlWt4WF0R5sbIZ0LIJVEHDjoQ1QJJ7fn/MneQuM/fO3NwtyfP9fO4n3DNnZs4MyXnOeVYxxqAoiqIoicjL9gAURVGUloEKDEVRFMUTKjAURVEUT6jAUBRFUTyhAkNRFEXxhAoMRVEUxRMqMBRFURRPqMBQFEVRPKECQ1EURfFEQbYHkEq6du1qDj/88GwPQ1EUpUWxaNGircaYbon6tSqBcfjhh7Nw4cJsD0NRFKVFISJrvfRTlZSiKIriCRUYiqIoiidUYCiKoiieUIGhKIqieEIFhqIoiuKJnBEYIvKFiKwOfeaE2m4WkXUislJEzs32GBVFUdoyOSMwAIwxfUOfU0WkHLgBGABcBDwrIoHsjlBRFCV32LlzJ2+//XbG7pdTAiOKi4CpxphdxphlwBfA4OwOSVEUJfvs37+fxx57jPLyci688EK++uqrjNw3lwRGrYisEZGPRGQE0BMIDyb5CugRfZKIXCsiC0Vk4ZYtWzI1VkVRlIwTDAZ58cUX6d+/PzfffDNbt25l37593HPPPRm5f84IDGNMf2NMOTAGeBFoBwTDugSBBofzJhtjhhhjhnTrljCyXVEUpUUSDAY59dRTufLKK/n8888jjv3pT39i+fLlaR9DzggMG2PMHCz10wagLOzQYcCX2RiToihKtsnLy+O0006LaS8pKeHuu+/msMMOS/8Y0n4HD4hIiYj0CP27Akv19C4wSkSKRaQ/0AVYnMVhKoqiZJXbb7+dzp07A1BQUMCNN97ImjVrGDduHB06dEj7/XMl+WAx8L6I5AM7gCuNMXNF5AVgKbAPGG2MMdkcpKIoSrrZuHEjmzdv5thjj405VlpaytixY1mwYAETJkygb9++GR2btKY5eMiQIUaz1SqK0hLZuXMnDz30EL///e8pLy+nqqqKvLxYJZAxBhFJ6b1FZJExZkiifjmhklIURWmrhLvITpgwgT179vDJJ5/w0ksvOfZPtbDwgwoMRVGULBAMBnnppZciXGTDGTt2LPv378/S6JzJFRuGoihKm8AYw1tvvUVlZSWLFzv78XTr1o0xY8Y4qqSyiQoMRVGUDLFgwQIqKyuZNWuW4/GDDjqIX//61/zqV7/KiNeTX1RgKIqipJlVq1YxduxYpk6d6ni8oKCA66+/nrFjx9K9e/cMj847KjAURVHSyOOPP86vfvUr6uvrHY+PGjWK++67j/Ly8gyPzD8qMBRFUdLIkCFDHIXFsGHDmDhxIoMHt5ycqrllUVEURWllfPvb3+aiiy5q/D5o0CDeeust3n777RYlLEAFhqIoSrMJBoPMnz/f9fj999/PkUceyZQpU1iwYAHDhw/P4OhShwoMRVGUJDHGMHPmTAYPHszQoUP57LPPHPv179+f5cuXM2rUqJxzlfVDyx25oihKFlmwYAHDhg3jnHPOYfHixTQ0NHDXXXe59m/JgsKm5T+BoihKBlm1ahWXXnopJ5xwQkw8xWuvvca8efMyNpbpVdWcPHEWR1TO4OSJs5heVZ3W+6mXlKIoigc2btzIvffey9NPPx3XRTZTcRTTq6q54/VPqa2z6spV19Ryx+ufAjCyoizeqUmjOwxFUZQ47Ny5k7vvvpvy8nKefPJJR2ExfPhwFi5cyJQpUzj88MMzMq5JM1c2Cgub2roGJs1cmbZ76g5DURTFgf379/PUU08xYcKEmMSANoMGDWLixIlZ8XpaX1Prqz0VqMBQFEWJYtmyZZx//vkxtbNt+vTpw/3338+ll16aNWP2oaVFVDsIh0NLi9J2T1VJKYqiRNGnTx9H1VO3bt14/PHHc8JFdsyIo8jPi6yNkZ8njBlxVNruqQJDURQlisLCQu69997G7yUlJYwbN441a9Zw44030q5duyyOzmLh2q9pCEZWTG0IGhau/Tpt91SBoShKm2X16tWuRYquuuoqKioquPHGG1mzZg3jx4/PqZTjU+Z96as9FajAUBSlzbFx40ZuuOEG+vfvz5NPPunYJz8/n3nz5vH444/nZMrxBmN8tacCFRiKorQZbBfZvn378sQTT1BfX899993Hjh07HPsHAoEMj9DCS0Bevkttb7f2VKACQ1GUVs/+/ft57LHHKC8vZ8KECezZs6fx2LZt23jooYeyOLpI7IC86ppaDE0BedFC4/ITezqe79aeClRgKIrSagkGg7z00kv079+fm2++2TGeok+fPgwcODALo3PGa0DekN5dYibwvFB7ulCBoShKqyM8i+wVV1zhGE/RrVs3/vCHP7B8+XIuueSSLIzSGa8BeZNmriQY1ScYak8XKjAURWlVRGeRjeaggw5i/PjxrFmzhhtuuCEnXGTDcQu8yxOJUEtlI9JbBYaiKK2CYDDID3/4Q8cssgAFBQXceOONrF69mnHjxuWUi2w4Y0YcRVEgP6a9wZgIW4abYNFIb0VRlATk5eVRUlLieGzUqFGsWLEiIy6yzU05PrKijAcuPsbR2ynclnH4wc6Cwa09FeSMwBCRdiKyTESeCX2/WUTWichKETk32+NTFCX3GT9+PIWFhY3fhw0b1phFtry8PO339+rhlIiRFWUEXeIpbJXTv/7jHNHt1p4KckZgAHcCXwCISDlwAzAAuAh4VkSy4xCtKEpOsX//ftavX+94rKysjJtvvplBgwbx1ltv8fbbbzN48OCMjS2VKccTqZyCLvF5bu2pICcEhoj0B44HpoaaLgKmGmN2GWOWYQmSzP2vK4qScwSDQV588UX69+/PFVdcgXFZgY8fP54FCxa0+JTjTraMokB+WpMLJiLrAkNEBHgMuDmsuSewNuz7V0CPTI5LUZTcINxF9sorr+Tzzz9n9uzZzJw507F/YWFhs7LINscGkUpDtG3LKCstQoCy0iIeuPiYtFXT80Iu1MO4HphtjFktIqeE2tpBhItxEGiIORMQkWuBawF69eqVznEqipJhFixYQGVlpaPXU2VlJWeffXZKU4w3t+zpmBFHRZwPzdsVjKwoy6qAiCYXBMZVQAcR+QHQBSjB2nGEv6XDAMcUjMaYycBkgCFDhqRRe6coSqZYtWoVd911F6+++qrj8YKCAk455RRqa2tdPaOSIZ4Nwm3inl5VzaSZK1lfU8uhpUVcMriM91Zsafw+ZsRRaZn0y1wKKJWl0a026wLDGDPU/reI/Bg4Bfg78LyIPAT0xhIksRE4iqK0KjZu3Mi9997L008/7VjACCwX2QkTJtC3b9+U39+vDcJpRzJtUXVGVEep3s14IesCwwljzCIReQFYCuwDRhs3C5eiKC2enTt38tBDD/Hwww+zd+9exz5nnXUWDz74YFq9nryUPQ3fUeSJxKQTT7QjSRX29cN3N+nazdhIa5qHhwwZYhYuXJjtYSiK4oOXX36Zm266yTExIEBFRQUPPvhgUl5P0eqiRBNq9I4BrFW7vWNwOu6EAJ9PPM/3eLOFiCwyxgxJ1C8ndxiKorReoifx7xTuc80ie//993PppZcmZdhOxoCdaNXuZONwIp3pObKJ7jCUVoHflaSSHZxW6IUFeeTNGM/yxfMBK4vs3XffzbXXXtusxIAnT5zlahSeW3mm5/GG/145XS+a8B1JS0F3GEqbobmukEpm2LRpk+MKfV99kA4nXclBq5dx6623cuutt6YkMaCbobq6ppYjKmdELCycFhxAzO+VAE5L7HwRgsakZLHiZ/GT6YWSCgylxZOMK6SSOVatWsXYsWOZMWMGnX/8JPkHxRb42d2pD1999RWdOnVK2X3j7QjC8zwtXPs10xZVxyw42hfkxfxeGYgRGqncUfhZ/EyvqmbMa0uoazCNfce8tsSxb6rIeqS3ojSXbNQFUCJxio7euHEjP//5zzn66KOZOnUqe/bsoW7ha47nH1palFJhAe5pwsOprWtgyrwvHRccNbV1jucYSFv0tZ9cVPe8ubRRWNjUNRjueXNpSsbihO4wlBaPF1dIJX1Er4q/3LSN0b8Yw+6F09m/L/L/ZcvCGRw+6AIaOjZl+klX7EC0AdvNWhvtFpsIPzYQv/hZ/Gzf6yzQ3NpTgQoMpcWTjQCmlkQ8PXcqdOD2qtjU17Fr8T/Y8eHLBGt3OvY97thjueKUw3h9XbuM6N3DU2u4GcHzHWIpADoXB9hXF8zo71WuL35UYCgtnmwEMLUU4unEIdaom4yzQPX2Pexe9j475rxA/Y5Njn2iXWRvjTPedP0/ui0sLhlcFmHDsNvHXTAAyOzvlZ/FT2lRwFFtVlqUvkoQKjCUVkGuJWnLFeLpxPfsr29W3qQenQoZ3nEjW58fz54Nqx375xV3otupP2TivWP4wQlHxB1rur3d4i0shvTu4ioYUmXM9iJ4/Cx+zj+uBy98tM6xPV1oHIaitGKOqJzhqruPxxcOUcrTq6oZ8+oS6oKGupqNfP3Px9i39hPH8yVQSMcTLqLj8ReR177Yk94/FXETqSRVu51E0ePJksr3pXEYitKKSHbyctOJu8UTAI61pAHGv7GUulA5t7zCgziw6T+xnfLy6TDwXDoNvYz8ks6NzV481nLJ2y2Vu510uX1n432pwFCUHKc5k5eTTjyQJ40TvxMNxnDyxFmNenMnL6P8woPoeNKl1Mz+U2PbqFGjWFn2Xb4uiI2z8GK09Zv4L502hVRO8uma2LNhII8rMETkNWAR8DHwsTFmS9pGoiiKI4l88+NNoE468b0H6hO6XtpBYMF9e6kvaI9IbMhWx8Hns2vRGwS6HMZ99/+WW68411X9MmbEUQkn+0QG30xG9Kdykk/XxH5Gv26ONowz+nVr1nXjkWiHcXHoYwBEpJqQ8CAkSIwxG9I2OkVR4qa48DKBRjsEHFE5I+E9TX0d2xZaLrKdz/opBw04I6aPFLSjx9WPkF/SmWdWCOVV1a5GW0jskZVM4r90RfSncpJPl9v3eyuc1+9u7akgkcDoDQwKfQaHfl4IXGB3EJFNxAoRx+p4iqL4x23yyheJO4G6rejjpswwQfYse5+aOS/QEHKRrZnzAiVHnYIUxLpr2naK8Ps6eaydPHGW41hvnbqEW15ZHDE+t8k/kzr7VE7y6XL7zjkbRmji/xL4m90mIpOB0cByYBvQB/gucG5Yn63GmO7pGLCitDXcJi+3NNvra2rjqm+crocx1H7+Mdvff466zZ9HXK9hxyZ2Lf5fOg75Xtxxxpuo3I7ZAXNe1EuZ1NmnepJPh9t3ztkwohGRMcAVwGnGmP8Lax8K3AOchSVgWo+vrqJ4JJliPc3xzZ80c6XjhFFaHODWqUtcK8HZLpf29TrsWsuuOX9h86fzHccpgULw4H6fJxKTBdbGS2rwROolP6v+VBjHcz22JxsZDnzFYYjIF8A7xpjRLsdvB+4GTjbGZLwGt8ZhKNnCr699Knzzna6Rnyc0xPGAgqYYi1WrVnHXXXfx6quvOvbLLyjg7Iuv4N/dh5FX3Nmxjxu2225ZaLJeuPZrXvxoXcKVZKJKdV4EQbriHnKRVHmNpSsO4xtYaihHjDEPisjFwJ3ApT6vrSgtFjeD7K1TY9NNT6+qjrsD8FP74IGLj2lsK26Xz54D8avB5YuwceNGrvnF7cyc9iIm6Nx/1KhRTJgwgb59+0YE7NnkAZ2KA9TsrXOsa21/a0y5bbypHRKpU7ys+jXdffrwKzBWAcMS9JkN/Cip0ShKCyWejj5cN2+vft0ypEZfxxYS0cV7bJ3/Axcfw9zKM5leVc0tr8Tf1BsTZNuclzj80e/HZJG1GTZsGBMnTmTw4MGNbYn0+Ym8rqJTcNs41ZVIhTollwIA00k2Cof5FRh/Ah4RkTuNMb916XMIEBu5oyitmHg6+tq6Bn75ymImzVzJ3gOx+Zuir2MTPSFET7vhsRi3Tl2SWN0jeeTXrHMUFu26l9NzxP/j7b/c4XhuvJW919Kl0dgqq1QH4eV6xtdUkY2dlN8CSn8A3gEmiMgsETkl/KCInAeMwjJ8K0qbwUuxnuqa2rgBcxLqYxcgcpoQnK4Zb8cSTlEgn/H33gdhQXgFpYfQ9YIxHHL1I9T3OJbpVdUJrxONl2d3ws559PnE85hbeWZKs9JGj6c1prvPObfaaIwxDSGh8ATw/4D3RWQbsA7LvlGG9Xv/X6keqKLkMvZk52Sb8Eq4uumXCdRLNk6xGMYYTP0B8gLtI9oLA3n07HMk3xg8gq3LPqTT0FF0GHgOkt8UX5HM6jRcZeVU9zqQL2CIsIGkcwJvK+nus7GTSjpbrYicANwEDMcSFg1YNo4HjDHPp2yEPlAvKSXbOHnopItAvsTYB/Zv+Iya958jr7Aj3UZWxpxTFMjn3G+W8FrVRvLaF8ccT+Sl5AUn4zy0/gk806TSGyzt2WqNMfOBq0I3aw80GGPqk72eorQGolfb6aJdvnAgTFjUfV1NzZwX2LtiTmPb/g2f0b7HkRHn1dY18Lfluyg56CBq64Ix1/W6Oo3nzulm71ABkVqysZNKaochIv2AE7EWJGuNMe+lemDJoDsMJVWkwr89E7uNht3bqflwCruXzIQoF9n2vY6l+6j7EYd05W5qIi+r02RiTnR3kdukZYchVsrKZ2lymxUgaF9HRMS0popMSqslUZ3rVLgrJtLtN4fg/r3snP86Oxf8FVO337GPSB7mQC3ioHqyVVl2PeuyOBN59Lty8vRy887Jhuunkj78qqRuB64GPgT+ApyNlc3WZqiIvAhckyu7DkWJZuz0TyOijqMnsVS6K4arZ8JjKpLF1Nexa7GVRTZYu9OxT7vu5ZSe/mOKjqhIeL0GYxoN0F4i0uON3ck7R4PoWhd+3WqvAVYCpxtjngb+HX7QGDMXqMdnlLeI5InI2yLymYisFJERofabRWRdqO3cRNdRlERMr6p2TFFhx0q4lb0Ea7JMxu3UZmRFGXMrz8S5nl18jAmye+l7rH/mera/O9lRWIS7yNrCorQo4FpBzyY8niMaL669Nk72j1wPopteVc3JE2dxROWMRndmxR2/O4zewB+NMfF+gxYBQ31e1wA/MsZsEJFzgPtFZDVwAzAA6Am8IyK9jTHxK78oigNeV/eJVEepUKf4DXTbt+4Ttr/7DAc2O5REBfKKO9Fp6Cg6VZyDyWtykS0K5DP+wgGN44438Td3Yndzk83lIDpVl/nH7w5jJ9A+QZ9qoIefixoLuxBTb2AJcBEw1RizyxizDPgCqyaHovjCnhi8TtIGXHcB8VbjXvEb6GZ2bnYUFhIopNPJl1N27dN0HHwBj1x+PGWlRQhWUJxthB5ZUcYDFx8T9x5uE3i8iT1fJOZe0eRyEF2iSoZKLH53GAuAYSKSZ4yJ9cmzCAKd/A5ERG7DspFsAUYAvyJS5fUVDoJIRK4FrgXo1auX39sqrZjm2AziGaebq06JNobnOyTvs8kXoceQs9k+73XqtobKcebl02HgOXQaOqqxgFFZaVHc9B22bcbpXQi4TuBjRhzlGkQYNCZhzEYuB9HlurosF/ErMJ4FXgXuBca69DmWOBlt3TDG/A74XSjb7UzgPSzhYxPECg6MPm8yMBkst1q/91VaD+HePKXFAXbvq49wG00V0avuZNxGoyf36VXVVL5axb5g097GdlW95ZXFlJ5+NVumTaC4/2mUnnolgc6HRvTzsmJ3qp8gwBUn9YoraO55c6ljShOvaqVcrSuRy+qyXMVvapBpIvIKcIeI9Af2hR8XkYuwstm+luyAjDGvi8hjwAasVCM2h6E5qhQXovXR8XI2NYfo1XiyevBwIfONwiC917/L9hmvc/hP/8jmfXkRgmfSzJV8VX4CPX7yR9p16x1xnbLSIs7o141JM1fGlDqNxmm1f0a/bry3Yotr4SOAcRcMyHihnkyQjQJELZ1kIr2vwNpB/MxuEJH3gK7A0UAd8KCfC4pIH2CvMWajiHwbSxDNAJ4XkYew7BpdgIwXZVJaBn68eZqDIVIQ+HUbnV5Vzfg3llJTW9foIrvuw5eZH/J6OnLZDD6f+t8R5zRObGHCwt59AL4EVrSbr5dzc1mt1Bxa63OlE98CI2S7uFFEnqcpl9TpocOfAGOMMYt8XrYU+KeI5AObgcuMMYtE5AVgKZYAGa1BgYobmdI7R7uout3XSdVhT9B7D9Sxd/kH1HzwPPU7NkX0mTv9OX7558t49JozGtviTWwnT5yVdJyDH2GXq2ql5tJanytd+I30/iHwvjGm2hgzD5gXam+PlWZkX9wLuGCM+Rg40qH9t4Bb3Q1FaSTZmgx+iTZOx7vv9KrqiMnod/9cwdcr57P9/b9Q5+IiK4FCpr49L0JggPvE1hzDbaqNvmOnf8qUeV/SYAz5Ilx+Yk/uGxnfO0tpWfh1q30BK615BMaY/ckKC0VJBWNGHGXlRwojPy+ZELn4lEUZROPpu8e/sbTx3wsXLqTqqV+x+dVxjsIi3EW2XS/vk2wy7rCpODeasdM/5YWP1jUK1AZjeOGjdYyd/qnvaym5i1+BkRAR+b6ITEj1dRUlEdGpvhvS4CG1fc/+iGjgeOqMmto6DrvuaboeewbHH388+9Z+EtspL58Og86j7LqnKT3lCvLaFyeMzA6nOXEOqYyRmDLP2R/FrV1pmSRUSYnIKGAhsMbjNQcAdwK/aca4FCWCRK6rd7zuMBmngb11QU9eUA17d1Dzfy+xe8k/Y7LI2ji5yAJcfmJPz+NpjuE2lUZftziSZItJKbmJFxvGS1jOIbtDP4eLyHrgY+BTh1QdJYBz+kxFSQIv3jxOtR3SRbRhuHNxIMaN19TXsefTtx2FRae+gyg++Ucc0e8YDj+4iI/+s71Zev/mGG5TZfR1Cz70s1tSch8vAuPXQAVWWo5+wMk05YqqE5FlWMKjCit1yOXA2tQPVWmr5GLG03DD8LgLBjDmtSURKrGCjl3pMPhCds5rCklq172cvz//JMOHD8/oWDPB5Sf25IWP1jm2K62HhALDGPN7+98iEgSeA/4FDAp9jgEGRp12Y+qGqLR1vHjz5AmkwWThim0YDgaDfG+gpVKyVTt5odV2x5O+z+7F/yCvqAOlp17FN086m+HDh2VukBnE3hWpl1Trxm8cxu+AhcaYxmVTKHbiaCyhcTCwyBgzx+V8RfGNlxQO3+7Thblrvs7IeIoC+fz67COZOXMmlZWVTJw4kZEjRsQGxHEQ3S//LYGuvSguLOS2c/tnZHzZ4r6Rx6iAyDCZrmaYVInWXEVLtLZcElXAG/Pqkoi8UIE8YdIPjgNSXz87X6DB5c+irLSIkYfV8s8//55Zs2YBMHDgQH7zzBs8/PaqmJQbzf1D1vKmiht+S+XGw2uJVl8CQ0QuxLJf/NEYk3P+ciowmk82JqhEv/jTq6pjbASBfOGy43sybVF1RlKCANR9Xc0pO95l6tSpMcd6jLyNdked5jj+cKLfbzzB4lYTvLQowPgLB6jgaOO4FfsqKy1ibuWZvq6VlprewM+x7Bb3utxUa3q3YDJRUMZJICUyak+auTImxqKuwTTqy9NNw+7t1Hz4MruX/JOpLi6ye9avjhAYTkZ5p/cbbij2UioWrPgOLfSjZCM9u9/AvWOBt4wxe12O9xWR6lDshtLCSHdBmfBCRoamCdJNnWT/4rv9AaRbWAT376VmzotUT/4pu6tmOLrIDhs2jB5XP0rnM34Scyx63F4SJIa/73hqNi30o5QWB3y1pwK/O4wuxHGZNcasEpGvgB8DLzdjXEoWSPeKxU0gufnw20btTOWJsrGzyO748GXH2tkAJYf25a/PPcHw4cNdVQOHlhZF7Ki8irf1NbWeUmpooZ+2jdt6KZ3rKL87jC0kLr+6GMvVVmlhpDK3kBN+dgrhKSqcUlgkCgdLJlzMGMOeZbNZ/8z1bH93sqOwKCg9hK4XjKHrlb9vjKdwS7FxRr9uETsqrxxaWuQppYYW+mnb7Kh1rvni1p4K/AqMD4HzRCTeb2oNVm0MpYWR7vrLXie40qJAjMG4fUHTr2rn4kDCCTjZRdauj/83JuU4QF5xJzoPu45DRz9JydGnU9a5BGiyydg7JWiqcf3eii2+DfL2+06kbtNCP0q6F3hO+BUYTwLdgOdExE2dNQDY1axRKVlhZEUZD1x8DGWlRQhNE1+qDKtOAsmJkvbWr9bJE2dxeOUMfvnKYmrCVk376oKUtEt8Hb+ICKXf+XFkW1gW2Y6DL0DyA42TdbhNBqydkn1sZEVZXJWR/X6vPKmX4/uOl1Ij1f8vSsvEKUNzIF/SupDwW6J1dqh86i+A3iJSaYyZbR8XkR8A5wL/m9JRKhkjnQVl7Ou61Yi2qa6pjYm7CCcVbrTGGMRhUi487GiKvnkStWsW0GHgOXQaOor8ks6Nx/NFGifrRMWL3GwvXtwe3VJtXHlSr5QEx2l8Rysh+k8kzU6DvtObG2N+CUwEhgDvishWEVkkIl9iGbqDoeOKEsPIijKK2yVep7gJi+bSsHs72956ku3vTnbt0/nM0Rw6+km6DP9ZhLAoCuTz8KXHNU6siZwEmqPiu2/kMVx5Uq/GnUa+SEqFhZO3WnjadiX3mTRzZczfSV3QpNV7Lpma3hhj7hSRqcCtwNlYyQkB/g1UGmPmpmh8SiskG949wf172Tn/r+xc8FdM3b5QHYrzCXSJXVUHSg+JaStzWIUnSlnS3PTh6Uq1kYvJHBX/ZCMOI67AEJHLsHJDrY4+ZoxZDFwV6tesEq1K6yFc1dGpKIAI1Oyti5gsM+km6+oiG2ygZs4LdPve7Z6us2FHLQvXfh0xoY4ZcVRMJLYAZ/Tr1vg9F2tGZ2OiUVKPlxxrqSbRDmMKEBSRo40xn4nIRKw05h8bY1bZnYwxWv8iC+SKHtoeR3VNLUKTGjXcUB0exew00aYaY4LsXf4BNR887+j1BHBg42qC+/eS17444fWChkabgr3qH1ntgH5iAAAgAElEQVRRxsK1X/PiR+san9kA0xZVM6R3l5wTFDbZmGiU1OP0d5Ru77lEAuPXWHUw7Mju2wjNByKyGyvmwq6F8TGwzBiTuUo2bZhMpPFIdH8nIRHP8mCrPWyDb6qTBoJlzN73RRXbZz/nWDsbLBfZTkNH0WHgOUi+v6jYKfO+jFATvbdiS8wz19Y1MP6NpTkrMLIx0SipJ5UVE70SV2CE18IIcSZNdTAGYRVTOpWmeWKfiHyKpca6IcVjVcLIph46Wlj5MU/bag9bVeOUiTZZ9m9YRc37z7Fv7RLH4xIopOMJF9Hx+IscdxWBPKHemLiRstHxEW5qnJraOqZXVeek0MjGRKOkh0yrPH271QKz7e8iUgwcR6QQqQCOB1RgpJFs6KHDdxXJYqfLSORa64e67eup+eB59q5wKcOSl0+HgefSaehlEV5P4dhGbSCuuiw6PiKePSaXjci5aFtRcp+kvKRsQkkI/xX6ACAi7YBvNXNcSgIyrYd2S7XtBztdRnSq8uay97N/uQqL4v6nUXrqlQQ6H+p6fnhcxPSqagoDea7PGV1ydMyIo/jlK4sd+6oRWWltxI3DEJHXROQOERkhIt3i9bUxxhwwxnycmuEpbqQ7jUc0XjKt2thr8NKiAJ2LAxFRzO+t2JJSYQHQYdD55HeIzEZT2Hsgh1z9KN0uvC2usICmid0Wik47H7c4iJEVZXR2yQ6qRmSltZFoh3Fx6GMbuquxjNsfA4uwvKU2pHWEiiOZ1kMnWi3bhm+neIVwbnFZjTeHvEB7Sk/5Idv+8RjtupdTevqPKTqiIvGJIToVBVwzzkLiyOxxFwxQI7LSJkgkMHrTZJsYHPp5IXCB3UFENhErRHKuGl9rJJEeOpVut4liJwxWUkC3e9hjSWZvYbnIzmH3J2/xjR+Md/RsKvnWWeQVdqDomyciErtxdkuhHsgT9hyoj3ABjiaRzUaNyEpbwXdNbxGZDIwGlgPbgD7AoUQ6y2w1xnT3cc1C4DHgdKAQeNQY84iI3IwVTV4L/NIY849419ESrU042RwEGFrehS+21Sac2MIN3G6TrRP5ecLDYbW219fUUlocYPe+et+eULaLbM37f+HApjUAdB52HR0HX5DgzCZKiwLsrw86qtMEKG6Xz54D8VVt+SKseeC7vsauKC2JtJRoFZExwBXAacaY/wtrHwrcA5wFfIn/FFglwEzgOuBgYKmIfIzlaTUA6Am8IyK9jTHpS/beinCyORhg7pqvG7+7xW5ECxs/le0agoZbXllMQb402iqS8YZyc5HdMXcKB33rLE/BdgAi7skKDSQUFtC8yn7hu7zS4gDGWPUKdBeitET8ekndAEwJFxYAxpgPgeEicjtwN1Z8hmeMMduAaaGvW0OJDE8DphpjdgHLROQLLLXYRz7H3Cbx6qHjFLvhx8DthIGkDduJXGSD+/ew78t/U9z3hITXKmmXT00KXHfLkjBeT6+qZvwbSyNUXeGCM9OBloqSCvwKjG9gqaEcMcY8KCIXA3cClyYzIBH5FpZaqitWMkObr3Co9ici1wLXAvTq1SuZW7YawlezeT7USNU1tZw8cVajmiqT5VBtGvZsp2buy+xe8k/H2tngzUU2HC+7h3gqKxu/xmuvLsia8E9pafgVGKuAYQn6zAZ+lMxgRKQr8DxwDfATrFTpNkEg5i/QGDMZmAyWDSOZ+7YGmqNGEpoMu16ERXgqkOYSk0XWgcLex1H6nWtof0jfFN3VQoDxFw4A4NapSxzfWWlRwPeE7meHprEaSkvCr8D4E/CIiNxpjPmtS59DgC5+ByIinYE3gTuNMQtE5Fwg/C/1MCz7iOJAc9RIfib/QJ6AJK9yarxnQx27Fv/TyiK7d4djHy8usmWlRREGfK+R6AJccVKvCGHg5BprCxQ/+BECGquhtCT8Cow/AOcBE0RkGHB3lPH7PGAUPid2EekIvAHcH+YJNQN4XkQewnLv7YKV7LBN4dU1Nt4kFT6pntGvG39fsiGuG6kbgbzUFTaq276B7e8+DQ65KgtKD6H01Kso7n+qo4usTefiQEx8hJc4D6dYkVS6xnpV62mshtLS8JtLqiEkFJ4A/h/wvohsA9Zh2TfKsBZv/+VzHL/AivF4VEQeDbWdDbwALAX2AaONXx/gFo6XjLSJ4htKi2In1fdWbPEtMNrlCwdSGKHdrmsvSr51Fns+fbuxzW8W2R0OCf4STdbxgvBSlV/JLX17Sbt8Avl56iWltFh8x2E0nihyAnATMBxLWDRg2TgeMMY8n7IR+iDTcRjprkfhFn1sT3pejKt2XET4uA6vnJGyMTaH+p1bWP/0dSB5cbPIxqNzcYCqu89u/B7vnQTyhUnfPy4jk3Su1CpRFC+kJQ4jHGPMfCIr7jUYY+qTvV5LIxP1KBJlpPVit2gIGu55c2nEjiRdRBvDbRfZjidcTPse34zpX9CxG10vvI32hx7lmkU2EdExHvZzOmbDzeD+VLPBKq2RuMkH3RCRfiJytYj8WETOMMbsb0vCAuLXo0gVbgZRu92rcdWeOG0hly7s+bhhz3a2vfUk65/5GXtXzKHm/T/jtpMt/uZJSQsLN0ZWlFHcLnYtVBc03Dp1CUdUzuDkibPSKjwVpTXiN9I7D3iWJrdZwXJ3LQgdl7ZiZ8hEPYpEldH8xEzYKpJ0lkV1c5Hdt/YT9n1RRdERg1J+z9IiZ1uH2/+D7TqrgXOK4h+/O4zbgaux6l9cjxWdHV5RZqiIfCEiZ6RofDlLotV/MkyvqubkibMaV8AAD1x8DGWlRREpwu0JzinFuRt3vP5p2gLyTEMdOxe9SfXkn7LjwymO8RR7lr7X7PvkRdYuIpAnrm6vXv4fUr0jVJTWjl8bxjXASuD0kMfUoVjpzwEwxswVkXqsKO/mzxA5TKrrIrvZRB64+Ji4Xj0QmeRvx946nIqqp2JnURTIbxRY06uqqZy2hG2fzKbmg/+hfscmx3PCXWSbS/jetXNxgHEXDHDdHbh5KkWTjah2RWmp+BUYvYE/GmPi/RUuAoYmP6SWgVe/fa/eMsnW6I42rk6vqnatANdcbGFhjKF4y1Lqpt3O1hX/duzr10XWC+G6zn11TmKxiej/Hzc9aXTJVUVR3PErMHYC7RP0qQZavUoKvNWj8OpJ5bbSdWt3E0QjK8qaXXfbibLSIkZWlLFw4UIqKyt59913HftJoDBpF1k/+BWmbq7EzclEqyhtDb82jAXAMIkXfmsZwTslP6TWgx9PKreVrlO7LYiqQytnWxDZXj9+bBteqa6ppU/lDL578ShnYZGXT4dB51F23dOUnnJFWoWFjR8HA7eMs8lkolWUtopfgfEs8E3g3jh9jiVORtu2hB9PKreVrlN7IkE0sqKMSwb78/wR4OTyLq71qSGUCfL4y2Pai/udyqGjn6TL8J+l3EU2HuGG7WiHgWiX2UzXQFeU1ogvgWGMmQa8AtwhItOAI8OPi8hFWNls/8/h9DaHH08qPytgL4LovRVbvAyxEQMs27ArYcqQor4n0r6sP2BlkT3k6kfp9r3bPaccj0fn4gCPXjYQL1aF8Mk+0Y4LLCEaz+NMUZTEJBPpfQXWDuJndoOIvIdVv+JooA54MCWja+H48aTy09ct/iJcECUTD7J9b11jFlmCDXQ8fmRMHxGh81nXEty3O24W2WQoblfgyQYTnTzQq8OARl8rSvPwtcMQkf6AMcbciOUJNQXYilWLewDwKXC+MWZRqgfaUikMNL3i0qKA66rWzwp4zIijCORHrsMD+RIhXPzGgxgTZM+y2ax/+nq2v/MUNXNeoGHPdse+7Xt805ewuPKkXp5sBbaQc1IfSeg6X0w8j7mVZ0a8l0wEUSqK4n+HsRQrg+yPjDHzgHnQmEtKjDHOFXDaIE5J8PbXB2P6RHs62UkFJ81cyS2vLGbSzJWc0a8b763YEpGiPMZPNOr7mBFHMea1JY51K8JrWhhj2PdFFdtnP0fd5v80Xa5uHzs+fJkuw38Wc74fOhcHGNK7CzM+2ZCwb6dQ1LbfVONedlzZQpMQKq0JX9lqQ6nM/9sYc1f6hpQ8mc5WG49kMs0WBfK5ZHAZ0xZVJxVoF566e3pVNWNeXeJav6K0KEDt+s9YN/MZ9q1d4tgnr30JZT/7c7M9nooC+Z6eJ5AvXHZ8zwjh6GWCdXuX2bZR5Oq4FCWadGWrnQP0S25IbYtkMs3W1jUwZd6XSccGhN9z0syVrsKibvt6Vv3tefaumON8obx8Ogw8h05DRyUUFnaG2nyXGuL5Io7CIk8genh1DYYXP1rXuFnymu8plcWPUkmywZiKkqv4FRj3A3NE5HhjzIJ0DKi1kEhNkig5XrL3tHF03d29nZoPX2b3Esuo7URxv1MpPe0qT15P4cbnsdM/5YWP1sXe0+V53Ar3RTd7nWBz0aCtthWlteFXYHwfmAW8IyK/MMb8JQ1janE46amTzTTrtlJPRLRHVaeiQKOLrFsW2XAKex9H6Xeuof0hfRuvd8ngMv76cTV7DkQKFye1ipsbb7LPE05LnWBz2baiKMngN3BvDDAC6AD8SUQ2iMgzIvJTEakQkaQLMrVU3GIAAC4ZXNYYqZ0vwiWDm1bBbp5Ol5/YM6ko7UG9OjFp5kqOqJzBwHveYkdYPMXOedNcs8i2617ONy6dQPdR9zcKC7DyRg3p3SVmJyCh54pezcfbMXl9Hrf4i5Y6wWqwoNLa8CswzsQSGi9hZa3tBvwE+G9gIbBLROaLyBMpHWUO46anvufNpUxbVN24um4whmmLqiMjkB08nYb07hIhaLwyd83XjUKrprYu4tIdjh9JXvuSiP4FnbrT9YIxHHL1IzEusnbeKKdnMzjvJtwmdds9OJFbbVlpEVec1KtVTbAaLKi0NnztCIwxs4HZ9ncRKQaOAwaFfY4DBgM/T9Ugcxm3lXVMeVAi9fFORum6oGH8G0vZXx+MUONElz71S35RBzp++wfUzH6OvKKOdDr5ctcssuETtB8dfDwVnG1fOKJyhuNzCDR6dw3p3SXnjNfNIRdtK4qSLM1SIRlj9mIVU/qX3SYi7YBvNXNcLQY/Ve+gabJ1m4ydUnMYrHiGfXVBV/fU/RtWsfvTd+gy/DqcckN2GHQBGEOHivNcPZ+iI6j96OC9eCp5uZ5OsIqSu3gSGCIyHrgOOBhYC/wFeNAYEzO7GWMOAB+ncIwZxW+gldvKun1BnuPkb0+OfgVNzd46HrlsYEzajLrt66n5oMlFtvCwoyk5+vSY8/MC7el00g8cr33lSb24b+Qxjd/td+A2vsMPLuLkibMcU6sn865aqspJUdoaCW0YIvIT4G6gO5aAKQfuAV5L79Ayj5ckdtG46anHXzggrj7ezSDqli02L2TTmFt5Jo9eNpDA/p1se+tJ1j/zs4h4ipoP/gfTED+BYDiBfGFI7y6O78CND8PsJdU1tdzyymLGTv804b1Up68oLZuEkd4iMh8rZfk1wPtYgXsPYtkrRhljXk33IL3S3EjvRNHZfkm0W3E6DriWFi0K5PObsw9n5TtT+N2kh9hXu9fxvl2/V0lJv1MavwfyIF6BuvDnc3sHiRDgkcsG6uSvKC2QVEZ6lwOvGWOmhL6vF5HhwGrgR0DOCIzmkupAq0QqmnjHb526JMLwbRrq2LToTa5+5BXq99Q4ntOuezmlp/84wuspkCdcdkJPXpn/pWvkd/jzJfusBlpcBLPmeVIUf3gRGJ2xhEMjxpgaEZmBVfui1eBmVygtDjjq7NPFyIoybgnV5TYmyN7lH1DzwfPU79jk2L+g9BBKT72K4v6nxhi864KG91ZsYdIPjosRQjbhRudEtpV4Hlu5HmAXLiBKiwPs3lffKES9piFRlLaM1zgMJ4XGOiwjeKvBya4QyBd276v3ZddIBT06FVL7+cdseO6XbH3zIUdhUVBSSudh13Ho6CcpOfp0R+8osCbykRVlPHzpcQnjHNxSi0NTrERLDLCLtk9t31sXs+NyK5+rKIpFc9xq6wH3ep5JICJFQE9jzGepvK5XnFxD9+yvj/F2ykQCuUv7BLnlzrsdj5WUlPDrX/+ao4Zdzr0zP0+YCdaeyL24vnpN5BeeJBBy39vJKQjRiVzfJSlKNvFi9A4CDcAKrPoX80OfS4A7jTH+81jE3qMj8D9YkeRTjTGjQ+03A7cCtcAvjTH/iHeddKQ3jxds9vnE8+Ke60VHHq/PCaefzYIP3m7sm19QwPXXXcdvfvMbunfvHnN+tJoF0pdOu6Xp/93+H6NJ1sFBUVoyqTR6vwtUYFXUG4DlLRV+o4eAxUAVsNwYE8cfx5Ug8Djwd+Ck0HXLgRtC9+yJlfCwt1PsRzpJNoFcdC0EJx15oj5/+uPvOe644wgGg1x22WXcd9999O3bN+I+0YbzTE3kXgLsckmoeIl7yfVdkqJkG88FlESkDzAk7FMBdAodti+yH/g3UGWMuc73YER+DJxijBktIr8GSo0xY0PHPgR+ZYz5yO38dOww4hXBAXfVjRcX3RPHv8nKt6dQ8q0zItKJh/d55JFHOPXUUxkyJKHwzylyrXiQ03gC+UJJuwJ21NZlXaApSjZJeQElY8x/gP8AU8NuciSRQmRg6OdgrMjw5tATS/jYfAX0iO4kItcC1wL06tWrmbeMxU2nD8TdHcRz0T1w4ABPPfUUCyf9huDeHdRtr6bbhbdF9LG55ZZbUv5M4aRrF5BrxYNytciSorQkmptL6jPgM6zstYiIAP2xBEZzaUekd5ZtS4kew2RgMlg7jBTcNwYn9cvJE2fFnRCdVCDGBKldPodefX7OpuqmYkN7l3/A/hMubkwvnkpvo3gCwYvaLFlysXiQ5qlSlObhN715XIzFMmPM8ym43AYg/K/7MODLFFw3JSSaEMPdU40xjS6ym9+cFCEsbHZ+ZMU/etWjT6+q5uSJsziicgYnT5zl6OabKNWJ2y5g/BtLE94/EW5CL5ddbxVFiU9KBUaKmQGMEpFiEekPdMEyrqcULxOvE52KnD2Kw11YH7j4GOo3rmbzK2PZPPVu6jb/J6Z/YVExh531I7qee7Pn3Epec17FUwtB/Iy5zY0z0eJBitL6yIkKeSLSAcvLqgNQKCLfAX4KvAAsBfYBo41XC71HklXJTK+qZs+B+pj2PKFxQly9ejVTJt5F9dSpMf2szvnc8LPrI1xkvRJvZxA+7kS7oHieQ821NajNQFFaHzkhMIwxu4C+DofeA36brvsma5idNHMldQ2xsitoYPvWzdxww2+ZPHky9fWxQgWguN+pHHXeT/nDQ1clNe5EOwOv9SzGjDiKX77ivGlLha1BbQaK0rrIZZVU2knWMOt23BjDDVd8jyeeeMJRWBT2Po5Drn6UXt+/k7uvSD44LJ4dIDy1RSK10MiKMtd06mprUBQlmjYtMJI1zLodFxGKB18U096ueznfuHQC3UfdT+EhfblkcPNW3vHsAOHCzEv9iXEXxK/boSiKYpMTKqlskWwFuDEjjuKWVxY7ppo48pTzqVn7NsuXL6d95x50OOXKiCyyBnhvxZa4108UGzGyoox73lzqWDc8Wph5SbEOamtQFCUxbVpgjKwoY+Har5ky70sajCFfJOHq3xhD8ZalDGpYwcf5/WIS8N323aMpHPx7Vq1axcPrekJ+rMonnsrLqyF+3AUDUlbuVG0NiqJ4oU2rpKZXVTNtUXVjjYgGY5i2qNrVpXThwoUMHz6cESNGMOcvD/Lb8/s6qnvOOeccbrrpJsoO7uh4HTeV1vSqam6duiSuK6yNljtVFCXTtOkdhlcvqdWrVzN27FheeeWVxrYtW7bw2bsvM3fcONfrO6m8APYeqI/wZoKmnYVTgSNw3pUkszPIpYSAiqK0LNr0DiORl9SmTZu44YYb6N+/f4SwsHn44YfZuXOn6/XtXUBpVJDf9r11MYF2ieo1lLp4M/nBa8CfoiiKE21aYLiphr5RGGTcuHGUl5e7usieeeaZzJo1i1lrdsWNFB9ZUUZJ+9iNXLSaKZErbypCFhNFfiuKosSjTQuM6DgF01BH7eK/s+Kxa7j33nvZs2dPzDkVFRXMnDmTd955h6/ye3hasXuJ90jkyrujNtYjyi+5mBBQUZSWQ5sWGLbK6NCO7dmzbDabnv05m2f+Nzu3b4vpe8QRR/DSSy+xcOFCzj77bETE84rdS7yHU5Cdl2v4QRMCKorSHNq0wABLaAzfP4etbz7E/u0bYo5369aNxx57jBUrVnD55ZeTl9f0yhKt2O3EhtU1tUhUn2gXWFt4OUVepyqQThMCKorSHNq8wAAYPXo07du3j2grKSlh3LhxrFmzhptuuol27drFnBdvxR5uYAYrYM8WGm4usCMryqi6+2wevWxgWtxl1RVXUZTm4LlEa0ugOSVax4wZw0MPPURBQQHXXXedpyyy8cqQTpq5MmGJVkVRlFwg5SVaWzt33HEHW7du5a677qJvX6fEubHES6txSxqzwKYSjctQFMUrKjBCdOnShT//+c++z3MKnpteVU2eiGMQXi4ZmNNZolVRlNaH2jBSTLyIbcGalP1U9ksnGpehKIofVGCkmHgR27YIyZUIa43LUBTFDyowUozXyTYXVvIal6Eoih9UYKQYP5NttlfyGpehKIofVGCkGKdJODpozybbK3mNy1AUxQ/qJZVinFxtz+jXjWmLqlNS7CjVaPEkRVG8ogIjDThNwkN6d2lz8Q4a46EorQsVGBmira3kNcZDUVofasNQ0oLGeChK60N3GG2MTKmJNMZDUVofusNoQ2SyRKvGeChK60MFRhsik2oijfFQlNZHzgsMEblURD4XkdUi8pNsj6clk0k1kcZ4KErrI6dtGCLSAXgYOAloABaLyJvGmC3ZHVnL5NDSIscaHelSE7U1zzBFae3k+g5jBPC+MabaGLMRmAWcleUxtVhUTaQoSnPI6R0G0BNYG/b9K6BHeAcRuRa4FqBXr16ZG1kLJF7BJ0VRlETkusBoBwTDvgexVFONGGMmA5PBKtGauaG1TFRNpChKsuS6SmoDED67HQZ8maWxKIqitGlyXWDMBEaIyDdE5BBgKPBWlsekKIrSJslplZQxZpOI3AX8K9R0qzFmTzbHpCiK0lbJaYEBYIx5Dnguy8NQFEVp8+S6SkpRFEXJEVRgKIqiKJ5QgaEoiqJ4QgWGoiiK4gkVGIqiKIonVGAoiqIonlCBoSiKongi5+MwcpVMlTpVFEXJFVRgJIFd6tSuXmeXOgVUaCiK0mpRlVQSZLLUqaIoSq6gAiMJMlnqVFEUJVdQgZEEbiVN01XqVFEUJRdQgZEEWupUUZS2iBq9k0BLnSqK0hZRgZEkWupUUZS2hqqkFEVRFE+owFAURVE8oQJDURRF8YQKDEVRFMUTKjAURVEUT4gxJttjSBkisgVYm+1xNJOuwNZsDyKH0PfRhL6LSPR9NNHcd9HbGNMtUadWJTBaAyKy0BgzJNvjyBX0fTSh7yISfR9NZOpdqEpKURRF8YQKDEVRFMUTKjByj8nZHkCOoe+jCX0Xkej7aCIj70JtGIqiKIondIehKIqieEIFRg4gIkUicmS2x6EoihIPFRhZREQ6ish0YBNwW1j7zSKyTkRWisi52Rth5hCRQhGZHHrmtSJyS6i9zb0LABHJE5G3ReSz0LOPCLW3yfcBICLtRGSZiDwT+t6W38UXIrI69JkTakv7+9D05tklCDwO/B04CUBEyoEbgAFAT+AdEeltjKnL2igzQwkwE7gOOBhYKiIf0zbfBYABfmSM2SAi5wD3i8hq2u77ALgT+ALa9N9JI8aYvva/M/U+dIeRRYwxu40x7wL1Yc0XAVONMbuMMcuw/kAGZ2N8mcQYs80YM81YbAW+BE6jDb4LgNB72BD62htYQhv93QAQkf7A8cDUUFObfRcuZOR9qMDIPXoSmd7kK6BHlsaSFUTkW0AhVrqDNvsuROQ2EdkG3ALcSxv93RARAR4Dbg5rbpPvIoxaEVkjIh+F1JUZeR8qMHKPdliqKpsg0JClsWQcEekKPA9cQxt/F8aY3xljDsZSxcyk7b6P64HZxpjVYW1t9V0AYIzpb4wpB8YAL5Kh96ECI/fYAITXfj0MSz3T6hGRzsCbwJ3GmAW04XcRjjHmdeAg2u77uAoYJSKLsXZaFwEbaZvvIgJjzBws9VNGfjdUYOQeM7D+OIpDetsuwOIsjyntiEhH4A3gfmPMP0LNbfJdAIhIHxE5JPTvbwP7aKPvwxgz1BhzjDFmIHA38FcsR5E29y4ARKRERHqE/l2BpXp6lwy8D/WSyiIi0gGoAjoAhSLyHeCnwAvAUqxJYrRpG+H4vwAGAY+KyKOhtrNpm+8CoBT4p4jkA5uBy4wxi0Skrb6PCNr4uygG3g/9buwArjTGzM3E+9DUIIqiKIonVCWlKIqieEIFhqIoiuIJFRiKoiiKJ1RgKIqiKJ5QgaEoiqJ4QgWGoiiK4gkVGIqiKIonVGAoiuKKiPQSESMir2d7LEr2UYGhtChCRYVM6POLOP2eDev3bCbH2MoYFPr5cVZHoeQEKjCUlsYgmuqHHOvUQUROxMp2a2frXJiBcbVWbIGxKKujUHICFRhKiyFUVawLMB+rrG2MwBCRPOCPwBaaBIUKjOSxi/DoDkNRgaG0KIaEfi7CSto4ICQgwrkOa5K7DSgH6oBPoi8kIpeIyD9EZKuIHBCRVSJyZyihW3TfH4rIi6H62rtEZLuILBCRa5wGKSKnisjroQI3+0Rks4jMF5HfRvW7M6Qyu8jhGr2jbQcickaobZKInCAifxORr0NtA5rxfAWhetCfiEitWDXVbwsVLhoEVBtjNjk9q9K2UIGhtCTCBcbHWFk7y+2DoeJL9wP/At7Hqtj3qTFmf1iffBGZArwG9AVeBZ7AKjhzP/Cn8BuGMgr/BegDzAH+ALwOHAH8SURuj+p/J/ABltB6F/g9Vo2P9sA5Uc8TT91jP7SOH8EAAATkSURBVGuVQ/9vhcYSBJ4CXgJWJPl87YD/BR7FUuH9ITTue4DJwCHo7kKxMcboRz8t4gO8BxisQvffD/37krDjz2BNehXAJaHjT0Vd4w+h9geAgrD2ADA3dOzosPaDgEMcxtID2AWsCGvrjmVfmQO0czina9T3z4EtLs/6QGgs54e1vRhq2wWc5HKe3+d7OtT2G0LZq0Ptp4XaDTA+2//3+smNj+4wlBZBSD1SAezFWk3bq95jQ8dPBH6CJSCqaNK9Lwy7xonAz4G/GWPuMMbYxnOMMXVYOwmAE8PadxtjNkaPxxizAViPZVOx6QfkA58ZYw44nLM1bCxdgMNxNybbO4zw1b29w7jZGPNR9Al+n09ETgBGAx8YYyYYY0xY/w+A5aGvavBWAC2gpLQcjgQ6AR8aYxqA/4hIDXBsmKF7GzA21N+ecMMN3jcBAuwVkfEO9/hW6KfYDaGysTcC5wFHAR2JVOWGq4yWYhW0+YmIdMPaEbxljNnucC978nczyA8CNhtj1ofGUYL1DjYDz7mc4/f5bgr9vNvlettCP1UlpQAqMJSWg5MAWAwcA1yLtaP4qTHm69CxQcB+4N9h/c8O/bw8wb3WAYjIscBbWKqm+cDLwNdYhvQjgB8BS+yTjDFbReQUYBzwXeACoEFE3gbuMsaET7z2Dihm9S4ifbB2Lv8Maz4OS1DNMMYEXcbt6/lC/bdh2Vyc6ANsMsZUJ7ie0kZQgaG0FMIN3jZVwOnAb4EFwLMAInIEcDAwP6SKQUQKgW5Y6pfTPd7zeaxSqWcYY2aHHxCRe0P/jNghGGP+DfwgZEw+DUuY/QA4XkTKTJMBviL002n1/t2w57OxdyTznAbq9/lC/b8BVIWrosKODwUOBf4RfUxpu6gNQ2kpOAmMj7HUK6XAjWETX4z9giY1TFcvNxORnlj2kdkOwqIUKzAwejyNGGMOGGPeMcZcCvwflgDrHtalH1BnjFkbde32WK7B9vPZJFJh+Xo+LOeABiyh4cQ9DmNQ2jgqMJScJ2SjGAjsockQC5Y76EXAmcaY+WHtMeorY0wtVjzG0SJysct9TgmLU9gX+tlHRAJhfQ4GXgEOw/KIWhxqrwgFFkZfsy+W7WAd8FXYoQNAQESODOtbguX2atsaoncYB4BPncbu9/lCO69VQJmIXBDV73ZgWOirGryVRlQlpbQE+mO5t84N19+H7BXTHfo77TAAxgAzgGki8g7WBJsHlIXOCRhjeoWuvUVEZgFnAvNC/XsA52K59waBZcYYW7D8ArhaROZjGb83Y9k5Lgwd/0mU7WEmcDzwvoj8NfR8Z4XGtAErxuQ/0LjrOBr4xMn7KpnnC/EAlufUNBF5GdgIfAfLLvQl0BPdYSjhZNuvVz/6SfTBMi4b4L889t+G5X6b73DseKygto1YxuutWKv2p4Czovp2A/4HK83ILuDD0FgqQuN5NqzvSCybx0pgJ9Zu4Aus2JBvOoyjEPgvLNfcvVjC7Tos9VoQSxVm9x2MQ0yJy7N7fr5Q/5uxBFMdVrqVaVi7mbXA1mz/3+sntz5iTIy9S1EURVFiUBuGoiiK4gkVGIqiKIonVGAoiqIonlCBoSiKonhCBYaiKIriCRUYiqIoiidUYCiKoiieUIGhKIqieEIFhqIoiuIJFRiKoiiKJ1RgKIqiKJ74/3bVliX0c9t2AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots()\n", "ax.scatter(y, predicted)\n", "ax.plot([y.min(), y.max()], [y.min(), y.max()], 'k--', lw=4)\n", "ax.set_xlabel('$Measured$', fontsize = 20)\n", "ax.set_ylabel('$Predicted$', fontsize = 20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 训练集和测试集" ] }, { "cell_type": "code", "execution_count": 190, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 6.32000000e-03, 1.80000000e+01, 2.31000000e+00, ...,\n", " 1.53000000e+01, 3.96900000e+02, 4.98000000e+00],\n", " [ 2.73100000e-02, 0.00000000e+00, 7.07000000e+00, ...,\n", " 1.78000000e+01, 3.96900000e+02, 9.14000000e+00],\n", " [ 2.72900000e-02, 0.00000000e+00, 7.07000000e+00, ...,\n", " 1.78000000e+01, 3.92830000e+02, 4.03000000e+00],\n", " ..., \n", " [ 6.07600000e-02, 0.00000000e+00, 1.19300000e+01, ...,\n", " 2.10000000e+01, 3.96900000e+02, 5.64000000e+00],\n", " [ 1.09590000e-01, 0.00000000e+00, 1.19300000e+01, ...,\n", " 2.10000000e+01, 3.93450000e+02, 6.48000000e+00],\n", " [ 4.74100000e-02, 0.00000000e+00, 1.19300000e+01, ...,\n", " 2.10000000e+01, 3.96900000e+02, 7.88000000e+00]])" ] }, "execution_count": 190, "metadata": {}, "output_type": "execute_result" } ], "source": [ "boston.data" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:16:27.403480Z", "start_time": "2018-04-29T07:16:27.398197Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "from sklearn.cross_validation import train_test_split\n", "Xs_train, Xs_test, y_train, y_test = train_test_split(boston.data,\n", " boston.target, \n", " test_size=0.2, \n", " random_state=42)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:16:43.427978Z", "start_time": "2018-04-29T07:16:43.423656Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "regr = linear_model.LinearRegression()\n", "lm = regr.fit(Xs_train, y_train)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:16:47.859814Z", "start_time": "2018-04-29T07:16:47.854257Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "(30.288948339369036,\n", " array([ -1.12463481e-01, 3.00810168e-02, 4.07309919e-02,\n", " 2.78676719e+00, -1.72406347e+01, 4.43248784e+00,\n", " -6.23998173e-03, -1.44848504e+00, 2.62113793e-01,\n", " -1.06390978e-02, -9.16398679e-01, 1.24516469e-02,\n", " -5.09349120e-01]),\n", " 0.75088377867329148)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lm.intercept_, lm.coef_, lm.score(Xs_train, y_train)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:17:35.601265Z", "start_time": "2018-04-29T07:17:35.598315Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "predicted = regr.predict(Xs_test)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:17:43.752187Z", "start_time": "2018-04-29T07:17:43.605493Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEXCAYAAAC+mHPKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3Xt4VOW1+PHvSggQEIkoCAQEIwKiiAEUvEG9oqgIaEWPIoIWsVattoggd8Kl6Gm91NOfqBwrWAWRIhaBHsULKIiBoBhsFESQCHIzXAOEZP3+2DNhkuxJZpK5JbM+z5Mnyd7v7HlnK3vlva1XVBVjjDGmIgnRroAxxpjqwQKGMcaYgFjAMMYYExALGMYYYwJiAcMYY0xALGAYY4wJiAUMY4wxAbGAYYwxJiAWMIwxxgSkVrQrEEqnnXaatm7dOtrVMMaYamXNmjW7VbVxReVqVMBo3bo1mZmZ0a6GMcZUKyKyJZBy1iVljDEmIBYwjDHGBMQChjHGmIBYwDDGGBMQCxjGGGMCEjMBQ0R+EJGNnq/lnmOPiMhWEckRkeujXUdjjIlnMRMwAFS1jefrchE5C3gQOBfoB7wiIknRraExxsSOvLw83nvvvYi9X0wFjFL6AXNV9YCqbgB+ALpEt0rGGBN9hw4dYtq0aaSlpdGvXz+2bt0akfeNpYCRLyKbRGSViPQCWgK+i0m2Ac1Kv0hEhopIpohk7tq1K1J1NcaYqBk0aBAjR47kl19+4dixY0yaNCki7xszAUNVz1HVs4DhwOtAbaDIp0gRUOjyuhmq2lVVuzZuXOHKdmOMqfYeffTREr//7//+L999913Y3zdmAoaXqi7H6X7aDqT6nGoB/BiNOhljTCy59NJL6d27NwANGzZkwoQJNGtWpgMm5GIil5SI1AdOVtXtIpKO0/X0AfCSiDwNtAIaAeuiWE1jjIkIVWXRokVkZ2czYsQI1zIZGRl06tSJ4cOHc8opp0SkXjERMIB6wMcikgjsA+5S1U9FZDaQDRwB7lNVjWYljTEm3JYtW8aTTz7JqlWrSExM5JZbbqFNmzZlyqWnp5Oenh7RusVEl5Sq7lLVtqp6lqp2VtUPPcenqOqZnvGNT6NdT2OMCZfPP/+cq6++mquuuopVq1YBUFhYyPjx46NbMR8xETCMMSZeffXVV/Tp04fu3bvzwQcflDn/5ptvsm3btijUrCwLGMYYEwXffvstd9xxB506deLdd991LdOnTx/Wrl1LixYtIlw7d7EyhmGMMXFh69atTJw4kVdffZXCwjIrBQC46qqryMjIoHv37hGuXfksYBhjTATs2LGDKVOm8OKLL3Ls2DHXMhdffDGTJ0/miiuuiHDtAmMBwxhjIuDNN9/k+eefdz3XqVMnMjIyuOGGGxCRCNcscDaGYYwxETBs2DBSU1NLHGvbti1z5sxh7dq13HjjjTEdLMAChjHGRETdunUZM2YMAK1atWLmzJlkZ2dz2223kZBQPR7F1iVljDEhUFBQwMyZM3nttddYtmwZderUKVNmyJAh1KpVi7vuusv1fKyrHmHNGGNiVGFhIbNnz6Z9+/YMGzaMzz77jBkzZriWTUpK4t57762WwQIsYBhjTKWoKv/85z/p1KkTAwcO5Pvvvy8+l5GRwaFDh6JYu/CwgGGMMUFQVZYuXcpFF11E//79yc7OLlNm3759fP7551GoXXhZwDDGmACtWLGCX/3qV1x33XVkZmaWOV+rVi2GDRvGpk2buPLKK6NQw/CyQW9jjKnA2rVrGT16NIsXL3Y9LyLcddddjB8/nrS0tAjXLnIsYBhjjB+bN2/m8ccfZ968eX7L3HLLLUycOJEOHTpEsGbRYQHDGGP8OHbsGPPnz3c9d91115GRkUGXLl0iXKvosTEMY4zxo127dgwaNKjEscsuu4xPPvmExYsXx1WwAAsYxhjDgQMH/J4bO3YsSUlJdOnShcWLF/PJJ59w+eWXR7B2scMChjEmbu3fv5/x48eTmprqdxps69at+eKLL/jiiy+47rrrYirf04KsXC6dtowzn1jEpdOWsSArN6zvZwHDGBN3Dh8+zPTp0znzzDOZMGECBw4cYPTo0X7Ld+rUKaYCBTjBYuT89eTm5aNAbl4+I+evD2vQsIBhjIkbx44d44UXXuCss85ixIgR7N27t/jc+++/z4cffhjF2gXnqaU55BeU3IApv6CQp5bmhO09bZaUMabGO378OLNnz2b8+PFs2bLFtcypp57Krl27IlyzyvspLz+o46FgLQxjTI1VVFTE3LlzOe+88xg8eLBrsDj55JOZOHEimzdv5rbbbotCLSuneUpyUMdDwVoYxpgaR1V57733GD16NOvWrXMtk5yczMMPP8zw4cM59dRTI1zD8i3IyuWppTn8lJdP85RkhvdqR9/0kpsvDe/VjpHz15folkpOSmR4r3Zhq5cFDGNMjfLzzz/Tr18/Vq5c6Xo+KSmJ+++/n1GjRtGsWbMI165i3sFsbyDwDmYDJYKG9+eKAksoWcAwxtQojRs3dl1XkZCQwKBBgxg7diytW7eOfMUCVN5gdulg0Dc9NawBojQbwzDG1CgJCQlMmjSpxLHbbruN7OxsZs6cGdPBAoIbzI7bdRgiUltENojIy57fHxGRrSKSIyLXR7t+xpjYsnHjRvLy8lzP3XzzzVx00UXccMMNZGVlMWfOHNq3bx/hGlZOoIPZ8b4OYxTwA4CInAU8CJwL9ANeEZGk6FXNGBMrfvzxR4YOHUr79u15+umnXcuICMuWLeNf//oXF1xwQYRrWDXDe7UjOSmxxDG3wexorMOIiYAhIucAFwJzPYf6AXNV9YCqbsAJJPGV5csYU8LOnTt59NFHOfvss3nppZcoLCzkmWeeYefOna7l69evX6n3iXQ3T2l901OZ2r8jqSnJCJCakszU/h3LjFVEYx1G1Ae9xVlv/xzwAHCZ53BL4GufYtsA1+kMIjIUGApwxhlnhK+ixpio+OWXX3j66ad59tlny+yTfejQIaZOncpf/vKXkLxXoDOUwi2QwezmKcnkugSHcK7DiIUWxjDgI1Xd6HOsNlDk83sRULLt5aGqM1S1q6p2bdy4cRiraYyJpIMHDzJlyhTS0tKYMmVKmWAB0KZNGy655JKQvWc0unkqK9Cuq1CKegsDGAg0EJFfA42A+jgtDt/w2gL4MQp1M8ZE2JEjR3jxxReZMmWK3+6mli1bMnbsWAYNGkRSUuiGN6PRzVNZcbkOQ1WL/zwQkXtwuqX+BcwSkaeBVjiBxH25pjGmRigoKODvf/87EyZMYNu2ba5lmjRpwqhRo7j//vupW7duyOsQjW6eqoj0OoyoBww3qrpGRGYD2cAR4D5V1ShXyxgTJkVFRVx00UV+03ikpKTw+OOP89BDD3HSSSeFrR7RSLdRncRUwFDVV4FXPT9PAaZEsz7GmMhISEjguuuuKxMw6tevz+9//3v++Mc/kpKSEvZ6RKObpzqRmvSHe9euXTUzMzPa1TDGVMLevXtJS0tj37591KlTh9/+9rc88cQTNGnSJNpVq/FEZI2qdq2oXEy1MIwxNdtnn31GUVERl112WZlzjRo1YuTIkXz//feMGTOGFi1aRKGGpjwWMIwxYbdu3TpGjx7NokWLOP/888nKyiIhoeys/hEjRkShdiZQsbAOwxhTQ+Xk5DBgwADS09NZtGgRAF999RVz586t4JUmFlnAMMaE3A8//MDgwYPp0KGDa3AYO3YshYWua3FNDLMuKWNquEB2bwuV7du3M3nyZGbMmEFBQYFrmUsvvZTJkyeTmJjoet7ELgsYxtRgkcqNtGfPHqZPn87zzz9Pfr77quj09HQyMjK4/vrrcVLImerGuqSMqcHCnRvpwIEDTJw4kbS0NKZPn+4aLNq3b89bb71FZmYmvXv3tmBRjVkLw5gaLNy5ke655x7mz5/veq5169aMHz+eO++8k1q17FFTE1gLw5gaLNDd2yrrj3/8Y5ljTZs25YUXXiAnJ4dBgwZZsKhBLGAYU4OFOwX2xRdfzI033gg4C++mT5/Opk2b+O1vf0vt2rVD8h4mdljoN6YGq2puJFVl/vz5fPPNN4wePbr4uO/Mq4ZpfRgw9GxmPDWek08+OSyfw8QGyyVljClDVVmyZAmjR49m7dq1JCYm8s0333D22WeXmXkFTqvFbRtRUz0Emkuq3C4pEZknIiNFpJeI2HZ2xsSBTz75hB49etC7d2/Wrl0LQGFhIePGjQOq1650JrQqGsPoD0wG3gN2iMhWEVkgImNF5AYRcd1n2xhT/WRmZtKrVy969uzJihUrypx/6623yM3NrVa70pnQqmgMoxXQ2fPVxfO9D3CTt4CI/Ays9XytAdaqqm2nakw1kZ2dzZgxY/jnP//pt8ytt97KxIkTSU1NpXlKTrXalc6ETrkBw/Pg/xF4x3tMRGYA9wHfAHuANKA3cL1Pmd2qeno4KmyMCY1NmzYxfvx4Xn/9dfyNZV5//fVkZGTQuXPn4mO2K138CmqWlIgMB+4EeqjqCp/jlwATgKtwAkzNGUk3JkwimePJ17Zt25g0aRIzZ87k+PHjrmV69OjB5MmTXfetsF3p4lew02ofBN7wDRYAqvoZcI2IjADGApeGqH7G1EiRyvHkZt68ecyYMcP1XNeuXZk8eTLXXHNNuSk8+qanWoCIQ8Eu3GuC0w3lSlX/BHwNjKpKpYypyRZk5fKHuV9GbabRsGHDyuxmd+655zJ//nxWr17Ntddea/mejKtgA8Z3wNUVlPkIuLxStTGmhvO2LAr9jBnk5uVz6bRlLMjKrfJ7+RuXqFu3LmPHjgUgLS2NWbNm8eWXX9KvXz8LFKZcwXZJzQT+IiKjVHWKnzJNgUZVq5YxNZPbGobSKts95R0Tyd29n4Sc90ncvIL1mauoW7dumbL33HMPderU4Y477iApKSm4D2HiVrAtjL8C7wOTRGSZiJQYERORG4DbcQa+jTGlBLpWIdjuqQVZuTwxbx3/+fgdtr00lM3/eoGN2V/y4Jg/uZZPSkri7rvvDihYLMjK5dJpyzjziUUha/2Y6imogKGqhcANOC2NXwEfi8hOEckUka3AQiAJeDbUFTWmJghmrUKgwaWoqIgR0//Gpr8NZe+S5yjcv6v43Kz/9wwHDx4Mup5e3i603Lx8lBOtHwsa8SnobLWqWqCqvwG6A/8AinAW9DUDcoBBqvp8SGtpTA3hlj3W36hBRcFFVXn33XdJT0/n2zcnc/yXn8qUOX40ny+++KKy1bU0IKaESmerVdXVwEAAEakDFKqq+6RuYwzgvobhivaNeXtNblAL4ZYtW8YDj/yRb7/Oci+QWIsGF1xPu153c8UVV1S6vv5aOd7BeVuHEV8qFTBEpD3QDeePoy2q+mFIa2VMDea2hqFrq0ZMeDebXw4XAFCnlnvjf9WqVTz55JMsW7bM/eKSwEkdr6bhpbfT4NRmPNm/Y5Xq2jwl2TUNiEDx8UiuITHRFexK7wTgFeBu7yGcLqlanvOilciX7rnuUpzcVQo8rKpLReQR4A9APvB7VV0c7LWNqUi0VlyXdqSgqPjnvPyCEg/hjRs38thjj/Huu+/6fX29c3rS6PI7qXVK85B9Drc0IELZVA7ebioLGDVbsC2MEcAg4DPg78C1OBltvS4RkdeBwUG2OhS4W1W3i8h1wGQR2YizsvxcoCXwvoi0UtWCIOtsjF/RXHHtq7yxgr7pqRQVFfHee++5vja5TTdSLr+L2k3ORIDN024IWb3cutDcWhxg2WrjQbABYzDOwHZPVS0Ukeb4BAxV/VREjgO3AQEHDE+rZLvn11bAl0A/YK6qHgA2iMgPOBlzVwVZZ2P8quhBHSkVpQxv27YtgwYNYubMmcXn6ra6gJQeA6nT/MRYRzgyxpbuQrt02jLLVhungp0l1QpY7Jle688a4JJgKyIij4vIHuBRYCJOq2KLT5FtODOxSr9uqGdab+auXbtKnzbGlXdtQXl/LUdy/YH3YVt05KDrcYCxY8dSu3ZtLr74Yia+OIfWA6eWCBaRyhgb7n3CTewKNmDsB+pUUCYXlwd7RVR1uqqeipOHailQG2d8xKsIKBOoVHWGqnZV1a6NG9umgKZivmsL/GmYnBTR9QfDujfhwPLX2PY/gzia+x+g7EO4VatWZGVl8emnnzJm6G1M7d+R1JRkBEhNSY7YFql901Oj9t4muoLtkvoCuFpEElS1yE+ZIqBhZSukqvNF5DmcLirf/wNbYCvITQhUlJ4jOSkRESLSVXXgwAGeffZZnnrqKfbv3w9A3vLX6DzsL66D1h06dCj+OZoZYy1bbXwKtoXxCnA2TpeRP+dTTkZbNyKSJiJNPT9fDBwBFgG3i0g9ETkHJz/VuiDra0wZ5Q3Oev9azjvsPrciVAO7+fn5/PnPfyYtLY0xY8YUBwuAI1u+YtJFNkXVxJ6gWhiq+raIzAFGeh7iR3zPi0g/nGy284KsRwqwREQSgZ3AAFVdIyKzgWzP+9xXmSm7xpTmb6ZPakoynz5xJeC0QsIxsFtQUMDMmTOZNGkSubnu3Vunn346eXl5VXofY8Ih6NQgODvu/Q3oi5NoEBH5UETW4wSKAsA945kfqrpWVduq6lmqerGqrvEcn6KqZ6rqOar6aSXqakwZgQzahnpgt7CwkNmzZ9O+fXuGDRvmGixOOeUUpk2bxqZNm+jfv7/LVYyJrqBXenvGLn4nIrOAh4BrgJ6e018Bw70PfGOiyd+CPG9Xz/iF2eTlO11PdZNK/u3ku/4gNy+fRJESOZQC7S5SVRYsWMCYMWPIzs52LXPSSSfx2GOP8dhjj9GwYaWH/4wJu2BXev8X8LGq5qrq58DnnuN1AFHVI+VewJgICWRB3tHjJ+Zt/HK4oMx57/fKLuzbsWMHN910E5mZma7n69aty4MPPsiIESOwGX6mOgi2S2o2cG/pg6p61IKFiSUVZVkNNAtrVbK1NmnShKNHj5Y5XqtWLYYNG8bGjRt5+umnLViYaqMyYxjlEpFbRWRSqK9rTDAqWjld0flAr1OehIQEJk068U9BRBg4cCA5OTn87W9/IzXVZkGZ6qXCLikRuR3IBDYFeM1zcRbfjalCvYypEn8zobyznCo6H+h1AL755hvW7Czibyt/LjNe0qdPH7p160aLFi2YOHFiiXUUgYiVxIjGQGBjGP/ASQ540PP9GhH5CVgLrHdJBlgfKNsON9VCTXlAuWVZ9Z3l5C8Lq3efhyvaN+bD/+wiNy+/THZW73U2b97M+PHjmT17Nqdc/GtOumwgUHac48MPPyQ5OfjpuLGSGNEYL6loaYOIPAak4yT+a+857H1RAbABJ3hk4aQOmQIcVNVzwlHh8nTt2lX9DTCaipV+QIHzcKyuaR9KBz9vECj9u1tQ8Cc1JZl7Ozckc8ErvPzyyxQUOH8vSVIdUu9/mcT6p5Qo613XURn+cl1V9brGlCYia1S1a0XlKmxhqOqffS5aBLwKrMTZlrUz0BG4oNTLfhdMZU1siJXMraHiO4XW7a/1t9fkMrV/R7+L9EorOryPxj+9y4MT/s6RIyXneGjBUfatfItGVw8tPlbVVeFVGT8xJhyCXYcxHchU1eKV3J7V2R1wgsapwBpVXR66KppIqakPqAVZufxh7pcUlmpN5xcU8ujcdVSUP6Do6CH2r17A/swF/HjM/V4kndqSuq3OL3GsqqvCAx1nMSZSgk0N8oTLsUJgvefLVGPV8QFV0ZiLt2VROlh4lRcsigqOcGDtv9i/6m2KjhxwLZOWlsZN9zzMkiNtOOLTOAtFuu+KxmGMifSYY7AL9/rg7HXxgqpa5tgapro9oEYvWM/rq7YWjz24DQpXlJnWjR4v4MCXS9i/ci6Fh35xLdO8eXPGjh3LkCFDSEpKCss/XLfd7qrrJAQTetGYFFHhoHeJwiJLcMYtWqvqYZfzldrTO1Rs0LvqqsssqQVZuTw6Z53rQHWqp96Bjk340qJCtv/vQxTs3up6/uSURowb8yQPPPBApWY+GRMqoZwUEbJB71LOB/7tFiw82ojIR8AfVPXNIK9tYkB12efgqaU5fmc1ef/SCrZlASAJiTQ57zJyP/pHieOJdepz25AHePFPY2nQoEElamxMaEVjzDHYgNGIktumlqCq34nINuAewAKGCSnf1k95zVhvosDKkk59SFz1DoVHDpGcnMwjjzzC8OHDadSoUaWvaUyoRWPMMdiAsYuKt19dB9xYueoY43BbQ/H2mtwKA4GA3wFucIJJoSpHtn4FkkDdlueVLVP3JBp0H4Ac2s1/Tx7P4GvSq/pxjAm5aIw5BhswPgNuEJFkVfXX7skDTqtatUy8WZCVy4R3s/nFZae73Lz8EoPb/ghwZ/czihfjlZaakszNqYeZMG4shzdnkdS4Nc0GP0e92knUqZVQnOocoGE3Zz+Kl9f8wuBrqvLJjAmPaEyKCHbQ+1fAMuAt4E5VPe5S5l9Ad1WNeNCwQe/qaUFWLsPnfUlBYeXmSwiU+MfitmJdftlKk2/fYfVHS0u89uwBTzJ9xAN+B9Ddrm9MTRPooHdQ2WpV9SPgOeDXwApPAPF9018D1+OsBDcmIE8tzal0sABIqZfEoaPHeXTOOi6dtgyAqf07kpqSzPG9uRxa8me2vPRgmWABoGvmcNP5Tcvt91VODKQvyHLfVtWYeBB0enNV/T0wDegKfCAiu0VkjYj8iDPQXeQ5b0xAqjqr45fDBeTlF5R4sO/e8RP1Vr9M7isPsPvLZbi1pOu17sR9o/+bxMRE1y1ZSwt0Hwxjaqqgt2gFUNVRIjIX+ANwLU5yQoCvgSds/20TDH+zPSqj8NAv5K58i6FTF6OFZcdDAGo3a0dKj4Ekt76AhduTGUHZ/mB/7Z3qnibFmKooN2CIyACc3FAbS59T1XXAQE8526LVVNrwXu2qNIYBUHjkIPtXz+dA5jtogXt2/aTGrZ1AcdZFiAhQMgD4rkHxtygqltOkGBNuFbUw3gCKRKSDqn4rItNw0pivVdXvvIVU1fa/MEEpPW12wIUtWfTVdtdZUr6SkxJJEDh0rOT02j3vPUP+d6tcX1PrlOakXHYn9c65HJGSvbD+AkB1S5NiTCRUFDD+iLMPhndl9+N4tg0QkYM4ay68e2GsBTaoalF4qmpqivJSjfdNT/X7132iCFP7d2T8wmygZMBo2O3WMgEjsUFjGl56Byd1vApJKDs+UV4AsDxOxpRVbsDw3QvD40pO7IPRGbgUuJwTe88cEZH1ON1YD4a4rqaGqGjfDX/jBEWqZG7ZW2K9hFed1PYkt7mI/I2radjoNG699yGWJ17AUXUfyE5JTmJ8n3PLDQDVJU2KMZESbHrzj4CPvL+LSD2gEyWDSDpwIWABI8oikUiwMu9RUQ4ct0Fw1SKOfbucv67YRspld7q+vm3v+xjQtC8PP/ww9evXL66b2456R49bQ9iYYAW1cC+gC4rUBs5T1bUhvXAAbOHeCZHYbtXtPQBOqZfEuJv8//WePvHfrmMVCeLsT1ErAQo8z3NVJX/j5+R9MouC3VtAEmh+399IalTy2uV9toqyelaXDL3GhEtIFu6JyDwRGSkivUSkcSBvrKrHgg0WIlJXRGaISI6IbBGRRz3HHxGRrZ7j1wdzzXhXXrdPON8DnHUR5S1y8/c3SpE6rYCCIk+g2JzFjll/YNf8DCdYAGgReSteL/Pa8gJheS0ab9DL9UyltQV6xvhXUZdUf8+Xd6A7F2dwey2wBme21PYQ1KM+sBS4H2eb12wRWYvTrXUu0BJ4X0RaqWr502gMEJnUx+Vdq7y9wPe5jEH4OrLtG/KWv8bRre6bOB7O+YzjB/ZQq8GpgDMYXl6LoLysnjVtH3NjwqmigNGKE2MTXTzf+wA3eQuIyM+UDSJB7canqnuAtz2/7vasGu8BzFXVA8AGEfnBUwf3uZOmhEikPq5owZ2/gOLvdcd+3kTeJ7PI/95ft6JQv0NPGl72X8XBAuCObi3LrWd5U2QfnbMuqLobE88qmiX1I/Aj8I73mIjMAO4DvgH2AGlAb5wcUt4yu1X19MpUSETOA+riZLz92ufUNlxSq4vIUGAowBlnnFGZt6yRwrmOoLzBZF++wanEa6RkuYI9P5K3/HUO56zw+5712l7MlIxJ7Epqwhuf/0ihKoki3NGtJRl9O5Zb3/KmyPrblc8W6BlTVrB7eg8H7gR6qOoKn+OXABOAq3ACTKVG0kXkNGAWMBgYgpOXyquI0pPvAVWdAcwAZ9C7Mu8byyo7IBuOdQQLsnIZvzC7xLRWfzfcNziVHhz3jmEc3/czeSve4FD2MvCzfKdu63RO6TGQIf2u5hFPYKgoQLjxN0XWFugZE7hgc0k9CLzhGywAVPUz4BoRGQGMxVmfERQROQV4Fxilql94Brl9/4W3wAlGcaOqm7yHch2BvxlRXinJSdSvU8s1OPkbHD/83ecc+vp91+vVSe1ASo+BNG3XucL1ElVhC/SMCVywAaMJTjeUK1X9k4j0B0YBtwV6URE5GVgITFbVxZ7Di4BZIvI0zlhKI5yV5XEjlgZk/T30vfblF7Bu3LWu5/yNBzS44Hr2r/4nhQd2FR+r37wN9S+5k7M6X87j17WPyOe0BXrGBCbYgPEdcHUFZT4C7g7yug/jDKg/IyLPeI5dC8wGsoEjwH0a6kUjMS6cM52C7eqq6D3L6/Nv1rAuP+0rm5dSaiXR8NLb2bvkeWo1akHadUP4ZtY4EhKCzrpvjImAYAPGTOAvIjJKVaf4KdMUpzUQMFXNADJcTk3xfMWlcM10Kr3DXW5ePsPnfQn47+oqb0aUvz7/OSs3MiLjaX7K/D+aDXwaqVW7TJmTOl5NQlJd6rW/jGMJieUGC1tgZ0x0Bfun3F+B94FJIrJMRC7zPSkiNwC3E2djDeHitqlPKAZkJ7ybXSaVeEGhMuHd7KDqAs6qbt9Fcwuycrk4Yymn9nqQO6/txpb3XqRg5/ccyFpc5rUAkpBI/Q49kYTEcgOhLbAzJvqCzSVV6AkK/wPcC3wsInuArTjjG6k4WyA/G+qKxqNwDcj6SyFeXmrxQOryduZWHpr0HLs+ns3xfT+XeP2+VXM5qdO1JNWpR6Fqmem4FQXCWBq2aiolAAAY3klEQVTPMSZeVTqXlIhcBDwEXIMTLApxxjimquqskNUwCJZLKjCtn1jk99wP024IuuunqKiI+fPnc/eDfyB/51bXMlI7mSa3jiO55Xls9nmP3Lx8EkWK11UUqpKakswV7Rvz4X92FdfBX3eYAJun3RDU5zfGlBRoLqlKbdEKoKqrKbnjXqGqHq/s9UzkpCQnuaYIT0lOCmoqr6qyZMkSRo8ezdq17unDpFZtGnS+kZO73UJivYbF3U7ea/m+V6GeGFOZvepE4ClvgaAtsDMmcio1HUVE2ovIIBG5R0SuUNWjFiyqj/F9ziUpoeRy66QEYXyfcwNOWvjJJ5/Qo0cPevfu7R4sEmpxUvoNNB/6EqdcMYTEeg3LdDtVNFXXl+K0JnzZAjtjIivYld4JwCucmDYrOCuwa3nOS7xNfa2OyhuPqCi3Uk5ODg8//DD//ve/3S8uCdQ/9woaXnoHSSlNiw+nunRtBTs9WD3XsVlSxkRHsF1SI4BBwGfA33HWSvT3OX+JiLwODFbVD0NTRRNKpccn/jLgghIP3YZ+uqsaJicBkJiYyAcffOB67XrtLiPlsjtJOu1EMsBT6iWRNdZ9QV9FyQvd5ObluwYfY0z4BdslNRjIAXqq6kuUTA6Iqn4KHCeIVd4mcgKZmlo6MWDp423atGHw4MElzvXu3Ztm9zxL475PlAgWAHnlzLzyN1W3Ijal1pjoCDZgtAIWq2p5Hc9rgEsqXyUTLoGMT3gf8IX5+0uU833wjx07ltq1a9OzZ09WrFjBokWLOLPdea7vWd6gdN/0VKb270iqp0yiJyqlBjCQHerNoIwxFQu2S2o/UKeCMrnAFZWrjgmXBVm5frt/fMcSGtc6yn+WvsbBdYtpMmAydVucA5R88Lds2ZKvv/6aNm3aIJ6H/PBe7UqsHgdISpQKB6X95XHyt62qv3obY8Iv2BbGF8DVnsFvf4qAhpWvkgk1b1eUP81TksnLy+O23/yeNdPv5EDmO+jxY+Qtfw1VdZ2NdPbZZxcHi2KlpztUYfpDIN1VNqXWmMgKNmC8ApwNTCynzPmUk9HWRF5501f12BFqZ79Dy1ateevlZykqOJEk8OjW9STt+Lrc/bJ936OgqFS6kSKtdLdR6e4qm1JrTPQFmxrkbRGZA4wUkXNwssgWE5F+ONls54WuiqayfFdTl6bHCziwbjH7Vs1l66E819cnNjiNJD0e0GykUGXW9bfK3BIPGhN9lVnpfSdOC+IB7wER+RBnS9UOQAHwp5DUroYL50PQ34ZHWlTIwfUfsO+zNyjcv8v1tQn1GtKw+200SL+eIy4ZZt2EIrNuRavMLUAYE11BdUl5WhWqqr/DmQn1BrAb6AmcC6wHblTVNaGuaE0T7uyrpbuhVIs4tOFjfnr5AfYuec41WEid+qRcPpDU+1/m5AtvRmrVDviBH4rMuoGuMjfGREewLYxsnE2N7lbVz4HPoTiXlKhq2V1yjKtwZ1/17Qo6fmAPO98aR8GuH1zLSlIdGnfrS8Nut3CsVr3i48E88Pump5K5ZS9vfP5jcSLBW7oE1yoI54ZRxpiqCzZg/ILLXheqejQ01Ykf4X44+nYRJZ50inuhxFo0uKA3p19+O0/d3QOofCr1BVm5vL0mtziBYKEqb6/JpWurRiWuUV43XLg2jDLGhEawAWM50D4cFYk34X44Du/Vrng8QCSBlB4D2fX2JAASEhM5rXMvane9lTPOaFXioV3Z1k0gLaaKxih86+xV3WdD2WC9qUmCDRiTgeUicqGqfhGOCsWLcDwcv/zyS1q2bEmjRo3KJBhs06Unp25cTHqHs5kwYQJnn312lT+Dr0BaTBUFlXBtGBUtwaSKN6Y6CDZg3AosA94XkYdV9e9hqFNcCOXDMScnh3HjxjFnzhyeeOIJpk6dWvwevtc7+tgK6tSpaKF+5QTSYgokqNSk2VC2S6CpaYINGMM5sTXBTBGZBizCGfzOBNbbvhiBq+rDccuWLUycOJFXX32VoqIiAJ577jkeeeQRmjZtWqZ8uIIFBNZiircxChvENzVNsCu9r8QJGv/AyVrbGBgC/D+cgHFARFaLyP+EtJamhB07dvDQQw/Rtm1bZs6cWRwsAA4fPky7G4fS+olFnDXyPUYv8J8SJJR8V2YLTgLB0ivEQzH1tjrxFwhraoA0NV+wK70/Aj7y/i4i9YBOQGefr05AF+C3oaqkcezdu5fp06fz3HPPkZ/v/ldqUpMzqXtmZ8CZqeTd6jSjb8ew16+iFlNNG6OoSE0cxDfxTUK9QZ6I1AbOU1X3TZ7DqGvXrpqZmRnptw0b7wybbT/vQb9+jz2r3ubwwQOuZdu2bUveOf2p2+4SSueGTBRh09TekaiyKcVmSZnqQETWqGrXisoF1MIQkfHA/cCpwBac3fb+pKpldsdR1WNAxINFTbMgK5cRczLZtfpd9q16i6JS+1N4tWrVinHjxjFw4EDajF7qWqbQds2Nmpo0iG9MhQFDRIYAY30OnQVMAC4Ebg5lZUQkGWipqt+G8rrV0fTFG/j+pd9xfO821/Onn346o0eP5je/+U3xYHaiiGtwSPS3jZ4xxgQhkEHvYcAxnKSDLXCy0a4FbhSRX4eiEiJysogsAH4GHvc5/oiIbBWRHBG5PhTvVV1s33+Meu3KblyYUPckpk2bxqZNm/jd735XYubTHd1alilf3nFjjAlGIF1SZwHzVPUNz+8/icg1wEbgbuCtENSjCHge+BfQHUBEzgIexElq2BJn7Ucrt26wmqh5SjLHL+rPwbWLKDp6CKmdzMld+9LumtsZMaKP62u8A9u++Zzu6NYyIgPexpiaL5CAcQpOcCimqnkisgintVFlqnoQ+EBE7vE53A+Yq6oHgA0i8gPO7KtVoXjPWKCqjHvhdeZ/tZNDjdqVGBR1Ztgco+Elt1N4cC8nd7+Vkxo2YuTN5T/8M/p2tABhjAmLQKfVFrkc24ozCB4uLYGvfX7fBjQrXUhEhgJDAc4444wwVie0li9fzrBHhrMh63OSTmtFs8HPuaaOeKp+bZthY4yJCZXZQMnrOJAUqoq4qE3JQFUElNlnVFVnADPAmVYbxvqUUNnpkmvWrGH06NEsWbKk+FjB7i0c/s9y6nf4VZncShYgjDGxItCV3mNEZL2IvCwiQ0XkAqoWbAKxHfB9WrbAJbV6NFRm86MNGzZwyy230LVr1xLBwitvxT9QdeKjpY4wxsSiQALGB8A+nMHnIcDfgDXAKAAReVpE7hKRc6X0irGqWQTcLiL1PDv9NQLWhfD6lRbMznDff/89d999N+eddx7z5893vV7dM7twWp/HixfcWeoIY0wsqrCVoKrXAIhIGtDV5ysdaAg8hpOQEOCoiHwNZKnq/YFWQkQaAFlAA6CuiPwK+A3O7n7ZwBHgPg31svRKCiSpXG5uLhkZGbz88sscP+6ej7FDejcOdrwVaXZO8TFLHWGMiVUBdyup6vfA98Bc7zERaUvJIHKB53sXnJXhgV77ANDG5dSHwJRArxMp5WVd3b17N9OmTeOFF17gyBH3HWu7dOnC5MmTufbaa3ln3U+WOsIYUy1UaRzCsyL7W5zstYiIAOfgBIwaq7ykcvfeey8LFy50fV2HDh2YNGkS/fr1Qzyrr21g2xhTXYRyzAF1bFDVWaG8bqwpL5X3qFGjypRPS0vjtdde46uvvqJ///7FwcIYY6qTkGerjaZYyVZ78803s3DhQlJTUxkzZgxDhgwhKSmcM5CNMabyQpqt1pxw/PhxZs2axaZNm8jIyHAtk5GRQc+ePXnggQdITrYZT8aYmiHuA0agC/CKioqYN28eY8eOJScnh4SEBAYOHEi7dmVnNHXs2JGOHS09hzGmZgnpGEZ1E8gCPFVl0aJFdO7cmQEDBpCT46y1KCoqYty4cVGquTHGRF5ctzDKW4DXNz2Vjz76iFGjRrFy5UrX1y9YsIAdO3bQtGnTkNTH29rJzcsv3tsi1abaGmNiRFy3MPwtwNu8YR3XXHMNV1xxhWuwSEhIYMiQIeTk5IQ0WHhbO3Bil7xA0o4YY0wkxHULo/QCvGO7fiBv+Wzyv1vFdj+vGTBgABMmTHAdu6gKt9aOl2+rxxhjoiWuA4Z3Ad7+nT+St+J1Dm/4hBNZTkq66aabmDRpEp06dQpLXSpKOGgJCY0x0RbXXVLeBXh1fvqSwxs+xi1YXHnllaxcuZKFCxeGLVhAxQkHLSGhMSba4jpggBM0vpn337RsWXLf627duvH+++/zwQcf0L1797DXY3ivdiQnJbqes4SExphYEPcBA6BOnTrFU2Q7duzIwoULWblyJVdddVXE6uCbbgQg0ZM+xDftiDHGRFNcj2H4GjRoEI0aNeLmm28mISE6cdQSERpjYpkFDI9atWrRr1+/aFfDGGNilnVJGWOMCYgFDGOMMQGxgGGMMSYgFjCMMcYExAKGMcaYgFjAMMYYExALGMYYYwJiAcMYY0xALGAYY4wJiAUMY4wxAbGAYYwxJiAxHzBE5DYR2SwiG0VkSLTrY4wx8Sqmkw+KSAPgv4HuQCGwTkTeVdVd0a2ZCcSCrFyeWprDT3n5NE9JZnivdpaN15hqLNZbGL2Aj1U1V1V3AMuAyG1SYSptQVYuI+evJzcvHwVy8/IZOX89C7Jyo101Y0wlxXrAaAls8fl9G9DMt4CIDBWRTBHJ3LXLGh6x4qmlOeQXFJY4ll9QyFNLc6JUI2NMVcV6wKgNFPn8XoTTNVVMVWeoaldV7dq4ceOIVs7491NeflDHjTGxL9YDxnbAt9O7BfBjlOpigtDcs9VsoMeNMbEv1gPGUqCXiDQRkabAJcC/o1wnE4DhvdqRnJRY4lhyUiLDe7WLUo2MMVUV07OkVPVnEXkSWOk59AdVPRTNOpnAeGdD2SwpY2oOUdVo1yFkunbtqpmZmdGuhjHGVCsiskZVu1ZULta7pIwxxsQICxjGGGMCYgHDGGNMQCxgGGOMCYgFDGOMMQGxgGGMMSYgFjCMMcYExAKGMcaYgFjAMMYYExALGMYYYwJiAcMYY0xALGAYY4wJiAUMY4wxAYnp9OaxbEFWrqXuNsbEFQsYlbAgK5eR89cX71mdm5fPyPnrASxoGGNqLOuSqoSnluYUBwuv/IJCnlqaE6UaGWNM+FnAqISf8vKDOm6MMTWBBYxKaJ6SHNRxY4ypCSxgVMLwXu1ITkoscSw5KZHhvdpFqUbGGBN+NuhdCd6BbZslZYyJJxYwKqlveqoFCGNMXLEuKWOMMQGxgGGMMSYgFjCMMcYExAKGMcaYgFjAMMYYExBR1WjXIWREZBewJdr1qKLTgN3RrkQMsftxgt2Lkux+nFDVe9FKVRtXVKhGBYyaQEQyVbVrtOsRK+x+nGD3oiS7HydE6l5Yl5QxxpiAWMAwxhgTEAsYsWdGtCsQY+x+nGD3oiS7HydE5F7YGIYxxpiAWAvDGGNMQCxgxAARSRaRttGuhzHGlMcCRhSJyMkisgD4GXjc5/gjIrJVRHJE5Pro1TByRKSuiMzwfOYtIvKo53jc3QsAEUkQkf8TkW89n72X53hc3g8AEaktIhtE5GXP7/F8L34QkY2er+WeY2G/H5bePLqKgOeBfwHdAUTkLOBB4FygJfC+iLRS1YKo1TIy6gNLgfuBU4FsEVlLfN4LAAXuVtXtInIdMFlENhK/9wNgFPADxPW/k2Kq2sb7c6Tuh7UwokhVD6rqB8Bxn8P9gLmqekBVN+D8A+kSjfpFkqruUdW31bEb+BHoQRzeCwDPfdju+bUV8CVx+v8GgIicA1wIzPUcitt74UdE7ocFjNjTkpLpTbYBzaJUl6gQkfOAujjpDuL2XojI4yKyB3gUmEic/r8hIgI8Bzziczgu74WPfBHZJCKrPN2VEbkfFjBiT22criqvIqAwSnWJOBE5DZgFDCbO74WqTlfVU3G6YpYSv/djGPCRqm70ORav9wIAVT1HVc8ChgOvE6H7YQEj9mwHfPd+bYHTPVPjicgpwLvAKFX9gji+F75UdT5wEvF7PwYCt4vIOpyWVj9gB/F5L0pQ1eU43U8R+X/DAkbsWYTzj6Oep9+2EbAuynUKOxE5GVgITFbVxZ7DcXkvAEQkTUSaen6+GDhCnN4PVb1EVTuq6gXAWOCfOBNF4u5eAIhIfRFp5vk5Hafr6QMicD9sllQUiUgDIAtoANQVkV8BvwFmA9k4D4n7ND6W4z8MdAaeEZFnPMeuJT7vBUAKsEREEoGdwABVXSMi8Xo/Sojze1EP+Njz/8Y+4C5V/TQS98NSgxhjjAmIdUkZY4wJiAUMY4wxAbGAYYwxJiAWMIwxxgTEAoYxxpiAWMAwxhgTEAsYxhhjAmIBwxjjl4icISIqIvOjXRcTfRYwTLXi2VRIPV8Pl1PuFZ9yr0SyjjVMZ8/3tVGthYkJFjBMddOZE/uHnO9WQES64WS79WbrzIxAvWoqb8BYE9VamJhgAcNUG55dxRoBq3G2tS0TMEQkAXgB2MWJQGEBo/K8m/BYC8NYwDDVSlfP9zU4SRvP9QQIX/fjPOQeB84CCoCvSl9IRG4RkcUisltEjonIdyIyypPQrXTZ/xKR1z37ax8QkV9E5AsRGexWSRG5XETmeza4OSIiO0VktYhMKVVulKfLrJ/LNVqVHjsQkSs8x54SkYtE5B0R2es5dm4VPl8tz37QX4lIvjh7qj/u2bioM5Crqj+7fVYTXyxgmOrEN2CsxcnaeZb3pGfzpcnASuBjnB371qvqUZ8yiSLyBjAPaAO8BfwPzoYzk4GZvm/oySj8dyANWA78FZgPnAnMFJERpcqPAj7BCVofAH/G2eOjDnBdqc9TXneP97NmuZQ/z1OXIuBF4B/Afyr5+WoD7wHP4HTh/dVT7wnADKAp1rowXqpqX/ZVLb6ADwHF2ej+Vs/Pt/icfxnnoZcO3OI5/2Kpa/zVc3wqUMvneBLwqedcB5/jJwFNXerSDDgA/Mfn2Ok44yvLgdourzmt1O+bgV1+PutUT11u9Dn2uufYAaC7n9cF+/le8hwbgyd7ted4D89xBcZH+7+9fcXGl7UwTLXg6R5JBw7j/DXt/av3fM/5bsAQnACRxYm+90yfa3QDfgu8o6ojVdU7eI6qFuC0JAC6+Rw/qKo7StdHVbcDP+GMqXi1BxKBb1X1mMtrdvvUpRHQGv+Dyd4Whu9f994WxiOquqr0C4L9fCJyEXAf8ImqTlJV9Sn/CfCN51cb8DaAbaBkqo+2QEPgM1UtBL4XkTzgfJ+B7j3AaE957wPXd8D7IUCAwyIy3uU9zvN8F+8Bz7axvwNuANoBJ1OyK9e3yygbZ0ObISLSGKdF8G9V/cXlvbwPf38D8p2Bnar6k6ce9XHuwU7gVT+vCfbzPeT5PtbP9fZ4vluXlAEsYJjqwy0ArAM6AkNxWhS/UdW9nnOdgaPA1z7lr/V8v6OC99oKICLnA//G6WpaDbwJ7MUZSD8TuBv40vsiVd0tIpcB44DewE1AoYj8H/Ckqvo+eL0toDJ/vYtIGk7LZYnP4U44gWqRqhb5qXdQn89Tfg/OmIubNOBnVc2t4HomTljAMNWF74C3VxbQE5gCfAG8AiAiZwKnAqs9XTGISF2gMU73S88A33MWzlapV6jqR74nRGSi58cSLQRV/Rr4tWcwuQdOMPs1cKGIpOqJAfh0z3e3v957+3w+L2+L5HO3igb7+TzlmwBZvl1RPucvAZoDi0ufM/HLxjBMdeEWMNbidK+kAL/zefCVGb/gRDfMaYG8mYi0xBkf+cglWKTgLAwsXZ9iqnpMVd9X1duAFTgB7HSfIu2BAlXdUuradXCmBns/n1dFXVhBfT6cyQGFOEHDzQSXOpg4ZwHDxDzPGMUFwCFODMSCMx20H3Clqq72OV6m+0pV83HWY3QQkf5+3ucyn3UKRzzf00QkyafMqcAcoAXOjKh1nuPpnoWFpa/ZBmfsYCuwzefUMSBJRNr6lK2PM+3VO9ZQuoVxDFjvVvdgP5+n5fUdkCoiN5UqNwK42vOrDXibYtYlZaqDc3Cmt37q23/vGa9Y4FLerYUBMBxYBLwtIu/jPGATgFTPa5JU9QzPtXeJyDLgSuBzT/lmwPU403uLgA2q6g0sDwODRGQ1zuD3Tpxxjj6e80NKjT0sBS4EPhaRf3o+31WeOm3HWWPyPRS3OjoAX7nNvqrM5/OYijNz6m0ReRPYAfwKZ1zoR6Al1sIwvqI9r9e+7KuiL5zBZQWeDbD8Hpzpt4ku5y7EWdS2A2fwejfOX+0vAleVKtsYeA0nzcgB4DNPXdI99XnFp2xfnDGPHGA/TmvgB5y1IWe71KMu8CzO1NzDOMHtfpzutSKcrjBv2S64rCnx89kD/nye8o/gBKYCnHQrb+O0ZrYAu6P9396+YutLVMuMdxljjDFl2BiGMcaYgFjAMMYYExALGMYYYwJiAcMYY0xALGAYY4wJiAUMY4wxAbGAYYwxJiAWMIwxxgTEAoYxxpiAWMAwxhgTEAsYxhhjAvL/ARaWPc4EJdAvAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots()\n", "ax.scatter(y_test, predicted)\n", "ax.plot([y.min(), y.max()], [y.min(), y.max()], 'k--', lw=4)\n", "ax.set_xlabel('$Measured$', fontsize = 20)\n", "ax.set_ylabel('$Predicted$', fontsize = 20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 交叉验证" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# cross-validation \n", " \n", "k-fold CV, the training set is split into k smaller sets (other approaches are described below, but generally follow the same principles). The following procedure is followed for each of the k “folds”:\n", "- A model is trained using k-1 of the folds as training data;\n", "- the resulting model is validated on the remaining part of the data (i.e., it is used as a test set to compute a performance measure such as accuracy)." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:21:10.344979Z", "start_time": "2018-04-29T07:21:10.333153Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "-1.5787701857180245" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "regr = linear_model.LinearRegression()\n", "scores = cross_val_score(regr, boston.data , boston.target, cv = 3)\n", "scores.mean() " ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:22:09.298535Z", "start_time": "2018-04-29T07:22:09.294530Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function cross_val_score in module sklearn.cross_validation:\n", "\n", "cross_val_score(estimator, X, y=None, scoring=None, cv=None, n_jobs=1, verbose=0, fit_params=None, pre_dispatch='2*n_jobs')\n", " Evaluate a score by cross-validation\n", " \n", " Read more in the :ref:`User Guide `.\n", " \n", " Parameters\n", " ----------\n", " estimator : estimator object implementing 'fit'\n", " The object to use to fit the data.\n", " \n", " X : array-like\n", " The data to fit. Can be, for example a list, or an array at least 2d.\n", " \n", " y : array-like, optional, default: None\n", " The target variable to try to predict in the case of\n", " supervised learning.\n", " \n", " scoring : string, callable or None, optional, default: None\n", " A string (see model evaluation documentation) or\n", " a scorer callable object / function with signature\n", " ``scorer(estimator, X, y)``.\n", " \n", " cv : int, cross-validation generator or an iterable, optional\n", " Determines the cross-validation splitting strategy.\n", " Possible inputs for cv are:\n", " \n", " - None, to use the default 3-fold cross-validation,\n", " - integer, to specify the number of folds.\n", " - An object to be used as a cross-validation generator.\n", " - An iterable yielding train/test splits.\n", " \n", " For integer/None inputs, if ``y`` is binary or multiclass,\n", " :class:`StratifiedKFold` used. If the estimator is a classifier\n", " or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.\n", " \n", " Refer :ref:`User Guide ` for the various\n", " cross-validation strategies that can be used here.\n", " \n", " n_jobs : integer, optional\n", " The number of CPUs to use to do the computation. -1 means\n", " 'all CPUs'.\n", " \n", " verbose : integer, optional\n", " The verbosity level.\n", " \n", " fit_params : dict, optional\n", " Parameters to pass to the fit method of the estimator.\n", " \n", " pre_dispatch : int, or string, optional\n", " Controls the number of jobs that get dispatched during parallel\n", " execution. Reducing this number can be useful to avoid an\n", " explosion of memory consumption when more jobs get dispatched\n", " than CPUs can process. This parameter can be:\n", " \n", " - None, in which case all the jobs are immediately\n", " created and spawned. Use this for lightweight and\n", " fast-running jobs, to avoid delays due to on-demand\n", " spawning of the jobs\n", " \n", " - An int, giving the exact number of total jobs that are\n", " spawned\n", " \n", " - A string, giving an expression as a function of n_jobs,\n", " as in '2*n_jobs'\n", " \n", " Returns\n", " -------\n", " scores : array of float, shape=(len(list(cv)),)\n", " Array of scores of the estimator for each run of the cross validation.\n", "\n" ] } ], "source": [ "help(cross_val_score)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:25:40.617010Z", "start_time": "2018-04-29T07:25:39.304291Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAD+CAYAAADYr2m5AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3XmUXHWZ//H3kw4haQjDksaErEBkC2BCOpgwKtEggcBhwJ8IGNCwBUTFYYAZNc64Rn8/BwHDIBBEkpA+hEUUIWwOSwSVgQ6LCrI0kA0J0xBBSCeQpJ/fH98qu7qrbu1r38/rnDpVfW/dut+6J7lPfZ/vZu6OiIjE24BaF0BERGpPwUBERBQMREREwUBERFAwEBERFAxERAQFAxERQcFARERQMBAREWBgrQuQr2HDhvm4ceNqXQwRkYaxcuXKN9y9JZ/3NkwwGDduHO3t7bUuhohIwzCz1fm+V2kiEREpPRiY2WfM7BUz6zCzM/rsW2Rmryb2dZjZmMT2w83shcRx80otg4iIlKakNJGZDQV+BEwFtgFPmdkd7t6Z8rbZ7v5QyjEG/BT4P8BLwJNmttzdnyqlLCIiUrxSawYzgRXu/qq7rwceAGbkOGYy8Lq7/8HdNwK3AkeVWA4RESlBqcFgNJDaQLEOGJHy9xZgsZk9Y2YX5nnM35nZXDNrN7P2zs7OTG8REZEyKDUYDAK6U/7uJqSLAHD3s919LOGX/9lmdkSuY1K5+0J3b3X31paWvHpH1UZbG4wbBwMGhOe2tlqXSESkIKUGg9eAkSl/jwLW9n2Tu68F7gQOzPeYhtHWBnPnwurV4B6e585VQBCRhlJqMLgXmGlmu5vZcOAw4L7kTjMbn3jejVA7eBx4FNjXzPY1sx2ATwG3lViO2pk3D7q6em/r6grbRUQaREm9idz99UTX0N8nNl0IHGlme7v7JcACMzsAeA+4wt1/C2BmZwJ3EFJG/8/d8x4YUXfWrClsu4hIHSp5BLK7LwIWReybFbH9HmCfUs9dc93dsNNO8Pbb6fvGjKl+eUREiqQRyMX629/ghBNCIGhq6r2vqQnmz69NuUREiqBgUIjUXkO77QZ33AE//jEsXgxjx4IZ7LILbNsGW7fWurT5UU8oEUHBIH99ew1t3QqDBoWgMHs2rFoV0kZvvAEf+xicfz6szdJJqh5uwuoJJSIJ5u61LkNeWltbvaazlo4dm7lReOzYEAhSvfwyHHwwHHYY3HtvqDGkSt6EU3shNTfDwoUhsFTLuHEhAPSV6TuJSMMxs5Xu3prPe1UzyMdzzxXWa2ivveCSS+DXv4ZrrknfXy/dUdUTSkQSFAz6Sk3fjB0LZ5wBhxwS/s4kqtfQOefAJz8JX/kKjBzZOx1ULzfhPfbIvF09oURiR8EgVd8c+po1cP31sPfesGBBSOWkam6O7jVkBsceC++/D3/5S09O/rTTwutMqn0TPvDA9G3ZvpOI9FsKBqkypW8gdCP94hdDTj/Za2js2Nw5/ksvTd/mDjvsAEOG9N4+ZEh1b8Lr18OKFaGxe5ddwraRI6vfbiEidUHBIFVUmibZKyi119CqVblvmlGf19UF117bE1gAjjuuujfhSy4JtZbrroPly8O2K69UIBCJKQWDVFFpmmLTN9k+LzWwHHYYPP10dPqo3Do74aqr4JRTYPx4mDgRBg6Exx6rzvlFpO4oGKSaPz+MHUhVSg59/vz82hlOPz30WPqf/ynuPIW67DLYtKmn99KQIXDQQfD449U5v4jUHQWDVLNnh1/JTU35twvk+rx82hk+85lwQ77++tK/Qy4bNsB//ReceCLsv3/P9ilToL29erWTUtXDoD2RfkTBINXWrfD88zBnTv7tArnk086w007w6U/DsmXhF3slLVgA77wD3/hG7+1TpsBf/wovvVTZ85eDRk6LlJ2CQapHHw0Tzx1VgyWZ58wJvZZ+8YvKnePtt+Hyy8MEewcd1HvflCnhOZ92g1r/Kq+XQXsi/YiCQap77gkpoiOOqP65p08PN9ZKpIqSN++ddw4BYfLk9PdMmBBSVbnaDerhV3m9DNoT6UdKDgZm9hkze8XMOszsjD77vmJmfzazVWZ2g5kNTGx/KLGtI/FoyvzpVXbPPTBtWrhpVtuAAfD5z8P995f3ppZ68076/vfTb94DB8KkSbmDQT38Ki93ry8RKS0YmNlQ4EfARxKP75tZ6sr17wAfAsYDHwBOStk33d3HJx7bSilHWbz+OqxcWZsUUdLnPx9+bS9ZUr7PLOTmfeih8MQT2affrodf5fPnp68hoZHTIiUptWYwE1jh7q+6+3rgAWBGcqe7/8zd33f3rcAfgF1LPF/l3JdYuvnoo2tXhj33DOmiRYvK16unkJv3lCmhAfvZZ6M/L+rXd9Q8R5Vw/PEhGOy4Y/i7qSlMCKgBcyJFKzUYjAZS50BeB4zo+yYzawaOIax7DGFN5AfN7EkzOzXqw81srpm1m1l7Z2dniUXN4e67YffdQ9fSWjr99NCj5+GHCzuub6Pu1VfDl75U2DxI+TQiz5+fPiU3hMbvb3wjumG5nI3Ot98eRk8vXx4+Z9u26gYjkf7I3Yt+AP8GfDfl7/8LnN/nPQOAW4HzMhx/APAasG+uc02ePNkrZutW9912cz/ttMqdI1/vvus+dKj7nDn5H7N0qXtzs3u49fc8zNyPOMJ9yJDe25ubwzF9dXe777yz+9y50ef605/CZ+yyS/j8sWPd//M/3UeMSD9/8jyZyhdVhnwcfbT7mDHu27a5d3WFMp9ySnGfJdKPAe2e5/281JrBa8DIlL9HAX9f3svMDPgp8Ky7/yRDIHoW+C2wf999VbVyJbz5Zm1TREk77BCmzF68OP9f0VET7A0fHtZUSJ0HKdtAOjNobc3eiHzLLeF9zz7bM3biootCA3RfXV1w5pmh22y5Gp1ffz2k9E49NVyfIUPC69tuCwPqRKQopQaDe4GZZra7mQ0HDgPuS9n/E2C9u/9H6kFmNj7xPBb4MPBUieUozd13hxvcJz9Z02IA4cb/6KM9v6Hz6boZ1S6wfn14LmSCvUMPhT/+ETZvTt/nDjffDIcfHgJNqnXrMn/ee+9FN0ivWRPKVEgK6aabQloo9TuceWY4jwadiRQv3ypE1AOYA7yUeJyQeFxE6F3UDXSkPE5JHPNH4BXgGeDEfM5T0TTR1Knuhx5auc8vxNix6ekWCNvLeUyUX/wiHPv736fv+8Mfwr6f/KSwMkTtA/fRo9233z7/FNKUKe6HHJK+ffJk94MPDqkuEXH3wtJEJQeDaj0qFgzeeMN9wAD3b36zMp9fKLPMN02z6GOWLnXfbrvy5OTXrQvH//jH6fu+8Y1wrdavz1yGqHaBqH3nnec+cGD+gey558K+Sy9N33fVVWHf448X/p1F+qlCgoFGIP/61yFVUcvxBamKGVD12c+GnlCDBpU+wd7IkTBiRHq7gSdSRNOnwwc+kH5ctkn5ovZdeWVI+WSSKfXV1hZSSSefnL7vlFNC+8FPf1rwVy5araflECmnfKNGrR8Vqxl8/vPuu+4aehTVg0y/oocMyf4r/3e/C+9buLA8ZTjuOPd99+297amnwjmuvro850jKN8XV3e2+557uRx4Z/Vmf+1zoifXuu+UtYybl7iElUgGoZpCn7u4wBcWRR6aPaK2Vvr+iIZQv26/8a68NvZAy/WIuxpQpYfbWt9/u2XbzzeEafepT5TlHUqY1HwYNSh9N/PvfwyuvhJ5DUc46K8zIeuut5S1jJvUwLYdIGcU7GDz9dOiqWC8poqTU3j/HHx8GoGXqOgphsNdNN4U0ydCh5Tn/oYeG5/b28JxMEX3iE9DSEn1cMfoGv+23D9+776yqS5eGoHHCCdGf9ZGPwD77VCdVVA/TcoiUUbyDwT33hOeZM2tbjmwuuCD0n7/hhsz7b7wxBIqzzy7fOVtbw3Oy3eCpp6CjIyzCUwmpwW/1ahg2LCy+8847Yf/774eAd/zxPVNQZGIWupk+8kho+6hkLl+T5Uk/E89gkGz4+/rXYbvtwkyh9eqjHw2D0C6/PNws+7r22vArOjmVRDnsuivsvXdPMEimiLL9Ki+XD3wgLPLT0RHGV7iHcSAbNmRPESUlg8Vf/lLZKba/+930aTmGDMk9WZ4anaVe5du4UOtH2RqQG7Hhb+nSUM7ly3tvf+KJsH3BgvKf8+STwxiA7m73vfZynzmz/OfI5vvf979PewGhS+uSJbmPK3bMxdKl4T3JKTZy/Xu4/vrwuS0tPd2Bjz469zka7d+eNDQ0ziCLcg7Qqpb33nPfY48wz1CqL3zBffBg9w0byn/OH/0oXJc77wzP111X/nNkc8MNIQAUeuMsdpxGITfpLVvc9947DH5LDnI788ww1uOFF6LP04j/9qShKRhkU8zNoh784AehnE8/Hf5+9133nXZyP/XUypzv4YfD+Q48MAwMe/PNypwnSrE3zmqM4E7WCn71q55tr73mvuOO7scfH32eRv23Jw2rkGAQvzaDRm34mzs39Ka5/PLw9623hp5EZ51VmfO98EJ4/tOfwiR0d99dmfNEKba3TqauqrkWvinkXFu3wve+F9pxjj22Z/vw4fC1r8EvfwkPPpj58/rO55Q0enR02USqJd+oUetHrNsMks47z33QoDAdxD/+o/s++1RmLp56uEalpFSWLg3tHfmWe8yY/M+1aFHYd/vt6fu6usJnTZyYPojxhRdCTS5T7eCQQ9w3b879vUQKhNJEOSxd6v4P/xC+/pgxjREI3N2ff77nBgdhHv9KlL0ectvlCEgXXBBSXOvWZX/fWWelf1cz9yuv7P2+ZFvBpEnRQfjGGz2tjWX16vDvbNgw9x/+sHdD9ec+539vfO7qyv+7ieRBwSAfF18cGl8bydKl7k1Npd0g81Evue1Ce/j09fLLoRH661+Pfs+WLe7jx7uPGxdu2Gbuw4eHxuBp09w3bep57+LF4Tr88pfRn9fdHY7baadQOzELAWnIkND7K5NrrgnvmzCh55hivq9IHwoG+fjiF8OcRI2kWr/Y66FmUC4nnBBWsdu4MfP+5C/5n/+89/ZbbgnbTzst3OCTQWPixNypuW99K/3abb999pv73LnpxzRK+rLUoC0Vo2CQj9NPdx81qryfWWnV+sVeD20G5bJiRSh/pkn8tm1zP+gg9/33D6/7+s53/O/puOR1+Od/zn3OWq9JUU396d9KP1RIMCi5N5GZfcbMXjGzDjM7o8++A83saTNbbWZXmNmAxPbDzeyFxHG1mdlr48YwuVsjqVZPqGzTUTeaj34UJk0KvbDce++7886wqtvXvhZGBPe1115h5PVbb/VsW7gw96jhYnpCNepcR5qwr//IN2pkegBDCWsejwSGA+uBlpT9vwGOBpqAFcDxgAEvAgcDOwAvABNznavsNYNjjw0NgY1Ev8KKk8z133tvz7bu7rC63Z57hhRQJvU81qFe1Ev7kmREFWsGM4EV7v6qu68HHgBmAJhZC7Cnu9/t7tuANuAoYDLwurv/wd03ArcmtldXV1d6f/R6159+sVfTSSeFOY+SYzQgzEf12GPwb/8WxlFkUs2xDsUcUw8addyOpCk1GIwGVqf8vQ4YkXg9CliTYV+2Y3oxs7lm1m5m7Z2dnSUWtY9GTBNBYYvbS7D99nDeeWHg3HPPhW3f/35Y0W3OnOjjir3RFRO0k8ckB6YNG9YYgX7+/HB9UzVCEJM0pQaDQYRF75O6gW059mU7phd3X+jure7e2lLuefQbsWYgxTv33LBozoIFYaGcBx+Eiy5Kv5GlKuXXejFBe/ZsePnl0E5x7rn1HwgglPHjH+/5e8cdGyOISZpSg8FrhPaCpFGENoRs+7IdUz0KBvGy++4wdSpcfTUcdlhoMN555+zH1CItN2QI7LdfWEOiUaxaFdYEOeqoMC23AkFDKjUY3AvMNLPdzWw4cBhwH4C7rwE2mtl0M2sCTgNuAR4F9jWzfc1sB+BTwG0llqNwXV2NmSaS4rS1hTaCZI+i7m748pdz9wyqRVpu4sTGCQYvvxxSb7NmwbRp8MwzvZdLlYZRUjBw99eBecDvgd8CFwJHmtlFibd8HrgCWAX8xt0fcff3gTOBO4BngCvcfXXfz664jRtVM4iTefNg8+be2+q1C+TEibBuHbzxRmU+v5wL7CQnMJw1K9S83HsWRZKGEtGNIn/uvghYFLHvCeCgDNvvAfYp9dwlUZooXhqpH//EieH56adhxozyfnZbW5gBNzk2ILkSHBRX67nrLhg/PjyGDQvbHn0UjjiiPOWVqonfFNYAW7aEqYiVJoqPRuoC+aEPhedKpIrKOUhs0yZ44IFQK4DQBnPAAaGBXhpOPIPBxo3hWTWD+GikfvwtLTByZGWCQTlrSA89FFJvyWAAIVX06KPpo72l7sUzGCR/GSkYxEejDdirVCPyiIxDeoqrId11V+j9dPjhPdumTYMNG6Cjo7jySc3EOxgoTRQvjTRgb+JE+POf0xu9S/HMM/Duu+nbi6khuYdgMGMGDB7cs33q1PCsVFHDiWcwUJpI6t3EibBtW7iBZ5Nvz6A//CEMDmtuhh/+MNSMIBx3zTWFB8YXXwzdSlNTRAD77w9Dh4ZUkTSUeAYDpYmk3iV7FGVLFSV7Bq1eHX6pJ3sGJQNCaqCYNCl0nFixAi6+ONSMrr8+1JIOPrjw8t11V3g++uje25ua4MMfVjBoQAoGIvVor73C1A7ZgkFUz6AvfjHMxXTWWT2Bors7pJxSxwAku63ef3/h5bvrrlALGDcufd/UqaEmkqyBS0OIZzBI/iNVm4HUqwEDQhfTbMEgqgfQ22/DVVeltzds3ty7C+no0bDPPoUHg3ffDTWMvimipKlTQ4qrvb2wz5WaimcwUM1AGsHEiWHgWXd35v3Zxk6YZd7XN4DMmBFu7Fu25F+uBx6A99/PHgxAqaIGo2AgUq8mToR33oFXXsm8f/78kKNP1dwcpufOd5DdjBnhl/5jj+VfrrvuCimsj3wk8/7ddoMPflA9ihpMPIOB0kTSCHI1Ih91VKg17LRT+tiJfAfZffzj4dh8U0XuYT6iT34yTAkeRYPPGk48g4FqBtIIJkwIv/yjgsFtt4Wb7YMPpo+dyHeQ3a67wiGH5B8Mnn02pJqiUkRJ06bB66+HMklDiHcwSB0sI1Jvcq1tcOONoQF40qTM+/MdZDdjRkjp5Or909bWkxr61reyz3aqdoOGE89gkJy+ekA8v740kKhpKV57LcwNdPLJ0Y3F+ZoxIzQgP/xw9HuSYxreeiv8/eqrvcc09HXQQeH/mIJBw4jn3VDTV0ujiFrb4OabQ4ro5JNLP8dHPhLy/9lSRYXOdjpwIEyZEt2IXM41FaQsFAxE6lnq2gapli0L4xD237/0czQ3h6VAswWDYmY7nToVnnwyTHWdKtfIaamJooOBmY00s0fMbK2ZLTOzwX32TzKzh82sw8yeNLMPJbbPMbMNie0dZnZiqV+iYFryUhpFprUNXnklpF9OOaV855kxI5wjanW1PfbIvD3bbKfTpoV1Q558svf2cq6pIGVTSs3gB0Cbu48GtgLn9tm/O3Cau48HLgMuSdm3wN3HJx63lFCG4mjJS2kUmdY2WLYsPJ90UvnOM2NGT8+kTEaOTN+Wa7bTD384PPdNFTXSqnMxUkowOBZYnHi9BDgqdae73+vuqxJ/PgHsWsK5yktpImkkfRuRly0Lv7ozzQtUrClTwmyjmVJF998fBqV9+tOFrQcxfHgoY99G5F0jbgX1uOpcjBQVDMxsF2CTuyfreuuAiFUzADgNuDXxuhuYa2Yvmtk1ZhaZrzGzuWbWbmbtnZ2dxRQ1M6WJpJGkrm3w7LNhErhypoggNPhOn54eDLZsgS9/OUycd8MNha8HMWIE/OIXoaF4zJhwjjffzNyT79hjS/8ehVAjdi85g0Hiht2e+gAOJtzUk7qBbRHHnwBMI5Emcvcl7r4HMBHYGfha1LndfaG7t7p7a0tLS95fKieliaSRpK5tsGxZuHmdWIGmthkzwgplq1f3bLviihCILr+88HE5bW1hsrpt20IKau3aMA/S8cfDokU9tYzRo2HffcO6CsmpsStNjdjp3L3gByGIvAMMSvx9BHB7hvfNBP4H2CXic44FfpnPOSdPnuxls+ee7qeeWr7PE6mkF190B/drr3UfP959xozKnOePfwznue668Pdf/uI+dKj7rFnu3d2Ff97YseHz+j7Gjk1/79tvux9yiPvgwe6/+U0p36L8ZWtgQLvneV8vKk3k7t3AQ8BnE5vmAL0ags3sE8B3gGPc/a8p2/e2YCBwElDADFllojSRNJLk2gY/+1n45V6OsQWZTJgAH/hAT6roq1+F994LtYJiBrYV0lC8005wzz0hXXPkkSG9VMn0jRqx05TSgHw+cK6ZrQM2Azea2VAzW25mTcBCYCzwaKILaTJYnAKsAZ4HuoAflVCG4ihNJI3kxhtD7j7ZKydqSutSmYXAk0xFLVkSJsP74AeL+7x8Z05NamkJi/K89x6sX1/Z9E2hZYuDfKsQtX6ULU3U3e0+YID7vHnl+TyRSlq61L25uXcqo7k5bK/EuQYN6n2uIUOKP1cxZa9W+ubSS9PPUanrWkNUOk3U0N5/P/yyUs1AGkE1B2jNmxf+f6TatKn4c+U7c2qqaqRv3END9fbb94yf2G673GXr5+IXDLSWgTSSaua2K3GufGdOTapG+mbRIvjv/4bLLgvzPl18cUiLVaotpkHELxhoLQNpJNXMbddDHj3fRXmKtX49/Mu/wEc/CuecE7YdcEBop3jppfKco0EpGIjUs0rfHGt1riipqSUIv9ivvrp86Zsvfzmkvq69tmfg24QJ4fmZZ8pzjgYVv2CgNJE0kmLy7o1wrlzlWLUq9KLq7i6+N1NS6kjjW2+F444Lg9ySkjO/xjwYDKx1AapONQNpNLNnV++GXM1z5XLkkeEGvnx5z8pphUqONE5thF++PGxPfs8ddwzB4tlnSy5yI4tfzUDBQKQx7LprWGehlCkq8u2NNWFC7GsG8QsGShOJNI5Zs+CJJ8Iyn8XIt4fUhAnw3HNh/YWYil8wUM1ApHHMmhWe7767uOMzrcMA6T2kDjggjLGIcY8iBQMRqV8HHxxu6MWmivbcM31bph5S6lEU42CgNJFI/TMLtYP77gvzMxXi9tvh4YfDlNm5ekipR1EMg0GyzUA1A5HGcMwx8M478Mgj+R/T2Rl6EU2aBDfdlHsU9A47hFpEjHsUxS8YdHWFXwjbb1/rkohIPmbMCHMH5ZsqcocvfAHeegsWL4ZBg/I7LuY9iuIZDHbYobj52UWk+nbcEQ4/PIwPyMeyZfDzn8N3vgMHHZT/eSZMgOefj22PovgFA61lINJ4jjkmLL/5yiuZ96eOMj71VNh7b7joosLOkexR1NFRcnEbUdHBwMxGmtkjZrbWzJaZ2eA++6eb2d8SC9t0mNkFie1DzexOM1tnZveZ2W6lfomCdHUpGIg0mmxdTPuuZ9zdDa++GmoIhYh5j6JSagY/ANrcfTSwFTg3w3tuc/fxicdliW0XA8+4+yjgceDfSyhD4bTkpUjj2WcfGD8+c6oo0yjjzZsLX4dh//1D+ljBoGDHAosTr5cAR+V53KeAaxOvFxdwXHkoTSTSmGbNggceCLOOpirXOgzNzaFHkYJB/sxsF2CTuyfD8TpgRJ+3OTDLzF5KpJF2T2wfRVgDOeq41PPMNbN2M2vv7OwspqjplCYSaUyzZoVf/A8+2LNtzRoYGDHfZjHrMEyYENvupTmDgZldk7whJx/AwUDqqtzdwLbU49x9hbvvDuwHvAZcmtg1KOXYtOP6fMZCd29199aWlpa8v1RWCgYijenww0MX05NOCg3Fw4fDgQeG1327ihe7DkOyR1GhA9z6gZzBwN3PSd6Qkw/gYWBnM0t24B0FrI04fgtwHXBgYtN6YI9cx1XMxo1qMxBpRD//eWgcfvfd0FD8+uvh9Xe/C9ddV551GA44IASCavYoSu0JNW5c+LsGikoTuXs38BDw2cSmOcAtqe8xs3FmNtDMDJgNPJbYtRw4I/H69L7HVZxqBiKNad482NYnkeAOV15Z+FrLUardo6hvT6jVq8PfbW1VDxKlNCCfD5xrZuuAzcCNiW6jy82sCZgOrAI6gL2AryaO+yYwLXHcgcBlfT+4ohQMRBpTuRqKs9lvv+r2KIpab+H886ODRIUUvdKZu78C9F1+6B3gmMTrRYlH3+M2ADOLPW/JlCYSaUxjxoSbYqbt5dLcDHvtVb1gEBXINmxI35ZclKdCK9HFawSyu2oGIo1q/vz0/7vFNhRnU80eRaNGFfb+ctaC+ohXMNi8OTwrGIg0ntmzQ8NwORqKs5kwAV54oTo9ij72sfRtzc2wW8TEDOWsBfURr2CgJS9FGlu5GoqzSfYoevHFwo4rtMF369aw3sK++6YHuB//uDq1oBRFtxk0JK1yJiK5pPYoOuCA/I5J9gpK3mOSDb4QHbBuvTWkfW6/HY47LvN75s0L7xkzJgSCCrUXQNxqBgoGIpLLfvuFX/eFNCJH9QqKmh/JHS65JMy5dOyxmd9TjVpQinjVDJQmEpFchgwpvEdRod1ef/MbWLkSrrkmBJ46UB+lqBbVDEQkH4X2KIrqFRTV4HvJJdDSAqedVnjZKkTBQESkr2SPovffz/1edxg9OvO+009P3/bnP8Odd8KXvhRqIXUinsFAaSIRyeaAA0Jvn3x6FC1YAL/7HXz60z29gkaNgmHDwpxJb7zR+/2XXgqDB8N551Wm7EWKVzBIthmoZiAi2SRHOh90UPZuoitWwIUXwj/9E9x0U0+D79q1cM89YTK92bN75lRavx6WLIE5c0KwqCPxCgZKE4lILm1tPf35M80LlDqe4BOfCLn/JUvSG4InT4YrroD77gszq0KYVG/LFrjggqp9nXzFqzeR0kQikkuubqKp4wnc4a234I47Mnf9PPts+O1v4dvfDoHgjTdCO8Hjj4dupXUkXjUDpYlEJJeo7qCrV8NZZxW23rIZTJ8enpNtB5s2VXwG0mLEKxh0dUFTU1gtSUQkk6juoEOG9Mxv1le2CeS+/e1Qg0iVbUBajcQECmBdAAALVUlEQVQvGDQ3hygtIpJJ1Oyo114begtlkm0CuWqsw1AGRQcDMxtpZo+Y2drEgveD++x/zMw6Eo81Zva/ie1zzGxDyr4TS/0SedNaBiKSS7bZUYuZRjsqUFRwBtJilFIz+AHQ5u6jga3Auak73f1Qdx/v7uOB7wI3pOxekNzn7tVb9lJrGYhIPqLmBSpmGu1qrcNQolKCwbHA4sTrJcBRWd57FvCzEs5VHgoGIlKqQieQq9Y6DCUqKhiY2S7AJndPNquvA0ZEvPdgoNvdk7M+dQNzzexFM7vGzCLzNmY218zazay9s7OzmKL2pjSRiNRClWcgLUbOYJC4YbenPoCDCTf1pG5gW8RHnE1KrcDdl7j7HsBEYGfga1HndveF7t7q7q0tLS15fJ0cVDMQEcko56Azdz+n7zYzGwDsbGaD3P19YBSwNsP7BgMnAF/P8LkbzewGQgqpOrq6YETGCoyISKwVlSZy927gIeCziU1zgEwNwScCv3b3d5IbzGxvCwYCJwGPFVOGoihNJCKSUSkNyOcD55rZOmAzcKOZDTWz5WbWlHhPrxRRwinAGuB5oAv4UQllKIzSRCIiGRU9N5G7vwJM7bP5HeCYlPd8LMNx3wO+V+x5S6JgICKSUbxGICtNJCKSUXyCQXd3mFdENQMRkTTxCQabNoVnBQMRkTTxCQZay0BEJFJ8goHWMhARiRSfYKAlL0VEIsUvGChNJCKSJj7BQGkiEZFI8QkGShOJiERSMBARkRgFg2SaSG0GIiJp4hMMVDMQEYmkYCAiIjEKBupNJCISqeRgYGYfKkdBKq6rC7bbLjxERKSXooOBmV1oZi8BKyP2DzSzRWb2qpk9amZ7JrYPNbM7zWydmd1nZrsVW4aCaC0DEZFIpdQM2oFDs+z/HDCYsD7ydcBlie0XA8+4+yjgceDfSyhD/rSWgYhIpKKDgbuvcPc3s7zlU8BP3d2BNuCIlO3XJl4vBo4qtgwFUc1ARCRSJRuQRwOrAdy9C+gys10INYU1ifesA0ZUsAw9FAxERCLlXAPZzK4BJvfZfKa7P53j0EFAd8rf3cC2PtuT26LOPReYCzBmzJhcRc1OaSIRkUg5g4G7n1PkZ78GjAReMrMhwEB3/5uZrQf2INQORgFrs5x7IbAQoLW11YssR6CagYhIpEqmiZYDpydenwr8MmX7GYnXpwO3VLAMPRQMREQildK19Coz6wCazKzDzK5IdBtdbmZNwJXAYDNbSwgG30gc+k1gmpmtAw6kp5dRZXV1KU0kIhIhZ5ooirt/IWLXMYnnbcApGY7bAMws9rxF27hRNQMRkQjxmY5CaSIRkUjxCgZKE4mIZBSPYLBtG7z3nmoGIiIR4hEMNH21iEhWCgYiIhKTYKAlL0VEsopHMFDNQEQkKwUDERGJSTBQmkhEJKt4BAPVDEREslIwEBGRmAQDpYlERLKKRzBQzUBEJCsFAxERiUkwSKaJFAxERDIqORiY2YfKUZCK6uqC7beHpqZal0REpC6VstLZhWb2ErAyYv8kM3s4sQrak8mgYWZzzGxDYnuHmZ1YbBnyprUMRESyKqVm0A4cmmX/7sBp7j6esLTlJSn7Frj7+MSj8msgay0DEZGsig4G7r7C3d/Msv9ed1+V+PMJYNdiz1UyLXkpIpJVtRqQTwNuTbzuBuaa2Ytmdo2ZRf5kN7O5ZtZuZu2dnZ3Fn11pIhGRrHIGg8QNu73PI+9GYzM7AZhGIk3k7kvcfQ9gIrAz8LWoY919obu3untrS0tLvqdMpzSRiEhWA3O9wd3PKfbDzWwm8FXgKHff0udzN5rZDcBZxX5+3jZuhKFDK34aEZFGVbE0kZl9AvgOcIy7/zVl+94WDAROAh6rVBn+TmkiEZGsSulaepWZdQBNiS6iV5jZUDNbbmZNwEJgLPBoYn+y19ApwBrgeaAL+FGJ3yE3BQMRkaxypomiuPsXInYdk3geH3Hc94DvFXveomzcqDYDEZEs4jEdhWoGIiJZKRiIiEgMgsGWLeGhNJGISKT+Hww0fbWISE4KBiIiEoNgoCUvRURy6v/BQDUDEZGcFAxERCQGwUBpIhGRnPp/MFDNQEQkJwUDERGJQTBQmkhEJKf+HwxUMxARyUnBQERESg8GhSyBWRPJYDBkSG3LISJSx0pZ3OZCM3sJWBmxf7qZ/S2xsE2HmV2Q2D7UzO40s3Vmdp+Z7VZsGfKycWMIBAP6fyVIRKRYpdwh24FDc7znNncfn3hclth2MfCMu48CHgf+vYQy5Kbpq0VEciplpbMVAGZW6KGfAo5PvF4M/KrYMuRFwUBEJKdK5k4cmGVmL5nZMjPbPbF9FGENZIB1wIioDzCzuWbWbmbtnZ2dxZVCS16KiOSUMxiY2TXJG3LKI2ejsbuvcPfdgf2A14BLE7sGAd2J193AtiyfsdDdW929taWlJeeXyUg1AxGRnHKmidz9nFJO4O5bzOw6YGli03pgD0LtYBSwtpTPz0nBQEQkp4qlicxsnJkNtNCoMBt4LLFrOXBG4vXpwC2VKgOgNJGISB5K6Vp6lZl1AE2JrqNXJLqNLjezJmA6sAroAPYCvpo49JvANDNbBxwIXJb+6WWkmoGISE7m7rUuQ15aW1u9vb29sIPa2mDOHNi6FcaOhfnzYfbsipRPRKTemNlKd2/N5739dyRWWxvMnRsCAcDq1eHvtrbalktEpA7132Awb17PVBRJXV1hu4iI9NJ/g8GaNYVtFxGJsf4bDMaMKWy7iEiM9d9gMH9+ei+i5uawXUREeum/wWD2bFi4MPQiMgvPCxeqN5GISAZFT1TXEGbP1s1fRCQP/bdmICIieVMwEBERBQMREVEwEBERFAxERIQGmqjOzDqB1bUuR5UNA96odSHqgK5DoOvQQ9ciyHUdxrp7XiuDNUwwiCMza893xsH+TNch0HXooWsRlPM6KE0kIiIKBiIiomBQ7xbWugB1Qtch0HXooWsRlO06qM1ARERUMxAREQWDumNmQ8xsn1qXQ0TiRcGgTpjZTmb2S+B14F9Ttn/FzNaY2fNmdnTtSlgdZjbYzBYmvu9qM7sgsT1u12GAmf3azF5IfOeZie2xug5JZjbIzJ41s58m/o7rdVhlZh2Jx8OJbWW5Fv17CuvG0g1cAdwJTAUws72BLwITgNHAf5vZWHffUrNSVt4OwL3AOcBuwDNm9gTxuw4OfM7dXzOzo4D5ZtZB/K5D0teBVRDb/xd/5+7jk6/LeS1UM6gT7v6uu98PbE3ZfAJws7u/4+7PEv4zTK5F+arF3d9095978AawFvgY8bsO7u6vJf4cCzxNDP89AJjZ/sAU4ObEplhehwhluxYKBvVtNL2n4FgHjKhRWarOzA4EBhOG3MfuOpjZv5rZm8AFwHeI4b8HMzNgAfCVlM2xuw4pNpnZS2b2aCJ1WLZroWBQ3wYR0kdJ3cC2GpWlqsxsGHADcDoxvQ7u/kN3342QIrmXeF6Hc4GH3L0jZVscrwMA7r6/u+8NXAy0UcZroWBQ314DRqb8PYqQNunXzGwX4A7g6+7+ODG9DknufhuwI/G8DqcBJ5vZU4Ta0QnAeuJ3HXpx94cJKaGy/ZtQMKhvywn/EZoTedNdgadqXKaKMrOdgF8B89397sTmOF6HvcxseOL1NGAzMbwO7n6Yux/k7hOB/wB+QehkEavrAGBmO5jZiMTrSYR00P2U6VqoN1GdMLOhwJPAUGCwmU0HzgaWAs8QbgZnef8fMn4+cAhwuZldnth2JPG7DjsD95hZE/C/wEnuvtLM4nYd0sT4OjQDKxL/Jt4GTnX335brWmg6ChERUZpIREQUDEREBAUDERFBwUBERFAwEBERFAxERAQFAxERQcFARERQMBAREeD/A8fJdsiAXvJGAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "scores = [cross_val_score(regr, data_X_scale,\\\n", " boston.target,\\\n", " cv = int(i)).mean() \\\n", " for i in range(3, 50)]\n", "plt.plot(range(3, 50), scores,'r-o')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:25:34.856887Z", "start_time": "2018-04-29T07:25:34.840623Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.45384871359695633" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_X_scale = scale(boston.data)\n", "scores = cross_val_score(regr,data_X_scale, boston.target,\\\n", " cv = 7)\n", "scores.mean() " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 使用天涯bbs数据" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:26:04.816677Z", "start_time": "2018-04-29T07:26:04.799349Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titlelinkauthorauthor_pageclickreplytime
0【民间语文第161期】宁波px启示:船进港湾人应上岸/post-free-2849477-1.shtml贾也http://www.tianya.cn/5049945019467527032012-10-29 07:59
1宁波镇海PX项目引发群体上访 当地政府发布说明(转载)/post-free-2839539-1.shtml无上卫士ABChttp://www.tianya.cn/743418358824410412012-10-24 12:41
\n", "
" ], "text/plain": [ " title link author \\\n", "0 【民间语文第161期】宁波px启示:船进港湾人应上岸 /post-free-2849477-1.shtml 贾也 \n", "1 宁波镇海PX项目引发群体上访 当地政府发布说明(转载) /post-free-2839539-1.shtml 无上卫士ABC \n", "\n", " author_page click reply time \n", "0 http://www.tianya.cn/50499450 194675 2703 2012-10-29 07:59 \n", "1 http://www.tianya.cn/74341835 88244 1041 2012-10-24 12:41 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "df = pd.read_csv('../data/tianya_bbs_threads_list.txt', sep = \"\\t\", header=None)\n", "df=df.rename(columns = {0:'title', 1:'link', 2:'author',3:'author_page', 4:'click', 5:'reply', 6:'time'})\n", "df[:2]" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:26:13.615377Z", "start_time": "2018-04-29T07:26:13.600130Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# 定义这个函数的目的是让读者感受到:\n", "# 抽取不同的样本,得到的结果完全不同。\n", "def randomSplit(dataX, dataY, num):\n", " dataX_train = []\n", " dataX_test = []\n", " dataY_train = []\n", " dataY_test = []\n", " import random\n", " test_index = random.sample(range(len(df)), num)\n", " for k in range(len(dataX)):\n", " if k in test_index:\n", " dataX_test.append([dataX[k]])\n", " dataY_test.append(dataY[k])\n", " else:\n", " dataX_train.append([dataX[k]])\n", " dataY_train.append(dataY[k])\n", " return dataX_train, dataX_test, dataY_train, dataY_test, " ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:26:27.510118Z", "start_time": "2018-04-29T07:26:27.485883Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Variance score: 0.74\n" ] } ], "source": [ "import numpy as np\n", "\n", "# Use only one feature\n", "data_X = df.reply\n", "# Split the data into training/testing sets\n", "data_X_train, data_X_test, data_y_train, data_y_test = randomSplit(np.log(df.click+1), \n", " np.log(df.reply+1), 20)\n", "# Create linear regression object\n", "regr = linear_model.LinearRegression()\n", "# Train the model using the training sets\n", "regr.fit(data_X_train, data_y_train)\n", "# Explained variance score: 1 is perfect prediction\n", "print('Variance score: %.2f' % regr.score(data_X_test, data_y_test))" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T11:56:23.976497Z", "start_time": "2018-04-29T11:56:23.936410Z" } }, "outputs": [ { "data": { "text/plain": [ "[[194675, 2703],\n", " [88244, 1041],\n", " [82779, 625],\n", " [45304, 219],\n", " [38132, 835],\n", " [27026, 122],\n", " [24026, 115],\n", " [21497, 378],\n", " [15366, 375],\n", " [8513, 41],\n", " [7191, 61],\n", " [6756, 16],\n", " [6368, 86],\n", " [4990, 0],\n", " [4241, 0],\n", " [3995, 19],\n", " [3720, 2],\n", " [3468, 104],\n", " [3421, 7],\n", " [3233, 70],\n", " [3126, 50],\n", " [2699, 59],\n", " [2456, 2],\n", " [2433, 4],\n", " [2342, 23],\n", " [2257, 142],\n", " [2164, 35],\n", " [2153, 0],\n", " [2151, 35],\n", " [2116, 70],\n", " [2077, 18],\n", " [1981, 24],\n", " [1875, 28],\n", " [1809, 15],\n", " [1795, 1],\n", " [1772, 18],\n", " [1599, 75],\n", " [1516, 44],\n", " [1414, 10],\n", " [1319, 28],\n", " [1306, 36],\n", " [1294, 5],\n", " [1268, 4],\n", " [1219, 3],\n", " [1214, 24],\n", " [1156, 24],\n", " [1154, 16],\n", " [1099, 77],\n", " [1046, 0],\n", " [1033, 6],\n", " [1033, 0],\n", " [998, 35],\n", " [998, 15],\n", " [987, 0],\n", " [947, 4],\n", " [910, 0],\n", " [891, 39],\n", " [852, 0],\n", " [813, 11],\n", " [768, 20],\n", " [746, 7],\n", " [707, 10],\n", " [705, 29],\n", " [702, 18],\n", " [677, 12],\n", " [668, 42],\n", " [667, 0],\n", " [655, 3],\n", " [652, 0],\n", " [624, 9],\n", " [622, 82],\n", " [608, 7],\n", " [601, 16],\n", " [597, 18],\n", " [596, 11],\n", " [584, 0],\n", " [567, 10],\n", " [544, 17],\n", " [531, 7],\n", " [525, 10],\n", " [515, 62],\n", " [508, 12],\n", " [508, 7],\n", " [498, 0],\n", " [496, 1],\n", " [482, 5],\n", " [462, 0],\n", " [458, 41],\n", " [444, 7],\n", " [433, 0],\n", " [421, 1],\n", " [420, 12],\n", " [419, 35],\n", " [410, 8],\n", " [405, 0],\n", " [405, 0],\n", " [400, 14],\n", " [397, 16],\n", " [388, 12],\n", " [381, 7],\n", " [381, 1],\n", " [379, 0],\n", " [362, 1],\n", " [352, 4],\n", " [349, 0],\n", " [348, 8],\n", " [331, 1],\n", " [328, 0],\n", " [327, 6],\n", " [324, 1],\n", " [320, 1],\n", " [315, 0],\n", " [306, 14],\n", " [306, 0],\n", " [300, 0],\n", " [300, 4],\n", " [289, 0],\n", " [288, 1],\n", " [287, 0],\n", " [286, 5],\n", " [278, 3],\n", " [275, 4],\n", " [272, 0],\n", " [269, 1],\n", " [269, 7],\n", " [265, 0],\n", " [261, 0],\n", " [261, 6],\n", " [255, 9],\n", " [252, 7],\n", " [250, 0],\n", " [241, 0],\n", " [235, 5],\n", " [235, 4],\n", " [234, 9],\n", " [232, 7],\n", " [224, 3],\n", " [216, 2],\n", " [214, 24],\n", " [207, 1],\n", " [205, 4],\n", " [197, 4],\n", " [190, 0],\n", " [188, 2],\n", " [187, 0],\n", " [183, 6],\n", " [181, 5],\n", " [176, 0],\n", " [172, 3],\n", " [170, 5],\n", " [170, 0],\n", " [166, 0],\n", " [166, 5],\n", " [165, 0],\n", " [164, 3],\n", " [164, 1],\n", " [161, 0],\n", " [154, 1],\n", " [151, 1],\n", " [151, 2],\n", " [149, 1],\n", " [149, 0],\n", " [149, 3],\n", " [147, 0],\n", " [146, 5],\n", " [145, 0],\n", " [143, 0],\n", " [142, 0],\n", " [139, 5],\n", " [137, 4],\n", " [137, 1],\n", " [136, 0],\n", " [135, 1],\n", " [134, 0],\n", " [133, 1],\n", " [131, 1],\n", " [127, 0],\n", " [125, 0],\n", " [123, 0],\n", " [119, 0],\n", " [118, 0],\n", " [118, 0],\n", " [118, 0],\n", " [116, 0],\n", " [116, 0],\n", " [114, 7],\n", " [113, 0],\n", " [110, 0],\n", " [110, 0],\n", " [109, 0],\n", " [108, 8],\n", " [107, 8],\n", " [106, 0],\n", " [105, 0],\n", " [105, 0],\n", " [105, 10],\n", " [103, 0],\n", " [101, 5],\n", " [100, 6],\n", " [100, 0],\n", " [99, 3],\n", " [99, 1],\n", " [98, 0],\n", " [98, 1],\n", " [98, 1],\n", " [97, 2],\n", " [96, 0],\n", " [96, 0],\n", " [95, 0],\n", " [94, 3],\n", " [93, 0],\n", " [93, 3],\n", " [92, 2],\n", " [90, 1],\n", " [90, 2],\n", " [89, 0],\n", " [88, 3],\n", " [86, 0],\n", " [86, 3],\n", " [85, 0],\n", " [85, 0],\n", " [84, 0],\n", " [84, 1],\n", " [83, 1],\n", " [83, 0],\n", " [82, 0],\n", " [81, 9],\n", " [81, 5],\n", " [81, 2],\n", " [81, 10],\n", " [81, 0],\n", " [80, 0],\n", " [80, 0],\n", " [80, 5],\n", " [78, 0],\n", " [78, 0],\n", " [77, 0],\n", " [76, 3],\n", " [76, 0],\n", " [76, 0],\n", " [75, 0],\n", " [74, 1],\n", " [74, 0],\n", " [73, 0],\n", " [73, 3],\n", " [73, 3],\n", " [73, 0],\n", " [73, 5],\n", " [73, 0],\n", " [73, 0],\n", " [72, 1],\n", " [72, 0],\n", " [64, 2],\n", " [64, 0],\n", " [64, 1],\n", " [64, 0],\n", " [64, 0],\n", " [63, 1],\n", " [62, 3],\n", " [62, 0],\n", " [62, 0],\n", " [61, 1],\n", " [61, 0],\n", " [61, 0],\n", " [61, 0],\n", " [61, 0],\n", " [60, 2],\n", " [60, 3],\n", " [59, 0],\n", " [59, 0],\n", " [59, 0],\n", " [59, 4],\n", " [59, 0],\n", " [59, 0],\n", " [59, 2],\n", " [58, 0],\n", " [58, 0],\n", " [58, 0],\n", " [58, 0],\n", " [57, 1],\n", " [57, 0],\n", " [57, 1],\n", " [57, 4],\n", " [57, 0],\n", " [57, 0],\n", " [56, 0],\n", " [56, 1],\n", " [56, 0],\n", " [56, 0],\n", " [55, 0],\n", " [55, 0],\n", " [54, 0],\n", " [54, 0],\n", " [53, 4],\n", " [53, 0],\n", " [53, 0],\n", " [52, 0],\n", " [52, 0],\n", " [52, 0],\n", " [52, 0],\n", " [52, 1],\n", " [52, 0],\n", " [51, 0],\n", " [51, 0],\n", " [50, 0],\n", " [50, 0],\n", " [50, 1],\n", " [50, 0],\n", " [50, 0],\n", " [50, 0],\n", " [49, 0],\n", " [49, 0],\n", " [49, 0],\n", " [49, 0],\n", " [49, 0],\n", " [48, 0],\n", " [47, 0],\n", " [47, 0],\n", " [47, 0],\n", " [47, 0],\n", " [46, 0],\n", " [46, 0],\n", " [46, 0],\n", " [45, 1],\n", " [45, 1],\n", " [45, 0],\n", " [45, 0],\n", " [44, 0],\n", " [43, 0],\n", " [43, 0],\n", " [43, 0],\n", " [43, 0],\n", " [43, 1],\n", " [43, 1],\n", " [42, 0],\n", " [42, 0],\n", " [42, 1],\n", " [42, 1],\n", " [42, 1],\n", " [42, 2],\n", " [42, 3],\n", " [41, 0],\n", " [41, 0],\n", " [41, 0],\n", " [41, 0],\n", " [40, 0],\n", " [40, 0],\n", " [40, 0],\n", " [40, 1],\n", " [40, 0],\n", " [39, 0],\n", " [39, 0],\n", " [39, 0],\n", " [39, 0],\n", " [39, 1],\n", " [39, 0],\n", " [39, 0],\n", " [38, 1],\n", " [38, 0],\n", " [38, 0],\n", " [38, 0],\n", " [38, 1],\n", " [37, 0],\n", " [37, 0],\n", " [37, 0],\n", " [37, 0],\n", " [36, 0],\n", " [36, 0],\n", " [36, 0],\n", " [36, 0],\n", " [36, 0],\n", " [36, 0],\n", " [36, 1],\n", " [36, 0],\n", " [35, 0],\n", " [35, 0],\n", " [35, 0],\n", " [34, 0],\n", " [34, 2],\n", " [34, 0],\n", " [34, 2],\n", " [34, 0],\n", " [33, 0],\n", " [33, 0],\n", " [33, 0],\n", " [33, 0],\n", " [33, 0],\n", " [33, 0],\n", " [33, 1],\n", " [33, 0],\n", " [33, 0],\n", " [32, 0],\n", " [31, 0],\n", " [31, 0],\n", " [31, 0],\n", " [30, 0],\n", " [30, 0],\n", " [29, 0],\n", " [29, 0],\n", " [29, 0],\n", " [29, 0],\n", " [29, 0],\n", " [29, 0],\n", " [28, 0],\n", " [28, 0],\n", " [28, 0],\n", " [28, 0],\n", " [28, 0],\n", " [27, 0],\n", " [26, 0],\n", " [26, 0],\n", " [26, 0],\n", " [25, 0],\n", " [25, 0],\n", " [25, 0],\n", " [25, 0],\n", " [25, 0],\n", " [24, 0],\n", " [24, 0],\n", " [24, 0],\n", " [24, 0],\n", " [24, 0],\n", " [24, 0],\n", " [23, 0],\n", " [23, 0],\n", " [23, 0],\n", " [23, 0],\n", " [22, 0],\n", " [22, 1],\n", " [21, 0],\n", " [21, 0],\n", " [21, 0],\n", " [20, 0],\n", " [20, 0],\n", " [20, 0],\n", " [19, 0],\n", " [19, 0],\n", " [19, 0],\n", " [17, 0],\n", " [17, 0],\n", " [17, 0],\n", " [17, 0],\n", " [17, 0],\n", " [17, 0],\n", " [15, 0],\n", " [14, 0],\n", " [11, 0]]" ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_X_train\n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:26:38.754002Z", "start_time": "2018-04-29T07:26:38.751117Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "y_true, y_pred = data_y_test, regr.predict(data_X_test)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:26:41.635527Z", "start_time": "2018-04-29T07:26:41.541620Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW0AAAD+CAYAAADxhFR7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAADiFJREFUeJzt3cGLJId1x/Hf690d1iWtkDMzjkLiroowIgHn4u0Q2/ggCET5A5I40CRohV1kdbBxEuSwTW6ui3KISG5NkATeyiFgxdgHWzGOk1MO7o1jcAwyQlGPY9by7p7iNEZy9uUwM8vM7PR0dU9VV7/u7wcGNNVF1Zse9quiqqfK3F0AgBg6bQ8AAKiOaANAIEQbAAIh2gAQCNEGgECINgAEQrQBIBCiDQCBEG0ACORi3Rvc2dnxLMvq3iwArLVbt27ddffdWevVHu0syzQajereLACsNTMbV1mP0yMAEAjRBoBAiDYABEK0ASAQog0AgRBtAAiEaAPYeGVZKssydTodZVmmsizbHmmq2j+nDQCRlGWpPM81mUwkSePxWHmeS5L6/X6bo52KI20AG20wGDwI9qHJZKLBYNDSRGcj2gA22t7e3lzL20a0AWy0brc71/K2EW0AG60oCiVJcmxZkiQqiqKlic5GtAFstH6/r+FwqDRNZWZK01TD4XAlL0JKkrl7rRvs9XrOXf4AYD5mdsvde7PW40gbAAIh2gAQCNEGgECINgAEQrQBIBCiDQCBEG0ACIRoA0AgRBsAAiHaABAI0QaAQIg2AARSKdpmtmVm3zezv2t6IADAdFWPtG9IervBOQAAFcyMtpn9uqTflPQPzY8DADjLmdE2M5P0N5I+O2O93MxGZja6c+dOnfMBAI6YdaT9J5L+xd3fPGsldx+6e8/de7u7u/VNBwA45uKM1/9I0hUz+31JvyDpETN7w93/qvnRAAAnnRltd//44X+b2bOSPkGwAaA9fE4bAAKZdXrkAXd/VdKrjU0CAJiJI20ACIRoA0AgRBsAAiHaAHAOZVkqyzJ1Oh1lWaayLBvdX+ULkQCA48qyVJ7nmkwmkqTxeKw8zyVJ/X6/kX1ypA0ACxoMBg+CfWgymWgwGDS2T6INAAva29uba3kdiDYALKjb7c61vA5EGwAWVBSFkiQ5tixJEhVF0dg+iTYALKjf72s4HCpNU5mZ0jTVcDhs7CKkJJm717rBXq/no9Go1m0CwLozs1vu3pu1HkfaABAI0QaAQIg2AARCtAEgEKINAIEQbQAIhGgDQCBEGwACIdoAEAjRBoBAiDYABEK0ASAQog0AgRBtAAiEaANAIEQbAAIh2gAQCNEGgECINgAEQrQBIBCiDQCBEG0ACIRoA0AgRBsAApkZbTPrmNk3zOwHZvaGmT2zjMEAAA+rcqTtkv7Y3Z+S9FlJRbMjAQCmuThrBXd3SbcPvk0lfbfRiQAAU82MtiSZ2QuSPi/pjqSHTo+YWS4pl6Rut1vnfACAIypdiHT3F919W9INSa+bmZ14fejuPXfv7e7uNjEnAEBzfnrE3V+T9Kik7WbGAQCcpcqnR540sycO/vtjkn7m7ncbnwwA8JAq57Qfl/R1M7sg6SeSPtnsSACAaap8euTfJT21hFkAADPwF5EAEAjRBoBAiDYABEK0ASAQog0AgRBtAAiEaANAIEQbAAIh2gAQCNEGgECINgAEQrQBIBCiDQCBEG0ACIRoA0AgRBsAAiHaABAI0QaAQIg2AARCtAEgEKINAIEQbQAIhGgDQCBEGwACIdoAEAjRBoBAiDYABEK0ASAQog0AgRBtAAiEaANAIEQbAAIh2gAQCNEGgECINgAEMjPaZnbZzIZm9oaZjc3sc8sYDADwsCpH2o9Iel3Sr0m6KukvzOyDjU4FADjVzGi7+z13/5Lvuyvph5Ieb340AMBJc53TNrMPS7os6XsnludmNjKz0Z07d+qcD3hIWZbKskydTkdZlqksy7ZHApamcrTNbEfSFyVdc3c/+pq7D9295+693d3dumcEHijLUnmeazwey901Ho+V5znhxsaoFG0ze7+kr0q64e7fbnYkYLrBYKDJZHJs2WQy0WAwaGkiYLmqfHrkMUlfkVS4+9eaHwmYbm9vb67lwLqpcqT9GUkfkfSSmb158PVkw3MBp+p2u3MtB9ZNlU+PfMHdH3H3Dx35emsZwwEnFUWhJEmOLUuSREVRtDQRsFz8RSRC6ff7Gg6HStNUZqY0TTUcDtXv99seDVgKO/FBkHPr9Xo+Go1q3SYArDszu+XuvVnrcaQNAIEQbQAIhGgDQCBEGwACIdoAEAjRBoBAiDYABEK0ASAQog0AgRBtAAiEaANAIEQbAAIh2gAQCNEGgECINgAEQrQBIBCi3bCyLJVlmTqdjrIsU1mWbY8EILCLbQ+wzsqyVJ7nmkwmkqTxeKw8zyWJx2MBWAhH2g0aDAYPgn1oMploMBi0NBGA6Ih2g/b29uZaDgCzEO0GdbvduZYDwCxEu0FFUShJkmPLkiRRURQtTQQgOqLdoH6/r+FwqDRNZWZK01TD4ZCLkAAWZu5e6wZ7vZ6PRqNatwkA687Mbrl7b9Z6HGkDQCBEGwACIdoAEAjRBoBAiDYABEK0G1SWpXZ2dmRmMjPt7OxwwygA58INoxpSlqWee+45vfvuuw+W3bt3T9euXZPEDaMALIYj7YYMBoNjwT703nvvccMoAAurHG0ze5+ZPdXkMOvkrJtCccMoAIuaGW0ze8zMvizpHUkvND/SejjrplDcMArAoqocad+X9LeS/rThWdZKURTa2tp6aPmlS5e4YRSAhc2Mtrv/1N2/KennS5hnbfT7fb388sva3t5+sGx7e1uvvPIKFyEBLKzyDaPM7FlJn3D3T53yWi4pl6Rut3t1PB7XOSMArL2l3jDK3Yfu3nP33u7ubh2bBACcgo/8AUAgRBsAApn5F5FmdkXSdyRdkXTZzJ6W9Gl3/1bDswEATpgZbXf/H0kfWsIsAIAZOD0CAIEQbQAIhGgDQCBEGwACIdoAEAjRxlRlWSrLMnU6HWVZttSn7rS5b2CV8eQanKosS+V5rslkIkkaj8fK81xS80/daXPfwKqrfMOoqnq9no9Go1q3ieXLskyn3fgrTVO9/fbba7tvoC1LvWEU1s+0p+ss46k7be4bWHVEG6ea9nSdZTx1p819A6uOaONURVEoSZJjy5IkWcpTd9rcN7DqiDZO1e/3NRwOlaapzExpmmo4HC7lQmCb+wZWHRciAWAFcCESANYQ0QaAQIg2AARCtAEgEKINAIEQbQAIhGgDQCBEGwACIdoAEAjRBoBAiDYABEK0ASAQog0AgRBtAAiEaANAIEQbAAIh2gAQCNEGgECINgAEQrQBIBCiDQCBEG0ACKRStM3sD8zsv8zsTTN7rolByrJUlmXqdDrKskxlWU5dx8x08eJFmdmDdc967Tz7XGT2559/vpbtzrvfo/uZ9lpdP3PToswJLJ27n/kl6YqkH0r6ZUlPSPqxpN1p61+9etXndfPmTU+SxCU9+EqSxG/evHnmOodfW1tbfunSpVNfO7mdefa56OxVZziPs+af9tr169dr+ZmbVtfvBohE0shn9NjdK0X79yTdPPL930v6w2nrLxLtNE1PjV2apjPXqfJ1dDvz7PM8s593u4vuN03Tqa9duHBhKbOdV12/GyCSqtG2/XWnM7PPSdpx98HB9y9Kuu3uf31knVxSLkndbvfqeDw+c5sndTodnTaHmen+/ftnrlPF0e3Ms88qqs4173YX3a+ZSdJc71Xds51XXb8bIBIzu+XuvVnrVTmnvSXp6L+U+5L+7+gK7j50956793Z3d+ebVFK32525fNo6i26/yj4X3XYd2110e91ud+prFy5cmGtbbanrdwOsoyrRvq3989mHfkX757hrUxSFkiQ5tixJEhVFceY6h7a2tnTp0qVTXzu5nXn2uejsVWc4j7Pmn/Zanue1/MxNq+t3A6ylWedPJP2ipB9J+oD2L0S+JemRaesvck7bff/iU5qmbmaepunUi4eH5zsPz88ernvWa+fZ5yKzX79+vZbtzrvfkxduT3utrp+5aVHmBOqius5pS5KZPSvpLw++/XN3/8dp6/Z6PR+NRov/XwQANlDVc9oXq2zM3V+V9Oo5ZwIAnBN/EQkAgRBtAAiEaANAIEQbAAIh2gAQSKWP/M21QbM7kub7O/b27Ei62/YQAfA+VcP7NBvv0XSpu8/8k/Laox2JmY2qfC5y0/E+VcP7NBvv0flxegQAAiHaABDIpkd72PYAQfA+VcP7NBvv0Tlt9DltAIhm04+0ASAUoo2ZzOx9ZvZU23MA2NBoL+Pp8uvAzB4zsy9LekfSC23Ps4rM7LKZDc3sDTMbHzyeDyeYWcfMvmFmPzh4r55pe6aoNu6ctpldkfR9SR/V/mPT/kPSb7j7nVYHW0Fm9qik35L0q5I+6u6fanmklWNm25KelvSapG1J/ymp5+61Pt0pOtt/eOkT7n7bzH5X0hf4vPZiNvFI+xlJ/+ruP3L3H0v6Z0m/3fJMK8ndf+ru35T087ZnWVXufs/dv3Tw8JG72n8U3+Ntz7VqDt6f2wffppK+2+Y8kVV6CMKa+aCO/5n9f0v6pZZmwRoxsw9Luizpe23PsorM7AVJn5d0R/sHT1jAJh5pz3y6PDAvM9uR9EVJ13zTzjlW5O4vuvu2pBuSXj84ZYI5bWK0G3+6PDaLmb1f0lcl3XD3b7c9z6pz99ckPar9awCY0yZG+3VJz5jZB8zsCUkfl/RPLc+EoMzsMUlfkVS4+9fanmdVmdmTB//eZGYfk/Szg2sAmNPGndN293fMbCDp3w4W/Zm7/2+bM62qg0/afEfSFUmXzexpSZ9292+1Othq+Yykj0h6ycxeOlj2O+7+VoszraLHJX3dzC5I+omkT7Y8T1gb95E/AIhsE0+PAEBYRBsAAiHaABAI0QaAQIg2AARCtAEgEKINAIEQbQAIhGgDQCD/D/+oO+KxGV+rAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.scatter(y_pred, y_true, color='black')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:27:00.422795Z", "start_time": "2018-04-29T07:27:00.326748Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW0AAAD+CAYAAADxhFR7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAGRpJREFUeJzt3XuQXGWZx/HvM2Fi6IBhNxMlItO9clkhqMCMF2BZuQpSUIosEBhuYjIq4AoYQRhFRBowoFCIKENWUKcXCxQotSAIVNjyFt0JicpFINFcZIkmqTW1JoSE5Nk/3plkejIz3TNzTp9zun+fqi6Sc076PD0Zfjnznvc8r7k7IiKSDU1JFyAiItVTaIuIZIhCW0QkQxTaIiIZotAWEckQhbaISIYotEVEMkShLSKSIQptEZEM2SXqN2xpafFCoRD124qI1LVFixatdfdplY6LPLQLhQK9vb1Rv62ISF0zsxXVHKfhERGRDFFoi4hkiEJbRCRDFNoiIhmi0BYRyRCFtohIhii0RUSAUqlEoVCgqamJQqFAqVRKuqQhRT5PW0Qka0qlEp2dnWzcuBGAFStW0NnZCUBHR0eSpe1EV9oi0vC6urq2B3a/jRs30tXVlVBFw1Noi0jDW7ly5ai2J0mhLSINr7W1dVTbk6TQFpGGVywWyeVyZdtyuRzFYjGhioan0BaRhtfR0UF3dzf5fB4zI5/P093dnbqbkADm7pG+YXt7u6vLn4jI6JjZIndvr3ScrrRFRDJEoS0ikiEKbRGRDFFoi4hkiEJbRCRDFNoiIhmi0BYRyRCFtohIhii0RUQyRKEtIpIhCm0RkQypKrTNbKKZPWdm8+IuSEREhlftlfbVwPIY6xARkSpUDG0zOwB4N3B//OWIiMhIRgxtMzPgduDTFY7rNLNeM+tds2ZNlPWJiMgAla60PwE85e5LRzrI3bvdvd3d26dNmxZddSIiUmaXCvvPBXY3s9OBfwQmm9kL7n5z/KWJiMhgI4a2ux/e/2szuwD4FwW2iEhyNE9bRCRDKg2PbOfu9wL3xlaJiIhUpCttEZEMUWiLiGSIQltEZBxKpRL5/NswO5N8/p8olUqxnq/qMW0RESnX01Ni1qwHeO21h4F3snLleXR2dgLQ0dERyzl1pS0iMgZLlsDs2a3bAzsosnHjNrq6umI7r0JbRGQUVq2C88+HQw+FTZuOHLDn78DdAKxcuTK282t4RESkCuvXw1e+ArfeCps2DdyzlRDW1wJ/AaC1NR9bHQptEZERbNkCd90FX/oSrF1bvu+QQ/7M889/iE2bnt6+LZfLUSwWY6tHwyMiIkNwh4ceghkz4FOfKg/stjZYsACefvqtzJt3Ofl8HjMjn8/T3d0d201IAHP3SN+wvb3de3t7I31PEZFaWrgQ5syBX/yifHtrK9x4I8ycCU0RX/Ka2SJ3b690nIZHRET6LFsGV10FDzxQvn3KFOjqClfckyYlU1s/hbaINLx16+D66+Eb3whj2P2am+Gii+ALX4CpU5OrbyCFtog0rE2b4Otfh2IxzA4Z6PTTw1DIPvskU9twFNoi0nC2bYPvfx+uvhpWrCjfd/jhcMstcNhhydRWiUJbRBrKU0+Fm4yLFpVv33ffMA/71FPBLJHSqqIpfyLSEJ5/Hk45BY4+ujywp06F22+HZ5+Fj3wk3YENutIWkTq3ejVcey3Mmwdbt+7YPmkSXHopfO5zYXZIVii0RaQubdgAX/0qzJ0bft3PDM45J8wWaW1Nrr6xUmiLSF3ZuhXuvTdM03vllfJ9xx4LN98MhxySSGmRUGiLSF1wh/nz4Yor4JlnyvfNmBHC+sQT0z9mXYluRIpI5i1ZAh/4AJx0UnlgT58Od98d9n/wg9kPbNCVtohk2KpV8PnPw/e+F660+02eHK64P/OZ8Ot6otAWkcwZrrd1UxPMmhXaqO65Z3L1xUmhLSKZMVJv65NPDkF+4IHJ1FYrCm0RST13ePhhuPJKeOml8n2HHhoeOz/66GRqqzWFtoik2ki9rW+4Ac46K/re1mmm0BaRVMpCb+skKLRFJFWy1Ns6CQptEUmFLPa2ToJCW0QSleXe1klQaItIYrLe2zoJFe+5mlmTmT1uZi+a2QtmdkItChOR+lUvva2TUM2VtgPnufsrZnYiUAQei7csEalH9dbbOgkVQ9vdHehvcJgHfhtrRSJSd+q1t3USqhrTNrMrgCuBNYCGR0SkKvXe2zoJVT1H5O5z3X0qcDXwmFn5SJOZdZpZr5n1rlmzJo46RSRD3OHRR+Hgg0MDp4GBPWMGPPIIPP64AnssRvXwp7s/COwGTB20vdvd2929fdq0aVHWJyIZ00i9rZNQcXjEzN4GbHT31WZ2GLDJ3ddW+nMi0lgasbd1EqoZ094DmG9mE4C/AmfGW5KIZEkj97ZOQjWzR54G9q9BLSKSIeptnQw9ESkio6Le1slSaItI1dTbOnkKbRGpSL2t00OhLSLDUm/r9FFoi8hO1Ns6vRTaIrKdelunn0JbRAD1ts4K3ecVaXDqbZ0tutIWaVDqbZ1NCm2RBqPe1tmm0BZpEOptXR8U2iJ1zh3mzw+d9ga2SgU46KBwxX3iiRqzzgqFtkgdW7IEPvtZeOKJ8u3Tp8OXvwwXXAATJiRSmoyRQlukDqm3df1SaIvUkZF6W8+eHWaLqLd1tim0ReqAels3DoW2SIaN1Nu6rS08dn7UUYmUJjFRaItk1Ei9rW+8EWbOVG/reqTQFskY9bZubAptkYxYty5M07vzzp17W198cZgtot7W9U+hLZJy6m0tAym0RVJq2za4774w5DG4t/URR4SbjO97XzK1SXIU2iIptGBBeJJxcG/r/faDm25Sb+tGpnvLIiny3HOht/Uxx5QHdktLGCJRb2vRlbZICqxeDV/8YuhtvW3bju3qbS2DKbRFEjRSb+tzzw29rffeO7n6JH0U2iIJ2LoV7rkHrrlGva1ldBTaIjWk3tYyXgptkRpZvDjMCHnyyfLt6m0to6HQFomZeltLlBTaIjFZvz7Mqb7tNvW2lugotEUipt7WEqeKD9eY2SQz6zazF8xshZldVovCRLLGHR58EGbMCJ32BgZ2W1t4yvHHP1Zgy/hU80TkZOAx4O1AG/A5M9PMUZEBFi6EI4+E004rX4ygtRVKJfjNb7QYgUSjYmi7+zp3/6EHa4FVwB7xlyaSfsuWwRlnwGGHlS9GMGVKmL73wgtw9tlajECiM6pvJTM7CJgEPDNoe6eZ9ZpZ75o1a6KsT2QnpVKJQqFAU1MThUKBUqlU8xrWrQuPlx9wQPliBM3NYfuyZWF6nxYjkMi5e1UvoAVYDLx7pOPa2tpcJC49PT2ey+Uc2P7K5XLe09NTk/O/+qr73LnuU6a4h1HsHa/TT3dfurQmZUgdAnq9iiw2HzhxdBhm9g/AI8B17v7oSMe2t7d7b2/vOP8pERlaoVBgxeDm0kA+n2f58uWxnVe9rSVuZrbI3dsrHVdxyp+ZvRH4EVCsFNgicVu5cuWotkdBva0lTaoZ0/534FDgNjNb2vd6W8x1iQyptbV1VNvHQ72tJY2qmT1yvbtPdvd9B7z+WIviRAYrFovkcrmybblcjmKxGNk5Vq+Gj38c3vEO+MlPdmyfNCn0tV66FC65JNx0FKk1TUSSTOno6KC7u5t8Po+Zkc/n6e7upqOjY9zvvWEDXHcd7LsvdHfvWIzADM47D158MSyiq8UIJElV3YgcDd2IlKxRb2tJg8huRIrUK/W2lixSaEtDUm9rySqFtjQU9baWrFNoS0NQb2upFwptqWvqbS31RqEtdckdHnoozKse2CoVQm/rW25Rq1TJJoW21J2FC2HOnPJWqRB6W994I8ycqVapkl0Kbakby5bBVVeVt0qF8DBMV1dYTUatUiXrFNqSeevWhWl6d94ZxrD7NTfDxReH2SJTpyZXn0iUFNqSWZs2hcZNxWKYHTLQ6aeHoZB99kmmNpG4KLQlc9TbWhqZQlsyRb2tpdHpHrpkgnpbiwS60pZUW70avvhFmDdvR6tUCLNALr00zMNWq1RpJLrSjlkaVg7PIvW2FhmarrRjVCqV6OzsZOPGjQCsWLGCzs5OgEia9tejkXpbH3dcaJeq3tbSyLQIQoySWjk8iyr1tr75ZjjhBI1ZS/2qdhEEDY/EKImVw7No8WI4/ng46aTywJ4+He6+G5Ys0WIEIv0U2jGq5crhWbRqFZx/fmjgNHAxgsmTQ1e+l16CWbO0GIHIQArtGNVi5fAsWr8+9AjZf3/47nd3LEbQ1BRWQV+6NIxpazECkZ0ptGMU58rhWbRlC9xxR5gRctNN5YsRnHwy/P738K1vaTECkZHoRqTETr2tRSrTauySCuptLRIthbbEQr2tReKh0JZIqbe1SLwU2hIJ9bYWqQ2NJsaoVCrR0tKCmWFmtLS01F3vkW3boFSCt789PM04MLCPOAJ+9Su4/34FtkhUdKUdk1KpxIUXXsjmzZu3b1u3bh0f/ehHgfroPaLe1iK1pyvtmHR1dZUFdr8tW7bQ1dWVQEXRUW9rkeRUfaVtZrsCe7v7izHWUzdG6i+S1d4j6m0tkryKV9pm9kYzexj4C3BF/CXVh5H6i2St94h6W4ukRzXDI9uArwOXx1xLXSkWi0ycOHGn7c3NzZnpPbJ1a7iq3m+/cIW9YcOOfcceG4ZGvvMd2Hvv5GoUaTQVQ9vd/+7uTwKv16CeutHR0cG3v/1tpg6YlDx16lTuueee1N+EdIdHH4WDD4bZs8sXIzjoIHjkEXj8cS1GIJKEqnuPmNkFwL+4+6wh9nUCnQCtra1tQzX+l2xYvDjMCBnYKhVCb+svfxkuuECtUkXiUNNFENy9293b3b192rRpUbyl1Fg1va0/9jEFtkjSNE+7wa1fH+ZU33ZbeavUpqYwNHLttWqVKpImCu0GtWUL3HVXuIpeu7Z838knw1e+AgcemExtIjK8iqFtZrsDi4HdgUlmdhQw290XxFybxEC9rUWyrWJou/v/AfvWoBaJ2a9+FW4yDtXb+oYb4Kyz1NtaJO00PNIAli4Nva1/8IPy7eptLZI9Cu06pt7WIvVHoV2H1NtapH4ptOvItm1w331hyGPw801HHBFuMr7vfcnUJiLR0G2nOrFgAbznPXDOOeWBvd9+8OCD8LOfjT6wS6UShUKBpqYmCoVCTRdwSPLcIqnm7pG+2traXGrn2WfdTz7ZPUzm2/FqaXG/4w73zZvH9r49PT2ey+Uc2P7K5XLe09MT7QdI2blFkgL0ehUZW3XvkWq1t7d7b29vpO8pOxupt/Vll8GVV46vVWqhUGCoHjL5fJ7ly5eP/Y1Tfm6RpFTbe0Rj2hmzYQN89aswd255q1QzOPdcuP76aFqlDrdQQy0WcEjy3CJppzHtjBipt/Vxx8HTT0fb23q4hRpqsYBDkucWSTuFdsq5h/7V73rX0L2tH30UfvrT0Ps6SsVikVwuV7Ytl8vVZAGHJM8tknrVDHyP5qUbkdF5+mn3Y4/d+Sbj9Onu8+a5v/56vOfv6enxfD7vZub5fL6mNwKTPLdIEtCNyOxauTI8rdjTE2K63+TJ4Qbj5ZeHX4tI/dCNyAzq7219663w2ms7tk+YALNmqbe1iCi0U2Gk3tannBJ6Wx9wQDK1iUi6KLQTpN7WIjJaCu2ELFwIc+bs3Ns6nw+9rWfOVG9rEdmZQrvGli0Lva0feKB8+5Qp4ebjJZeot7WIDE+hXSPqbS0iUVBox2yk3tZnnBGGQtTbWkSqpdCOiXpbi0gcFNoxWLAgLKC7aFH59v32C9P3Pvzh0OBJRGS0ND8hQs89F+ZVH3NMeWC3tMAdd8Czz8KppyqwRWTsdKUdgdWrw9OKd98dT29rEZF+Cu1xqFVvaxGRfgrtMdi6Fe65B665prxVKoTe1jffHH2rVBERUGiPijvMnw9XXAHPPFO+76CDQlifcILGrEUkPgrtKi1eHGaEPPlk+fbp08MwyPnnh258IiJxUmhXsGpVeFrxe99Tb2sRSZ5Cexj9va1vuy081dhPva1FJEkK7UHU21pE0kyh3Ue9rUUkCxTaqLe1iGRHVVFkZmeY2Z/MbKmZXRhHIaVSiUKhQFNTE4VCgVKpNOwxZsYuu+yCmW0/dqR9w51vr73+FbP7Oeyw8sCeMiU8MPOHP8DZZ1cO7MG1X3TRRRU/SxRG+poNta+ar3FaZKlWkZqqtFw7sDuwCtgL2BNYDUwb7vi2trZRLx3f09PjuVzOge2vXC7nPT09Ix7T/5o4caI3NzcPuW/w+7i7f/Ob9/suu9zh8JqHgZHwmjDhdb/0Uve1a8dXezU1jNdIX7Oh9g31NYqjrihU8/0gUm+AXq+Qxx6iqmJo/xvQM+D3/wnMHO74sYR2Pp8fMuzy+XzFY6p59b/Pq6+6z53rbra+LKzD6/v+lrccGVntI32WKIz0NRvN1yrquqJQzfeDSL2pNrTNB04+HoKZXQa0uHtX3+/nAq+4+60DjukEOgFaW1vbVgxuIF1BU1MTQ9VhZmzr68A03DFVnoGenq1D9raGnwNzgF+Xna/qd66yrrG891jOa32PY1b7tYq6rihU8/0gUm/MbJG7t1c6rpox7YnAwP9TtgFbBx7g7t3u3u7u7dOmTRtdpUBra2vF7cMdU9lRTJy4mHPOGRzYLwKnAkcCvx7zOar9M2Ovf3Tv19raOqpzRV1XFKr5fhBpVNWE9iuE8ex+byWMcUemWCySy+XKtuVyOYrF4ojH9Js4cSLNzc2Dth4A/AhYwObN79y+taUFzj//v9l113cDDw97vvHUPthY33u05+0/z1D7hvoaxVFXFKr5fhBpWJXGT4A3Ay8DbyLciPwjMHm448cypu0ebj7l83k3M8/n80PedOo/BvAJEyZsH+fsv/kW9r3Zze5yeL1szHrSJPerrnL/29+qP99Ya//kJz8Z2XuP5ryDb9wO3hflZ45blmoViQJRjWkDmNkFwBf6fjvH3R8a7tj29nbv7e0dz78jYzJSb+vzzgsroau3tYikVbVj2lU9XOPu9wL3jrOmWKi3tYg0ksw+Eane1iLSiDL5cPbixXD88XDSSeWBPX06zJsHS5bAiScqsEWk/mTqSnvlytDbuqenvLf1bruFK271thaRepeJ0O7vbX3rrfDaazu2T5gAs2eH3tZvfnNi5YmI1EyqQ3vz5tDb+rrr1NtaRARSGtoj9bZubw+9rd///mRqExFJUupCe+FC+Mxn4Je/LN+ez8ONN8KZZ6q3tYg0rtSE9rJlcNVV8MAD5dv32AO6uuCSS2DSpGRqExFJi1SE9ssvw4EHhjHsfs3NIai7umDq1ORqExFJk1QMNOy1F5x22o7fn3EGPP88fO1rCmwRkYFScaUNYS3Gv/4VikV473uTrkZEJJ1SE9qFAjzxRNJViIikWyqGR0REpDoKbRGRDFFoi4hkiEJbRCRDFNoiIhmi0BYRyRCFtohIhlS1sO+o3tBsDbBilH+sBVhb8ahs0GdJn3r5HFA/n6VePgdE91ny7j6t0kGRh/ZYmFlvNasQZ4E+S/rUy+eA+vks9fI5oPafRcMjIiIZotAWEcmQtIR2d9IFREifJX3q5XNA/XyWevkcUOPPkooxbRERqU5arrRFRKQKCm0RkQxJNLTNrMnMHjezF83sBTM7Icl6xsvMJprZc2Y2L+laxsvMlpvZ0r7Xz5KuZ6zMbIqZfd/MXjazZWY2MemaRsvMPjfg72KpmW0ys5OSrmuszOxyM3vJzP5kZhcnXc9YmdnVA7LrQzU7b5Jj2mZmwJ7u/oqZnQhcn+W5m2Z2LfAe4H/cfVbC5YyLmS1390LSdYyXmX0XeBEoAm8AXvMM38gxsynAYmB/d3896XpGy8wKwFPADGAS8EfgLe6+IbmqRs/MjgZuAo4iPFzzc2CGu/897nMneqXtwSt9v80Dv02ynvEwswOAdwP3J12LBGa2J3A4cEPf99qmLAd2nw7gB1kM7D5b+v67jbBy1t+BzcMfnlrtwBPu/qq7rwJ+B9RkocTEx7TN7AozWwdcBlyXdD1j0fcTw+3Ap5OuJUKv9g0nLMzwsNUM4E/AD/t+hL2l7+8qyz4GfDvpIsbK3V8GrgUWAk8AZ7n7lhH/UDo9C3zAzHYzs+nAIUDFR9CjkHhou/tcd58KXA08ltH/qT4BPOXuS5MuJCrufoC77wN8FiiZ2R5J1zQGbwIOBD4FHAocAZySaEXjYGZtwCZ3/0PStYyVmb0ROJtwgfM1YI6ZpWat2mq5+yPAfKAX+AbhSntdLc6deGj3c/cHgd2AqUnXMgbnAjPNbAnhp4VTzeyzCdcUCXf/GbAcKCRbyZj8FVjk7n/uGzN9HPjnhGsaj9nAfyRdxDidA/zO3Z9y93v6th2fZEFj5e5fcPe3u/tHgLcCNfnHNOnZI2/rG3fEzA4jXEVkrvOXux/u7u9w94OBa4CH3P3mpOsaKzOb3PcjH2Z2CDAdeCnZqsZkIXCgmb3FzN4AHEe4MsocM5tM+Ckh6/dMNgEHm1mzme0O7A/8b8I1jZqZ7dL3d4KZdQJ/6hvbjl3SP5bsAcw3swmEq6IzE65HghzwX31/L+uBc7J2dx/A3TeY2acIV9hvAO519wUJlzVWZwLzazE7IWY9wDGEWSOvAt9x94XJljQmOWCRme1KmEBxYa1OrMfYRUQyJDVj2iIiUplCW0QkQxTaIiIZotAWEckQhbaISIYotEVEMkShLSKSIQptEZEMUWiLiGTI/wPIVrB6p3RDjAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plot outputs\n", "plt.scatter(data_X_test, data_y_test, color='black')\n", "plt.plot(data_X_test, regr.predict(data_X_test), color='blue', linewidth=3)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:27:36.147084Z", "start_time": "2018-04-29T07:27:36.142088Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "('Coefficients: \\n', array([ 0.68334304]))" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The coefficients\n", "'Coefficients: \\n', regr.coef_" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:27:48.770254Z", "start_time": "2018-04-29T07:27:48.765411Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Residual sum of squares: 0.40'" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The mean square error\n", "\"Residual sum of squares: %.2f\" % np.mean((regr.predict(data_X_test) - data_y_test) ** 2)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:27:56.521151Z", "start_time": "2018-04-29T07:27:56.496715Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "df.click_log = [[np.log(df.click[i]+1)] for i in range(len(df))]\n", "df.reply_log = [[np.log(df.reply[i]+1)] for i in range(len(df))]" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:28:02.712616Z", "start_time": "2018-04-29T07:28:02.701169Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'Variance score: 0.62'" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import train_test_split\n", "Xs_train, Xs_test, y_train, y_test = train_test_split(df.click_log, df.reply_log,test_size=0.2, random_state=0)\n", "\n", "# Create linear regression object\n", "regr = linear_model.LinearRegression()\n", "# Train the model using the training sets\n", "regr.fit(Xs_train, y_train)\n", "# Explained variance score: 1 is perfect prediction\n", "'Variance score: %.2f' % regr.score(Xs_test, y_test)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:28:16.645996Z", "start_time": "2018-04-29T07:28:16.549017Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD+CAYAAAAqP/5ZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAHKJJREFUeJzt3XuQXFWdB/DvryfThJ4EQqbDQ3BuG6hEiAqaQQWyJRrWlGhQKUHYFiq1YOtEQQQWlXG3oKzJLouFgmXJtrKATlOIG1FQyRqBUMhLJpKVQBlMyMwY5DGAvDIJkPRv/ziZMD3T99F9b99Xfz9VXZXpe/ue39xJfefOOefeI6oKIiJKvkzUBRARUTAY6EREKcFAJyJKCQY6EVFKMNCJiFKCgU5ElBIMdCKilGCgExGlBAOdiCglZoTZWD6f10KhEGaTRESJt379+udVdZ7bfqEGeqFQwNDQUJhNEhElnoiMeNmPXS5ERCnBQCciSgkGOhFRSjDQiYhSwlegi8jXRWTzpNdOETk5qOKIiMg7X4Guqv+hqkeo6hEAFgP4G4DfBlIZERE1JMgulyKA/1HVXQEek4go0SqVCgqFAjKZDAqFAiqVSsvaCnIe+jkwoV5DREoASgDQ09MTYHNERPFWqVRQKpUwPj4OABgZGUGpVAIAFIvT4tI3CWJNURFZDOAaVT3Bab/e3l7ljUVE1C4KhQJGRqbfE2RZFoaHhz0fR0TWq2qv235Bdbl8HsB1AR2LiCgVRkdHG3rfL9+BLiJdAJYDuMV/OURE6WHXzdyq7ucgrtA/C2CNqr4WwLGIiFJjYGAAuVyu5r1cLoeBgYGWtOc70FX1v1X1nCCKISJKk2KxiHK5DMuyICKwLAvlcrklA6JAQIOiXnFQlIiocWEPihIRUcQY6EREKcFAJyJKCQY6EVFKMNCJiFKCgU5ElBIMdCKilGCgExGlBAOdiCglGOhERCnBQCciSgkGOhFRSjDQiYhSgoFORJQSDHQiopRgoBMRpUQQa4ruLyI3i8hTIrJFRLJBFEZERI0J4gr9ewA2AjgMwCIAbwZwTCIiatAMPx8WkYMBHA9ghZq17HYGUhURETXM7xX6IgBbAawWkU0i8m0Rkck7iEhJRIZEZGhsbMxnc0REZMdvoB8I4CgA5wF4H4ATACyfvIOqllW1V1V7582b57M5IiKy46vLBcBzANar6jYAEJG1ABb6roqIiBrm9wr9QQBHicjbRGQfACcBGPJfFhERNcrXFbqqbheR8wCsBbAPgBtU9e5AKiMioob47XKBqt4B4I4AaiEiIh94pygRUUow0ImIUoKBTkSUEgx0IqKUYKATEaUEA52IKCUY6EREKcFAJyJKCQY6EVFKMNCJiFKCgU5E1CKqwAUXAKecAvzkJ61vz/ezXIiIqJYq8M1vAqtWvfXe7bcDH/0ocNBBrWuXgU5EFKBVq4D+/unvd3YCuVxr22agExEF4LvfBb76VfvtDz8MzJ7d2hrYh05E5MMPfwiI2If5vfeaLpijj259LbxCJyJqwuAgcNZZ9tvXrgVOOim8egAGOhFRQ1avBj7zGfvtt98OfOIT4dUzme9AF5FhALv2fPm0qv6D32MSEcXNr3/tHNS33AKcdlp49dQTyBW6qh4RxHGIiOLmrruApUvtt994I3D22eHV44SDokREddx3nxnstAvzH/zADHbGJcyBYAJ9h4hsEZEHRWTZ1I0iUhKRIREZGhsbC6A5IqLWGRoyQb5kSf3tV11lgvyLXwy3Li98B7qqHqmqhwP4FwAVEZkzZXtZVXtVtXfevHl+myMiaolHHzVBfuyx9bd/61smyJ3mmkctsC4XVb0XwDCAQlDHJCJqtU2bTJC/5z31t3/jG0C1am7ljztfgS4iXSJyyJ5/vxfAIQD+EkRhRERTVSoVFAoFZDIZFAoFVCqVpo+1dasJ8ne+s/728883Qb5qldkvCfzOcskBuEdEOgC8DOBzqrrdf1lERLUqlQpKpRLGx8cBACMjIyiVSgCAYrHo+TjbtgHveAewa1f97eecA5TLQCaBU0ZEVUNrrLe3V4eGhkJrj4jSo1AoYGRkZNr7lmVheHjY9fPPPgssXAi8/HL97WecYe7+7OjwWWgLiMh6Ve112493ihJRIoyOjjb0/oQXXjD943/7W/3ty5ebuz87O/1WGL0E/lFBRO2op6enofdffhlYsADI5+uH+Uc+AuzcCdx2WzrCHGCgE1FCDAwMIDflgeK5XA4DAwM1723fDrz3vcCcOcBf6kzR+MAHzD533gnss08rKw4fA52IEqFYLKJcLsOyLIgILMtCuVzeOyC6YwdwwgnArFnAhg3TP/+udwGvvAI8+GDrF5qICgOdqE0FOQUwLMViEcPDw6hWqxgeHkaxWMQbbwDLlpmQvv/+6Z+ZPx/4+9/NjUOtXmAiahwUJWpDQU0BjNKuXcDppwO33lp/+4EHAhs3Au10gzqv0CnRkniVGQf9/f17w3zC+Pg4+usthhkz1apZWKKzs36Yd3WZQdBnn22vMAd4hU4JloarzKg0OwUwStWq+xzxkRHAZtJLW+AVOiVWHK4yk/oXQqNTAKOkam69dwrzzZvNfjEsP1QMdEqsqK8yJ/5CGBkZgaru/QshCaHudQpg1EScb8F/7DET5IcfHl5NccZAp8SK+iozDn8hNMttCmDUurudH4h1/fUmyI86KryakoCBTokV9VVm1H8h+FVvCmDUFi40Qf7ii/W3X3ONCfIVK0ItKzEY6JRYUV9lRv0XQposWWKC/Ikn6m+fWFzivPPCrStpGOiUaFFeZUb9F0LQohjgXb7cBPl999XfftFFJsiTsLhELKhqaK/FixcrUZoMDg6qZVkqImpZlg4ODkZdUlMGBwc1l8spgL2vXC7Xsu/n7LNVTVTXf51zTkuaTSwAQ+ohY/k8dCLy/axxry64ALj6avvtp55qHmVLtbw+Dz2QLhcRyYrI4yLyoyCOR0ThavUA72WXma4VuzD/0IfMtTnD3J+g+tAvhVkgmogSqFUDvFddZYL88svrb1+0yAT5unW+mqE9fAe6iBwJ4FgAt/gvh4gmhDlIGfQA73XXmSC/6KL62w8+2AT5xo1NHZ7seOlot3sBEABrARwBYAWAH9XZpwRgCMBQT09PGOMHRIkX9iDlRJt+B3hvvtl5sLOjowWFtwGEMSgqIn0A5qrqgIisALBEVc+125+DokTehDVIGZTf/Ab4+Med96lWne/+JHthLRJ9FoDZInIagLkAukRkk6pe6fO4RG0tKXeh3nMPcOKJzvvs3u38PBYKjq9AV9XjJ/496QqdYU7kU09PT90r9LjchTo0BBx7rPM+b74JzOADukPF35tEMRTXu1A3bjTdJk5h/vrrpsecYR6+wAJdVW9w6j8nIu+ifk7NVFu2mCB/97vt99m+3QR5NhteXVSLd4oSka2nngIOO8x5n5deAvbfP5x62lVYg6JElEJjY2aRZbd98vlw6iFvGOhEtNfLLwNz5jjvs20bcOih4dRDjeGgKBHhpZdMH7lTmG/ZYvrIGebxxUAnamPj4ybIDzjAfp+NG02Qz58fXl3UHAY6URt6/XUT5F1d9vs8/LAJ8kWLwquL/GGgE7WR3btNkM+cab/Pj39sgrzXdU4FxQ0DnagNqJogd7rZZ2IB5rPOCq8uChZnuRClnNsDsU4/HfjpT8OphVqLgU6UUm5BvmQJcO+94dRC4WCgE6WMW5AXCsDWraGUQiFjoBOlhFuQd3QAu3aFUwtFg4FOlHBeFo0I8ZFNFCEGOlFCMchpKgY6UcIwyMkO56ETJURnp3uYTyzHTO2JgU4Uc4cfboLcaUCTQU6Az0AXkYyIrBWRJ0Rkk4gsC6owona3334myJ980n4fyypAJINCoYBKpRJecRRLfq/QFcDZqroAwFcARLvgIVEKHH20CfJXX7XfZ3CwglyuCyMjI1BVjIyMoFQqMdTbnK9AV+PpPV9aAP7Pf0lE7iqVCgqFAjKZ9FydLl9ugvxPf7Lfp1o1XSv9/f0YHx+v2TY+Po7+/v4WV0lx5nuWi4hcAuBrAMYATOtyEZESgBIA9PT0+G2OCJVKBaVSaW+gTVydAohsEWU/+vqAa6913mfXLnNj0ITR0dG6+9m9T+0hsEWiReRUAKsAHKk2B+Ui0RSEQqGAkZGRae9bloXh4eHwC2rS5ZcDl13mvM/4OLDvvtPfT8s5IG+8LhId2CwXVf05gFkAuoM6JlE9Sb86vfZa07XiFOYvvmi6VuqFOQAMDAwgl8vVvJfL5TAwwGGsduZ3lst8ETl4z7+PA7BTVZ8PpDIiG3Zdd3Hv0lu92gR5X5/9Ptu2mSB3WhIOMF1L5XIZlmVBRGBZFsrlciK7nCg4fvvQ5wBYIyIdAJ4D8Fn/JRE5GxgYqOlDB+J9dbpuHfDhDzvvs2kTsGBBY8ctFosMcKrhd5bLH1V1gaoerqrHqer6oAqj9tHojJWkXJ0+8oi5IncK8z/8wVyRNxrmRHWpamivxYsXK9Fkg4ODmsvlFOaeBgWguVxOBwcHQ2nbsiwVEbUsK7A2N2+euG/T/vXb3wbSFLUJAEPqIWMDm+XiBWe50FRRzdaYOvURMN02fq70n3kGOOQQ531uugk488ymDk9tzOssFwY6RSqTyaDe/0ERQbVabVm7Qf4ieeUVYP/9nfe5+mrg/PMbOizRXqFPWyRqRlQzVoKY+rhjh+kjdwrzSy81nSwMcwoDA50iFcZ86nqDrn5+kezebYJ8Stk1VqwwQR7TiTeUUgx0ilSrZ6xM9JVPfYjVySef3PAvElUT5DMcJvsuXWr2u/76QMonagj70CnVnPrKBwYG0N/fj9HRUfT09GBgYMD2FwlXCaIosQ89hdzma6fxCYSNmnoO6oU5YPrKi8UihoeHUa1WMTw8XDfMRbhKECWIl7mNQb04D715bvO1o5zPHRf1zoGI1Hw98bIsy/FYbvPI9z48migE4Dz0dHGbZsen79mfIxGpmRrpNN+cXSsUR+xySRm3aXZJfwJhEOy+V1V1HXRl1wqlge8FLigcPT09da8+J6bZuW1vB3bnwOmvFF6RU5rwCj0h3OZr8/nYjZ0DXpFTKnnpaA/qxUFRf9weJtWqh00lids54GAnJRE4KEr0FnatUJJxUJRCF/Y8eC/tsWuF2gkHRSkQUx9HO3GLPYCWLDzh1h6vyKkd+epyEZGZAK4B8CEAMwF8V1W/Y7c/u1zSK+x58PZ3gbr/f969G8jwb1NKkLC6XLoA/C+AdwJYDODrIvJ2n8ekBHK6xb4Vph934iZQezt2mKtyhjmlld81RV9Q1dV7BmKfB/BXmIWjqY1UKhWITR9Hq+bBv3Vc9yB/4QUT5DNntqQUotgI7FpFRN4F0+2yccr7JREZEpGhsbGxoJqjGOnv77dddahV8+Cff34T3IJ861YT5HPntqQEotgJZNqiiOQBrAVQUtWH7fZjH3o62S0jB8D2/WYtXQrcdZfzPvffDxx3XKDNEkUqtGmLInIAgNsBXOoU5uRPGFMCJ7eRz+eRz+c9tWfXrWJZVmC1n3++mX7oFOY//7m5ImeYU9vycveR3QvAfgDuBfAJL/vzTtHmhPFo3HpteG3PqT6/tV9xhfudnZdeGthpIIoleLxT1G+gfxPAdgCbJ73m2+3PQG+OZVlNPdM7iDa8tmd3y32ztf/sZ+5BfuqpgX37RLHmNdB5638C2PVRiwiq1WpL2/DbXqO1P/QQ8MEPOh9z/nxgy5aGyiBKNN76nyJ+Vqj324bf9rzWvmWL6SN3C3NVhjmRHQZ6yJoZIAzj0bj12nBrz8v34lb7iy+aID/iCOf6gn7eCtdfpVTy0i8T1Kvd+9D9DBCG8WjcyW10d3drd3e346N6vX4v9WrfudO9j7xVj7Ll+quUNGAfevykad3PZr8X9XjrfSv/W6bp50DtwWsfOgM9RGEMboalme8lLk9ATNPPgdoDB0VjKIzBzbA08r3E7Znkafo5EE3GQA9Rmtb99PK9xC3IJ6Tp50BUw0tHe1Cvdh8UVQ1mcLORwctW1mL3+SSs28n1VylJEMadoo2+GOj++blF3+04fmd6JCHIiZLIa6BzUDRh7FfqeYuX2RpBzvSIy2AnUVp5HRTlmqIJ42UFID/7NLLCEIOcKF44KJowQd2i72emR1wHO4naHQM9YZq5Rd/rcdw+yyAnijcGesIUi0WUy2VYlgURQXd3N7q7uyEisCwL5XIZxWKx4eM4fZZBTpQMHBRNqEqlgv7+foyOjqKnpwcDAwOegrwRXvrIq1Vv+xFR80IdFBWRfQG8XVWfCOJ45KxSqaBUKmF8fBwAMDIyglKpBACBhLqXgH79dSCb9d0UEQXIV5eLiOwnIr8A8CyAS4Ipidz09/fvDfMJ4+Pj6O/v93VcL10rL7xgulYY5kTx47cPvQrgewAuDKAW8iiIKYeTeQnyP//ZBPncuU01QUQh8BXoqvqaqt4JYFdA9ZAHQT1cykuQr1ljgnzhwoYOTUQR4CyXBPL7cCkvQX7FFSbIly1rtkoiClvLA11ESiIyJCJDY2NjrW6uLTQy5XCyfN49yE8/3QT5JRwRIUqcQKYtisgKAEtU9Vyn/ThtMRqf+hTwy18673PoocC2beHUQ0SN4QIXhFWrzBW5W5irMsyJ0sDXPHQRmQ3gEQCzAcwUkRMBfF5V7w6gNmrSbbcBn/yk+368s5MoXfzOcnlVVY9Q1YNUdf89/26bMK9UKigUChARzJgxAyKCQqGAlStX1n2/UqkE3nYmk9l77EcfNVfkbmFuWQWIZAKvqdF6vWx3+xwRTeLloelBvdK0wIXbQhP1Xn4XkLBvO+9pcYlWLGrRXL217dpt7+vri6ReorgBF7hoLS8LTdTTzAIS9m13AnjDdf+JH3GQi1o0wq1du+0dHR3YvXu37eeI2oXXQVEGepMymQyaOXcigmq16qttkQzMTbrOppZnV3MQNTlxa7fRc9nqeonihrNcWqzRuzL9fm6CmUfuHGZ2j7IN6g7TRrm1a7e9o6OjoeMRtTsGepPcFpqop5G7Oafycnfn4GDFceaK3ztMm+XWrt32UqkUSb1EieWloz2oV5oGRVXNYJ5lWQpAOzo6FIBalqV9fX11329mMM/LYGcjx56oWUSarqkZbu3abY+qXqI4AQdFk40LMBPRhFAXuKDgMMiJqFkM9JhgkBORXwz0iDHIiSgoDPSIMMiJKGgM9JAxyImoVRjoIWGQE1GrMdBbzEuQV6ve9iMicsI7RVvEy52d4+PmqpxhTkRBYKAHzEuQP/OMCfJ99w2nJiJqDwz0gHgJ8g0bTJAfdFA4NRFRe/Ed6CJyuohsFZHNIvLPQRQ1VbOr1tT7nN17s2bNgohMe2UymZp/r1y5subYIhtcgzyXK6K7O49jjpl+/HqvWbNm1a0rk8mgs7OzZt/Zs2dPOy8rV67cu1LSjBkzsHLlymnf98qVK5HP52vazOfzyGQyyOfze//t9Xy7HT+fz9uuUhTEyk7NrIjE1ZAodbw88MXuBbOW6F8BHArgYADPAJhnt38zD+dqdpWdep/r7OzUbDZb8142m1URaWjloaVLl+qMGd/28OCsMxs67uSXiDRcVy6X06VLl9bdlslkmq7F7Xx7Xb0pm806rlLUyM+3kf8j9bZns1nt7Oz01S5RWODx4Vx+A/0zAAYnfX0TgDPs9m8m0CeeWjj1ZVlWU5/z//qChyC/qEVtR/dyOt+NnOuJ47h9xu3n28j/kWbqI4oTr4Hu62mLIvJVAHlV7d/z9X8CeFpVvzNpnxKAEgD09PQsbnTZtmZX2Wl2RSF7HwfwK5d9ygC+EGCb8eF0vhs5115XKWpkVaIgV0TiakgUR2GtWJRF7fI5VQA1i0CqallVe1W1d968eQ030OwqO8GtanMkzMWbU5jfCECQ1jAHnM9nI+fabZUiP8dstq1m2yWKG7+B/jRM//mEw2D61APT7Co79T7X2dmJbDZb8142m4XUHdU8BCbIH3do5W6YIF/hWEszJgYTG5HL5bB06dK62zKZ5n/Ubufb6+pN2WzWcZUir+15ad9tRaRsNovOzk5f7RLFjpd+GbsXgIMAPAXgQJhB0ScBdNnt3+yKRc2uWlPvc3bvdXV17elHneXaR37ggbWrFbm95syZo93d3Z77cbu6uurUZQZKZ8yYUbPvrFmzpp2Xvr6+vSsldXR0aF9f37Tvu6+vr6amrq4u7e7uVhHR7u7uvf/2er7djt/d3W27StFEnUDzKzs1syISV0OipEBYKxaJyAoA/7rny4tV9Va7feO8YtGuXcCUC7a6Au2WJyLyILQVi1T1BgA3+D1OVFQBL70RDHIiiru2fjgXn4BIRGnSloHOICeiNGqrZ7ksXOgc5u9//1tDn0RESdMWgX788SbIn3ii/vZly0yIP/RQuHUREQUp1YFeLpsgf+CB+tuPPtoE+Zo14dZFRNQKqQz0W281Qf4Fmxs3zz3XBPmGDeHWRUTUSqkaFF2zBvjYx+y3n3oqsHp1ePUQEYUpFVfo69aZK3K7ML/wQnNFzjAnojRL9BX6Aw+YAU873/8+sGc9CiKi1EvkFfof/2iuyO3C/MorzRU5w5yI2kmiAv2xx0yQL15cf/vll5sgv/jicOsiIoqDRHS5qAKFAjA6Wn/7174G/Pu/e7sDlIgorRIR6DfdVD/Mv/xl4JprGOREREBCAv2AA2q/XrECuO46b09JJCJqF4kI9JNPBn7/e+C554BTTgE6OqKuiIgofhIR6ABwwglRV0BEFG++Oy1E5OggCiEiIn+aDnQRuUhEtgBYH2A9RETUJD9X6EMA3h9UIURE5E/Tfeiqeg8ACOcMEhHFQssn/olISUSGRGRobGys1c0REbUt10AXkf+aCORJL88DoapaVtVeVe2dN2+ev2qJiMiWa5eLqtosE0FERHEi6nNFZBHZpaqe+uJFZAzAiK8G0yUP4Pmoi4gRno9aPB+12vl8WKrq2sXR9KCoiPwAwD8C6BCRzQDuUNXznD7jpaB2IiJDqtobdR1xwfNRi+ejFs+HOz+zXPqCLISIiPzh462IiFKCgR6tctQFxAzPRy2ej1o8Hy58D4oSEVE88AqdiCglGOhEFFsisq+ILIi6jqRgoEdERLIi8riI/CjqWuJARPYXkZtF5CkR2SIi2ahripKIXCgifxGRrSLypajrCZuI7CcivwDwLIBLJr3/FREZFZFNIvKx6CqMp8QscJFClwIYjrqIGPkegI0AzgSwD4A3oy0nOiJSAHA+gEUAZgJ4UkRuUNXtUdYVsirM/4lfAfggAIjI4QC+BHNe3g7gdyJiqWrb/l+ZilfoERCRIwEcC+CWqGuJAxE5GMDxAFapsVPbe7R+IqCqMBddrwF4I7pywqeqr6nqnQB2TXr70wBuUdVXVfVxmAuixVHUF1cM9JCJed7wNQC+EnUtMbIIwFYAq/f8Kf1taePnMqvqUwAuA/AggN8BOJNXoQDMVfnkR4dsA3BIRLXEEgM9fF8EsE5VN0ddSIwcCOAoAOcBeB+AEwAsj7SiCInIfgD+CeaX/lUALhYRdo8CWZi/WiZUAeyOqJZY4n+S8J0FYLaInAZgLoAuEdmkqldGXFeUngOwXlW3AYCIrAWwMNqSIvU5AH9S1XUA1onIp2Gem3RHpFVF72kAh076+jAAf42olljiFXrIVPV4VX23qh4D4N8A3NrmYQ6YroWjRORtIrIPgJNgljhsVzsBHCMinSIyG8ACAH+PuKY4+DWAM0Qkt2ccai6ADRHXFCu8QqfIqep2ETkPwFqYGS43qOrdEZcVpUEAHwHwJIAdAG5U1QejLSlce36RPQJgNoCZInIigM/DnJvHYH7pndvmg+fT8NZ/IqKUYJcLEVFKMNCJiFKCgU5ElBIMdCKilGCgExGlBAOdiCglGOhERCnBQCciSgkGOhFRSvw/1Z5KVW1nE4AAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plot outputs\n", "plt.scatter(Xs_test, y_test, color='black')\n", "plt.plot(Xs_test, regr.predict(Xs_test), color='blue', linewidth=3)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:28:41.441426Z", "start_time": "2018-04-29T07:28:41.428476Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "-0.68370073919430563" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "regr = linear_model.LinearRegression()\n", "scores = cross_val_score(regr, df.click_log, \\\n", " df.reply_log, cv = 3)\n", "scores.mean() " ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:29:00.237224Z", "start_time": "2018-04-29T07:29:00.220565Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "-0.71881497228209845" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "regr = linear_model.LinearRegression()\n", "scores = cross_val_score(regr, df.click_log, \n", " df.reply_log, cv =5)\n", "scores.mean() " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> # 使用sklearn做logistic回归\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- logistic回归是一个分类算法而不是一个回归算法。\n", "- 可根据已知的一系列因变量估计离散数值(比方说二进制数值 0 或 1 ,是或否,真或假)。\n", "- 简单来说,它通过将数据拟合进一个逻辑函数(logistic function)来预估一个事件出现的概率。\n", "- 因此,它也被叫做逻辑回归。因为它预估的是概率,所以它的输出值大小在 0 和 1 之间(正如所预计的一样)。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "$$odds= \\frac{p}{1-p} = \\frac{probability\\: of\\: event\\: occurrence} {probability \\:of \\:not\\: event\\: occurrence}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$ln(odds)= ln(\\frac{p}{1-p})$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$logit(x) = ln(\\frac{p}{1-p}) = b_0+b_1X_1+b_2X_2+b_3X_3....+b_kX_k$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/logistic.jpg)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:46:50.277195Z", "start_time": "2018-04-29T07:46:50.272229Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "repost = []\n", "for i in df.title:\n", " if u'转载' in i:\n", " repost.append(1)\n", " else:\n", " repost.append(0)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:47:06.292994Z", "start_time": "2018-04-29T07:47:06.270715Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[[194675, 2703], [88244, 1041], [82779, 625]]" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_X = [[df.click[i], df.reply[i]] for i in range(len(df))]\n", "data_X[:3]" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:47:45.269303Z", "start_time": "2018-04-29T07:47:45.259792Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.61241970021413272" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.linear_model import LogisticRegression\n", "df['repost'] = repost\n", "model = LogisticRegression()\n", "model.fit(data_X,df.repost)\n", "model.score(data_X,df.repost)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:47:59.648431Z", "start_time": "2018-04-29T07:47:59.633936Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def randomSplitLogistic(dataX, dataY, num):\n", " dataX_train = []\n", " dataX_test = []\n", " dataY_train = []\n", " dataY_test = []\n", " import random\n", " test_index = random.sample(range(len(df)), num)\n", " for k in range(len(dataX)):\n", " if k in test_index:\n", " dataX_test.append(dataX[k])\n", " dataY_test.append(dataY[k])\n", " else:\n", " dataX_train.append(dataX[k])\n", " dataY_train.append(dataY[k])\n", " return dataX_train, dataX_test, dataY_train, dataY_test, " ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:48:27.726443Z", "start_time": "2018-04-29T07:48:27.710922Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'Variance score: 0.45'" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Split the data into training/testing sets\n", "data_X_train, data_X_test, data_y_train, data_y_test = randomSplitLogistic(data_X, df.repost, 20)\n", "# Create logistic regression object\n", "log_regr = LogisticRegression()\n", "# Train the model using the training sets\n", "log_regr.fit(data_X_train, data_y_train)\n", "# Explained variance score: 1 is perfect prediction\n", "'Variance score: %.2f' % log_regr.score(data_X_test, data_y_test)" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:48:56.873331Z", "start_time": "2018-04-29T07:48:56.870219Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "y_true, y_pred = data_y_test, log_regr.predict(data_X_test)\n" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:39:12.344043Z", "start_time": "2018-04-29T07:39:12.338223Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "([1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n", " array([0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]))" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_true, y_pred" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:39:13.175680Z", "start_time": "2018-04-29T07:39:13.171386Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " precision recall f1-score support\n", "\n", " 0 0.50 0.17 0.25 6\n", " 1 0.72 0.93 0.81 14\n", "\n", "avg / total 0.66 0.70 0.64 20\n", "\n" ] } ], "source": [ "print(classification_report(y_true, y_pred))" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:51:43.039620Z", "start_time": "2018-04-29T07:51:43.034812Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "from sklearn.cross_validation import train_test_split\n", "Xs_train, Xs_test, y_train, y_test = train_test_split(data_X, df.repost, test_size=0.2, random_state=42)" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:51:47.690742Z", "start_time": "2018-04-29T07:51:47.683127Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'Variance score: 0.60'" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create logistic regression object\n", "log_regr = LogisticRegression()\n", "# Train the model using the training sets\n", "log_regr.fit(Xs_train, y_train)\n", "# Explained variance score: 1 is perfect prediction\n", "'Variance score: %.2f' % log_regr.score(Xs_test, y_test)" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:51:55.780061Z", "start_time": "2018-04-29T07:51:55.771924Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Logistic score for test set: 0.595745\n", "Logistic score for training set: 0.613941\n", " precision recall f1-score support\n", "\n", " 0 1.00 0.03 0.05 39\n", " 1 0.59 1.00 0.74 55\n", "\n", "avg / total 0.76 0.60 0.46 94\n", "\n" ] } ], "source": [ "print('Logistic score for test set: %f' % log_regr.score(Xs_test, y_test))\n", "print('Logistic score for training set: %f' % log_regr.score(Xs_train, y_train))\n", "y_true, y_pred = y_test, log_regr.predict(Xs_test)\n", "print(classification_report(y_true, y_pred))" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:52:53.880925Z", "start_time": "2018-04-29T07:52:53.866672Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.53333333333333333" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logre = LogisticRegression()\n", "scores = cross_val_score(logre, data_X, df.repost, cv = 3)\n", "scores.mean() " ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T07:53:26.825100Z", "start_time": "2018-04-29T07:53:26.810871Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.62948717948717947" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logre = LogisticRegression()\n", "data_X_scale = scale(data_X)\n", "# The importance of preprocessing in data science and the machine learning pipeline I: \n", "scores = cross_val_score(logre, data_X_scale, df.repost, cv = 3)\n", "scores.mean() " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> # 使用sklearn实现贝叶斯预测\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# Naive Bayes algorithm\n", "\n", "It is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors. \n", "\n", "In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. \n", "\n", "why it is known as ‘Naive’? For example, a fruit may be considered to be an apple if it is red, round, and about 3 inches in diameter. Even if these features depend on each other or upon the existence of the other features, all of these properties independently contribute to the probability that this fruit is an apple." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "贝叶斯定理为使用$p(c)$, $p(x)$, $p(x|c)$ 计算后验概率$P(c|x)$提供了方法:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$\n", "p(c|x) = \\frac{p(x|c) p(c)}{p(x)}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).\n", "- P(c) is the prior probability of class.\n", "- P(x|c) is the likelihood which is the probability of predictor given class.\n", "- P(x) is the prior probability of predictor." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/Bayes_41.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Step 1: Convert the data set into a frequency table\n", "\n", "Step 2: Create Likelihood table by finding the probabilities like:\n", "- p(Overcast) = 0.29, p(rainy) = 0.36, p(sunny) = 0.36\n", "- p(playing) = 0.64, p(rest) = 0.36\n", "\n", "Step 3: Now, use Naive Bayesian equation to calculate the posterior probability for each class. The class with the highest posterior probability is the outcome of prediction." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Problem: Players will play if weather is sunny. Is this statement is correct?\n", "\n", "We can solve it using above discussed method of posterior probability.\n", "\n", "$P(Yes | Sunny) = \\frac{P( Sunny | Yes) * P(Yes) } {P (Sunny)}$\n", "\n", "Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64\n", "\n", "Now, $P (Yes | Sunny) = \\frac{0.33 * 0.64}{0.36} = 0.60$, which has higher probability." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "'ABCMeta BaseDiscreteNB BaseEstimator BaseNB BernoulliNB ClassifierMixin GaussianNB LabelBinarizer MultinomialNB __all__ __builtins__ __doc__ __file__ __name__ __package__ _check_partial_fit_first_call abstractmethod binarize check_X_y check_array check_is_fitted in1d issparse label_binarize logsumexp np safe_sparse_dot six'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn import naive_bayes\n", "' '.join(dir(naive_bayes)) " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- naive_bayes.GaussianNB\tGaussian Naive Bayes (GaussianNB)\n", "- naive_bayes.MultinomialNB([alpha, ...])\tNaive Bayes classifier for multinomial models\n", "- naive_bayes.BernoulliNB([alpha, binarize, ...])\tNaive Bayes classifier for multivariate Bernoulli models." ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:02:37.644606Z", "start_time": "2018-04-29T08:02:37.635952Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "#Import Library of Gaussian Naive Bayes model\n", "from sklearn.naive_bayes import GaussianNB\n", "import numpy as np\n", "\n", "#assigning predictor and target variables\n", "x= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2], [2,7], [-4,1], [-2,7]])\n", "Y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4])" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:02:52.828101Z", "start_time": "2018-04-29T08:02:52.818463Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([4, 3])" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Create a Gaussian Classifier\n", "model = GaussianNB()\n", "\n", "# Train the model using the training sets \n", "model.fit(x[:8], Y[:8])\n", "\n", "#Predict Output \n", "predicted= model.predict([[1,2],[3,4]])\n", "predicted" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# cross-validation \n", " \n", "k-fold CV, the training set is split into k smaller sets (other approaches are described below, but generally follow the same principles). The following procedure is followed for each of the k “folds”:\n", "- A model is trained using k-1 of the folds as training data;\n", "- the resulting model is validated on the remaining part of the data (i.e., it is used as a test set to compute a performance measure such as accuracy)." ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:04:04.297675Z", "start_time": "2018-04-29T08:04:04.273413Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([41, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0])" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_X_train, data_X_test, data_y_train, data_y_test = randomSplit(df.click, df.reply, 20)\n", "# Train the model using the training sets \n", "model.fit(data_X_train, data_y_train)\n", "\n", "#Predict Output \n", "predicted= model.predict(data_X_test)\n", "predicted" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:04:34.184513Z", "start_time": "2018-04-29T08:04:34.178511Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.65000000000000002" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.score(data_X_test, data_y_test)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:05:04.297453Z", "start_time": "2018-04-29T08:05:04.249311Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/sklearn/cross_validation.py:516: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=7.\n", " % (min_labels, self.n_folds)), Warning)\n" ] }, { "data": { "text/plain": [ "0.53413410073295453" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "model = GaussianNB()\n", "scores = cross_val_score(model, [[c] for c in df.click],\\\n", " df.reply, cv = 7)\n", "scores.mean() " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> # 使用sklearn实现决策树\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# 决策树\n", "- 这个监督式学习算法通常被用于分类问题。\n", "- 它同时适用于分类变量和连续因变量。\n", "- 在这个算法中,我们将总体分成两个或更多的同类群。\n", "- 这是根据最重要的属性或者自变量来分成尽可能不同的组别。\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/tree.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/playtree.jpg)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 在上图中你可以看到,根据多种属性,人群被分成了不同的四个小组,来判断 “他们会不会去玩”。\n", "### 为了把总体分成不同组别,需要用到许多技术,比如说 Gini、Information Gain、Chi-square、entropy。" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:10:20.871345Z", "start_time": "2018-04-29T08:10:20.855125Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "from sklearn import tree\n", "model = tree.DecisionTreeClassifier(criterion='gini')" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:10:49.988277Z", "start_time": "2018-04-29T08:10:49.973060Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.91275167785234901" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_X_train, data_X_test, data_y_train, data_y_test = randomSplitLogistic(data_X, df.repost, 20)\n", "model.fit(data_X_train,data_y_train)\n", "model.score(data_X_train,data_y_train)" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:11:12.730866Z", "start_time": "2018-04-29T08:11:12.725782Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Predict\n", "model.predict(data_X_test)" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:11:28.411441Z", "start_time": "2018-04-29T08:11:28.397481Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.33461538461538459" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# crossvalidation\n", "scores = cross_val_score(model, data_X, df.repost, cv = 3)\n", "scores.mean() " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> # 使用sklearn实现SVM支持向量机\n", "***\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/svm.jpg)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- 将每个数据在N维空间中用点标出(N是你所有的特征总数),每个特征的值是一个坐标的值。\n", " - 举个例子,如果我们只有身高和头发长度两个特征,我们会在二维空间中标出这两个变量,每个点有两个坐标(这些坐标叫做支持向量)。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/xyplot.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- 现在,我们会找到将两组不同数据分开的一条直线。\n", " - 两个分组中距离最近的两个点到这条线的距离同时最优化。" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![](./img/sumintro.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## 上面示例中的黑线将数据分类优化成两个小组\n", "- 两组中距离最近的点(图中A、B点)到达黑线的距离满足最优条件。\n", " - 这条直线就是我们的分割线。接下来,测试数据落到直线的哪一边,我们就将它分到哪一类去。" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:17:29.788250Z", "start_time": "2018-04-29T08:17:29.785022Z" } }, "outputs": [], "source": [ "from sklearn import svm\n", "# Create SVM classification object \n", "model=svm.SVC() " ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:17:31.035310Z", "start_time": "2018-04-29T08:17:31.030713Z" } }, "outputs": [ { "data": { "text/plain": [ "'LinearSVC LinearSVR NuSVC NuSVR OneClassSVM SVC SVR __all__ __builtins__ __cached__ __doc__ __file__ __loader__ __name__ __package__ __path__ __spec__ base bounds classes l1_min_c liblinear libsvm libsvm_sparse'" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "' '.join(dir(svm))" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:17:41.872379Z", "start_time": "2018-04-29T08:17:41.849759Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0.90380313199105144" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_X_train, data_X_test, data_y_train, data_y_test = randomSplitLogistic(data_X, df.repost, 20)\n", "model.fit(data_X_train,data_y_train)\n", "model.score(data_X_train,data_y_train)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:17:47.661313Z", "start_time": "2018-04-29T08:17:47.655841Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1])" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Predict\n", "model.predict(data_X_test)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:18:00.419986Z", "start_time": "2018-04-29T08:17:58.671257Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# crossvalidation\n", "scores = []\n", "cvs = [3, 5, 10, 25, 50, 75, 100]\n", "for i in cvs:\n", " score = cross_val_score(model, data_X, df.repost,\n", " cv = i)\n", " scores.append(score.mean() ) # Try to tune cv\n", " " ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:18:05.493658Z", "start_time": "2018-04-29T08:18:05.359658Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZQAAAEXCAYAAACK4bLWAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3X2YVWW9//H3BxUVRU0BAeWh9Ij5kA9MSWhCYoDaycfQItOwsCxPKmoeOVbWhT9NOZdHT9ZFppiSZaAdFRVUBBUfckjUk6Wh8ijgIJopKjDz/f1x7zlsxhmYPew9a8/en9d1zTWse62157tA/LDue637VkRgZma2pTplXYCZmVUGB4qZmRWFA8XMzIrCgWJmZkXhQDEzs6JwoJiZWVE4UMzMrCgcKGZmVhQOFDMzK4qtsy6gPXXr1i369++fdRlmZh3KvHnzVkVE980dV1WB0r9/f2pra7Muw8ysQ5G0qDXHucvLzMyKwoFiZmZFUTaBImmUpNckLZA0ppn9P5a0RNJCSYNzbUMkvZw7b3z7V21mZo3KYgxFUldgIjAIqAfmS7onIupy+8cANcA+wAfAtpIE3AicDLwCPCtpekTMz+IazMyqXbncoYwA5kTEsohYAcwChuXtPx84LyLej+QDYCCwMiKej4j3gKnAyHav3MzMgPIJlD5A/lMES4FeAJK2AXoCYyS9JOkuSbtt6px8ksZKqpVUW1dXV7ILMDMrR1OmQP/+0KlT+j5lSul+VrkESmegIW+7gdT1BdAN+BjwCLAvsBgYv5lz/k9ETIqImoio6d59s49Rm5lVjClTYOxYWLQIItL3sWNLFyrlEijLgT3ytvcEluR+vQp4NyIejLRe8f8AAzZzjplZ1bv0UlizZuO2NWtgfIkeYSqXQJkBjJDUQ1JPYDAwEyAi1gFPS2ocH/ki8AzwFDBA0gBJOwAnAXe2f+lmZuVjzRqYPh2++11YvLj5Y1pq31Jl8ZRXRKzMPfb7ZK5pHDBc0l4RcQ3wHeBWSf9NCpPLImKtpLOAe0jdX1dFRKve5jQzqyQLF6YQue8+mDULPvgAunSB7beH99//6PF9+5amjrIIFICImAxMbmHfq8DhzbQ/QHqU2MysaqxbB3PnpgCZPh1efDG177VXGiM57jg48kiYNi1t53d7dekCEyaUpq6yCRQzM2vZypXwwAMpQGbOhH/8A7bZJgXHN7+ZQmSfJv+8Hj06fR8/PnVz9e2bwqSxvdgcKGZmZaihAebN23AX8swzqb1XLzjllBQgRx8NXbtu+nNGjy5dgDTlQDEzKxP/+Ac8+GAKkPvvT3clEhx2GPz0pylEDj44tZUjB4qZWUYi4G9/SwEyfTo8/jisXw+77AIjR6YAGTkSunXLutLWcaCYmbWj99+H2bM3hMjChan9wAPhwgtTiAwaBFt3wP87d8CSzcw6lsWLNwTIrFkpVLp0gWHD4JJL4NhjoU+frKvccg4UM7MiW78ennhiQ4j85S+p/ROf2PBE1pAhsN122dZZbA4UM7MiqKtLA+nTp8OMGWmAfeut02O9Y8aku5ABA8p3QL0YHChmZm3Q0ADPPrvhLuSZZ9Ige8+ecPLJGx7r3WmnrCttPw4UM7NWeuedjR/rXbEi3XF85jNw+eUbHuvtVC6zJLYzB4qZWQsi4KWXNsyT9dhjadqTXXaBESM2PNbrlTESB4qZWZ4PPtjwWO9998Grr6b2Aw6ACy5IIfLZz3bMx3pLzb8lZlb1lizZECAPP5wmU9x++/RY70UXwTHHQL9+WVdZ/hwoZlZ11q+HJ5/cME/WCy+k9o9/fMMTWUOHplCx1nOgmFlVWLUqDaTfd196rPett1K31ec+B9dck0Jk330r+7HeUnOgmFlFikiP9TbehTz9dGrbfXc44YQUIF/4Auy8c9aVVo6yCRRJo4CrgHrgioi4KW/fZOALQOPaY0dFxGJJs4H+wPpc+4CIqG+vms2svPzzn/DQQxvGQ5YvT3ccn/40/OhHaUD90EOr97HeUiuLQJHUFZgIDCIFynxJ90REXd5hoyNidjOnD42IhaWv0syyMmVK84tERcDLL2+4C3n00fRY7847p8d6jz02Daj36JH1FVSHsggUYAQwJyKWAUiaBQwDfpdpVWaWuSlTNl7GdtEiOOssuPVWWLAAXnklte+/P5x/fgqRwYPTaobWvsolUPoAi/K2lwK98rbXAbdIehe4KSIm5to/BB6R9DYwMSJua/rBksYCYwH69u1bitrNrMg+/DB1V73+Opx33sZrojfunzEjdWFdcEEKkf79MynV8pRLoHQGGvK2G0hdXwBExLcAJPUBHpT0XEQ8FBEjcu37AQ9LeiYiXsr/4IiYBEwCqKmpidJehpltytq1abqS11/fEBjNfa1evfnPkuDee0tfs7VeuQTKcmBo3vaewNNND4qIJZLuBQ4AHsprf1HSXOCTwEtNzzOz0lq/Pi1X2zQYmoZGXd1Hz91qq7ROeu/esPfeaXbe3r03tI0Zkz6nKXc4lJ9yCZQZwP+T1APoBAwGzm7cKWnviFggaTdgZOO+vPZ+wGHAhe1fulnlqq+HN97Y/B3FG2+kAfJ8nTqlmXd7907/8x80KP266Ve3bpt+6urqqzceQ4G0ONWECaW5Zmu7sgiUiFgpaTzwZK5pHDBc0l4RcQ1wXa5b60Pg+oiYmzvuLkk7AmuAC/y0l1nrNDSkF/2aC4f84FixIh2bT0pPTTUGQk3Nhl833lX07p2O2WqrLa919Oj0vbmnvKy8KJr+s6KC1dTURG1tbdZlmJVMBLz5ZstdTvlBsX79R8/v3v2jwdD0q0cPP0FVbSTNi4iazR1XFncoZtWipfcpNicC3n5783cUy5enge+mdt11QyDst1/zdxQ9e0LnzsW/ZqseDhSzdtLc+xRjx8L778Phh2/6jmL58jStelO77LIhGBoHs5t+9exZeWuXW3lyl5dZO+nfP4VIa3Tt2nw45N9R9OqVBqfNSs1dXmZlZvHilvf99rcbB8WOO7ZfXWbF4kAxawdPP50eja1vZurSfv3gK19p/5rMis1zbpqV2E03pfGNj33so2MZfp/CKokDxaxE1q6Fc85JExkOGQIvvQQ33pjuSKT0fdIkv09hlcNdXmYlsGIFfPnL8PjjaU3yK65IqwOOHu0AscrlQDErsj/9CU46KS0x+7vfwamnZl2RWftwl5dZEd10U1qjfJtt4IknHCZWXRwoZkWwbh1873tpvOTII6G2Fg46KOuqzNqXA8VsC61cCcOGwc9/nsZL7r8fdtst66rM2p/HUMy2QON4yerVcPvtcNppWVdklh3foZi10c03p+6txvESh4lVOweKWYHWrYNzz00rCR5xRBovOfjgrKsyy54DxawAjeMl//3fcOGF8MADHi8xa1Q2gSJplKTXJC2QNKbJvsmSluX2LZDUN9c+RNLLufPGZ1O5VYtnnkmrE9bWpqnor746vaxoZklZ/HWQ1BWYCAwC6oH5ku6JiLq8w0ZHxOy8cwTcCJwMvAI8K2l6RMxvv8qtWtxyC5x9dlpb5Ikn3MVl1pxyuUMZAcyJiGURsQKYBQzbzDkDgZUR8XxEvAdMBUaWuE6rMuvWwb/9G5x5ZloEy+MlZi0rl0DpA+QvPbQU6JW3vQ64RdJfJI1r5TkASBorqVZSbV1dXdPdZi164w04+mi4/nq44AKYMQO6dcu6KrPyVRZdXkBnoCFvu4HU9QVARHwLQFIf4EFJz23unLxzJwGTIK3YWPTKrSLV1qb3S1atSuMlX/1q1hWZlb9yuUNZDuyRt70nsKTpQRGxBLgXOKC155gV6je/SY8Dd+oEc+c6TMxaq1wCZQYwQlIPST2BwcDMxp2S9s593400TvIM8BQwQNIASTsAJwF3tnvlVjHWrYPvfx/OOGPDeMkhh2RdlVnHURZdXhGxMvfY75O5pnHAcEl7RcQ1wHWS9gM+BK6PiLkAks4C7iF1f10VEYua+XizzXrjDRg1CubMgfPPh5/9zI8EmxWqbP7KRMRkYHIL+45tof0BYJ/SVWXVYN48OPFEqKuD227zAlhmbVUuXV5mmfjNb1L3lpTGSxwmZm3nQLGqtG4dnHdeGi8ZPDiNlxx6aNZVmXVsZdPlZdZe6urSeMns2SlUPIWKWXH4r5FVlT//OY2XvPFG6u46/fSsKzKrHO7ysqpx221pvATSeInDxKy4HChW8davT48Cn346DBrk8RKzUnGgWEWrq4Phw+Haa9NLizNnQvfuWVdlVpk8hmIVy+MlZu3LdyhWkaZMSeMlEfD44w4Ts/bgQLGKsn59mmr+a1+Dww5L4yUDB2ZdlVl1cJeXVYxVq+DUU2HWrLQo1jXXwDbbZF2VWfVwoFhFePbZNF6yYkVarvfrX8+6IrPq4y4v6/B++9s0XlJfn8ZLHCZm2XCgWIe1fj2MG5cmdPz0p9OswTU1WVdlVr3c5WUd0qpVcNpp8PDDcO65MHGix0vMsuZAsQ5n/nw44YQ0XnLzzXDmmVlXZGZQRl1ekkZJek3SAkljWjjm15IW5G3PlrQwd84CSVu1X8WWhdtvT9PN19fDY485TMzKSVncoUjqCkwEBgH1wHxJ90REXd4xnwd6NnP60IhY2C6FWmbWr4dLLkldW5/7HPzhD7D77llXZWb5yuUOZQQwJyKWRcQKYBYwrHGnpO2AnwKXZVSfZejNN2HkyBQm3/teGjdxmJiVnzbdoUjaFzgMELA4ImZtYR19gEV520uBXnnbPwR+Aaxuct6HwCOS3gYmRsRtzdQ6FhgL0Ldv3y0s09rbc8+l8ZLlyz1eYlbuCgoUSZ2AXwONT/oLaGj8HEmKiGhDHZ1zn9OogdT1haQDgYMi4lJJ/fNPiogRuWP2Ax6W9ExEvNTkmEnAJICampq21GYZ+d3vYMwY2HXXNF7y6U9nXZGZbUqhXV4/AM4AngS+DUwjhUqjwblB8s8X+LnLgT3ytvcEluR+fQawt6T5wH1AH0m/zz85Il4E5gKfLPDnWhlavx4uugi+8pX0Xsm8eQ4Ts46g0ED5BvASMCQifgX8b/7OiJgLrAdGFfi5M4ARknpI6gkMBmbmPvPCiBgQEQcDxwJLIuJUAEl75773I3XBzS/w51qZefNNOOaYNA/Xd78LDz3k8RKzjqLQMZR+wM8jon4Tx8wjBUKrRcRKSeNJdz4A44DhkvaKiGs2cepdknYE1gAX+Gmvju2559J8XMuWwU03wTe+kXVFZlaIQgPlHWDbzRyzDCi0y4uImAxM3swxC4G987YPLPTnWHn6/e9TgDSOl3zmM1lXZGaFKrTL6xng6NzgfEsagJ3bXpJVk/p6uPjiNI3KwIFp/RKHiVnHVGig/Br4F+AnmzjmU8Cbba7Iqsbq1Wm85Oqr4Zxz0vslPZt7ddXMOoSCurwiYlruCat/l/RJ4IP8/ZJOBI4GphavRKtEzz+f3i9ZtgxuvBHOOivrisxsS7XlxcbRpDuQ7zQ2SHoE6AbsB6wDripKdVaR7rgjjZfssgs8+mhaqtfMOr6Cp16JiIaI+B7pSa7bgVXAEGB/4AXgixExr6hVWkWor4cf/CAt03vIIen9EoeJWeUo9E35TwJ/i+Rp4Olc+7aAIuKDTX6AVa3Vq9OLijNnwne+A9deC507Z12VmRVToXcofwFuadoYER86TKwlL7yQ3nSfPTuNl9xwg8PErBIVGihvsWFKFLPNuuMOGDQIPvgA5szx4LtZJSs0UB4D9i1FIVZZ6uvT+iWnngoHH5zGSwYNyroqMyulQgNlAnCcJE/VZy1avRqOOw6uugq+/W145BG/X2JWDQoNlFNIi189JOmMEtRjHVzjeMkjj8CvfgW/+IXHS8yqRaHvoVwEBGnK+pskXQlMJz3tVQu8EBHri1uidRR/+EN6v2SnndJ4ibu4zKpLoYFyFHBo3tc+wBjStPYAayW9ANRGxDlFq9LKWn09/Md/wJVXwuDBMHUq9Oq1+fPMrLIUOvXKbGB247akLsBBbBwyBwEDAQdKhZoyBcaPh8WLYY890gzBzz8PZ58N113nLi6zatWmNeUbRcQa0homjeuYIKkzcMAW1mVlasoUGDsW1qxJ20uXpq+zzoJf/jLb2swsW20KFEk7k+5GtgIWR8TLjfsiYi3w5+KUZ+Vm/PgNYZLvoYfavxYzKy8Fz+Ul6RLSGvAPkZbu/auk5ZImSOra1kIkjZL0mqQFksa0cMyvJS3I2x4i6eXceePb+rOt9RYvLqzdzKpHQYEi6evAFaQld28FrgV+T1pU69+BZyV9vNAickE0ETgi93WFpO5Njvk80DNvW8CNpEeZDwDOkHRwoT/bCrPLLs239+3bvnWYWfkp9A7lPNLdyYCIODMixkXEV4E+wLeA3YEHJe1Q4OeOAOZExLKIWEF612VY405J2wE/BS7LO2cgsDIino+I90hrsIws8OdaK0XAj38Mb70FW2218b4uXWDChEzKMrMyUmig7AvcFREbrciYm9L+18AJwMeBcQV+bh9gUd72UiD/wdMfAr8AVhdwDgCSxkqqlVRbV1dXYFkG0NAA554Ll18OY8bAzTdDv34gpe+TJsHo0VlXaWZZK3RQfg1NVmnMFxEPS5oBnMymlwluqjOp26xRA1APIOlA4KCIuFRS/9ac06SmScAkgJqamiigJgPWroUzz4Tbb4eLLkrTqUhw+ulZV2Zm5abQO5T/Ja8rqgXPke5SCrEc2CNve082zGp8BrC3pPnAfUCf3DLEmzrHiuC99+D441OYXHUV/OxnKUzMzJpTaKBMBg6S9INNHNO7DXXMAEZI6iGpJ2k1yJkAEXFhRAyIiIOBY4ElEXEq8BQwQNKA3JjNScCdbfjZ1oy33oLhw9OCWL/6FVx8cdYVmVm5KyhQImIy8CDpKazbmz5VJekoYBTwpwI/dyUwnvSC5FzSGMxwSRdu4py1wFnAPaSFv66PiEUtHW+t9/rrcOSRUFub5uf65jezrsjMOgJFFDaskHvi6nbgeNJEke8AC4GPkQbKG4CjIuKxolZaBDU1NVFbW5t1GWVtwYJ0Z1JXB//zP3DUUVlXZGZZkzQvImo2d1zBLzZGxAcRcSLwJdJMwyLN37Un6c5kRDmGiW3ec8/BEUfAO+/ArFkOEzMrTJvn8oqIe4F7ASTtCHwYEeuKVZi1r8ceg3/9V+jaNa39vq/X5TSzAhX6pvyXJF0pqU9+e0S86zDpuO69N3Vz9eoFTzzhMDGztim0y+sc0vonbza3MzcdinUgt94KJ5wABxyQ7lL69Nn8OWZmzSk0UD4FzMxNW9+cvSUtk3TaFtZl7eC//gu+/nUYMiSNmXTrlnVFZtaRFRoou7LxdCcbiYi/k6ZAOXMLarISi4Af/hDOOw9OOgnuuy+NnZiZbYlCA6WOZubLamI+cGDbyrFSq6+Hc86Bn/40vV9yxx2w7bZZV2VmlaDQQHkCOE7S9ps45m3AnSdlaO3aNInjL38JP/hBmtSx6czBZmZtVWig/ALoDkyW1NIjx/sD/9yiqqzo3nsvPRb8+9/D1VfDlVd6Xi4zK65Cp16ZDVwHfBl4XNLQ/P2SvgwcQ94a85a91avh6KPTMr033QQXtjihjZlZ2xX8YmNEnCdpDXAx8LCkt0gD9T1IE0M2AFcWtUprs2XLYMSINKXKtGnpEWEzs1IoeOoVgIi4FKgBfktag+QQ0lTyfwGOj4i5RavQ2uzvf4fDD0/rvd9/v8PEzEprS6ZemQ+cDiBpW9JEky0uvmXt69lnYeTItNriI4/AwIFZV2RmlW6zdyiSzpO0ODc1fUu6OEzKx5w5MHQobLcdPP64w8TM2kdruryGkB4D/nPTHZLOyo2hrJJUJ+lST7+SrbvvTmMme+wBc+fCgAFZV2Rm1aI1gbI/8FhEvJ3fmFtcaxKwM2kK+92AnwL/VewirXVuuSW9+X7QQWlerj33zLoiM6smrQmU3YEFzbSfTQqS3wB9gUHAC8B3JR1WaCGSRkl6TdICSWOa7LtS0t9yXW8X57XPlrQwd84CSVX7mt5//ieceWZaw+Thh2G33bKuyMyqTWsCpTPwbjPtxwDrgPMiYmlE/Im0rnsDUNCisZK6AhOBI3JfV0jqnnfIdRGxL3AocFnu+EZDI2Lv3Fd9IT+3EkTA+PEwbhx8+ctwzz2w445ZV2Vm1ag1gbIC6JffIGkP0l3J0/ldYRHxKvAQ8LkC6xgBzImIZRGxApgFDMv73Ndzv+xNeuflvQI/vyLV18O3vw1XXAFnnw233+55ucwsO60JlCeBY5rcFYzIfZ/TzPEvk5YDLkQfNp7FeCl5k1BKGippCfAocGFENOR2fQg8IulZSV9r7oMljZVUK6m2rq6uwLLK14cfwmmnpfm4Lr0UfvELz8tlZtlqTaBMAroCt0jqLuljwHeBAGY0c/y7rfzcfJ1JXWWNGkgvTAJpypeI6AN8FvilpE/k2kdExMeB0cDVkj7yTFNETIqImoio6d69e9PdHdK778IXvwhTp8LEiTBhguflMrPsbfZ//Ln5u24ETiB1f60CDgZeiYjHmzmlD7CywDqWk960b7QnsKSZWv4KPE4aS8lvfxGYC3yywJ/b4bz5Jgwbll5WnDwZLrgg64rMzJLW3kl8G/gx8Bbpya5lwNdbOPZIYHGBdcwARkjqIaknMBiYCSBpO0kDc7/uQXqabH5ue+/c937AYY3tlWrpUvjc5+C55+DOO+GMM7KuyMxsg1ZNvZIbs/gJ8BNJO0XEO80dJ2kYabD+lkKKiIiVksazYZbiccBwSXsBPwdukLQ7aTD+RxHR+BjzXZJ2BNYAF0TEwkJ+bkfy0kswfDi8/TbMmJGW7TUzKydtmW242TDJ6QXMBu5tw+dOBia3sLvZ91oioipWhpw3D445Jo2TzJ4NhxySdUVmZh/VptmGWxIRt0XEURHxTDE/t5o98gh8/vPQpUual8thYmblqqiBYsX1xz+mGYP79k3zcv3Lv2RdkZlZyxwoZermm+Hkk+HQQ+HRR9Nkj2Zm5cyBUoauuQbGjNmwbO+uu2ZdkZnZ5jlQykgEXHIJXHQRnHpqmpdrhx2yrsrMrHXavGKjFVfjvFw33gjf+Q5cf72nUjGzjsV3KBmZMgX694dOndKg+6BBKUwuuwx+/nOHiZl1PL5DycCUKTB2LKxZk7aXLElfX/sa/OQn2dZmZtZWvkPJwPjxG8Ik32OPtX8tZmbF4kDJwOIWZjprqd3MrCNwoGSgb9/C2s3MOgIHSgYmTIDtt9+4rUuX1G5m1lE5UDIwejQcf3z6tQT9+qWVF0ePzrYuM7Mt4ae8MvLqqzBwINTWZl2JmVlx+A4lA4sXw5/+BKecknUlZmbF40DJwJ13pu8nn5xtHWZmxVQ2gSJplKTXJC2QNKbJvisl/U3SYkkX57UPkfRy7rzx7V9120ybBp/6lKejN7PKUhaBIqkrMBE4Ivd1haTueYdcFxH7AocCl0nqKknAjcApwAHAGZIObufSC/b662ltE9+dmFmlKYtAAUYAcyJiWUSsAGYBwxp3RsTruV/2BhaR1pYfCKyMiOcj4j1gKjCyfcsu3F13pVmFPX5iZpWmXAKlDykoGi0lrU8PgKShkpYAjwIXRkTD5s7JO3espFpJtXV1dSUpvhDTpsG++8J++2VdiZlZcZVLoHQGGvK2G4D6xo2ImB0RfYDPAr+U9InNnZN37qSIqImImu7duzfd3a7q6mDOHN+dmFllKpdAWQ7kL3K7J7Ck6UER8VfgcdJYSqvOKSd//CM0NDhQzKwylUugzABGSOohqScwGJgJIGk7SQNzv+4BDALmA08BAyQNkLQDcBJwZybVt9LUqbDXXukJLzOzSlMWb8pHxMrcY79P5prGAcMl7QX8HLhB0u6kwfgfRcQCAElnAfeQur+uiohFH/308rB6NcyaBePGpelWzMwqTVkECkBETAYmt7D7sBbOeQDYp0QlFdXdd8P69X5c2MwqV7l0eVW8adPS9PQ1NVlXYmZWGg6UdvDOOzBzZro7cXeXmVUqB0o7uPdeWLvWT3eZWWVzoLSDqVOhd28YNCjrSszMSseBUmLvvgv33w8nnQSd/LttZhXM/4srsfvvhw8+cHeXmVU+B0qJTZ0KPXrAEUdkXYmZWWk5UEro/fdh+nQ48UTYaqusqzEzKy0HSgnNmAHvveeXGc2sOjhQSmjaNNh1Vxg6NOtKzMxKz4FSIh9+mKZbOf542GabrKsxMys9B0qJPPxwekPeT3eZWbVwoJTI1Kmw004wbNjmjzUzqwQOlBJYty4tpvWlL8G222ZdjZlZ+3CglMDs2fDWW+7uMrPq4kApgalTYYcdYPjwrCsxM2s/ZRMokkZJek3SAkljmuz7vqS/Sloo6VZJW+faZ+faFuS+Mn99sL4e7roLvvhF2H77rKsxM2s/ZREokroCE4Ejcl9XSOqed8g/gYOAvYHdgVPz9g2NiL1zX/XtVXNLHnsM6ur8MqOZVZ+yCBRgBDAnIpZFxApgFvB/z0dFxE0RsTYi1gPPA7tmVOdmTZuW7kyOOSbrSszM2le5BEofYFHe9lKgV9ODJHUBjgPuyTV9CDwi6VlJX2vugyWNlVQrqbaurq7IZW+soSEFysiRsOOOJf1RZmZlp1wCpTPQkLfdAGzUfSWpE/Ab4PqIWAgQESMi4uPAaOBqSQOafnBETIqImoio6d69e9PdRfXUU7B8uZ/uMrPqVC6BshzYI297T2BJ44YkATcCL0bEDU1PjogXgbnAJ0tc5yZNnQqdO6cBeTOzalMugTIDGCGph6SewGBgZt7+G4AVEfHD/JMk7Z373g84DJjfTvV+RETq7ho+PL0hb2ZWbbbOugCAiFgpaTzwZK5pHDBc0l7AU8DZwKuSRuX2XxYRtwN3SdoRWANc0NgVloXaWli8GC6/PKsKzMyyVRaBAhARk4HJLexu9k4qIg4sVT2FmjoVtt46TbdiZlaNyqXLq0Nr7O4aNiytf2JmVo0cKEXw3HPwyit+mdHMqpsDpQimTYNOneCEE7KuxMwsOw6UIpg6FYYMgRK/5mJmVtYcKFtgyhTo3Rv+9jeYPz9tm5lVq7J5yqujmTIFxo6FNWvS9ltvpW2A0aOzq8v7dyDmAAAINUlEQVTMLCu+Q2mj8eM3hEmjNWtSu5lZNXKgtNHixYW1m5lVOgdKG/XtW1i7mVmlc6C00Te+8dG2Ll1gwoT2r8XMrBw4UNroqafSmid9+oAE/frBpEkekDez6uWnvNrgqafggQfgqqvg4ouzrsbMrDz4DqUNLr8cunWDc87JuhIzs/LhQClQ493JRRd5mV8zs3wOlAL57sTMrHkOlAL47sTMrGVlEyiSRkl6TdICSWOa7Pu+pL9KWijpVklb59qHSHo5d17J31H33YmZWcvKIlAkdQUmAkfkvq6QlD937z+Bg4C9gd2BUyUJuBE4BTgAOEPSwcWubcoU6N8/TU//wANpES3fnZiZfVRZBAowApgTEcsiYgUwCxjWuDMiboqItRGxHnge2BUYCKyMiOcj4j1gKjCymEU1TgC5aFFalRHg7rs9q7CZWXPKJVD6AIvytpcCvZoeJKkLcBxwTwHnjJVUK6m2rq6uoKKamwDy/fc9AaSZWXPKJVA6Aw152w1Aff4BkjoBvwGuj4iFrTkHICImRURNRNR0L3AFLE8AaWbWeuUSKMuBPfK29wSWNG7kjZe8GBE3tOacYvAEkGZmrVcugTIDGCGph6SewGBgZt7+G4AVEfHDvLangAGSBkjaATgJuLOYRU2YkCZ8zOcJIM3MmlcWc3lFxMrcY79P5prGAcMl7UUKjrOBVyWNyu2/LCJul3QWaTylM3BVRCxq+tlbonGix/HjUzdX374pTDwBpJnZRykaH1+qAjU1NVFbW5t1GWZmHYqkeRFRs7njyqXLy8zMOjgHipmZFYUDxczMisKBYmZmReFAMTOzoqiqp7wk1bHxdC2NugGr2rmccuFrrz7Vet3ga2/rtfeLiM1ONVJVgdISSbWteSSuEvnaq+/aq/W6wdde6mt3l5eZmRWFA8XMzIrCgZJMyrqADPnaq0+1Xjf42kvKYyhmZlYUvkMxM7OicKCYmVUQSdtL2ieLn131gSJplKTXJC2QNCbrekpF0naSJkl6SdIiSefn2r8vaXGu/Zis6ywlSZ0lvSjpxtx2VVy7pJ0l/U7SMkmv5H4fquXaL5D099zf8e/m2iry2iXtJOmPwErg4rz2Zq9X0pWSlkp6QdLAohQREVX7BXQlrfK4B9ATWAF0z7quEl3rbsDJgEgvOK0EhgAv534f9gNeB7bJutYS/h78GLiPtPrnXtVy7aSls/8j92e/XbVcO9AfWAjskPvv/x/A/pV67cCOwDDgm8CNubZm/6yBo4DHSWtifQGYX4waqv0OZQQwJyKWRcQKYBbpD6TiRMSbETEtklWkID0SuCMi/hkRL5L+8hXnXyplRtIngU8Dd+SaTqQKrj1vBdQrcn/2H1Al1w6sy31vIP2P813gWCr02iPi3Yh4GFif19zSn/VJwOSIWB8RDwLdc/+tbJFqD5Q+bDwVy1KgV0a1tBtJB5D+pdqNKrh+SQKuA76f11wtf/b7A68B03JdHtdQJdceEctId6VPAQ8BXwH2pAquPU9Lf9ZN25dRhN+Hag+UzqR/vTRqAOozqqVdSOoG3Ap8g+q5/m8DsyNiQV5btVx7D1JXx7nAocDhwJeogmuXtBPwVdI/JP4TuJDq+XNv1NL1luT3oSzWlM/QcmBo3vaewNPZlFJ6kj4G3ANcGhHP5Abo9sg7ZE9SV1ilOR3oKunLwK6kPvXrqI5rfwOYFxFLASQ9SPofRzVc+9eA5yNiNjBb0omkcdJquPZGy2n+epu29ybdvWyRar9DmQGMkNQjr695ZsY1lUTuX2t3AxMi4v5c83TgNEldcmMMuwLzs6qxVCJicEQcGBEHAz8E7gLupQqundTds5+k3pK2BY4mjSVUw7V/ABwsaRtJXYF9SF1f1XDtjVr6Oz4dOEPSVpK+ALwcEau39IdV9R1KRKyUNB54Mtc0LiLey7KmEvo3UpfHtZKuzbUNB24D/kL6y/fNyD0aUukiYp6kir/2iHhP0rnAg8C2pIHYiblwqehrJ/23fRTwKvA+cEtEzK3UP/dcaD5LeqJrO0lDgW/RzN9xSXeRnvJ8FXiT1DW45TVUyO+lmZllrNq7vMzMrEgcKGZmVhQOFDMzKwoHipmZFYUDxczMisKBYmZmReFAMTOzonCgmJlZUThQzMysKBwoZmZWFA4UMzMrCgeKWTuRNFTS73PreH8oabmkGZJOkPTvkiI3kWNz534id86fcguGmZWdqp5t2Ky95GZ4/j6wijR1+HKgH2nG58HA7NyhB7TwEVeRFkU6v1Jmx7XK40AxKzFJV5DCZBpwRv4SCZJ2BHZjw2p5+zdz/uHAKaS1weeWvmKztvH09WYlJOkQoBb4M3B4RKzdxLFvAp0i4mN5bSItknUQsG9ELCxtxWZt5zEUs9K6gPT37JJNhUnO88AukvbMa/sK8BngWoeJlTsHillpjQDeAh5pxbHP5b7vDyBpO+AK0rrwV5SkOrMicqCYlUguELoDiyKioRWnPJ/73jgwfx5p4P6yiHinBCWaFZUDxax0Gh/v7dHK4xvvUA6Q1AP4d+B/gV8XuzCzUnCgmJVIRLxPCoTekkY13S9pH0lb5TX9hfS01/7A5cBOwAURUd/0XLNy5Ke8zEpI0nHA3aS7lftJobELcDDQJyJ6NTn+ReATpEf6H4iIL7ZvxWZt5zsUsxKKiOnAUFKYfBY4H/hX4G3gwmZOeQ7YFogW9puVLd+hmJlZUfgOxczMisKBYmZmReFAMTOzonCgmJlZUThQzMysKBwoZmZWFA4UMzMrCgeKmZkVhQPFzMyKwoFiZmZF4UAxM7Oi+P/gdhYMizDZCAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(cvs, scores, 'b-o')\n", "plt.xlabel('$cv$', fontsize = 20)\n", "plt.ylabel('$Score$', fontsize = 20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "\n", "> # 泰坦尼克号数据分析\n", "\n", "王成军\n", "\n", "wangchengjun@nju.edu.cn\n", "\n", "计算传播网 http://computational-communication.com" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:18:51.356108Z", "start_time": "2018-04-29T08:18:51.352719Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "#Import the Numpy library\n", "import numpy as np\n", "#Import 'tree' from scikit-learn library\n", "from sklearn import tree" ] }, { "cell_type": "code", "execution_count": 140, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T12:26:42.061508Z", "start_time": "2018-04-29T12:26:42.054162Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "import pandas as pd\n", "train = pd.read_csv('../data/tatanic_train.csv', sep = \",\")" ] }, { "cell_type": "code", "execution_count": 141, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T12:26:43.031257Z", "start_time": "2018-04-29T12:26:43.028590Z" } }, "outputs": [], "source": [ "from sklearn.naive_bayes import GaussianNB" ] }, { "cell_type": "code", "execution_count": 142, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T12:26:46.589016Z", "start_time": "2018-04-29T12:26:46.581334Z" } }, "outputs": [], "source": [ "train[\"Age\"] = train[\"Age\"].fillna(train[\"Age\"].median())\n", "train[\"Fare\"] = train[\"Fare\"].fillna(train[\"Fare\"].median())\n", "# x = [[i] for i in train['Age']]\n", "y = train['Age']\n", "y = train['Fare'].astype(int) \n", "#y = [[i] for i in y]" ] }, { "cell_type": "code", "execution_count": 145, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T12:27:37.001150Z", "start_time": "2018-04-29T12:27:36.993798Z" } }, "outputs": [], "source": [ "#Create a Gaussian Classifier\n", "model = GaussianNB()\n", "\n", "# Train the model using the training sets \n", "nb = model.fit(x[:80], y[:80])\n", "# nb.score(x, y)" ] }, { "cell_type": "code", "execution_count": 135, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T12:21:14.104428Z", "start_time": "2018-04-29T12:21:14.098794Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on class GaussianNB in module sklearn.naive_bayes:\n", "\n", "class GaussianNB(BaseNB)\n", " | Gaussian Naive Bayes (GaussianNB)\n", " | \n", " | Can perform online updates to model parameters via `partial_fit` method.\n", " | For details on algorithm used to update feature means and variance online,\n", " | see Stanford CS tech report STAN-CS-79-773 by Chan, Golub, and LeVeque:\n", " | \n", " | http://i.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf\n", " | \n", " | Read more in the :ref:`User Guide `.\n", " | \n", " | Attributes\n", " | ----------\n", " | class_prior_ : array, shape (n_classes,)\n", " | probability of each class.\n", " | \n", " | class_count_ : array, shape (n_classes,)\n", " | number of training samples observed in each class.\n", " | \n", " | theta_ : array, shape (n_classes, n_features)\n", " | mean of each feature per class\n", " | \n", " | sigma_ : array, shape (n_classes, n_features)\n", " | variance of each feature per class\n", " | \n", " | Examples\n", " | --------\n", " | >>> import numpy as np\n", " | >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])\n", " | >>> Y = np.array([1, 1, 1, 2, 2, 2])\n", " | >>> from sklearn.naive_bayes import GaussianNB\n", " | >>> clf = GaussianNB()\n", " | >>> clf.fit(X, Y)\n", " | GaussianNB()\n", " | >>> print(clf.predict([[-0.8, -1]]))\n", " | [1]\n", " | >>> clf_pf = GaussianNB()\n", " | >>> clf_pf.partial_fit(X, Y, np.unique(Y))\n", " | GaussianNB()\n", " | >>> print(clf_pf.predict([[-0.8, -1]]))\n", " | [1]\n", " | \n", " | Method resolution order:\n", " | GaussianNB\n", " | BaseNB\n", " | abc.NewBase\n", " | sklearn.base.BaseEstimator\n", " | sklearn.base.ClassifierMixin\n", " | builtins.object\n", " | \n", " | Methods defined here:\n", " | \n", " | fit(self, X, y, sample_weight=None)\n", " | Fit Gaussian Naive Bayes according to X, y\n", " | \n", " | Parameters\n", " | ----------\n", " | X : array-like, shape (n_samples, n_features)\n", " | Training vectors, where n_samples is the number of samples\n", " | and n_features is the number of features.\n", " | \n", " | y : array-like, shape (n_samples,)\n", " | Target values.\n", " | \n", " | sample_weight : array-like, shape (n_samples,), optional\n", " | Weights applied to individual samples (1. for unweighted).\n", " | \n", " | .. versionadded:: 0.17\n", " | Gaussian Naive Bayes supports fitting with *sample_weight*.\n", " | \n", " | Returns\n", " | -------\n", " | self : object\n", " | Returns self.\n", " | \n", " | partial_fit(self, X, y, classes=None, sample_weight=None)\n", " | Incremental fit on a batch of samples.\n", " | \n", " | This method is expected to be called several times consecutively\n", " | on different chunks of a dataset so as to implement out-of-core\n", " | or online learning.\n", " | \n", " | This is especially useful when the whole dataset is too big to fit in\n", " | memory at once.\n", " | \n", " | This method has some performance and numerical stability overhead,\n", " | hence it is better to call partial_fit on chunks of data that are\n", " | as large as possible (as long as fitting in the memory budget) to\n", " | hide the overhead.\n", " | \n", " | Parameters\n", " | ----------\n", " | X : array-like, shape (n_samples, n_features)\n", " | Training vectors, where n_samples is the number of samples and\n", " | n_features is the number of features.\n", " | \n", " | y : array-like, shape (n_samples,)\n", " | Target values.\n", " | \n", " | classes : array-like, shape (n_classes,)\n", " | List of all the classes that can possibly appear in the y vector.\n", " | \n", " | Must be provided at the first call to partial_fit, can be omitted\n", " | in subsequent calls.\n", " | \n", " | sample_weight : array-like, shape (n_samples,), optional\n", " | Weights applied to individual samples (1. for unweighted).\n", " | \n", " | .. versionadded:: 0.17\n", " | \n", " | Returns\n", " | -------\n", " | self : object\n", " | Returns self.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data and other attributes defined here:\n", " | \n", " | __abstractmethods__ = frozenset()\n", " | \n", " | ----------------------------------------------------------------------\n", " | Methods inherited from BaseNB:\n", " | \n", " | predict(self, X)\n", " | Perform classification on an array of test vectors X.\n", " | \n", " | Parameters\n", " | ----------\n", " | X : array-like, shape = [n_samples, n_features]\n", " | \n", " | Returns\n", " | -------\n", " | C : array, shape = [n_samples]\n", " | Predicted target values for X\n", " | \n", " | predict_log_proba(self, X)\n", " | Return log-probability estimates for the test vector X.\n", " | \n", " | Parameters\n", " | ----------\n", " | X : array-like, shape = [n_samples, n_features]\n", " | \n", " | Returns\n", " | -------\n", " | C : array-like, shape = [n_samples, n_classes]\n", " | Returns the log-probability of the samples for each class in\n", " | the model. The columns correspond to the classes in sorted\n", " | order, as they appear in the attribute `classes_`.\n", " | \n", " | predict_proba(self, X)\n", " | Return probability estimates for the test vector X.\n", " | \n", " | Parameters\n", " | ----------\n", " | X : array-like, shape = [n_samples, n_features]\n", " | \n", " | Returns\n", " | -------\n", " | C : array-like, shape = [n_samples, n_classes]\n", " | Returns the probability of the samples for each class in\n", " | the model. The columns correspond to the classes in sorted\n", " | order, as they appear in the attribute `classes_`.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Methods inherited from sklearn.base.BaseEstimator:\n", " | \n", " | __repr__(self)\n", " | Return repr(self).\n", " | \n", " | get_params(self, deep=True)\n", " | Get parameters for this estimator.\n", " | \n", " | Parameters\n", " | ----------\n", " | deep: boolean, optional\n", " | If True, will return the parameters for this estimator and\n", " | contained subobjects that are estimators.\n", " | \n", " | Returns\n", " | -------\n", " | params : mapping of string to any\n", " | Parameter names mapped to their values.\n", " | \n", " | set_params(self, **params)\n", " | Set the parameters of this estimator.\n", " | \n", " | The method works on simple estimators as well as on nested objects\n", " | (such as pipelines). The former have parameters of the form\n", " | ``__`` so that it's possible to update each\n", " | component of a nested object.\n", " | \n", " | Returns\n", " | -------\n", " | self\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data descriptors inherited from sklearn.base.BaseEstimator:\n", " | \n", " | __dict__\n", " | dictionary for instance variables (if defined)\n", " | \n", " | __weakref__\n", " | list of weak references to the object (if defined)\n", " | \n", " | ----------------------------------------------------------------------\n", " | Methods inherited from sklearn.base.ClassifierMixin:\n", " | \n", " | score(self, X, y, sample_weight=None)\n", " | Returns the mean accuracy on the given test data and labels.\n", " | \n", " | In multi-label classification, this is the subset accuracy\n", " | which is a harsh metric since you require for each sample that\n", " | each label set be correctly predicted.\n", " | \n", " | Parameters\n", " | ----------\n", " | X : array-like, shape = (n_samples, n_features)\n", " | Test samples.\n", " | \n", " | y : array-like, shape = (n_samples) or (n_samples, n_outputs)\n", " | True labels for X.\n", " | \n", " | sample_weight : array-like, shape = [n_samples], optional\n", " | Sample weights.\n", " | \n", " | Returns\n", " | -------\n", " | score : float\n", " | Mean accuracy of self.predict(X) wrt. y.\n", "\n" ] } ], "source": [ "help(GaussianNB)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model.fit(x)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:19:06.585035Z", "start_time": "2018-04-29T08:19:06.568397Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
00103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
11211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
22313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
33411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
44503Allen, Mr. William Henrymale35.0003734508.0500NaNS
\n", "
" ], "text/plain": [ " Unnamed: 0 PassengerId Survived Pclass \\\n", "0 0 1 0 3 \n", "1 1 2 1 1 \n", "2 2 3 1 3 \n", "3 3 4 1 1 \n", "4 4 5 0 3 \n", "\n", " Name Sex Age SibSp \\\n", "0 Braund, Mr. Owen Harris male 22.0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n", "2 Heikkinen, Miss. Laina female 26.0 0 \n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n", "4 Allen, Mr. William Henry male 35.0 0 \n", "\n", " Parch Ticket Fare Cabin Embarked \n", "0 0 A/5 21171 7.2500 NaN S \n", "1 0 PC 17599 71.2833 C85 C \n", "2 0 STON/O2. 3101282 7.9250 NaN S \n", "3 0 113803 53.1000 C123 S \n", "4 0 373450 8.0500 NaN S " ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train.head()" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:19:14.097652Z", "start_time": "2018-04-29T08:19:13.866494Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:3: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", " app.launch_new_instance()\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:4: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:9: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:10: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:11: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n" ] } ], "source": [ "train[\"Age\"] = train[\"Age\"].fillna(train[\"Age\"].median())\n", "#Convert the male and female groups to integer form\n", "train[\"Sex\"][train[\"Sex\"] == \"male\"] = 0\n", "train[\"Sex\"][train[\"Sex\"] == \"female\"] = 1\n", "\n", "#Impute the Embarked variable\n", "train[\"Embarked\"] = train[\"Embarked\"].fillna('S')\n", "#Convert the Embarked classes to integer form\n", "train[\"Embarked\"][train[\"Embarked\"] == \"S\"] = 0\n", "train[\"Embarked\"][train[\"Embarked\"] == \"C\"] = 1\n", "train[\"Embarked\"][train[\"Embarked\"] == \"Q\"] = 2" ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:20:40.202068Z", "start_time": "2018-04-29T08:20:40.189583Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.13031677 0.31274009 0.23443048 0.32251266]\n", "0.977553310887\n" ] } ], "source": [ "#Create the target and features numpy arrays: target, features_one\n", "target = train['Survived'].values\n", "features_one = train[[\"Pclass\", \"Sex\", \"Age\", \"Fare\"]].values\n", "\n", "#Fit your first decision tree: my_tree_one\n", "my_tree_one = tree.DecisionTreeClassifier()\n", "my_tree_one = my_tree_one.fit(features_one, target)\n", "#Look at the importance of the included features and print the score\n", "print(my_tree_one.feature_importances_)\n", "print(my_tree_one.score(features_one, target))" ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:22:30.084256Z", "start_time": "2018-04-29T08:22:29.811884Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:3: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", " app.launch_new_instance()\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:6: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:7: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:12: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:13: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", "/Users/datalab/Applications/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:14: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n" ] } ], "source": [ "test = pd.read_csv('../data/tatanic_test.csv', sep = \",\")\n", "# Impute the missing value with the median\n", "test.Fare[152] = test.Fare.median()\n", "test[\"Age\"] = test[\"Age\"].fillna(test[\"Age\"].median())\n", "#Convert the male and female groups to integer form\n", "test[\"Sex\"][test[\"Sex\"] == \"male\"] = 0\n", "test[\"Sex\"][test[\"Sex\"] == \"female\"] = 1\n", "\n", "#Impute the Embarked variable\n", "test[\"Embarked\"] = test[\"Embarked\"].fillna('S')\n", "#Convert the Embarked classes to integer form\n", "test[\"Embarked\"][test[\"Embarked\"] == \"S\"] = 0\n", "test[\"Embarked\"][test[\"Embarked\"] == \"C\"] = 1\n", "test[\"Embarked\"][test[\"Embarked\"] == \"Q\"] = 2\n", "\n", "# Extract the features from the test set: Pclass, Sex, Age, and Fare.\n", "test_features = test[[\"Pclass\",\"Sex\", \"Age\", \"Fare\"]].values\n", "\n", "# Make your prediction using the test set\n", "my_prediction = my_tree_one.predict(test_features)\n", "\n", "# Create a data frame with two columns: PassengerId & Survived. Survived contains your predictions\n", "PassengerId =np.array(test['PassengerId']).astype(int)\n", "my_solution = pd.DataFrame(my_prediction, PassengerId, columns = [\"Survived\"])\n" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:23:10.968349Z", "start_time": "2018-04-29T08:23:10.962036Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Survived
8920
8930
8941
\n", "
" ], "text/plain": [ " Survived\n", "892 0\n", "893 0\n", "894 1" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_solution[:3]" ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:23:24.168495Z", "start_time": "2018-04-29T08:23:24.164200Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "(418, 1)" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check that your data frame has 418 entries\n", "my_solution.shape" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# Write your solution to a csv file with the name my_solution.csv \n", "my_solution.to_csv(\"../data/tatanic_solution_one.csv\", index_label = [\"PassengerId\"])" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:26:02.502954Z", "start_time": "2018-04-29T08:26:02.489152Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.905723905724\n" ] } ], "source": [ "# Create a new array with the added features: features_two\n", "features_two = train[[\"Pclass\",\"Age\",\"Sex\",\"Fare\",\\\n", " \"SibSp\", \"Parch\", \"Embarked\"]].values\n", "\n", "#Control overfitting by setting \"max_depth\" to 10 and \"min_samples_split\" to 5 : my_tree_two\n", "max_depth = 10\n", "min_samples_split = 5\n", "my_tree_two = tree.DecisionTreeClassifier(max_depth = max_depth, \n", " min_samples_split = min_samples_split, \n", " random_state = 1)\n", "my_tree_two = my_tree_two.fit(features_two, target)\n", "\n", "#Print the score of the new decison tree\n", "print(my_tree_two.score(features_two, target))" ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:27:06.386839Z", "start_time": "2018-04-29T08:27:06.372711Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.979797979798\n" ] } ], "source": [ "# create a new train set with the new variable\n", "train_two = train\n", "train_two['family_size'] = train.SibSp + train.Parch + 1\n", "\n", "# Create a new decision tree my_tree_three\n", "features_three = train[[\"Pclass\", \"Sex\", \"Age\", \\\n", " \"Fare\", \"SibSp\", \"Parch\", \"family_size\"]].values\n", "\n", "my_tree_three = tree.DecisionTreeClassifier()\n", "my_tree_three = my_tree_three.fit(features_three, target)\n", "\n", "# Print the score of this decision tree\n", "print(my_tree_three.score(features_three, target))\n" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:28:40.445208Z", "start_time": "2018-04-29T08:28:40.246927Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.939393939394\n", "418\n", "[0 0 0]\n" ] } ], "source": [ "#Import the `RandomForestClassifier`\n", "from sklearn.ensemble import RandomForestClassifier\n", "\n", "#We want the Pclass, Age, Sex, Fare,SibSp, Parch, and Embarked variables\n", "features_forest = train[[\"Pclass\", \"Age\", \"Sex\", \"Fare\", \"SibSp\", \"Parch\", \"Embarked\"]].values\n", "\n", "#Building the Forest: my_forest\n", "n_estimators = 100\n", "forest = RandomForestClassifier(max_depth = 10, min_samples_split=2, \n", " n_estimators = n_estimators, random_state = 1)\n", "my_forest = forest.fit(features_forest, target)\n", "\n", "#Print the score of the random forest\n", "print(my_forest.score(features_forest, target))\n", "\n", "#Compute predictions and print the length of the prediction vector:test_features, pred_forest\n", "test_features = test[[\"Pclass\", \"Age\", \"Sex\", \"Fare\", \"SibSp\", \"Parch\", \"Embarked\"]].values\n", "pred_forest = my_forest.predict(test_features)\n", "print(len(test_features))\n", "print(pred_forest[:3])" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "ExecuteTime": { "end_time": "2018-04-29T08:29:11.346182Z", "start_time": "2018-04-29T08:29:11.319726Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.14130255 0.17906027 0.41616727 0.17938711 0.05039699 0.01923751\n", " 0.0144483 ]\n", "[ 0.10384741 0.20139027 0.31989322 0.24602858 0.05272693 0.04159232\n", " 0.03452128]\n", "0.905723905724\n", "0.939393939394\n" ] } ], "source": [ "#Request and print the `.feature_importances_` attribute\n", "print(my_tree_two.feature_importances_)\n", "print(my_forest.feature_importances_)\n", "\n", "#Compute and print the mean accuracy score for both models\n", "print(my_tree_two.score(features_two, target))\n", "print(my_forest.score(features_two, target))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "slide" } }, "source": [ "# 阅读材料\n", "机器学习算法的要点(附 Python 和 R 代码)http://blog.csdn.net/a6225301/article/details/50479672\n", "\n", "The \"Python Machine Learning\" book code repository and info resource https://github.com/rasbt/python-machine-learning-book\n", "\n", "An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani, 2013) : Python code https://github.com/JWarmenhoven/ISLR-python\n", "\n", "BuildingMachineLearningSystemsWithPython https://github.com/luispedro/BuildingMachineLearningSystemsWithPython" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 作业\n", "https://www.datacamp.com/community/tutorials/the-importance-of-preprocessing-in-data-science-and-the-machine-learning-pipeline-i-centering-scaling-and-k-nearest-neighbours" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python [conda env:anaconda]", "language": "python", "name": "conda-env-anaconda-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.4" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 0, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": false, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "780px", "left": "1279px", "top": "168.667px", "width": "341px" }, "toc_section_display": false, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 1 }