{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python examples and notes for Machine Learning for Computational Linguistics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**(C) 2017-2024 by [Damir Cavar](http://cavar.me/damir/)**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Download:** This and various other Jupyter notebooks are available from my [GitHub repo](https://github.com/dcavar/python-tutorial-for-ipython)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Version:** 1.2, January 2024" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**License:** [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Prerequisites:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -U numpy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -U matplotlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a tutorial related to the discussion of SpamAssassin in the textbook [Machine Learning: The Art and Science of Algorithms that Make Sense of Data](https://www.cs.bris.ac.uk/~flach/mlbook/) by [Peter Flach](https://www.cs.bris.ac.uk/~flach/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial was developed as part of my course material for the course Machine Learning for Computational Linguistics in the [Computational Linguistics Program](http://cl.indiana.edu/) of the [Department of Linguistics](http://www.indiana.edu/~lingdept/) at [Indiana University](https://www.indiana.edu/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SpamAssassin" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The linear classifier can be described as follows. A test $x$ returns $1$ (for true), if it succedes, otherwise it returns $0$. The $i^{th}$ test in the set of tests $x$ is refered to as $x_i$. The weight of the $i^{th}$ test is denoted as $w_i$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The total score of test results for a specific e-mail can be expressed as the sum of the products of $n$ test results and corresponding weights, that is $\\sum_{i=1}^{n} w_i x_i$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we assume two tests $x_1$ and $x_2$ with the corresponding weights $w_1 = 4$ and $w_2 = 4$, for some e-mail $e_1$ the tests could result in two positives $x_1 = 1$ and $x_2 = 1$. The computation of the equation above for the results can be coded in Python in the following way:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n" ] } ], "source": [ "x = (1, 0)\n", "w = (4, 4)\n", "\n", "result = 0\n", "for e in range(len(x)):\n", " result += x[e] * w[e]\n", "\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we specify a threshold $t$ that seperates spam from ham, with $t = 5$ in our example, the decision for spam or ham could be coded in Python as:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ham 4\n" ] } ], "source": [ "t = 5\n", "\n", "if result >= t:\n", " print(\"spam\", result)\n", "else:\n", " print(\"ham\", result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the code example above we define $x$ and $w$ as vectors of the same length. The computation of the result could be achieved even easier by making use of linear algebra and calculating the dot-product of $x$ and $w$:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy\n", "\n", "wn = [4, 4]\n", "xn = [1, 1]\n", "\n", "numpy.dot(wn, xn)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use a trick to manipulate the data to be rooted in the origin of an extended coordiante system. We can add a new dimension by adding a new virtual test result $x_0 = 1$ and a corresponding weight $w_0 = -t$. This way the decision boundary $t$ can be moved to $0$:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x0 = (1, 1, 1)\n", "w0 = (-t, 4, 4)\n", "\n", "numpy.dot(w0, x0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This kind of transformation of the vector space is usefull for other purposes as well. More on that later." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating and Using an SVM Classifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following example is inspired and partially taken from the [page Linear SVC Machine learning SVM example with Python](https://pythonprogramming.net/linear-svc-example-scikit-learn-svm-python/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To start learning and classifying, we will need to import some Python modules in addition to the ones above:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import matplotlib.pyplot\n", "from matplotlib import style\n", "\n", "style.use(\"ggplot\")\n", "\n", "from sklearn import svm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will use two features that represent the axis on a graph. The samples are tuples taken from the ordered arrays $x$ and $y$, that is, the $i^{th}$ sample is $X_i = (x_i, y_i)$, sample $X_1 = (1,2)$, sample $X_2 = (5, 8)$, and so on." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x = [1, 5, 1.5, 8, 1, 9]\n", "y = [2, 8, 1.8, 8, 0.6, 11]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can plot the datapoints now:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAgoAAAFqCAYAAAB73XKSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAG/NJREFUeJzt3X9sVfX9x/HXbe9taSmX3mtvi7ZCU9TKYk0V3NL10pLNDBHjUDYhc2bVUUXiFv6QOSVMcKnLyNQlTv8pxM4fTEiUpobIFDELt5B0oiYQUay1RgfUq+3laumv257vH4Z+rfJRCudHe/t8/GXP9Zzz5pMr9+k95976LMuyBAAAcAYZXg8AAAAmLkIBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGI07FI4cOaK//vWvuuuuu7RixQq98cYbo48NDw/r2Wef1b333qvbbrtNd911l/7xj3+op6fnnIaLxWLntB/OHWvuPtbcfay5+1hz99m15uMOhYGBAZWWlmrVqlVnfOyjjz7SL37xC23evFnr1q3T8ePHtXnz5nMarrW19Zz2w7ljzd3HmruPNXcfa+4+u9bcP94dKisrVVlZecbHcnNztX79+jHb7rjjDj3wwAP6/PPPdcEFF5zblAAAwBOO36PQ29srn8+n6dOnO30qAABgM0dDYWhoSNu2bVM0GtW0adOcPBUAAHCAY6EwPDysRx99VD6f74z3M5yNefPm2TwVvk9RUZHXI0w5rLn7WHP3sebus+s11GdZlnWuO69YsULr1q3TggULxmw/HQnxeFx/+tOflJeX953HicVi37rpYt68ebrxxhvPdTQAAKa8lpYWHTlyZMy26upqRaPRsz6G7aFwOhI+/fRTPfjgg98bCd+np6dHqVTqvI6BsxcMBpVMJr0eY0phzd3HmruPNXeX3+9XKBSy51jj3aG/v18nTpwY/bmrq0udnZ3Ky8tTKBTSI488os7OTv3xj39UKpVSIpGQJOXl5cnvH/fplEqlNDQ0NO79cG4sy2K9Xcaau481dx9rPnmN+5W7o6NDmzZtGv356aefliTV1tbql7/8pQ4ePChJWrdu3Zj9HnzwQf3gBz84n1kBAIDLzuvSgxvi8TgV6qJwOKzu7m6vx5hSWHP3sebuY83dFQgEFIlEbDkWv+sBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwMjv9QAAAEw0qVSmenqylEz6FAxaCoUG5fcPez2WJ3hHAQCAr0mlMrVvX66qqvJVU5Ovqqp87duXq1Qq0+vRPEEoAADwNT09Waqvz1Nfn0+S1NfnU319nnp6sjyezBuEAgAAX5NM+kYj4bS+Pp+SSZ9hj/RGKAAA8DXBoKWcHGvMtpwcS8GgZdgjvREKAAB8TSg0qMbGL0djITfXUmPjlwqFBj2ezBt86gEAgK/x+4e1cOEpHTiQ4lMPIhQAAPgWv39YkUifIhGvJ/Eelx4AAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABG/vHucOTIEbW0tKijo0OJRELr1q3TggULxvw727dv1969e9Xb26vy8nLV19dr1qxZtg0NAADcMe53FAYGBlRaWqpVq1ad8fHm5mbt3r1bd955px5++GFlZ2eroaFBqVTqvIcFAADuGncoVFZWasWKFbrmmmvO+PjLL7+s5cuXa/78+Zo9e7buuecedXd3q62t7byHBQAA7rL1HoVPP/1UiURCFRUVo9tyc3N16aWX6ujRo3aeCgAAuMDWUEgkEpKkmTNnjtk+c+bM0ccAAMDkwaceAACA0bg/9fBd8vPzJUknT54c/efTP5eWlhr3i8Viam1tHbOtqKhIdXV1CgaDsizLzjHxHQKBgMLhsNdjTCmsuftYc/ex5u7y+XySpKamJnV1dY15rLq6WtFo9KyPZWsoFBYWKj8/X4cOHdKcOXMkSadOndL777+vxYsXG/eLRqPGoZPJpIaGhuwcE98hHA6ru7vb6zGmFNbcfay5+1hzdwUCAUUiEdXV1Z33scYdCv39/Tpx4sToz11dXers7FReXp4KCgp0/fXX68UXX9SsWbNUWFio559/XhdccIHxUxIAAGDiGncodHR0aNOmTaM/P/3005Kk2tparVmzRj//+c81MDCgxsZG9fb2at68eXrggQfk99v65gUAAHCBz5rgNwDE43EuPbiItwfdx5q7jzV3H2vurtOXHuzApx4AAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADDy233AkZER7dixQ7FYTIlEQqFQSIsWLdLy5cvtPhUAAHCY7aHQ3NysPXv26J577lFJSYk++OADPfnkk5o+fbquu+46u08HAAAcZHsoHD16VAsWLFBlZaUkqaCgQLFYTO3t7XafCgAAOMz2exTKy8t1+PBhHT9+XJLU2dmp9957T1dddZXdpwLggVQqU/F4jj74IFfxeI5SqUyvR8IEdvr5cvDgIM+XScr2dxSWLVumvr4+rV27VhkZGbIsSytXrlR1dbXdpwLgslQqU/v25aq+Pk99fT7l5FhqbPxSCxeekt8/7PV4mGB4vqQH20Nh//79isViWrt2rUpKStTZ2ammpiaFw2HV1NTYfToALurpyRr9S1+S+vp8qq/P04EDKUUifR5Ph4mG50t6sD0Unn32Wd10002qqqqSJF188cWKx+PauXOnMRRisZhaW1vHbCsqKlJdXZ2CwaAsy7J7TBgEAgGFw2Gvx5hSJtOaf/jh4Ohf+qf19fnU25up8vLJ8WeQJteaT2bp8nyZjHy+r9a9qalJXV1dYx6rrq5WNBo962PZHgqDg4PKyBh764PP5/vOF/toNGocOplMamhoyNYZYRYOh9Xd3e31GFPKZFrzvLwc5eRYY/7yz8mxNH368KT5M0iTa80ns3R5vkxGgUBAkUhEdXV1530s229mnD9/vl544QW9+eabisfjamtr065du/TDH/7Q7lMBcFkoNKjGxi+Vk/NV+OfmfnXNORQa9HgyTEQ8X9KDz7L5ff3+/n5t375dbW1tSiaTCoVCikajWr58uTIzx3+3azwe5x0FF/F/Wu6bbGueSmWqpydLyaRPwaClUGhw0t2YNtnWfDI7/Xzp7c3U9OnDk/L5MhmdfkfBDraHgt0IBXfxF6j7WHP3sebuY83dZWco8LseAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAICR34mDdnd367nnntPbb7+tgYEBXXjhhbr77rtVVlbmxOkAAIBDbA+F3t5ebdiwQRUVFVq/fr1mzJih48ePKy8vz+5TAQAAh9keCs3NzSooKNDq1atHt0UiEbtPAwAAXGB7KBw8eFCVlZV69NFHdeTIEYXDYf3sZz/TT3/6U7tPBQAAHGZ7KHR1demVV17RDTfcoJtvvlnt7e166qmnFAgEVFNTY/fpAACAg2wPBcuyNHfuXK1cuVKSVFpaqo8//livvvoqoQAAwCRjeyiEQiEVFxeP2VZcXKy2tjbjPrFYTK2trWO2FRUVqa6uTsFgUJZl2T0mDAKBgMLhsNdjTCmsuftYc/ex5u7y+XySpKamJnV1dY15rLq6WtFo9KyPZXsolJeX69ixY2O2HTt2TAUFBcZ9otGocehkMqmhoSFbZ4RZOBxWd3e312NMKay5+1hz97Hm7goEAopEIqqrqzvvY9n+hUtLly7V+++/r507d+rEiROKxWLau3evrrvuOrtPBQAAHGb7Owpz587Vvffeq23btumFF15QYWGh6urqVF1dbfepAACAwxz5Zsarr75aV199tROHBgAALuJ3PQAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI8dDobm5WStWrNA///lPp08FAABs5mgotLe3a8+ePZozZ46TpwEAAA5xLBT6+/v1+OOPa/Xq1Zo+fbpTpwEAAA5yLBS2bNmi+fPn64orrnDqFAAAwGGOhEJra6s++ugj/epXv3Li8AAAwCW2h8Lnn3+upqYm/e53v5Pf77f78AAAwEU+y7IsOw/43//+V3/729+UkfH/DTIyMiJJysjI0LZt2+Tz+cbsE4vF1NraOmZbUVGR6urqNDAwIJtHxHcIBAIaGhryeowphTV3H2vuPtbcXT6fT9nZ2WpqalJXV9eYx6qrqxWNRs/+WHaHQn9/vz777LMx25544gkVFxdr2bJlKikpGdfx4vE4Ty4XhcNhdXd3ez3GlMKau481dx9r7q5AIKBIJGLLsWy/NjBt2rRvxcC0adM0Y8aMcUcCAADwFt/MCAAAjFy52/DBBx904zQAAMBmvKMAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAICR3+4D7ty5U21tbTp27JiysrJ02WWX6dZbb9VFF11k96kAAIDDbA+Fd999V0uWLFFZWZlGRka0bds2NTQ06LHHHlNWVpbdpwMAAA6y/dLD/fffr5qaGpWUlGj27Nlas2aNPvvsM3V0dNh9KgAA4DDb31H4plOnTkmS8vLynD6V51KpTPX0ZCmZ9CkYtBQKDcrvH/Z6LAAAzpmjNzNalqWmpiZdfvnlKikpcfJUnkulMrVvX66qqvJVU5Ovqqp87duXq1Qq0+vRAAA4Z46GwpYtW/TJJ59o7dq1Tp5mQujpyVJ9fZ76+nySpL4+n+rr89TTw30ZAIDJy7FLD1u3btVbb72lhx56SKFQ6Dv/3VgsptbW1jHbioqKVFdXp2AwKMuynBrTNh9+ODgaCaf19fnU25up8vKwR1ONXyAQUDg8eeZNB6y5+1hz97Hm7vL5vno9ampqUldX15jHqqurFY1Gz/5YlgOvwlu3btUbb7yhjRs3qqio6LyOFY/HNTQ0ZNNkzonHc1RVlT8mFnJyLB04kFAk0ufhZOMTDofV3d3t9RhTCmvuPtbcfay5uwKBgCKRiC3Hsv3Sw5YtWxSLxfT73/9e2dnZSiQSSiQSGhwctPtUE0ooNKjGxi+Vk/NVd+XmWmps/FKhkH1/7lQqU/F4jj74IFfxeA73PwAAHGf7pYdXX31VkrRx48Yx29esWaPa2lq7Tzdh+P3DWrjwlA4cSDnyqYfTN0uevg8iJ+erEFm48BSfrAAAOMaRSw92miyXHpzm1qUN3h50H2vuPtbcfay5uyb0pQc4I5n0nfFmyWTSZ9gDAIDzRyhMEsGgNXr/w2k5OZaCwQn9hhAAYJIjFCYJN26WBADgmxz/CmfYw+mbJQEAOBNCYRLx+4cVifTJpvtTAAD4Xlx6AAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaEAgAAMCIUAACAEaEAAACMCAUAAGBEKAAAACNCAQAAGBEKAADAiFAAAABGhAIAADAiFAAAgBGhAAAAjAgFAABgRCgAAAAjQgEAABj5vR4gnaRSmerpyVIy6VMwaCkUGpTfP+z1WAAAnDPeUbBJKpWpfftyVVWVr5qafFVV5WvfvlylUplejwYAwDkjFGzS05Ol+vo89fX5JEl9fT7V1+eppyfL48kAADh3hIJNkknfaCSc1tfnUzLpM+wBAMDERyjYJBi0lJNjjdmWk2MpGLQMewAAMPERCjYJhQbV2PjlaCzk5lpqbPxSodCgx5MBAHDu+NSDTfz+YS1ceEoHDqT41AMAIG0QCjby+4cVifQpEvF6EgAA7MGlBwAAYEQoAAAAI0IBAAAYEQoAAMCIUAAAAEaOfeph9+7deumll5RIJFRaWqrbb79dl1xyiVOnAwAADnDkHYX9+/frmWee0S233KLNmzdrzpw5amhoUDKZdOJ0AADAIY6Ewq5du3TttdeqtrZWxcXFqq+vV3Z2tl5//XUnTgcAABxieyikUil1dHSooqJidJvP51NFRYWOHj1q9+kAAICDbA+FL774QiMjI5o5c+aY7TNnzlQikbD7dAAAwEET/iuc/f4JP2Ja8fl8CgQCXo8xpbDm7mPN3ceau8vO107bX4VnzJihjIwMnTx5csz2kydPKj8//4z7xGIxtba2jtk2b9483XjjjQqFQnaPiO8R4ZdVuI41dx9r7j7W3H0tLS06cuTImG3V1dWKRqNnfQzbQ8Hv96usrEyHDh3SggULJEmWZenw4cNasmTJGfeJRqNnHLqlpUU33nij3SPiOzQ1Namurs7rMaYU1tx9rLn7WHP3nX4NPd/XUUc+9bB06VK99tpr+s9//qP//e9/amxs1MDAgBYtWjSu43yzguC8rq4ur0eYclhz97Hm7mPN3WfXa6gjNwD8+Mc/1hdffKEdO3aMfuHS+vXrFQwGnTgdAABwiGN3Ci5evFiLFy926vAAAMAF/K4HAABglLlx48aNXg/xXWbPnu31CFMOa+4+1tx9rLn7WHP32bHmPsuyLBtmAQAAaYhLDwAAwIhQAAAARoQCAAAwIhQAAIDRhPyNS7t379ZLL700+mVNt99+uy655BKvx0pLO3fuVFtbm44dO6asrCxddtlluvXWW3XRRRd5PdqU0dzcrH/961+6/vrr9Zvf/MbrcdJWd3e3nnvuOb399tsaGBjQhRdeqLvvvltlZWVej5aWRkZGtGPHDsViMSUSCYVCIS1atEjLly/3erS0cuTIEbW0tKijo0OJRELr1q0b/fUJp23fvl179+5Vb2+vysvLVV9fr1mzZp31OSbcOwr79+/XM888o1tuuUWbN2/WnDlz1NDQoGQy6fVoaendd9/VkiVL1NDQoA0bNmh4eFgNDQ0aHBz0erQpob29XXv27NGcOXO8HiWt9fb2asOGDQoEAlq/fr0ee+wx3XbbbcrLy/N6tLTV3NysPXv2aNWqVfr73/+uX//612ppadHu3bu9Hi2tDAwMqLS0VKtWrTrj483Nzdq9e7fuvPNOPfzww8rOzlZDQ4NSqdRZn2PChcKuXbt07bXXqra2VsXFxaqvr1d2drZef/11r0dLS/fff79qampUUlKi2bNna82aNfrss8/U0dHh9Whpr7+/X48//rhWr16t6dOnez1OWmtublZBQYFWr16tsrIyRSIRXXnllSosLPR6tLR19OhRLViwQJWVlSooKNCPfvQjXXnllWpvb/d6tLRSWVmpFStW6Jprrjnj4y+//LKWL1+u+fPna/bs2brnnnvU3d2ttra2sz7HhAqFVCqljo4OVVRUjG7z+XyqqKjQ0aNHPZxs6jh16pQk8X9aLtiyZYvmz5+vK664wutR0t7Bgwc1d+5cPfroo6qvr9d9992n1157zeux0lp5ebkOHz6s48ePS5I6Ozv13nvv6aqrrvJ4sqnj008/VSKRGPOampubq0svvXRcr6kT6h6FL774QiMjI5o5c+aY7TNnztSxY8c8mmrqsCxLTU1Nuvzyy1VSUuL1OGmttbVVH330kf7yl794PcqU0NXVpVdeeUU33HCDbr75ZrW3t+upp55SIBBQTU2N1+OlpWXLlqmvr09r165VRkaGLMvSypUrVV1d7fVoU0YikZCkM76mnn7sbEyoUIC3tmzZok8++UR//vOfvR4lrX3++edqamrShg0b5Pfzn6AbLMvS3LlztXLlSklSaWmpPv74Y7366quEgkP279+vWCymtWvXqqSkRJ2dnWpqalI4HGbNJ5kJ9bfUjBkzlJGRoZMnT47ZfvLkSeXn53s01dSwdetWvfXWW3rooYcUCoW8HietdXR0KJlM6r777hvdNjIyonfeeUe7d+/Wtm3b5PP5PJww/YRCIRUXF4/ZVlxcPK7rtBifZ599VjfddJOqqqokSRdffLHi8bh27txJKLjk9OvmN19DT548qdLS0rM+zoQKBb/fr7KyMh06dGj04x2WZenw4cNasmSJx9Olr61bt+qNN97Qxo0bVVBQ4PU4aa+iokKPPPLImG1PPPGEiouLtWzZMiLBAeXl5d+6fHns2DGe7w4aHBxURsbY2+B8Pp/49ULuKSwsVH5+vg4dOjT6yapTp07p/fff1+LFi8/6OBMqFCRp6dKlevLJJ1VWVqZLLrlEu3bt0sDAgBYtWuT1aGlpy5Ytam1t1R/+8AdlZ2ePXrfKzc1VVlaWx9Olp2nTpn3rHpBp06ZpxowZ3BvikKVLl2rDhg3auXOnqqqq1N7err179+quu+7yerS0NX/+fL3wwgsKh8O6+OKL9eGHH2rXrl36yU9+4vVoaaW/v18nTpwY/bmrq0udnZ3Ky8tTQUGBrr/+er344ouaNWuWCgsL9fzzz+uCCy4wfkriTCbkb4/897//rZaWltEvXLrjjjs0d+5cr8dKSytWrDjj9jVr1qi2ttblaaauTZs2qbS0lC9cctCbb76pbdu26cSJEyosLNQNN9zAi5aD+vv7tX37drW1tSmZTCoUCikajWr58uXKzMz0ery08c4772jTpk3f2l5bW6s1a9ZIknbs2KHXXntNvb29mjdvnn7729+O6wuXJmQoAACAiWFCfY8CAACYWAgFAABgRCgAAAAjQgEAABgRCgAAwIhQAAAARoQCAAAwIhQAAIARoQAAAIwIBQAAYEQoAAAAI0IBAAAY/R9hHeXyP21ZXAAAAABJRU5ErkJggg==", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "matplotlib.pyplot.scatter(x,y)\n", "matplotlib.pyplot.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can create an array of features, that is, we convert the coordinates in the $x$ and $y$ feature arrays above to an array of tuples that represent the datapoints or features of samples:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "X = numpy.array([[1,2],\n", " [5,8],\n", " [1.5,1.8],\n", " [8,8],\n", " [1,0.6],\n", " [9,11]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Assuming two classes represented by $0$ and $1$, we can encode the assignment of the datapoints in $X$ to classes $0$ or $1$ by using a vector with the class labels in the order of the samples in $X$. The $i^{th}$ datapoint of $X$ is assigned to the $i^{th}$ class label in $y$." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "y = [0,1,0,1,0,1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We define a classifier as a linear Support Vector Classifier using the *svm* module of [Scikit-learn](http://scikit-learn.org/):" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "classifier = svm.SVC(kernel='linear')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We train the classifier using our features in *X* and the labels in *y*:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n", " decision_function_shape=None, degree=3, gamma='auto', kernel='linear',\n", " max_iter=-1, probability=False, random_state=None, shrinking=True,\n", " tol=0.001, verbose=False)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "classifier.fit(X,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now create a new sample and ask the classifier for a guess to which class this sample belongs. Note that in the following code we generate a *numpy array* from the features $[0.58, 0.76]$. This array needs to be reshaped to an array the contains one element, an array with a set of sample features. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sample: [[ 0.58 0.76]]\n", " Class: [0]\n" ] } ], "source": [ "sample = numpy.array([0.58,0.76]).reshape(1,-1)\n", "\n", "print(\"Sample:\", sample)\n", "\n", "print(\" Class:\", classifier.predict(sample))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of using the *reshape()* function, we could have also defined the sample directly as an array with a sample feature array:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": true }, "outputs": [], "source": [ "sample = numpy.array( [ [0.58,0.76] ] )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following code will visualize the data and the identified hyperplane that separates the two classes." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.1380943 0.24462418]\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAg0AAAFqCAYAAACZAWnrAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAIABJREFUeJzt3XtcjvfjP/DXdVcOiXTr4JAhZx982sfdgY6ykEOMiVkotMiZmWENM6f2yWFOM1FbLTKH5FAWGYqOm2E2h4WPoUQqC+lw//7YT981p0vuq6v77vV8PD6Px6frvu/r/Xp3W7263td13YJarVaDiIiI6CUUcgcgIiIi7cDSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKJUujT8+uuvWLFiBfz9/TF8+HCkp6eXP1ZaWoqIiAh88MEHGDVqFPz9/bFu3Trcu3ev0kETExMr/VptwnnqFs5Tt3CeuqWmzBPQ3FwrXRqKiorQsmVLjB8//pmPXbt2De+88w6CgoIwe/Zs3Lp1C0FBQZUOmpSUVOnXahPOU7dwnrqF89QtNWWegObmql/ZF1pbW8Pa2vqZjxkaGmL+/PkVto0dOxbz5s3D3bt30ahRo8oOS0RERDKpsnMaCgsLIQgC6tWrV1VDEhERkQZVSWkoLi5GZGQkHB0dUadOnaoYkoiIiDRM8tJQWlqKlStXQhCEZ57/IFbHjh01mKr6srCwkDtCleA8dQvnqVs4T92jqd+hglqtVr/uToYPH47Zs2dDpVJV2P6kMOTk5OCTTz6BkZHRS/eVmJj41AkbHTt2hKen5+vGJCIiqrFiYmLw66+/Vtjm4OAAR0dH0fuQrDQ8KQy3b9/GggULRBWGl7l37x5KSkpeez/VWYMGDVBQUCB3DMlxnrqF89QtnKdu0dfXh4mJiWb2VdkXPnr0CFlZWeVfZ2dn4+rVqzAyMoKJiQmCg4Nx9epVfPTRRygpKUFeXh4AwMjICPr6lRu2pKQExcXFlY2sFdRqtc7PEeA8dQ3nqVs4T3qeSpeGzMxMLFq0qPzrb775BgDg4uKCYcOGISMjAwAwe/bsCq9bsGABOnXqVNlhiYiISCaVLg2dOnVCVFTUcx9/0WNERESkffjZE0RERCQKSwMRERGJwtJAREREolT6nAYiIik1bNgQCkX1+rtGoVBAqVTKHUNynKd2KSsrK79CUWosDURULSkUCuTm5sodg6jaq8riU71qPBEREVVbLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1ERCSZP/74A5aWlvjuu+8q/dpNmzZJkEw8Ozs7zJw5s1Kvfdb8g4ODYWlpqal4VYqlgYiIJCUIguRjJCQkYOXKlZLsW9P5BUGodvcgEUs7UxMRkVawtLTE77//jnfeeUfScRISErBq1SpJx9CU6dOn4/Lly3LHqBTe3ImIiCRVq1YtycdQq9WSj6EpCoWiSr4nUuCRBiKiKvRkPfvq1auYPn06OnXqhI4dO2LmzJl49OhRheeWlpZi1apVcHBwgJWVFezt7bF8+XI8fvy4wvPs7Ozg4+ODtLQ0DBgwAK1bt0aPHj2wc+fOl+bp27cv/Pz8Kmzr1asXLC0t8dtvv5Vv27t3LywtLSv8hZyVlYWZM2fC2toaVlZWcHNzQ1RUVIV9Pe+chn379qFnz55o3bo13nrrLcTFxWH69Omwt7d/Zs5vv/22/PvQv39//Pzzz+WPzZgxA19//TWAv45sWFpaonnz5uWPq9VqbN68GW5ubmjdujWsra0xZ84c5OfnPzXO6tWroVKp0KZNG3h5eeHixYsv+xaWKygowPTp09GxY0d06tQJM2bMeOYY/zynoVevXvDy8nrqeWq1Gt26dYO/v7/oDFLjkQYioir0ZH18woQJeOONNzB37lycPXsW27Ztg5mZGebOnVv+3FmzZmHnzp0YOHAg/P398dNPP2HdunX4/fffsXnz5gr7vHLlCvz9/TFixAgMGzYMUVFRmDlzJv7973+jbdu2z81ja2uLmJiY8q/z8vJw8eJF6OnpISUlBR06dAAApKWlwdTUFG3atAEA3LlzBwMGDICenh7Gjh0LpVKJo0ePYtasWfjzzz8xbty45455+PBhBAQEoFOnTpg7dy7y8/Mxa9YsNGnS5JnnD+zZsweFhYUYNWoUAGDDhg3w8/PDqVOnoKenh1GjRiE7OxsnTpzAunXrnjrq8OGHH2Lnzp0YPnw4xo0bh//9738IDQ3FL7/8gr1790JPTw8AEBQUhC+++AJvvfUWevbsiXPnzuHdd99FSUnJc+fyd76+vkhPT8fo0aPRunXr8iL0zzkJglBh28CBA7Fq1SrcuXMHpqam5dtTUlKQnZ2NwYMHixq/KrA0EBHJoGvXrggKCir/+t69e9i2bVt5aTh//jx27tyJ9957DytWrAAAjB49Go0aNcKmTZtw6tQpdO/evfz1mZmZ2L17N2xsbAD89YvIxsYGUVFR+Pjjj5+bw87ODqGhobh8+TLatGmDtLQ01KpVC66urkhNTcWYMWMA/PUL7Mm+AWD58uVQq9X4/vvvYWxsDADw9vbGpEmTsHLlSnh7e6N27drPHHP58uVo0qQJoqOjUbduXQCAo6Mjhg4dWuEIwRM3b95EUlIS6tevDwCwsrLCuHHj8MMPP6BXr174z3/+AysrK5w4ceKpX7CpqanYtm0b1q9fj0GDBpVvd3BwwMiRI7F//34MGjQIubm5+PLLL+Hu7o7Q0NDy561YsQJr16597vfviUOHDiElJQWBgYHlRwbGjBkj6lwOT09P/Pe//8X+/fvh4+NTvj0mJgZGRkZwc3N76T6qCksDEWm9hw8fSn5iWZs2bcp/wb0uQRDg7e1dYZutrS3i4uJQWFiIevXqISEhAYIgPLV04O/vjy+//BJHjhypUBratWtX4Ze6UqmElZUVrl279sIsdnZ2UKvVSElJQZs2bZCamgpra2s4OTlh3bp1AP467H7hwgUMHz68/HWxsbEYOHAgSktLK3ywmLOzM2JiYnD27FmoVKqnxsvOzsZvv/2GadOmVfh+2tnZoUOHDigsLHzqNYMGDSovDH/P/L///e+FcwOA/fv3w9jYGE5OThVydu7cGfXq1UNSUhIGDRqE48ePo7i4GL6+vhVe7+fnJ6o0JCQkwMDAAKNHjy7fJggCfH19kZKS8sLXWllZ4V//+hf27dtXXhrKyspw8OBBuLu7P7d8yYGlgYi03uXLl9G3b19Jx4iLi0OXLl00tr9mzZpV+PrJX+t5eXmoV68e/vjjDygUCrRq1arC88zMzGBsbIw//vijwvamTZs+NYaxsfEz19T/ztTUFK1atUJKSgree+89pKSkwMHBAXZ2dvj4449x/fp1XLhwAWq1GnZ2dgCAu3fvIj8/H99++y0iIiKe2qcgCLh79+4zx3uSu0WLFk891qpVK5w7d+6p7f+c29+/Vy9z5coV5Ofno2vXri/MeePGjfIMf6dUKsvHe5EbN27A3Nz8qWLZunXrl74W+Otow4oVK5CdnQ0LCwskJSXhzp078PT0FPX6qsLSQERar02bNoiLi5N8DE0Se52+2HsEPFmX/ycxVxXY2toiKSkJjx49wtmzZzFr1ix06NABxsbGSElJwaVLl1CvXj107twZwF9/BQPAkCFDMGzYsGfus1OnTqJyi/E69zRQq9UwMzN75rkOANCoUaPXiaYxnp6eWLZsGfbv349x48Zh3759MDY2hqurq9zRKmBpICKtV7duXY0eBagOLC0tUVZWhszMzAqF5c6dO8jPz9foHQVtbW2xY8cO7N27F2VlZejWrRsEQYCNjU15aVCpVOUFplGjRjAyMkJZWRkcHR1feV4AcPXq1aceu3LlSqXn8Lxy1aJFCyQmJkKlUr3wMP+TXFeuXKlwXkVubu5Lj9YAfx05SkpKwsOHDyscbRC7bNa8eXNYW1sjJiYGPj4+iIuLQ9++fWFgYCDq9VWFl1wSEVVDbm5uUKvVCAkJqbB906ZNEAQBvXr10thYT84R2LBhAzp27AgjI6Py7YmJiTh79ixsbW3Ln69QKNCvXz8cPHgQFy5ceGp/fz934J8sLCzQoUMH7Ny5Ew8fPizffurUqQqXeL4qQ0NDAMD9+/crbB84cCBKSkqeeeOn0tJSFBQUAACcnJygr6+PrVu3VnjOV199JWr8Xr16obi4uPzST+CvIzKhoaGijxZ5enrixx9/xPbt25Gbm1vtliYAHmkgIqqWOnXqhGHDhuHbb79Ffn4+7O3t8dNPP2Hnzp3w8PCocBLk62rZsiXMzc2RmZlZ4URAOzs7LFmyBIIglJ/P8MS8efNw6tQpDBgwACNHjkS7du2Ql5eHM2fOICkp6ZnnJjwxZ84cjBs3Dp6enhg+fDjy8vIQFhaGDh064MGDB5WaQ5cuXaBWq/Hxxx/D1dUVCoUCgwYNgr29Pby9vbF+/Xr88ssvcHFxgb6+PjIzM3HgwAEsXrwY/fr1g1KphL+/P9avX4/Ro0fDzc0Nv/zyC44ePSpqCcPd3R02NjZYtmwZrl+/jrZt2yI2NhZ//vmn6DkMHDgQixcvxuLFi2FiYvLKR3GqAksDEVE1FRwcjBYtWuC7777DoUOHYGZmhqlTp2LGjBkVnvfP6/7/+ZgYtra2OHDgQIUjCl27dkXdunVRVlaGN998s8LzTU1NceDAAaxatQpxcXEIDw+HiYkJ2rVr99Qlnv/M4O7ujvXr12PlypVYtmwZWrZsiZUrV2Lnzp24dOmSqLn9c1u/fv0wduxYxMTEYM+ePVCr1eWXWC5fvhz//ve/ERERgRUrVkBfXx+WlpZ45513Klxx8tFHH6FOnToIDw/HqVOn8J///Afbtm3D6NGjX/p9FAQBYWFhWLBgAfbs2QNBENC7d28sWLAAffr0eWl+AGjSpAlUKhXS09MxcuTI556nIidBrUX33szJyUFxcbHcMSSlVCpfeGhPV3CeukWKedaU7x39n969e8PU1BSRkZFyR9EqL/tvxcDAAGZmZhoZi+c0EBFRlSopKUFpaWmFbSdPnsT58+fRo0cPmVKRGFyeICKiKpWVlYXhw4dj6NChsLCwwKVLlxAREQELC4unbnpF1QtLAxERVSljY2P8+9//xrZt25Cbm4u6devC3d0dc+fORcOGDeWORy/A0kBERFWqfv362LBhg9wxqBJ4TgMRERGJwtJAREREorA0EBERkSgsDURERCRKpU+E/PXXXxETE4PMzEzk5eVh9uzZT312elRUFBISElBYWIj27dvDz88PjRs3fu3QREREVPUqfaShqKgILVu2xPjx45/5eHR0NOLi4vD+++9j6dKlqF27NpYsWYKSkpJKhyUiIiL5VPpIg7W1NaytrZ/7eGxsLIYOHYpu3boBACZPngw/Pz+kpqbyjl9E9FJlZWVQKpVyx6hAoVCgrKxM7hiS4zy1S1XOQZL7NNy+fRt5eXkVPt/e0NAQbdu2xcWLF1kaiOil8vLy5I7wlJryeRicJz2PJCdCPvmP3djYuMJ2Y2PjavmDgIiIiF6OV08QERGRKJIsTzy5d3h+fn6F+4jn5+ejZcuWL3xtYmIikpKSKmyzsLCAj48PGjRoAC36JO9KMTAwqHbruFLgPHUL56lbOE/dIggCACAsLAzZ2dkVHnNwcICjo6PofUlSGszNzdGwYUOcPXsWLVq0AAA8ePAAly5dQp8+fV74WkdHx+dOoKCgAMXFxRrPW53UlDU2zlO3cJ66hfPULQYGBjAzM4OPj89r76vSpeHRo0fIysoq/zo7OxtXr16FkZERTE1N0a9fP+zevRuNGzeGubk5tm/fjkaNGsHGxua1QxMREVHVq3RpyMzMxKJFi8q//uabbwAALi4uCAgIwKBBg1BUVITNmzejsLAQHTt2xLx586Cvzw/WJCIi0kaV/g3eqVMnREVFvfA5Xl5e8PLyquwQREREVI3w6gkiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRGFpICIiIlFYGoiIiEgUlgYiIiIShaWBiIiIRNGXcudlZWXYsWMHEhMTkZeXBxMTE7i6umLo0KFSDktEREQSkLQ0REdH4/Dhw5g8eTIsLS3x+++/Y8OGDahXrx769u0r5dBERESkYZKWhosXL0KlUsHa2hoAYGpqisTERFy+fFnKYYlqpJL8fDzOz4e6tBS1TUyg37Ch3JF0RmlhIYrz8/FHZib0jY1Rq1EjuSMRyULS0tC+fXscOXIEt27dQpMmTXD16lVcuHABY8aMkXJYohqn5N49xI0di1upqQAAZYcO8NyxAwb85fbaSgoKkLJ0Kc6HhwMA6pqa4u29e2HUqhXUarXM6YiqlqSlYfDgwXj48CGmT58OhUIBtVqNESNGwMHBQcphiWoUhUKBy/v2lRcGAMj97Tf8vGkTVHPmAHp6MqbTfvevXi0vDADw8M4dJEyfjn7h4dCrX1/GZERVT9LScPLkSSQmJmL69OmwtLTE1atXERYWBqVSCWdn52e+JjExEUlJSRW2WVhYwMfHBw0aNND5Zm9gYAClUil3DMlxnppT9OABbhw//tT2m8nJMFCrUb8Kvs+6+n6WlZXh4s8/P7U95+efoSgr08k5A7r7fv5TTZmnIAgAgLCwMGRnZ1d4zMHBAY6OjqL3JWlpiIiIwNtvv43u3bsDAJo3b46cnBzs2bPnuaXB0dHxuRMoKChAcXGxZHmrA6VSidzcXLljSI7z1BxBENCyTx9kxsZW2P5Gz54oViiq5Pusy++nhUr11LYm9vYo09fX2Tnr8vv5dzVlngYGBjAzM4OPj89r70vS+zQ8fvwYCkXFIQRB0PmjBURVSa1Wo4W7O6z69y/f1qxHD3T28QEUvBXL6zKytIRq5kwI/3+Zx7hVK/QMDoZevXoyJyOqepIeaejWrRt27doFpVKJ5s2b48qVKzhw4ADc3NykHJaoxtFv2BCuK1fCYeFCqMvKUKtBA+g1aCB3LJ2gV78+/j1pEjp5e0NdXAy9evVQS6nkHz9UI0laGsaOHYuoqChs2bIFBQUFMDExQe/evXlzJyIJ6BkZQc/ISO4YOklRpw5q16lTfjibhYFqKklLQ506dTBmzBheYklERKQDuOBJREREorA0EBERkSgsDURERCQKSwMRERGJwtJAREREorA0EBERkSgsDURERCQKSwMRERGJwtJAREREorA0EBERkSgsDURERCQKSwMRERGJwtJAREREorA0EBERkSgsDURERCQKSwMRERGJwtJAREREorA0EBERkSgsDURERCQKSwMRERGJwtJAREREorA0VCNpaWl49OiR3DGIiIieiaWhmnjw4AFGjhyJtm3bYvny5bh586bckYiIiCpgaagmDA0N8f3332PEiBEICwuDvb09JkyYgLS0NKjVarnjERERsTRUJ61atcLnn3+O9PR0LFq0CL/88gsGDx4MDw8P7Nixg0sXREQkK5aGasjIyAi+vr44duwYIiIiYGZmhhkzZsDW1hZBQUHIysqSOyIREdVALA3VmEKhQM+ePREeHo7jx49j0KBBCAkJgZ2dHSZNmoSMjAwuXRARUZVhadASrVu3xuLFi5GRkYHAwECcPn0anp6eGDBgAHbu3ImioiK5IxIRkY5jadAy9evXx/jx43HixAl8/fXXMDY2xrRp02BnZ4fg4GBkZ2fLHZGIiHQUS4OWUigUeOuttxAZGYkffvgB/fr1w5dffgk7OztMmTIFP/30k9wRiYhIx7A06IC2bdti6dKlSE9Px9y5c5Geno4BAwZgwIAB2LNnDx4/fix3RCIi0gEsDTrE2NgY/v7+SExMRGhoKOrVq4fJkyfD3t4eq1atQk5OjtwRiYhIi7E06CA9PT307t0bUVFROHLkCNzd3bFu3TrY2tpi2rRpOHPmjNwRiYhIC0leGnJzc7F27VqMGzcO3t7emD17NjIzM6Uelv6/Dh06YMWKFUhPT8ecOXOQnJwMDw8PDBo0CHv37kVxcbHcEYmISEtIWhoKCwsRGBgIAwMDzJ8/H6tWrcKoUaNgZGQk5bD0DCYmJpgwYQJOnjyJLVu2wMDAAAEBAbC3t8cXX3yBu3fvyh2RiIiqOUlLQ3R0NExNTTFhwgRYWVnBzMwMXbt2hbm5uZTD0gvo6emhb9++2LlzJ+Lj49GrVy+sWbMGNjY2mDlzJs6dOyd3RCIiqqYkLQ0ZGRlo3bo1Vq5cCT8/P8yZMwdHjhyRckh6BZ06dUJQUBDS0tIwa9YsnDhxAn369MGQIUOwf/9+lJSUyB2RiIiqEUlLQ3Z2Nr7//ns0bdoU8+fPh7u7O0JDQ3H8+HEph6VXpFQqMWnSJJw6dQpfffUVBEGAv78/unfvjnXr1iE3N1fuiEREVA1IWhrUajWsrKwwYsQItGzZEm+99RZ69eqF+Ph4KYelStLX10f//v2xa9cuHDp0CE5OTli5ciVsbGwwe/ZsnD9/Xu6IREQkI30pd25iYoJmzZpV2NasWTOkpqY+9zWJiYlISkqqsM3CwgI+Pj5o0KCBzn9Ak4GBAZRKpdwx4OzsDGdnZ+Tk5CA0NBSbNm1CZGQknJ2dERAQgAEDBkBPT6/S+68u85Qa56lbOE/dUlPmKQgCACAsLOypjxpwcHCAo6Oj+H2pJfwt/OSs/EWLFpVvCwsLw++//47Fixe/8v5ycnJ0/hJBpVJZLZcDiouLERsbiy1btiA9PR2Wlpbw8fHBiBEjYGJi8sr7q67z1DTOU7dwnrqlpszTwMAAZmZmGtmXpMsT/fv3x6VLl7Bnzx5kZWUhMTERCQkJ6Nu3r5TDkgQMDAzg6emJvXv3IjY2Fvb29ggKCoJKpcKcOXNw4cIFuSMSEZHEJD3SAAA//vgjIiMjkZWVBXNzcwwYMABubm6V2hePNFQvOTk5iIiIQHh4OLKzs+Ho6Ihx48ahV69eL1260KZ5vg7OU7dwnrqlpsxTk0caJC8NmsTSUD09fvwYBw8exJYtW/Djjz/ijTfeKF+6MDY2fuZrtHGelcF56hbOU7fUlHlqzfIE1Qy1atXC4MGDsW/fPuzfvx8qlQrLli2DSqXCvHnzcPnyZbkjEhGRBrA0kEa9+eabWLt2LVJTUzFx4kQcPHgQLi4uGDlyJA4fPoyysjK5IxIRUSWxNJAkzM3NMXPmTKSkpGDt2rXIz8/HmDFj4OTkhJCQEOTn58sdkYiIXhFLA0mqdu3aGDJkCA4cOICYmBhYW1tj8eLFaNOmDT7++GMuXRARaRGWBqoy3bp1w/r165GSkoIpU6YgJiYGLi4u8Pb2RkJCApcuiIiqOZYGqnKNGzfGJ598gtTUVKxevRp37tzBqFGj4OzsjK1bt+LPP/+UOyIRET0DSwPJpk6dOhg2bBhiY2MRHR2Nzp07Y+HChejWrRs++eQTXLlyRe6IRET0NywNJDtBEGBjY4Mvv/wSycnJ8PX1xe7du+Hk5ITRo0fj2LFjOv+ZI0RE2oClgaqVpk2b4qOPPkJaWhqCg4Nx69YtjBw5Eq6urvj6669RWFgod0QiohqLpYGqpbp162L48OH4/vvvsXv3brRv3x6BgYFQqVRYtGgRrl27JndEIqIah6WBqjVBEGBnZ4evvvoKp06dwujRo7Fjxw44ODjA19cXJ06c4NIFEVEVYWkgrdGsWTPMnTsX6enpCAoKwvXr1zFixAj06tUL4eHhePDggdwRiYh0GksDaZ26deti5MiRiI+Px3fffQcrKyvMmzcPKpUKixcvxvXr1+WOSESkk1gaSGsJgoAePXogJCQEJ0+exMiRI7Ft2zb06NED48aNQ1JSEpcuiIg0iKWBdELz5s3x8ccfIz09HUuXLsWVK1fg5eUFd3d3REZG4uHDh3JHJCLSeiwNpFMMDQ0xatQoHDlyBNu3b0fz5s3x4YcfQqVSYenSpbhx44bcEYmItBZLA+kkQRDg5OSE0NBQJCUlwcvLC+Hh4ejevTvef/99pKSkcOmCiOgVsTSQzmvRogUWLFiA9PR0fPrpp7hw4QKGDBmCPn36ICoqCo8ePZI7IhGRVmBpoBqjXr168PHxwdGjRxEZGYkmTZpg1qxZsLGxwfLly3Hz5k25IxIRVWssDVTjKBQKuLi44Ouvv8aJEyfw9ttvIzQ0FPb29pgwYQLS0tK4dEFE9AwsDVSjtWrVCp9++ikyMjKwaNEi/PLLLxg8eDD69euH7777DkVFRXJHJCKqNlgaiAAYGRnB19cXx44dQ0REBExNTTF9+nTY2tri888/R1ZWltwRiYhkx9JA9DcKhQI9e/ZEeHg4jh8/joEDB2Lz5s2ws7PDpEmTkJGRwaULIqqxWBqInqN169b47LPPkJ6ejsDAQJw+fRqenp4YMGAAdu3axaULIqpxWBqIXqJBgwYYP348Tpw4gbCwMBgbG2Pq1Kmws7NDcHAwbt++LXdEIqIqwdJAJJJCoSi/LfXRo0fh4eGBjRs3wtbWFlOmTMHp06fljkhEJCmWBqJKaNeuHZYtW4aMjIzyj+vu378/Bg4ciOjoaDx+/FjuiEREGsfSQPQajI2N4e/vj8TERGzduhWGhoaYNGkS7O3tsWrVKmRnZ8sdkYhIY1gaiDRAT0+v/LbUR44cgbu7O9atW4d27dph2rRpOHv2rNwRiYheG0sDkYZ16NABK1asQHp6OhYuXIjk5GT07dsXgwcPRkxMDIqLi+WOSERUKSwNRBIxMTHBjBkzcPLkSWzZsgX6+vqYOHEi7O3t8cUXX+Du3btyRyQieiUsDUQS09PTQ9++fbFz507Ex8fDzc0Na9asgY2NDWbOnIlz587JHZGISBSWBqIq1KlTJ3z++edIS0vDrFnGkAHmAAAgAElEQVSzcOLECfTp0wdDhgzB/v37UVJSIndEIqLnqrLSEB0djeHDh+Prr7+uqiGJqi2lUolJkybh1KlT+OqrrwAA/v7+6N69O9avX4/c3FyZExIRPa1KSsPly5dx+PBhtGjRoiqGI9Ia+vr66N+/P3bv3o1Dhw7B2dkZwcHBsLGxwezZs3H+/Hm5IxIRlZO8NDx69Ahr167FhAkTUK9ePamHI9JanTt3RnBwMNLS0jBt2jQkJCTA3d0d77zzDmJjY1FaWip3RCKq4SQvDSEhIejWrRs6d+4s9VBEOqFRo0aYOnUqkpOTsXHjRhQXF2P8+PHo0aMHNm7ciHv37skdkYhqKElLQ1JSEq5du4aRI0dKOQyRTjIwMICnpyf27t2L2NhYdO/eHUFBQVCpVJgzZw4uXLggd0QiqmEkKw13795FWFgYpkyZAn19famGIaoRunbtitWrVyMtLQ1Tpkwpv3Rz+PDh+P7777l0QURVQlCr1WopdpyWlob//ve/UCj+r5eUlZUB+OvTAiMjIyEIwlOvS0xMRFJSUoVtFhYW8PHxQVFRESSKW20YGBjUiDsGcp6v5/Hjx9izZw82bNiA1NRUtGzZEhMmTMCYMWPQsGFDjY/3Mnw/dQvnqVsEQUDt2rURFhb21OfhODg4wNHRUfy+pCoNjx49wp07dypsW79+PZo1a4bBgwfD0tLylfeZk5Oj82+wUqmsEZfbcZ6a89NPP2Hr1q3Yt28fDAwMMGzYMPj6+qJt27aSjvt3fD91C+epWwwMDGBmZqaRfUm2PFGnTh1YWlpW+F+dOnVQv379ShUGInq2N998E2vXrkVqaiomTpyIgwcPwtXVFe+++y7i4+PLj/AREb0u3hGSSEeYm5tj5syZSElJwRdffIH8/Hz4+PjAyckJmzdvRkFBgdwRiUjLSbY8IQUuT+gOzlN6arUaGRkZCA0Nxf79+1G7dm14eXnBx8cHbdq00ehYfD91C+epW7RieYKI5CUIAlQqFdavX4+UlBT4+fkhJiYGLi4u8Pb2RkJCApcuiOiVsDQQ1QCNGzfG7NmzkZqailWrViEnJwejRo2Cs7Mztm7dij///FPuiESkBVgaiGqQOnXqwMvLC3FxcYiOjsa//vUvLFy4EN26dcMnn3yCK1euyB2RiKoxlgaiGkgQBNjY2GDTpk1ITk6Gr68vdu/eDScnJ4wePRrHjh3T+XuiENGrY2kgquGaNm2Kjz76CGlpaQgODsatW7cwcuRIuLq6IiwsDIWFhXJHJKJqgqWBiAAAdevWLb8t9e7du9GuXTsEBgZCpVJh0aJFuHbtmtwRiUhmLA1EVIEgCLCzs8PmzZuRnJyMUaNGYceOHXBwcICvry+OHz/OpQuiGoqlgYieq1mzZpg3bx7S09MRFBSE69ev491334Wbmxu++eYbPHjwQO6IRFSFWBqI6KXq1q2LkSNHIj4+Ht999x2srKwwf/58qFQqLF68mEsXRDUESwMRiSYIAnr06IEtW7bg5MmTePfdd7Ft2zZ06tQJ48aNQ1JSEpcuiHQYSwMRVUrz5s0RGBiI9PR0rFmzBpmZmfDy8oK7uzsiIyPx8OFDuSMSkYaxNBDRazE0NMT48eORkJCA7du3w9LSEh9++CFUKhWWLl2KGzduyB2RiDSEpYGINEIQBDg5OSEsLAxJSUnw8vJCeHg47O3t4efnh+TkZC5dEGk5lgYi0rgWLVpgwYIFSE9Px2effYaLFy9i6NCh6NOnD6KiovDo0SO5IxJRJbA0EJFk6tWrhzFjxuDo0aOIjIxE48aNMWvWLNjY2GD58uW4efOm3BGJ6BWwNBCR5BQKBVxcXPDNN9/gxIkTePvttxEaGgp7e3tMmDABaWlpXLog0gIsDURUpVq1aoVPP/0UGRkZWLRoEX755RcMHjwYHh4e2LFjB5cuiKoxlgYikoWRkRF8fX1x7NgxREREwNTUFDNmzICtrS2CgoKQlZUld0Qi+geWBiKSlUKhQM+ePREREYFjx47B09MTISEhsLOzQ0BAADIyMrh0QVRNsDQQUbXRpk0bfPbZZ0hPT0dgYCB+/vlneHp6YsCAAdi1axeKiorkjkhUo7E0EFG106BBA4wfPx7Hjx9HWFgYGjRogKlTp8LOzg4rV67E7du35Y5IVCOxNBBRtaWnpwd3d3ds27YNR48ehYeHBzZs2ABbW1tMmTIFp0+fljsiUY3C0kBEWqFdu3ZYtmwZMjIyMHfuXKSnp6N///4YOHAgoqOj8fjxY7kjEuk8lgYi0irGxsbw9/dHYmIiQkNDUbduXUyaNAn29vZYtWoVcnJy5I5IpLNYGohIK+np6aF3797YsWMHEhIS4O7ujnXr1sHW1hbTpk3DmTNn5I5IpHNYGohI67Vv3x4rVqxAeno65syZg+TkZHh4eGDQoEHYu3cviouL5Y5IpBNYGohIZ5iYmGDChAlISkpCSEgIDAwMEBAQAHt7e3zxxRe4e/eu3BGJtBpLAxHpHH19fXh4eGDnzp2Ij4+Hm5sb1qxZAxsbG8ycORPnzp2TOyKRVmJpICKd1qlTJ3z++edIS0vDzJkzcfz4cfTp0wdDhgzB/v37UVJSIndEIq3B0kBENYJSqcTkyZORnJyMTZs2AQD8/f3RvXt3rF+/Hrm5uTInJKr+WBqIqEbR19fHgAEDsHv3bhw6dAhOTk4IDg6GjY0NZs+ejfPnz8sdkajaYmkgohqrc+fOWLlyJdLS0jBt2rTySzffeecdxMbGorS0VO6IRNUKSwMR1XiNGjXC1KlTkZycjI0bN6KkpATjx49Hjx498OWXXyIvL0/uiETVgqCW8DNn9+zZg9TUVNy8eRO1atVCu3bt8N5776Fp06aV2l9OTo7OX2+tVCprxNoq56lbdHGeZ86cwdatW7F3714oFAoMHToUM2bMQJMmTeSOJjldfD+fpabM08DAAGZmZhrZl6RHGn777Td4eHhgyZIlCAwMRGlpKZYsWcJ7xBNRtde1a1esXr0aaWlpmDJlCg4fPgyVSgUvLy8cOnSISxdUI0laGubOnQtnZ2dYWlrijTfeQEBAAO7cuYPMzEwph6VXIAiC3BGIqjVTU1NMnz4dycnJCAsLw8OHDzF27Fg4Ojpi06ZNyM/PlzsiUZWp0nMaHjx4AAAwMjKqymHpGUpLgdu3H+PIkRs4c+Ye8vJ4rTrRi9SqVQvDhw/Hvn37sH//fqhUKixbtgzdunXD3LlzcenSJbkjEkmuykqDWq1GWFgYOnToAEtLy6oalp5BEARcvJgPe/sIjBp1AB4eOzF69EHcu6fb54sQacqbb76JtWvXIjU1FQEBAYiNjYWrqyveffddxMfHo6ysTO6IRJKostIQEhKCP/74A9OnT6+qIek58vOL8cEHP6Co6P/WZDMysvHzzzlcriB6Bebm5pg5cyZSUlLwxRdfID8/Hz4+PnByckJISAgKCgrkjkikUZJePfHEli1bkJGRgU8//RSmpqYvfG5iYiKSkpIqbLOwsICPjw+KiopQBXFlZWBgIPkVIlev3oW9fTjy84sqbP/kkx748MMe0NPTk3R8oGrmWR1wnrrlZfNUq9VISUnBxo0bsXv3btSpUwfe3t6YOHEi2rVrV4VJXw/fT90iCAJq166NsLAwZGdnV3jMwcEBjo6O4vcldWnYsmUL0tPTsXDhQlhYWLzWvnjJpWY8fqzGvHmJ2Lbttwrbf/hhBNq2rf/S1wuCgHv3ilFaqkbdunowNHz1A1Y15VInzlO3vMo8s7KyEB4ejvDwcNy9exc9e/bEuHHj4OLiAoWiet8ih++nbtGaSy5DQkKQmJiIqVOnonbt2sjLy0NeXh4vuZRZrVoC5s2zR9++rSAIQMOGtbFmjRuaNjV86WtLStT48ce7GDw4Gm+++TWmTEnAnTu6XeSIKqNx48aYPXs2UlNTsXr1aty5cwfe3t5wcXFBaGgo/vzzT7kjEr0ySY80DB8+/JnbAwIC4OLi8sr745EGzXrwoBSFhaVQKAQolbUgCC//p3D7dhFsbSNQXPx/J3o5OVkiJKQ3jIzEL2vUlIbPeeqW15mnWq1Geno6tm7digMHDqBu3boYPnw4fH190apVKw0nfT18P3WLJo806GtkL88RFRUl5e7pNRka6sHQ8MkvenHd8ddfcysUBgA4ceIPPHhQ8kqlgaimEQQBNjY2sLGxwc2bN/HNN98gIiICW7duhZubG8aNGwdnZ2eejEzVWvVeWKNqx9z86SUMpbIO9PT4T4lIrKZNm+Kjjz5CWloagoODcevWLYwcORKurq4ICwtDYWGh3BGJnok/6emVNG1aD66uzStsW7bMGSYmtWRKRKS9nixRfP/999i1axfat2+PwMBAqFQqLFy4ENeuXZM7IlEFki5PkO4xNtbHhg1v4dKlPPz++z3Y2zeDuXltKBS6fSkskZQEQYC9vT3s7e1x48YNfP311/j2228REhICd3f38ttWc+mC5MYjDfTKjI31oVKZYsSIdmjRwhB16/JcBiJNadasGebNm4f09HQEBQXh+vXrGDFiBNzc3BAeHl5+O34iObA0UKXp+o22iORUt25djBw5EvHx8dixYwdat26NefPmQaVSYfHixbh+/brcEakGYmkgIqrGBEGAg4MDQkJCcPLkSYwcORLbtm1Djx49MH78eJw8eZIFnqoMSwMRkZZo3rw5Pv74Y6Snp2PZsmXIzMzEsGHD4O7ujsjISDx8+FDuiKTjWBqIiLSMoaEhvL29ceTIEWzfvh1vvPEGPvzwQ6hUKixduhQ3btyQOyLpKJYGIiItJQgCnJycsHXrViQlJcHLywvh4eGwt7eHn58fkpOTuXRBGsXSQESkA1q0aIEFCxYgPT0dn332GS5evIihQ4eid+/e2L59O5cuSCNYGoiIdEi9evUwZswY/PDDD9i2bRuaNGmCDz74ADY2Nli2bBlu3rwpd0TSYiwNREQ6SBAEODs745tvvsGJEycwZMgQhIWFwd7eHhMmTEBaWhqXLuiVsTQQEem4Vq1a4dNPP0VGRgYWLVqEX375BYMHD4aHhwd27NiBoqIiuSOSlmBpICKqIYyMjODr64tjx44hIiICpqammDFjBmxtbfH5558jKytL7ohUzbE0EBHVMAqFAj179kRERASOHTuGgQMHYvPmzbCzs8OkSZOQkpIid0SqplgaiIhqsDZt2uCzzz5Deno6AgMDcfr0abi6uqJ///7YtWsXly6oApYGIiJCgwYNMH78eJw4cQK7d++GsbExpk6dCjs7O6xcuRK3b9+WOyJVAywNRERUTqFQwMPDA5GRkfjhhx/Qr18/bNy4Eba2tpgyZQpOnz4td0SSEUsDERE9U9u2bbF06VKkp6eXf1x3//79MWDAAOzZswePHz+WOyJVMZYGIiJ6IWNjY7z//vtITExEaGgoDA0NMXnyZNjb22PVqlXIycmROyJVEZYGIiISRU9PD71798aOHTtw5MgRuLu7Y926dbC1tcW0adNw5swZuSOSxFgaiIjolXXo0AErVqxAeno6PvzwQ5w6dQoeHh4YNGgQ9u7di+LiYrkjkgRYGmo4QRDkjkBEWszExAQTJ07EyZMnERISAgMDAwQEBMDe3h5ffPEF7t69K3dE0iCWhhqqtBS4ffsxjhy5gTNn7iEvr0TuSESkxfT19eHh4YGdO3ciPj4ebm5uWLNmDWxsbDBz5kycO3dO7oikASwNNZAgCLh4MR/29hEYNeoAPDx2YvTog7h3j4cTiej1derUCZ9//jnS0tIwa9YsnDhxAn369MHQoUNx4MABlJTwjxRtxdJQA+XnF+ODD35AUVFp+baMjGz8/HMOlyuISGOUSiUmTZqEU6dO4auvvoJarcb777+P7t27Y/369cjNzZU7Ir0iloYaqLhYjStX8p/a/ttvuSwNRKRx+vr66N+/P3bv3o1Dhw7B2dkZwcHBsLGxwQcffIDz58/LHZFEYmmogerX10e/fq2e2t6rVwuUlZXJkIiIaorOnTsjODgYaWlpmDZtGo4ePQp3d3e88847OHjwIJcuqjmWhhqoVi0B8+bZo2/fVhAEoGHD2lizxg1NmxrKHY2IaohGjRph6tSpSE5OxsaNG1FcXAw/Pz84ODhg48aNuHfvntwR6RkEtVqtljuEWDk5OTp/7a9Sqayydb4HD0pRWFgKhUKAUlkLglB1/xSqcp5y4jx1C+cprTNnzmDr1q3Yu3cvFAoFhg4dirFjx6JDhw6SjFdT3k8DAwOYmZlpZF880lCDGRrqwcysFho1MqjSwkBE9Cxdu3bF6tWrkZqaismTJ+Pw4cPo1asXvLy8cOjQIZSWlr58JyQplgYiIqpWzMzMMGPGDCQnJ2P9+vV48OABxo4dC0dHR2zatAn5+U+fyE1Vg6WBiIiqpVq1amHw4MHYv38/9u/fD5VKhWXLlqFbt26YO3cuLl26JHfEGoelgYiIqr0333wTa9euRWpqKgICAhAbGwtXV1e8++67iI+P55VfVUTy0hAXF4dJkybhvffew/z583H58mWphyQiIh1lbm6OmTNnIiUlBWvXrkVBQQF8fHzg5OSEkJAQFBQUyB1Rp0laGk6ePInw8HB4eXkhKCgILVq0wJIlS/imEhHRa6lduzaGDBmCAwcOICYmBtbW1li8eDFUKhU+/vhj/oEqEUlLw4EDB/DWW2/BxcUFzZo1g5+fH2rXro2jR49KOSwREdUg3bp1w/r165GSkgI/Pz/ExMTAxcUF3t7eSEhI4NKFBklWGkpKSpCZmYkuXbqUbxMEAV26dMHFixelGpaIiGqoxo0bY/bs2UhNTcXq1atx584djBo1Cs7Ozti6dSvu378vd0StJ1lpuH//PsrKymBsbFxhu7GxMfLy8qQaloiIarg6depg2LBhiI2NRXR0NDp37oyFCxdCpVLhk08+QWZmptwRtRavniAiIp0kCAJsbGzw5ZdfIjk5Gb6+vti9ezecnZ0xevRoHD58GFp0U+RqQV+qHdevXx8KheKpm3Dk5+ejYcOGz31dYmIikpKSKmyzsLCAj48PGjRooPNvsIGBAZRKpdwxJMd56hbOU7fo4jyVSiWCgoKwaNEifPfdd1i/fj0GDhyIY8eOwdbWVu54knry6cVhYWHIzs6u8JiDgwMcHR3F70vKz56YP38+2rRpA19fXwCAWq1GQEAAPDw84Onp+cr742dP6A7OU7dwnrqlJsxTrVYjMzMTVlZW5b9UdZXWfPZE//79ceTIERw7dgw3btzA5s2bUVRUBFdXVymHJSIieqEnSxe6Xhg0TbLlCQDo0aMH7t+/jx07diAvLw8tW7bE/Pnz0aBBAymHJSIiIglIWhoAoE+fPujTp4/UwxAREZHEePUEERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERicLSQERERKKwNBAREZEoLA1EREQkCksDERERiaIvxU5zcnKwa9cunDt3Dnl5eVAqlXB0dMSQIUOgry/JkERERCQxSX6D37hxA2q1Gv7+/rCwsMD169fx5Zdf4vHjx/D29pZiSCIiIpKYJKXB2toa1tbW5V+bm5tj4MCBiI+PZ2kgIiLSUlV2TsODBw9gZGRUVcMRERGRhlVJacjKykJcXBzc3d2rYjgiIiKSwCstT0RGRmLv3r0vfM6qVavQtGnT8q9zc3OxdOlS9OjRA25ubpVL+f/VhJMoBUGAgYGB3DEkx3nqFs5Tt3CeukWTvzsFtVqtFvvk+/fv4/79+y98joWFBfT09AD8VRgWLVqE9u3bIyAgQNQYiYmJSEpKqrCtY8eO8PT0FBuTiIiI/iEmJga//vprhW0ODg5wdHQUvxO1RO7evaueOnWqes2aNeqysrLX3t/evXs1kKr6Cw0NlTtCleA8dQvnqVs4T92jqd+hkpzTkJubi4ULF8LMzAze3t7Iz89HXl4e8vLyKr3Pf7YjXZWdnS13hCrBeeoWzlO3cJ66R1O/QyU5SeDMmTPIzs5GdnY2Jk6cWOGxqKgoKYYkIiIiiUlSGlxdXeHq6irFromIiEgm/OwJIiIiEkVv4cKFC+UOIdYbb7whd4QqwXnqFs5Tt3CeuqWmzBPQzFxf6ZJLIiIiqrm4PEFERESisDQQERGRKCwNREREJApLAxEREYmiVZ8AlZOTg127duHcuXPIy8uDUqmEo6MjhgwZovUfZhUXF4d9+/YhLy8PLVu2hK+vL9q0aSN3LI3Zs2cPUlNTcfPmTdSqVQvt2rXDe++9V+HDzXRRdHQ0tm3bhn79+mHMmDFyx9G43NxcfPvttzh9+jSKiorQpEkTTJw4EVZWVnJH05iysjLs2LEDiYmJyMvLg4mJCVxdXTF06FC5o72WX3/9FTExMcjMzEReXh5mz54NlUpV4TlRUVFISEhAYWEh2rdvDz8/PzRu3FimxJXzonmWlpZi27ZtOH36NLKzs2FoaIguXbrgvffeg4mJiczJX42Y9/OJr776CkeOHMGYMWPQr1+/VxpHq4403LhxA2q1Gv7+/li5ciXGjBmD+Ph4bN++Xe5or+XkyZMIDw+Hl5cXgoKC0KJFCyxZsgQFBQVyR9OY3377DR4eHliyZAkCAwNRWlqKJUuW4PHjx3JHk8zly5dx+PBhtGjRQu4okigsLERgYCAMDAwwf/58rFq1CqNGjYKRkZHc0TQqOjoahw8fxvjx47F69Wp4e3sjJiYGcXFxckd7LUVFRWjZsiXGjx//zMejo6MRFxeH999/H0uXLkXt2rWxZMkSlJSUVHHS1/OieRYVFeHatWt45513EBQUhNmzZ+PWrVsICgqSIenredn7+URqaiouX74MpVJZqXG06s9za2trWFtbl39tbm6OgQMHIj4+Ht7e3jImez0HDhzAW2+9BRcXFwCAn58ffvzxRxw9ehSDBg2SOZ1mzJ07t8LXAQEB8PPzQ2ZmJjp06CBTKuk8evQIa9euxYQJE7Br1y6540giOjoapqammDBhQvk2MzMzGRNJ4+LFi1CpVOU/e0xNTZGYmIjLly/LnOz1/PPn6T/FxsZi6NCh6NatGwBg8uTJ8PPzQ2pqKnr06FFVMV/bi+ZpaGiI+fPnV9g2duxYzJs3D3fv3kWjRo2qIqJGvOz9BP46MhgaGor58+dj2bJllRpHq440PMuDBw+0+i+bkpISZGZmokuXLuXbBEFAly5dcPHiRRmTSevBgwcAoNXv3YuEhISgW7du6Ny5s9xRJJORkYHWrVtj5cqV8PPzw5w5c3DkyBG5Y2lc+/btce7cOdy6dQsAcPXqVVy4cAFvvvmmzMmkc/v2beTl5VX4uWRoaIi2bdvq9M8l4K8jaIIgoF69enJH0Si1Wo1169Zh0KBBsLS0rPR+tOpIwz9lZWUhLi4Oo0ePljtKpd2/fx9lZWUwNjausN3Y2Bg3b96UKZW01Go1wsLC0KFDh9f6x1tdJSUl4dq1a5Vu8toiOzsb33//PQYMGIAhQ4bg8uXLCA0NhYGBAZydneWOpzGDBw/Gw4cPMX36dCgUCqjVaowYMQIODg5yR5PMk08kftbPpdf5tOLqrri4GJGRkXB0dESdOnXkjqNR0dHR0NfXR9++fV9rP9WiNERGRmLv3r0vfM6qVasqnDSXm5uLpUuXokePHnBzc5M6ImlQSEgI/vjjDyxevFjuKBp39+5dhIWFITAwUOtPzn0ZtVqN1q1bY8SIEQCAli1b4vr164iPj9ep0nDy5EkkJiZi+vTpsLS0xNWrVxEWFgalUqlT86zpSktLsXLlSgiC8NLzArRNZmYmYmNjNXKuRrX4qTZw4MCXfiqmhYVF+f/Pzc3FokWL0KFDB7z//vsSp5NW/fr1oVAokJ+fX2F7fn4+GjZsKFMq6WzZsgU//fQTPv30U607O1mMzMxMFBQUYM6cOeXbysrKcP78ecTFxSEyMhKCIMiYUHNMTEzQrFmzCtuaNWuG1NRUmRJJIyIiAm+//Ta6d+8OAGjevDlycnKwZ88enS0NT372/PPnUH5+Plq2bClTKuk8KQx3797FJ598onNHGX777TcUFBRg4sSJ5dvKysrwzTff4ODBg1i3bp3ofVWL0lC/fn3Ur19f1HOfFIbWrVtX+AZoK319fVhZWeHs2bPll8eo1WqcO3cOHh4eMqfTrC1btiA9PR0LFy6Eqamp3HEk0aVLFwQHB1fYtn79ejRr1gyDBw/WmcIA/LXW/88ltJs3b+rce/v48WMoFBVP/xIEAbr8sT3m5uZo2LAhzp49W371z4MHD3Dp0iX06dNH5nSa9aQw3L59GwsWLNDJ86ycnZ3RtWvXCts+++wzODs7o2fPnq+0r2pRGsTKzc3FwoULYW5uDm9v7wp/nWvzX+X9+/fHhg0bYGVlhTZt2uDAgQMoKip66dEXbRISEoKkpCR8+OGHqF27dvm6qKGhIWrVqiVzOs2pU6fOU+dp1KlTB/Xr19e58zf69++PwMBA7NmzB927d8fly5eRkJAAf39/uaNpVLdu3bBr1y4olUo0b94cV65cwYEDB7R+WfTRo0fIysoq/zo7OxtXr16FkZERTE1N0a9fP+zevRuNGzeGubk5tm/fjkaNGsHGxkbG1K/uRfM0MTFBcHAwrl69io8++gglJSXlP5uMjIy0aonxZe/nP8uQnp4eGjZsiCZNmrzSOFr1KZc//PADNm7c+MzHoqKiqjiNZh06dAgxMTHlN3caO3YsWrduLXcsjRk+fPgztwcEBC5YSuoAAAD2SURBVJRfaqqrFi1ahJYtW+rkzZ1+/PFHREZGIisrC+bm5hgwYIDW/zL9p0ePHiEqKgqpqakoKCiAiYkJHB0dMXToUOjp6ckdr9LOnz+PRYsWPbXdxcUFAQEBAIAdO3bgyJEjKCwsRMeOHTFu3Ditu7nTi+Y5bNgwTJ48+ZmvW7BgATp16iR1PI0R837+3eTJk9GvX79XvrmTVpUGIiIiko/W36eBiIiIqgZLAxEREYnC0kBERESisDQQERGRKCwNREREJApLAxEREYnC0kBERESisDQQERGRKCwNREREJApLAxEREYnC0kBERESisDQQERGRKP8P+/xG05hyFYIAAAAASUVORK5CYII=", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "w = classifier.coef_[0]\n", "print(w)\n", "\n", "a = -w[0] / w[1]\n", "\n", "xx = numpy.linspace(0,12)\n", "yy = a * xx - classifier.intercept_[0] / w[1]\n", "\n", "h0 = matplotlib.pyplot.plot(xx, yy, 'k-', label=\"non weighted div\")\n", "\n", "matplotlib.pyplot.scatter(X[:, 0], X[:, 1], c = y)\n", "matplotlib.pyplot.legend()\n", "matplotlib.pyplot.show()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "(C) 2017-2024 by [Damir Cavar](http://cavar.me/damir/) - [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)); portions taken from the referenced sources." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" }, "latex_metadata": { "affiliation": "Indiana University, Department of Linguistics, Bloomington, IN, USA", "author": "Damir Cavar", "title": "Python examples and notes for Machine Learning for Computational Linguistics" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": false, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": false, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }