{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false, "deletable": true, "editable": true }, "source": [ "##
Support Vector Machines (SVMs) in scikit-learn
\n", "\n", "- for classification and regression (SVCs, SVRs)\n", "- can be applied on linear and non-linear data\n", "- look for the best separating line or decision boundary\n", "- look for the largest margin " ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "![](SVM.png)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "![](SVM2.png)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "deletable": true, "editable": true }, "source": [ "![](SVM3.png)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Commonly used kernels:\n", "\n", "- linear\n", "- polynomial\n", "- radial basis function (RBF) - Gaussian RBF\n", "- sigmoid\n", "- etc." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The accuracy on the training subset: 1.000\n", "The accuracy on the test subset: 0.629\n" ] } ], "source": [ "from sklearn.svm import SVC \n", "from sklearn.model_selection import train_test_split\n", "from sklearn.datasets import load_breast_cancer\n", "\n", "cancer = load_breast_cancer()\n", "X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, random_state=0)\n", "\n", "svm = SVC()\n", "svm.fit(X_train, y_train)\n", "\n", "print('The accuracy on the training subset: {:.3f}'.format(svm.score(X_train, y_train)))\n", "print('The accuracy on the test subset: {:.3f}'.format(svm.score(X_test, y_test)))" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEKCAYAAAAFJbKyAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu0XGV9//H3JyGQI/AjVaLCCTHhFqQEjB5BwCWgRZAS\niPxsJeqqXCRgQaJdQoNttT9bmljUFgQvUWjUQigipglgI/WGWFtICCZcjKQRyzkgobEJiIkm8P39\nsfeQSXoue8/MPntmz+e11qwz85yZvb87kzPPPLfvo4jAzMwsqzFlB2BmZp3FFYeZmeXiisPMzHJx\nxWFmZrm44jAzs1xccZiZWS6uOMzMLBdXHGZmlosrDjMzy2W3sgMowr777htTpkwpOwwzs46ycuXK\n/46IiSM9r5IVx5QpU1ixYkXZYZiZdRRJP8/yvEp1VUmaKWnh5s2byw7FzKyyKlVxRMSyiJizzz77\nlB2KmVllVariMDOz4lVyjMPMrBnbtm2jv7+frVu3lh1KIcaPH8+kSZMYN25cQ6+vVMUhaSYw8+CD\nDy47FDPrYP39/ey9995MmTIFSWWH01IRwcaNG+nv72fq1KkNHaNSFUdELAOW9fX1XVB2LGY2ij7/\nRvjFmv9d/srpcNE9uQ+3devWSlYaAJJ42ctextNPP93wMTzGYWadb9LRMHb3ncvG7p6UN6iKlUZN\ns9fmisPMOt8Jl4N2+TjTGDjhT8uJp+JccZhZ59v7lfCad+9odYzdPXm89yvKjasJknjPe97z4uPt\n27czceJETj/9dACWLl3KggULSomtUmMcHhw362InXA4P3JjcH+XWxpJVA1y1fC1PbNrC/hN6uOyU\nacya0dvUMffcc08efPBBtmzZQk9PD3fddRe9vTuOecYZZ3DGGWc0G3pDKtXi8AJAsy5Wa3VozKi2\nNpasGuCK29YwsGkLAQxs2sIVt61hyaqBpo992mmncccddwCwePFiZs+e/eLvFi1axCWXXALAOeec\nw6WXXspxxx3HgQceyK233tr0uYdTqYrDzLrcCZfD5DeMamvjquVr2bLt+Z3Ktmx7nquWr2362Gef\nfTY333wzW7duZfXq1RxzzDFDPvfJJ5/knnvu4fbbb2fevHlNn3s4leqq6lgtnkpo1rX2fiWc+81R\nPeUTm7bkKs/jyCOP5LHHHmPx4sWcdtppwz531qxZjBkzhsMPP5ynnnqq6XMPxy2OdlDAVEIzGx37\nT+jJVZ7XGWecwYc//OGduqkGs8cee7x4PyJacu6huOJoB55KaNaxLjtlGj3jxu5U1jNuLJedMq0l\nxz/vvPP42Mc+xvTp01tyvFZwxdEOKjiV0KxbzJrRy/yzptM7oQcBvRN6mH/W9KZnVdVMmjSJSy+9\ntCXHahUV3aQpQ19fX3TcRk7P/gKuPgq2b4XdxsPc1a44zEryyCOP8OpXv7rsMAo12DVKWhkRfSO9\n1i2OdlHSVEIzs7wqVXF0/A6AJUwlNDPLq1IVR8cvAKxNJXRrw8zaWKUqDjMzK54rDjMzy8UVh5mZ\n5dL2FYekV0v6vKRbJb2/7HjMzEbDSGnVy1RKripJNwCnAxsi4oi68lOBq4GxwJciYkFEPAJcJGkM\n8MUy4jUzG1JBueZGSqteprJaHIuAU+sLJI0FrgPeBhwOzJZ0ePq7M4B7gG+PbphmZiMoMNfccGnV\n7733Xo499lhmzJjBcccdx9q1STbev/u7v+O8884DYM2aNRxxxBH8+te/bjqWeqVUHBFxN/DLXYqP\nBtZFxPqI+C1wM3Bm+vylEXEc8O7RjdTMbAQF5pobLq36YYcdxg9+8ANWrVrFxz/+cT7ykY8AMHfu\nXNatW8c3vvENzj33XL7whS/wkpe8pOlY6rVTWvVe4PG6x/3AMZJOBM4C9gDuHOrFkuYAcwAmT55c\nXJRmZvVqWR9WfRWe/21Lc80Nl1Z98+bNvPe97+XRRx9FEtu2bQNgzJgxLFq0iCOPPJILL7yQ448/\nvuk4dtX2g+MR8b2IuDQiLoyI64Z53sKI6IuIvokTJ45miGbW7epbHS3ObD1UWvW/+Iu/4KSTTuLB\nBx9k2bJlbN269cXfPfroo+y111488cQTLYuj3ogVhxLvkfTR9PFkSUVsFDEAHFD3eFJaZmbW3grM\nNTdUWvXNmze/OFi+aNGincovvfRS7r77bjZu3FjINrJZWhyfBY4FatXdsySD2K12H3CIpKmSdgfO\nBpbmOUDH56oys85VUK65odKqX3755VxxxRUcf/zxPP/8jq1rP/ShD3HxxRdz6KGHcv311zNv3jw2\nbNjQ0phGTKsu6f6IeK2kVRExIy37cUQc1fBJpcXAicC+wFPAxyLiekmnAX9PMh33hoi4spHjd2Ra\ndTNrG06rPrwsg+Pb0qmykR54IvBCI4HWRMSgeyBGxJ0MMwBuVgrvCW+2kyxdVdcA3wBeLulKkvUU\nf1NoVA1yV5UVwnvCm+1kxIojIm4ELgfmA08CsyLia0UH1oiOT6tu7cl7wnelKu6OWtPstQ1ZcUh6\nae0GbAAWAzcBT6VlZt3Be8J3nfHjx7Nx48ZKVh4RwcaNGxk/fnzDxxhujGMlybiG6s+ZPg7gwIbP\natZpTrgcHrgxue/WRuVNmjSJ/v5+nn766bJDKcT48eOZNGlSw68fsuKIiKkNH9WsamqtjpX/4NZG\nFxg3bhxTp/ojcCiZUo5I+h3gEODFtk2ab8qse5xwOTz9iFsb1vVGrDgkvQ+YS7KS+wHgDcCPgDcX\nG1p+kmYCMw8++OCyQ7Eqqu0Jb9blskzHnQu8Hvh5RJwEzADasuPPs6rMzIqXpeLYGhFbASTtERE/\nAaYVG5aZmbWrLGMc/ZImAEuAuyT9D1BMykUzs3petd+WRqw4IuLt6d2/lPRdYB/gXwqNyswMktX5\nT69N9rmo8ar90mVJq/4GSXsDRMT3ge+RjHOYmRXLq/bbUpauqs8Br617/KtBymxXbmKbNa/A3fWs\ncVkGxxV16+4j4gXaa8vZ9uTEeGatkXF3vSWrBjh+wXeYOu8Ojl/wHZas8j5wRclScayXdKmkcelt\nLrC+6MA6npvYZq2RYXe9JasGuOK2NQxs2kIAA5u2cMVta1x5FCRLxXERcBzJNq4DwDHAnCKDalRb\npVV3Yjyz1hlhd72rlq9ly7bndyrbsu15rlq+djSi6zoj7gDYidpmB8BnfwFXHwXbt8Ju42Hu6iEr\njiWrBrhq+Vqe2LSF/Sf0cNkp05g1o3eUAzbrTFPn3cFgn2QCfrbg90c7nI6VdQfA4dKqXyDpkPS+\nJN0gabOk1ZI8MJ5Fxg3s3cw2a87+E3pylVtzhuuqmgs8lt6fDRxFkkr9T4Criw2rQjJsYO9mtllz\nLjtlGj3jxu5U1jNuLJed4iQXRRiu4tgeEdvS+6cDX4mIjRHxr8CexYdWEbXEeMOMbTyxaUuucjPb\n2awZvcw/azq9E3oQ0Duhh/lnTXd3b0GGm1b7gqT9gP8B3gJcWfe7UWv/SZoF/D7wcuC6iPjWaJ17\ntOw/oYeBQSoJN7PNsps1o9cVxSgZrsXxUWAFSXfV0oh4CEDSCTQ5HTcdL9kg6cFdyk+VtFbSOknz\nACJiSURcAJwDvLOZ87YrN7PNrJMMWXFExO3Aq4BXpx/cNSto/gN8EXBqfYGkscB1wNuAw4HZkg6v\ne8qfp7+vHDezzayTDLsCPCK2k3RV1Zc91+xJI+JuSVN2KT4aWBcR6wEk3QycKekRYAHwzYi4f6hj\nSppDur5k8uTJzYY46tzMNrNOkWUB4GjpBR6ve9yfln0A+D3gHZIuGurFEbEwIvoiom/ixInFRmpm\n1sXaPudURFwDXFN2HGZmlsiy5/hgi/02k2wlu72FsQwAB9Q9npSWZeY9x83Mipelq+qzwL8DC4Ev\nAj8CbgbWSnprC2O5DzhE0lRJuwNnA0vzHMB7jpuZFS9LxfEYMCMdP3gdySZODwInA3/byEklLSap\ngKZJ6pd0ftp6uQRYDjwC3FKbAmxmZu0jyxjHYfUf4BHxsKQZEbFeUkMnjYjZQ5TfCdzZ0EHNrDPk\n3OTMCUDbT5aKY62kz5F0T0GyhuOnkvYAtg39stHnMQ6zDpBjH/FaAtBaLrdaAlDAlUeJsnRVnQOs\nAz6Y3tanZduAk4oKrBHtNsbhHcnMBpFjkzMnAG1PI7Y4ImKLpM8A3wICWFuX/PBXRQbXyfxNyWwI\nOfYRdwLQ9jRii0PSicCjwLUkM6x+KulNBcfV8fxNyWwYGfcR9z4b7SnLGMengLdGxFoASYcCi4HX\nFRlY28o4sOdvStZ18gx611odK/9h2E3OLjtl2k4td3AC0HaQZYxjXK3SAIiInwLjigupcaOy5/ik\no3fsI14zyMCevylZ18n4t/GiDJucOQFoexpxz3FJNwAvAP+YFr0bGBsR5xUcW8MK3XO8fh/xmkH2\nE991jAOSb0r+T2+VlfFvw9pX1j3Hs3RVvR+4GLg0ffwDqpbevJEm9ggDe7XKwfPPrWvkGPS2zpZl\nVtVvgE+nNwAk/RNV2lQpx7xyIGliP3Bjcn+YgT2nSreuk/Fvwzpbo2nVj21pFGXLMa8c2PHNSmP8\njcqsnv82ukLbp1UfFY00sU+4HJ5+xN+oOlnO1BeWkf82Km/IimOIdOoAok1nVTUlbxN771fCud8s\nPi4rTt4uSsvGfxuVN1yL41PD/O4nrQ6kdBnnlVuF1H9ZqHG/vNmIhqw4IqKt8lCNCjexu4tnAZk1\npJ32HG9a0wsAa01sf3B0j4ypL8xsh0pVHO2WHdc6gGcBmeXmWVVm7qI0yyVTxSGpF3hV/fMj4u6i\ngjIbVZ4FZJbLiBWHpE+QrBJ/GKglXgrAFYeZNcTbwXa2LC2OWcC0NPXIqJN0IPBnwD4R8Y4yYjCz\n1vEmZ50vy+D4elq84E/SDZI2SHpwl/JTJa2VtE7SPICIWB8R57fy/M3ylrBmjfMmZ50vS4vj18AD\nkr4NvNjqiIhLh37JiBaR7Cj4lVqBpLEkWXdPBvqB+yQtjYiHmzhPy/nbkllzvMlZ58vS4lgK/BXw\nb8DKulvD0oH1X+5SfDSwLm1h/Ba4GTizmfMUwd+WzJrjTc46X5a06l8ejUCAXuDxusf9wDGSXgZc\nCcyQdEVEzB/sxZLmAHMAJk+eXFiQ/rZkNrQsg97eDrbzDZfk8JaI+ENJa0hmUe0kIo4sNLId59kI\nXJTheQuBhZDsAFhUPPtP6GFgkErC35aqzbOARpa1G9ebnHW+4Vocc9Ofp49GIMAAcEDd40lp2ajI\n+sHgb0vdx+Na2QzXjbvrv5M3OetswyU5fDL9+fNRiuU+4BBJU0kqjLOBd+U5gKSZwMyDDz4414nz\nfDD421L3yfOB2M3cjds9Skk5ImkxcCKwr6R+4GMRcb2kS4DlwFjghoh4KM9xI2IZsKyvr++CPK/L\n+8Hgb0vdxR+I2bgbt3uUkuQwImZHxH4RMS4iJkXE9Wn5nRFxaEQcFBFXjlY8/mCw4XgWUDaXnTKN\nnnFjdypzN241Zc1V1QNMjohKzjn1N6UKKWA7WI9rZeNu3O6RJVfVTOCTwO7AVEmvAT4eEWcUHVxe\njY5x+IOhQgrYDtYfiNm5G7c7KGL4mauSVgJvBr4XETPSsjURMX0U4mtIX19frFixItdrPN2yIp79\nBVx9FGzfuqNst/Ewd7X32jAbgaSVEdE30vOydFVti4jNkurLClsnURZ/U6oIbwdrVrgsg+MPSXoX\nMFbSIZI+Q5J+xKw9eTtYs0JlqTg+APwuSYLDxcAzwAeLDMqsKd4O1qxQWXJV/ZpkP4w/Kz6c5jQ6\nOG4V5O1gzQoz5OC4pGUMM5bRjrOqahoZHLc2V8A0WzPbWSsGxz+Z/jwLeCXwj+nj2cBjTUVnllcB\n02zNrDHD5ar6PoCkv4qIN9X9apkk7zduo+uEy+GBG3cu88C3WSmyDI5PTPf9BiBNQjixuJDMBlEb\n8B67e/LY02zNSpNlHceHgO9JWg8IeBVwYaFRmQ2mvtXh1oZZabLMqvoXSYcAh6VFP4mI3wz3GrNC\n1FodK//BrQ2zEmXJVfVHuxQdJYmI+EpBMZkNzdNszUqXpavq9XX3xwNvAe4HXHHY6Nv7lXDuN8uO\nwqyrZemq+kD9Y0kTgC8XFlETvADQzKx4jWzk9BxwaKsDaYWIWBYRc/bZZ5+yQzEzq6wsYxz1K8jH\nAIcDXysyKDMza19Zxjg+WXd/O/DziOgvKB4zs87TZSlxsnRVnRYR309vP4yIfkmfKDwyM7NOMeno\nHYtTayqcEidLi+NkYNe5j28bpKwQkvYEPgv8lmQXwhtHeImZWfPytCK6LCXOkC0OSe+XtAaYJml1\n3e1nwOpmTirpBkkbJD24S/mpktZKWidpXlp8FnBrRFwAtG1GXjOrmDytiC5LiTNcV9VNwExgafqz\ndntdRLynyfMuAk6tL5A0FriOpDVzODBb0uHAJODx9GnPN3leM7Ns6neSrBmuFdFFO08OV3FERDwG\nXAw8W3dD0kubOWlE3A38cpfio4F1EbE+In4L3AycCfSTVB4jxWtm1jp5WxFdtPPkcGMcNwGnAytJ\npuOq7ncBHDjYi5rQy46WBSQVxjHANcC1kn4fWDbUiyXNAeYATJ48ucWhWVUtWTXAVcvX8sSmLew/\noYfLTpnGrBm9ZYdl7SJvYs0uSYkz3H4cp6c/p45eOIPG8RxwbobnLQQWQrIDYNFxWedbsmqAK25b\nw5ZtSQ/owKYtXHFbMhjqysOA/Ik1uyQlTpZZVUjqJUmn/uLz0+6mVhoADqh7PCktMyvEVcvXvlhp\n1GzZ9jxXLV/risN2KKsV0cZrQ7KsHP8E8E7gYXYMTgfQ6orjPuCQdKOoAeBs4F15DuBcVZbHE5u2\n5Cq3LlVWK6KNt0vOMtg8C5gWEadFxMz01tS0WEmLgR+RTPXtl3R+RGwHLgGWA48At0TEQ3mO61xV\nlsf+E3pylZuNqryzukZRlopjPTCulSeNiNkRsV9EjIuISRFxfVp+Z0QcGhEHRcSVrTyn2a4uO2Ua\nPePG7lTWM24sl50yraSIzOq08dqQLGMcvwYekPRt4MWd/yLi0sKiapC7qiyP2jiGZ1VZ22rT7ZIV\nMfwEJEnvHaw8ItpyTw5IZlWtWLGi7DDMzJp3+58ks7pedy6c/ulCTyVpZUT0jfS8LBs5tW0FYS3Q\nxjM3zIy2XBuSZVbVGnbsx1GzGVgB/HVEbCwiMBslbTxzw1Ku3LtbG64NyTLG8U2Sabg3pY/PJllF\nvpkk59TMQiKz0VHBrJ6VWw3uyt3aTJZZVcdHxBURsSa9/RlwQkR8AphSbHj5SJopaeHmzZvLDqVz\ntPHMjUbUVoMPbNpCsGM1+JJVHbyWtI2nZVp3ytLi2EvS0RFxL4Ck1wN7pb/bXlhkDYiIZcCyvr6+\nC8qOpXSN7iXQig+kErtWKrkavFa5r/pq0uro8MrdOl+WiuN9wA2S9iLponoGeF+6wdL8IoOzJuTp\n3sibj6eV526xyq4Gb9NpmZZTRcarRuyqioj7ImI68BrgqIg4MiLujYjnIuKW4kO0hjSyl8DkN7Tm\nA6nErpXKrgbvopTdlVaRLWYz7W+RpjS/EJgr6aOSPlpsWNa0RvYSOPebrflAKnHcpNKrwVtZuVs5\nKjJelWU67ueBlwAnAV8C3gHcW3Bc1gpldm+UdO5KrwZvw2mZHanM7qKKjFdlGeM4LiKOlLQ6Iv6f\npE8BtxUdmLVAq8cu8vzBtfrcOcya0VuNisKKUfb05gpMRsnSVVUbVfy1pP2BbUCpmztZDq3s3sjb\nP+uuFWtHZXcXtXq8qoRxkywVx+2SJgBXAfcDj5HsB26doJVjF3n/4Fp5brNWaYe1Sx0+GSXLrKq/\niohNEfF1kl0AD4uIvygsoiZ4AWDB2uEPzqwV6j9syxic7vDJKENmx5V01nAvjIi2HedwdtwCPfsL\nuPoo2L4VdhsPc1e74rDONIpZZxuVOX1Oi/4uW5Ed91bggfQGyeK/msAD5KOvHRYPlTjobdZSbZh1\ntl4tfU4tE0ItfQ7wvyuPUf67HK7iOIskoeGRwD8DiyNiXaHR2PDKng1S0+Z/cGaZtPn05tzpc0bx\n73LIMY6IWBIRZwMnAP8JfErSPZJOKDwqG1zZs0FqPOhtVrjc6XNG8e8yyzqOrSQp1J8hGRwfX2hE\nNrSKLB5qVKXSpbdDt6O1tf0n9DAwSCXRDulzhmxxSHqzpIXASpJV41dHxGsiYvmoRZfEcaCk6yXd\nOprnbVtlzwbJaMmqAY5f8B2mzruD4xd8p+m05pVLl16RnEVWnHZOnzNci+NfgdXAPcAewB9J+qPa\nLyPi0pEOLukG4HRgQ0QcUVd+KnA1MBb4UkQsGOoYEbEeON8VR6oDBqczD+rl+NZduXTpFdxAy1qr\nndPnDFdxnNuC4y8CrgW+UiuQNBa4DjgZ6Afuk7SUpBLZNU37eRGxoQVxVEubD05n/pDPMdhfuXTp\nXd7taNm0a/qcISuOiPhyswePiLslTdml+GhgXdqSQNLNwJkRMZ+kddIQSXOAOQCTJ09u9DCdoc1n\ng2T+kM/xrbud+3sb5j02rENlSqveYr3A43WP+9OyQUl6WZqhd4akK4Z6XkQsjIi+iOibOHFi66K1\n3DLviZFjxWs79/c2zHtsWIfKMquqVBGxEbio7Dgsu8tOmbbTGAcM8yGf8Vt3O/f3NqXNux2rqlIz\n9EpQRsUxABxQ93hSWtY0STOBmQcffHArDmcNyvUhn2Owv137e5vS5t2OVZRrRbYNashcVS8+QToU\n+Bzwiog4QtKRwBkR8deZTpCMcdxem1UlaTfgp8BbSCqM+4B3RcRDjV7ErpyrqsM8+wu49Vx4xyJ3\n14yWLl5HcvyC7ww6XtY7oYcfzntzCRG1j1bkqqr5InAZ8AWAiFgt6SZgxIpD0mLgRGBfSf3AxyLi\nekmXAMtJZlLd0MpKwzqQv3W3Rp7KoF3S15SgcjP0GP2utywVx0si4l6pPsch27McPCJmD1F+J3Bn\nlmPYzqrWN1u16ylVnsqgi9eRVG2GXhldb1lmVf23pINIMuIi6R3Ak4VE06Sq78dRtdXTVbue0uXJ\nZdbFe6tUbYbecOumipKl4riYpJvqMEkDwAdp01lOEbEsIubss88+ZYdSiDL+gxSpatdTuryVQYek\nr2m1WTN6mX/WdHon9CCSsY35Z03v2JZuGV1vw3ZVSRoD9EXE70naExgTEc8WFo0Nq2p9s1W7nraQ\nZ1FhB6SvKUqVZuiV0fU2bIsjIl4ALknvP+dKo1yZF9Z1iKpdT1vIu6iwlXtfWynK6HrL0lV1l6QP\nSzpA0ktrt8IisiFVrW+2atfTNvJUBt5bpeOV0fWWZR3HzwYpjog4sJiQGle3APCCRx99tOxwClG1\nWUhVux6zTpZ1HceIFUcn8gJAsxbp4oWC3ahlCwDr9+CoFxFfGazczCqkixcKglvEQ8myAPD1dffH\nk6QKuZ+6PTbMrKK6eKGgc1oNbcSKIyI+UP9Y0gSg6b06zCqpal07XbzhVOV2nWyhRvbjeA44tNWB\nmFVCFfcS79KFgl5nNLQsYxzLSNONkFQ0hwNfKzIoaw33z5agil07XbpQsGo5rVopyxjHJ+vubwd+\nHhH9BcVjLeL+2ZJUtWunCzecyrUhWZfJ0lV1WkR8P739MCL6JX2i8MisKc4DVaIqdu104ULBquW0\naqUsLY6TgV3/579tkLLSeQfAHdw/W6Iu7dqpoirltGqlIVsckt4vaQ0wTdLqutvPgNWjF2J2Vc+O\nm4fzQJXMOaCswobrqroJmAksTX/Wbq+LiPeMQmzWBOeBKlkXdu1Y9xiyqyoiNgObgdkAkl5OsgBw\nL0l7RcR/jU6I1oha89qzqsys1bJMx50JfBrYH9gAvAp4BPjdYkOzZrl/1syKkGVW1V8DbwB+GhFT\nSVKO/LDQqOpImiXpi5L+WdJbR+u8ZmY2uCwVx7aI2AiMkTQmIr4LvCbLwSXdIGmDpAd3KT9V0lpJ\n6yTNG+4YEbEkIi4AzgHemeW8ZmZWnCzTcTdJ2gv4AXCjpA0kCwGzWARcS11CREljgetIpvn2A/dJ\nWgqMBebv8vrzImJDev/P09eZmVmJslQcZwJbgA8C7wb2AT6e5eARcbekKbsUHw2si4j1AJJuBs6M\niPnA6bseQ5KABcA3I+L+LOc1M7PiZMmO+5ykVwGHRMSXJb2EpHXQqF7g8brH/cAxwzz/A8DvAftI\nOjgiPj/YkyTNAeYATJ48uYnwzLqDc5lZo7LMqrqA5AP5pcBBJB/8nycZJC9cRFwDXJPheQuBhZDs\nAFh0XGadzLnMrBlZBscvBo4HngGIiEeBlzdxzgHggLrHk9IyMxslzmVmzchScfwmIl7cN1LSbuxI\ns96I+4BDJE2VtDtwNsnq9KZJmilp4ebNm1txOLPKci4za0aWiuP7kj4C9Eg6mWQvjmVZDi5pMfAj\nknxX/ZLOj4jtwCXAcpKFhLdExEONhb8z56oyy8a5zKwZWSqOecDTwBrgQuBOkqmxI4qI2RGxX0SM\ni4hJEXF9Wn5nRBwaEQdFxJWNBm9mjXEuM2vGkIPjkiZHxH9FxAvAF9NbW3NadbNsnMvMmqGIwYcr\nJN0fEa9N7389Iv7vqEbWhL6+vlixYkXZYZiZdRRJKyOib6TnDddVpbr7BzYfkpmZVcFwFUcMcd/M\nzLrYcAsAj5L0DEnLoye9T/o4IuL/FB6dmZm1neE2cmomrYiZmVVUlum4HcMLAM3MilepisMLAM3M\nilepisPMzIrnisPMzHLJspGTmRXA+2FYp3LFYVYC74dhncxdVWYl8H4Y1slccZiVwPthWCdzV5VZ\ni2UZu9h/Qg8Dg1QS3g/DOkGlWhxeAGhlq41dDGzaQrBj7GLJqp13R/Z+GNbJKlVxeAGglS3r2MWs\nGb3MP2s6vRN6ENA7oYf5Z033wLh1BHdVmbVQnrGLWTN6XVFYR6pUi8OsbN7L27qBKw6zFvLYhXWD\ntu+qkvRqYC6wL/DtiPhcySGZDcl7eVs3KLTikHQDcDqwISKOqCs/FbgaGAt8KSIWDHWMiHgEuEjS\nGOCLRcZciVrCAAAH3ElEQVRr1goeu7CqK7rFsQi4FvhKrUDSWOA64GSgH7hP0lKSSmT+Lq8/LyI2\nSDoDmJcey8xGkXNq2a4KrTgi4m5JU3YpPhpYFxHrASTdDJwZEfNJWieDHWcpsFTSHcBNgz1H0hxg\nDsDkyZNbEr9Zt3NOLRtMGYPjvcDjdY/707JBSTpR0jWSvgDcOdTzImJhRPRFRN/EiRNbF61ZF3NO\nLRtM2w+OR8T3gO+VHIZZV3JOLRtMGS2OAeCAuseT0rKmOeWIWWt5XYoNpoyK4z7gEElTJe0OnA0s\nbcWBnXLErLW8LsUGU2jFIWkx8CNgmqR+SedHxHbgEmA58AhwS0Q8VGQcZtYY59SywSgiyo6h5fr6\n+mLFihVlh2Fm1lEkrYyIvpGeV6mUIx7jMDMrXqUqDo9xmJkVr1IVh5mZFc8Vh5mZ5eKKw8zMcmn7\nleN5SJoJzASekfRog4fZF/jv1kXVFqp2Tb6e9le1a6ra9cDg1/SqLC+s5HTcZkhakWU6Wiep2jX5\netpf1a6patcDzV2Tu6rMzCwXVxxmZpaLK47/bWHZARSgatfk62l/Vbumql0PNHFNHuMwM7Nc3OIw\nM7NcXHHUkXSqpLWS1kmaV3Y8zZL0mKQ1kh6Q1JFZHyXdIGmDpAfryl4q6S5Jj6Y/f6fMGPMY4nr+\nUtJA+j49IOm0MmPMQ9IBkr4r6WFJD0mam5Z38ns01DV15PskabykeyX9OL2e/5eWT5X0H+nn3T+l\n21xkO6a7qhKSxgI/BU4m2c72PmB2RDxcamBNkPQY0BcRHTv/XNKbgF8BX4mII9KyvwV+GREL0gr+\ndyLiT8uMM6shrucvgV9FxCfLjK0RkvYD9ouI+yXtDawEZgHn0Lnv0VDX9Id04PskScCeEfErSeOA\ne4C5wJ8At0XEzZI+D/w4Ij6X5ZhucexwNLAuItZHxG+Bm4EzS46p60XE3cAvdyk+E/hyev/LJH/U\nHWGI6+lYEfFkRNyf3n+WZI+dXjr7PRrqmjpSJH6VPhyX3gJ4M3BrWp7rPXLFsUMv8Hjd4346+D9L\nKoBvSVopaU7ZwbTQKyLiyfT+L4BXlBlMi1wiaXXaldUx3Tr1JE0BZgD/QUXeo12uCTr0fZI0VtID\nwAbgLuA/gU3pxnqQ8/POFUe1vTEiXgu8Dbg47SaplEj6Wju9v/VzwEHAa4AngU+VG05+kvYCvg58\nMCKeqf9dp75Hg1xTx75PEfF8RLwGmETSu3JYM8dzxbHDAHBA3eNJaVnHioiB9OcG4Bsk/2Gq4Km0\nH7rWH72h5HiaEhFPpX/YLwBfpMPep7Tf/OvAjRFxW1rc0e/RYNfU6e8TQERsAr4LHAtMkFTLV5jr\n884Vxw73AYekMw12B84GlpYcU8Mk7ZkO7CFpT+CtwIPDv6pjLAXem95/L/DPJcbStNoHbOrtdND7\nlA68Xg88EhGfrvtVx75HQ11Tp75PkiZKmpDe7yGZAPQISQXyjvRpud4jz6qqk06v+3tgLHBDRFxZ\nckgNk3QgSSsDkizIN3Xi9UhaDJxIksnzKeBjwBLgFmAy8HPgDyOiIwach7ieE0m6PwJ4DLiwbnyg\nrUl6I/ADYA3wQlr8EZIxgU59j4a6ptl04Psk6UiSwe+xJI2FWyLi4+lnxM3AS4FVwHsi4jeZjumK\nw8zM8nBXlZmZ5eKKw8zMcnHFYWZmubjiMDOzXFxxmJlZLq44rGtIer4us+kDaTqJvMeYIOmPWx/d\ni8c/R9K1OV+zSNI7Rn6mWWvsNvJTzCpjS5p2oRkTgD8GPpvnRZLGRsTzTZ7brC24xWFdLU3+dpWk\n+9LkdRem5XtJ+rak+5XsaVLLlLwAOChtsVwl6URJt9cd71pJ56T3H5P0UUn3AH8g6SBJ/5ImnfyB\npGHzBaUtiWsk/Zuk9bVWhRLXpvtF3AG8vO41r5P0/fQcyyXtJ2m39PpOTJ8zX1LHLQa19uEWh3WT\nnjRDKMDPIuLtwPnA5oh4vaQ9gB9K+hZJpuS3R8QzkvYF/l3SUmAecESt5VL7MB7G1oh4Y/rcbwMX\nRcSjko4habW8eYTX7we8kSQp3VKSNNhvB6YB00myzj4M3JDmV/oMcGZEPC3pncCVEXFeWpndKukD\nwKnAMSP/c5kNzhWHdZPBuqreChxZN0awD3AISZrpv0kzCr9AknK6kdTg/wQvZlo9DvhakgoJgD0y\nvH5JmlTvYUm1878JWJx2fT0h6Ttp+TTgCOCu9BxjSbK4EhEPSfoqcDtwbLrnjFlDXHFYtxPwgYhY\nvlNh8g19IvC6iNimZDfF8YO8fjs7d/nu+pzn0p9jSPY/yDvGUp87SEM+a8fvH4qIY4f4/XRgE3Vd\nW2aN8BiHdbvlwPvTbh4kHZpmE94H2JBWGicBr0qf/yywd93rfw4cLmmPNAPpWwY7Sbqfw88k/UF6\nHkk6qsGY7wbemY7P7AeclJavBSZKOjY9xzhJv5veP4skmd2bgM/UsqWaNcIVh3W7L5GMEdwv6UHg\nCyQt8RuBPkkrgHcDPwGIiI0k4yAPSroqIh4nyQK7GvgqSZbRobwbOF/Sj4GHaHxr4m8Aj5Jkb/0c\n8P00tt+SpMn+RHqOB4Dj0jGaBcD7IuKnwLXA1Q2e28zZcc3MLB+3OMzMLBdXHGZmlosrDjMzy8UV\nh5mZ5eKKw8zMcnHFYWZmubjiMDOzXFxxmJlZLv8fcHAMunKnwvgAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline \n", "\n", "plt.plot(X_train.min(axis=0), 'o', label='Min')\n", "plt.plot(X_train.max(axis=0), 'v', label='Max')\n", "plt.xlabel('Feature Index')\n", "plt.ylabel('Feature Magnitude in Log Scale')\n", "plt.yscale('log')\n", "plt.legend(loc='upper right')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Minimum per feature\n", "[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n", " 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", "Maximum per feature\n", "[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n", " 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n" ] } ], "source": [ "min_train = X_train.min(axis=0)\n", "range_train = (X_train - min_train).max(axis=0)\n", "\n", "X_train_scaled = (X_train - min_train)/range_train\n", "\n", "print('Minimum per feature\\n{}'.format(X_train_scaled.min(axis=0)))\n", "print('Maximum per feature\\n{}'.format(X_train_scaled.max(axis=0)))" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/plain": [ "SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n", " decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',\n", " max_iter=-1, probability=False, random_state=None, shrinking=True,\n", " tol=0.001, verbose=False)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_test_scaled = (X_test - min_train)/range_train\n", "\n", "svm = SVC()\n", "svm.fit(X_train_scaled, y_train)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The accuracy on the training subset: 0.948\n", "The accuracy on the test subset: 0.951\n" ] } ], "source": [ "print('The accuracy on the training subset: {:.3f}'.format(svm.score(X_train_scaled, y_train)))\n", "print('The accuracy on the test subset: {:.3f}'.format(svm.score(X_test_scaled, y_test)))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The accuracy on the training subset: 0.988\n", "The accuracy on the test subset: 0.972\n" ] } ], "source": [ "svm = SVC(C=1000)\n", "svm.fit(X_train_scaled, y_train)\n", "\n", "print('The accuracy on the training subset: {:.3f}'.format(svm.score(X_train_scaled, y_train)))\n", "print('The accuracy on the test subset: {:.3f}'.format(svm.score(X_test_scaled, y_test)))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "deletable": true, "editable": true }, "source": [ "##
Advantages and Disadvantages of SVMs (scikit-learn)
" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Stronger points:\n", " - they are versatile\n", " - can build complex decision boundaries on low-dimensional data\n", " - can work well on high-dimensional data with relatively small sample size\n", " - etc.\n", "\n", " \n", "### Weaker points:\n", " - don't perform well on high-dimensional data with many samples (i.e. > 100k)\n", " - preprocessing may be required => implies knowledge and understanding of hyper-parameters\n", " - harder to inspect and visualize\n", " - etc.\n", " \n", " \n", "### Alternatives:\n", " \n", " - DT and Random Forests (require less/no preprocessing of data, easier to understand, inspect, and visualize)\n", " \n", "### Good practices: \n", "- data scaling\n", "- other pre-processing\n", "- choosing an appropriate kernel\n", "- tuning hyper-parameters: C, gamma, etc. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.0" } }, "nbformat": 4, "nbformat_minor": 2 }