{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Clustering and Unsupervised Analysis\n", "\n", "## Clustering Overview\n", "<p>This notebook covers different clustering methods. We'll cover both k-means and hierarchical based clustering. We'll also cover how to incorporate SVD into clustering methods. <br><br>\n", "\n", "Clustering is an unsupervised technique that doesn't require a particular outcome variable. The core idea of clustering follows this logic:<br>\n", "<ul>\n", " <li>Identify an object/instance $i$ by its feature vector $X_i=<X_i^1,X_i^2,...,X_m^1>$ </li>\n", " <li>Define a distance function $d(X_i,X_j)$ to measure distance between $X_i$ and $X_j$.</li>\n", " <li>Create distinct groupings of instances that:</li>\n", " <ul>\n", " <li>Minimize intra-group distances</li>\n", " <li>Maximize inter-group distances</li>\n", " </ul>\n", "</ul>\n", "<br>\n", "Note that in the above, if we use a similarity function instead of traditional distance metric (i.e., cosine similarity as opposed to Euclidean distance), we want to maximize similarity within clusters.\n", "<br><br>\n", "The key questions we often consider when clustering are:\n", "<ul>\n", " <li>What is a good distance/similarity function?</li>\n", " <li>How many clusters is appropriate?</li>\n", " <li>How do we interpret/assess the clustering?</li>\n", "</ul>\n", "<br>\n", "We'll explore these in the options below.\n", "\n", "<br><br>\n", "The data we'll use is from the student survey that everyone filled out at the beggining of the class. The survey asked for each student to rank themselves on a scale of 1 to 10 in each of the following DS related skill sets - Visualization, Computer Science, Math, Statistics, Machine Learning, Business, Communication. In the next section we load and do some basic distributions of the results.\n", "</p>" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAscAAAEnCAYAAABFWu4xAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3X20ZXV95/n3JzxIFOTBqkEEysLR\nmNDpFTUlDdF2GFBbwRbWLJuBJAoJ3TVO6ywcs6LkyYx2kgWZ7qhJk8lUlLYwhocQIoToKI3YrqSV\nWCAgDyqgRQspqCLyaJsI+p0/zu/ipby37uPZe5973q+1zjr77L3PPd+777d2fc9v/377l6pCkiRJ\nEvxI3wFIkiRJQ2FxLEmSJDUWx5IkSVJjcSxJkiQ1FseSJElSY3EsSZIkNRbHA5Dkj5L8Rt9xSJLm\nluSsJH/ddxySxs/iuANJ/r8k75tj/SlJ7gfeXlX/rofQpCVL8rNJtiV5PMmOJJ9M8sokByW5MMn9\nSR5L8rUk5/Ydr6ZPku1Jvptk3W7rv5Skkmxc4P0b2357jzNOaU/aefW/JnkkybeS/E2Sly/1i5r5\nvHQWx93YCvx8kuy2/s3Ax6rqyR5ikpYsyTuBDwC/AxwKbAD+EDgFeD+wP/ATwIHAG4G7+olU4hvA\nGTMvkvxT4Jn9hSMtXpJnA1cDfwAcAhwOvBf4xz7jmhYWx934OPAc4J/PrEhyMPAG4KIkH0nyW239\nX7YWuZnH95Oc1UvU0ixJDgTeB7ytqq6oqm9X1RNV9ZdV9cvAy4E/raqHqur7VfWVqrq836g1xT4K\nvGXW6zOBi2ZeJDm5tSQ/muSbSf6vWft+rj0/3M7Dx816379P8lCSbyR5/Th/AU21HwOoqour6ntV\n9Z2q+jTwBPBHwHEtNx+Gpedzkhcm+S+tVfrBJJd2+csNncVxB6rqO8BlPP1EfRrwlaq6ebd9/2VV\n7V9V+wP/CrgfuLazYKX5HQfsB/zFPNu/APx2kl9I8qLuwpLm9AXg2Ul+IslewOnAn8za/m1G5+SD\ngJOB/z3JqW3bq9rzQe18/Pn2+p8BXwXWAb8LfHiOK4LSavga8L0kW5O8vjWoUVV3AG8FPt9y86C2\n/1Lz+d8BnwYOBo5g1EKtxuK4O1uBNyXZr71+S1s3pyQ/1rafVlXf7CA+aSHPAR7cQzeg/wP4GPB2\n4PYkd9mypp7NtB6/BrgDuG9mQ1V9tqq+3K5y3AJcDPxPC/y8e6rqj6vqe4zOz4cx6l4kraqqehR4\nJVDAHwO7klyVZM58W0Y+PwE8H3heVf1DVTnYdBaL4460xHsQODXJ/wgcA/zpXPu2y9dXAr9uwmpA\n/h5YN9+gjnbZ73eq6qcZFdKXAX+W5JAug5Rm+Sjws8BZzOpSAZDknyW5LsmuJI8wao1b98M/4mnu\nn1moqv/eFvdfvXClH6iqO6rqrKo6AvhJ4HmMxnz8kGXk87uAAH+b5LYkv7ja8U8yi+NuXcSoFePn\ngU9V1QO775DkRxgVzddV1ZaO45P25POMBoOcutCOrdXjd4BnAUeNOS5pTlV1D6OBeScBV+y2+U+B\nq4Ajq+pARv04Z7pIVGdBSotQVV8BPsKoSJ4rP5eUz1V1f1X9m6p6HvC/AX+Y5IXjiH0SWRx36yLg\n1cC/Yf4uFb/NqKA4p6ugpMWoqkeA9wAXJDk1yTOT7NP6w/1ukt9otxnat3UfOgd4mFEfTakvZwMn\nVNW3d1t/APCtqvqHJMcwamGesQv4PvCCjmKUnibJjyf5pSRHtNdHMrr7yheAB4Ajkuw76y1Lyuck\n/2rmZwMPMSqgvz+2X2jCWBx3qKq2A/+VUfF71Ty7nQEcCzw0644VP9dRiNIeVdV/AN4J/DqjE+43\nGfUx/jijk+t/YtR96O8Y9fM8uaoe7ydaCarq7qraNsemfwu8L8ljjL70XTbrPf+dUUPF3yR5OMmx\n3UQrPeUxRgNAr0/ybUZF8a3ALwGfAW4D7k/yYNt/qfn88vazH2dUj5xTVV/v5lcbvlR59UiSJEkC\nW44lSZKkp1gcS5IkSY3FsSRJktRYHEuSJEmNxbEkSZLUzDnT1bisW7euNm7c2OVHasLdcMMND1bV\n+r7jmGEOa6nMYa0F5rEm3VJyuNPieOPGjWzbNtftJqW5Jbmn7xhmM4e1VOaw1oLVyOM2OdDngGcw\nqj8ur6rfTHIUcAmjaedvAN5cVd/d088yj7VUS8lhu1VIkqQu/COj2Qp/CngJ8Lo2IcX5wPur6oWM\nZms7u8cYJYtjSZI0fjUyM2PmPu1RwAnA5W39VuDUHsKTntJpt4r5bDz3r5b8nu3nnTyGSKRhW86/\nleXw35c0PtP8f16SvRh1nXghcAFwN/BwVT3ZdrkXOHye924GNgNs2LBh/MFOoGnOrdVky7EkSepE\nVX2vql4CHAEcA/z4Et67pao2VdWm9esHMzZQa5DFsSRJ6lRVPQxcBxwHHJRk5kr2EcB9vQUmYXEs\nSZI6kGR9koPa8o8CrwHuYFQkv6ntdiZwZT8RSiOD6HMsSZLWvMOAra3f8Y8Al1XV1UluBy5J8lvA\nl4AP9xmkFrbW+zZbHEuSpLGrqluAl86x/uuM+h9Lg2C3CkmaAEn2S/K3SW5OcluS97b1RyW5Psld\nSS5Nsm/fsUrSJLM4lqTJ4AQKktQBi2NJmgBOoCBJ3ViwOPZSniQNQ5K9ktwE7ASuYZETKCTZnGRb\nkm27du3qLmBJmkCLaTn2Up4kDcByJ1Bw8gRJWrwFi2Mv5UnSsDiBgiSNz6L6HC/3Ul57r5fzJGmF\nnEBBkrqxqOLYudAlqXeHAdcluQX4InBNVV0NvBt4Z5K7gOfgBAqStCJLmgSkqh5O8rRLea312Et5\nkjRGTqAgSd1YsDhOsh54ohXGM5fyzucHl/IuwUt5kqQxWetT1UoalsW0HDsXuiRJkqbCgsWxl/Ik\nSZI0LZbU51iSJP3Acrp8SBo2p4+WJEmSGluOJUmSppRXP36YxbGkVeEdBSRJa4HdKiRJkqTG4liS\nJElqLI4lSZKkxuJYa16SI5Ncl+T2JLclOaetPyTJNUnubM8H9x2rJEnql8WxpsGTwC9V1dHAscDb\nkhwNnAtcW1UvAq5tryVJ0hTzbhVa86pqB7CjLT+W5A7gcOAU4Pi221bgs8C7ewhR0gB4SytJYHGs\nKZNkI6Pp0K8HDm2FM8D9wKHzvGczsBlgw4YN4w9yACwSJEnTym4VmhpJ9gf+HHhHVT06e1tVFVBz\nva+qtlTVpqratH79+g4ilaS1x/EfmhS2HGsqJNmHUWH8saq6oq1+IMlhVbUjyWHAzv4ilKQ1b2b8\nx41JDgBuSHINcBaj8R/nJTmX0fgPu7jhVby+2HKsNS9JgA8Dd1TV783adBVwZls+E7iy69gkaVpU\n1Y6qurEtPwbMHv+xte22FTi1nwilEVuONQ1eAbwZ+HKSm9q6XwXOAy5LcjZwD3BaT/FJ0lRZzvgP\nqSsWx1rzquqvgcyz+cQuY5Gkabf7+I/Rxb2Rqqokc47/mPTB0XaRmBx2q5AkSZ3Y0/iPtn3e8R8O\njlZXJrbleDnfwLafd/IYIlm5tfS7SBqPJEcCFzG65FzAlqr6YJJDgEuBjcB24LSqeqivOKX5LGL8\nx3k4/kMDYMuxJE0GZ3rUpJsZ/3FCkpva4yRGRfFrktwJvLq9lnqzYMuxrRWS1D9netSkc/yHJsVi\nulV4X8IlstO9pHFypL8kjc+C3Sq8L6EkDcdyZnpMsjnJtiTbdu3a1VGkkjSZltTneDmtFZ6UJWl1\nLHekv6P8JWnxFl0cL6e1om3zpCxJK+RMj5LUjUXdym1PrRVVtWNP9yWUpNUy5bc9dKZHSerAYu5W\nsWbuSzjl/7FKmmCO9Jekbiym5djWCkmSJE2FBYtjWyskSZI0LSZ2+mhJk897gkuShsbpoyVJkqTG\nlmNJUqe8YiBpyCyOtUfe4WN8LBAkSRoeu1VIkiRJjcWxJEmS1FgcS5IkSY3FsSRJktRYHEuSJEmN\nxbEkSZLUeCs3SZKkJfBWnGubLceSJElSY3EsSZIkNRbHkiRJUmNxrKmQ5MIkO5PcOmvdIUmuSXJn\nez64zxglSVL/LI41LT4CvG63decC11bVi4Br22tJkjTFLI41Farqc8C3dlt9CrC1LW8FTu00KEma\nIl7B06TwVm5TxFvP/JBDq2pHW74fOHSunZJsBjYDbNiwoaPQJGnN+QjwH4GLZq2buYJ3XpJz2+t3\n9xCb9BRbjiWgqgqoebZtqapNVbVp/fr1HUcmSWuDV/A0KWw51jR7IMlhVbUjyWHAzr4DkuaT5ELg\nDcDOqvrJtu4Q4FJgI7AdOK2qHuoyLq9IaYUWdQUPvIo36ZZzrth+3sljiGRhCxbHQz0ha7gm6B/A\nVcCZwHnt+co+gpAW6SN4SVprWFVVkjmv4LXtW4AtAJs2bZp3P2mlFtOt4iM4yl8TLsnFwOeBFye5\nN8nZjIri1yS5E3h1ey0NkpektUY90K7c4RU8DcWCLcdV9bkkG3dbfQpwfFveCnwWWys0YFV1xjyb\nTuw0EGl1Oah0Siz1ilxfl6OXwSt4GpzlDshbUh+hJNuSbNu1a9cyP06StCcOKtXQeQVPk2LFA/Ls\nIyRpyCaoD/xyOKhUE8MreJoUy205to+QJPVv5pI0eElaklbFcotjT8iS1CEvSUtSNxZzK7eLGQ2+\nW5fkXuA3GZ2AL2sn53uA08YZpCRNOy9JS1I3FnO3Ck/IkiRJmgpOHy1JkiQ1FseSJElSY3EsSZIk\nNSu+z7EkSdKkWs690LW22XIsSZIkNbYcS9Ju1visepKkPbDlWJIkSWosjiVJkqTG4liSJElqLI4l\nSZKkxuJYkiRJarxbhSQJ8H6va4l3XJGWz5ZjSZIkqbHlWJIkrQle/dBqsOVYkiRJaiyOJUmSpMZu\nFdIq8XKeJEmrp6+BpbYcS5IkSY3FsSRJktSsqDhO8rokX01yV5JzVysoqUvmsSadOaxJZw5rSJZd\nHCfZC7gAeD1wNHBGkqNXKzCpC+axJp05rElnDmtoVtJyfAxwV1V9vaq+C1wCnLI6YUmdMY816cxh\nTTpzWIOykuL4cOCbs17f29ZJk8Q81qQzhzXpzGENythv5ZZkM7C5vXw8yVfH/JHrgAdX64fl/BX/\niFWNZ8YK4xpLTCuR8+eN6fldx7K7JeTw4I5rx6b6918jObyQof+Nn4pvFc7d4zDo47eHHIa1lcfL\nMaS/3ZBigYHFsxrn4pUUx/cBR856fURb9zRVtQXYsoLPWZIk26pqU1eft5ChxQPGtJsF83ixOTzE\n49olf//Jz+GFDP1vbHwrM+Qchu7ridmG9LcbUiywNuNZSbeKLwIvSnJUkn2B04GrVhKM1APzWJPO\nHNakM4c1KMtuOa6qJ5O8HfgUsBdwYVXdtmqRSR0wjzXpzGFNOnNYQ7OiPsdV9QngE6sUy2rp5ZLL\nHgwtHjCmp1nFPB7ice2Sv39POjwXD/1vbHwrMw05vFxD+tsNKRZYg/GkqlYjEEmSJGniOX20JEmS\n1KyJ4jjJkUmuS3J7ktuSnNN3TDOS7JXkS0mu7jsWgCQHJbk8yVeS3JHkuJ7j+T/b3+zWJBcn2a/P\neJZj2qc9TbI9yZeT3JRkW9/xdCHJhUl2Jrl11rpDklyT5M72fHCfMS7XYs6nSY5P8kj7m9+U5D0d\nx7jHnMvI77d/k7ckeVmHsb141nG5KcmjSd6x2z6dHr+V5GuSM9s+dyY5c5xx9mmIeT+kPB9CXnea\nx1U18Q/gMOBlbfkA4GvA0X3H1eJ5J/CnwNV9x9Li2Qr867a8L3BQj7EcDnwD+NH2+jLgrL6P0RJ/\nh72Au4EXtON581Byr8NjsB1Y13ccHf/OrwJeBtw6a93vAue25XOB8/uOc5m/24LnU+D4Ps9pC+Uc\ncBLwSSDAscD1PcW5F3A/8Pw+j99y8xU4BPh6ez64LR/c1999zMdocHk/1DzvK6+7zOM10XJcVTuq\n6sa2/BhwBwOYXSfJEcDJwIf6jgUgyYGMkuvDAFX13ap6uN+o2Bv40SR7A88E/q7neJbKaU+nUFV9\nDvjWbqtPYfTlk/Z8aqdBrZKhnk+X6BTgohr5AnBQksN6iONE4O6quqeHz37KCvL1XwDXVNW3quoh\n4BrgdWMLtEcTmvd95Xkved1lHq+J4ni2JBuBlwLX9xsJAB8A3gV8v+9AmqOAXcB/al09PpTkWX0F\nU1X3Af8e+G/ADuCRqvp0X/Esk9OeQgGfTnJDRjNYTatDq2pHW74fOLTPYFbDAufT45LcnOSTSf5J\np4EtnHND+Xd5OnDxPNv6PH6wuHwdynHs1IDyfqh5PqS8Hkser6niOMn+wJ8D76iqR3uO5Q3Azqq6\noc84drM3o0sS/09VvRT4NqPLEL1ofYNOYVS0Pw94VpKf7yseLdsrq+plwOuBtyV5Vd8B9a1G1/Im\n+lZAC5xPb2R0SfWngD8APt5xeIPPuYwms3gj8GdzbO77+D3NWsjX1TKwvB9cng85r1czj9dMcZxk\nH0YJ/bGquqLveIBXAG9Msp3RpfYTkvxJvyFxL3BvVc18G76cUbHcl1cD36iqXVX1BHAF8DM9xrMc\ni5r2dC1rVwCoqp3AXzDqajKNHpi5pNmed/Ycz7ItdD6tqker6vG2/AlgnyTruopvETk3hH+Xrwdu\nrKoHdt/Q9/FrFpOvQziOnRla3g80z4eW12PJ4zVRHCcJo360d1TV7/UdD0BV/UpVHVFVGxldgvhM\nVfXaKlpV9wPfTPLitupE4PYeQ/pvwLFJntn+hicy6uc1SaZ62tMkz0pywMwy8Frg1j2/a826CpgZ\nBX0mcGWPsSzbYs6nSZ7b9iPJMYz+L/n7juJbTM5dBbyljeY/llGXrR106wzmufTc5/GbZTH5+ing\ntUkOblf6XtvWrTlDy/sB5/nQ8no8ebxaowj7fACvZNSUfgtwU3uc1Hdcs+I7nuHcreIlwLZ2rD5O\nzyOPgfcCX2H0j/6jwDP6PkbL+B1OYjSy+W7g1/qOp+Pf/QWM7tBxM3DbtPz+jP5z2AE8weiKzNnA\nc4BrgTuB/wwc0necy/zd5jyfAm8F3tr2eXv7e98MfAH4mQ7jmzPndosvwAXt3+SXgU0dH8NnMSoK\nDpy1rrfjt5R8BTYBH5r13l8E7mqPX+g7P8d4jAaV90PM877zuss8doY8SZIkqVkT3SokSZKk1WBx\nLEmSJDUWx5IkSVJjcSxJkiQ1FseSJElSY3EsSZIkNRbHkiRJUmNxLEmSJDUWx5IkSVJjcSxJkiQ1\nFseSJElSY3EsSZIkNRbHkiRJUmNx3JEk25N8N8m63dZ/KUkl2ZjkI0l+q68Ypbm03P1OkseTPJTk\nr5Ic2Xdc0mIk+dkk21r+7kjyySSv7DsuScNlcdytbwBnzLxI8k+BZ/YXjrRo/7Kq9gcOAx4A/qDn\neKQFJXkn8AHgd4BDgQ3AHwKn9BmXpGGzOO7WR4G3zHp9JnBRT7FIS1ZV/wBcDhwNkOSzSf71zPYk\nZyX567acJO9PsjPJo0m+nOQn+4lc0ybJgcD7gLdV1RVV9e2qeqKq/rKqfjnJM5J8IMnftccHkjyj\nvff4JPcmeVfL3x1JTk1yUpKvJflWkl/t9zeUNC4Wx936AvDsJD+RZC/gdOBPeo5JWrQkzwT+V0a5\nvJDXAq8Cfgw4EDgN+PvxRSc9zXHAfsBfzLP914BjgZcAPwUcA/z6rO3Pbe8/HHgP8MfAzwM/Dfxz\n4DeSHDWWyCX1yuK4ezOtx68B7gDu6zccaVE+nuRh4BFGuft/L+I9TwAHAD8OpKruqKodY4xRmu05\nwINV9eQ8238OeF9V7ayqXcB7gTfP2v4E8NtV9QRwCbAO+GBVPVZVtwG3MyqqJa0xFsfd+yjws8BZ\n2KVCk+PUqjqIUUva24H/kuS5e3pDVX0G+I/ABcDOJFuSPHv8oUrA6CrFuiR7z7P9ecA9s17f09Y9\n9f6q+l5b/k57fmDW9u8A+69GoJKGxeK4Y1V1D6OBeScBV/QcjrQkVfW9qroC+B7wSuDbPH1Q6XN3\n2//3q+qnGfVR/jHgl7uKVVPv88A/AqfOs/3vgOfPer2hrZM05eb7Rq3xOhs4uKq+PUerxl5J9pv1\n+vtV9d0OY5PmlSTAG4GDGXULugn4X5J8iFGr29m01rUkL2f0BfxGRkX0PwDf7yFsTaGqeiTJe4AL\nkjwJfJpRV4lXA/8zcDHw60m+CBSjfsWOAZFkcdyHqrp7D5vPbY8Zf8OohU7q018m+R6jIuIe4Myq\nui3J+4GXMyqIbwE+xqj4AHg28H7gBYwK40+xuL7K0qqoqv+Q5H5GA+0+BjwG3AD8NqMvbc9mlLcA\nfwZ4n3lJpKr6jkGSJEkaBPscS5IkSY3FsaZGkr3adN1Xt9dHJbk+yV1JLk2yb98xSpKkflkca5qc\nw2gQ2YzzgfdX1QuBhxgNJpMkSVPM4lhTIckRwMnAh9rrACcwmgoZYCvz3/JJkiRNCYtjTYsPAO/i\nB7cSew7w8KzZs+5lNE3sD0myOcm29tg8/lAlSVJfOr2V27p162rjxo1dfqQm3A033PBgVa1fyc9I\n8gZgZ1XdkOT4pb6/qrYAW2CUw5s2bfp/VxKPpstq5PBq8jys5RhaHkvj1GlxvHHjRrZt29blR2rC\nJbln4b0W9ArgjUlOYjT98bOBDwIHJdm7tR4fAdy30A8yh7VUq5TDMz9rL2AbcF9VvSHJUcAljK6E\n3AC8eaFJg8xhLcdq5rE0dHar0JpXVb9SVUdU1UbgdOAzVfVzwHXAm9puZwJX9hSitFgOKpWkMbM4\n1jR7N/DOJHcxann7cM/xSPNyUKkkdcPpozVVquqzwGfb8teBY/qMR1qCmUGlB7TXix5UKklavIkt\njjee+1dLfs/2804eQyTSyFJz0nzUYq10UGm7y8pmgA0bNqxydFoq//+Shs1uFZI0fDODSrczGoB3\nArMGlbZ95h1UWlVbqmpTVW1av94bDkjSnlgcS9LAOahUkrpjcSxJk8tBpZK0yia2z7EkTSMHlUrS\neNlyLEmSJDUWx5IkSVJjcSxJkiQ1FseSJElSY3EsSZIkNRbHkiRJUmNxLEmSJDWLLo6T7JXkS0mu\nbq+PSnJ9kruSXJpk3/GFKUmSJI3fUlqOzwHumPX6fOD9VfVC4CHg7NUMTJIkSeraoorjJEcAJwMf\naq8DnABc3nbZCpw6jgAlSZKkriy25fgDwLuA77fXzwEerqon2+t7gcNXOTZJkiSpUwsWx0neAOys\nqhuW8wFJNifZlmTbrl27lvMjJEmSpE4spuX4FcAbk2wHLmHUneKDwEFJ9m77HAHcN9ebq2pLVW2q\nqk3r169fhZClpUmyX5K/TXJzktuSvLetd1CpJEl6mgWL46r6lao6oqo2AqcDn6mqnwOuA97UdjsT\nuHJsUUor84/ACVX1U8BLgNclORYHlUqSpN2s5D7H7wbemeQuRn2QP7w6IUmrq0Yeby/3aY/CQaWS\nJGk3ey+8yw9U1WeBz7blrwPHrH5I0upLshdwA/BC4ALgbhY5qDTJZmAzwIYNG8YfrDSHJPsBnwOe\nwejcfXlV/WaSoxh1eXsOoxx/c1V9t79IJWmyOUOepkJVfa+qXsKof/wxwI8v4b32m9cQ2D1Ikjpg\ncaypUlUPM+ovfxyLHFQqDYHdgySpGxbHWvOSrE9yUFv+UeA1jGZ7dFCpJkqSvZLcBOwErmEJ3YMk\nSYtjcaxpcBhwXZJbgC8C11TV1TioVBNmud2DvN+8JC3ekgbkSZOoqm4BXjrHegeVaiJV1cNJntY9\nqLUez9k9qKq2AFsANm3aVJ0GK0kTxpZjSZoAdg+SpG7YcixJk+EwYGu7LeGPAJdV1dVJbgcuSfJb\nwJewe5AkrYjFsSRNALsHSVI37FYhSZIkNRbHkiRJUmNxLEmSJDUWx5IkSVLjgDxpgmw896+W/J7t\n5508hkgkSVqbbDmWJEmSGotjSZIkqbFbhSRJy7Scrk6Shs2WY0mSJKmxOJYkSZIai2NJkiSpsTiW\nJEmSGgfkSZKEg+skjdhyrDUvyZFJrktye5LbkpzT1h+S5Jokd7bng/uOVZIk9cuWY02DJ4Ffqqob\nkxwA3JDkGuAs4NqqOi/JucC5wLt7jHPqLLWlztn+JEnjZsux1ryq2lFVN7blx4A7gMOBU4Ctbbet\nwKn9RChJkobC4lhTJclG4KXA9cChVbWjbbofOLSnsKQF2T1Ikrpht4oFLGeAhpd+hynJ/sCfA++o\nqkeTPLWtqipJzfO+zcBmgA0bNnQRqjQXuwdJUgdsOdZUSLIPo8L4Y1V1RVv9QJLD2vbDgJ1zvbeq\ntlTVpqratH79+m4ClnZj9yBJ6obFsda8jJqIPwzcUVW/N2vTVcCZbflM4MquY5OWY6ndg5JsTrIt\nybZdu3Z1FqckTSKLY02DVwBvBk5IclN7nAScB7wmyZ3Aq9tradB27x40e1tVFfBD3YO8+iFJi2ef\nY615VfXXQObZfGKXsUgrsafuQVW1Y0/dgyRJi2NxLEkTYBHdg87D7kFPcbY7SctlcSxJk2Gme9CX\nk9zU1v0qo6L4siRnA/cAp/UUnyStCQsWx0mOBC5iNMijgC1V9cEkhwCXAhuB7cBpVfXQ+EKVNGS2\n1I2X3YMkqRuLGZA3c2/No4FjgbclOZrRvTSvraoXAde215IkSdLEWrA49t6akiRJmhZL6nO8nKl3\nnV1M6pezPEqStHiLvs/xcu6t2bZ5f01JkiRNhEUVxyuZeleSJEmaFAsWx069K0mSpGmxmD7H3ltz\niZbax9P+nZIkScOwYHHsvTUlDYWDCyVJ47boAXmSJEnSWuf00ZKkQXP2RUldmqri2BOsJEmS9sRu\nFZIkSVIzVS3Ha4kDkyRJklafLceSJElSY8vxANgXevySXAi8AdhZVT/Z1h0CXApsBLYDp1XVQ33F\nKE0ir2JJWmtsOda0+Ajwut3WnQtcW1UvAq5tr6VBSnJhkp1Jbp217pAk1yS5sz0f3GeMkrQWWBxr\nKlTV54Bv7bb6FGBrW94KnNppUNLSfAS/4EnS2Fkca5odWlU72vL9wKFz7ZRkc5JtSbbt2rWru+ik\nWfyCJ0ndsM+xBFRVJal5tm24UHdYAAAGs0lEQVQBtgBs2rRpzn2kniz6Cx6wGWDDhg0dhabVtNS+\n3fbrlpbPlmNNsweSHAbQnnf2HI+0bFVVwLxf8KpqU1VtWr9+fceRSdJkseVY0+wq4EzgvPZ8ZZcf\nPuS7lAw5Nj3NA0kOq6odfsGTpNVhy7GmQpKLgc8DL05yb5KzGRXFr0lyJ/Dq9lqaJDNf8KCHL3iS\ntBbZcqypUFVnzLPpxE4DkZapfcE7HliX5F7gNxl9obusfdm7BzitvwglaW2wOJ4i3qxfmlxr6Que\n3XYkDZndKiRJkqTGlmNJa5pXTCRJS2HLsSRJktTYcqw9stVNkiRNE1uOJUmSpMbiWJIkSWosjiVJ\nkqTGPseSJK0xjheRls+WY0mSJKmxOJYkSZIai2NJkiSpsTiWJEmSGotjSZIkqRnE3SqWM6pWkiRJ\nWm2DKI4laUim9TZYNlRI0gq7VSR5XZKvJrkrybmrFZTUJfNYk84clqTVs+ziOMlewAXA64GjgTOS\nHL1agUldMI816cxhSVpdK+lWcQxwV1V9HSDJJcApwO2rEZgm14RdkjaPNenMYUlaRSvpVnE48M1Z\nr+9t66RJYh5r0pnDkrSKxj4gL8lmYHN7+XiSr86x2zrgwXHHsgTGM7+xxJLz5930/NX+rKVaZA6P\ny5D+9jCseIYUCzl/3njM4QH9nTCeee0hh2EAeSx1ZSXF8X3AkbNeH9HWPU1VbQG27OkHJdlWVZtW\nEMuqMp75DSmWVbJgHi8mh8dlaMd7SPEMKRboNR5zeAmMZ35DikXq00q6VXwReFGSo5LsC5wOXLU6\nYUmdMY816cxhSVpFy245rqonk7wd+BSwF3BhVd22apFJHTCPNenMYUlaXSvqc1xVnwA+sQpx9HK5\nbw+MZ35DimVVrGIej8PQjveQ4hlSLNBjPObwkhjP/IYUi9SbVFXfMUiSJEmDsKIZ8iRJkqS1pNPi\neKEpTpM8I8mlbfv1STaOMZYjk1yX5PYktyU5Z459jk/ySJKb2uM9Y4xne5Ivt8/ZNsf2JPn9dmxu\nSfKyMcby4lm/801JHk3yjt326ezYrHXm4h5j6T0Xk1yYZGeSW2etOyTJNUnubM8Hz/PeM9s+dyY5\nczXjGpKh5XD7vEHksTksTaCq6uTBaKDI3cALgH2Bm4Gjd9vn3wJ/1JZPBy4dYzyHAS9rywcAX5sj\nnuOBqzs6PtuBdXvYfhLwSSDAscD1Hf7d7gee39exWesPc3HRcfWSi8CrgJcBt85a97vAuW35XOD8\nOd53CPD19nxwWz6473wb0zEaVA63zxtcHpvDPnxMxqPLluOnpjitqu8CM1OcznYKsLUtXw6cmCTj\nCKaqdlTVjW35MeAOhj2r1CnARTXyBeCgJId18LknAndX1T0dfNZUMhcXrZdcrKrPAd/abfXsc9VW\n4NQ53vovgGuq6ltV9RBwDfC6sQXaownMYegnj81haQJ0WRwvZorTp/apqieBR4DnjDuwjLpvvBS4\nfo7NxyW5Ocknk/yTMYZRwKeT3JDRbFa762uK2NOBi+fZ1tWxmRrm4h4NKRcPraodbfl+4NA59pnK\naZ0HksMwzDw2h6UJMPbpo4cuyf7AnwPvqKpHd9t8I6PLX48nOQn4OPCiMYXyyqq6L8n/AFyT5Cvt\n235vMppQ4I3Ar8yxuctjMxXMxfkNORerqpJ42x8GlcMwsDw2h6XJ0WXL8WKmm35qnyR7AwcCfz+u\ngJLsw+hE/rGqumL37VX1aFU93pY/AeyTZN04Yqmq+9rzTuAvGHVDmW1R03WvstcDN1bVA7tv6PLY\nTANzcUFDy8UHZi7Bt+edc+zTx3HqzZByuH3G0PLYHJYmRJfF8WKmOL0KmBkN+ybgM1U1lm+zrS/z\nh4E7qur35tnnuTN9npMcw+h4rXqxnuRZSQ6YWQZeC9y6225XAW9pI6yPBR6ZdUlsXM5gnkuAXR2b\naWAuLsrQcnH2uepM4Mo59vkU8NokB7c7Aby2rVtzhpTD7ecPMY/NYWlSdDn6j9Ho4K8xumvFr7V1\n7wPe2Jb3A/4MuAv4W+AFY4zllYz6pN0C3NQeJwFvBd7a9nk7cBujO2t8AfiZMcXygvYZN7fPmzk2\ns2MJcEE7dl8GNo35b/UsRifnA2et6/zYTMPDXBx2LjIqaHYATzDqc3k2o7EQ1wJ3Av8ZOKTtuwn4\n0Kz3/mI7n90F/ELfuTYNOTzEPDaHffiYrIcz5EmSJEmNM+RJkiRJjcWxJEmS1FgcS5IkSY3FsSRJ\nktRYHEuSJEmNxbEkSZLUWBxLkiRJjcWxJEmS1Pz/+3Wti9c6p1UAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 1000x600 with 7 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import os\n", "\n", "\n", "cwd = os.getcwd()\n", "datadir = '/'.join(cwd.split('/')[0:-1]) + '/data/'\n", "\n", "\n", "d = pd.read_csv(datadir + 'survey_responses_2018.csv', header = 0, sep=',')\n", "dpro = d[['profile_{}'.format(k + 1) for k in range(7)]]\n", "dpro.columns = ['Viz', 'CS', 'Math', 'Stats', 'ML', 'Bus', 'Com']\n", "\n", "fig = plt.figure(figsize = (10, 6))\n", "for i in range(7):\n", " plt.subplot(3, 4, i + 1)\n", " plt.hist(dpro[dpro.columns.values[i]])\n", " plt.title(dpro.columns.values[i])\n", "\n", "fig.tight_layout()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>We can see that most categories have a full range of values. An important question is how correlated are the values to each other. We'll explore this question in two different ways.<br><br>\n", "\n", "First let's look at the correlation of the different categories.\n", "</p>" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAD8CAYAAACRkhiPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAFCZJREFUeJzt3X/wXXV95/HnCxCCDWAhiCJdUEeg\nmIYIEa3tKhRdFXUolpIw3RkznZ3UnVZAl6KOjmg7q2xL1QWmslkL2HEGf6AWxehYNaGjYiHIr+C6\njCAsAWsJIILVCMl7/7gneEm/Se73e+/NSfw8HzPfyTmf87n3vG/uj9f5nHPvOakqJElt2qPvAiRJ\n/TEEJKlhhoAkNcwQkKSGGQKS1DBDQJIaZghIUsMMAUlqmCEgSQ3bq+8CduTAgw6q5/zG4X2XMSdP\ne2xD3yWM5b47f9R3CWN55jN/re8S5mzTxif6LmEsd+39zL5LGMszH1zfdwljuXfzxg1VdfAofXf5\nEHjObxzOZ79ybd9lzMlh37687xLG8vbTPtR3CWM5e9lL+i5hzn5yz0N9lzCWM57zX/suYSxnf/wd\nfZcwlrf85I57Ru3r7iBJapghIEkNMwQkqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwBSWqYISBJ\nDZt1CCR5VpJPJLkzyY1JViU5chrFSZKma1bnDkoS4HPAx6pqWdd2LHAIcMfky5MkTdNsRwInAY9X\n1aVbGqrqFuAbSf46yboktyVZCpDkxCTXJrk6yV1JLkjyR0mu7/o9f4KPRZI0S7M9i+hC4MYZ2t8I\nLAaOBRYANyT5p27ZscBvAg8BdwEfraoTkpwNvAU4Zy6FS5LGN6kDw78LXFlVm6rqR8C1wIu7ZTdU\n1Q+raiNwJ/CVrv024IiZ7izJiiRrk6x96MHd+5z8krQrm20I3A4cP8vbbBya3jw0v5ltjESqamVV\nLamqJQcetGCWq5MkjWq2IfB1YJ8kK7Y0JFkE/BhYmmTPJAcDLweun1yZkqRpmNUxgaqqJKcBH07y\nduDnwN0M9uvPB24BCjivqv4lydETrleSNEGzvrxkVd0PnDHDoj/v/ob7rgHWDM2fuK1lkqSdz18M\nS1LDDAFJapghIEkNMwQkqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwBSWqYISBJDZv1uYN2tsd+\n8QTfXv9I32XMyUu/9E877rQL+8Xm6ruEsRx26il9lzBn933hy32XMJazL3tH3yWM5X/+5wv6LmE8\nf/vGkbs6EpCkhhkCktQwQ0CSGmYISFLDDAFJapghIEkNMwQkqWGGgCQ1zBCQpIYZApLUsLFCIMmm\nJDcnuSXJd5K8bFKFSZKmb9xzB/2sqhYDJHk18AHgFWNXJUnaKSa5O2h/4GGAJCcmuWbLgiSXJFne\nTV+Q5LtJbk1y4QTXL0mapXFHAvsmuRmYBzwb+L3tdU5yEHAacHRVVZJnjLl+SdIYxh0J/KyqFlfV\n0cBrgL9Pku30fwT4OfB3Sd4I/NtMnZKsSLI2ydpHH35ozBIlSdsysd1BVXUdsAA4GHhiq/ue1/V5\nAjgBuAp4PTDjSdOramVVLamqJfv9+oGTKlGStJWJXVQmydHAnsCDwD3AMUn2AfYFTga+kWQ+8PSq\nWpXkm8Bdk1q/JGn2JnVMACDAm6pqE3Bvkk8B64AfADd1ffYDrk4yr+v/tjHXL0kaw1ghUFV7bmfZ\necB5Myw6YZx1SpImx18MS1LDDAFJapghIEkNMwQkqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwB\nSWqYISBJDZvYWUSn5Rnz9uINR+6ep5N+/ynv67uEsVx8/ry+SxjLz/c/tO8S5mzVaf+97xLGcsof\nH993CWP5nR98pO8SxnLcLPo6EpCkhhkCktQwQ0CSGmYISFLDDAFJapghIEkNMwQkqWGGgCQ1zBCQ\npIYZApLUsImGQJJK8vGh+b2SPJDkmm5+eZJLJrlOSdLcTXok8FNgYZJ9u/lXAfdNeB2SpAmZxu6g\nVcDruukzgSunsA5J0gRMIwQ+ASxLMg9YBPzzFNYhSZqAiYdAVd0KHMFgFLBqLveRZEWStUnWbtiw\nYZLlSZKGTOvbQZ8HLmSOu4KqamVVLamqJQsWLJhsZZKkJ03rojKXAT+uqtuSnDildUiSxjSVkUBV\nra+qi7axeHmS9UN/h02jBknSjk10JFBV82doWwOs6aavAK6Y5DolSXPnL4YlqWGGgCQ1zBCQpIYZ\nApLUMENAkhpmCEhSwwwBSWqYISBJDTMEJKlhhoAkNcwQkKSGTessohOzx6bHmf/o7nmFyj+++l19\nlzCWX3vPgX2XMJYP3vXNvkuYszf98Oa+SxjL3t/+dN8ljKWuntOlUHZLjgQkqWGGgCQ1zBCQpIYZ\nApLUMENAkhpmCEhSwwwBSWqYISBJDTMEJKlhhoAkNWykEEjyriS3J7k1yc1JXpLknCRPH+G2I/WT\nJO18OwyBJL8NvB44rqoWAa8E7gXOAUb5cB+1nyRpJxtlJPBsYENVbQSoqg3A6cChwOokqwGSfCTJ\n2m7E8L6u7azhfkn2THJFknVJbkvy1qk8KknSSEY5i+hXgPckuQP4KvDJqrooyduAk7pQAHhXVT2U\nZE/ga0kWbd0vyfHAc6pqIUCSZ0zhMUmSRrTDkUBVPQYcD6wAHgA+mWT5DF3PSPId4CbghcAxM/S5\nC3hekouTvAb4yUzrTLKiG1Ws3fDgQ6M9EknSrI10YLiqNlXVmqo6H/gz4A+Glyd5LnAucHJ33OCL\nwLwZ7udh4FhgDfBm4KPbWN/KqlpSVUsWHLR7n9NeknZloxwYPirJC4aaFgP3AI8C+3Vt+wM/BR5J\ncgjw2qH+T/ZLsgDYo6o+A7wbOG7sRyBJmrNRjgnMBy7u9t8/AXyfwa6hM4EvJ7m/qk5KchPwPQbf\nHBq+pNPKLf0YfFPo8iRbwuedE3ockqQ52GEIVNWNwMtmWHRx97el3/Jt3P4p/XDrX5J2Gf5iWJIa\nZghIUsMMAUlqmCEgSQ0zBCSpYYaAJDXMEJCkhhkCktQwQ0CSGmYISFLDDAFJatgoJ5Dr1cPf/T6f\nWnRq32XMyR9e8/6+SxjLBz/99r5LGMvbnvc7fZcwZ3/yk/v7LmEsbzn5PX2XMJa990jfJew0jgQk\nqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwBSWqYISBJDTMEJKlhhoAkNWykEEhSST4+NL9XkgeS\nXLOD2y1OcsrQ/HuTnDv3ciVJkzTqSOCnwMIk+3bzrwLuG+F2i4FTdthLktSL2ewOWgW8rps+E7hy\ny4IkJyS5LslNSb6V5KgkewN/ASxNcnOSpV33Y5KsSXJXkrMm8igkSXMymxD4BLAsyTxgEfDPQ8u+\nB/zHqnoR8B7g/VX1i276k1W1uKo+2fU9Gng1cAJwfpKnjfsgJElzM/KppKvq1iRHMBgFrNpq8QHA\nx5K8AChgex/sX6yqjcDGJP8KHAKsH+6QZAWwAmDBHrv82a4labc1228HfR64kKFdQZ2/BFZX1ULg\nDcC87dzHxqHpTcwQRFW1sqqWVNWS/QwBSZqa2X7CXgb8uKpuS3LiUPsB/PJA8fKh9keB/eZcnSRp\nqmY1Eqiq9VV10QyL/gr4QJKbeGqwrGZwIHj4wLAkaRcx0kigqubP0LYGWNNNXwccObT43V37Q8CL\nt3O/C0cvVZI0af5iWJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAJDXMEJCkhhkCktQwQ0CSGmYISFLD\nDAFJatguf57mX3/hCzj9q1tfvmD38L+ef3LfJYzlT77yP/ouYSyv/N9X9F3CnL3rhp/3XcJY/tub\nl/RdwlgOe+3L+y5hLBefevbIfR0JSFLDDAFJapghIEkNMwQkqWGGgCQ1zBCQpIYZApLUMENAkhpm\nCEhSwwwBSWqYISBJDZtKCCR5VpJPJLkzyY1JViU5MslFSdYluS3JDUmeO431S5JGM/ETyCUJ8Dng\nY1W1rGs7FlgKHAosqqrNSQ4Dfjrp9UuSRjeNkcBJwONVdemWhqq6hcEH/g+ranPXtr6qHp7C+iVJ\nI5pGCCwEbpyh/VPAG5LcnORvkrxoW3eQZEWStUnWbnjwoSmUKEmCnXhguKrWA0cB7wQ2A19LMuMJ\n96tqZVUtqaolCw46cGeVKEnNmcZFZW4HTp9pQVVtBL4EfCnJj4DfB742hRokSSOYxkjg68A+SVZs\naUiyKMkrkhzaze8BLALumcL6JUkjmngIVFUBpwGv7L4iejvwAQYf+l9Isg64FXgCuGTS65ckjW4q\n1xiuqvuBM2ZYdPE01idJmht/MSxJDTMEJKlhhoAkNcwQkKSGGQKS1DBDQJIaZghIUsMMAUlqmCEg\nSQ0zBCSpYRmc6mfX9R/2nFfnzT+87zLmZMWdu/cJUq/6zf/UdwljOePWq/suYc4e2+85fZcwli/c\nsXtfB+Slhx3QdwljOfKQ/W+sqiWj9HUkIEkNMwQkqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwB\nSWqYISBJDTMEJKlhhoAkNWxiIZBkdZJXb9V2TpLLk1w1qfVIkiZnkiOBK4FlW7UtAy6vqtMnuB5J\n0oRMMgSuAl6XZG+AJEcAhwL3JlnXtX00yc3d3wNJzp/g+iVJszSxEKiqh4Drgdd2TcuATwE11Oe/\nVNVi4FRgA3DFTPeVZEWStUnWPlabJlWiJGkrkz4wPLxLaFk3/xRJ5gGfBt5SVffMdCdVtbKqllTV\nkvnZc8IlSpK2mHQIXA2cnOQ44OlVdeMMfS4FPltVX53wuiVJszTREKiqx4DVwGXMPAr4U2C/qrpg\nkuuVJM3NNH4ncCVwLDOEAHAu8FtDB4ffPIX1S5JGtNek77Cq/gHI0PzdwMJu+rmTXp8kae78xbAk\nNcwQkKSGGQKS1DBDQJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAJDXMEJCkhqWqdtyrR0keAGY85fQE\nLGBwXYPdlfX3y/r7tTvXP+3aD6+qg0fpuMuHwDQlWVtVS/quY66sv1/W36/duf5dqXZ3B0lSwwwB\nSWpY6yGwsu8CxmT9/bL+fu3O9e8ytTd9TECSWtf6SECSmvYrHwJJVid59VZt5yS5PMlVfdU1W0me\nleQTSe5McmOSVUmOTHJRknVJbktyQ5Jer96WpJJ8fGh+ryQPJLlmB7dbnOSUofn3Jjl3mrVuo453\nJbk9ya3dJVBf0r1enj7CbUfqt7Pt6DlJsjzJJf1V+O8l2dT9/9+S5DtJXtZ3TaPa1nu177q25Vc+\nBBhc63jZVm3LgMur6vQe6pm1JAE+B6ypqudX1fHAO4GlwKHAoqr6LeA04Mf9VQrAT4GFSfbt5l8F\n3DfC7RYDp+yw1xQl+W3g9cBxVbUIeCVwL3AOMMqH+6j9dra5Pid9+llVLa6qYxm81j/Qd0Gj2M57\n9ZB+K9u2FkLgKuB1SfYGSHIEgw/Oe5Os69o+2m113NxtIZ3fW7UzOwl4vKou3dJQVbcweHP/sKo2\nd23rq+rhnmoctgp4XTd9JoMgBiDJCUmuS3JTkm8lOap7bv4CWNo9B0u77sckWZPkriRn7YS6nw1s\nqKqNAFW1ATidwetldZLV3WP4SJK13YjhfV3bWcP9kuyZ5IqhUdpbd0L927PN52Q3sD/wMECSE4dH\nlUkuSbK8m74gyXe7UdyF/ZS6zffqN5L89dDrYWlX84lJrk1ydfc6vyDJHyW5vuv3/KlXXFW/8n/A\nNcCp3fQ7gAuBI4B1W/U7HPg/DH5t13vdQ3WdBXxohvbDgLuBm4G/AV60C9T6GLCIQfjO62o7Ebim\nW74/sFc3/UrgM930cuCSoft5L/AtYB8Gv658EHjalGuf39V7B/C3wCu69ruBBUP9Duz+3RNYw2Ak\n9pR+wPHAPw7d5hm78HPylP/7XeEP2NTV+T3gEeD4rv3Jurv5S7r6DwL+L7/8sksv/9/bea/+AfCP\n3WvmEOD/MdjoOJHB6P3Z3Wv9PuB93W3OBj487ZpbGAnAU3cJLWOGraAk84BPA2+pqmmdpmKiqmo9\ncBSD4eZm4GtJTu63KqiqWxmE7JkMtkCHHQB8uhuFfQh44Xbu6otVtbEGW+T/ypSH1FX1GIMP7xXA\nA8Ant2xlbuWMJN8BbmJQ/zEz9LkLeF6Si5O8BvjJdKoezQ6ek13Rlt1BRwOvAf6+29WyLY8APwf+\nLskbgX/bGUXOwu8CV1bVpqr6EXAt8OJu2Q1V9cMajEDvBL7Std/G4DmbqlZC4Grg5CTHAU+vqhtn\n6HMp8Nmq+urOLW0ktzP4cPp3ug/JL1XVnwPvB35/p1a2bZ9nMOLaOnD/ElhdVQuBNzDYMt2WjUPT\nm4C9JlrhDLo36ZqqOh/4MwZbcE/qDryfC5xcg+MGX2SGx1CD3XLHMhgpvBn46JRLH8W2npNdWlVd\nx2A0eDDwBE/93JrX9XkCOIHBaOf1wJd3cplbbPO9uh3Dr/PNQ/Ob2Qmv+SZCoNvCWw1cxsyjgD8F\n9quqC3Z2bSP6OrBPkhVbGpIsSvKKJId283swGPLvKqOYyxgMa2/bqv0AfnlQcvlQ+6PAfjuhrm3q\njk+8YKhpMYP/z+Ha9mdwLOaRJIcArx3q/2S/JAuAParqM8C7geOmXP4otvWc7NKSHM1gN8qDDJ6P\nY5Lsk+QZwMldn/nAAVW1CngrgwDuw4zvVQa7fJZ2x4oOBl4OXN9TjU8x9ZTZhVzJ4Kj91t8UgsGW\n3eNJbu7mL62hAzt9q6pKchrw4SRvZzDsvZvB1s4Hk+zTdb2ewT7S3nW7qi6aYdFfAR9L8m4GW9Fb\nrAbe0T0HfX0TZD5wcffh8gTwfQa7hs4Evpzk/qo6KclNDPZV3wt8c+j2K7f0Y/BNocu7cIbBLrte\nbec5AVieZHgU+dKuf1/2HXo/BnhTVW1i8IWOTwHrgB8w2CUHg/C9ututG+BtO7tg2O579RwGr69b\ngALOq6p/6QKuV/5iWJIa1sTuIEnSzAwBSWqYISBJDTMEJKlhhoAkNcwQkKSGGQKS1DBDQJIa9v8B\nxImvhyIuFeMAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 600x400 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Viz</th>\n", " <th>CS</th>\n", " <th>Math</th>\n", " <th>Stats</th>\n", " <th>ML</th>\n", " <th>Bus</th>\n", " <th>Com</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>Viz</th>\n", " <td>0.000000</td>\n", " <td>0.282906</td>\n", " <td>0.015144</td>\n", " <td>0.270820</td>\n", " <td>0.328538</td>\n", " <td>0.437584</td>\n", " <td>0.368496</td>\n", " </tr>\n", " <tr>\n", " <th>CS</th>\n", " <td>0.282906</td>\n", " <td>0.000000</td>\n", " <td>0.215316</td>\n", " <td>0.101657</td>\n", " <td>0.481906</td>\n", " <td>0.107712</td>\n", " <td>0.185016</td>\n", " </tr>\n", " <tr>\n", " <th>Math</th>\n", " <td>0.015144</td>\n", " <td>0.215316</td>\n", " <td>0.000000</td>\n", " <td>0.670016</td>\n", " <td>0.272858</td>\n", " <td>-0.044132</td>\n", " <td>-0.037995</td>\n", " </tr>\n", " <tr>\n", " <th>Stats</th>\n", " <td>0.270820</td>\n", " <td>0.101657</td>\n", " <td>0.670016</td>\n", " <td>0.000000</td>\n", " <td>0.321812</td>\n", " <td>0.201824</td>\n", " <td>0.114225</td>\n", " </tr>\n", " <tr>\n", " <th>ML</th>\n", " <td>0.328538</td>\n", " <td>0.481906</td>\n", " <td>0.272858</td>\n", " <td>0.321812</td>\n", " <td>0.000000</td>\n", " <td>0.117302</td>\n", " <td>0.083175</td>\n", " </tr>\n", " <tr>\n", " <th>Bus</th>\n", " <td>0.437584</td>\n", " <td>0.107712</td>\n", " <td>-0.044132</td>\n", " <td>0.201824</td>\n", " <td>0.117302</td>\n", " <td>0.000000</td>\n", " <td>0.589699</td>\n", " </tr>\n", " <tr>\n", " <th>Com</th>\n", " <td>0.368496</td>\n", " <td>0.185016</td>\n", " <td>-0.037995</td>\n", " <td>0.114225</td>\n", " <td>0.083175</td>\n", " <td>0.589699</td>\n", " <td>0.000000</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Viz CS Math Stats ML Bus Com\n", "Viz 0.000000 0.282906 0.015144 0.270820 0.328538 0.437584 0.368496\n", "CS 0.282906 0.000000 0.215316 0.101657 0.481906 0.107712 0.185016\n", "Math 0.015144 0.215316 0.000000 0.670016 0.272858 -0.044132 -0.037995\n", "Stats 0.270820 0.101657 0.670016 0.000000 0.321812 0.201824 0.114225\n", "ML 0.328538 0.481906 0.272858 0.321812 0.000000 0.117302 0.083175\n", "Bus 0.437584 0.107712 -0.044132 0.201824 0.117302 0.000000 0.589699\n", "Com 0.368496 0.185016 -0.037995 0.114225 0.083175 0.589699 0.000000" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from matplotlib import pyplot as plt\n", "\n", "#Get correlation\n", "c_mat = dpro.corr()\n", "\n", "#Mask diagnol & duplicates (for plotting purposes)\n", "plot_data = c_mat.values\n", "mask = np.triu(np.ones_like(c_mat, dtype=bool))\n", "plot_data = np.ma.masked_where(np.asarray(mask), plot_data)\n", "\n", "fig, ax = plt.subplots()\n", "ax.pcolormesh(plot_data, cmap=plt.cm.RdBu)\n", "\n", "#Set the tick labels and center them\n", "ax.set_xticks(np.arange(c_mat.shape[0])+0.5, minor=False)\n", "ax.set_yticks(np.arange(c_mat.shape[1])+0.5, minor=False)\n", "ax.set_xticklabels(c_mat.index.values, minor=False)\n", "ax.set_yticklabels(c_mat.index.values, minor=False)\n", "\n", "plt.title(\"Category Correlation\")\n", "plt.show()\n", "\n", "c_mat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>We can see a range of correlations, while we also see two sets of strongly correlated features. These relationships are between \"business/communication\" and \"math/stats.\" It is perhaps not surprising that students that rank themselves on one aspect of each of these sets might rank themselves highly on the other. We can also see that these two groups of correlated features are uncorrelated with each other. From this we can certainly sense that distinct segments of the student population might be exist.\n", "<br><br>\n", "\n", "## Latent Variables\n", "\n", "With this range of correlation in the data, we might wonder whether certain latent features might exist that can explain the above observations. A latent feature (or variable) is described by <a href=\"http://en.wikipedia.org/wiki/Latent_variable\">Wikipedia</a> as: \"...latent variables (or hidden variables, as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured). As we see above, students rank themselves very similarly in \"math\" and \"stats\". Both \"math\" and \"stats\" are the observed feature. The latent feature might be some sort of intellectual capacity for abstraction and logic. This hidden feature is of course manifested, and thus observed, in the form of skill in two related academic disciplines. <br><br>\n", "\n", "One way to detect and define the latent features is through a decomposition of the observed features. Our student survey results are stored in a matrix $X$. Ideally, latent features will all be independent, and each observed feature might be a linear combination of the latent features. One straightfoward mechanism to to mathematically arrive at the properties just described is via the singular value decomposition. See the notebook titled \"Lecture_PhotoSVD_3\" to explore SVD and an example of a potential use case.<br><br>\n", "For our exploratory analysis of the survey data, we'll use the SVD to define independent features (basically, latent features).<br><br>\n", "This starts with the basic decomposition. We'll also generate a scree plot to get a sense of how important the various latent features are to the overall distribution of the data.\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/briand/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: Passing one of 'on', 'true', 'off', 'false' as a boolean is deprecated; use an actual boolean (True/False) instead.\n", " warnings.warn(message, mplDeprecation, stacklevel=1)\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAEWCAYAAACKSkfIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XmcXFWd///Xu7uzkAUCJGAgkIRF\nEFABA7gA4rAIyAAzOBpQBL8wEUbU2YUZv8Lg6I9ZHNSvyCZIGNlBnMgiIE5AVkkEkd0kJCSBkJAQ\nIAlZuvvz++OcSm4q1Wuqu7q73s9HOl33nLt87q3qT5177q1TigjMzKw+NNQ6ADMz6z1O+mZmdcRJ\n38ysjjjpm5nVESd9M7M64qRvZlZHnPSt5iSdLumhKq3rc5Lurca6OtjOBEkhqamnt9UZkq6R9K+1\njsP6Pid9qzpJcyUdUZieLOlNSR+vRrKUdLCkRyS9JWmZpIclHQAQEddFxFHV2I/eIunDklZKGlGh\n7klJ59QiLhuYnPStR0k6DbgE+FREPFCF9W0J3AH8P2AbYEfgX4A1m7vu3lL+hhcRjwELgE+XzbcP\nsBdwQ+9FZwOdk771GElfAr4LfDIiHsnFD+bfyyWtkPSRwvz/mc8IXpZ0TBurfS9ARNwQES0R8W5E\n3BsRT+d1bNRVlM8qzpL0R0nLJV0iSbmuUdJ3Jb2Rt3lO8SykwhnLBZJ+2sa+flHS85LekTQn73up\n7jBJCyR9XdIi4CcVVjEV+EJZ2ReAuyJiaV7PLZIW5TOcByXt3UYsm3SX5f3aLT8eko/1K5Jel3SZ\npC0qHm0bcJz0raecDVwIHB4RMwrlh+bfoyJiREQ8mqcPAl4ERgP/DlxVSs5lXgJaJE2VdIykrTsR\ny3HAAcAHgM8An8zlfwkcA+wL7A+c2Om929TivJ0tgS8CF0vav1D/HtKZyXhgSoXl/xs4VNJOAJIa\ngFNIbwYldwO7A9sBvwOu62asF5HePPcFdiOdLX2zm+uyfsZJ33rKkcBjwB86Of+8iLgyIlpIiW4s\nsH35TBHxNnAwEMCVwBJJ0yRtMm/BRRGxPCJeAf6XlOwgvQF8PyIWRMSbpGTYLRFxZ0TMjuQB4F7g\nkMIsrcD5EbEmIt6tsPx8YDpwai46HBgC3FmY5+qIeCci1gAXAB+UtFVX4sxvpFOAv4mIZRHxDvAd\nYHJX1mP9l5O+9ZSzSa3JH7fRYi+3qPQgIlblh5tc2Mz1z0fE6RExDtgH2AH4XmfWDawqrHcHYH6h\nrvi4S/JZx2P5wvJy4FjSWUvJkohY3cFqprIh6Z8K3BgR6/L6GyVdJGm2pLeBuXm+0Zuupl1jgGHA\nzNzdtRz4ZS63OuCkbz3ldVJr9RDgR4Xyqg7rGhEvANeQkn9XvQaMK0zvVFa/kpQgS95TaSWShgC3\nAf8JbB8Ro4C7gOKbXWf2+2fAOEmfAP6cjbt2TgFOAI4AtgImlDZfYT0bxS2pGPcbwLvA3hExKv9s\nFREV32Bt4HHStx4TEa+SEv/Rki7OxUtIXR27dGedkvaU9HeSxuXpnYCTSV1JXXUz8DVJO0oaBXy9\nrP4pYLKkQZImUXZ3TcFgUlfMEqA5X4Tu8m2jEbESuJV0oXde2bWQkaQ7lJaSEvp32lnV74G9Je0r\naSipK6i0jVZSt9jFkrYDyPv/yYprsgHHSd96VO5H/xPg05L+v9x1823g4dy98OEurvId0kXfxyWt\nJCX7Z4C/60Z4V5L63p8GniS1zpuBllz/f4FdgTdJt4VeX2kluV/8q6Q3kTdJrfJp3YgHUut+PHBt\nWfm1wDxgIfAc7bzJRcRLpIvovwL+CJR/8O3rwCzgsdxV9Ctgj27Ga/2M/CUqZkluoV8WEeNrHYtZ\nT3FL3+qWpC0kHSupSdKOwPnA7bWOy6wnuaVvdUvSMOABYE/Sxc07ga/l20LNBiQnfTOzOuLuHTOz\nOtInhoUtGj16dEyYMKHWYZiZ9SszZ858IyI6/JBdn0v6EyZMYMaMGR3PaGZm60ma15n53L1jZlZH\nnPTNzOqIk76ZWR3pc336m2vCuXd2PFM75l70qSpFYmbW97ilb2ZWR5z0zczqiJO+mVkdcdI3M6sj\nTvpmZnXESd/MrI446ZuZ1ZEuJ31JV0taLOmZQtkFkhZKeir/HFuoO0/SLEkv+ns4zcxqqzst/WuA\noyuUXxwR++afuwAk7QVMBvbOy/xIUmN3gzUzs83T5aQfEQ8Cyzo5+wnAjRGxJiJeJn0Z84Fd3aaZ\nmVVHNfv0z5H0dO7+2TqX7QjML8yzIJdtRNIUSTMkzViyZEkVQzIzs6JqJf1LgV2BfYHXgO92ZeGI\nuCIiJkXEpDFjOvwOADMz66aqJP2IeD0iWiKiFbiSDV04C4GdCrOOy2VmZlYDVUn6ksYWJv8MKN3Z\nMw2YLGmIpInA7sBvq7FNMzPrui4PrSzpBuAwYLSkBcD5wGGS9gUCmAt8CSAinpV0M/Ac0Ax8OSJa\nqhO6mZl1VZeTfkScXKH4qnbm/zbw7a5ux8zMqs+fyDUzqyNO+mZmdcRJ38ysjjjpm5nVESd9M7M6\n4qRvZlZHnPTNzOqIk76ZWR1x0jczqyNO+mZmdcRJ38ysjjjpm5nVESd9M7M64qRvZlZHnPTNzOqI\nk76ZWR1x0jczqyNO+mZmdcRJ38ysjnQ56Uu6WtJiSc8UyraRdJ+kP+bfW+dySfqBpFmSnpa0fzWD\nNzOzrulOS/8a4OiysnOB+yNid+D+PA1wDLB7/pkCXNq9MM3MrBq6nPQj4kFgWVnxCcDU/HgqcGKh\n/NpIHgNGSRrb3WDNzGzzVKtPf/uIeC0/XgRsnx/vCMwvzLcgl21E0hRJMyTNWLJkSZVCMjOzclW/\nkBsRAUQXl7kiIiZFxKQxY8ZUOyQzM8uqlfRfL3Xb5N+Lc/lCYKfCfONymZmZ1UC1kv404LT8+DTg\nfwrlX8h38XwYeKvQDWRmZr2sqasLSLoBOAwYLWkBcD5wEXCzpDOAecBn8ux3AccCs4BVwBerELOZ\nmXVTl5N+RJzcRtXhFeYN4Mtd3YaZmfUMfyLXzKyOOOmbmdURJ30zszripG9mVkec9M3M6oiTvplZ\nHXHSNzOrI076ZmZ1xEnfzKyOOOmbmdURJ30zszripG9mVkec9M3M6oiTvplZHXHSNzOrI076ZmZ1\nxEnfzKyOOOmbmdWRLn9dYnskzQXeAVqA5oiYJGkb4CZgAjAX+ExEvFnN7ZqZWef0REv/ExGxb0RM\nytPnAvdHxO7A/XnazMxqoDe6d04ApubHU4ETe2GbZmZWQbWTfgD3SpopaUou2z4iXsuPFwHbV3mb\nZmbWSVXt0wcOjoiFkrYD7pP0QrEyIkJSlC+U3yCmAOy8885VDsnMzEqq2tKPiIX592LgduBA4HVJ\nYwHy78UVlrsiIiZFxKQxY8ZUMyQzMyuoWtKXNFzSyNJj4CjgGWAacFqe7TTgf6q1TTMz65pqdu9s\nD9wuqbTe6yPil5KeAG6WdAYwD/hMFbdpZmZdULWkHxFzgA9WKF8KHF6t7ZiZWff5E7lmZnXESd/M\nrI5U+5ZN20wTzr1zs5afe9GnqhSJmQ1EbumbmdURJ30zszripG9mVkfcp29V5WsSZn2bW/pmZnXE\nSd/MrI64e8cs29yuKXD3lPV9bumbmdURt/TNBgifqVhnOOmbWZ/gN63e4e4dM7M64qRvZlZH3L1j\nZlYF/aV7yi19M7M64qRvZlZHnPTNzOpIryR9SUdLelHSLEnn9sY2zcxsUz2e9CU1ApcAxwB7ASdL\n2qunt2tmZpvqjZb+gcCsiJgTEWuBG4ETemG7ZmZWRhHRsxuQPg0cHRFn5ulTgYMi4pzCPFOAKXly\nD+DFHg3KzGzgGR8RYzqaqU/cpx8RVwBX1DoOM7OBrje6dxYCOxWmx+UyMzPrZb2R9J8Adpc0UdJg\nYDIwrRe2a2ZmZXq8eycimiWdA9wDNAJXR8SzPb1dMzPbVK/cpx8Rd0XEeyNi14j4dm9ssz+QdIGk\nn+bHO0takW9xreY25ko6oprr7A2S/kzS/HxM9qt1PH2BpH+V9IakRVVa33RJZ1ZpXXdLOq0a6+pg\nO+v/ZvoCSSFpt1rH0RUD+hO5OeEtljS8UHampOk1DKuiiHglIkZEREtvbVPSOEm35UTylqRnJJ3e\nW9vvwH8C5+Rj8mR5paRRkq6WtEjSO5JeGsgf/JO0M/B3wF4R8Z4K9YdJWlCYHizpZ5IelrRlNZKl\npH+S9HJ+I14g6aZSXUQcExFTN2f9vU3SZZKurVD+QUlrJG1Ti7h62oBO+lkj8LXNXYmSgXa8/huY\nD4wHtgVOBV6vaUQbjAfa6wa8GBgBvA/YCjgemNULcW1EUm/dAbczsDQiFnc0o6QhwM+AUcBREfH2\n5m48t+JPBY6IiBHAJOD+zV1vb6pwFj0V+PNiozA7FbgjIpb1TmS9a6AlsUr+A/h7SaMqVUr6qKQn\nckv3CUkfLdRNl/RtSQ8Dq4Bdctm/Snokt3h+IWlbSddJejuvY0JhHd/P3RRvS5op6ZA24piQTxWb\nJH0kr7v0s1rS3Dxfg6RzJc2WtFTSzcUWiaRTJc3Ldf/cwbE5ALgmIlZGRHNEPBkRd+f1bNRyzGXr\nu4pyy/EWST/NLe0/SHqvpPPy2dV8SUe1teG8H9/IsS6WdK2krSQNkbSC9Gb9e0mz24n9+oh4MyJa\nI+KFiLi1sP4jJb2Qn9cfSnqg1JVR3uotHvs8/UVJz+f9miPpS4V5D8ut3K8rdbP8JJcfJ+kpScvz\na+MDhWW+LmlhXt+Lkg5v45hslY/DknxcvpGP0xHAfcAO+fVwTTvHdRjwC9L1uk9FxEpJRwP/BHw2\nL//7wiLjlc4G3pF0r6TR7RzveyJiNkBELMq3Wpe2u76rSNLpkh6S9J+S3lQ6OzimMO9ESQ/mbf5K\n0iXa0M3Z7uuuwv7eonS291Ze596FumskXSrpLkkrgU8Ul42IR0l3Ep5UWKYROAW4Nk8fKOnR/Ly+\nll9Lg9uIZaPustJxKEzvKek+Scvy6+AzbRzrHlUPSX8GMB34+/KKnCzvBH5Aaun+F3CnpG0Ls51K\n+uDYSGBeLpucy3cEdgUeJf3xbwM8D5xfWP4JYN9cdz1wi6Sh7QUcEY/mbo0RwNbA48ANuforwInA\nx4EdgDdJw1ygNLzFpTm2HfI+jWtnU48Bl0iarNR90FV/Sjpb2Bp4knSxvoF0XC4ELm9n2dPzzyeA\nXUit9h9GxJq83wAfjIhd24n92zlB716syInrZ8A3gNHAbOBjXdivxcBxwJbAF4GLJe1fqH8P6fkc\nD0xRuuZwNfAl0jG/HJiW38D2AM4BDoiIkcAngbltbPf/kc5adiE9v18AvhgRvyINY/Jqfl2c3sby\nQ4C7gdXACRHxLkBE/BL4DnBTXv6DhWVOyfu4HTCYCn8n2WPAFyT9g6RJ6vja00GkD1mOBv4duEqS\nct31wG9Jx+oC0uu1u+4Gds/x/w64rqz+FODbpL/fh9jUtaTjXHIEMAi4K0+3AH+T9+MjwOHAX3U1\nSKWziftI+74dKYf8SLUYkiYiBuwP6Y/rCGAf4C1gDHAmMD3Xnwr8tmyZR4HT8+PpwIVl9dOBfy5M\nfxe4uzD9p8BT7cT0JimZQXrB/zQ/ngAE0FQ2/6XAHUBDnn4eOLxQPxZYR2rZfRO4sVA3HFhLOiWv\nFMvWwEWkbpQW4ClScgI4DFhQ6XgWYr+vbL9XAI15emTen1FtbPt+4K8K03uU9iNPB7BbO8dxC1Lr\ndWZebhZwTK77AvBYYV4BC4Azy497e8e+UP9z4GuF47IWGFr2HH2rbJkXSYl7N9KbyBHAoHb2pzGv\nd69C2ZfY8Frd5PkoW/4wUrJfC5xUoX6jfS68lr9RmP4r4JftbONzwK+AlcBS4Otl6yod39NJQ6+U\n6obl4/seUjdVMzCsUP9TNvwddOZ199M24huVt7NVnr4GuLaDHLFzfv2My9PXAd9vZ/6/Bm4vTK9/\nnRaPQeE4PJQffxb4Tdm6LgfOby++nviph5Y+EfEMKXGWX+jbgQ2t95J5pJZqyfwKqyz2e79bYbrU\nUkXS3+eugrckLSe15No6hd5I7lY4DDglIlpz8Xjg9ny6uZz0JtACbJ/3Z328EVH646woUtfIuRGx\nd17+KeDnhRZZR8r3+43YcCH63fx7hKRDtKGrqtRPX37s55HeuLYv34ikzxWWvzvH/m5EfCciPkRq\nMd5MOovapsJxCCo/jxVJOkbSY/k0fDlwLBs/Z0siYnVhejzwd6XnJC+zE7BDRMwiJYoLgMWSbpS0\nQ4XNjia1MMuPyY4V5m3LG6QW5FRJn+zkMsU7gVZReO2Wi4jrIuIIUnI9C/hWO9tZVFhuVX44gvTc\nLCuUQReemyJJjZIuUurqfJsNZ1DF56rddUfEK8CDwOcljSCdRa+/uKvUZXlH7kJ6m3TG1Km/3zLj\ngYPKXiOfI70R9qq6SPrZ+cBfsvEf0aukJ6NoZzb+xHC3BydS6r//R+AzwNYRMYp0xtFhUs3Lfot0\nml68EDef1KIdVfgZGhELgdcofPo59+9uSydExBukO2Z2IHVdrCS10ErraiSdKXVZRPwmcndVfoOB\nTY99qQW4yYXknGxKyx9Tob70xzgcmMimx0Fs/KnwjfaNwh+e0kXQ20jHYvv8nN3Fxs9Z+WtiPvDt\nsudkWETckOO7PiIOzvsbwL+V7wMpYa+rcEy69On1iPgZ6XV+q6RiH3bVBtmKiHURcQvwNOksuite\nA7bJr82SNp+bDl53p5AGbzyC1JiaUFqsGG4nYppKOus/CXg5ImYW6i4FXgB2j4gtSWeXbf39tvm6\nIr1GHih7jYyIiLM7EV9V1U3Szy2um4CvForvAt4r6RSlC6ifJQ3/fEeVNjuSlMiWAE2SvknqJ26X\npJ1ILdcvRMRLZdWXkfqyx+d5x0gqjVp6K3CcpIPzxaYLaec5lvRvkvbJ+z4SOJt0Wr4UeAkYKulT\nkgaR+seHdH7XO3QD8Df5ot4INvQ5N3dmYUn/V9IBSrcmDiXdobWc1K1yJ7C3pD9Xujj7VTb+A3wK\nOFTpsxFbAecV6gaT9nMJ0Kx0AbLNC9LZlcBZkg5SMjwft5GS9pD0J/nNZDXpDKi1fAX5DOlm0nM7\nMj+/f0vq+uiS/GZzDvA/kkrXMl4HJqibd6Dli5KlfWrIx2Vv0vWmrsQ2j3Sd7YL83H2E1DVY0pXX\n3UhgDelsdhjpNdQdt5HeYP+F9AZQvo23gRWS9iT9jbTlKdLdQMOU7t0/o1B3BynXnCppUP45QNL7\nuhlzt9VN0s8uJLUGAcjJ7TjS/c9LSa3y43KrtxruAX5JeiHPI/3Rd+ZU9nBSN8etFbpFvk8axuJe\nSe+QLrAdlPfnWeDLpItFr5GuHywoX3nBMOB2UrKcQ2plHp/X9Rapj/fHpNbmyg7W1VVXky4CPwi8\nTDo2X+nC8kG6eP4G6azhSNLdKivy8/cXpOsVS0kX+h5ev2DEfaQGwNOkawJ3FOreIb1J3Ew6fqfQ\nwbAhETGD1Lr+YV5mFqk/F1LCuijHuYh0Ee+8TdcCpP1fSXouHiI9j1d3cBzaimkq6XV9p6QDgVty\n1VJJv+vGKt8mtXJfIb1e/h04OyIqXRztyOdIF0WXAv9Kei7W5Li78rq7lvR3tRB4jvS30GW5G/Q2\n0k0P5ReC/570GniH9OZ+E227mHRN5XXSm8f6deXX1VGk7rdXSa+Ff6O6DalO6fGhlc36AqUP5P00\nIn5c61hsY0of8nohIs7vcGbbbPXW0jezGsvdGrvmbqKjSf3yP691XPWiw6Sv9FH3xZKeaaNekn6g\n9P23T6twP7Ok0yT9Mf/0+LgcZtYvvId0e+MK0mdkzo4KQ21Yz+iwe0fSoaQn59qI2ORKvaRjSX2R\nx5L6lr8fEQflW+dmkD6uHaS+0w9FxJvV3QUzM+usDlv6EfEg0N4YFCeQ3hAiIh4DRkkaS/rk4X0R\nsSwn+vuAo6sRtJmZdU81BovakY3vSFmQy9oq34QK35E7fPjwD+25555VCMusZwUQAUGk3/lx/pc/\nAVl4TBvzrC9PD0rTG7aRJtavq7A8hbL1y2yyjdK6orCdPE9xvaXtFJeJ0lz9i/J/KtxSr/X/FaY3\nWkZt16uN+Tta5yZ1qlhXMnRQI2O3aneUljbNnDnzjeiP35E7adKkmDFjRo0jslprbQ3WNLeyprmF\nNc2trF6Xfq9Z18rq5hbWrKtUV5puZW1LC80twbqWoLm1lXUtrelxSyvrWtPv5pZY/3h9fWsub2ml\nuTXWP163fv70u7m1Z1Ohyn43NojGBtHUIBolGhvz4wbR1NCwvq6hoVheVt8oGlQobxSNDQ3rpzdd\n74b64npL8zY0iEal2Er1DdowT+nxhjI2PC4tX6yXaGigbP1av/719YVtbLp8Zz9MPvBIKh9doKJq\nJP22vgN3IWkIgWL59Cpsz3pRa2uwal0Lq9Y0szon2tWFhLvR9LpCAq5UV0jMG+o21G/43cralk0+\nv9QlpaQ1qLGBpsaU+AY3iqY8Paghlzc2MCjPN3RQnr9suUGNG6ZL9aXlmhrL6svWu1F9Ybq03qbC\n9tYn2sYNya6UaNXpkTHM2leNpD8NOEfSjaQLuW9FxGuS7gG+I2nrPN9RtP2hFKuCiNQ6XrmmmVVr\nW1ixpplVa5tZuaaFlWuaWbm29LuZVWvK6tc2p7o1Laxa28yK/HvV2u5/p0tTgxjS1MCQQY0Mzb+H\nNDWsLxs5tInRTY0MHdTAkKZGhgxKdUPXz1eoa2pgyKAGhq6fr3HjeQt1gxsbaGr03chmlXSY9CXd\nQGqxj1Ya5/p80sBQRMRlpKEMjiV9CnEVaZhWImKZpG+RhhaGNFrlgPxSgu5qbmll5dqWnHg3Ts4p\n8W6cnEuJuDTf+nnWbliupZPdDg2C4UOaGD64ieFDGhk+pIlhgxvZYdRQhg1uynWNDBvSxIghjWwx\nuIktBpUl4VLSHVQhQTc58Zr1RX3uE7n10qf/X/e+yA9+3fkvetpiUErMw4c0MmxwSsTpd0rWG9el\nshFDmhiWk3d5gh/S1OAuA7MBRNLMiJjU0Xx94kJuvXlz5Vqu/M3LHDRxG47ca/ucsDdOzsOGNK5P\n3sMGN9FYxxeozKx6nPRrYOqjc3l3XQvfOnEf3rv9yFqHY2Z1xJ2uvWzV2maueWQuR7xvOyd8M+t1\nTvq97KYn5rN81TrOPqytr341M+s5Tvq9aF1LK1c+OIcDJ2zDh8ZvU+twzKwOOen3omlPvcqrb63m\nrMN2qXUoZlannPR7SWtrcPmDs9lj+5F8Yo/tah2OmdUpJ/1e8usXFvPS6ys4+7BdfX+8mdWMk34v\nueyB2ew4aguO+8DYWodiZnXMSb8XPDF3GTPmvcmUQ3fx0ARmVlPOQL3g0umz2Wb4YD4zaaeOZzYz\n60FO+j3shUVv8+sXFnP6RyewxeDGWodjZnXOSb+HXf7AHIYNbuQLHxlf61DMzJz0e9L8ZauY9vtX\nOfnAnRk1bHCtwzEzc9LvSVc99DINgjMPmVjrUMzMACf9HrN0xRpufOIVTtx3R8ZutUWtwzEzAzqZ\n9CUdLelFSbMknVuh/mJJT+WflyQtL9S1FOqmVTP4vmzqI3NZva6VL33cQy6YWd/Rma9LbAQuAY4E\nFgBPSJoWEc+V5omIvynM/xVgv8Iq3o2IfasXct+3ck0zUx+dx1F7bc9u23n4ZDPrOzrT0j8QmBUR\ncyJiLXAjcEI7858M3FCN4PqrG377Cm+9u46zPHyymfUxnUn6OwLzC9MLctkmJI0HJgK/LhQPlTRD\n0mOSTmxjuSl5nhlLlizpZOh909rmVq56KH0V4v47b13rcMzMNlLtC7mTgVsjoqVQNj5/We8pwPck\nbdL8jYgrImJSREwaM2ZMlUPqXf/z1EJee2u1vyTFzPqkziT9hUBx/IBxuaySyZR17UTEwvx7DjCd\njfv7B5TW1uCyB2bzvrFb8vH39u83LzMbmDqT9J8Adpc0UdJgUmLf5C4cSXsCWwOPFsq2ljQkPx4N\nfAx4rnzZgeJXz7/O7CUrOevju3j4ZDPrkzq8eycimiWdA9wDNAJXR8Szki4EZkRE6Q1gMnBjRERh\n8fcBl0tqJb3BXFS862cgiQh+NH02O22zBZ96v4dPNrO+qcOkDxARdwF3lZV9s2z6ggrLPQK8fzPi\n6zcef3kZT81fzrdO2NvDJ5tZn+XsVCWXPTCbbYcP5i88fLKZ9WFO+lXw3KtvM/3FJfyfgycydJCH\nTzazvstJvwoue2A2wwc38vmDPHyymfVtTvqb6ZWlq7jj6Vf53IfHs9WwQbUOx8ysXU76m+nK38yh\nqaGBMw728Mlm1vc56W+GN1as4eYZ8/mz/XZk+y2H1jocM7MOOelvhmsensvallamePhkM+snnPS7\nacWaZq59dC6f3Os97DpmRK3DMTPrFCf9brrh8Vd4e3Wzh082s37FSb8b1jS38OOH5vDRXbdl351G\n1TocM7NOc9Lvhp8/uZDX317DWR93K9/M+hcn/S5qaQ0uf3AOe++wJYfsPrrW4ZiZdYmTfhfd99wi\n5ixZydmH7erhk82s33HS74KI4NLpsxm/7TCO2cfDJ5tZ/+Ok3wWPzlnK7xe8xZRDd6Gxwa18M+t/\nOpX0JR0t6UVJsySdW6H+dElLJD2Vf84s1J0m6Y/557RqBt/bLp0+m9EjhnDS/uNqHYqZWbd0+CUq\nkhqBS4AjgQXAE5KmVfgGrJsi4pyyZbcBzgcmAQHMzMu+WZXoe9EzC9/iN398g388eg8Pn2xm/VZn\nWvoHArMiYk5ErAVuBE7o5Po/CdwXEctyor8POLp7odbWZQ/MZuSQJj7/YQ+fbGb9V2eS/o7A/ML0\nglxW7iRJT0u6VVLp66M6taykKZJmSJqxZMmSTobee+YtXcldf3iNUz68M1sO9fDJZtZ/VetC7i+A\nCRHxAVJrfmpXFo6IKyJiUkRMGjNmTJVCqp4rHszDJ3/MwyebWf/WmaS/ECh+8eu4XLZeRCyNiDV5\n8sfAhzq7bF+3+J3V3DJzASd9aBzbefhkM+vnOpP0nwB2lzRR0mBgMjCtOIOk4k3rxwPP58f3AEdJ\n2lrS1sBRuazf+MnDc1nX0spPYEy0AAANoklEQVSUQz18spn1fx3evRMRzZLOISXrRuDqiHhW0oXA\njIiYBnxV0vFAM7AMOD0vu0zSt0hvHAAXRsSyHtiPHvH26nX89NF5HLvPWCaOHl7rcMzMNluHSR8g\nIu4C7ior+2bh8XnAeW0sezVw9WbEWDPXP/4K76xp9sBqZjZg+BO5bVi9roWrHnqZg3cbzfvHbVXr\ncMzMqsJJvw23P7mQJe+s4Wx/SYqZDSBO+hW0tAaXPzCbD4zbio/uum2twzEzqxon/QrueXYRc5eu\n4qyPe/hkMxtYnPTLlIZPnjh6OJ/c+z21DsfMrKqc9Ms8PGspf1jo4ZPNbGBy0i9z2QOz2W7kEP58\n/0rDC5mZ9W9O+gVPL1jOQ7Pe4IyDJzKkycMnm9nA46RfcNkDsxk5tIlTDtq51qGYmfUIJ/3s5TdW\ncvczizj1w+MZ6eGTzWyActLPrnhwNoMaG/iih082swHMSR9Y/PZqbpu5kL/40DjGjBxS63DMzHqM\nkz5w1cMv09zq4ZPNbOCr+6T/1rvruO6xVzj2/WMZv62HTzazga3uk/51j89jhYdPNrM6UddJf/W6\nFq5+aC6HvncM++zo4ZPNbODrVNKXdLSkFyXNknRuhfq/lfScpKcl3S9pfKGuRdJT+Wda+bK1dOvM\nBbyxYg1nfdx9+WZWHzr85ixJjcAlwJHAAuAJSdMi4rnCbE8CkyJilaSzgX8HPpvr3o2Ifasc92Zr\nbmnligfn8MGdRvGRXTx8spnVh8609A8EZkXEnIhYC9wInFCcISL+NyJW5cnHgHHVDbP67n5mEa8s\nW8XZHj7ZzOpIZ5L+jsD8wvSCXNaWM4C7C9NDJc2Q9JikEystIGlKnmfGkiVLOhHS5ikNn7zLmOEc\ntdf2Pb49M7O+oqoXciV9HpgE/EeheHxETAJOAb4naZPbZCLiioiYFBGTxowZU82QKvrNH9/gudfe\n5qxDd6XBwyebWR3pTNJfCOxUmB6XyzYi6Qjgn4HjI2JNqTwiFubfc4DpwH6bEW9VXDp9Nu/Zcign\n7LdDrUMxM+tVnUn6TwC7S5ooaTAwGdjoLhxJ+wGXkxL+4kL51pKG5MejgY8BxQvAve6p+ct5dM5S\nD59sZnWpw7t3IqJZ0jnAPUAjcHVEPCvpQmBGREwjdeeMAG7JF0VfiYjjgfcBl0tqJb3BXFR210+v\nu2z6bLYc2sTJHj7ZzOpQh0kfICLuAu4qK/tm4fERbSz3CPD+zQmwmmYvWcE9zy3iy4ftxoghndp1\nM7MBpa4+kXvFA3MY3NjA6R+bUOtQzMxqom6S/qK3VvOzJxfw2QN2YvQID59sZvWpbpL+VQ/NoTXg\nLw/xkAtmVr/qIum/tWod1z/+Csd9YCw7bTOs1uGYmdVMXST9/35sLivXtnj4ZDOrewM+6a9e18JP\nHp7LYXuM4X1jt6x1OGZmNTXgk/4tM+azdOVaznYr38xsYCf95pZWLn9wDvvvPIoDJ25T63DMzGpu\nQCf9O//wGgvefJezPHyymRkwgJN+afjk3bYbwRHv8/DJZmYwgJP+9JeW8MKid/jSobt4+GQzs2zA\nJv3Lps9m7FZDOWHf9r7vxcysvgzIpD9z3ps8/vIyzjxkFwY3DchdNDPrlgGZES97YDZbbTGIyQfs\n1PHMZmZ1ZMAl/VmL3+G+517ntI9OYLiHTzYz28iAS/qXPTCHoYMaOP2jE2odiplZn9OppC/paEkv\nSpol6dwK9UMk3ZTrH5c0oVB3Xi5/UdInqxf6pl5d/i4/f3Ihkw/YmW2GD+7JTZmZ9UsdJn1JjcAl\nwDHAXsDJkvYqm+0M4M2I2A24GPi3vOxepO/U3Rs4GvhRXl+PuOqhlwngzEMm9tQmzMz6tc609A8E\nZkXEnIhYC9wInFA2zwnA1Pz4VuBwpY/AngDcGBFrIuJlYFZeX9UtX7WWG377Csd/cAfGbe3hk83M\nKunMlc4dgfmF6QXAQW3Nk79I/S1g21z+WNmym9w4L2kKMCVPrpD0Yqeir+B5GP29ybzR3eV72Wjo\nN7FC/4q3P8UK/Sve/hQr9K94NyfW8Z2ZqU/c3hIRVwBXVGNdkmZExKRqrKun9adYoX/F259ihf4V\nb3+KFfpXvL0Ra2e6dxYCxRvex+WyivNIagK2ApZ2clkzM+slnUn6TwC7S5ooaTDpwuy0snmmAafl\nx58Gfh0Rkcsn57t7JgK7A7+tTuhmZtZVHXbv5D76c4B7gEbg6oh4VtKFwIyImAZcBfy3pFnAMtIb\nA3m+m4HngGbgyxHR0kP7UlKVbqJe0p9ihf4Vb3+KFfpXvP0pVuhf8fZ4rEoNcjMzqwcD7hO5ZmbW\nNid9M7M6MmCSfkdDRfQlkq6WtFjSM7WOpSOSdpL0v5Kek/SspK/VOqb2SBoq6beSfp/j/Zdax9QR\nSY2SnpR0R61j6YikuZL+IOkpSTNqHU97JI2SdKukFyQ9L+kjtY6pLZL2yMe09PO2pL/ukW0NhD79\nPLTDS8CRpA+APQGcHBHP1TSwNkg6FFgBXBsR+9Q6nvZIGguMjYjfSRoJzARO7MPHVsDwiFghaRDw\nEPC1iHisg0VrRtLfApOALSPiuFrH0x5Jc4FJEdHnP+wkaSrwm4j4cb7zcFhELK91XB3J+WwhcFBE\nzKv2+gdKS78zQ0X0GRHxIOkupz4vIl6LiN/lx+8Az1PhU9V9RSQr8uSg/NNnWzaSxgGfAn5c61gG\nEklbAYeS7iwkItb2h4SfHQ7M7omEDwMn6VcaKqLPJqb+Ko+euh/weG0jaV/uLnkKWAzcFxF9Od7v\nAf8ItNY6kE4K4F5JM/PwKX3VRGAJ8JPcdfZjScNrHVQnTQZu6KmVD5Skbz1M0gjgNuCvI+LtWsfT\nnohoiYh9SZ8AP1BSn+xCk3QcsDgiZtY6li44OCL2J426++XcVdkXNQH7A5dGxH7ASqBPX+sDyN1Q\nxwO39NQ2BkrS93APPSj3jd8GXBcRP6t1PJ2VT+f/lzSsd1/0MeD43E9+I/Ankn5a25DaFxEL8+/F\nwO300Ki5VbAAWFA4y7uV9CbQ1x0D/C4iXu+pDQyUpN+ZoSKsG/KF0auA5yPiv2odT0ckjZE0Kj/e\ngnRx/4XaRlVZRJwXEeMiYgLpNfvriPh8jcNqk6Th+WI+uavkKKBP3oEWEYuA+ZL2yEWHk0YG6OtO\npge7dqCPjLK5udoaKqLGYbVJ0g3AYcBoSQuA8yPiqtpG1aaPAacCf8j95AD/FBF31TCm9owFpuY7\nIBqAmyOiz98K2U9sD9ye2gE0AddHxC9rG1K7vgJclxuCc4Av1jieduU30iOBL/XodgbCLZtmZtY5\nA6V7x8zMOsFJ38ysjjjpm5nVESd9M7M64qRvZlZHnPStJiStKDw+VtJLksZLOlHSXoW66ZLa/aJo\nSQ2SfiDpmTwC5BP56zmRdFfpvv0qxz9X0uhuLntavm23WDZa0hJJQ9pZ7hpJn+7ONs1KnPStpiQd\nDvwAOCYPMHUisFf7S23is8AOwAci4v3AnwHLASLi2L4w0Fb+3EDJ7cCRkoYVyj4N/CIi1vRuZFZv\nnPStZvK4LVcCx0XEbEkfJY078h95TPFd86x/kcfIf0nSIRVWNRZ4LSJaASJiQUS8mbcxN7eiJ+Qx\n1a/M4+zfmz+xi6QDJD2dt/kfpe85kHS6pB8W4r1D0mEV9uPneQCyZ4uDkElaIem7kn4PrB/LPY9d\n9ADwp4XVrB9kS9I389nKM5KuyJ+KLt/m+jMNSZMkTc+Phyt9X8Nv80BjfXa0WasNJ32rlSHAz0lj\n878AEBGPkIbP+IeI2DciZud5myLiQOCvgfMrrOtm4E9z0v6upP3a2ObuwCURsTfpTOCkXP4T4Et5\nkLaWbuzL/4mID5HGxP+qpG1z+XDg8Yj4YEQ8VLbMDaREj6QdgPcCv851P4yIA/J3LWwBdGWM/X8m\nDedwIPAJ0htofxld0nqBk77VyjrgEeCMTsxbGuRtJjChvDIiFgB7AOeRhii+P3cblXs5IkpDScwE\nJuT+/pER8Wguv77Te7DBV3Nr/jHSwH+75/IW0kB1ldwJfEzSlsBngNsiovSG8wlJj0v6A/AnwN5d\niOUo4Nw8ZMZ0YCiwc1d2xga2ATH2jvVLraRkd7+kf4qI77Qzb6mfu4U2XrO5L/xu4G5Jr5OuDdzf\nxnpK69qigxib2bhhNLR8htzdcwTwkYhYlbtZSvOtLiTy8njflfRL0vWHycDf5vUNBX5E+naq+ZIu\nqLTdstiK9QJOiogXO9g3q1Nu6VvNRMQq0rdGfU5SqcX/DjCyK+uRtH/uIkFSA/ABoFPfOpQv8r4j\n6aBcNLlQPRfYN98dtBOVhxHeCngzJ/w9gQ93IfQbSMl+e6B0plFK4G8ofYdBW3frzAU+lB+fVCi/\nB/hK6TpAO11dVqec9K2mImIZabz7b0g6njSu/D/ki5C7tr/0etsBv8gXYJ8mtYJ/2P4iGzkDuDJ3\niQwH3srlDwMvk4bk/QHwuwrL/hJokvQ8cBGpi6ez7iPddXRT5JEP85vQlaQhi+8hDRteyb8A31f6\ncvLi2cS3SF8R+bSkZ/O02XoeZdPqnqQRpe/VlXQu6Yvgv1bjsMx6hPv0zeBTks4j/T3MA06vbThm\nPcctfTOzOuI+fTOzOuKkb2ZWR5z0zczqiJO+mVkdcdI3M6sj/z8Y3fvJI3Cm5gAAAABJRU5ErkJg\ngg==\n", "text/plain": [ "<Figure size 600x400 with 2 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import sys\n", "import course_utils as bd\n", "\n", "U, sig, Vt = np.linalg.svd(dpro, full_matrices=False)\n", "\n", "bd.plotSVD(sig)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>This is a fairly extreme outcome. Most of the data can be explained by the first singular vector and value. We can guess that this first (and very dominant) latent feature might be related to having self-reported business/communication skills or math/stats skills. Remember from the above correlations, that this is almost an either/or scenario.<br><br>\n", "So while we have good evidence that skills based segments exist, we have no principled way to identify them right now. Fortunately, there are tools to solve this problem.\n", "</p>\n", "\n", "\n", "## Clustering Examples\n", "<p><a href=\"http://scikit-learn.org/stable/modules/clustering.html\">Clustering</a> can be performed using the <a href=\"http://scikit-learn.org/stable/modules/classes.html#module-sklearn.cluster\">sklearn.cluster</a> library. We'll show two examples below, one using <a href=\"http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans\">sklearn.cluster.kmeans</a> for K-Means clustering, and one using <a href=\"http://docs.scipy.org/doc/scipy-0.14.0/reference/cluster.hierarchy.html\">scipy.cluster.hierarchy</a> for hierarchical clustering. Note, in the latter case we'll demo scipy not sklearn here because scipy supports plotting the cluster dendrograms better.\n", "\n", "\n", "</p>\n", "\n", "\n", "### K-means clustering\n", "<p>\n", "One decision we have to make is what data to use. We have our original feature matrix $X$, but we have also computed the SVD of $X$, which gives us an orthonormal matrix of user latent features $U$. If we use $U$, the features are normalized and independent. The normalization is important because clustering methods use distance metrics that are sensitive to scale. The indepedence means each feature will hold equal weight in the clustering. This may or may not be a good thing. For example, if we use $U$, we know that the fist singular vector is by far the most important. We might want this feature to dominate the clustering process. The good news is, we can weight the columns in $U$ using the singular values, i.e., cluster on $U\\Sigma$ instead of $U$.<br><br>\n", "A subtle corollary of this last point is using $U$ or $U\\Sigma$ gives us a great tool to overcome the curse of dimensionality. If $X$ happend to be very high dimensional, but most of the sum-of-squares can be explained by a smaller first-$k$ subset of the singular vectors, then we might be better off clustering on $U_k$ or $U_k\\Sigma_k$ (the rank-$k$ approximations). \n", "<br><br>\n", "We start with a basic clustering. It is fairly easy to implement. In general, you'll always get a result, and a major question is always how do you know if it is a good fit? Ultimately, this becomes both a qualitative and quantitative issue. Some criteria might be:<br>\n", "<ul>\n", " <li>Do the clusters make sense? (this is decidedly qualitative)</li>\n", " <li>Are the clusters well balanced? (a quantitative attribution to a potentially arbitrary need).</li>\n", "</ul><br>\n", "A similar question on quality, is what is the optimal $k$. For choosing $k$, we think of the above two questions, but we can also see how well the clusters minimize the within cluster sum of squares. This criteria is also called 'inertia' and is defined as:<br><br>\n", "\n", "<center>$inertia = \\sum\\limits_{j=1}^k\\:\\sum\\limits_{x_i \\in C_j}|x_i-\\mu_j|^2$\n", "</center>\n", "\n", "\n", "\n", "</p>" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,\n", " n_clusters=2, n_init=5, n_jobs=1, precompute_distances='auto',\n", " random_state=None, tol=0.0001, verbose=0)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn import cluster\n", "\n", "#Note - most of these input parms, except the first, help ensure stability of the fit\n", "km = cluster.KMeans(n_clusters = 2, init = 'k-means++', n_init = 5)\n", "km.fit(dpro)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>Let's loop through different values of k to get the inertia as a function of k. We'll also compute another metric that has been used to evaluate clusters where true cluster labels are not known (which is usually the case). This is called the Silhoette Coefficient. More details can be found <a href=\"http://scikit-learn.org/stable/modules/clustering.html#clustering-evaluation\">here</a></p>" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA3wAAAIYCAYAAAA2Bu7aAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzs3X28Z2VdL/zP10G08gGQORzkweEo\n2sHOHXqPkKcnjyYPamInH/CYolmT90vu8pWdwk6JodwHe7Is0yhItBJJMyfBkHyorFQGRROUGBFj\nRpARkDQLG/zef/zWeH5s957Zv5m998xe836/Xr/XXuta17XWtfaPvfd8uK51/aq7AwAAwPjca293\nAAAAgOUh8AEAAIyUwAcAADBSAh8AAMBICXwAAAAjJfABAACMlMAHAIOqekNV/eJutr2xqn5gCfrw\nuKrasqfnAYBE4ANgH7ZUIWqBcz+/qj44XdbdL+ruVy7H9QBgbxD4ANjvVNUBe7sPALASBD4AVoUd\nI3JV9atVdUdVfbaqTp06/sCquqCqbq6qrVX1qqpaM9X2b6vqNVV1W5K3JnlDksdW1Veq6ktDvTdW\n1auG7YOr6l1VtW243ruq6shddPMxVXXtUP8Pquq+w7k+WVU/ONXXe1fVF6vqUYu4758czrmrawPA\nNxH4AFhNTkxyXZJDk/xykguqqoZjb0yyPcnDkjwqyUlJfmxO2xuSHJbkR5K8KMnfd/f9uvugea51\nryR/kOQhSY5O8q9JfnsX/XtOkpOTPDTJw5P8wlD+puGaOzwpyc3d/bGdnayqXp7k+Um+v7s91wfA\nzAQ+AFaTz3X373X33UkuSnJ4ksOq6rBMQtRLuvtfuvvWJK9JcvpU289392919/bu/tddXai7b+vu\nt3f3V7v7y0nOTfL9u2j22919U3ffPtR/9lD+h0meVFUPGPafm+TNOzlPVdWvZxJa/1t3b9tVfwFg\nPp5hAGA1uWXHRnd/dRjcu1+SQ5LcO8nN/2fAL/dKctNU2+ntXaqqb80kNJ6S5OCh+P5VtWYInPOZ\nvsbnkjx46Ovnq+pvk/xwVb0jyalJfmonlz8oyYYkz+ruO2fpNwBME/gAGIObktyV5NDu3r5And7F\n/lwvTfKIJCd29y1VdXySjyWpnbQ5amr76CSfn9q/KJMppgdkMpV0607Oc0cmU0Avqaof6u6/3UVf\nAWBepnQCsOp1981J3pPk16rqAVV1r6p6aFXtbArmF5IcWVUHLnD8/pk8t/elqjokydmL6MqLq+rI\nof7/ymRxmB3+LMmjMxnZe9OuTtTdH8jkmcA/raoTFnFtAPgmAh8AY/G8JAcmuTaTEbK3ZfKM30Le\nl+SaJLdU1RfnOf4bSb4lyReTfCjJXyyiD3+cSfC8Iclnkrxqx4HhucG3JzkmyZ8u4lzp7iuS/GiS\nP6+qRy+mDQBMq+5dzWgBAJbCsOrmw7v7R3ZZGQCWgGf4AGAFDNM8X5jJCp0AsCJM6QSAZVZVP57J\nwjLv7u6/3tv9AWD/YUonAADASBnhAwAAGCmBDwAAYKT26UVbDj300F63bt3e7gYAAMBecdVVV32x\nu9fubvt9OvCtW7cumzZt2tvdAAAA2Cuq6nN70t6UTgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4A\nAICREvgAAABGSuADAAAYqUUHvqpaU1Ufq6p3DfvHVNWHq2pzVb21qg4cyu8z7G8ejq+bOsfLhvLr\nqurkpb4ZAAAA/o9ZPnj9p5J8KskDhv1XJ3lNd19cVW9I8sIkrx++3tHdD6uq04d6z6qq45KcnuSR\nSR6c5C+r6uHdffcS3cuKWXfWpXvU/sbznrxEPQEAAFjYokb4qurIJE9O8vvDfiV5fJK3DVUuSvK0\nYfu0YT/D8ScM9U9LcnF339Xdn02yOckJS3ETAAAAfLPFTun8jSQ/m+Trw/6Dknypu7cP+1uSHDFs\nH5HkpiQZjt851P9G+TxtvqGqNlTVpqratG3bthluBQAAgGm7DHxV9ZQkt3b3VSvQn3T3+d29vrvX\nr127diUuCQAAMEqLeYbvu5M8taqelOS+mTzD95tJDqqqA4ZRvCOTbB3qb01yVJItVXVAkgcmuW2q\nfIfpNgAAACyxXY7wdffLuvvI7l6XyaIr7+vu5yR5f5KnD9XOSPLOYXvjsJ/h+Pu6u4fy04dVPI9J\ncmySjyzZnQAAAHAPs6zSOdfPJbm4ql6V5GNJLhjKL0jy5qranOT2TEJiuvuaqrokybVJtid58Wpc\noRMAAGC1mCnwdfcHknxg2L4h86yy2d3/luQZC7Q/N8m5s3YSAACA2S36g9cBAABYXQQ+AACAkRL4\nAAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuAD\nAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKR2Gfiq6r5V9ZGq+nhVXVNVvzSUv7Gq\nPltVVw+v44fyqqrXVtXmqvpEVT166lxnVNX1w+uM5bstAAAADlhEnbuSPL67v1JV907ywap693Ds\nf3b32+bUPzXJscPrxCSvT3JiVR2S5Owk65N0kquqamN337EUNwIAAMA97XKErye+Muzee3j1Tpqc\nluRNQ7sPJTmoqg5PcnKSK7r79iHkXZHklD3rPgAAAAtZ1DN8VbWmqq5Ocmsmoe3Dw6Fzh2mbr6mq\n+wxlRyS5aar5lqFsoXIAAACWwaICX3ff3d3HJzkyyQlV9R1JXpbk25M8JskhSX5uKTpUVRuqalNV\nbdq2bdtSnBIAAGC/NNMqnd39pSTvT3JKd988TNu8K8kfJDlhqLY1yVFTzY4cyhYqn3uN87t7fXev\nX7t27SzdAwAAYMpiVulcW1UHDdvfkuSJST49PJeXqqokT0vyyaHJxiTPG1br/K4kd3b3zUkuT3JS\nVR1cVQcnOWkoAwAAYBksZpXOw5NcVFVrMgmIl3T3u6rqfVW1NkkluTrJi4b6lyV5UpLNSb6a5AVJ\n0t23V9Urk1w51Dunu29fulthIevOunSP2t943pOXqCcAAMBK2mXg6+5PJHnUPOWPX6B+J3nxAscu\nTHLhjH0EAABgN8z0DB8AAACrh8AHAAAwUgIfAADASAl8AAAAIyXwAQAAjJTABwAAMFICHwAAwEgJ\nfAAAACMl8AEAAIyUwAcAADBSAh8AAMBICXwAAAAjJfABAACMlMAHAAAwUgIfAADASAl8AAAAIyXw\nAQAAjNQuA19V3beqPlJVH6+qa6rql4byY6rqw1W1uareWlUHDuX3GfY3D8fXTZ3rZUP5dVV18nLd\nFAAAAMkBi6hzV5LHd/dXqureST5YVe9O8tNJXtPdF1fVG5K8MMnrh693dPfDqur0JK9O8qyqOi7J\n6UkemeTBSf6yqh7e3Xcvw32xiq0769I9an/jeU9eop4AAMDqtssRvp74yrB77+HVSR6f5G1D+UVJ\nnjZsnzbsZzj+hKqqofzi7r6ruz+bZHOSE5bkLgAAAPgmi3qGr6rWVNXVSW5NckWSzyT5UndvH6ps\nSXLEsH1EkpuSZDh+Z5IHTZfP0wYAAIAltqjA1913d/fxSY7MZFTu25erQ1W1oao2VdWmbdu2Lddl\nAAAARm+mVTq7+0tJ3p/ksUkOqqodzwAemWTrsL01yVFJMhx/YJLbpsvnaTN9jfO7e313r1+7du0s\n3QMAAGDKYlbpXFtVBw3b35LkiUk+lUnwe/pQ7Ywk7xy2Nw77GY6/r7t7KD99WMXzmCTHJvnIUt0I\nAAAA97SYVToPT3JRVa3JJCBe0t3vqqprk1xcVa9K8rEkFwz1L0jy5qranOT2TFbmTHdfU1WXJLk2\nyfYkL7ZCJwAAwPLZZeDr7k8kedQ85TdknlU2u/vfkjxjgXOdm+Tc2bsJAADArGZ6hg8AAIDVQ+AD\nAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGKnFfA4fsBPrzrp0t9veeN6Tl7AnAABwT0b4AAAA\nRkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAY\nKYEPAABgpHYZ+KrqqKp6f1VdW1XXVNVPDeWvqKqtVXX18HrSVJuXVdXmqrquqk6eKj9lKNtcVWct\nzy0BAACQJAcsos72JC/t7o9W1f2TXFVVVwzHXtPdvzpduaqOS3J6kkcmeXCSv6yqhw+HX5fkiUm2\nJLmyqjZ297VLcSPA4qw769LdbnvjeU9ewp4AALDcdhn4uvvmJDcP21+uqk8lOWInTU5LcnF335Xk\ns1W1OckJw7HN3X1DklTVxUNdgQ8AAGAZzPQMX1WtS/KoJB8eis6sqk9U1YVVdfBQdkSSm6aabRnK\nFiqfe40NVbWpqjZt27Ztlu4BAAAwZdGBr6rul+TtSV7S3f+c5PVJHprk+ExGAH9tKTrU3ed39/ru\nXr927dqlOCUAAMB+aTHP8KWq7p1J2Puj7v7TJOnuL0wd/70k7xp2tyY5aqr5kUNZdlIOAADAElvM\nKp2V5IIkn+ruX58qP3yq2g8l+eSwvTHJ6VV1n6o6JsmxST6S5Mokx1bVMVV1YCYLu2xcmtsAAABg\nrsWM8H13kucm+Yequnoo+/kkz66q45N0khuT/ESSdPc1VXVJJouxbE/y4u6+O0mq6swklydZk+TC\n7r5mCe8FAACAKYtZpfODSWqeQ5ftpM25Sc6dp/yynbUDAABg6cy0SicAAACrx6IWbQHYW3xQPADA\n7jPCBwAAMFICHwAAwEgJfAAAACMl8AEAAIyURVsAlpBFZgCAfYkRPgAAgJES+AAAAEZK4AMAABgp\ngQ8AAGCkBD4AAICREvgAAABGSuADAAAYKZ/DB7Cf2pPPDEx8biAArAZG+AAAAEZK4AMAABipXQa+\nqjqqqt5fVddW1TVV9VND+SFVdUVVXT98PXgor6p6bVVtrqpPVNWjp851xlD/+qo6Y/luCwAAgMWM\n8G1P8tLuPi7JdyV5cVUdl+SsJO/t7mOTvHfYT5JTkxw7vDYkeX0yCYhJzk5yYpITkpy9IyQCAACw\n9Ha5aEt335zk5mH7y1X1qSRHJDktyeOGahcl+UCSnxvK39TdneRDVXVQVR0+1L2iu29Pkqq6Iskp\nSd6yhPcDwEhZZAYAZjfTM3xVtS7Jo5J8OMlhQxhMkluSHDZsH5HkpqlmW4ayhcrnXmNDVW2qqk3b\ntm2bpXsAAABMWfTHMlTV/ZK8PclLuvufq+obx7q7q6qXokPdfX6S85Nk/fr1S3JOAFhpRiQB2Bcs\naoSvqu6dSdj7o+7+06H4C8NUzQxfbx3KtyY5aqr5kUPZQuUAAAAsg8Ws0llJLkjyqe7+9alDG5Ps\nWGnzjCTvnCp/3rBa53cluXOY+nl5kpOq6uBhsZaThjIAAACWwWKmdH53kucm+Yequnoo+/kk5yW5\npKpemORzSZ45HLssyZOSbE7y1SQvSJLuvr2qXpnkyqHeOTsWcAEAAGDpLWaVzg8mqQUOP2Ge+p3k\nxQuc68IkF87SQQAAAHbPohdtAQDGa08WmbHADMC+a6aPZQAAAGD1EPgAAABGypROAGDVMQUVYHGM\n8AEAAIyUwAcAADBSAh8AAMBIeYYPAGCZeeYQ2FuM8AEAAIyUwAcAADBSAh8AAMBICXwAAAAjZdEW\nAADuwSIzMB5G+AAAAEZK4AMAABgpUzoBAFi19mT6aWIKKuNnhA8AAGCkdjnCV1UXJnlKklu7+zuG\nslck+fEk24ZqP9/dlw3HXpbkhUnuTvKT3X35UH5Kkt9MsibJ73f3eUt7KwAAsG8zIslKW8wI3xuT\nnDJP+Wu6+/jhtSPsHZfk9CSPHNr8TlWtqao1SV6X5NQkxyV59lAXAACAZbLLEb7u/uuqWrfI852W\n5OLuvivJZ6tqc5IThmObu/uGJKmqi4e6187cYwAAABZlTxZtObOqnpdkU5KXdvcdSY5I8qGpOluG\nsiS5aU75ifOdtKo2JNmQJEcfffQedA8AANgTpqCufrsb+F6f5JVJevj6a0l+dCk61N3nJzk/Sdav\nX99LcU4AAGD89iSgjjWc7lbg6+4v7Niuqt9L8q5hd2uSo6aqHjmUZSflAAAALIPd+liGqjp8aveH\nknxy2N6Y5PSquk9VHZPk2CQfSXJlkmOr6piqOjCThV027n63AQAA2JXFfCzDW5I8LsmhVbUlydlJ\nHldVx2cypfPGJD+RJN19TVVdksliLNuTvLi77x7Oc2aSyzP5WIYLu/uaJb8bAAAAvmExq3Q+e57i\nC3ZS/9wk585TflmSy2bqHQAAALttt6Z0AgAAsO8T+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgA\nAABGSuADAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMA\nABgpgQ8AAGCkBD4AAICR2mXgq6oLq+rWqvrkVNkhVXVFVV0/fD14KK+qem1Vba6qT1TVo6fanDHU\nv76qzlie2wEAAGCHxYzwvTHJKXPKzkry3u4+Nsl7h/0kOTXJscNrQ5LXJ5OAmOTsJCcmOSHJ2TtC\nIgAAAMtjl4Gvu/86ye1zik9LctGwfVGSp02Vv6knPpTkoKo6PMnJSa7o7tu7+44kV+SbQyQAAABL\naHef4Tusu28etm9JctiwfUSSm6bqbRnKFioHAABgmezxoi3d3Ul6CfqSJKmqDVW1qao2bdu2balO\nCwAAsN/Z3cD3hWGqZoavtw7lW5McNVXvyKFsofJv0t3nd/f67l6/du3a3eweAAAAuxv4NibZsdLm\nGUneOVX+vGG1zu9Kcucw9fPyJCdV1cHDYi0nDWUAAAAskwN2VaGq3pLkcUkOraotmay2eV6SS6rq\nhUk+l+SZQ/XLkjwpyeYkX03ygiTp7tur6pVJrhzqndPdcxeCAQAAYAntMvB197MXOPSEeep2khcv\ncJ4Lk1w4U+8AAADYbXu8aAsAAAD7JoEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAA\ngJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACAkRL4AAAA\nRkrgAwAAGKk9CnxVdWNV/UNVXV1Vm4ayQ6rqiqq6fvh68FBeVfXaqtpcVZ+oqkcvxQ0AAAAwv6UY\n4ftv3X18d68f9s9K8t7uPjbJe4f9JDk1ybHDa0OS1y/BtQEAAFjAckzpPC3JRcP2RUmeNlX+pp74\nUJKDqurwZbg+AAAA2fPA10neU1VXVdWGoeyw7r552L4lyWHD9hFJbppqu2UoAwAAYBkcsIftv6e7\nt1bVf0hyRVV9evpgd3dV9SwnHILjhiQ5+uij97B7AAAA+689GuHr7q3D11uTvCPJCUm+sGOq5vD1\n1qH61iRHTTU/ciibe87zu3t9d69fu3btnnQPAABgv7bbga+qvq2q7r9jO8lJST6ZZGOSM4ZqZyR5\n57C9McnzhtU6vyvJnVNTPwEAAFhiezKl87Ak76iqHef54+7+i6q6MsklVfXCJJ9L8syh/mVJnpRk\nc5KvJnnBHlwbAACAXdjtwNfdNyT5znnKb0vyhHnKO8mLd/d6AAAAzGY5PpYBAACAfYDABwAAMFIC\nHwAAwEgJfAAAACMl8AEAAIyUwAcAADBSAh8AAMBICXwAAAAjJfABAACMlMAHAAAwUgIfAADASAl8\nAAAAIyXwAQAAjJTABwAAMFICHwAAwEgJfAAAACMl8AEAAIyUwAcAADBSKx74quqUqrquqjZX1Vkr\nfX0AAID9xYoGvqpak+R1SU5NclySZ1fVcSvZBwAAgP3FSo/wnZBkc3ff0N1fS3JxktNWuA8AAAD7\nherulbtY1dOTnNLdPzbsPzfJid195lSdDUk2DLuPSHLdinUQAABg3/KQ7l67u40PWMqeLIXuPj/J\n+Xu7HwAAAKvdSk/p3JrkqKn9I4cyAAAAlthKB74rkxxbVcdU1YFJTk+ycYX7AAAAsF9Y0Smd3b29\nqs5McnmSNUku7O5rVrIPAAAA+4sVXbQFAACAlbPiH7wOAADAyhD4AAAARkrgAwAAGCmBDwAAYKQE\nPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACAkRL4\nAAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuAD\nAAAYKYEPgHuoqudU1Xum9ruqHjZsv7GqXrX3erdvqKpvqao/r6o7q+pPhrJXVdUXq+qWqjq6qr5S\nVWt2cZ7vrarrVqjPj6uqLUt0Lv8dAKwSAh/Afqiqvqeq/m4ILLdX1d9W1WOSpLv/qLtPWuH+PL+q\nPjinbI9DRVWdUFWXVdWXhvv8SFW9YM96myR5epLDkjyou59RVUcneWmS47r7P3b3P3X3/br77p2d\npLv/prsfsQT9SVXdWFU/sBTnAmA8BD6A/UxVPSDJu5L8VpJDkhyR5JeS3LU3+7XUquqxSd6X5K+S\nPCzJg5L8P0lOXYLTPyTJP3b39mH/6CS3dfetS3BuAFgyAh/A/ufhSdLdb+nuu7v7X7v7Pd39iWT+\n0bY5Dq6qS6vqy1X14ap66I4DVfVfq+rKYeTwyqr6r1PHHlhVF1TVzVW1dZgCuaaq/nOSNyR57DAN\n8ktVtSHJc5L87FD258M5HlxVb6+qbVX12ar6yZ3081eSXNTdr+7uL/bEVd39zKk+/XhVbR5G/zZW\n1YOnjn17VV0xHLuuqp45lP9SkpcnedbQt59IckWSBw/7b6yqdcNU2AOGNodU1R9U1eer6o6q+rOh\n/B7TLHd2f1X1iqq6pKreNHzvr6mq9cOxN2cSOv986MPPLvRNqaqfH6ae3lhVzxnKHlNVX5ieglpV\n/72qPr6T7++OevevqvdX1WurqnZVH4CVJfAB7H/+McndVXVRVZ1aVQfP2P70TEYED06yOcm5ySTU\nJLk0yWszGU379SSXVtWDhnZvTLI9k9G2RyU5KcmPdfenkrwoyd8P0yAP6u7zk/xRkl8eyn6wqu6V\n5M+TfDyTUcknJHlJVZ08t4NV9a1JHpvkbQvdRFU9Psn/TvLMJIcn+VySi4dj35ZJiPvjJP9huOff\nqarjuvvsJP9fkrcOffvdTEYNPz/sP3+ey705ybcmeeRwvtfM05/F3N9Thz4elGRjkt9Oku5+bpJ/\nSvKDQx9+eYHb/o9JDh3Of0aS86vqEd19ZZLbMnlPdnhukjctcJ4dfX5Qkvcm+dvu/snu7p3VB2Dl\nCXwA+5nu/uck35Okk/xekm3D6NZhizzFO7r7I8N0xj9KcvxQ/uQk13f3m7t7e3e/Jcmnk/zgcO4n\nJXlJd//LMPXxNZkEqcV6TJK13X1Od3+tu28Y+j/fOQ7O5G/czTs533OSXNjdH+3uu5K8LJNRxnVJ\nnpLkxu7+g+FePpbk7UmeMUN/kyRVdXgmgfBF3X1Hd/97d//Vbt7fB7v7suHZwDcn+c5Z+5PkF7v7\nrqEPl2YSeJPkoiQ/MvT5kCQnZxJ4F/LgTKbL/kl3/8Ju9AOAFXDA3u4AACtvGFV7fjKZupjkD5P8\nRpJnL6L5LVPbX01yv2H7wZmMkk37XCajSQ9Jcu8kN0/N+rtXkptm6PZDMpk2+aWpsjVJ/maeunck\n+XomI3efXuB8D07y0R073f2Vqrptqr8nzrnWAZmErFkdleT27r5jF/UWc39zv/f3raoDpp4l3JU7\nuvtfpvY/l8n3IZn8N/CpYXTzmUn+prt3FpifnOQrmUzHBWAfJfAB7Oe6+9NV9cYkP7GHp/p8JqFl\n2tFJ/iKTYHdXkkMXCCfzTQWcW3ZTks9297G76kh3f7Wq/j7JDyd5/2L6OwSdByXZOlzrr7r7ibu6\n1iLclOSQqjqou7+0i3qLur8FLGY65cFV9W1Toe/oJJ9Mku7eOnzP/nsm0zlfv4tz/V4mI6mXVdUp\nc4IkAPsIUzoB9jPDYiQvraojh/2jMhnZ+9AenvqyJA+vqv9RVQdU1bOSHJfkXcNI0XuS/FpVPaCq\n7lVVD62q7x/afiHJkVV14NT5vpDkP03tfyTJl6vq52ryOXhrquo7avg4iXn8bJLnV9X/3PEcYVV9\nZ1VdPBx/S5IXVNXxVXWfTJ7L+3B335jJKqYPr6rnVtW9h9djarLAzEyGe393Js8AHjyc6/vmqTrr\n/c019/u1kF+qqgOr6nszmbr6J1PH3pTJ9+2/JPnTRZzrzCTXZbJYzLcssp8ArCCBD2D/8+UkJyb5\ncFX9SyZB75OZfI7cbuvu2zIJEC/NZAGQn03ylO7+4lDleUkOTHJtJlMu35bJlMtk8vEJ1yS5pap2\n1L8gyXE1WbXzz4bn1p6SyTODn03yxSS/n+SBC/Tn75I8fnjdUFW3Jzk/k2Ca7v7LJL+YybN5Nyd5\naIbn5br7y5ksYHJ6JiOBtyR5dZL77Oa357lJ/j2T6aW3JnnJPP2d6f7m8b+T/MLw/fqZBercksn3\n/vOZPH/5ou6envL6jkxGPd/R3V/d1QWHRVo2JNmS5J1Vdd9F9hWAFVIW1AIAdqiqzyT5iSEQA7DK\nGeEDAJIkVfXDmTwL+L693RcAloZFWwCAVNUHMnnm8rnd/fW93B0AlogpnQAAACNlSicAAMBI7dNT\nOg899NBet27d3u4GAADAXnHVVVd9sbvX7m77fTrwrVu3Lps2bdrb3QAAANgrqupze9LelE4AAICR\nEvgAAABGSuADAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAAgJE6YG93YDVa\nd9ale9T+xvOevEQ9AQAAWJgRPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuAD\nAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8A\nAGCkBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACAkRL4AAAARuqAvd0Blt+6sy7do/Y3nvfk\nJeoJAACwkozwAQAAjNTMga/9gYAhAAARgUlEQVSqTqmq66pqc1WdNc/xn66qa6vqE1X13qp6yNSx\nM6rq+uF1xp52HgAAgIXNFPiqak2S1yU5NclxSZ5dVcfNqfaxJOu7+/9K8rYkvzy0PSTJ2UlOTHJC\nkrOr6uA96z4AAAALmXWE74Qkm7v7hu7+WpKLk5w2XaG739/dXx12P5TkyGH75CRXdPft3X1HkiuS\nnLL7XQcAAGBnZg18RyS5aWp/y1C2kBcmefdutgUAAGAPLNsqnVX1I0nWJ/n+GdttSLIhSY4++uhl\n6BkAAMD+YdYRvq1JjpraP3Iou4eq+oEk/yvJU7v7rlnadvf53b2+u9evXbt2xu4BAACww6yB78ok\nx1bVMVV1YJLTk2ycrlBVj0ryu5mEvVunDl2e5KSqOnhYrOWkoQwAAIBlMNOUzu7eXlVnZhLU1iS5\nsLuvqapzkmzq7o1JfiXJ/ZL8SVUlyT9191O7+/aqemUmoTFJzunu25fsTgAAALiHmZ/h6+7Lklw2\np+zlU9s/sJO2Fya5cNZrAgAAMLuZP3gdAACA1UHgAwAAGCmBDwAAYKQEPgAAgJFatg9eB/ZN6866\ndLfb3njek5ewJwAALDcjfAAAACMl8AEAAIyUwAcAADBSnuGDPeSZOAAA9lUCH7BPE6gBAHafKZ0A\nAAAjJfABAACMlCmdAPupPZkum6z8lNnV1l8A2BcIfAAArFr+Z9Dy8v1d/UzpBAAAGCkjfOxz/J8k\nYAz8LgNgX2CEDwAAYKSM8AEsIZ8bCMDOGP1npRnhAwAAGCmBDwAAYKQEPgAAgJHyDB8AAPfgeWRW\nK//tfjMjfAAAACMl8AEAAIyUwAcAADBSAh8AAMBICXwAAAAjJfABAACMlI9lAABYZpaKB/YWI3wA\nAAAjJfABAACMlMAHAAAwUgIfAADASFm0BQBYdSyCArA4RvgAAABGSuADAAAYKVM6AQBTJAFGauYR\nvqo6paquq6rNVXXWPMe/r6o+WlXbq+rpc47dXVVXD6+Ne9JxAAAAdm6mEb6qWpPkdUmemGRLkiur\namN3XztV7Z+SPD/Jz8xzin/t7uN3s68AAADMYNYpnSck2dzdNyRJVV2c5LQk3wh83X3jcOzrS9RH\nAAAAdsOsUzqPSHLT1P6WoWyx7ltVm6rqQ1X1tPkqVNWGoc6mbdu2zdg9AAAAdljpVTof0t3rk/yP\nJL9RVQ+dW6G7z+/u9d29fu3atSvcPQAAgPGYNfBtTXLU1P6RQ9midPfW4esNST6Q5FEzXh8AAIBF\nmjXwXZnk2Ko6pqoOTHJ6kkWttllVB1fVfYbtQ5N8d6ae/QMAAGBpzRT4unt7kjOTXJ7kU0ku6e5r\nquqcqnpqklTVY6pqS5JnJPndqrpmaP6fk2yqqo8neX+S8+as7gkAAMASmvmD17v7siSXzSl7+dT2\nlZlM9Zzb7u+S/Jfd6CMAAAC7YaUXbQEAAGCFCHwAAAAjJfABAACMlMAHAAAwUgIfAADASAl8AAAA\nIyXwAQAAjJTABwAAMFICHwAAwEgJfAAAACMl8AEAAIyUwAcAADBSAh8AAMBICXwAAAAjJfABAACM\nlMAHAAAwUgIfAADASAl8AAAAIyXwAQAAjJTABwAAMFICHwAAwEgJfAAAACMl8AEAAIyUwAcAADBS\nAh8AAMBICXwAAAAjJfABAACMlMAHAAAwUgIfAADASAl8AAAAIyXwAQAAjJTABwAAMFICHwAAwEgJ\nfAAAACMl8AEAAIyUwAcAADBSAh8AAMBIzRz4quqUqrquqjZX1VnzHP++qvpoVW2vqqfPOXZGVV0/\nvM7Yk44DAACwczMFvqpak+R1SU5NclySZ1fVcXOq/VOS5yf54zltD0lydpITk5yQ5OyqOnj3ug0A\nAMCuzDrCd0KSzd19Q3d/LcnFSU6brtDdN3b3J5J8fU7bk5Nc0d23d/cdSa5Icspu9hsAAIBdmDXw\nHZHkpqn9LUPZkrWtqg1VtamqNm3btm3G7gEAALDDPrdoS3ef393ru3v92rVr93Z3AAAAVq1ZA9/W\nJEdN7R85lC13WwAAAGY0a+C7MsmxVXVMVR2Y5PQkGxfZ9vIkJ1XVwcNiLScNZQAAACyDmQJfd29P\ncmYmQe1TSS7p7muq6pyqemqSVNVjqmpLkmck+d2qumZoe3uSV2YSGq9Mcs5QBgAAwDI4YNYG3X1Z\nksvmlL18avvKTKZrztf2wiQXznpNAAAAZrfPLdoCAADA0hD4AAAARkrgAwAAGCmBDwAAYKQEPgAA\ngJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACAkRL4AAAA\nRkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAY\nKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCk\nBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACAkZo58FXVKVV1XVVtrqqz5jl+n6p663D8w1W1\nbihfV1X/WlVXD6837Hn3AQAAWMgBs1SuqjVJXpfkiUm2JLmyqjZ297VT1V6Y5I7uflhVnZ7k1Ume\nNRz7THcfvwT9BgAAYBdmHeE7Icnm7r6hu7+W5OIkp82pc1qSi4bttyV5QlXVnnUTAACAWc0a+I5I\nctPU/pahbN463b09yZ1JHjQcO6aqPlZVf1VV3zvfBapqQ1VtqqpN27Ztm7F7AAAA7LCSi7bcnOTo\n7n5Ukp9O8sdV9YC5lbr7/O5e393r165du4LdAwAAGJdZA9/WJEdN7R85lM1bp6oOSPLAJLd1913d\nfVuSdPdVST6T5OG702kAAAB2bdbAd2WSY6vqmKo6MMnpSTbOqbMxyRnD9tOTvK+7u6rWDou+pKr+\nU5Jjk9yw+10HAABgZ2ZapbO7t1fVmUkuT7ImyYXdfU1VnZNkU3dvTHJBkjdX1eYkt2cSCpPk+5Kc\nU1X/nuTrSV7U3bcv1Y0AAABwTzMFviTp7suSXDan7OVT2/+W5BnztHt7krfvRh8BAADYDSu5aAsA\nAAArSOADAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMA\nABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACAkRL4AAAARkrgAwAAGCmBDwAA\nYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICREvgAAABGSuADAAAYKYEPAABgpAQ+AACA\nkRL4AAAARkrgAwAAGCmBDwAAYKQEPgAAgJES+AAAAEZK4AMAABgpgQ8AAGCkBD4AAICRmjnwVdUp\nVXVdVW2uqrPmOX6fqnrrcPzDVbVu6tjLhvLrqurkPes6AAAAOzNT4KuqNUlel+TUJMcleXZVHTen\n2guT3NHdD0vymiSvHtoel+T0JI9MckqS3xnOBwAAwDKYdYTvhCSbu/uG7v5akouTnDanzmlJLhq2\n35bkCVVVQ/nF3X1Xd382yebhfAAAACyDWQPfEUlumtrfMpTNW6e7tye5M8mDFtkWAACAJVLdvfjK\nVU9Pckp3/9iw/9wkJ3b3mVN1PjnU2TLsfybJiUlekeRD3f2HQ/kFSd7d3W+bc40NSTYMu49Ict3u\n3dooHZrki3u7E8zM+7b6eM9WJ+/b6uR9W328Z6uT9211OjTJt3X32t09wQEz1t+a5Kip/SOHsvnq\nbKmqA5I8MMlti2yb7j4/yfkz9mu/UFWbunv93u4Hs/G+rT7es9XJ+7Y6ed9WH+/Z6uR9W52G923d\nnpxj1imdVyY5tqqOqaoDM1mEZeOcOhuTnDFsPz3J+3oyjLgxyenDKp7HJDk2yUd2v+sAAADszEwj\nfN29varOTHJ5kjVJLuzua6rqnCSbuntjkguSvLmqNie5PZNQmKHeJUmuTbI9yYu7++4lvBcAAACm\nzDqlM919WZLL5pS9fGr735I8Y4G25yY5d9Zr8g2muq5O3rfVx3u2OnnfVifv2+rjPVudvG+r0x6/\nbzMt2gIAAMDqMeszfAAAAKwSAt8+pqqOqqr3V9W1VXVNVf3UPHUeV1V3VtXVw+vl852LlVVVN1bV\nPwzvyaZ5jldVvbaqNlfVJ6rq0Xujn0xU1SOmfoaurqp/rqqXzKnjZ20fUFUXVtWtw8f+7Cg7pKqu\nqKrrh68HL9D2jKHO9VV1xnx1WB4LvG+/UlWfHn4HvqOqDlqg7U5/n7I8FnjPXlFVW6d+Dz5pgban\nVNV1w9+4s1au1yzwvr116j27saquXqCtn7W9YKF/7y/X3zZTOvcxVXV4ksO7+6NVdf8kVyV5Wndf\nO1XncUl+prufspe6yTyq6sYk67t73s+4Gf5I/r9JnpTJZ1P+ZnefuHI9ZCFVtSaTj4k5sbs/N1X+\nuPhZ2+uq6vuSfCXJm7r7O4ayX05ye3efN/zj8uDu/rk57Q5JsinJ+iSdye/T/7u771jRG9hPLfC+\nnZTJ6t3bq+rVSTL3fRvq3Zid/D5leSzwnr0iyVe6+1d30m5Nkn9M8sQkWzJZ1f3Z0/92YfnM977N\nOf5rSe7s7nPmOXZj/KytuIX+vZ/k+VmGv21G+PYx3X1zd3902P5ykk8lOWLv9oolclomv4y7uz+U\n5KDhB5697wlJPjMd9th3dPdfZ7Lq87TTklw0bF+UyR/KuU5OckV33z78IbwiySnL1lHuYb73rbvf\n093bh90PZfKZvOwjFvhZW4wTkmzu7hu6+2tJLs7kZ5QVsLP3raoqyTOTvGVFO8VO7eTf+8vyt03g\n24dV1bokj0ry4XkOP7aqPl5V766qR65ox1hIJ3lPVV1VVRvmOX5Ekpum9rdEmN9XnJ6F/xj6Wds3\nHdbdNw/btyQ5bJ46fub2bT+a5N0LHNvV71NW1pnDNNwLF5hi5mdt3/W9Sb7Q3dcvcNzP2l4259/7\ny/K3TeDbR1XV/ZK8PclLuvuf5xz+aJKHdPd3JvmtJH+20v1jXt/T3Y9OcmqSFw9TLNjHVdX/3979\nvNgUhgEc/z6ZYTErpUgoyd5CUjYWTEhTZEHyK8ooa4WFYmPDlmLshlAmFiL/gFI2iIUFRZopFhIb\nPBbn0O12z+TX3HMc38/mzpzzLt7b0/Oe97n3fd87GxgBrve4ba79A7LYm+D+hH9IRByn+E3e8Yom\njqfNcQ5YBqwA3gBn6u2OftEOpv92z1yr0XTz/b/5bLPga6CIGKQI/nhm3ui+n5nvM/ND+fdtYDAi\n5vW5m+qSma/L1ylggmKJS6fXwOKO/xeV11SvjcDDzJzsvmGuNdrk9yXR5etUjzbmXANFxF5gM7Az\nKw4S+InxVH2SmZOZ+SUzvwIX6B0Lc62BImIA2ApcrWpjrtWnYr4/I882C76GKddajwFPM/NsRZsF\nZTsiYhVFHN/2r5fqFhFD5aZbImIIGAYedzW7BeyOwmqKDdRvUN0qP/001xrtFvD9ZLI9wM0ebe4C\nwxExt1yGNlxeU00iYgNwBBjJzI8VbX5mPFWfdO0130LvWDwAlkfE0nLVxHaKHFW91gHPMvNVr5vm\nWn2mme/PyLNt4M+7rL9sDbALeNRxhO4xYAlAZp4HtgGHIuIz8AnYXvUpqfpmPjBR1gYDwOXMvBMR\no/AjbrcpTuh8DnwE9tXUV5XKB9x64GDHtc6YmWsNEBFXgLXAvIh4BZwATgPXImI/8JLiUAIiYiUw\nmpkHMvNdRJyimIwCnMzM3zmQQr+hIm5HgTnAvXK8vJ+ZoxGxELiYmZuoGE9reAv/nYqYrY2IFRRL\ny15QjpedMStPXT1MMemcBVzKzCc1vIX/Uq+4ZeYYPfanm2uNUTXfn5Fnmz/LIEmSJEkt5ZJOSZIk\nSWopCz5JkiRJaikLPkmSJElqKQs+SZIkSWopCz5JkiRJaikLPkmSJElqKQs+SZIkSWopCz5JkiRJ\naqlvv/2mkqbxrDMAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 1500x900 with 2 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from sklearn.metrics import pairwise_distances\n", "from sklearn import metrics\n", "\n", "inert_k = []\n", "sil_k = []\n", "\n", "for k in range(2, 20):\n", " km = cluster.KMeans(n_clusters = k, init = 'k-means++', n_init = 5)\n", " km.fit(dpro)\n", " inert_k.append(km.inertia_)\n", " sil_k.append(metrics.silhouette_score(dpro, km.labels_, metric = 'euclidean')) \n", "\n", "fix = plt.figure(figsize = (15, 9))\n", "ax1 = plt.subplot(211)\n", "plt.bar(range(2,20), inert_k, 0.35)\n", "plt.title('Inertia by k')\n", "plt.tick_params(axis='x',which='both',bottom=False,top=False,labelbottom=False)\n", "ax1 = plt.subplot(212)\n", "plt.bar(range(2,20), sil_k, 0.35)\n", "plt.title('Silhoette Coefficient by k')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>We can see that increasing $k$ tends to continually decrease the inertia. But at the same time, our Silhouette Coefficient does not increase. We should always expect inertia to decrease as $k$ increases, but the other metric shows that this does not exactly result in good clusters.\n", "<br><br>\n", "One method for determining a good value of $k$ is called the <a href=\"http://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set#The_Elbow_Method\">elbow method</a>. This amounts to looking at the first plot above, and determining at what $k$ the within cluster sum of squares starts to level off. This is somewhat subjective, but tends to be useful.\n", "</p>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>Next we'll look at a qualitative evaluation. For a small range of $k$, we first look at the distribution of clusters.\n", "</p>" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1kAAAKACAYAAABuR7JgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzs3X2UZXV95/v3J92gxidAKkxfWmxy\nIRqSOzykFoGYyRgeMihGuEuWA6NOm9vejpno6DgzsU3uTGLirMBac4Pm4Sarl5h0JkYgRAcCIYZB\nGG8ebC3kSUCladvYfYEuDa0SExT83j/OblOUVV2nq/Y5+zy8X2udVXvvs0/V5+w6tX/96b3PPqkq\nJEmSJEnt+K6uA0iSJEnSJLFkSZIkSVKLLFmSJEmS1CJLliRJkiS1yJIlSZIkSS2yZEmSJElSiyxZ\nUsuS7ElyXtc5JElaiuOUNHiWLGmEJHlGkquSfCHJ15LcleTlXeeSJOmgJH+Q5OEkX03yuSRv7DqT\nNGosWdJoWQ98EfjnwPOB/wu4NsmmDjNJkrTQrwKbqup5wKuAdyf5oY4zSSPFkiUNUJLvT/L5JJf1\ns35V/V1V/VJV7amqb1XVjcDnAQcvSVLrDnecAqiq+6rqiYOzze1/HUhAaUxZsqQBSXIG8BHgLVX1\nwSQ3JjmwzO3GZb7HccD3AfcNM7skafKtZZxK8v8k+TrwGeBh4E87eArSyEpVdZ1BmihJ9gA7gC3A\n66rq9lV+nyOAm4GHquqnWwsoSZpqLY5T64CzgZcBV1TVN1uKKI09S5bUsmbwehbwP6vqNav8Ht8F\n/CHwPOAiBy5JUlvaGKcWfb/fAe6vql9f6/eSJoWnC0qD8SbghCRXHlyQ5OYkjy9zu3nBegGuAo4D\nXm3BkiQNwKrHqSWsx/dkSU+zvusA0oT6GnABcGuSy6tqW1X1eyn23wa+Hzivqv5+YAklSdNsVeNU\nku8BzgFuBP4eOA+4rLlJangkSxqQqjoAnA+8PMmv9POYJC8Cfho4DXhkwf8gvnaAUSVJU2g14xS9\nKwn+DLAXeAz4r8DbquqGwaSUxpPvyZIkSZKkFnkkS5IkSZJaZMmSJEmSpBZZsiRJkiSpRZYsSZIk\nSWqRJUuSJEmSWjTUz8k69thja9OmTcP8kZKkMXLHHXd8qapmuszgWCVJWk6/49RQS9amTZuYm5sb\n5o+UJI2RJF/oOoNjlSRpOf2OU54uKEmSJEktsmRJkiRJUossWZKksZfkxUnuWnD7apK3JTkmyS1J\nHmy+Ht11VknS5LNkSZLGXlV9tqpOq6rTgB8Cvg58GNgG3FpVJwO3NvOSJA2UJUuSNGnOBR6qqi8A\nFwE7muU7gIs7SyVJmhqWLEnSpLkU+GAzfVxVPdxMPwIc100kSdI06esS7kn+HfBGoIB7gZ8CNgBX\nAy8A7gBeX1XfGFBOSdICm7bd1HWEp9lz+YVdRwAgyZHAq4B3Lr6vqipJLfO4rcBWgBNOOGHNOfz9\nSNJ0W/FIVpLjgX8LzFbVDwLr6P0v4RXAlVV1EvAYsGWQQSVJ6sPLgU9V1aPN/KNJNgA0X/cv9aCq\n2l5Vs1U1OzPT6WchS5ImQL+nC64HnpVkPfDdwMPAOcB1zf2e5y5JGgWX8Y+nCgLcAGxupjcD1w89\nkSRp6qxYsqpqH/Bfgb+hV66+Qu/0wANV9WSz2l7g+KUen2Rrkrkkc/Pz8+2kliRpkSTPBs4HPrRg\n8eXA+UkeBM5r5iVJGqgV35PVfKbIRcCJwAHgj4AL+v0BVbUd2A4wOzu75LnwkiStVVX9Hb33CS9c\n9mV6VxuUJGlo+jld8Dzg81U1X1XfpPc/hC8FjmpOHwTYCOwbUEZJkiRJGhv9lKy/Ac5K8t1JQu9/\nBO8HbgMuadbxPHdJkiRJor/3ZO2kd4GLT9G7fPt30Tv97x3A25Psond6xlUDzClJkiRJY6Gvz8mq\nql8EfnHR4t3Ama0nkiRJkqQx1u8l3CVJkiRJfbBkSZIkSVKL+jpdcNRs2nZT1xGeZs/lF3YdQZIk\nSdKI8EiWJEmSJLXIkiVJkiRJLbJkSZIkSVKLLFmSJEmS1CJLliRJkiS1yJIlSZIkSS2yZEmSJElS\niyxZkqSJkOSoJNcl+UySB5KcneSYJLckebD5enTXOSVJk8+SJUmaFO8F/qyqXgKcCjwAbANuraqT\ngVubeUmSBsqSJUkae0meD/wYcBVAVX2jqg4AFwE7mtV2ABd3k1CSNE0sWZKkSXAiMA/8bpI7k7wv\nybOB46rq4WadR4Djlnpwkq1J5pLMzc/PDymyJGlSWbIkSZNgPXAG8NtVdTrwdyw6NbCqCqilHlxV\n26tqtqpmZ2ZmBh5WkjTZLFmSpEmwF9hbVTub+evola5Hk2wAaL7u7yifJGmKWLIkSWOvqh4Bvpjk\nxc2ic4H7gRuAzc2yzcD1HcSTJE2Z9V0HkCSpJW8BPpDkSGA38FP0/jPx2iRbgC8Ar+kwnyRpSliy\nJEkToaruAmaXuOvcYWeRJE03TxeUJEmSpBZZsiRJkiSpRZYsSZIkSWqRJUuSJEmSWmTJkiRJkqQW\nWbIkSZIkqUWWLEmSJElqkSVLkiRJklpkyZIkSZKkFlmyJEmSJKlFlixJkiRJalFfJSvJUUmuS/KZ\nJA8kOTvJMUluSfJg8/XoQYeVJEmSpFHX75Gs9wJ/VlUvAU4FHgC2AbdW1cnArc28JEmSJE21FUtW\nkucDPwZcBVBV36iqA8BFwI5mtR3AxYMKKUmSJEnjop8jWScC88DvJrkzyfuSPBs4rqoebtZ5BDhu\nUCElSVpJkj1J7k1yV5K5ZpmntkuShq6fkrUeOAP47ao6Hfg7Fp0aWFUF1FIPTrI1yVySufn5+bXm\nlSTpUH68qk6rqtlm3lPbJUlD10/J2gvsraqdzfx19ErXo0k2ADRf9y/14KraXlWzVTU7MzPTRmZJ\nkvrlqe2SpKFbsWRV1SPAF5O8uFl0LnA/cAOwuVm2Gbh+IAklSepPAX+e5I4kW5tlfZ3a7lkXkqQ2\nre9zvbcAH0hyJLAb+Cl6Be3aJFuALwCvGUxESZL68qNVtS/J9wC3JPnMwjurqpIseWp7VW0HtgPM\nzs4uuY4kSf3qq2RV1V3A7BJ3ndtuHEmSVqeq9jVf9yf5MHAmzantVfXwoU5tlySpTf1+TpYkSSMr\nybOTPPfgNPATwKfx1HZJUgf6PV1QkqRRdhzw4STQG9v+sKr+LMkn8dR2SdKQWbIkSWOvqnYDpy6x\n/Mt4aru0apu23dR1hKfZc/mFXUeQ+uLpgpIkSZLUIkuWJEmSJLXIkiVJkiRJLbJkSZIkSVKLLFmS\nJEmS1CJLliRJkiS1yJIlSZIkSS2yZEmSJElSiyxZkiRJktQiS5YkSZIktciSJUmSJEktsmRJkiRJ\nUossWZIkSZLUIkuWJEmSJLXIkiVJmghJ1iW5M8mNzfyJSXYm2ZXkmiRHdp1RkjQdLFmSpEnxVuCB\nBfNXAFdW1UnAY8CWTlJJkqaOJUuSNPaSbAQuBN7XzAc4B7iuWWUHcHE36SRJ02Z91wEkSWrBe4Cf\nA57bzL8AOFBVTzbze4Hjl3twkq3AVoATTjhhgDElaTps2nZT1xGeZs/lFw7153kkS5I01pK8Ethf\nVXes9ntU1faqmq2q2ZmZmRbTSZKmkUeyJEnj7qXAq5K8Angm8DzgvcBRSdY3R7M2Avs6zChJmiIe\nyZIkjbWqemdVbayqTcClwEer6rXAbcAlzWqbges7iihJmjKWLEnSpHoH8PYku+i9R+uqjvNIkqaE\npwtKkiZGVd0O3N5M7wbO7DKPJGk6eSRLkiRJklpkyZIkSZKkFlmyJEmSJKlFlixJkiRJalHfJSvJ\nuiR3JrmxmT8xyc4ku5Jck+TIwcWUJEmSpPFwOEey3go8sGD+CuDKqjoJeAzY0mYwSZIkSRpHfZWs\nJBuBC4H3NfMBzgGua1bZAVw8iICSJEmSNE76PZL1HuDngG818y8ADlTVk838XuD4lrNJkiRJ0thZ\nsWQleSWwv6ruWM0PSLI1yVySufn5+dV8C0mSJEkaG/0cyXop8Koke4Cr6Z0m+F7gqCTrm3U2AvuW\nenBVba+q2aqanZmZaSGyJEmSJI2uFUtWVb2zqjZW1SbgUuCjVfVa4Dbgkma1zcD1A0spSZIkSWNi\nLZ+T9Q7g7Ul20XuP1lXtRJIkSZKk8bV+5VX+UVXdDtzeTO8Gzmw/kiRJkiSNr7UcyZIkSZIkLWLJ\nkiSNvSTPTPKJJHcnuS/Ju5rlJybZmWRXkmuSHNl1VknS5LNkSZImwRPAOVV1KnAacEGSs4ArgCur\n6iTgMWBLhxklSVPCkiVJGnvV83gze0RzK3ofO3Jds3wHcHEH8SRJU8aSJUmaCEnWJbkL2A/cAjwE\nHKiqJ5tV9gLHL/PYrUnmkszNz88PJ7AkaWJZsiRJE6Gqnqqq04CN9K5++5LDeOz2qpqtqtmZmZmB\nZZQkTQdLliRpolTVAeA24GzgqCQHP65kI7Cvs2CSpKlhyZIkjb0kM0mOaqafBZwPPECvbF3SrLYZ\nuL6bhJKkaXJYH0YsSdKI2gDsSLKO3n8gXltVNya5H7g6ybuBO4GrugwpSZoOlixJ0tirqnuA05dY\nvpve+7MkSRoaTxeUJEmSpBZZsiRJkiSpRZYsSZIkSWqRJUuSJEmSWmTJkiRJkqQWWbIkSZIkqUWW\nLEmSJElqkSVLkiRJklpkyZIkSZKkFlmyJEmSJKlFlixJkiRJapElS5IkSZJaZMmSJEmSpBZZsiRJ\nkiSpRZYsSdLYS/LCJLcluT/JfUne2iw/JsktSR5svh7ddVZJ0uSzZEmSJsGTwL+vqlOAs4CfTXIK\nsA24tapOBm5t5iVJGihLliRp7FXVw1X1qWb6a8ADwPHARcCOZrUdwMXdJJQkTRNLliRpoiTZBJwO\n7ASOq6qHm7seAY5b5jFbk8wlmZufnx9KTknS5LJkSZImRpLnAH8MvK2qvrrwvqoqoJZ6XFVtr6rZ\nqpqdmZkZQlJJ0iSzZEmSJkKSI+gVrA9U1YeaxY8m2dDcvwHY31U+SdL0sGRJksZekgBXAQ9U1a8t\nuOsGYHMzvRm4ftjZJEnTZ8WS5WVxJUlj4KXA64FzktzV3F4BXA6cn+RB4LxmXpKkgVrfxzoHL4v7\nqSTPBe5IcgvwBnqXxb08yTZ6l8V9x+CiSpK0tKr6CyDL3H3uMLNIkrTikSwviytJkiRJ/Tus92R5\nWVxJkiRJOrS+S5aXxZUkSZKklfVVsrwsriRJkiT1p5+rC3pZXEmSJEnqUz9XFzx4Wdx7k9zVLPt5\nepfBvTbJFuALwGsGE1GSJEmSxseKJcvL4kqSJElS/w7r6oKSJEmSpEOzZEmSJElSiyxZkiRJktQi\nS5YkSZIktciSJUmSJEktsmRJkiRJUossWZIkSZLUIkuWJGnsJXl/kv1JPr1g2TFJbknyYPP16C4z\nSpKmhyVLkjQJfg+4YNGybcCtVXUycGszL0nSwFmyJEljr6o+BvztosUXATua6R3AxUMNJUmaWpYs\nSdKkOq6qHm6mHwGOW27FJFuTzCWZm5+fH046SdLEsmRJkiZeVRVQh7h/e1XNVtXszMzMEJNJkiaR\nJUuSNKkeTbIBoPm6v+M8kqQpYcmSJE2qG4DNzfRm4PoOs0iSpoglS5I09pJ8EPhr4MVJ9ibZAlwO\nnJ/kQeC8Zl6SpIFb33UASZLWqqouW+auc4caRJIkPJIlSZIkSa2yZEmSJElSiyxZkiRJktQiS5Yk\nSZIktciSJUmSJEktsmRJkiRJUossWZIkSZLUIkuWJEmSJLXIkiVJkiRJLbJkSZIkSVKLLFmSJEmS\n1CJLliRJkiS1yJIlSZIkSS2yZEmSJElSi9ZUspJckOSzSXYl2dZWKEmS2uJYJUkatlWXrCTrgN8C\nXg6cAlyW5JS2gkmStFaOVZKkLqzlSNaZwK6q2l1V3wCuBi5qJ5YkSa1wrJIkDd36NTz2eOCLC+b3\nAj+8eKUkW4GtzezjST67hp950LHAl1r4Pq3IFV0neJqR2jYjyO1zaG6f5bltDiFXtLZ9XtTC91io\nq7FqpF4vfY5TI5W5T2YevJHK62t5pIxV5mGPU2spWX2pqu3A9ja/Z5K5qppt83tOCrfNobl9Ds3t\nszy3zaGN+/Zpe6wax+1h5uEYt8zjlhfMPCzjlnnYeddyuuA+4IUL5jc2yyRJGhWOVZKkoVtLyfok\ncHKSE5McCVwK3NBOLEmSWuFYJUkaulWfLlhVTyZ5M/ARYB3w/qq6r7Vkh9bq6YcTxm1zaG6fQ3P7\nLM9tc2gjuX06HKtGcnuswMzDMW6Zxy0vmHlYxi3zUPOmqob58yRJkiRpoq3pw4glSZIkSU9nyZIk\nSZKkFo10yUpyQZLPJtmVZNsS9z8jyTXN/TuTbBp+ym70sW3enuT+JPckuTVJ2589M9JW2j4L1nt1\nkkoyNpcgXat+tk2S1zSvn/uS/OGwM3apj7+tE5LcluTO5u/rFV3k7EKS9yfZn+TTy9yfJL/ebLt7\nkpwx7IzDsJaxKck7m+WfTfIvRijzsmNGkqeS3NXchnLRkD7yviHJ/IJcb1xw3+YkDza3zcPI22fm\nKxfk/VySAwvu62Ibr/rvucNtvFLm1zZZ703yV0lOXXDfnmb5XUnmRijzy5J8ZcHv/z8vuK+vf8sM\nOe9/XJD1081r95jmvq628Qubcfngv1veusQ6w389V9VI3ui9Qfkh4HuBI4G7gVMWrfNvgN9ppi8F\nruk69whtmx8HvruZ/plp2Tb9bp9mvecCHwM+Dsx2nXtUtg1wMnAncHQz/z1d5x6x7bMd+Jlm+hRg\nT9e5h7h9fgw4A/j0Mve/ArgZCHAWsLPrzB29RpYcm5rXy93AM4ATm++zbkQyLztmAI+P4DZ+A/Cb\nSzz2GGB38/XoZvroUci8aP230LsISyfbuPmZq/p77mob95n5RxaMXS9fuA8C9gDHjuB2fhlw41pf\nU8PKu2jdnwQ+OgLbeANwRjP9XOBzS+wzhv56HuUjWWcCu6pqd1V9A7gauGjROhcBO5rp64Bzk2SI\nGbuy4rapqtuq6uvN7MfpfTbMtOjntQPwK8AVwD8MM1zH+tk2/yfwW1X1GEBV7R9yxi71s30KeF4z\n/Xzg/xtivk5V1ceAvz3EKhcBv189HweOSrJhOOmGZi1j00XA1VX1RFV9HtjVfL/OM4/YmNHvPnwp\n/wK4par+ttmH3QJcMKCcCx1u5suADw4h17LW8Pfc1TZeMXNV/dXBsYvuX8dAX9t5OWv5O1i1w8zb\n+esYoKoerqpPNdNfAx4Ajl+02tBfz6Ncso4Hvrhgfi/fucG+vU5VPQl8BXjBUNJ1q59ts9AWeu19\nWqy4fZrDxC+sqpuGGWwE9PPa+T7g+5L8ZZKPJxnK4Dki+tk+vwS8Lsle4E/p/Y+0eg533zSO1jI2\ndbV91jpmPDPJXLM/uHgQARfpN++rm9N+rkty8AOnR34bN6dingh8dMHiYW/jfiz3nMbl73zx67iA\nP09yR5KtHWVaztlJ7k5yc5IfaJaN9HZO8t30ysgfL1jc+TZO7/Ts04Gdi+4a+ut51Z+TpfGQ5HXA\nLPDPu84yKpJ8F/Br9E430XdaT++UwZfR+1/AjyX536rqwCEfNT0uA36vqv7vJGcD/y3JD1bVt7oO\nJq3VMmPGi6pqX5LvBT6a5N6qeqibhN/2J8AHq+qJJD9N78jhOR1n6telwHVV9dSCZaO4jcdWkh+n\nV7J+dMHiH2228fcAtyT5THPUpmufovf7fzy99/j+d3pj8Kj7SeAvq2rhUa9Ot3GS59ArfW+rqq8O\n6+cuZ5SPZO0DXrhgfmOzbMl1kqynd+rOl4eSrlv9bBuSnAf8AvCqqnpiSNlGwUrb57nADwK3J9lD\n79zcGzIdF7/o57WzF7ihqr7ZnNL0OcZjh9+GfrbPFuBagKr6a+CZwLFDSTf6+to3jbm1jE1dbZ81\njRlVta/5uhu4nd7/Eg/Sinmr6ssLMr4P+KF+Hzsgh/NzL2XRKVYdbON+LPecRvrvPMk/pfeauKiq\nvv1vwgXbeD/wYYZzqu6KquqrVfV4M/2nwBFJjmXEtzOHfh0PfRsnOYJewfpAVX1oiVWG/3pu441d\ng7jR+9/03fQOqR98w98PLFrnZ3n6m4uv7Tr3CG2b0+m9YfLkrvOO4vZZtP7tTM+FL/p57VwA7Gim\nj6V3GP0FXWcfoe1zM/CGZvr76b0nK11nH+I22sTyb+C+kKe/sfgTXeft6DWy5NgE/ABPv/DFboZz\n4YtVjxn03gj+jGb6WOBBBvzm+z7zblgw/b8DH2+mjwE+3+Q+upk+ZhS2cbPeS+hdHCALlg19Gy/4\n2Yf999zVNu4z8wn03uv4I4uWPxt47oLpvwIuGJHM/+Tg64FeKfmbZpsf1r9lhpW3uf/59N639exR\n2MbN9vp94D2HWGfor+ehvLjWsNFeQe9/0R8CfqFZ9sv0/pcNev+D/EfNH9QngO/tOvMIbZv/ATwK\n3NXcbug68yhtn0Xr3s6UlKw+Xzuhdzrl/cC9wKVdZx6x7XMK8JfNgHcX8BNdZx7itvkg8DDwTXpH\nPLcAbwLetOC181vNtrt3Uv+u1jI20TtS9BDwWeDlI5R5yTGD3tXa7m1e7/cCW0Yk768C9zW5bgNe\nsuCx/0ez7XcBPzUq27iZ/yXg8kWP62obr/rvucNtvFLm9wGPLXgdzzXLv7fZvnc3r5tfGKHMb17w\nWv44CwriUq+prvM267yB3kV8Fj6uy238o/TeD3bPgt/9K7p+PR9szpIkSZKkFozye7IkSZIkaexY\nsiRJkiSpRZYsSZIkSWqRJUuSJEmSWmTJkiRJkqQWWbIkSZIkqUWWLEmSJElqkSVLkiRJklpkyZIk\nSZKkFlmyJEmSJKlFlixJkiRJapElS5IkSZJaZMmSWpZkT5Lzus4hSdJSHKekwbNkSSMqyclJ/iHJ\nH3SdRZKkg5Lc3oxPjze3z3adSRo1lixpdP0W8MmuQ0iStIQ3V9VzmtuLuw4jjRpLljRASb4/yeeT\nXHaYj7sUOADcOphkkiStfpySdGiWLGlAkpwBfAR4S1V9MMmNSQ4sc7txweOeB/wy8PauskuSJt9q\nx6nGryb5UpK/TPKy4aeXRtv6rgNIE+qfAVuA11XV7QBV9co+H/srwFVVtTfJgOJJkqbcWsapdwD3\nA98ALgX+JMlpVfXQIIJK48gjWdJgvAn4q4MDV7+SnAacB1w5iFCSJDVWNU4BVNXOqvpaVT1RVTuA\nvwRe0XZAaZxZsqTBeBNwQpJvl6UkNy+4EtPi283Nai8DNgF/k+QR4D8Ar07yqWE/AUnSRFvtOLWU\nAjz1QlogVdV1BmmiJNkDvBGYo3fhiluqalufj/1u4HkLFv0HeqXrZ6pqvt2kkqRptMZx6ijgh4H/\nCTwJ/EtgO3B6VX1uIIGlMeR7sqQBqaoDSc4Hbkvyzar6T3085uvA1w/OJ3kc+AcLliSpbasZp4Aj\ngHcDLwGeAj4DXGzBkp7OI1mSJEmS1CLfkyVJkiRJLbJkSZIkSVKLLFmSJEmS1CJLliRJkiS1aKhX\nFzz22GNr06ZNw/yRkqQxcscdd3ypqma6zOBYJUlaTr/jVF8lq/k8ha/Ru1Tnk1U1m+QY4Bp6n+Gz\nB3hNVT12qO+zadMm5ubm+vmRkqQplOQLXWdwrJIkLaffcepwThf88ao6rapmm/ltwK1VdTK9D7Lr\n60PsJEmSJGmSreU9WRcBO5rpHcDFa48jSZIkSeOt35JVwJ8nuSPJ1mbZcVX1cDP9CHBc6+kkSZIk\nacz0e+GLH62qfUm+B7glyWcW3llVlaSWemBTyrYCnHDCCWsKK0mSJEmjrq8jWVW1r/m6H/gwcCbw\naJINAM3X/cs8dntVzVbV7MxMpxeMkiRJkqSBW7FkJXl2kucenAZ+Avg0cAOwuVltM3D9oEJKkiRJ\n0rjo53TB44APJzm4/h9W1Z8l+SRwbZItwBeA1wwu5tNt2nbTsH5UX/ZcfmHXESRJkqSRMe3/Xl+x\nZFXVbuDUJZZ/GTh3EKEkSZIkaVyt5RLukiRJkqRFLFmSJEmS1CJLliRJkiS1yJIlSZIkSS2yZEmS\nJElSiyxZkiRJktQiS5YkSZIktciSJUmSJEktsmRJkiRJUossWZKkiZBkXZI7k9zYzJ+YZGeSXUmu\nSXJk1xklSdPBkiVJmhRvBR5YMH8FcGVVnQQ8BmzpJJUkaepYsiRJYy/JRuBC4H3NfIBzgOuaVXYA\nF3eTTpI0bSxZkqRJ8B7g54BvNfMvAA5U1ZPN/F7g+C6CSZKmjyVLkjTWkrwS2F9Vd6zhe2xNMpdk\nbn5+vsV0kqRpZMmSJI27lwKvSrIHuJreaYLvBY5Ksr5ZZyOwb7lvUFXbq2q2qmZnZmYGnVeSNOEs\nWZKksVZV76yqjVW1CbgU+GhVvRa4DbikWW0zcH1HESVJU8aSJUmaVO8A3p5kF733aF3VcR5J0pRY\nv/IqkiSNh6q6Hbi9md4NnNllHknSdLJkSfq2Tdtu6jrCt+25/MKuI0iSJK2KpwtKkiRJUossWZIk\nSZLUIkuWJEmSJLXIkiVJkiRJLbJkSZIkSVKL+i5ZSdYluTPJjc38iUl2JtmV5JokRw4upiRJkiSN\nh8M5kvVW4IEF81cAV1bVScBjwJY2g0mSJEnSOOqrZCXZCFwIvK+ZD3AOcF2zyg7g4kEElCRJkqRx\n0u+RrPcAPwd8q5l/AXCgqp5s5vcCxy/1wCRbk8wlmZufn19TWEmSJEkadSuWrCSvBPZX1R2r+QFV\ntb2qZqtqdmZmZjXfQpIkSZLGxvo+1nkp8KokrwCeCTwPeC9wVJL1zdGsjcC+wcWUJEmSpPGw4pGs\nqnpnVW2sqk3ApcBHq+q1wG3AJc1qm4HrB5ZSkiRJksbEWj4n6x3A25PsovceravaiSRJkiRJ46uf\n0wW/rapuB25vpncDZ7YfSZIkSZLG11qOZEmSJEmSFjmsI1mSpNGwadtNXUd4mj2XX9h1BEkD4L5G\nWh2PZEmSJElSiyxZkiRJktQiS5YkSZIktciSJUmSJEktsmRJkiRJUossWZIkSZLUIkuWJEmSJLXI\nkiVJkiRJLbJkSZIkSVKLLFlYSJRKAAAgAElEQVSSJEmS1KL1XQeQJEnTZdO2m7qO8DR7Lr+w6wiS\nJoxHsiRJYy/JM5N8IsndSe5L8q5m+YlJdibZleSaJEd2nVWSNPksWZKkSfAEcE5VnQqcBlyQ5Czg\nCuDKqjoJeAzY0mFGSdKUsGRJksZe9TzezB7R3Ao4B7iuWb4DuLiDeJKkKWPJkiRNhCTrktwF7Adu\nAR4CDlTVk80qe4Hjl3ns1iRzSebm5+eHE1iSNLEsWZKkiVBVT1XVacBG4EzgJYfx2O1VNVtVszMz\nMwPLKEmaDpYsSdJEqaoDwG3A2cBRSQ5eSXcjsK+zYJKkqWHJkiSNvSQzSY5qpp8FnA88QK9sXdKs\nthm4vpuEkqRp4udkSZImwQZgR5J19P4D8dqqujHJ/cDVSd4N3Alc1WVISdJ0sGRJksZeVd0DnL7E\n8t303p8lSdLQeLqgJEmSJLXIkiVJkiRJLVqxZCV5ZpJPJLk7yX1J3tUsPzHJziS7klyT5MjBx5Uk\nSZKk0dbPkawngHOq6lTgNOCCJGcBVwBXVtVJwGPAlsHFlCRJkqTxsGLJqp7Hm9kjmlsB5wDXNct3\nABcPJKEkSZIkjZG+3pOVZF2Su4D9wC3AQ8CBqnqyWWUvcPwyj92aZC7J3Pz8fBuZJUmSJGlk9VWy\nquqpqjoN2EjvUrgv6fcHVNX2qpqtqtmZmZlVxpQkSZKk8XBYVxesqgPAbcDZwFFJDn7O1kZgX8vZ\nJEmSJGns9HN1wZkkRzXTzwLOBx6gV7YuaVbbDFw/qJCSJEmSNC7Wr7wKG4AdSdbRK2XXVtWNSe4H\nrk7ybuBO4KoB5pQkSZJGyqZtN3Ud4Wn2XH5h1xHUWLFkVdU9wOlLLN9N7/1ZkiRJkqTGYb0nS5Ik\nSZJ0aP2cLiitmofRJUmSNG08kiVJkiRJLbJkSZIkSVKLLFmSJEmS1CJLliRJkiS1yJIlSZIkSS2y\nZEmSJElSiyxZkiRJktQiPydLkqSW+RmBkjTdPJIlSZIkSS2yZEmSJElSiyxZkiRJktQiS5YkSZIk\ntciSJUmSJEktsmRJkiRJUossWZKksZfkhUluS3J/kvuSvLVZfkySW5I82Hw9uuuskqTJZ8mSJE2C\nJ4F/X1WnAGcBP5vkFGAbcGtVnQzc2sxLkjRQlixJ0tirqoer6lPN9NeAB4DjgYuAHc1qO4CLu0ko\nSZomlixJ0kRJsgk4HdgJHFdVDzd3PQIct8xjtiaZSzI3Pz8/lJySpMllyZIkTYwkzwH+GHhbVX11\n4X1VVUAt9biq2l5Vs1U1OzMzM4SkkqRJZsmSJE2EJEfQK1gfqKoPNYsfTbKhuX8DsL+rfJKk6WHJ\nkiSNvSQBrgIeqKpfW3DXDcDmZnozcP2ws0mSps/6rgNIktSClwKvB+5Nclez7OeBy4Frk2wBvgC8\npqN8kqQpsmLJSvJC4PfpvVm4gO1V9d4kxwDXAJuAPcBrquqxwUWVJGlpVfUXQJa5+9xhZpEkqZ/T\nBf3sEUmSJEnq04oly88ekSRJkqT+HdaFL/zsEUmSJEk6tL5Llp89IkmSJEkr66tk+dkjkiRJktSf\nFUuWnz0iSZIkSf3r53Oy/OwRSZIkSerTiiXLzx6RJEmSpP4d1tUFJUmSJEmHZsmSJEmSpBZZsiRJ\nkiSpRZYsSZIkSWqRJUuSJEmSWmTJkiRJkqQWWbIkSZIkqUWWLEmSJElqkSVLkiRJklpkyZIkSZKk\nFlmyJEmSJKlFlixJkiRJapElS5IkSZJaZMmSJEmSpBZZsiRJkiSpRZYsSZIkSWqRJUuSJEmSWmTJ\nkiRJkqQWWbIkSZIkqUWWLEmSJElqkSVLkjT2krw/yf4kn16w7JgktyR5sPl6dJcZJUnTw5IlSZoE\nvwdcsGjZNuDWqjoZuLWZlyRp4CxZkqSxV1UfA/520eKLgB3N9A7g4qGGkiRNrRVLlqdgSJLG1HFV\n9XAz/Qhw3HIrJtmaZC7J3Pz8/HDSSZImVj9Hsn4PT8GQJI2xqiqgDnH/9qqararZmZmZISaTJE2i\nFUuWp2BIksbUo0k2ADRf93ecR5I0JVb7nqy+T8GQJKkjNwCbm+nNwPUdZpEkTZE1X/hipVMwPM9d\nkjRoST4I/DXw4iR7k2wBLgfOT/IgcF4zL0nSwK1f5eMeTbKhqh5e6RSMqtoObAeYnZ1dtoxJkrRa\nVXXZMnedO9QgkiSx+iNZnoIhSZIkSUvo5xLunoIhSZIkSX1a8XRBT8GQJEmSpP6t+cIXkiRJkqR/\nZMmSJEmSpBZZsiRJkiSpRZYsSZIkSWqRJUuSJEmSWmTJkiRJkqQWWbIkSZIkqUWWLEmSJElqkSVL\nkiRJklpkyZIkSZKkFlmyJEmSJKlFlixJkiRJapElS5IkSZJaZMmSJEmSpBZZsiRJkiSpRZYsSZIk\nSWqRJUuSJEmSWmTJkiRJkqQWWbIkSZIkqUWWLEmSJElqkSVLkiRJklpkyZIkSZKkFlmyJEmSJKlF\nlixJkiRJatGaSlaSC5J8NsmuJNvaCiVJUlscqyRJw7bqkpVkHfBbwMuBU4DLkpzSVjBJktbKsUqS\n1IW1HMk6E9hVVbur6hvA1cBF7cSSJKkVjlWSpKFLVa3ugcklwAVV9cZm/vXAD1fVmxettxXY2sy+\nGPjs6uN+27HAl1r4PuNgmp4r+Hwn2TQ9V/D5rtaLqmqmhe8DdDpWjePv38zDMW6Zxy0vmHlYxi3z\nUMep9S38oEOqqu3A9ja/Z5K5qppt83uOqml6ruDznWTT9FzB5ztu2h6rxnF7mHk4xi3zuOUFMw/L\nuGUedt61nC64D3jhgvmNzTJJkkaFY5UkaejWUrI+CZyc5MQkRwKXAje0E0uSpFY4VkmShm7VpwtW\n1ZNJ3gx8BFgHvL+q7mst2aG1evrhiJum5wo+30k2Tc8VfL4jocOxaiS3xwrMPBzjlnnc8oKZh2Xc\nMg8176ovfCFJkiRJ+k5r+jBiSZIkSdLTWbIkSZIkqUUjXbKSXJDks0l2Jdm2xP3PSHJNc//OJJuG\nn7IdfTzXNySZT3JXc3tjFznbkOT9SfYn+fQy9yfJrzfb4p4kZww7Y5v6eL4vS/KVBb/b/zzsjG1J\n8sIktyW5P8l9Sd66xDoT8/vt8/lO0u/3mUk+keTu5vm+a4l1Jma/3I9xHKfGbbwZxzFj3Pb747jv\nHsf977jtQ/vMO1L7i4OSrEtyZ5Ibl7hvONu4qkbyRu8Nyg8B3wscCdwNnLJonX8D/E4zfSlwTde5\nB/hc3wD8ZtdZW3q+PwacAXx6mftfAdwMBDgL2Nl15gE/35cBN3ads6XnugE4o5l+LvC5JV7LE/P7\n7fP5TtLvN8BzmukjgJ3AWYvWmYj9cp/bY+zGqXEcb8ZxzBi3/f447rvHcf87bvvQPvOO1P5iQa63\nA3+41O9/WNt4lI9knQnsqqrdVfUN4GrgokXrXATsaKavA85NkiFmbEs/z3ViVNXHgL89xCoXAb9f\nPR8HjkqyYTjp2tfH850YVfVwVX2qmf4a8ABw/KLVJub32+fznRjN7+zxZvaI5rb46kmTsl/uxziO\nU2M33ozjmDFu+/1x3HeP4/533PahfeYdOUk2AhcC71tmlaFs41EuWccDX1wwv5fv/OP59jpV9STw\nFeAFQ0nXrn6eK8Crm0P01yV54RL3T4p+t8ckObs5HH9zkh/oOkwbmsPvp9P7n6+FJvL3e4jnCxP0\n+21OwbgL2A/cUlXL/n7HfL/cj3EcpyZxvBnXfcpI7hfGcd89TvvfcduH9pEXRm9/8R7g54BvLXP/\nULbxKJcsPd2fAJuq6p8Ct/CPDVzj71PAi6rqVOA3gP/ecZ41S/Ic4I+Bt1XVV7vOM2grPN+J+v1W\n1VNVdRqwETgzyQ92nUmtc7wZvJHcL4zjvnvc9r/jtg/tI+9I7S+SvBLYX1V3dJkDRrtk7QMWtuGN\nzbIl10myHng+8OWhpGvXis+1qr5cVU80s+8DfmhI2brQz+9+YlTVVw8ejq+qPwWOSHJsx7FWLckR\n9Aa8D1TVh5ZYZaJ+vys930n7/R5UVQeA24ALFt01KfvlfozjODWJ483Y7VNGcb8wjvvucd7/jts+\ndLm8I7i/eCnwqiR76J0OfU6SP1i0zlC28SiXrE8CJyc5McmR9N6YdsOidW4ANjfTlwAfraqRP1d0\nCSs+10XnPb+K3rnHk+oG4F83VzI6C/hKVT3cdahBSfJPDp4LnORMen+Xne9QV6N5HlcBD1TVry2z\n2sT8fvt5vhP2+51JclQz/SzgfOAzi1ablP1yP8ZxnJrE8Wbs9imjtl8Yx333OO5/x20f2k/eUdtf\nVNU7q2pjVW2it3/7aFW9btFqQ9nG69v+hm2pqieTvBn4CL2rIb2/qu5L8svAXFXdQO+P678l2UXv\nDaaXdpd49fp8rv82yauAJ+k91zd0FniNknyQ3hV/jk2yF/hFem+mpKp+B/hTelcx2gV8HfipbpK2\no4/newnwM0meBP4euHSM/1H6UuD1wL3NOdwAPw+cABP5++3n+U7S73cDsCPJOnr/WLm2qm6cxP1y\nP8ZxnBrH8WYcx4wx3O+P4757HPe/47YP7SfvSO0vltPFNs74jvWSJEmSNHpG+XRBSZIkSRo7lixJ\nkiRJapElS5IkSZJaZMmSJEmSpBZZsiRJkiSpRZYsSZIkSWqRJUuSJEmSWmTJkiRJkqQWWbIkSZIk\nqUWWLEmSJElqkSVLkiRJklpkyZIkSZKkFlmypJYl2ZPkvK5zSJK0FMcpafAsWdIISnJpkgeS/F2S\nh5L8s64zSZIEkOTxRbenkvxG17mkUbK+6wCSni7J+cAVwL8EPgFs6DaRJEn/qKqec3A6yXOAR4A/\n6i6RNHo8kiUNUJLvT/L5JJcdxsPeBfxyVX28qr5VVfuqat+gMkqSptcqx6mFXg3sB/7fFmNJY8+S\nJQ1IkjOAjwBvqaoPJrkxyYFlbjc2j1kHzAIzSXYl2ZvkN5M8q8vnIkmaPKsZp5awGfj9qqrhJZdG\nX/ybkNqVZA+wA9gCvK6qbj+Mx/4vwD7gDuAngW8C1wO3V9UvtB5WkjR11jJOLfo+LwJ2AydV1edb\nCyhNAI9kSYPxJuCvVjFw/X3z9Teq6uGq+hLwa8Ar2gwnSZp6qx2nFno98BcWLOk7WbKkwXgTcEKS\nKw8uSHLzEldkOni7GaCqHgP2AgsPMXu4WZLUtlWNU4v8a3pHxCQt4tUFpcH4GnABcGuSy6tqW1W9\nvM/H/i7wliR/Ru90wX8HLHcuvCRJq7GWcYokPwIcj1cVlJZkyZIGpKoONJdjvy3JN6vqP/X50F8B\njgU+B/wDcC3wXwYUU5I0pdYwTkHvghcfqqqvDSieNNa88IUkSZIktcj3ZEmSJElSiyxZkiRJktQi\nS5YkSZIktciSJUmSJEktGurVBY899tjatGnTMH+kJGmM3HHHHV+qqpkuMzhWSZKW0+84NdSStWnT\nJubm5ob5IyVJYyTJF7rO4FglSVpOv+OUpwtKkiRJUossWZIkSZLUIkuWJEmSJLXIkiVJkiRJLbJk\nSZIkSVKLLFmSJEmS1KKhXsJdg7dp201dR3iaPZdf2HUESZK0Sv67Qlodj2RJkiRJUossWZIkSZLU\nIkuWJEmSJLXIkiVJkiRJLbJkSZIkSVKLLFmSpLGX5JlJPpHk7iT3JXlXs/z3knw+yV3N7bSus0qS\nJp+XcJckTYIngHOq6vEkRwB/keTm5r7/WFXXdZhNkjRlLFmSpLFXVQU83swe0dyqu0SSpGnm6YKS\npImQZF2Su4D9wC1VtbO5678kuSfJlUmescxjtyaZSzI3Pz8/tMySpMnkkSxNFT+5XppcVfUUcFqS\no4APJ/lB4J3AI8CRwHbgHcAvL/HY7c39zM7OegRMkrQmHsmSJE2UqjoA3AZcUFUPV88TwO8CZ3ab\nTpI0DVYsWYe4YtOJSXYm2ZXkmiRHDj6uJEnfKclMcwSLJM8Czgc+k2RDsyzAxcCnu0spSZoW/RzJ\nOnjFplOB04ALkpwFXAFcWVUnAY8BWwYXU5KkQ9oA3JbkHuCT9N6TdSPwgST3AvcCxwLv7jCjJGlK\nrPierENcsekc4F81y3cAvwT8dvsRJUk6tKq6Bzh9ieXndBBHkjTl+npP1uIrNgEPAQeq6slmlb3A\n8cs81is2SZIkSZoafZWsqnqqqk4DNtJ70/BL+v0BVbW9qmaranZmZmaVMSVJkiRpPBzW1QUXXLHp\nbOCoJAdPN9wI7Gs5myRJkiSNnX6uLrjUFZseoFe2LmlW2wxcP6iQkiRJkjQu+vkw4g3AjiTr6JWy\na6vqxiT3A1cneTdwJ3DVAHNKkiRJ0ljo5+qCy12xaTd+qKMkSZIkPc1hvSdLkiRJknRolixJkiRJ\napElS5IkSZJaZMmSJEmSpBZZsiRJkiSpRZYsSZIkSWqRJUuSNPaSPDPJJ5LcneS+JO9qlp+YZGeS\nXUmuSXJk11klSZPPkiVJmgRPAOdU1anAacAFSc4CrgCurKqTgMeALR1mlCRNCUuWJGnsVc/jzewR\nza2Ac4DrmuU7gIs7iCdJmjKWLEnSREiyLsldwH7gFuAh4EBVPdmsshc4vqt8kqTpYcmSJE2Eqnqq\nqk4DNgJnAi/p97FJtiaZSzI3Pz8/sIySpOlgyZIkTZSqOgDcBpwNHJVkfXPXRmDfMo/ZXlWzVTU7\nMzMzpKSSpEllyZIkjb0kM0mOaqafBZwPPECvbF3SrLYZuL6bhJKkabJ+5VUkSaNm07abuo7wNHsu\nv7DrCBuAHUnW0fsPxGur6sYk9wNXJ3k3cCdwVZchJUnTwZIlSRp7VXUPcPoSy3fTe3+WJElD8/+3\nd78hltX3Hcffn7qWtmobJdNl0d1uEBGk0DUMkrIhmNoE84eqT0KEWinSyQMt2gpl65PYZ6Yk2kKL\ndJO1bqhJCFGJNJJksYIVUutot9nVNVVkQ3ZZ3QlpUftEVr99MEc66szO7Nxz7zn3nvcLhnv+3Xu+\n98x4fn72/M7vrNtdMMn2JI8neb55wOOtzfI7kxxPcrD5+fT4y5UkSZKkftvIlaxTwO1V9WyS84Bn\nkhxo1t1TVV8eX3mSJEmSNF3WDVlVdQI40Uy/nuQIPmdEkiRJklZ1RvdkJdnJcp/3p4DdwC1J/ghY\nZPlq13+v8p4FYAFgx44dI5YrSZLULgeSkdS2DQ/hnuRc4EHgtqp6DbgXuBjYxfKVrq+s9j6fPSJJ\nkiRpSDYUspKczXLAeqCqHgKoqler6q2qehv4Ko7eJEmSJEkbGl0wLD9X5EhV3b1i+bYVm10HHG6/\nPEmSJEmaLhu5J2s3cANwKMnBZtkdwPVJdgEFHAW+MJYKJUmSJGmKbGR0wSeBrLLq0fbLkdSlPt38\n7Y3fkiRpWm144AtJkiRJ0voMWZIkSZLUIkOWJEmSJLXIkCVJkiRJLdrI6IKSNHh9GhRE75dkO/B1\nYCvLo97uraq/TXIn8CfAUrPpHVXlwE2SpLEyZEmSZsEp4PaqejbJecAzSQ406+6pqi93WJskaWAM\nWZKkqVdVJ4ATzfTrSY4AF3ZblSRpqLwnS5I0U5LsBC4HnmoW3ZLkx0nuS3L+Gu9ZSLKYZHFpaWm1\nTSRJ2jBDliRpZiQ5F3gQuK2qXgPuBS4GdrF8pesrq72vqvZW1XxVzc/NzU2sXknSbDJkSZJmQpKz\nWQ5YD1TVQwBV9WpVvVVVbwNfBa7oskZJ0jAYsiRJUy9JgH3Akaq6e8XybSs2uw44POnaJEnD48AX\nkqRZsBu4ATiU5GCz7A7g+iS7WB7W/SjwhW7KkyQNiSFLkjT1qupJIKus8plYkqSJs7ugJEmSJLVo\nKq9k7dzzva5LeJejd32m6xIkSZIk9cS6V7KSbE/yeJLnkzyX5NZm+QVJDiR5sXld9dkjkiRJkjQk\nG+kueAq4vaouAz4C3JzkMmAP8FhVXQI81sxLkiRJ0qCtG7Kq6kRVPdtMvw4cAS4ErgH2N5vtB64d\nV5GSJEmSNC3OaOCLJDuBy4GngK1VdaJZ9QqwtdXKJEmSJGkKbThkJTkXeBC4rapeW7muqorlZ5Cs\n9r6FJItJFpeWlkYqVpIkSZL6bkOjCyY5m+WA9UBVPdQsfjXJtqo6kWQbcHK191bVXmAvwPz8/KpB\nTJIkSdJo+jQC99BH397I6IIB9gFHquruFaseAW5spm8Evtt+eZIkSZI0XTZyJWs3cANwKMnBZtkd\nwF3At5PcBPwU+Nx4SpQkSZKk6bFuyKqqJ4GssfqqdsuRJEmSpOl2RqMLSpIkSZJOz5AlSZIkSS0y\nZEmSpl6S7UkeT/J8kueS3NosvyDJgSQvNq/nd12rJGn2GbIkSbPgFHB7VV0GfAS4OcllwB7gsaq6\nBHismZckaawMWZKkqVdVJ6rq2Wb6deAIcCFwDbC/2Ww/cG03FUqShmRDDyOWJGlaJNkJXA48BWyt\nqhPNqleArWu8ZwFYANixY8f4i5S0KX162C74wF2tzStZkqSZkeRc4EHgtqp6beW6qiqgVntfVe2t\nqvmqmp+bm5tApZKkWWbIkiTNhCRnsxywHqiqh5rFrybZ1qzfBpzsqj5J0nAYsiRJUy9JgH3Akaq6\ne8WqR4Abm+kbge9OujZJ0vB4T5YkaRbsBm4ADiU52Cy7A7gL+HaSm4CfAp/rqL5OeR+LpEkb+nnH\nkCVJmnpV9SSQNVZfNclaJEmyu6AkSZIktciQJUmSJEktMmRJkiRJUosMWZIkSZLUIkOWJEmSJLVo\n3ZCV5L4kJ5McXrHsziTHkxxsfj493jIlSZIkaTps5ErW/cDVqyy/p6p2NT+PtluWJEmSJE2ndUNW\nVT0B/GICtUiSJEnS1Bvlnqxbkvy46U54/lobJVlIsphkcWlpaYTdSZIkSVL/bTZk3QtcDOwCTgBf\nWWvDqtpbVfNVNT83N7fJ3UmSJEnSdNhUyKqqV6vqrap6G/gqcEW7ZUmSJEnSdNpUyEqybcXsdcDh\ntbaVJEmSpCHZyBDu3wR+BFya5FiSm4C/TnIoyY+BjwN/NuY6JUlak48bkST1yZb1Nqiq61dZvG8M\ntUiStFn3A38HfP09y++pqi9PvhxJ0pCNMrqgJEm94ONGJEl9YsiSJM2yDT1uRJKkNhmyJEmzasOP\nG/GZjpKkNhmyJEkz6UweN+IzHSVJbTJkSZJmko8bkSR1Zd3RBSVJ6rvmcSNXAh9Mcgz4InBlkl1A\nAUeBL3RWoCRpUAxZkqSp5+NGJEl9YndBSZIkSWqRIUuSJEmSWmTIkiRJkqQWeU+WJEkt27nne12X\nIEnqkFeyJEmSJKlFhixJkiRJapEhS5IkSZJaZMiSJEmSpBatG7KS3JfkZJLDK5ZdkORAkheb1/PH\nW6YkSZIkTYeNXMm6H7j6Pcv2AI9V1SXAY828JEmSJA3euiGrqp4AfvGexdcA+5vp/cC1LdclSZIk\nSVNps/dkba2qE830K8DWtTZMspBkMcni0tLSJncnSZIkSdNh5IEvqqqAOs36vVU1X1Xzc3Nzo+5O\nkiRJknptsyHr1STbAJrXk+2VJEnSmXGQJklSn2w2ZD0C3NhM3wh8t51yJEnalPtxkCZJUk9sZAj3\nbwI/Ai5NcizJTcBdwCeSvAj8fjMvSVInHKRJktQnW9bboKquX2PVVS3XIklSm85okCZgAWDHjh0T\nKE2SNMtGHvhCkqS+c5AmSdIkGbIkSbPKQZokSZ0wZEmSZpWDNEmSOmHIkiRNPQdpkiT1yboDX0iS\n1HcO0iRJ6hOvZEmSJElSiwxZkiRJktQiQ5YkSZIktciQJUmSJEktMmRJkiRJUosMWZIkSZLUIkOW\nJEmSJLXIkCVJkiRJLTJkSZIkSVKLDFmSJEmS1KIto7w5yVHgdeAt4FRVzbdRlCRJkiRNq5FCVuPj\nVfXzFj5HkiRJkqae3QUlSZIkqUWjXskq4IdJCviHqtr73g2SLAALADt27Bhxd5IknRm7tkuSJm3U\nkPXRqjqe5DeBA0leqKonVm7QBK+9APPz8zXi/iRJ2gy7tkuSJmak7oJVdbx5PQk8DFzRRlGSJEmS\nNK02HbKSnJPkvHemgU8Ch9sqTJKklrzTtf2Zpgv7+yRZSLKYZHFpaWnC5UmSZs0o3QW3Ag8needz\nvlFV32+lKkmS2mPXdknSRG06ZFXVy8DvtFiLJEmtW9m1Pck7XdufOP27JEnaPIdwlyTNLLu2S5K6\n0MbDiCVJ6iu7tkuSJs6QJUmaWXZtlyR1we6CkiRJktQiQ5YkSZIktciQJUmSJEktMmRJkiRJUosM\nWZIkSZLUIkOWJEmSJLXIkCVJkiRJLTJkSZIkSVKLDFmSJEmS1CJDliRJkiS1yJAlSZIkSS0yZEmS\nJElSi0YKWUmuTvKTJC8l2dNWUZIktcW2SpI0aZsOWUnOAv4e+BRwGXB9ksvaKkySpFHZVkmSujDK\nlawrgJeq6uWqehP4FnBNO2VJktQK2ypJ0sSNErIuBH62Yv5Ys0ySpL6wrZIkTdyWce8gyQKw0My+\nkeQnLXzsB4Gft/A5rciXuq7gXTw2p+fxOb3eHB+PzXTJl1o7Pr/VwmecsTG0VUP7ezmj79vD/77P\n1Fh/vz07Pv4tn0bPflebMZjf76TbqVFC1nFg+4r5i5pl71JVe4G9I+znfZIsVtV8m585Kzw2p+fx\nOT2Pz9o8NqfX4+PTSVvV4+MxFn7f2TWk7wp+31k26e86SnfBp4FLknwoyS8DnwceaacsSZJaYVsl\nSZq4TV/JqqpTSW4BfgCcBdxXVc+1VpkkSSOyrZIkdWGke7Kq6lHg0ZZqOROtdj+cMR6b0/P4nJ7H\nZ20em9Pr7fHpqK3q7fEYE7/v7BrSdwW/7yyb6HdNVU1yf5IkSZI000a5J0uSJEmS9B5TFbKSXJ3k\nJ0leSrKn63r6JMl9SU4mOdx1LX2UZHuSx5M8n+S5JLd2XVNfJPmVJP+e5D+bY/NXXdfUN0nOSvIf\nSf6561r6JsnRJIeSHD7z5vEAAANdSURBVEyy2HU9fTCktmpIbc/Q2pEhtg1DOtcP7dyd5ANJvpPk\nhSRHkvzu2Pc5Ld0Fk5wF/BfwCZYfJvk0cH1VPd9pYT2R5GPAG8DXq+q3u66nb5JsA7ZV1bNJzgOe\nAa717weSBDinqt5IcjbwJHBrVf1bx6X1RpI/B+aBX6+qz3ZdT58kOQrMV9UgnrOynqG1VUNqe4bW\njgyxbRjSuX5o5+4k+4F/raqvNSPN/lpV/c849zlNV7KuAF6qqper6k3gW8A1HdfUG1X1BPCLruvo\nq6o6UVXPNtOvA0eAC7utqh9q2RvN7NnNz3T868sEJLkI+Azwta5r0VQYVFs1pLZnaO3I0NoGz/Wz\nK8lvAB8D9gFU1ZvjDlgwXSHrQuBnK+aPMcMnN41Pkp3A5cBT3VbSH00XiYPASeBAVXls/t/fAH8B\nvN11IT1VwA+TPJNkoetiesC2agCG0o4MrG0Y2rl+SOfuDwFLwD823UG/luScce90mkKWNLIk5wIP\nArdV1Wtd19MXVfVWVe0CLgKuSDLT3X42KslngZNV9UzXtfTYR6vqw8CngJub7mPSzBpSOzKUtmGg\n5/ohnbu3AB8G7q2qy4H/BcZ+v+w0hazjwPYV8xc1y6QNafqUPwg8UFUPdV1PHzWXzx8Hru66lp7Y\nDfxB03f9W8DvJfmnbkvql6o63ryeBB5mubvckNlWzbChtiMDaBsGd64f2Ln7GHBsxZXY77AcusZq\nmkLW08AlST7U3LD2eeCRjmvSlGhu4N0HHKmqu7uup0+SzCX5QDP9qyzfsP9Ct1X1Q1X9ZVVdVFU7\nWT7n/EtV/WHHZfVGknOaAQBoul58Epj5UebWYVs1o4bWjgypbRjauX5o5+6qegX4WZJLm0VXAWMf\nsGbLuHfQlqo6leQW4AfAWcB9VfVcx2X1RpJvAlcCH0xyDPhiVe3rtqpe2Q3cABxq+pcD3FFVj3ZY\nU19sA/Y3o6L9EvDtqpr54WvViq3Aw8v/78kW4BtV9f1uS+rW0NqqgbU9Q2tHbBtm1xDP3X8KPND8\n49fLwB+Pe4dTM4S7JEmSJE2DaeouKEmSJEm9Z8iSJEmSpBYZsiRJkiSpRYYsSZIkSWqRIUuSJEmS\nWmTIkiRJkqQWGbIkSZIkqUWGLEmSJElq0f8BoVn1N0RoQKUAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 1200x900 with 6 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = plt.figure(figsize = (12, 9))\n", "for k in range(2, 8):\n", " km = cluster.KMeans(n_clusters = k, init = 'k-means++', n_init = 5)\n", " km.fit(dpro)\n", " ax = fig.add_subplot(3, 2 , k - 1)\n", " plt.hist(km.labels_)\n", " plt.title('k={}'.format(k))\n", "\n", "fig.tight_layout()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>We can see that the clusters are actually pretty well balanced for all values of $k$. Each choice of $k$ certainly has a clear maximum, but the differences aren't too extreme.<br><br>\n", "The next thing we might try to do then is define the clusters based on the centroids. The centroid essentially describes the average student within each cluster. We can use that to better understand and then define the clusters. For comparison's sake, we'll also show this for clusters derived using $U$ or $U\\Sigma$. When doing this though, we have to remember to report the cluster means in the original $X$ space for us to be able to interpret it. To project the $U\\Sigma$ centroids back into the $X$ space, we just right multiply the centroid by $V^T$ from the SVD.<br><br>\n", "Another thing we do here is we subtract the mean of $X$ from each centroid. This is because we're more interested in how each cluster differs from the average student profile.\n", "</p>" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/briand/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.\n", " warnings.warn(message, mplDeprecation, stacklevel=1)\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtMAAAE/CAYAAACEmk9VAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xm4ZHV95/H3h2YVG5AtIItNgESB\nsKUVRUa6tYkiEBOCCCpIjIP4iGDU0XFwROKCkUQEiWEYFFAUFQ1DlCWgoXEbxQYRWcwEJIZGkUWg\nAQGb7u/8Uee2l6bv7erDre32+/U89+k6p06d860fRdWnfvU7v5OqQpIkSdKqW2PQBUiSJEmjyjAt\nSZIktWSYliRJkloyTEuSJEktGaYlSZKklgzTkiRJUkuGaUlaDSWZlaSSrDmdjiVJ/WaYlrRaSDI/\nyf1J1unhMV6V5Poki5Lcm+Rfk2zXq+MNkyR/kOTC5nk/mOSGJO9IMmMKjzE/yZuman+SNBUM05Km\nvSSzgP8CFPCnPTrGDsBngXcCGwLbAf8ALJnCYyTJ0L1vJ9ke+AFwB/BHVbUh8GpgNjBzkLWNN5XB\nXpLGDN2bsiT1wJHA94FzgTeMrUyyV5K7xoesJH+e5Ibm9npJzmt6tG9J8u4kCyc4xu7A7VX1zep4\nqKq+WlX/2exrRpL/keS2JA8luTbJNs19eyf5YdOj+8Mke4+rZ36SDyf5LvAb4PeTbJjk00l+meTO\nJB8aew5JdkhydbOve5N8aSVt88Ykv2j29a5mH1sk+U2STcbVsWeSe5KstYJ9nAR8r6reUVW/BKiq\nf6uq11bVA8tvnOQ/kswbt/yBJOc3t9dNcn6S+5I80LTH7yX5MJ0vRGckeTjJGc32z01yZZJfJ/m3\nJIeO2++5Sf4xyaVJHgHmrqQtJGmVGaYlrQ6OBD7f/L08ye8BVNUPgEeAl47b9rXAF5rbJwKzgN8H\n9gNeP8kxrgOem+TUJHOTPHO5+98BHA68EtgAeCPwmyQbA5cApwObAB8HLhkfZIEjgKPp9PL+nM6X\ngieAHYA9gD8BxoY/fBC4AngWsDXwyUlqhk7A3LHZx3uSzKuqu4D5wKHjtjsC+GJVLV7BPuYBX1nJ\ncbr1Bjo9+9vQaY9jgEer6gTg28CxVfXMqjo2yfrAlXT+e20OHAZ8KslO4/b3WuDDdNruO1NUoyQt\nY5iWNK0l2Qd4DvDlqroWuI1OwBpzAZ2QS5KZdMLuBc19hwIfqar7q2ohncC7QlX1M2AOsBXwZeDe\npmd0LFS/CXhf02NbVfXjqroPOAD496r6XFU9UVUXAD8FDhq3+3Or6qaqegLYuKnx7VX1SFXdDZxK\nJ0gCLG6e77Or6rGqWlmAPKnZz0+Ac8baAjiP5stD0+t9OPC5CfaxCfDLlRynW4ub/e1QVUuq6tqq\nWjTBtgcC/1FV5zRt9yPgq3SGmIy5uKq+W1VLq+qxKapRkpYxTEua7t4AXFFV9zbLX2DcUI9m+eDm\nxMSDgeuq6ufNfc+mMw54zPjbT1FV36+qQ6tqMzpDEl4CnNDcvQ2dIL+8Z9PpbR7v53RC+YqO+xxg\nLeCXzTCIB4D/RadnFuDdQIBrktyU5I2T1bzcvn/e1ANwMbBTcwLlfsCDVXXNBPu4D9hyJcfp1ueA\nfwG+2Aw/+dgEQ0ug0xZ7jbVD0xavA7YYt82k/80k6elymiJJ01aS9ej0Ls9Iclezeh1goyS7Nb3D\nNyf5ObA/Tx7iAZ3e1q2Bm5vlbbo9dlX9MMk/Abs0q+4AtgduXG7TX9AJheNtC1w+fnfjbt8BPA5s\n2vRUL3/cu4D/Cst65b+R5FtVdesEpW5Dpyd87Li/aPbzWJIv0+mdfi4T90oDfAP4Czo92914BHjG\nuOVl4bcZRnIScFJz4uilwL8Bn+bJ7QCdtri6qvab5FjLP0aSppQ905Kmsz+jM5vGTnROENwdeB6d\nsbdHjtvuC8DxdHqSLxy3/svAe5M8K8lWwLETHSjJPkn+a5LNm+Xn0pk55PvNJmcDH0yyYzMrx67N\nuOhLgT9I8tokayZ5TVPv11d0nOYEvyuAv0+yQZI1kmyfZN/muK9OsnWz+f10wuTSSdrofyZ5RpKd\ngb8Exp+w+FngqOZ5TBamTwT2TnJKki2aOnZoTiTcaAXbXw8clmStJLOBQ8buaMab/1EztGQRnWEf\nY/X/is749TFfp9N2RzT7WivJ85M8b5JaJWlKGaYlTWdvAM6pqv+sqrvG/oAzgNfldxcRuQDYF/jX\nccNBAP4GWAjcTqf39St0eoVX5AE6ofMnSR6m07N8EfCx5v6P0wnnV9AJiZ8G1mvGTR9IZ0q9++gM\n0zhwuTqWdySwNp0e8/ubusaGWTwf+EFTwz8DxzfjuSdyNXAr8E3g76rqirE7quq7dILs+KEvT1FV\ntwEvonOy5k1JHqQzdnkB8NAKHvI/6fTS30+nF3r8rwFbNM9nEXBLU99YkD8NOCSd2VVOr6qH6Jw4\neRidHvW7gL+l8+uDJPVFqvwFTJK6keQtwGFVte+ga+mXJP8KfKGqzh50LZI0jOyZlqQJJNkyyYub\noRR/SKf3+KJB19UvSZ4P7MmTh35IksbxBERJmtjadGbK2I7OMI4vAp8aaEV9kuQ8OmPOj2+GU0iS\nVsBhHpIkSVJLDvOQJEmSWjJMS5IkSS2N1JjpTTfdtGbNmjXoMiRJkjTNXXvttfc2V7Sd1EiF6Vmz\nZrFgwYJBlyFJkqRprrk67ko5zEOSJElqyTAtSZIktWSYliRJkloyTEuSJEktGaYlSZKklgzTkiRJ\nUkuGaUmSJKklw7QkSZLUkmFakiRJaskwLUmSJLU0UpcTlyRpuspJafW4OrGmuBJJq8KeaUmSJKkl\nw7QkSZLUkmFakiRJaskwLUmSJLVkmJYkSZJaMkxLkiRJLRmmJUmSpJYM05IkSVJLAwvTSdZNck2S\nHye5KclJg6pFkiRJamOQV0B8HHhpVT2cZC3gO0kuq6rvD7AmSZIkqWsDC9NVVcDDzeJazZ/XRJUk\nSdLIGOiY6SQzklwP3A1cWVU/GGQ9kiRJ0qoYaJiuqiVVtTuwNfCCJLssv02So5MsSLLgnnvu6X+R\nkiRJ0gSGYjaPqnoAuAp4xQruO6uqZlfV7M0226z/xUmSJEkTGORsHpsl2ai5vR6wH/DTQdUjSZIk\nrapBzuaxJXBekhl0Qv2Xq+rrA6xHkiRJWiWDnM3jBmCPQR1fkiRJerqGYsy0JEmSNIoM05IkSVJL\nhmlJkiSpJcO0JEmS1JJhWpIkSWrJMC1JkiS1ZJiWJEmSWhrkRVskSdJ0lKz6Y6qmvg6pD+yZliRJ\nkloyTEuSJEktGaYlSZKklgzTkiRJUkuGaUmSJKklw7QkSZLUkmFakiRJaskwLUmSJLVkmJYkSZJa\nMkxLkiRJLRmmJUmSpJYM05IkSVJLhmlJkiSpJcO0JEmS1JJhWpIkSWrJMC1JkiS1ZJiWJEmSWjJM\nS5IkSS2tOegCpFGVk9LqcXViTXElkiRpUOyZliRJkloyTEuSJEktGaYlSZKklgzTkiRJUkuGaUmS\nJKklw7QkSZLUkmFakiRJaskwLUmSJLVkmJYkSZJaGliYTrJNkquS3JzkpiTHD6oWSZIkqY1BXk78\nCeCdVXVdkpnAtUmurKqbB1iTJEmS1LWB9UxX1S+r6rrm9kPALcBWg6pHkiRJWlVDMWY6ySxgD+AH\ng61EkiRJ6t7Aw3SSZwJfBd5eVYtWcP/RSRYkWXDPPff0v0BJkiRpAoMcM02StegE6c9X1T+taJuq\nOgs4C2D27NnVx/JGTk5Kq8fViTarJElSG4OczSPAp4Fbqurjg6pDkiRJamuQwzxeDBwBvDTJ9c3f\nKwdYjyRJkrRKBjbMo6q+A7QblyBJkiQNgYGfgChJkiSNKsO0JEmS1JJhWpIkSWrJMC1JkiS1ZJiW\nJEmSWhroRVskSZIGwQudaarYMy1JkiS1ZJiWJEmSWjJMS5IkSS11HaaTPKOXhUiSJEmjZqVhOsne\nSW4Gftos75bkUz2vTJIkSRpy3fRMnwq8HLgPoKp+DLykl0VJkiRJo6CrYR5Vdcdyq5b0oBZJkiRp\npHQzz/QdSfYGKslawPHALb0tS5IkSRp+3fRMHwO8FdgKuBPYvVmWJEmSVmuT9kwnmQEcUVWv61M9\nkqQh0eYKcV4dTtLqZtKe6apaAry2T7VIkiRJI6WbMdPfSXIG8CXgkbGVVXVdz6qSJEmSRkA3YXr3\n5t+/GbeugJdOfTmSJEnS6FhpmK6quf0oRJIkSRo13VwBccMkH0+yoPn7+yQb9qM4SZIkaZh1MzXe\nZ4CHgEObv0XAOb0sSpIkSV1K2v1pSnQzZnr7qvqLccsnJbm+VwVJkiRJo6KbnulHk+wztpDkxcCj\nvStJkiRJGg3d9Ey/BThv3Djp+4GjelaRJEmSNCK6mc3jemC3JBs0y4t6XpUkSZI0ArqZzeMjSTaq\nqkVVtSjJs5J8qB/FSZIkScOsmzHT+1fVA2MLVXU/8MrelSRJkiSNhm7C9Iwk64wtJFkPWGeS7SVJ\nkqTVQjcnIH4e+GaSsbml/xI4r3clSZIkSaOhmxMQ/zbJj4F5QAEfrKp/6XllkjROTmp3gYE6saa4\nEkmSfqebnmmq6vIkPwReAtzb25IkSZKk0TDhmOkkX0+yS3N7S+BG4I3A55K8vU/1SZIkSUNrsp7p\n7arqxub2XwJXVtWRSWYC3wU+0fPqJEmSNPKm81C9ycL04nG3Xwb8b4CqeijJ0p5WJWkkzJ/f7s1x\nzpzhf3OUJKkbk4XpO5K8DVgI7AlcDsumxlurD7VJkiRJQ22yeab/CtgZOAp4zbgLt7wQOGeiB62K\nJJ9JcneSG1e+tSRJkjRcJuyZrqq7gWNWsP4q4KopOv65wBnAZ6dof5IkSVLfdHMFxJ6pqm8Bvx5k\nDZIkSVJbAw3TkiRJ0igb+jCd5OgkC5IsuOeeewZdjiRJkrTMhGOmk3ySzuXDV6iqjutJRU89zlnA\nWQCzZ892Pi1JkiQNjcl6phcA1wLr0pka79+bv92BtXtfmiRJkjTcJpvN4zyAJG8B9qmqJ5rlM4Fv\nT8XBk1wAzAE2TbIQOLGqPj0V+5YkSZJ6bbKLtox5FrABv5t145nNuqetqg6fiv30XNpd5Y1yVIok\nSdJ01k2Y/ijwoyRXAQFeAnygl0VJkiRJo2ClYbqqzklyGbBXs+o9VXVXb8uSJEmSht9Kp8ZLEmAe\nsFtVXQysneQFPa9MkiRJGnLdzDP9KeBFwNj45oeAf+hZRZIkSdKI6GbM9F5VtWeSHwFU1f1JnBpP\nkiRJq71ueqYXJ5lBcwGXJJsBS3talSRJkjQCugnTpwMXAZsn+TDwHeDknlYlSZIkjYBuZvP4fJJr\ngZfRmRrvz6rqlp5XJkmSJA25lYbpJJ+rqiOAn65gnSSpT+bPb3cBqTlzvICUJPVKN8M8dh6/0Iyf\n/uPelCNJkiSNjgnDdJL3JnkI2DXJoubvIeBu4OK+VShJkiQNqQnDdFWdXFUzgVOqaoPmb2ZVbVJV\n7+1jjZIkSdJQ6maYx9eTrA+Q5PVJPp7kOT2uS5IkSRp63YTpfwR+k2Q34J3AbcBne1qVJEmSNAK6\nuQLiE1VVSV4FnFFVn07yV70uTFpVznQgSZL6rZsw/VCS9wKvB16SZA1grd6WJUmSJA2/boZ5vAZ4\nHPirqroL2Bo4padVSZIkSSOgmysg3gV8fNzyf+KYaUmSJKmrKyA+BIwNKl2bzhCPh6tqw14WJkmS\nJA27bnqmZ47dThLgVcALe1mUJEmSNAq6GTO9THX8H+DlPapHkiRJGhndDPM4eNziGsBs4LGeVSRJ\nkiSNiG6mxjto3O0ngP+gM9RDkiRJWq11M2b6L/tRiCRJw8QLQUnqxoRhOsm7q+pjST7J72bzGFPA\nr4Hzq+q2XhYoSZIkDavJeqZvaf5dMMH9mwD/BOw2pRVJkiRJI2LCMF1VX2v+PW+ibZI80ouiJEmS\npFEw2TCPr/HU4R3LVNWfVtX/6klVkiRJ0giYbJjH3zX/HgxsAZzfLB8O/KqXRUmSJEmjYLJhHlcD\nJPn7qpo97q6vJZloHLUkSZK02ujmCojrJ/n9sYUk2wHr964kSZIkaTR0c9GWvwbmJ/kZEOA5wJt7\nWpUkSZI0Arq5aMvlSXYEntus+imwtKdVSZIkSSOgm2EeVNXjwA3ApsCngIW9LEqSJEkaBSsN00le\nmOR04OfAxcC3+F0vtSRJkrTamjBMJ/lIkn8HPkynV3oP4J6qOq+q7p+Kgyd5RZJ/S3Jrkv8+FfuU\nJEmS+mWynuk30ZlP+h+Bz1XVfUxyEZdVlWQG8A/A/sBOwOFJdpqq/UuSJEm9NlmY3hL4EHAQcFuS\nzwHrJelmBpBuvAC4tap+VlW/Bb4IvGqK9i1JkiT13GQXbVkCXA5cnmQd4EBgPeDOJN+sqtc+zWNv\nBdwxbnkhsNfT3KckSZLUN131MjezeXwV+GqSDYA/62lV4yQ5GjgaYNttt+3XYZ+s2o1umT8/rR43\nd26741XLOodSVr3t5rR8/i0OBdje0N82t719jbdme/dfi+fiZ+bT0MecYns/VVdT441XVYuq6rNT\ncOw7gW3GLW/drFv+eGdV1eyqmr3ZZptNwWElSZKkqbHKYXoK/RDYMcl2SdYGDgP+eYD1SJIkSatk\nqk4mXGVV9USSY4F/AWYAn6mqmwZVjyRJkrSqugrTSfYGZo3ffiqGelTVpcClT3c/kiRJ0iCsNEw3\nU+JtD1wPLGlWFzAV46YlSZKkkdVNz/RsYKeazqdhSpIkSS10cwLijcAWvS5EkiRJGjXd9ExvCtyc\n5Brg8bGVVfWnPatKkqYzf+iTpGmjmzD9gV4XIUmSJI2ilYbpqrq6H4VIkiRJo2alY6aTvDDJD5M8\nnOS3SZYkWdSP4iRJkqRh1s0JiGcAhwP/DqwHvAn4h14WJUmSJI2Cri4nXlW3AjOqaklVnQO8ordl\nSZIkScOvmxMQf5NkbeD6JB8DfkmXIVySJEmazroJxUc02x0LPAJsA/xFL4uSJEmSRkE3s3n8PMl6\nwJZVdVIfapIkSZJGQjezeRwEXA9c3izvnuSfe12YJEmSNOy6GebxAeAFwAMAVXU9sF0Pa5IkSZJG\nQjdhenFVPbjcOq+FK0mSpNVeN7N53JTktcCMJDsCxwHf621ZkiRJ0vDrpmf6bcDOwOPABcAi4O29\nLEqSJEkaBd3M5vEb4ITmT5IkSVJjwjC9shk7qupPp74cSZIkaXRM1jP9IuAOOkM7fgCkLxVJkiRJ\nI2KyML0FsB9wOPBa4BLggqq6qR+FSZIkScNuwhMQq2pJVV1eVW8AXgjcCsxPcmzfqpMkSZKG2KQn\nICZZBziATu/0LOB04KLel6XVXjmVuSRJGn6TnYD4WWAX4FLgpKq6sW9VSZIkSSNgsp7p1wOPAMcD\nxyXLzj8MUFW1QY9rkyRJkobahGG6qrq5oIskSZK02jIwS5IkSS0ZpiVJkqSWVno5cUmrAWdPkSSp\nFXumJUmSpJYM05IkSVJLhmlJkiSpJcO0JEmS1JJhWpIkSWrJMC1JkiS1ZJiWJEmSWhpImE7y6iQ3\nJVmaZPYgapAkSZKerkH1TN8IHAx8a0DHlyRJkp62gVwBsapuAUgyiMNLkiRJU8LLiUuSpreqQVcg\naRrrWZhO8g1gixXcdUJVXbwK+zkaOBpg2223naLq+mPOHN/AJUmSprOehemqmjdF+zkLOAtg9uzZ\nplNJkiQNDafGkyRJkloa1NR4f55kIfAi4JIk/zKIOiRJkqSnY1CzeVwEXDSIY0uSJMlzu6aKwzwk\nSZKklgzTkiRJUkuGaUmSJKklw7QkSZLUkmFakiRJaskwLUmSJLU0kKnxJEmSNHrK2fSewp5pSZIk\nqSXDtCRJktSSwzyGkD+hSJIkjQZ7piVJkqSW7JmW1Hf++iJJmi7smZYkSZJaMkxLkiRJLRmmJUmS\npJYM05IkSVJLhmlJkiSpJcO0JEmS1JJhWpIkSWrJMC1JkiS1ZJiWJEmSWjJMS5IkSS0ZpiVJkqSW\nDNOSJElSS2sOugBp0KoGXYEkSRpVhmlJkqaQX9Cl1YvDPCRJkqSWDNOSJElSS4ZpSZIkqSXDtCRJ\nktSSYVqSJElqydk8JGmac3YJSeode6YlSZKklgzTkiRJUkuGaUmSJKklw7QkSZLU0kDCdJJTkvw0\nyQ1JLkqy0SDqkCRJo62q3Z80VQbVM30lsEtV7Qr8P+C9A6pDkiRJam0gYbqqrqiqJ5rF7wNbD6IO\nSZIk6ekYhjHTbwQuG3QRkiRJ0qrq2UVbknwD2GIFd51QVRc325wAPAF8fpL9HA0cDbDtttv2oFJJ\nkiSpnZ6F6aqaN9n9SY4CDgReVjXxqQBVdRZwFsDs2bM9ZUCSJElDYyCXE0/yCuDdwL5V9Zu2+1m8\neDELFy7ksccem7riJGlErbvuumy99dastdZagy5FklYbAwnTwBnAOsCVSQC+X1XHrOpOFi5cyMyZ\nM5k1axbNfiRptVRV3HfffSxcuJDttttu0OVI0mpjIGG6qnaYiv089thjBmlJApKwySabcM899wy6\nFElarQzDbB5Pi0Fakjp8P5Sk/hvUMI/emKoPkpVcGumGG27gPe95D48++ii//e1vOeSQQ3jHO97B\nDjvswK233rpKhzr99NM57rjjWpd6+eWXc9JJJwHwgQ98gJe//OWt97W8+fOnpj3nzBmd9nzjG9/I\nZZddxgEHHMDZZ5/dej8r0qeX59C058KFC3nd617H0qVLWbp0KaeddhqzZ89uta8VyUlT06B14ui8\nPufNm8cTTzzBww8/zDvf+U4OP/zw1vuSJE2N6RWm++DBBx/k9a9/PRdddBHbb789VcUVV1zRen+r\n+uG6ZMkSZsyYsez2u9/9br71rW8BsO+++zJv3rxl94+CYWpPgA9+8IMceeSRnH/++a1rGKRhas+Z\nM2dy4YUXsvnmm3PzzTfz5je/mW9/+9utaxmEYWpPgEsvvZS1116bRYsWsdtuuxmmJWkIjPwwj367\n5JJLOOigg9h+++2Bzs+qy/cGn3vuuXzoQx8COr1zc+bMAeDUU09lr732Yu7cuZx22ml84Qtf4M47\n72TOnDl8+MMfZvHixbzpTW9i7ty57LPPPlxzzTUAHHXUURxzzDEceOCBTwojt956K9tttx0bbbQR\nG220EbNmzVrlnrJBG6b2BNhqq616/Ix7a5jac8MNN2TzzTcHYJ111mHNNUfvu/swtSfA2muvDcAj\njzzCzjvv3MunLknq0uh9ug3YHXfcwTbbbNPqsZ///Oe56qqrmDlzJkuXLmWNNdbg/e9/P/Pnzwfg\nzDPPZIcdduDss8/mV7/6FQcffDDf/e53AXjOc57DmWee+aT93XfffTzrWc9atrzRRhvx61//ut0T\nG5Bhas/pYBjbc8mSJRx33HGccMIJreoapGFrzyVLlvDSl76Um266iZNPPrn185IkTR3D9CraZptt\nuPHGGyfdZvxJQOOvR/OJT3yC4447jsWLF3PMMcewzz77POlxP/nJT/je977H5ZdfDnR+Yh6z9957\nP+U4G2+8MQ888MCy5QcffJCNN9541Z7QgA1Te04Hw9ieb37zm9l///2ZN2/S6zgNpWFrzxkzZnD1\n1Vdz33338fznP59DDz2UDTfccJWflzSMVnZ+jTSsHOaxig444AC+9rWvcdttty1bd+WVVz5pm403\n3piFCxcCcO211y5bv+eee3LOOefw0Y9+lOOPPx6ANddck6VLlwKw8847c+SRRzJ//nzmz5/Pdddd\nt+yxKxoHveOOO3L77bezaNEiFi1axO23384OO0zJrIN9M0ztOR0MW3u+613vYsstt+TYY4+dmifY\nZ8PUnosXL2bJkiUArL/++qy77rqsu+66U/RMJUlt2TO9ijbccEPOP/983vrWt/LYY4/x29/+lle/\n+tXst99+y7bZb7/9OPXUU/mTP/kT9thjj2XrjzjiCO69914ee+wx3vrWtwJwyCGHcMABB7D//vvz\nlre8hbe97W3MnTsXgNmzZ3PKKadMWMuMGTM4+eSTl43hPPnkk0cuJA5TewK8733v47LLLuOuu+5i\n3rx5XHzxxay//vo9eOa9MUztuWDBAk477TRe/OIXM2fOHDbbbDMuvPDCHj3z3him9rz77rs5/PDD\nmTFjBo8//jjvf//7WWeddXr0zCVJ3UqtbJ6tITJ79uxasGDBsuVbbrmF5z3veQOsSJKGi++LkjQ1\nklxbVSud09VhHpIkSVJLhmlJkiSpJcO0JEmS1NLIh+lHH32UURr3LUm9UFU8+uijgy5DklY7Iz2b\nx5Zbbsmdd97J4sWLB12KJA3cWmutxZZbbjnoMiRptTLSYXrsMtqSJEnSIIz8MA9JkiRpUAzTkiRJ\nUkuGaUmSJKmlkboCYpJ7gJ8Puo4B2xS4d9BFrEZs7/6yvfvPNu8v27u/bO/+mm7t/Zyq2mxlG41U\nmBYkWdDNpS01NWzv/rK9+8827y/bu79s7/5aXdvbYR6SJElSS4ZpSZIkqSXD9Og5a9AFrGZs7/6y\nvfvPNu8v27u/bO/+Wi3b2zHTkiRJUkv2TEuSJEktGaaHUJKrkrx8uXVvT3JOkq8Mqq7pLskWSb6Y\n5LYk1ya5NMkfJDk9yY1JfpLkh0m2G3StoyRJJTl/3PKaSe5J8vWVPG73JK8ct/yBJO/qZa3TRZIT\nktyU5IYk1yfZq3kPeUYXj+1qO01uZa/7JEclOWNwFU4PSZY0r/EfJ7kuyd6Drmm6mugzctB1DQPD\n9HC6ADhsuXWHAedU1SEDqGfaSxLgImB+VW1fVX8MvBd4DfBsYNeq+iPgz4EHBlfpSHoE2CXJes3y\nfsCdXTxud+CVK91KT5LkRcCBwJ5VtSswD7gDeDvQTUjudjtNru3rXqvm0aravap2o/OeffKgC5qO\nJvmM/L3BVjYcDNPD6SvAAUnWBkgyi06guyPJjc26s5tv49c3vR0nDqza6WEusLiqzhxbUVU/pvOB\n+MuqWtqsW1hV9w+oxlF2KXBAc/twOl8YAUjygiT/N8mPknwvyR82r/2/AV7TvMZf02y+U5L5SX6W\n5Lj+PoWRsSVwb1U9DlBV9wLj1y3AAAAEeElEQVSH0HkPuSrJVQBJ/jHJgqYH+6Rm3XHjt0syI8m5\n436Z+evBPKWRNeHrXj2xAXA/QJI543/9SnJGkqOa2x9NcnPzy83fDabUkTPRZ+R3kpwy7j3iNbCs\n/a9OcnHzfv3RJK9Lck2z3faDeiK9YJgeQlX1a+AaYP9m1WHAl4Eat82bqmp34FV0rjZ0bp/LnG52\nAa5dwfovAwc1ge7vk+zR57qmiy8ChyVZF9gV+MG4+34K/Jeq2gN4P/CRqvptc/tLTa/Tl5ptnwu8\nHHgBcGKStfr2DEbHFcA2Sf5fkk8l2beqTgd+AcytqrnNdic0F1fYFdg3ya4r2G53YKuq2qX5Zeac\nATyfUTbZ615TY73m/fmnwNnAByfbOMkmdH5h3Ln55eZDfahxOpjoM/JgOu8Tu9H5FeyUJFs29+0G\nHAM8DzgC+IOqegGd/05v63nFfWSYHl7jh3ocxgp6NJo36AuBt1XV6n6Z9Z6oqoXAH9L5OWsp8M0k\nLxtsVaOnqm4AZtHpnbt0ubs3BC5sfnU5Fdh5kl1dUlWPN72td+NPjE9RVQ8DfwwcDdwDfGmsR245\nhya5DvgRnTbfaQXb/Az4/SSfTPIKYFFvqp6eVvK619QYG+bxXOAVwGebIQkTeRB4DPh0koOB3/Sj\nyGlsH+CCqlpSVb8Crgae39z3w6r6ZfMr2W10vugD/ITO/xfThmF6eF0MvCzJnsAzqmpF3wjPBP6p\nqr7R39KmpZvoBJCnaMLbZVX134CPAH/W18qmj38G/o6nfjH8IHBVVe0CHASsO8k+Hh93ewmw5pRW\nOE00H2zzq+pE4FjgL8bf35xE+y7gZU3v3CWsoN2bIU27AfPp9DCd3ePSp6OJXveaYlX1f4FNgc2A\nJ3hyxlm32eYJOr9sfYXOuQWX97nMUTXhZ+Qkxr9fLx23vJRp9t5tmB5STe/SVcBnWHGv9FuBmVX1\n0X7XNk39K7BOkqPHViTZNcm+SZ7dLK9B56dafwVo5zPASVX1k+XWb8jvTsw6atz6h4CZfahrWmnG\nnO84btXudF6z49tzAzrnAzyY5Pf43ZAyxm+XZFNgjar6KvA+YM8elz8dTfS61xRL8lxgBnAfndf8\nTknWSbIR8LJmm2cCG1bVpcBf0/myqJVb4WcknRPyX9OcX7EZ8BI6w1RXK9Pqm8E0dAGds2eXn9kD\nOr1Ki5Nc3yyfOf7EAK2aqqokfw58Isl76PwM+B90ei0+nmSdZtNrAKezaqEZMnP6Cu76GHBekvfR\n6SEdcxXw35vXuGfod++ZwCebAPEEcCudIR+HA5cn+UVVzU3yIzrj1e8Avjvu8WeNbUdnZo9zmi+S\n0BnupFUwyese4Kgk43/pemGzvbq33rjPwQBvqKoldE7Y/zJwI3A7neFM0PmieHEzTDLAO/pd8Cia\n5DPy7XTec35M57yud1fVXc0Xm9WGV0CUJEmSWnKYhyRJktSSYVqSJElqyTAtSZIktWSYliRJkloy\nTEuSJEktGaYlSZKklgzTkiRJUkuGaUmSJKml/w8/ffiq6NR8YAAAAABJRU5ErkJggg==\n", "text/plain": [ "<Figure size 1200x500 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtMAAAE/CAYAAACEmk9VAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xm8ZGV95/HPl2YVWWQxIIJNwETB\nyJJWFFG6tQkiEpWggIoSYxBHRKNOHAdHIC440UhANAyDQRTFNQwqi1to3KIIiMiiESRKI6gg0Igs\nTfdv/jinsWj63q4+favq1u3P+/W6L+pUnTrnV08XVd96znOek6pCkiRJ0qpba9QFSJIkSePKMC1J\nkiR1ZJiWJEmSOjJMS5IkSR0ZpiVJkqSODNOSJElSR4ZpSRqiJLOTVJK1Z9K+hinJBUleOeo6JAkM\n05IGLMmCJLcnWW+A+3hBkiuSLEpya5J/T7L9oPY3nST5kySfbV/3nUmuTPKmJLOmcB8Lkrx6qrbX\nbvO/ksxf7r7Dk3xrZc+tqv2q6syprKenhv+Z5IYkv0uyMMmn2/tPTfKxFay/S5L7kmyW5Lgki5Pc\n1f79Z5JTkmw9iFolTQ+GaUkDk2Q28EyggL8c0D52BD4GvBnYBNge+BCwZAr3kSTT7vMyyQ7A94Ab\ngT+rqk2AFwNzgI1GWVuvqQz2g9T2dh8GzK+qR9K049fbh88EDkyy4XJPOwz4UlX9tl3+dFVtBGwG\nvAjYCrjMQC3NXNPuy0HSjPIK4LvAR4EHD8sn2SPJLb0hK8mLklzZ3t4gyZltj/a1Sf4+ycIJ9rEr\ncENVfb0ad1XV56vqF+22ZrW9jde3vYWXJdm2fWzPJN9ve3S/n2TPnnoWJHl3km8Dvwf+OMkmST6S\n5OYkNyV517LXkGTHJBe327p1WY/mJF6V5Jfttt7SbmOrJL9PsnlPHbsn+U2SdVawjeOB71TVm6rq\nZoCq+klVvbSq7lh+5eV7g9ue1LPa2+snOSvJbUnuaNvjj5K8m+YH0Sltb+0p7fpPSPLVJL9N8pMk\nL+nZ7keT/EuS85PcDcxbSVs8zET1tI892FO+rDc7yfvb98sNSfbr2c72Sb7R/tt/LcmHlr3mFXgK\n8OWqur5ty1uq6rT29n8ANwF/1bPtWcBLaX7MPURVLa6qq4GDgd/Q/NiTNAMZpiUN0iuAT7R/+y4L\nQ1X1PeBu4Nk9674U+GR7+1hgNvDHwD7AyyfZx+XAE5KcmGRekkcu9/ibgEOB5wEbA68Cfp9kM+A8\n4GRgc+ADwHm9QZam1/EIml7en9P8KHgA2BHYDfgLYNnwh3cCXwEeBTwW+OAkNUMTMB/fbuOtSeZX\n1S3AAuAlPesdBnyqqhavYBvzgc+tZD/9eiVNz/62NO1xJHBPVR0DfBM4qqoeWVVHtb2zX6X593o0\ncAjw4SQ79WzvpcC7adpupUM3+q1ngnX3AH4CbAH8I/CRJGkf+yRwSbuN42jacyLfBV6R5L8nmbOC\nHvWP0bynl5kPrAOcP9EGq2oJcC7NDxJJM5BhWtJAJNkLeBzwmaq6DLieJmAtczZNyCXJRjRh9+z2\nsZcA76mq26tqIU3gXaGq+hkwF9gG+Axwa9szuixUvxp4e9tjW1X1w6q6Ddgf+GlVfbyqHqiqs4Ef\nAwf0bP6jVXV1VT1Ac9j+ecAbq+ruqvo1cCJNkARY3L7ex1TVvVW1sgB5fLudHwFnLGsLmuEEL2/b\nZVZ7/8cn2MbmwM0r2U+/Frfb27GqllTVZVW1aIJ1nw/8V1Wd0bbdD4DP0wwxWebcqvp2VS2tqnsH\nXM/Pq+r/tsH1TGBr4I+SbEfT2/yOqrq//Tf5wkQ7rKqzgNcD+wIXA79O8taeVT4O7J3kse3yK4BP\nTvBDp9cvad4/kmYgw7SkQXkl8JWqurVd/iQ9Qz3a5QPTnJh4IHB5Vf28fewxNOOAl+m9/TBV9d2q\neklVbUnTA/gs4Jj24W1pgvzyHkPT29zr5zShfEX7fRxNL+TN7bCDO4D/Q9MzC/D3QIBLklyd5FWT\n1bzctn/e1gNNL+ZOaU6g3Ae4s6oumWAbt9EEx6nwceDLwKfa4Sf/OMHQEmjaYo9l7dC2xctoxgcv\nM+m/GU0P//LbX4cmRK9qPbcsu1FVv29vPpKmTX/bc99K66qqT1TVfGBTmt7wdybZt33sF8A3gJe3\nP9ZeyAqGeKzANsBvV7qWpLFkmJY05ZJsQNO7vHeasdG3AH8H7JJkF4CquoYmRO7HQ4d4QNPb+tie\n5W373XdVfR/4N+BJ7V03AjusYNVf0oTCXtvRjIt9cHM9t28E7gO2qKpN27+Nq2rndr+3VNXfVtVj\ngNfQDHvYcZJSe1/Tdm09tL24n6HpnT6MiXulAb5GzxjePtwNPKJn+cHw247xPb6qdgL2pOl9Xjak\nobcdoGmLi3vaYdN2CMhre9ZZ/jnL+wXNUJ5e29P+wFlJPf26GdgsSe9r7uu91O7/s8CV/OG9BE3P\n92E07X5De9RlQmlOXD2AZqiMpBnIMC1pEF5IM5vGTjQnCO4KPJEmUPQGok8Cb6DpSf5sz/2fAd6W\n5FFJtgGOmmhHSfZK8rdJHt0uP4Fm5pDvtqucTtO7+Pg0ntyOiz4f+JMkL02ydpKD23q/tKL9tCf4\nfQX4pyQbJ1kryQ5J9m73++Kew/+304TJpZO00f9K8ogkOwN/DfSesPgx4PD2dUwWpo8F9kzyviRb\ntXXs2J64t+kK1r8COCTJOknmAActe6Adb/5n7dCSRTQ9xMvq/xXN+PVlvkTTdoe121onyVOSPHGS\nWpf3aeCN7YmMaet5FfCpPurpS3uk41LguCTrJnk6Dx3G8xDtyYz7J9mo/ffdD9iZZsaUZT5P8+Pn\neJpgPdG21m7b42yaHy0fWJXaJY0Pw7SkQXglcEZV/aLtsb2lPbnuFOBl+cNFRM4G9gb+vWc4CMA/\nAAuBG2h6Xz9H0yu8InfQhM4fJfkdcCFwDs2JaNCEmM/QBOFFwEeADdpx08+nmWXhNpphGs9fro7l\nvQJYF7iGJjB/jj8Ms3gK8L22hi8Ab2jHc0/kYuA6mqnX3l9VX1n2QFV9myY49g59eZh21omn0/Tw\nXp3kTpqwdylw1wqe8r9oeulvpwmDvUcDtmpfzyLg2ra+ZUH+JOCgdraMk6vqLpoTJw+h6VG/Bfjf\nwKrMJf5/acaKfxG4k+YHxDFVdWEf9ayKl9G00W3Au2hC/ETvpUXA/6TpNb+D5j302t7x71V1N00b\nP5bmxNrlHdy+B+6keR/cBvx5Vf2yQ+2SxkCqVnYkTpJGK8lrgUOqau9R1zIsSf6d5uS200ddy0yS\nZsrCH1fVsaOuRdLMYM+0pGknydZJntEeav9Tmt7jc0Zd17AkeQqwOw8d+qEO2uEnO7TvpecCLwD+\n36jrkjRzrL3yVSRp6NalmSlje5rD7Z8CPjzSioYkyZk0Y87f0A6n0OrZiuaE1M1phg69tp3KT5Km\nhMM8JEmSpI4c5iFJkiR1ZJiWJEmSOhqrMdNbbLFFzZ49e9RlSJIkaYa77LLLbm2vrDupsQrTs2fP\n5tJLLx11GZIkSZrhkkw4z38vh3lIkiRJHRmmJUmSpI4M05IkSVJHhmlJkiSpI8O0JEmS1JFhWpIk\nSerIMC1JkiR1ZJiWJEmSOjJMS5IkSR0ZpiVJkqSOxupy4tIgJN2eVzW1dUiSpPFjz7QkSZLUkWFa\nkiRJ6sgwLUmSJHVkmJYkSZI6MkxLkiRJHRmmJUmSpI6cGk+SJI3cggXd5imdO9d5SjVahmlJkqaQ\nc9dLaxaHeUiSJEkdGaYlSZKkjgzTkiRJUkeGaUmSJKkjw7QkSZLUkWFakiRJ6sgwLUmSJHU0sjCd\nZNskFyW5JsnVSd4wqlokSZKkLkZ50ZYHgDdX1eVJNgIuS/LVqrpmhDVJkiRJfRtZz3RV3VxVl7e3\n7wKuBbYZVT2SJEnSqpoWY6aTzAZ2A7432kokSZKk/o08TCd5JPB54I1VtWgFjx+R5NIkl/7mN78Z\nfoGSJEnSBEYappOsQxOkP1FV/7aidarqtKqaU1Vzttxyy+EWKEmSJE1ilLN5BPgIcG1VfWBUdUiS\nJEldjbJn+hnAYcCzk1zR/j1vhPVIkiRJq2RkU+NV1beAjGr/kiRJ0uoa+QmIkiRJ0rgyTEuSJEkd\nGaYlSZKkjgzTkiRJUkeGaUmSJKkjw7QkSZLUkWFakiRJ6sgwLUmSJHVkmJYkSZI6MkxLkiRJHRmm\nJUmSpI4M05IkSVJHhmlJkiSpI8O0JEmS1JFhWpIkSerIMC1JkiR1ZJiWJEmSOjJMS5IkSR0ZpiVJ\nkqSODNOSJElSR4ZpSZIkqSPDtCRJktSRYVqSJEnqyDAtSZIkdWSYliRJkjoyTEuSJEkdGaYlSZKk\njgzTkiRJUkeGaUmSJKkjw7QkSZLUkWFakiRJ6sgwLUmSJHVkmJYkSZI6MkxLkiRJHRmmJUmSpI4M\n05IkSVJHhmlJkiSpI8O0JEmS1JFhWpIkSerIMC1JkiR1ZJiWJEmSOhppmE7yr0l+neSqUdYhSZIk\ndTHqnumPAs8dcQ2SJElSJ2uPcudV9Y0ks0dZg6TxkOPT6Xl1bE1xJZIk/cGoe6YlSZKksTXSnul+\nJDkCOAJgu+22G3E1kiRJ00y6HbmjPHI3FaZ9z3RVnVZVc6pqzpZbbjnqciRJkqQH9R2mkzxikIVI\nkiRJ42alwzyS7AmcDjwS2C7JLsBrquq/re7Ok5wNzAW2SLIQOLaqPrK625UkSdLkFixY9eEhc+c6\nNGR5/YyZPhHYF/gCQFX9MMmzpmLnVXXoVGxHkjT1usyg4uwpktY0fQ3zqKobl7tryQBqkSRJksZK\nPz3TN7ZDPSrJOsAbgGsHW5YkSZI0/fXTM30k8DpgG+AmYNd2WZIkSVqjTdoznWQWcFhVvWxI9UiS\nJEljY9Ke6apaArx0SLVIkiRJY6WfMdPfSnIK8Gng7mV3VtXlA6tKkiRJGgP9hOld2//+Q899BTx7\n6suRJEmSxsdKw3RVzRtGITNRl8nQAebN6zZPazm9qyRJ0lCtdDaPJJsk+UCSS9u/f0qyyTCKkyRJ\nkqazfqbG+1fgLuAl7d8i4IxBFiVJkiSNg37GTO9QVX/Vs3x8kisGVZAkSZI0Lvrpmb4nyV7LFpI8\nA7hncCVJkiRJ46GfnunXAmf2jJO+HTh8YBVJkiRJY6Kf2TyuAHZJsnG7vGjgVUmSJEljoJ/ZPN6T\nZNOqWlRVi5I8Ksm7hlGcJEmSNJ31M2Z6v6q6Y9lCVd0OPG9wJUmSJEnjoZ8wPSvJessWkmwArDfJ\n+pIkSdIaoZ8TED8BfD3Jsrml/xo4c3AlSZIkSeOhnxMQ/3eSHwLzgQLeWVVfHnhlkiRJ0jTXT880\nVXVhku8DzwJuHWxJkiRJ0niYcMx0ki8leVJ7e2vgKuBVwMeTvHFI9UmSJEnT1mQnIG5fVVe1t/8a\n+GpVHQDsQROqJUmSpDXaZGF6cc/t5wDnA1TVXcDSQRYlSZIkjYPJxkzfmOT1wEJgd+BCeHBqvHWG\nUJskSZI0rU3WM/03wM7A4cDBPRdueRpwxkRPkiRJktYUE/ZMV9WvgSNXcP9FwEWDLEqSJEkaB/1c\nAVGSJEnSChimJUmSpI4M05IkSVJHE46ZTvJBmsuHr1BVHT2QiiRJkqQxMdnUeJe2/30GsBPw6Xb5\nxcA1gyxKkqQ1TY5Pp+fVsRP2e0kagslm8zgTIMlrgb2q6oF2+VTgm8MpT5IkSZq++hkz/Shg457l\nR7b3SZIkSWu0yYZ5LPNe4AdJLgICPAs4bpBFSZIkSeNgpWG6qs5IcgGwR3vXW6vqlsGWJUmSJE1/\nKx3mkSTAfGCXqjoXWDfJUwdemSRJkjTN9TNm+sPA04FD2+W7gA8NrCJJkqQ+Jd3+pKnSz5jpPapq\n9yQ/AKiq25OsO+C6JEmSpGmvn57pxUlm0V7AJcmWwNKBViVJkiSNgX7C9MnAOcCjk7wb+BZwwkCr\nkiRJksZAP7N5fCLJZcBzaKbGe2FVXTvwyiRJkqRprp/ZPD5eVT+uqg9V1SlVdW2Sjw+jOEkzkycL\nSZJmin6Geezcu9COn/7zqdh5kucm+UmS65L8j6nYpiRJkjQsE4bpJG9Lchfw5CSL2r+7gF8D567u\njttQ/iFgP2An4NAkO63udiVJkqRhmTBMV9UJVbUR8L6q2rj926iqNq+qt03Bvp8KXFdVP6uq+4FP\nAS+Ygu1KkiRJQ9HPPNNfSrJhVd2d5OXA7sBJVfXz1dz3NsCNPcsL+cMlyx+U5AjgCIDttttuNXfZ\nUccBm3OrOj2v49PI8d3qrGM77nCQOrT5gou67ao6Nvia3t4w7Dafhu3W1ZDbe968rp9FM6TNx+L9\nDWv6e9zvzNXQsRHmdnhO53NYjptB7b2cfsZM/wvw+yS7AG8Grgc+NtCqelTVaVU1p6rmbLnllsPa\nrSRJkrRS/YTpB6r5mf0C4JSq+hCw0RTs+yZg257lx7b3Sc2v7FX9kyRJGrJ+wvRdSd4GvBw4L8la\nwDpTsO/vA49Psn17efJDgC9MwXYlSZKkoegnTB8M3Af8TVXdQtOD/L7V3XFVPQAcBXwZuBb4TFVd\nvbrblSRJkoalnysg3gJ8oGf5F0zRmOmqOh84fyq2JUmSJA3bZPNMf6v9710980wv+7szyQ1J/tvw\nSpUkSZKmlwl7pqtqr/a/KzzZMMnmwHeADw+mNEmSJGl6W+kwjyQrnNy5qn6RZO6UVyRJkiSNiX4u\n2nJez+31ge2BnwA7V9XNA6lKkiRJGgP9nID4Z73LSXYHHCstSUM2d67zqUvSdNNPz/RDVNXlSR52\n2W9J0vTkNY0kaXD6GTP9pp7FtYDdgV8OrCJJkiRpTPTTM907m8cDNGOoPz+YciRJkqTx0c+Y6eOH\nUYgkSZI0biYM00m+CEw40q6q/nIgFUmSJEljYrKe6fe3/z0Q2Ao4q10+FPjVIIuSJEmSxsFkV0C8\nGCDJP1XVnJ6Hvpjk0oFXJkmSpGml6+xAmcGDhtfqY50Nk/zxsoUk2wMbDq4kSZIkaTz0M5vH3wEL\nkvwMCPA44DUDrUqSJEkaA/3M5nFhkscDT2jv+jGwdKBVSZIkSWOgn2EeVNV9wJXAFsCHgYWDLEqS\nJEkaBysN00meluRk4OfAucA3+EMvtSRJkrTGmjBMJ3lPkp8C76bpld4N+E1VnVlVtw+rQElDUNXt\nT5KkNdxkY6ZfDfwn8C/AF6vqviR+e0rS6vKHiCTNGJMN89gaeBdwAHB9ko8DGyTpZwYQSZIkacab\n7KItS4ALgQuTrAc8H9gAuCnJ16vqpUOqUdI0NXeuPaySpDVbX73M7Wwenwc+n2Rj4IUDrUqSJEka\nA6s8ZKOqFgEfG0AtkiRJ0ljpa55pSZIkSQ9nmJYkSZI66muYR5I9gdm961eVQz0kSZK0RltpmG6n\nxNsBuAJY0t5dOG5akiRJa7h+eqbnADtVeZUBSZIkqVc/Y6avArYadCGSJEnSuOmnZ3oL4JoklwD3\nLbuzqv5yYFVJkiRJY6CfMH3coIuQJEmSxtFKw3RVXTyMQiRJkqRxs9Ix00meluT7SX6X5P4kS5Is\nGkZxkiRJ0nTWzwmIpwCHAj8FNgBeDXxokEVJkiRJ46CvKyBW1XXArKpaUlVnAM8dbFmSJEnS9NfP\nCYi/T7IucEWSfwRuxsuQS5IkSX2F4sPa9Y4C7ga2Bf5qkEVJkiRJ46Cf2Tx+nmQDYOuqOn4INUmS\nJEljoZ/ZPA4ArgAubJd3TfKFQRcmSZIkTXf9DPM4DngqcAdAVV0BbD/AmiRJkqSx0E+YXlxVdy53\nXw2iGEmSJGmc9BOmr07yUmBWkscn+SDwndXZaZIXJ7k6ydIkc1ZnW5IkSdKo9BOmXw/sDNwHnA0s\nAt64mvu9CjgQ+MZqbkeSJEkamX5m8/g9cEz7NyWq6lqAJFO1ycEqR7WMg7lz/XeSJEnDNWGYXtmM\nHVX1l1NfjiRJkjQ+JuuZfjpwI83Qju8Bq9SNnORrwFYreOiYqjp3FbZzBHAEwHbbbbcqJUiSJEkD\nNVmY3grYBzgUeClwHnB2VV3dz4arav7qlwdVdRpwGsCcOXM8ji9JkqRpY8ITEKtqSVVdWFWvBJ4G\nXAcsSHLU0KqTJEmSprFJZ/NIsl6SA4GzgNcBJwPnrO5Ok7woyUKaoSTnJfny6m5TkiRJGrbJTkD8\nGPAk4Hzg+Kq6aqp2WlXnMAWhXJIkSRqlycZMvxy4G3gDcHTPNHYBqqo2HnBtkiRJ0rQ2YZiuqn4u\n6CJJkiStsQzMkiRJUkeGaUmSJKmjlV5OXJKkNdHcuV7aQNLK2TMtSZIkdWSYliRJkjoyTEuSJEkd\nGaYlSZKkjgzTkiRJUkfO5iF1VMd6pr8kSWs6e6YlSZKkjgzTkiRJUkeGaUmSJKkjw7QkSZLUkWFa\nkiRJ6sgwLUmSJHXk1HgziFO1SZIkDZc905IkSVJHhmlJkiSpI4d5SJJmtnIInKTBsWdakiRJ6sgw\nLUmSJHVkmJYkSZI6MkxLkiRJHRmmJUmSpI4M05IkSVJHhmlJkiSpI8O0JEmS1JFhWpIkSerIMC1J\nkiR1ZJiWJEmSOjJMS5IkSR0ZpiVJkqSODNOSJElSR4ZpSZIkqaO1R12AJEmSZrY6tkZdwsDYMy1J\nkiR1ZJiWJEmSOjJMS5IkSR0ZpiVJkqSORhKmk7wvyY+TXJnknCSbjqIOSZIkaXWMqmf6q8CTqurJ\nwH8CbxtRHZIkSVJnIwnTVfWVqnqgXfwu8NhR1CFJkiStjukwZvpVwAWjLkKSJElaVQO7aEuSrwFb\nreChY6rq3HadY4AHgE9Msp0jgCMAtttuuwFUKkmSJHUzsDBdVfMnezzJ4cDzgedU1YSXxamq04DT\nAObMmTNzL58jSZKksTOSy4kneS7w98DeVfX7UdQgSZLWXDP58tYarlGNmT4F2Aj4apIrkpw6ojok\nSZKkzkbSM11VO45iv5IkSdJUmg6zeUiSJEljyTAtSZIkdWSYliRJkjoyTEuSJEkdGaYlSZKkjgzT\nkiRJUkeGaUmSJKkjw7QkSZLUkWFakiRJ6sgwLUmSJHVkmJYkSZI6MkxLkiRJHRmmJUmSpI7WHnUB\nq2Px4sUsXLiQe++9d9SlSNLIrb/++jz2sY9lnXXWGXUpkrTGGOswvXDhQjbaaCNmz55NklGXI0kj\nU1XcdtttLFy4kO23337U5UjSGmOsh3nce++9bL755gZpSWu8JGy++eYeqZOkIRvrMA0YpCWp5eeh\nJA3fWA/zeJip+iKpmvThK6+8kre+9a3cc8893H///Rx00EG86U1vYscdd+S6665bpV2dfPLJHH30\n0Z1LvfDCCzn++OMBOO6449h33307b2t5CxZMTXvOnTs+7fmqV72KCy64gP3335/TTz+983ZWZEhv\nz2nTngsXLuRlL3sZS5cuZenSpZx00knMmTOn07ZWJMdPTYPWsePz/pw/fz4PPPAAv/vd73jzm9/M\noYce2nlbkqSpkVrZN/M0MmfOnLr00ksfXL722mt54hOf+IcVhpBW7rzzTp75zGdyzjnnsMMOO1BV\nfOUrX2Hfffft9OW6qs9ZsmQJs2bNevD2brvtxje+8Q0A9t57by6//PIHH19dwwjT06k9AW666SZ+\n+tOfctZZZ41lmJ5O7XnnnXdy33338ehHP5prrrmG17zmNXzzm99cpf1PZhhhejq1J8D999/Puuuu\ny6JFi9hll1244YYbHvach30uSqPQ5QNvjPKI1gxJLquqlfYCjf0wj2E777zzOOCAA9hhhx2A5rDq\n8r3BH/3oR3nXu94FNL1zc+fOBeDEE09kjz32YN68eZx00kl88pOf5KabbmLu3Lm8+93vZvHixbz6\n1a9m3rx57LXXXlxyySUAHH744Rx55JE8//nPf0gYue6669h+++3ZdNNN2XTTTZk9e/Yqf7mP2nRq\nT4BtttlmwK94sKZTe26yySY8+tGPBmC99dZj7bXH70DYdGpPgHXXXReAu+++m5133nmQL12S1Kfx\n+3YbsRtvvJFtt92203M/8YlPcNFFF7HRRhuxdOlS1lprLd7xjnewYMECAE499VR23HFHTj/9dH71\nq19x4IEH8u1vfxuAxz3ucZx66qkP2d5tt93Gox71qAeXN910U3772992e2EjMp3acyaYju25ZMkS\njj76aI455phOdY3SdGvPJUuW8OxnP5urr76aE044ofPrkiRNHcP0Ktp222256qqrJl2n9ySg3mE0\n//zP/8zRRx/N4sWLOfLII9lrr70e8rwf/ehHfOc73+HCCy8EmkPMy+y5554P289mm23GHXfc8eDy\nnXfeyWabbbZqL2jEplN7zgTTsT1f85rXsN9++zF//vxVei3TwXRrz1mzZnHxxRdz22238ZSnPIWX\nvOQlbLLJJqv8uqSBc8iG1iAO81hF+++/P1/84he5/vrrH7zvq1/96kPW2WyzzVi4cCEAl1122YP3\n77777pxxxhm8973v5Q1veAMAa6+9NkuXLgVg55135hWveAULFixgwYIFXH755Q8+d0XjoB//+Mdz\nww03sGjRIhYtWsQNN9zAjjvuOHUvdgimU3vOBNOtPd/ylrew9dZbc9RRR03NCxyy6dSeixcvZsmS\nJQBsuOGGrL/++qy//vpT9EolSV3ZM72KNtlkE8466yxe97rXce+993L//ffz4he/mH322efBdfbZ\nZx9OPPFE/uIv/oLddtvtwfsPO+wwbr31Vu69915e97rXAXDQQQex//77s99++/Ha176W17/+9cyb\nNw+AOXPm8L73vW/CWmbNmsUJJ5zw4BjOE044YexC4nRqT4C3v/3tXHDBBdxyyy3Mnz+fc889lw03\n3HAAr3wwplN7XnrppZx00kk84xnPYO7cuWy55ZZ89rOfHdArH4zp1J6//vWvOfTQQ5k1axb33Xcf\n73jHO1hvvfUG9MolSf2aWbN5SNIazs9FSZoazuYhSZIkDZhhWpIkSerIMC1JkiR1NPZh+p577mGc\nxn1L0iBUFffcc8+oy5CkNc5Yz+ax9dZbc9NNN7F48eJRlyJJI7fOOuuw9dZbj7oMSVqjjHWYXnYZ\nbUmSJGkUxn6YhyRJkjQqhmmIcVPyAAAGiElEQVRJkiSpI8O0JEmS1NFYXQExyW+An4+6jhHbArh1\n1EWsQWzv4bK9h882Hy7be7hs7+Gaae39uKracmUrjVWYFiS5tJ9LW2pq2N7DZXsPn20+XLb3cNne\nw7WmtrfDPCRJkqSODNOSJElSR4bp8XPaqAtYw9jew2V7D59tPly293DZ3sO1Rra3Y6YlSZKkjuyZ\nliRJkjoyTE9DSS5Ksu9y970xyRlJPjequma6JFsl+VSS65NcluT8JH+S5OQkVyX5UZLvJ9l+1LWO\nkySV5Kye5bWT/CbJl1byvF2TPK9n+bgkbxlkrTNFkmOSXJ3kyiRXJNmj/Qx5RB/P7Ws9TW5l7/sk\nhyc5ZXQVzgxJlrTv8R8muTzJnqOuaaaa6Dty1HVNB4bp6els4JDl7jsEOKOqDhpBPTNekgDnAAuq\naoeq+nPgbcDBwGOAJ1fVnwEvAu4YXaVj6W7gSUk2aJf3AW7q43m7As9b6Vp6iCRPB54P7F5VTwbm\nAzcCbwT6Ccn9rqfJdX3fa9XcU1W7VtUuNJ/ZJ4y6oJloku/IPxptZdODYXp6+hywf5J1AZLMpgl0\nNya5qr3v9PbX+BVtb8exI6t2ZpgHLK6qU5fdUVU/pPlCvLmqlrb3Layq20dU4zg7H9i/vX0ozQ9G\nAJI8Ncl/JPlBku8k+dP2vf8PwMHte/zgdvWdkixI8rMkRw/3JYyNrYFbq+o+gKq6FTiI5jPkoiQX\nAST5lySXtj3Yx7f3Hd27XpJZST7ac2Tm70bzksbWhO97DcTGwO0ASeb2Hv1KckqSw9vb701yTXvk\n5v2jKXXsTPQd+a0k7+v5jDgYHmz/i5Oc235evzfJy5Jc0q63w6heyCAYpqehqvotcAmwX3vXIcBn\ngOpZ59VVtSvwApqrDX10yGXONE8CLlvB/Z8BDmgD3T8l2W3Idc0UnwIOSbI+8GTgez2P/Rh4ZlXt\nBrwDeE9V3d/e/nTb6/Tpdt0nAPsCTwWOTbLO0F7B+PgKsG2S/0zy4SR7V9XJwC+BeVU1r13vmPbi\nCk8G9k7y5BWstyuwTVU9qT0yc8YIXs84m+x9r6mxQfv5/GPgdOCdk62cZHOaI4w7t0du3jWEGmeC\nib4jD6T5nNiF5ijY+5Js3T62C3Ak8ETgMOBPquqpNP9Orx94xUNkmJ6+eod6HMIKejTaD+jPAq+v\nqjX9MusDUVULgT+lOZy1FPh6kueMtqrxU1VXArNpeufOX+7hTYDPtkddTgR2nmRT51XVfW1v66/x\nEOPDVNXvgD8HjgB+A3x6WY/ccl6S5HLgBzRtvtMK1vkZ8MdJPpjkucCiwVQ9M63kfa+psWyYxxOA\n5wIfa4ckTORO4F7gI0kOBH4/jCJnsL2As6tqSVX9CrgYeEr72Per6ub2KNn1ND/0AX5E8//FjGGY\nnr7OBZ6TZHfgEVW1ol+EpwL/VlVfG25pM9LVNAHkYdrwdkFV/XfgPcALh1rZzPEF4P08/IfhO4GL\nqupJwAHA+pNs476e20uAtae0whmi/WJbUFXHAkcBf9X7eHsS7VuA57S9c+exgnZvhzTtAiyg6WE6\nfcClz0QTve81xarqP4AtgC2BB3hoxlm/XecBmiNbn6M5t+DCIZc5rib8jpxE7+f10p7lpcywz27D\n9DTV9i5dBPwrK+6Vfh2wUVW9d9i1zVD/DqyX5IhldyR5cpK9kzymXV6L5lCtRwG6+Vfg+Kr60XL3\nb8IfTsw6vOf+u4CNhlDXjNKOOX98z1270rxne9tzY5rzAe5M8kf8YUgZvesl2QJYq6o+D7wd2H3A\n5c9EE73vNcWSPAGYBdxG857fKcl6STYFntOu80hgk6o6H/g7mh+LWrkVfkfSnJB/cHt+xZbAs2iG\nqa5RZtQvgxnobJqzZ5ef2QOaXqXFSa5ol0/tPTFAq6aqKsmLgH9O8laaw4D/RdNr8YEk67WrXgI4\nnVUH7ZCZk1fw0D8CZyZ5O00P6TIXAf+jfY97hn7/Hgl8sA0QDwDX0Qz5OBS4MMkvq2pekh/QjFe/\nEfh2z/NPW7YezcweZ7Q/JKEZ7qRVMMn7HuDwJL1Hup7Wrq/+bdDzPRjglVW1hOaE/c8AVwE30Axn\nguaH4rntMMkAbxp2weNoku/IN9J85vyQ5ryuv6+qW9ofNmsMr4AoSZIkdeQwD0mSJKkjw7QkSZLU\nkWFakiRJ6sgwLUmSJHVkmJYkSZI6MkxLkiRJHRmmJUmSpI4M05IkSVJH/x/wFyTq/kc4lgAAAABJ\nRU5ErkJggg==\n", "text/plain": [ "<Figure size 1200x500 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#Clustering on original X space\n", "km = cluster.KMeans(n_clusters = 4, init = 'k-means++', n_init = 10)\n", "km.fit(dpro)\n", "cols = ['r', 'y', 'b', 'g']\n", "\n", "fig = plt.figure(figsize = (12, 5))\n", "for i, cc in enumerate(km.cluster_centers_):\n", " w = 0.15\n", " ax = fig.add_subplot(1, 1, 1)\n", " plt.bar(np.arange(len(cc)) + i*w, cc - dpro.mean(), w, color = cols[i], label='Cluster {}'.format(i))\n", " ax.set_xticklabels(dpro.columns.values)\n", " ax.set_xticks(np.arange(len(cc))+2*w)\n", "\n", "plt.ylabel('Mean Adjusted Score')\n", "plt.title('Avg Scores by Cluster')\n", "plt.legend(loc = 3, ncol = 4, prop={'size':9})\n", " \n", "#Now cluster on the SVD space\n", "km_u = cluster.KMeans(n_clusters = 4, init = 'k-means++', n_init = 10)\n", "#km_u.fit(pd.DataFrame(U.dot(diag(sig))))\n", "km_u.fit(pd.DataFrame(U))\n", "cols = ['r', 'y', 'b', 'g']\n", "\n", "fig = plt.figure(figsize = (12, 5))\n", "for i, cc in enumerate(km_u.cluster_centers_):\n", " w = 0.15\n", " ax = fig.add_subplot(1, 1, 1)\n", " #plt.bar(np.arange(len(cc))+i*w, cc.dot(Vt), w, color = cols[i], label='Cluster {}'.format(i))\n", " plt.bar(np.arange(len(cc)) + i*w, cc.dot(np.diag(sig).dot(Vt)) - dpro.mean(), w, color = cols[i], label='Cluster {}'.format(i))\n", " ax.set_xticklabels(dpro.columns.values)\n", " ax.set_xticks(np.arange(len(cc))+2*w)\n", "\n", "plt.ylabel('Mean Adujsted Score')\n", "plt.title('Avg Scores by Cluster Using SVD')\n", "plt.legend(loc = 3, ncol = 4, prop={'size':9})\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>In the above we chose $k=4$. Let's assume we did this because we want to assign students to groups of 4 to work together in a study group. Our goal would be to maximize the skill diversity of each group, so we'd cluster into 4 clusters and assign one student per cluster into each group. If we look at the above plots, we can get a sense of how the average person within each group differs from the average student in the class. When we make our profiles, we'll use the top chart, as the clusters here tend to follow a more intuitive line of reasoning (note the subjectivity here)!\n", "\n", "<br><br>\n", "<b><u>Student Profiles</u></b>\n", "<ul>\n", " <li><u>Cluster 0</u> appears to be strongest in business, communication and vizualization. </li>\n", " <li><u>Cluster 1</u> is the math/stats group.</li>\n", " <li><u>Cluster 2</u> in general is the group that tends to rate itself low on all dimensions!</li>\n", " <li><u>Cluster 3</u> in general is the group that tends to be very confident on all technical subjects!</li>\n", " \n", "</ul>\n", "</p>\n", "\n", "### Hierarchical Clustering\n", "<p>\n", "In the above we showed how to compute and evaluate K-Means, and we also came up with a use case for the clustering. The above use case (i.e., putting students into study groups of size 4) essentially dictated the choice of $k$. In a more general use case, we might not have such an application specific best $k$. One way we can be more general is to use hiearchical clustering. In this type of clustering, the individual clusters are embedded in a taxonomy. We can use this taxonomy to see if there are any natural values of $k$ that make the most sense. We can also use it to ensure that each final cluster we choose is well balanced in size. Additionally, we can use this to get a sense of any outlier clusters (those with very small counts).<br><br>\n", "\n", "Using Scipy isn't as straightforward as using Sklearn, but again, scipy has a good procedure for displaying the dendrogram.\n", "\n", "</p>" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAsoAAAGpCAYAAACUHhApAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XucJFV99/HPz11WQIibCIJGZTUi\nCggjrICiccF7giR4iYIaL0k2RvMYRI3ES2JMoj5JND4x3jbRoOiqMYIgXhCVxSuXRVfuCCKKgsuC\nKCDCyvJ7/qhqtxnOTPdUdXXP7H7er9e8pqemTp3T1dXV3z596nRkJpIkSZLu7G6TboAkSZI0HxmU\nJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS9IMImJNRGRErJl0WwDqtmREvHHSbZmL\niLiybvdxk26LJM2FQVnS2ETENhFxbERcFBG/iIgbI+J7EXFSRBww6fZtzSLiuDrMXtnB5r8NnAV8\nr4NtS1JnFk+6AZK2Kv8MHF3fvhz4JbAbcDhwInD2hNqlDkTEkszcmJlHTLotktSEPcqSxunI+vc/\nZObumbkPsBR4FH0hOSKeHxFnR8R1EfGriLghIk7t73WOiBV9QxFeHBGnR8QvI+KsiHhYRDwxIi6I\niJsi4rMRsWtf2V8PqYiI/xMRP6zLfjYi7jfbHYiIJRHxhoi4NCJui4jrI2J1f7mI2CUijo+Iq+t1\nro2Ir0bE8wZse5eIeG/dno11uU/Psn7/PljRt/xOQzQiYlFE/GNEXF7fzxsiYl1EvLX+/5XAC+ri\nu03fZkTsGhH/FRE/rtv1g4h4a0TcfYZ9+pqIuBpY39t+/9CLiFjWV8crI+LD9eP044h4/bT7uFe9\n726NiEsi4giHckgaF3uUJY1T7835EyPiHOCczPwJcOa09Q4EHg78EPgR8FDgScCjIuIhdZl+7wZ+\nUG//AOAUYNe6/D2ApwJvA547rdxB9fpXAver1/tkXf9MPgkcBtwBXFiXOxI4OCKmMvOGuj1PB34B\nXADcC3g0cCnw4dJGI+JeVMMTdqsXXU51jj5slrYM66XA64BNdZu3pdqnOwDHUg2NuAewE7Cx/hvg\nxrpdZ9bt+gVwcV32NcBewNOm1fUo4DFU9/XWIdr2FuC6et37Av8QEWdl5mkRsS3wOeD+wO1U+/wj\n2MkjaUw82Ugap3fXvw8CTgauqXtm3xQR2/et9+/AvTJzj8ycAvaul+8I/H5hu8dn5h7Av9R/Pwh4\nc2Y+jCpYATy+UG4RsF9m7gkcUy87ICIOKTU+In6XzcH1qXWP+IOogt4DqAIpwEPq33+Rmftn5jJg\nF+Cdpe3WXsbmkPzcusf9gcD+s5QZVq89H8zMfet99ZvAHwPUQyM+U69zTWYeVP98q69dPwV2z8x9\ngUPrdQ+LiIOn1bUEOCwz9wJ2H6Jta4FlwMOAX9XLeo/VUVQhGeA59eN0BHB3JGkMDMqSxiYz30jV\n03oScGO9+CHAG4AP9a26FDgpIn4aEXcAl/X9776FTfeGJ1xZWHZF/XuXQrnzMvOS+vbH+5bvXVgX\n7tzTfGpEJHADVU8sVG8A+us+rr5Y8bPAnwNXz7Dd/m1fmZmrewvrsNrWKUACL46IayLiDOCf2PwY\nzKbXrt8Crq7v89f6/n/QtPUvzczPA2TmpiG2/z/1OObrgGvrZb3Hqvc4bAROqLd5KtU+l6TOOfRC\n0lhl5onAiRERwH7Aqvr30yLibsD2wKlUYflWqmEAv2JzYFtU2Gwv8N1eWJYjvQObnV3Y9g/r368D\nvg48mSrsPYZqWMezgKkRtqG//kUAEXHPu6yUeWpE7FfXvy/wCOB3gT+LiD0z86oh6rqZatjGdD+b\n9vf6YRo+Q/ne4xfT1snM7OpxlKQZ2aMsaWzqC8qmoE4+mecCvR7dmzLzDmAPqpAM8OLM3J/NM2WM\n2j4RsUd9+1l9yy+YYf1z+m6/vTdEgWpc7l8D76v/dzBwRma+PDMPBVbWy/etx/yWnFX/XhYRf9Rb\n2NtfM7i27/bv1L/vMsNEROwDbMjM12XmYWwezrED1RhtgFvq39vXb2J6evc5gef13edDgH+lGrPd\nb5SB9vz6990j4mkAEfFkqmEjktQ5g7KkcfpT4NsRsSEizo2IH1CNQwX4aP37CqqLxgDeHxHnAZ/q\nqD23Ad+KiAuBd9TL1mbm6aWVM3MN1cVlAB+LiO9GxPnAz4EzqHrGAd4KXF/PMnEu8IF6+Y+oxvqW\nvIvqgkSAj0fEZRHxPWC2oReXsbkX++0RcTrwnsJ6fwRcVc+mcS6bA2jv4j7Y/IZlZ+CSiDgzIrYD\n/gO4imp8+EURcV5EXEY1/OETbH5T04WP1nUDfLJ+nD5F9bhJUucMypLG6fVU45Nvopo5YVeqsPdm\n4FUA9awRzwIuojpHbeSuMyuMylrgFVQ9qxuBz1ONoZ7NEcDfUQXL3ahmvbiCalaNNfU6H6camrEj\n1ewdN1Hd76fONIQgM6+nGu/7PqpwuAz4DeCzMzUkM28Hnk01PGUR1TjiZxRWPaPeTlANBVkMfAN4\nRt8Y7Q9Q9Q7/nGrc+IHAonrs8EHAf1H1YD+sbtc5wGuZ+1CLoWXmrVRDVr5G1VO9BHg+1f6Eah5u\nSepMOOxL0tYmqq+kfhzV8IgVk22NZhMRuwOX995g1DOPnFH/+yWZ+b4ZC0tSS17MJ0maz/4FmKqH\nuNwDeGy9/GLg+Im1StJWwaEXkqT57HSqCw0PpbpI8krg7cDBmXnLLOUkqTWHXkiSJEkF9ihLkiRJ\nBQZlSZIkqcCgLEmSJBUYlCVJkqQCg7IkSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJ\nkiSpwKAsSZIkFRiUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmSJKnA\noCxJkiQVGJQlSZKkAoOyJEmSVGBQliRJkgoMypIkSVKBQVmSJEkqMChLkiRJBQZlSZIkqcCgLEmS\nJBUYlCVJkqSCxZNuQL+ddtoply1bNulmSJIkaQt27rnnXpeZOw9ab14F5WXLlrF27dpJN0OSJElb\nsIj4wTDrOfRCkiRJKjAoS5IkSQUGZUmSJKnAoCxJkiQVGJQlSZKkAoOyJEmSVGBQliRJkgoMypIk\nSVKBQVmSJEkqMChLkiRJBQZlSZIkqcCgLEmSJBUsnnQDhrFqFaxePf56jzoKVq4cf72SJEmavAXR\no7x6NaxbN946162bTDiXJEnS/LAgepQBpqZgzZrx1bdixfjqkiRJ0vyzIHqUJUmSpHEzKEuSJEkF\nBmVJkiSpwKAsSZIkFRiUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmS\nJKnAoCxJkiQVGJQlSZKkAoOyJEmSVGBQliRJkgoMypIkSVKBQVmSJEkqWNzlxiPiSuAmYBNwe2Yu\n77I+SZIkaVQ6Dcq1QzLzujHUI0mSJI3MOILyxKxaBatXNyt79tmwcSMsXdqs/NRUs3JHHQUrVzYr\nK0mSpNHpeoxyAl+IiHMjohj/ImJlRKyNiLUbNmwYaeWrV8O6dc3KLlky0qYMZd265sFekiRJo9V1\nj/JjMvPHEXFv4LSIuCQzv9K/QmauAlYBLF++PEfdgKkpWLNm7uVWrKh+NynbVK9OSZIkTV6nPcqZ\n+eP697XAicABXdYnSZIkjUpnQTki7hERO/ZuA08CLuiqPkmSJGmUuhx6sQtwYkT06lmdmZ/vsD5J\nkiRpZDoLypl5BbBvV9ufz5rOttG78LDpWGVnzJAkSRodv5mvA01n25iaaj6tnDNmSJIkjdYWPY/y\nJDWdbaMpZ8yQJEkaLXuUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmS\nJKnAoCxJkiQVGJQlSZKkAoOyJEmSVGBQliRJkgoMypIkSVKBQVmSJEkqMChLkiRJBQZlSZIkqcCg\nLEmSJBUYlCVJkqQCg7IkSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSpwKAsSZIk\nFRiUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSweJJN0CbrVoFq1c3K3v22bBxIyxdOrdyGzdWPwA7\n7NCs7p6pqXblAY46ClaubL8dSZKktuxRnkdWr4Z165qVXbKkWbmNG2HTpmZlR23duuZvFCRJkkbN\nHuV5ZmoK1qyZe7kVK6rfcy3btFwXem2RJEmaD+xRliRJkgoMypIkSVKBQVmSJEkqMChLkiRJBV7M\np8baTGdX0pvxY489YP360W13agquuWZ023QaPEmStg72KKuxNtPZlUxNVT/r18PNN49uu9DNNpty\nGjxJkhYGe5TVStPp7GbTxZR1ToMnSZLmyh5lSZIkqcCgLEmSJBUYlCVJkqQCg7IkSZJU4MV80gBd\nTYM3yov6nG5OkqTRs0dZGqCrafBGxenmJEnqhj3K0hC6mAZvVJxuTpKkbtijLEmSJBUYlCVJkqSC\nzoNyRCyKiG9HxCld1yVJkiSNyjh6lP8KuHgM9UiSJEkj0+nFfBFxP+D3gX8CjumyLmmhGdW0c9On\nm7vmGli/fu7b2bix+tlhh2btaDuTh1PcSZLmm657lN8B/DVwx0wrRMTKiFgbEWs3bNjQcXOk+WNU\n085Nn25u/Xq4+ea5b2fjRti0qX17mnCKO0nSfNRZj3JEHAZcm5nnRsSKmdbLzFXAKoDly5dnV+2R\n5qMupp3r9SzPdbtNy42CU9xJkuajLnuUDwYOj4grgY8Bh0bEhzusT5IkSRqZzoJyZv5NZt4vM5cB\nzwG+nJnP66o+SZIkaZScR1mSJEkqGMtXWGfmGmDNOOqSJEmSRmEsQVnaWs02Bdz0ad2mW0jTpTWd\n6q43lV1vlo6lS4cv25vOrmfYae2ml+s3aBuzlW2yvVFoOy1fFxbSsStJs3HohdSh2aaAmz6tW7+F\nNl1a06nueiF5hx3mHiqbTmfXZhq8SU6ht1AstGNXkmZjj7LUsSZTwC3E6dLa3M8mU9JNYhq8SU6h\nt1AsxGNXkmZij7IkSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFznqhsVt17ipWnz/z/FHr\nfvIOAFYcd3Tx/0c9/ChW7u8krdqCNJ2IepDeRNXjdNPJcEfC4sePt94ujGMi7C3ZfJzkW+O3wCdW\nNyhr7Fafv5p1P1nH1K7lk+jUseWADLDuJ9VkvQZlbVF6E1GPOlj0T1Q9JmviUGATsGhsdUqap3oT\n7BuUpbmZ2nWKNS9cM+dyK45bMfK2SPNCk4moB5nExM9ONi2pZwuYWN0xypIkSVKBQVmSJEkqMChL\nkiRJBQZlSZIkqcCgLEmSJBU464UkjcNscyX3plCa6Qrxcc9D2mZe57PPho0bYenSuZXbuBGWLGlW\nZ5uybabkW+Dzw0pz0uS8MOjcNpN59NyyR1mSxqE3V3LJ1NTMgW3dum6+jGQ2s7V1kDZh9+abx1+2\nqUk8LtIkNTkvzHZum8k8e27ZoyxJ49JkruRJzUPadF7npvMot5l/eZLzRUtbky7me59unj237FGW\nJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSpwFkvJElbp0nMFw3N531uM180NJ8zeh7NaSuN\nmz3KkqSt0yTmi4bm8z47X7Q0dvYoS5K2XuOeL7pNWeeLlsbOHmVJkiSpwKAsSZIkFRiUJUmSpAKD\nsiRJklRgUJYkSZIKnPVCkqQtXdM5oxfafNGTqHOXXWD9+rmXm5qCa64Zvmz/PNiDyg2aM3vcc2PP\n5fjrTdm4YsVw+2e2+zqC+2mPsiRJW7qmc0YvtPmiJ1Hn+vXjL9umzknMjT2X429qanP4nQf30x5l\nSZK2Bk3mjF5o80UvpDrblB1FneM2qeOvJXuUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJ\nKjAoS5IkSQUGZUmSJKlg6KAcEbtFxBPq29tFxI7dNUuSJEmarKGCckT8GfC/wPvqRfcDPtVVoyRJ\nkqRJG7ZH+WXAwcCNAJl5GXDvrholSZIkTdqwQfm2zNzY+yMiFgPZTZMkSZKkyRs2KJ8REa8FtouI\nJwKfAD7dXbMkSZKkyRo2KB8LbADOB/4c+Czw+q4aJUmSJE3a4iHX2w74QGb+J0BELKqX3dJVwyRJ\nkqRJGrZH+UtUwbhnO+CLsxWIiG0j4uyI+E5EXBgRf9+0kZIkSdK4DdujvG1m3tz7IzNvjojtB5S5\nDTi0Xncb4GsR8bnMPLNpYyVJkqRxGbZH+RcRsV/vj4jYH/jlbAWy0gvX29Q/zpQhSZKkBWHYHuWj\ngU9ExNVAALsCzx5UqB7LfC7wYOBdmXlWYZ2VwEqABzzgAUM2R5IkSerWUEE5M8+JiIcCe9SLLs3M\nXw1RbhMwFRFLgRMjYu/MvGDaOquAVQDLly+3x1mSJEnzwrA9ygCPBJbVZfaLCDLzQ8MUzMyfRcTp\nwFOACwatL0mSJE3aUEE5Io4HfgdYB2yqFycwY1COiJ2BX9UheTvgicD/bddcSZIkaTyG7VFeDuyZ\nmXMZGnEf4IP1OOW7Af+TmafMtYGSJEnSJAwblC+guoDvmmE3nJnnAY9o0ihJkiRp0oYNyjsBF0XE\n2VTzIwOQmYd30ipJkiRpwoYNym/sshGSJEnSfDPs9HBndN0QSZIkaT4Z6pv5IuKgiDgnIm6OiI0R\nsSkibuy6cZIkSdKkDPsV1v8BHAlcBmwH/Cnwrq4aJUmSJE3asEGZzLwcWJSZmzLzv6m+PESSJEna\nIg17Md8tEbEEWBcR/0w1TdzQIVuSJElaaIYNu8+v1/1L4BfA/YGnd9UoSZIkadKGDcp/mJm3ZuaN\nmfn3mXkMcFiXDZMkSZImadig/ILCsheOsB2SJEnSvDLrGOWIOBI4CnhgRJzc96/fAH7aZcMkSZKk\nSRp0Md83qC7c2wl4W9/ym4DzumqUJEmSNGmzBuXM/AHwg4h4AvDLzLwjIh4CPBQ4fxwNlCRJkiZh\n2DHKXwG2jYjfBr5ANQvGcV01SpIkSZq0YYNyZOYtVFPCvTsznwXs1V2zJEmSpMkaOihHxKOA5wKf\nqZct6qZJkiRJ0uQNG5SPBv4GODEzL4yIBwGnd9csSZIkabKG+grrzDwDOKPv7yuAl3fVKEmSJGnS\nBs2j/I7MPDoiPg3k9P9n5uGdtUySJEmaoEE9ysfXv/+164ZIkiRJ88mgeZTPrX+fERE717c3jKNh\nkiRJ0iQNvJgvIt4YEdcBlwLfjYgNEfG33TdNkiRJmpxZg3JEHAMcDDwyM38rM38TOBA4OCJeMY4G\nSpIkSZMwqEf5+cCRmfn93oJ6xovnAX/cZcMkSZKkSRoUlLfJzOumL6zHKW/TTZMkSZKkyRsUlDc2\n/J8kSZK0oA2aHm7fiLixsDyAbTtojyRJkjQvDJoebtG4GiJJkiTNJwOnh5MkSZK2RgZlSZIkqcCg\nLEmSJBUYlCVJkqQCg7IkSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSpwKAsSZIk\nFRiUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmSJKnAoCxJkiQVGJQl\nSZKkgs6CckTcPyJOj4iLIuLCiPirruqSJEmSRm1xh9u+HXhlZn4rInYEzo2I0zLzog7rlCRJkkai\nsx7lzLwmM79V374JuBj47a7qkyRJkkZpLGOUI2IZ8AjgrML/VkbE2ohYu2HDhnE0R5IkSRqo86Ac\nETsAnwSOzswbp/8/M1dl5vLMXL7zzjt33RxJkiRpKJ0G5YjYhiokfyQzT+iyLkmSJGmUupz1IoD3\nAxdn5tu7qkeSJEnqQpc9ygcDzwcOjYh19c/vdVifJEmSNDKdTQ+XmV8DoqvtS5IkSV3ym/kkSZKk\nAoOyJEmSVGBQliRJkgoMypIkSVKBQVmSJEkqMChLkiRJBQZlSZIkqcCgLEmSJBUYlCVJkqQCg7Ik\nSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSpwKAsSZIkFRiUJUmSpAKDsiRJklRg\nUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmSJKnAoCxJkiQVGJQlSZKkAoOyJEmSVGBQliRJ\nkgoMypIkSVKBQVmSJEkqMChLkiRJBQZlSZIkqcCgLEmSJBUYlCVJkqQCg7IkSZJUYFCWJEmSCgzK\nkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSpwKAsSZIkFRiUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElS\ngUFZkiRJKjAoS5IkSQUGZUmSJKnAoCxJkiQVGJQlSZKkgs6CckR8ICKujYgLuqpDkiRJ6kqXPcrH\nAU/pcPuSJElSZzoLypn5FeCnXW1fkiRJ6tLExyhHxMqIWBsRazds2DDp5kiSJEnAPAjKmbkqM5dn\n5vKdd9550s2RJEmSgHkQlCVJkqT5yKAsSZIkFXQ5PdxHgW8Ce0TEjyLiT7qqS5IkSRq1xV1tODOP\n7GrbkiRJUtcceiFJkiQVGJQlSZKkAoOyJEmSVGBQliRJkgoMypIkSVKBQVmSJEkqMChLkiRJBQZl\nSZIkqcCgLEmSJBUYlCVJkqQCg7IkSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSp\nwKAsSZIkFRiUJUmSpAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmSJKnAoCxJ\nkiQVGJQlSZKkAoOyJEmSVGBQliRJkgoMypIkSVKBQVmSJEkqMChLkiRJBQZlSZIkqcCgLEmSJBUY\nlCVJkqQCg7IkSZJUYFCWJEmSCgzKkiRJUoFBWZIkSSowKEuSJEkFBmVJkiSpwKAsSZIkFRiUJUmS\npAKDsiRJklRgUJYkSZIKDMqSJElSgUFZkiRJKjAoS5IkSQUGZUmSJKnAoCxJkiQVdBqUI+IpEXFp\nRFweEcd2WZckSZI0Sp0F5YhYBLwLeCqwJ3BkROzZVX2SJEnSKHXZo3wAcHlmXpGZG4GPAX/QYX2S\nJEnSyCzucNu/DVzV9/ePgAOnrxQRK4GV9Z83R8SlM20wollDmpabVNmtps4XNa+0aVmPBeucZJ2t\nCm8tO2lrqbNNWevcsupsU9Y625TdbZjiXQbloWTmKmDVpNshSZIk9ety6MWPgfv3/X2/epkkSZI0\n73UZlM8Bdo+IB0bEEuA5wMkd1idJkiSNTGdDLzLz9oj4S+BUYBHwgcy8sKv6JEmSpFGKzJx0GyRJ\nkqR5x2/mkyRJkgoMypIkSVKBQVmSJEkqmPg8ypI0n0XE/sCjgKXAz4AzM3PtZFulkoh4ZGaeM+l2\nSNpyzMuL+SIigMOAR1O9OK0HPjPMCTAiHj6t3KmZeU3DdrwsM9815Lpzrjci7k51Py8Dvg+8GPgl\n8KHMvHVA2Udk5rcjYjvgJcBD6228NzN/NqDsNsBTgOsz8xsR8TzgnsBHBpUtbOtNmfm3A9Z5OXBK\nZl4xl223aWtEHA58MTNvmWudM2xvLsfCXsCmzLykb9mBmXnWLGXuk5nX1Mf+HwAPo3o8/zczbx9Q\nX+OyM2zvaZn56SHWaxQg63b+HrAJ+EJm3lEv/4PMPGmWcr8FPBe4HjgBeDXwG8C7M/P7A+pc2jte\nIuIwYG/ge1T7aMaTYET8G3B34IvAz+v6ngDcnpl/NcR93aFe/6H1okuAL2XmTUOUneovl5nrBpWZ\nVn5v6vs55Lmz0T6q19+f6ptYr6c6p/0yM78wRJ2NnqcRUfo0NIDPZ+YTh9zGnPZv23NKk+dLm2O+\nr84mj0vjY6EuM9d9uwj4Q6btH+BTw5zDRvy6P8xrWuP2Nj3/zbK9od8ctj2n9G1n4Othm+dLm/Nm\nXX4k9/PX25unQfm/gAuB7wCHAjsCPwVuy8y3zlLurcB2dblDgFupDsZvZOaHBtT5VaC3M3rfd7gX\ncEFm/u6Aso3qjYhPAd+imj7vEOBTwI3AkzPzWQPq/HJmHhoRHwS+CXwZmAJemJm/N6DsiVTzXC8F\n9gc+C1wHHJWZT56l3A+BHwJ3MId9FBFXUO2bXYHPAydk5vmztXEEbb0a+AHVSfNE4OTMvGHIOtsc\nC28DdgF+BewEvDgzN/Qer1nK9R7P/0f1Zqn3eC7PzD8aUGejshHxoNJi4LjMfOyAOhsHyIj4MFWQ\nv70u86eZeekQ++gLwHFUx8JLgDdSBYC/z8wVA+rs7aO31OVPAg4G7peZL5ql3FdKj/lMy6et8zaq\nY+BMoPcm8UFUL+jXZuYrZyj3aqoX4O9S7adeud2Bb2bmv8xS5+cz8ykRcTTweOAz9f38UWb+zYD2\nNt1H76c6bm4D7k31xVI3AvfOzJUD6mz0PI2IW6j2a3Dn5+o+mXmvAWVfDRzE5g4KGGL/tjynNHq+\ntDzm2zwuTY+Fpvv2eOA84Evcef/sm5nPG9DWNq/7TV/T2rS36fmv8ZvDlueURq+HLZ7bjc6bddlG\nx99AmTnvfoDTp/39pfr3aQPKfWna36fVv784RJ2voDohrehb9rkh29uo3v77CZwz0/ZmqpPqoD2V\n+g1PvfyMuexfqoO9uN8L5Z4OfAR4IbB42H3U2y6wPfAM4MPAWuCfO2xrr84HAq8E1tT76qUdHwtf\n6bu9T13vcuDLA8p9sXTMDLqfbcpSvWh+APjvaT8/msv9HGb5tHXW9N2+L/AF4PAh9tEZfbcvmuM+\n+vL0bUxvywzl3g68D3gm8KT693uAdwxR584N/7dPk/9Nv5/A3fqWf63DfdT/uJw/x8fl9Pr3nJ6n\nwLnAPQvLZ32NaLN/W55TGj1fWh7zbR6XpsdC03371bksn7ZOm9f9pq9pbdq7pu/2XM5/t1B1gpxe\n/+7dvn6IOtucUxq9HrZ4bjc6b7a9n7P9zNcxyudHxHuo3rGtoDoYYPCY6msj4jV95S6qly8aVGFm\n/lv9DYJ/EhEvAVbPob1N613Sd/ulfbcHthd4C/A/VB/5rImIr1F95H7iEGV/ERGvB+4BXB8Rr6Tu\nsZ+tUGaeAJwQEU8Fjo+IbwIehEpCAAALzklEQVTbDFFfr/wtwCeBT0bEYqpPC4Zt6w5zaWtfnd8H\n3ga8LSJ2oRqaMKhMm2NhUUQsycyNmXleRBxB9cZgrwHlPlh/knJV3eNwBrAv1RuKQUpl9xmi7AXA\nazJzQ//CiPj4EHWujYj3AadRBe5ej8q3hih7t4jYMTNvysyr6492V1F9YjCbH9b3cxHVOeKdVMfC\ndUPUuV9EfAXYs/exct1Ds+NshTLzmIh4BFUvxe5UvUerMvPbgyrM6pOEOQ/DqY+b4kfmmXnegGr3\njIgPAb9D1YP5y3r5toPaS8N9xJ3Py6/tux3TV5xJg+fpYWy+b/2eOkRdd9mHvY+Sh9i/jc4plJ8v\nj2fw86XNMd/mcdmv7kl82ByfL0337ckRcQpVmOrtn8cBA4eA0e51v+lr2kkt2tv0/HcxcERm/rx/\nYUScNqjCvnPKXYb+DDrmW74ezvn50vS8WZdt9dyeybwcegEQEY+h+hjlpN4djIhHZ+Y3ZimzGDia\n6mOYi6g+TvkF1bvpoccr1dt5PrAH8MkcMP4nqvFKR1C9c7oU+HRmZkTcNzOvnqXcvYCfZt+DENWY\n3Kkh6jycqlf5AKqP+n9O1Ss98AQa1bjmp1CNObsMeAHVyXPtoHqnbecQYO/MfOeA9fbNzO/Ut+c6\nbvI3qMaCfZ8q+L0auAE4Pmcfo/yknDYWL4YcZxx946LrOl8DvBx48PSTVKHsAVTHzfdy85jqpVTv\n+j86oOx9gSez+fH8Rm+/DdHm/rI/o/qYaday9XG+Bw1OSPV6vQC5tG7vN4cJkBGxDLihcMI/KDPP\nnKVcUA0p+THVMJ69qQLhP2U9zm9AvXtT3deL67+3p+plmLHONqL5MJw2H5nv1vfn1Zn5q6jG+z02\nMz/X4D5sD+w+27FUv6hdkpmb+pYtAZ6SmScP2P6TM/PUubarjRYfJbdqa9/z5Z5Uz5czBz1f6mP+\nsXV7v0b1HA/gxsz8+oCyjR+XGbY3zLHQZtjazlSfvu1P9dp0+ZCvEUuA5wC7AedThdYdqYb4DbpO\n6E7jaOfwmnY41Zuch7P5/HfO9E6HGcouoz7/9b8eAosGnP/uS5UxLqPvDTRVz+2vBtTZ6lqLvu38\nOhtl5rED1m30fGl63qzLNj7+Zt3ufAzK9Y66N9UYnnG9wLS6OKR+t/ZoqpPgnK+MbxAg24yXG8WF\nMEO3N+48bvIJwCkMP26y6RjlNifsUp3XA0fOVmeb9tZle72IP6U6Cd4yPewPKNu7mOUGhrtQqPEJ\naYbtdX3B40zjb6/KzNfOVK4u2+ic0kb0jWOOiH2AfwdeRTXkaLbz2BmZ+bj69vmZ+fD69umZeUjD\ntsxpNoi5no8K5edyLDS64KypiHgF1ac1x2XmmnrZ5zJzYG/0DNsbat+2eI42Pm7b7NuGz9FG+3bE\nrxGfobq/w5xzm46jbfP62+g6gpb5pvG1FrNsc+BxP1Mv9rBtnct5s15/pM/tnvk69OKR03bUJyLi\nVUOUe/C0F5hn1LdPn70YADez+eIQqEJWUH2EPav63doSqh7ei6jerb0oIp4/27u1GZ4wL4+IgScH\n4NLMPCQiHkg1zurEiLiNqgf+3QPK9u4r9X0c6r7OcDIbpr29ISZHAIfUvX/vjWq4yCBLM/PNdf0X\nZObb6tsvHFDuBJo/YWaq8wVdlZ3pJBgRzxziJNjfWzD08Ufz59n0NyLUbd8zIp49xBuRXwf0iPj1\niz/VcKLZToRtjqPG97WF6cNwng4cz+BhOI0/Mp/lTfCbgUEX/DQ6H830pnTIY6Hxcd9UNvwoueW+\nbfQaQbvnaJtzSqPnaNN9y+heI87PzLfXt184RNmmr6NtXn+b3tc2+abp0J+ZjnsYcNxPe126mOGP\n+abnzdbDRGbb8Lz7Ab4OLOn7+zepTtrrB5Xru/20vttrhqizzcUhTS/UaHPhzemFZbsAK7u6r03b\nC/wE+BDwI2C7vuVrh2jrKcDrqU7QZ1BdFPAiqt7vQWWXAH8BfIzqYolhL8hrU2ejsrS78Kbp8dfo\neVavO4kLHtscR43va9MfqmFR9562bBHVx8T7zlLuMVQfw/Yv26Y+hmcsV6/X5oKfps/vNsdC4+O+\nxeOyb9/txfXz8y3T/zfifTuJ5+hIzilzfI423beTeo24y75giNfRpuXa3Fda5Jt6vUdQvR4eW/9+\nxJDlGh33LY75RufNNsffoJ/52qP8Cqqu+msBMvOGqMYEzTplGrAyIhZl5qas54Gt3128fYg6G18c\nQvN3a20uvLnLNHmZuZ7qooBBmt7Xpu09sP79BqqPEIlq3OQbhmjrs9g8nvpNVOOptwWePahgZm4E\n3hMR/0k1rmqo8b5t6mxRts2FN02Pv6bPM3IyFzy2OY4a39cWHgscGxGXAVfWy5ZRjQv/CjMfjwcC\nx0wr90DgIQPKQYsLfmj4/G55LLS+ELCBJ0TE31GN9byyrmtR/RH+V5l5/7bZt2N/jtJu3zZ9jjbd\ntxN5jaD562ib19+m97VNvoHqm5gXU73pXsRwkwZA8+O+6THf9LwJzY+/Wc3LMcoLURQubKKabmbG\nMTxx5wtvrsnMjfUT5pjMfFOnDW5gobV3IYmZL7w5dph92+T4G5WY+8WvBwBXZua107bx2i3tOIqI\noBoCtHu96DLgOzngxNui3H2oeno2Tls+zHjC1s/vBsdCq+O+qSb7t82+rdcb63O0zb5t8xxteuyq\nO9OG/sx13vs255RGx3ybY6iL48+gPAKzjF2b9eK4puUmZaG1dyFps28n8bh00F6ovk3L46iFSRxH\nC+3YbWqh3U+fo+qJdl+ctFXkm9nM16EXC03/xXE9w1wI2H8BYf/FMAMvIJyQhdbehaTxxaQ0P/7a\nGEV7PY5Gr82+bVp2VMfCXMuO2yj2bT+foxqXxhfz0f680G9BHkMG5dFoOoanzZi3SVho7V1I2uzb\nSTwuC629W4tJPC5by7Gw0O7nQmuvOpItvjiJrSffzMihFyMwyxiexZl5+6jLTcpCa+9C0mbfTuJx\nWWjt3VpM4nHZWo6FhXY/F1p7NX5DjjPeKvLNbAzKkiRJW6gtabzwJDj0QpIkacu1kK4FmHcMypIk\nSVuui4GnZ+bP+hcuxPHCk+DQC0mSpC1URDya6pv/7jJeGNgrMxt9EcfWwh5lSZKkLdejgFfFXb/t\nc3dafGPd1sIeZUmSpC2Y35jYnEFZkiRJKpjpayolSZKkrZpBWZIkSSowKEvSmEXE6yLiwog4LyLW\nRcSBHda1JiKWd7V9SdqSOeuFJI1RRDwKOAzYLzNvi4idgCUTbpYkqcAeZUkar/sA12XmbQCZeV1m\nXh0RfxsR50TEBRGxqr5Kvdcj/G8RsTYiLo6IR0bECRFxWUT8Y73Osoi4JCI+Uq/zvxGx/fSKI+JJ\nEfHNiPhWRHwiInaol781Ii6qe7j/dYz7QpLmNYOyJI3XF4D7R8R3I+LdEfG4evl/ZOYjM3NvYDuq\nXueejZm5HHgvcBLwMmBv4IURca96nT2Ad2fmw4AbgZf2V1r3XL8eeEJm7gesBY6pyx9B9cUD+wD/\n2MF9lqQFyaAsSWOUmTcD+wMrgQ3AxyPihcAhEXFWRJwPHArs1Vfs5Pr3+cCFmXlN3SN9BXD/+n9X\nZebX69sfBh4zreqDgD2Br0fEOuAFwG7Az4FbgfdHxNOBW0Z2ZyVpgXOMsiSNWWZuAtYAa+pg/OfA\nPsDyzLwqIt4IbNtX5Lb69x19t3t/987j0yfFn/53AKdl5pHT2xMRBwCPB54J/CVVUJekrZ49ypI0\nRhGxR0Ts3rdoCri0vn1dPW74mQ02/YD6QkGAo4CvTfv/mcDBEfHguh33iIiH1PXdMzM/C7yC6tu7\nJEnYoyxJ47YD8M6IWArcDlxONQzjZ8AFwE+Acxps91LgZRHxAeAi4D39/8zMDfUQj49GxN3rxa8H\nbgJOiohtqXqdj2lQtyRtkfwKa0la4CJiGXBKfSGgJGlEHHohSZIkFdijLEmSJBXYoyxJkiQVGJQl\nSZKkAoOyJEmSVGBQliRJkgoMypIkSVLB/we3BWxmvoBUBQAAAABJRU5ErkJggg==\n", "text/plain": [ "<Figure size 1200x600 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from scipy.spatial.distance import pdist, squareform\n", "from scipy.cluster.hierarchy import linkage, dendrogram\n", "\n", "#This function gets pairwise distances between observations in n-dimensional space.\n", "dists = pdist(dpro)\n", "\n", "#This function performs hierarchical/agglomerative clustering on the condensed distance matrix y.\n", "links = linkage(dists)\n", "\n", "p = 46\n", "#Now we want to plot the dendrogram\n", "fig = plt.figure(figsize = (12, 6))\n", "den = dendrogram(links, truncate_mode = 'lastp', p = p)\n", "plt.xlabel('Samples')\n", "plt.ylabel('Distance')\n", "plt.suptitle('Samples clustering', fontweight='bold', fontsize=14);\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>Let's try to understand the output of the linkage function. The first few records look like:\n", "</p>" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 16. , 121. , 1.73205081, 2. ],\n", " [ 93. , 110. , 1.73205081, 2. ],\n", " [ 13. , 159. , 1.73205081, 3. ],\n", " [ 35. , 157. , 1.73205081, 2. ],\n", " [ 92. , 130. , 1.73205081, 2. ]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "links[:5,:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>The way we read this is as follows:\n", "<ul>\n", " <li>At the lowest level, each record is assigned to its own cluster, and the cluster number is just its original index.</li>\n", " <li>The linkage function returns an (n-1) by 4 matrix where:</li>\n", " <ul>\n", " <li>The ith row corresponds to a new cluster, whose id=n+i</li>\n", " <li>The L[i,0] and L[i,1] columns are the two clusters that are joined to make cluster (n+i)</li>\n", " <li>The column L[i, 2] is the distance between L[i,0] and L[i,1]</li>\n", " <li>The last column is the final size of cluster (n+i)</li>\n", " </ul>\n", "</ul><br>\n", "In the above few rows, we see that user 19 and 30 were combined to make cluster 46. Then, we see in the 5th row that this cluster 46 was joined with user 0 to make cluster 50. If we look up at the dendrogram we can see these 3 students being combined into a single cluster.\n", "\n", "</p>" ] }, { "cell_type": "code", "execution_count": 191, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Note code to follow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>If we want to use a hiearchical clustering technique to get a specific number of clusters, we can use <a href=\"http://scikit-learn.org/stable/modules/generated/sklearn.cluster.Ward.html#sklearn.cluster.Ward\">sklearn.cluster.Ward</a> for a concise process that returns exactly what we need. We'll continue with our student study group example and choose $k=4$.\n", "\n", "\n", "</p>" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "AgglomerativeClustering(affinity='euclidean', compute_full_tree='auto',\n", " connectivity=None, linkage='ward', memory=None, n_clusters=4,\n", " pooling_func=<function mean at 0x10c8ca950>)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ka = cluster.AgglomerativeClustering(n_clusters = 4)\n", "ka.fit(dpro)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>Let's do another plot of mean adjusted centroids to get a sense of what each cluster represents. Sklearn.cluster.Ward does not return the centroids, so we'll have to compute this step ourselves.\n", "</p>" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/briand/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.\n", " warnings.warn(message, mplDeprecation, stacklevel=1)\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtMAAAE/CAYAAACEmk9VAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xu4XHV97/H3h3AVYxAIBbkYCrQq\nlFujWORI0FBFoBeKCCpIrQfxEcGqR48Hj0i9YKWVgmg5HBRQFBUtB5VLQUvwVsWAiFxsC1JLUOQi\nEEDAkHzPH7N23ITsncliz23n/Xqe/WTWmjXr910/hpnv/NZ3/VaqCkmSJEmrb61BByBJkiSNKpNp\nSZIkqSWTaUmSJKklk2lJkiSpJZNpSZIkqSWTaUmSJKklk2lJWgMlmZOkkqw9ndqSpH4zmZa0Rkiy\nIMl9SdbrYRt/muS6JIuT3JPkX5Js26v2hkmS30tyQXPcDyS5PsnbksyYwjYWJHnDVO1PkqaCybSk\naS/JHOC/AQX8SY/a2B74NPB2YBawLfBxYOkUtpEkQ/e5nWQ74PvA7cAfVNUs4JXAXGDmIGMbbyoT\ne0kaM3QfypLUA0cA3wPOAV43tjLJHknuHJ9kJfnzJNc3jzdIcm4zon1zkncmWTRBG7sCt1XVN6rj\nwar6clX9V7OvGUn+V5JbkzyY5JokWzfP7ZnkB82I7g+S7DkungVJPpjkO8Cvgd9NMivJJ5P8Iskd\nST4wdgxJtk9yVbOve5J8YRV98/okP2/29Y5mH5sn+XWSTcbFsXuSu5Oss5J9nAh8t6reVlW/AKiq\nf6uqV1fV/StunOQ/k8wft/y+JOc1j9dPcl6Se5Pc3/TH7yT5IJ0fRKcneSjJ6c32z0lyRZJfJfm3\nJIeM2+85Sf4xySVJHgb2WUVfSNJqM5mWtCY4Avhs8/eyJL8DUFXfBx4GXjJu21cDn2senwDMAX4X\n2Bd47SRtXAs8J8kpSfZJ8vQVnn8bcBjwCuAZwOuBXyfZGLgYOA3YBPgocPH4RBY4HDiKzijvz+j8\nKHgc2B7YDfhjYKz84f3A5cAzga2Aj00SM3QSzB2afbwryfyquhNYABwybrvDgc9X1ZKV7GM+8KVV\ntNOt19EZ2d+aTn8cDTxSVccD3wKOqaqnV9UxSTYErqDz32sz4FDgE0meN25/rwY+SKfvvj1FMUrS\ncibTkqa1JHsBzwa+WFXXALfSSbDGnE8nySXJTDrJ7vnNc4cAH6qq+6pqEZ2Ed6Wq6qfAPGBL4IvA\nPc3I6FhS/QbgPc2IbVXVj6rqXmB/4D+q6jNV9XhVnQ/8BDhw3O7Pqaobq+pxYOMmxrdW1cNVdRdw\nCp1EEmBJc7zPqqpHq2pVCeSJzX5+DJw91hfAuTQ/HppR78OAz0ywj02AX6yinW4tafa3fVUtrapr\nqmrxBNseAPxnVZ3d9N0PgS/TKTEZc1FVfaeqllXVo1MUoyQtZzItabp7HXB5Vd3TLH+OcaUezfJB\nzYWJBwHXVtXPmueeRacOeMz4x09SVd+rqkOqajadkoQXA8c3T29NJ5Ff0bPojDaP9zM6SfnK2n02\nsA7wi6YM4n7g/9AZmQV4JxDg6iQ3Jnn9ZDGvsO+fNfEAXAQ8r7mAcl/ggaq6eoJ93AtssYp2uvUZ\n4J+BzzflJx+ZoLQEOn2xx1g/NH3xGmDzcdtM+t9Mkp4qpymSNG0l2YDO6PKMJHc2q9cDNkqySzM6\nfFOSnwH78cQSD+iMtm4F3NQsb91t21X1gyT/BOzUrLod2A64YYVNf04nKRxvG+Cy8bsb9/h24DFg\n02akesV27wT+Oywflf96km9W1S0ThLo1nZHwsXZ/3uzn0SRfpDM6/RwmHpUG+DrwF3RGtrvxMPC0\nccvLk9+mjORE4MTmwtFLgH8DPskT+wE6fXFVVe07SVsrvkaSppQj05Kmsz+jM5vG8+hcILgr8Fw6\ntbdHjNvuc8BxdEaSLxi3/ovAu5M8M8mWwDETNZRkryT/PclmzfJz6Mwc8r1mk7OA9yfZoZmVY+em\nLvoS4PeSvDrJ2kle1cT7tZW101zgdznw90mekWStJNsl2btp95VJtmo2v49OMrlskj7630melmRH\n4C+B8Rcsfho4sjmOyZLpE4A9k5ycZPMmju2bCwk3Wsn21wGHJlknyVzg4LEnmnrzP2hKSxbTKfsY\ni/+XdOrXx3yNTt8d3uxrnSTPT/LcSWKVpCllMi1pOnsdcHZV/VdV3Tn2B5wOvCa/vYnI+cDewL+M\nKwcB+BtgEXAbndHXL9EZFV6Z++kknT9O8hCdkeULgY80z3+UTnJ+OZ0k8ZPABk3d9AF0ptS7l06Z\nxgErxLGiI4B16YyY39fENVZm8Xzg+00MXwGOa+q5J3IVcAvwDeDvqurysSeq6jt0EtnxpS9PUlW3\nAn9E52LNG5M8QKd2eSHw4Epe8r/pjNLfR2cUevzZgM2b41kM3NzEN5bInwocnM7sKqdV1YN0Lpw8\nlM6I+p3A39I5+yBJfZEqz4BJUjeSvAk4tKr2HnQs/ZLkX4DPVdVZg45FkoaRI9OSNIEkWyR5UVNK\n8ft0Ro8vHHRc/ZLk+cDuPLH0Q5I0jhcgStLE1qUzU8a2dMo4Pg98YqAR9UmSc+nUnB/XlFNIklbC\nMg9JkiSpJcs8JEmSpJZMpiVJkqSWRqpmetNNN605c+YMOgxJkiRNc9dcc809zR1tJzVSyfScOXNY\nuHDhoMOQJEnSNNfcHXeVLPOQJEmSWjKZliRJkloymZYkSZJaMpmWJEmSWjKZliRJkloymZYkSZJa\nMpmWJEmSWjKZliRJkloymZYkSZJaMpmWJEmSWhqp24lLkjTsknavq5raOCT1hyPTkiRJUksm05Ik\nSVJLJtOSJElSSwNLppOsn+TqJD9KcmOSEwcViyRJktTGIC9AfAx4SVU9lGQd4NtJLq2q7w0wJkmS\nJKlrA0umq6qAh5rFdZo/r2WWJEnSyBhozXSSGUmuA+4Crqiq769km6OSLEyy8O677+5/kJIkSdIE\nBppMV9XSqtoV2Ap4QZKdVrLNmVU1t6rmzp49u/9BSpIkSRMYitk8qup+4Erg5YOORZIkSerWIGfz\nmJ1ko+bxBsC+wE8GFY8kSZK0ugY5m8cWwLlJZtBJ6r9YVV8bYDySJEnSahnkbB7XA7sNqn1JkiTp\nqRqKmmlJkiRpFJlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5Ik\nSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2ZTEuSJEktmUxLkiRJLa096AAkSZIWLEir182bV1McibR6\nTKa1xku7z2/Kz29JktZ4lnlIkiRJLZlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2ZTEuSJEkt\nmUxLkiRJLZlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2t\nPegA9GRJu9dVTW0ckiRJmpwj05IkSVJLJtOSJElSSybTkiRJUksDS6aTbJ3kyiQ3JbkxyXGDikWS\nJElqY5AXID4OvL2qrk0yE7gmyRVVddMAY5IkSZK6NrCR6ar6RVVd2zx+ELgZ2HJQ8UiSJEmrayhq\nppPMAXYDvj/YSCRJkqTuDTyZTvJ04MvAW6tq8UqePyrJwiQL77777v4HKEmSJE1goMl0knXoJNKf\nrap/Wtk2VXVmVc2tqrmzZ8/ub4CSJEnSJAY5m0eATwI3V9VHBxWHJEmS1NYgR6ZfBBwOvCTJdc3f\nKwYYjyRJkrRaBjY1XlV9G8ig2pckSZKeqoFfgChJkiSNKpNpSZIkqSWTaUmSJKklk2lJkiSpJZNp\nSZIkqSWTaUmSJKklk2lJkiSppYHNMy1J6o+0nNG/amrjkKTpyJFpSZIkqSWTaUmSJKklyzwk9V2b\nsgNLDiRJw8iRaUmSJKklk2lJkiSpJZNpSZIkqSWTaUmSJKklL0CUJEkjy3nUNWiOTEuSJEktdZ1M\nJ3laLwORJEmSRs0qk+kkeya5CfhJs7xLkk/0PDJJkiRpyHUzMn0K8DLgXoCq+hHw4l4GJUmSJI2C\nrso8qur2FVYt7UEskiRJ0kjpZjaP25PsCVSSdYDjgJt7G5YkSZI0/LoZmT4aeDOwJXAHsGuzLEmS\nJK3RJh2ZTjIDOLyqXtOneCRJkqSRMenIdFUtBV7dp1gkSZKkkdJNzfS3k5wOfAF4eGxlVV3bs6gk\nSZKkEdBNMr1r8+/fjFtXwEumPhxJkiT1w4IFq38v9nnzvA/7ilaZTFfVPv0IRJIkSRo13dwBcVaS\njyZZ2Pz9fZJZ/QhOkiRJGmbdTI33KeBB4JDmbzFwdi+DkiRJkkZBNzXT21XVX4xbPjHJdb0KSJIk\nSRoV3YxMP5Jkr7GFJC8CHuldSJIkSdJo6GZk+k3AuePqpO8DjuxZRJIkSdKI6GY2j+uAXZI8o1le\n3POoJEmSpBHQzWweH0qyUVUtrqrFSZ6Z5ANT0XiSTyW5K8kNU7E/SZIkqZ+6qZner6ruH1uoqvuA\nV0xR++cAL5+ifUmSJEl91U3N9Iwk61XVYwBJNgDWm4rGq+qbSeZMxb6GUZs7C3V4dyFJkqRR0E0y\n/VngG0nG5pb+S+Dc3oUkSZIkjYZuLkD82yQ/AubTGTJ9f1X9c88jayQ5CjgKYJtttulXs5IkSdIq\ndVMzTVVdBpwEfBe4p6cRPbntM6tqblXNnT17dj+bliRJkiY1YTKd5GtJdmoebwHcALwe+EySt/Yp\nPkmSJGloTTYyvW1VjU1Z95fAFVV1ILAHnaT6KUtyPvCvwO8nWZTkr6Ziv5IkSVI/TFYzvWTc45cC\n/xegqh5MsmwqGq+qw6ZiP5IkSdIgTJZM357kLcAiYHfgMlg+Nd46fYhNkiRJQyQtZ/2taTzr72Rl\nHn8F7AgcCbxq3I1bXgicPdGLJEmSpDXFhCPTVXUXcPRK1l8JXNnLoCRJkqRR0NXUeJIkSZKezGRa\nkiRJaslkWpIkSWppwprpJB+jc/vwlaqqY3sSkSRJkjQiJhuZXghcA6xPZ2q8/2j+dgXW7X1okvom\nafcnSdIabrLZPM4FSPImYK+qerxZPgP4Vn/CkyRJkoZXNzXTzwSeMW756c06SZIkaY022R0Qx3wY\n+GGSK4EALwbe18ugJEmSpFGwymS6qs5OcimwR7PqXVV1Z2/DkiRJkobfKss8kgSYD+xSVRcB6yZ5\nQc8jkyRJkoZcNzXTnwD+CDisWX4Q+HjPIpKk6c7ZUyRp2uimZnqPqto9yQ8Bquq+JE6NJ0mSpDVe\nNyPTS5LMoLmBS5LZwLKeRiVJkiSNgG6S6dOAC4HNknwQ+DZwUk+jkiRJkkZAN7N5fDbJNcBL6UyN\n92dVdXPPI5MkSZKG3CqT6SSfqarDgZ+sZJ0kSZK0xuqmzGPH8QtN/fQf9iYcSZIkaXRMmEwneXeS\nB4Gdkyxu/h4E7gIu6luEkiRJ0pCaMJmuqpOqaiZwclU9o/mbWVWbVNW7+xijJEmSNJS6KfP4WpIN\nAZK8NslHkzy7x3FJkiRJQ6+bZPofgV8n2QV4O3Ar8OmeRiVJkiSNgG6S6cerqoA/BU6vqo8DM3sb\nliRJkjT8urmd+INJ3g28FnhxkrWAdXobliRJkjT8uhmZfhXwGPBXVXUnsBVwck+jkiRJkkZAN3dA\nvBP46Ljl/8KaaUmSJKmrOyA+CFSzuC6dEo+HqmpWLwOTJEmShl03I9PLLzZMEjoXIr6wl0FJkiRJ\no6CbmunlquP/AS/rUTySJEnSyOimzOOgcYtrAXOBR3sWkSRJkjQiupka78Bxjx8H/pNOqYckSZK0\nRuumZvov+xGIJEmSNGomTKaTvLOqPpLkY/x2No8xBfwKOK+qbm3beJKXA6cCM4CzqurDbfclSZIk\n9dtkI9M3N/8unOD5TYB/AnZp03CSGcDHgX2BRcAPknylqm5qsz9JkiSp3yZMpqvqq82/5060TZKH\nn0LbLwBuqaqfNvv6PJ1abJNpSZIkjYTJyjy+ypPLO5arqj+pqv/zFNreErh93PIiYI+nsD9JkiSp\nryYr8/i75t+DgM2B85rlw4Bf9jKo8ZIcBRwFsM022/Sr2RWDaPWyeTXhb5FJtXwZObFdnHVCywZ7\nqUWfL7iyXVPVssOnVX+37IMFC1r2QYv27O/+9je06/Oh7O+Wn+F+pjwFLfrc78ynoI95iv39ZJOV\neVwFkOTvq2ruuKe+mmSiOurVcQew9bjlrZp1K8ZxJnAmwNy5c4e/RyVJkrTG6OYOiBsm+d2xhSTb\nAhtOQds/AHZIsm2SdYFDga9MwX4lSZKkvujmpi1/DSxI8lMgwLOBNz7Vhqvq8STHAP9MZ2q8T1XV\njU91v5om2pxHankKXNI01/a8tJ8pkrrQzU1bLkuyA/CcZtVPgGVT0XhVXQJcMhX7kiRJkvqtmzIP\nquox4HpgU+ATdGbekCRJktZoq0ymk7wwyWnAz4CLgG/y21FqSZIkaY01YTKd5ENJ/gP4IJ1R6d2A\nu6vq3Kq6r18BSpIkScNqsprpNwD/Dvwj8NWqeiyJU9NJkiRJjcnKPLYAPgAcCNya5DPABkm6mQFE\nkiRJmvYmu2nLUuAy4LIk6wEHABsAdyT5RlW9uk8xSl2ZN88TJ5Ikqb+6GmVuZvP4MvDlJM8A/qyn\nUUmSnsQfjJI0fFa7ZKOqFgOf7kEskiRJ0kjpap5pSZIkSU9mMi1JkiS11FWZR5I9gTnjt68qSz0k\nSZK0RltlMt1MibcdcB2wtFldWDctSZKkNVw3I9NzgedVlZeRS5IkSeN0UzN9A7B5rwORJEmSRk03\nI9ObAjcluRp4bGxlVf1Jz6KSJGnAnNdbUje6Sabf1+sgJEmSpFG0ymS6qq7qRyCSJEnSqFllzXSS\nFyb5QZKHkvwmydIki/sRnCRJkjTMurkA8XTgMOA/gA2ANwAf72VQkiRJ0ijo6g6IVXULMKOqllbV\n2cDLexuWJEmSNPy6uQDx10nWBa5L8hHgF3gbckmSJKmrpPjwZrtjgIeBrYG/6GVQkiRJ0ijoZjaP\nnyXZANiiqk7sQ0ySJEnSSOhmNo8DgeuAy5rlXZN8pdeBSZIkScOumzKP9wEvAO4HqKrrgG17GJMk\nSZI0ErpJppdU1QMrrPMeq5IkSVrjdTObx41JXg3MSLIDcCzw3d6GJUmSJA2/bkam3wLsCDwGnA8s\nBt7ay6AkSZKkUdDNbB6/Bo5v/iRpuXnzrPiSJK3ZJkymVzVjR1X9ydSHI0mSJI2OyUam/wi4nU5p\nx/eB9CUiSZIkaURMlkxvDuwLHAa8GrgYOL+qbuxHYJIkSdKwm/ACxKpaWlWXVdXrgBcCtwALkhzT\nt+gkSZKkITbpBYhJ1gP2pzM6PQc4Dbiw92FJkiRpuqgTpu8F65NdgPhpYCfgEuDEqrphqhpN8ko6\nd1Z8LvCCqlo4VfvuiZq+bwBJkiS1N9k8068FdgCOA76bZHHz92CSxU+x3RuAg4BvPsX9SJIkSQMz\n4ch0VXVzQ5dWqupmgMQJQiRJkjS6epYwT5UkRyVZmGTh3XffPehwJEmSpOVWeQfEtpJ8nc70eis6\nvqou6nY/VXUmcCbA3LlzLV6WJEnS0OhZMl1V83u1b0mSJGkYDH2ZhyRJkjSsBpJMJ/nzJIvo3LL8\n4iT/PIg4JEmSRl5Vuz9NiZ6VeUymqi7Em79IkiRpxFnmIUmSJLVkMi1JkiS1ZDItSZIktWQyLUmS\nJLVkMi1JkiS1ZDItSZIktWQyLUmSJLVkMi1JkiS1ZDItSZIktWQyLUmSJLVkMi1JkiS1ZDItSZIk\ntWQyLUmSJLVkMi1JkiS1ZDItSZIktWQyLUmSJLVkMi1JkiS1ZDItSZIktWQyLUmSJLVkMi1JkiS1\nZDItSZIktWQyLUmSJLVkMi1JkiS1ZDItSZIktWQyLUmSJLVkMi1JkiS1ZDItSZIktbT2oAOQRlWd\nUIMOQZIkDZjJtKSR4I8XSdIwssxDkiRJaslkWpIkSWrJZFqSJElqyWRakiRJamkgyXSSk5P8JMn1\nSS5MstEg4pAkSZKeikGNTF8B7FRVOwP/Drx7QHFIkiRJrQ1karyqunzc4veAgwcRhyRJw8LpH6XR\nNAw1068HLh10EJIkSdLq6tnIdJKvA5uv5Knjq+qiZpvjgceBz06yn6OAowC22WabHkQqSZIktdOz\nZLqq5k/2fJIjgQOAl1bVhOe2qupM4EyAuXPneg5MkiQ9ZZbVaKoMpGY6ycuBdwJ7V9WvBxGDJEnq\nkYnHyKRpZ1A106cDM4ErklyX5IwBxSFJkiS1NqjZPLYfRLuSJEnSVBpIMq3esP5LkiSpv4ZhajxJ\nkiRpJJlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2ZTEuSJEktmUxLkiRJLZlMS5IkSS2ZTEuS\nJEktmUxLkiRJLZlMS5IkSS2ZTEuSJEktrT3oAJ6KJUuWsGjRIh599NFBhyJJA7f++uuz1VZbsc46\n6ww6FElaY4x0Mr1o0SJmzpzJnDlzSDLocCRpYKqKe++9l0WLFrHtttsOOhxJWmOMdJnHo48+yiab\nbGIiLWmNl4RNNtnEM3WS1GcjnUwDJtKS1PDzUJL6b6TLPJ5kqr5IqiZ9+vrrr+dd73oXjzzyCL/5\nzW84+OCDedvb3sb222/PLbfcslpNnXbaaRx77LGtQ73ssss48cQTAXjf+97Hy172stb7WtGCBVPT\nn/PmjU5/vv71r+fSSy9l//3356yzzmq9n5Xp09tzaPpz0aJFvOY1r2HZsmUsW7aMU089lblz57ba\n18rkxKnp0DphdN6f8+fP5/HHH+ehhx7i7W9/O4cddljrfUmSpsb0Sqb74IEHHuC1r30tF154Idtt\ntx1VxeWXX956f6v75bp06VJmzJix/PE73/lOvvnNbwKw9957M3/+/OXPj4Jh6k+A97///RxxxBGc\nd955rWMYpGHqz5kzZ3LBBRew2WabcdNNN/HGN76Rb33rW61jGYRh6k+ASy65hHXXXZfFixezyy67\nmExL0hAY+TKPfrv44os58MAD2W677YDOadUVR4PPOeccPvCBDwCd0bl58+YBcMopp7DHHnuwzz77\ncOqpp/K5z32OO+64g3nz5vHBD36QJUuW8IY3vIF99tmHvfbai6uvvhqAI488kqOPPpoDDjjgCcnI\nLbfcwrbbbstGG23ERhttxJw5c1Z7pGzQhqk/AbbccsseH3FvDVN/zpo1i8022wyA9dZbj7XXHr3f\n7sPUnwDrrrsuAA8//DA77rhjLw9dktSl0ft2G7Dbb7+drbfeutVrP/vZz3LllVcyc+ZMli1bxlpr\nrcV73/teFixYAMAZZ5zB9ttvz1lnncUvf/lLDjroIL7zne8A8OxnP5szzjjjCfu79957eeYzn7l8\neaONNuJXv/pVuwMbkGHqz+lgGPtz6dKlHHvssRx//PGt4hqkYevPpUuX8pKXvIQbb7yRk046qfVx\nSZKmjsn0atp666254YYbJt1m/EVANa7A9R/+4R849thjWbJkCUcffTR77bXXE1734x//mO9+97tc\ndtllQOcU85g999zzSe1svPHG3H///cuXH3jgATbeeOPVO6ABG6b+nA6GsT/f+MY3st9++zF//vzV\nOpZhMGz9OWPGDK666iruvfdenv/853PIIYcwa9as1T6ubq2qnlySZJnHatt///356le/yq233rp8\n3RVXXPGEbTbeeGMWLVoEwDXXXLN8/e67787ZZ5/Nhz/8YY477jgA1l57bZYtWwbAjjvuyBFHHMGC\nBQtYsGAB11577fLXrqwOeocdduC2225j8eLFLF68mNtuu43tt99+6g62D4apP6eDYevPd7zjHWyx\nxRYcc8wxU3OAfTZM/blkyRKWLl0KwIYbbsj666/P+uuvP0VHKklqy5Hp1TRr1izOO+883vzmN/Po\no4/ym9/8hle+8pXsu+++y7fZd999OeWUU/jjP/5jdtttt+XrDz/8cO655x4effRR3vzmNwNw8MEH\ns//++7Pffvvxpje9ibe85S3ss88+AMydO5eTTz55wlhmzJjBSSedtLyG86STThq5JHGY+hPgPe95\nD5deeil33nkn8+fP56KLLmLDDTfswZH3xjD158KFCzn11FN50YtexLx585g9ezYXXHBBj468N4ap\nP++66y4OO+wwZsyYwWOPPcZ73/te1ltvvR4duSSpW6lVzbM1RObOnVsLFy5cvnzzzTfz3Oc+d4AR\nSdJw8XNRkqZGkmuqapVzulrmIUmSJLVkMi1JkiS1ZDItSZIktTTyyfQjjzzCKNV9S1IvVBWPPPLI\noMOQpDXOSM/mscUWW3DHHXewZMmSQYciSQO3zjrrsMUWWww6DElao4x0Mj12G21JkiRpEEa+zEOS\nJEkaFJNpSZIkqSWTaUmSJKmlkboDYpK7gZ8NOo4B2xS4Z9BBrEHs7/6yv/vPPu8v+7u/7O/+mm79\n/eyqmr2qjUYqmRYkWdjNrS01Nezv/rK/+88+7y/7u7/s7/5aU/vbMg9JkiSpJZNpSZIkqSWT6dFz\n5qADWMPY3/1lf/effd5f9nd/2d/9tUb2tzXTkiRJUkuOTEuSJEktmUwPoSRXJnnZCuvemuTsJF8a\nVFzTXZLNk3w+ya1JrklySZLfS3JakhuS/DjJD5JsO+hYR0mSSnLeuOW1k9yd5GureN2uSV4xbvl9\nSd7Ry1iniyTHJ7kxyfVJrkuyR/MZ8rQuXtvVdprcqt73SY5McvrgIpwekixt3uM/SnJtkj0HHdN0\nNdF35KDjGgYm08PpfODQFdYdCpxdVQcPIJ5pL0mAC4EFVbVdVf0h8G7gVcCzgJ2r6g+APwfuH1yk\nI+lhYKckGzTL+wJ3dPG6XYFXrHIrPUGSPwIOAHavqp2B+cDtwFuBbpLkbrfT5Nq+77V6HqmqXatq\nFzqf2ScNOqDpaJLvyN8ZbGTDwWR6OH0J2D/JugBJ5tBJ6G5PckOz7qzm1/h1zWjHCQOLdnrYB1hS\nVWeMraiqH9H5QvxFVS1r1i2qqvsGFOMouwTYv3l8GJ0fjAAkeUGSf03ywyTfTfL7zXv/b4BXNe/x\nVzWbPy/JgiQ/TXJsfw9hZGzvidaiAAAEgElEQVQB3FNVjwFU1T3AwXQ+Q65MciVAkn9MsrAZwT6x\nWXfs+O2SzEhyzrgzM389mEMaWRO+79UTzwDuA0gyb/zZrySnJzmyefzhJDc1Z27+bjChjpyJviO/\nneTkcZ8Rr4Ll/X9Vkouaz+sPJ3lNkqub7bYb1IH0gsn0EKqqXwFXA/s1qw4FvgjUuG3eUFW7An9K\n525D5/Q5zOlmJ+Calaz/InBgk9D9fZLd+hzXdPF54NAk6wM7A98f99xPgP9WVbsB7wU+VFW/aR5/\noRl1+kKz7XOAlwEvAE5Isk7fjmB0XA5sneTfk3wiyd5VdRrwc2Cfqtqn2e745uYKOwN7J9l5Jdvt\nCmxZVTs1Z2bOHsDxjLLJ3veaGhs0n88/Ac4C3j/Zxkk2oXOGccfmzM0H+hDjdDDRd+RBdD4ndqFz\nFuzkJFs0z+0CHA08Fzgc+L2qegGd/05v6XnEfWQyPbzGl3ocykpGNJoP6AuAt1TVmn6b9Z6oqkXA\n79M5nbUM+EaSlw42qtFTVdcDc+iMzl2ywtOzgAuasy6nADtOsquLq+qxZrT1LjzF+CRV9RDwh8BR\nwN3AF8ZG5FZwSJJrgR/S6fPnrWSbnwK/m+RjSV4OLO5N1NPTKt73mhpjZR7PAV4OfLopSZjIA8Cj\nwCeTHAT8uh9BTmN7AedX1dKq+iVwFfD85rkfVNUvmrNkt9L5oQ/wYzr/X0wbJtPD6yLgpUl2B55W\nVSv7RXgG8E9V9fX+hjYt3UgnAXmSJnm7tKr+B/Ah4M/6Gtn08RXg73jyD8P3A1dW1U7AgcD6k+zj\nsXGPlwJrT2mE00Tzxbagqk4AjgH+YvzzzUW07wBe2ozOXcxK+r0padoFWEBnhOmsHoc+HU30vtcU\nq6p/BTYFZgOP88QcZ/1mm8fpnNn6Ep1rCy7rc5ijasLvyEmM/7xeNm55GdPss9tkekg1o0tXAp9i\n5aPSbwZmVtWH+x3bNPUvwHpJjhpbkWTnJHsneVazvBadU7WeBWjnU8CJVfXjFdbP4rcXZh05bv2D\nwMw+xDWtNDXnO4xbtSud9+z4/nwGnesBHkjyO/y2pIzx2yXZFFirqr4MvAfYvcfhT0cTve81xZI8\nB5gB3EvnPf+8JOsl2Qh4abPN04FZVXUJ8Nd0fixq1Vb6HUnngvxXNddXzAZeTKdMdY0yrX4ZTEPn\n07l6dsWZPaAzqrQkyXXN8hnjLwzQ6qmqSvLnwD8keRed04D/SWfU4qNJ1ms2vRpwOqsWmpKZ01by\n1EeAc5O8h84I6Zgrgf/ZvMe9Qr97Twc+1iQQjwO30Cn5OAy4LMnPq2qfJD+kU69+O/Cdca8/c2w7\nOjN7nN38kIROuZNWwyTve4Ajk4w/0/XCZnt1b4Nx34MBXldVS+lcsP9F4AbgNjrlTND5oXhRUyYZ\n4G39DngUTfId+VY6nzk/onNd1zur6s7mh80awzsgSpIkSS1Z5iFJkiS1ZDItSZIktWQyLUmSJLVk\nMi1JkiS1ZDItSZIktWQyLUmSJLVkMi1JkiS1ZDItSZIktfT/AVFl/HL+UazcAAAAAElFTkSuQmCC\n", "text/plain": [ "<Figure size 1200x500 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#Clustering on original X space using Hierarchical clustering\n", "\n", "cols = ['r','y','b','g']\n", "\n", "fig = plt.figure(figsize = (12, 5))\n", "for i in range(4):\n", " w = 0.15\n", " ax = fig.add_subplot(1,1,1)\n", " cc = dpro[(ka.labels_==i)].mean()\n", " plt.bar(np.arange(len(cc))+i*w, cc - dpro.mean(), w, color = cols[i], label='Cluster {}'.format(i))\n", " ax.set_xticklabels(dpro.columns.values)\n", " ax.set_xticks(np.arange(len(cc))+2*w)\n", "\n", "plt.ylabel('Mean Adjusted Score')\n", "plt.title('Avg Scores by Cluster')\n", "plt.legend(loc = 3, ncol = 4, prop={'size':9})\n", "\n", "plt.show()\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>When we plot the 4 clusters using hierarchical clustering, we get similar conceptual groupings as we did with k-means. However, I find this latter plot more easy to interpret (remember, this is a bit subjective).\n", "<br><br>\n", "<b><u>Student Profiles using Hierarchical Clustering</u></b>\n", "<ul>\n", "\n", " <li><u>Cluster 0:</u> This group has most of the math/stats experience</li>\n", " <li><u>Cluster 1:</u> These are the business and strategy minded students. </li>\n", " <li><u>Cluster 2:</u> These are the math/stats folks in the group.</li>\n", " <li><u>Cluster 3:</u> This group is a little below average in skill across all categories (or at least is the group that underrates their own skill levels).</li>\n", "\n", "</ul>\n", "\n", "<br><br>\n", "So now we have to make a choice - which clustering method to use for the student profiles. One last thing to compare is the distribution of students across clusters for each method.\n", "</p>" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1oAAAEYCAYAAABFrYfQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAHy1JREFUeJzt3X+0JHV55/H3JwzqHmBFZYLIT42s\nBo0SnTPRaBISFGFiRDdshLgJJpCJRDa6Z3ezbNyoMdkjWde4iZhwJsiOegwYNRiS4I/ZiFESQQcW\nEEQEOUZmRBgBB0iMOvrsH10T20vfuT23v123uf1+ndPnVld9q+rpYu6H+3R1VaeqkCRJkiS1830r\nXYAkSZIkrTY2WpIkSZLUmI2WJEmSJDVmoyVJkiRJjdloSZIkSVJjNlqSJEmS1JiNlppJcmOS41a6\nDkmSJA0kOS7JjStdxzyy0dJYknwxyfMWzHt5kit2P6+qp1TVx6a0/59J8pUkjx6ad3KS7UkeOY19\nSprcwuxIcmqSe5P8xErWNS1JXpLky0kOHJp3SpIvJTlgJWuTHmrmMD/OTPKxEfO3Db+RneSJSf77\ngjE/lGRLd3zuTbI1yQsAqupjVfWUadevB7PR0opLsmapMVX1l8BHgbd06xwI/DFwVlXtnG6FklpI\ncjrwNuCnq+pvV7qeaaiqS4ArgDcDdG8OnQe8oqruX8napIeyeciPpSR5TpJzgH2658clOSdJgL8C\nLgO+H3gs8B+BB1asWAE2Wmpo+J2nJN/X/fJ/IcndSf5s99moJEclqSRnJPkSgwZqHL8OnNS9Q/MW\n4G+r6tKpvBhJTSX5VQbNxwuq6u/3MO6KJG9IcmWSf0zygSSPSXJRkvuSXJXkiKHxxyT5v0nuSfK5\nJD87tOxFSa7t1vtSkt8aWvbELod+sXu3eEf3B8zu5c9Kck237p1J3rQXL/ds4OQkxwN/AGypqsv2\nYn1JQ+YsPxZVVX8H3Myg4XwZ8HzgrcDBwBHAn1TVt6rqG1X1iW48SZ6X5ItD9a3rXtv9SS5O8t4k\nr29Ro76XjZam5T8ALwZ+AngccC+DYBj2E8APAi8ASHJ9kp9fbINV9VXgVcC7gRcyaLwkzb6zgDcA\nx1fV1jHGvxT4eeAw4MnA3wObgEcDXwB+CyDJ/sAW4J0M3sV9GbApyZO67TzQzTsQ+BngVUleuGBf\nPwo8kUEO/XaSo7v5bwXeVFX/ulv+vt0rZHA96s8tVnxV3cXg3eT3ACcArx7jNUsaba7yYww1NP3t\n7vldwG3AuzO4rOL7F1s5ycOBDwAXMDgm72fw95qmwEZLe+MDSb62+wH80R7GvgJ4TVVtq6pvAK8H\nTlnwMcHXV9U/VtXXAarqaVX1p0vUcCXwSOAjVbVj+S9FUo+ez+B39zNjjr+wqm6rqnuBDwOfr6rL\nq2oX8F7gh7txJ3fL3llVu6rqagZ/QJwCUFUfraobq+o7VXUdcDGDN3iGvb6q/rmqrgFuBJ7ezf8W\ncHSSx1TV/VV11e4VuutR/2yJ1/D3DP5A+1BV3T3m65b0YPOYHyMleQ6DN6jPZvCm8+XAr1fVd4Dj\ngO0MPvFzR5LLk/zAiM08B/hOVZ3Xnf16L3D1curR0my0tDdeXFUH7n4Av7aHsUcClww1ZTcxeOfl\n4KExty+jhk0M3n3akOTZy1hfUv/OAv4NcEF3LQEASS5I8kD3+I2h8XcOTX99xPP9u+kjgecseAPo\npcAh3fafneRj3cd6dgJnAgcNF1ZVXxl6+k9D2/4l4Bjg5iSfSrJh3BfbvcYLgP/D4COE68ddV9KD\nzFN+7AL2HTF/X+BbVfV3VfXGbhxdA3luN317Vf1aVT0BeDyDZm/ziG09Dti2YN5y/h7TGGy0NC23\nAycNN2ZV9Yiq2j40phZbeZQkZwCHM2jwfpNB6D6sXcmSpuRO4Hjgxxg6E15VZ1bV/t3jfy5ju7cD\nf7MgZ/avqrO75Rcz+FjM4VX1SAbNTxbb2LCqurmqTmXwkaI3A+9P8ogx69rYrfdKBh9TuiDJqD+e\nJC1tnvLjS8CRCxrK/Rk0eP8wtP1bq+p397D/LzE4Vk8dsfgO4NAF8w4fozYtg42WpuV84H8kORIg\nydokJy93Y0keB7wJ+JXuo4jnA3cDr2lRrKTpqqovM/hj6cQkb2m02UuBpyT5+ST7do/1Q9dYHADc\nU1X/nORZwKnjbjjJLyQ5qPtIzk4Gbwx9Z4z1DgPOBc6sqm8yuDb1AeCcPa4oaVHzkh/AJ7tx/yXJ\nw7sm6/eAT1bVwrNQw/s7KMnrkjwhA2sZnFW7csTwK4A1Sc5KsiaDG4A8c9zXpr1jo6Vp+QMGIfaR\nJPcz+GX/kT2t0F0g+rJFFv8RcHFVfQKgqgr4FeDVSfxuCOkhoHuX9acYXK/5xgbb28ngIvR/z+Bd\n2q8AbwQe3g05C3hjl0G/CezNdREbgJu6df8X8NKucSLJzUleush65wPvqqpPdjV+BzgD+M9Jnrw3\nr0/Sd81DfnTXrG8AnsfgeqsvAGsZfKRxT74B/ACDa7YeYHA92wPAL4/YxzeAlzC4lv5e4OcY3Bb+\nG3vx+jSmDP5elSRJkjRvklwN/O+qetdK17LaeEZLkiRJmhMZfNHxwd1HB89gcBv8D690XavRko1W\nksO7W0R+tvto16u6+Y9OsiXJLd3PRy2y/undmFsy+FZvSVqUmSOpL+aN5tQPAtcDX2PwnaQ/233/\nnxpb8qODSQ4BDqmqa5IcwOBe+y8GXs7gIsFzM/g27EdV1X9dsO6jga3AOgYXAl4NPLP7bgNJehAz\nR1JfzBtJ07TkGa2quqP7Ijaq6n4G34d0KIMventHN+wdjP5W6RcAW6rqni54tgAntihc0upk5kjq\ni3kjaZr26hqtJEcx+Ebtq4CDq+qObtFX+N4vot3tUL73S9C28eB790vSSGaOpL6YN5JaWzPuwO5e\n/u8HXl1V9w19lxpVVUkmun1hko0MvuSR/fbb75lPfvJ4d8H9zPadk+x2UT906COnst1p8TgMeBwG\nZuE4XH311V+tqrXL3dc0M8e8mYzHYcDjMDALx8G8GZ//vgY8DgMeh4Fp5c1YjVYG32j/fuDdVfXn\n3ew7kxxSVXd0n3EedRHdduC4oeeHAR8btY+q2gRsAli3bl1t3bp1nNI46py/Hmvc3tp67k9PZbvT\n4nEY8DgMzMJxSPIPS49adN2pZo55MxmPw4DHYWAWjoN5Mz7/fQ14HAY8DgPTyptx7joY4O3ATVX1\n+0OLLgV232HndOAvRqz+YeCEJI/q7thzAt4+UtIemDmS+mLeSJqmca7Reg7wC8BPJbm2e2wAzgWe\nn+QWBt9gfS5AknVJLgCoqnuA3wE+3T3e0M2TpMWYOZL6Yt5ImpolPzpYVVcAWWTx8SPGbwXOHHp+\nIXDhcguUNF/MHEl9MW8kTdNe3XVQkiRJkrQ0Gy1JkiRJasxGS5IkSZIas9GSJEmSpMZstCRJkiSp\nMRstSZIkSWrMRkuSJEmSGrPRkiRJkqTGbLQkSZIkqTEbLUmSJElqzEZLkiRJkhqz0ZIkSZKkxmy0\nJEmSJKkxGy1JkiRJasxGS5IkSZIas9GSJEmSpMZstCRJkiSpMRstSZIkSWrMRkuSJEmSGrPRkiRJ\nkqTG1iw1IMmFwAuBu6rqqd289wBP6oYcCHytqo4dse4XgfuBbwO7qmpdo7olrVJmjqS+mDeSpmnJ\nRgvYDJwHvHP3jKp66e7pJG8Gdu5h/Z+sqq8ut0BJc2czZo6kfmzGvJE0JUs2WlX18SRHjVqWJMDP\nAT/VtixJ88rMkdQX80bSNE16jdaPAXdW1S2LLC/gI0muTrJxwn1JkpkjqS/mjaSJjPPRwT05Dbho\nD8ufW1Xbk3w/sCXJ56rq46MGdiG1EeCII46YsCxJq1STzDFvJI3BvJE0kWWf0UqyBvi3wHsWG1NV\n27ufdwGXAOv3MHZTVa2rqnVr165dblmSVqmWmWPeSNoT80ZSC5N8dPB5wOeqatuohUn2S3LA7mng\nBOCGCfYnab6ZOZL6Yt5ImtiSjVaSi4BPAk9Ksi3JGd2iU1lwSj3J45Jc1j09GLgiyXXAp4C/rqoP\ntStd0mpk5kjqi3kjaZrGuevgaYvMf/mIeV8GNnTTtwFPn7A+SXPGzJHUF/NG0jRNetdBSZIkSdIC\nNlqSJEmS1JiNliRJkiQ1ZqMlSZIkSY3ZaEmSJElSYzZakiRJktSYjZYkSZIkNWajJUmSJEmN2WhJ\nkiRJUmM2WpIkSZLUmI2WJEmSJDVmoyVJkiRJjdloSZIkSVJjNlqSJEmS1JiNliRJkiQ1ZqMlSZIk\nSY3ZaEmSJElSYzZakiRJktSYjZYkSZIkNWajJUmSJEmN2WhJkiRJUmNLNlpJLkxyV5Ibhua9Psn2\nJNd2jw2LrHtikpuT3JrknJaFS1qdzBxJfTFvJE3TOGe0NgMnjpj/lqo6tntctnBhkn2AtwEnAccA\npyU5ZpJiJc2FzZg5kvqxGfNG0pQs2WhV1ceBe5ax7fXArVV1W1V9E7gYOHkZ25E0R8wcSX0xbyRN\n0yTXaJ2d5PrutPujRiw/FLh96Pm2bt5ISTYm2Zpk644dOyYoS9Iq1SxzzBtJSzBvJE1suY3WHwM/\nABwL3AG8edJCqmpTVa2rqnVr166ddHOSVpemmWPeSNoD80ZSE8tqtKrqzqr6dlV9B/gTBqfQF9oO\nHD70/LBuniTtFTNHUl/MG0mtLKvRSnLI0NOXADeMGPZp4Ogkj0/yMOBU4NLl7E/SfDNzJPXFvJHU\nypqlBiS5CDgOOCjJNuB1wHFJjgUK+CLwq93YxwEXVNWGqtqV5Gzgw8A+wIVVdeNUXoWkVcPMkdQX\n80bSNC3ZaFXVaSNmv32RsV8GNgw9vwx40G1RJWkxZo6kvpg3kqZpkrsOSpIkSZJGsNGSJEmSpMZs\ntCRJkiSpMRstSZIkSWrMRkuSJEmSGrPRkiRJkqTGbLQkSZIkqTEbLUmSJElqzEZLkiRJkhqz0ZIk\nSZKkxmy0JEmSJKkxGy1JkiRJasxGS5IkSZIas9GSJEmSpMZstCRJkiSpMRstSZIkSWrMRkuSJEmS\nGrPRkiRJkqTGbLQkSZIkqTEbLUmSJElqbMlGK8mFSe5KcsPQvDcl+VyS65NckuTARdb9YpLPJLk2\nydaWhUtancwcSX0xbyRN0zhntDYDJy6YtwV4alU9Dfg88N/2sP5PVtWxVbVueSVKmjObMXMk9WMz\n5o2kKVmy0aqqjwP3LJj3kara1T29EjhsCrVJmkNmjqS+mDeSpqnFNVq/DHxwkWUFfCTJ1Uk27mkj\nSTYm2Zpk644dOxqUJWmVmjhzzBtJYzJvJC3bRI1WktcAu4B3LzLkuVX1DOAk4JVJfnyxbVXVpqpa\nV1Xr1q5dO0lZklapVplj3khainkjaVLLbrSSvBx4IfCyqqpRY6pqe/fzLuASYP1y9ydpvpk5kvpi\n3khqYVmNVpITgd8AXlRV/7TImP2SHLB7GjgBuGHUWEnaEzNHUl/MG0mtjHN794uATwJPSrItyRnA\necABwJbutqbnd2Mfl+SybtWDgSuSXAd8CvjrqvrQVF6FpFXDzJHUF/NG0jStWWpAVZ02YvbbFxn7\nZWBDN30b8PSJqpM0d8wcSX0xbyRNU4u7DkqSJEmShthoSZIkSVJjNlqSJEmS1JiNliRJkiQ1ZqMl\nSZIkSY3ZaEmSJElSYzZakiRJktSYjZYkSZIkNWajJUmSJEmN2WhJkiRJUmM2WpIkSZLUmI2WJEmS\nJDVmoyVJkiRJjdloSZIkSVJjNlqSJEmS1JiNliRJkiQ1ZqMlSZIkSY3ZaEmSJElSYzZakiRJktSY\njZYkSZIkNTZWo5XkwiR3JblhaN6jk2xJckv381GLrHt6N+aWJKe3KlzS6mTeSOqLeSNpmsY9o7UZ\nOHHBvHOAv6mqo4G/6Z5/jySPBl4H/AiwHnjdYoElSZ3NmDeS+rEZ80bSlIzVaFXVx4F7Fsw+GXhH\nN/0O4MUjVn0BsKWq7qmqe4EtPDjQJOlfmDeS+mLeSJqmSa7ROriq7uimvwIcPGLMocDtQ8+3dfMe\nJMnGJFuTbN2xY8cEZUlahcwbSX0xbyQ10eRmGFVVQE24jU1Vta6q1q1du7ZFWZJWIfNGUl/MG0mT\nmKTRujPJIQDdz7tGjNkOHD70/LBuniTtDfNGUl/MG0lNTNJoXQrsvsvO6cBfjBjzYeCEJI/qLhI9\noZsnSXvDvJHUF/NGUhPj3t79IuCTwJOSbEtyBnAu8PwktwDP656TZF2SCwCq6h7gd4BPd483dPMk\naSTzRlJfzBtJ07RmnEFVddoii44fMXYrcObQ8wuBC5dVnaS5Y95I6ot5I2mamtwMQ5IkSZL0XTZa\nkiRJktSYjZYkSZIkNWajJUmSJEmN2WhJkiRJUmM2WpIkSZLUmI2WJEmSJDVmoyVJkiRJjdloSZIk\nSVJjNlqSJEmS1JiNliRJkiQ1ZqMlSZIkSY3ZaEmSJElSYzZakiRJktSYjZYkSZIkNWajJUmSJEmN\n2WhJkiRJUmM2WpIkSZLUmI2WJEmSJDVmoyVJkiRJjS270UrypCTXDj3uS/LqBWOOS7JzaMxrJy9Z\n0jwycyT1xbyR1MKa5a5YVTcDxwIk2QfYDlwyYugnquqFy92PJIGZI6k/5o2kFlp9dPB44AtV9Q+N\ntidJe2LmSOqLeSNpWVo1WqcCFy2y7NlJrkvywSRPWWwDSTYm2Zpk644dOxqVJWmVmihzzBtJe8G8\nkbQsEzdaSR4GvAh474jF1wBHVtXTgbcCH1hsO1W1qarWVdW6tWvXTlqWpFWqReaYN5LGYd5ImkSL\nM1onAddU1Z0LF1TVfVX1QDd9GbBvkoMa7FPS/DJzJPXFvJG0bC0ardNY5JR6kscmSTe9vtvf3Q32\nKWl+mTmS+mLeSFq2Zd91ECDJfsDzgV8dmvcKgKo6HzgFOCvJLuDrwKlVVZPsU9L8MnMk9cW8kTSp\niRqtqvpH4DEL5p0/NH0ecN4k+5Ck3cwcSX0xbyRNqtVdByVJkiRJHRstSZIkSWrMRkuSJEmSGrPR\nkiRJkqTGbLQkSZIkqTEbLUmSJElqzEZLkiRJkhqz0ZIkSZKkxmy0JEmSJKkxGy1JkiRJasxGS5Ik\nSZIas9GSJEmSpMZstCRJkiSpMRstSZIkSWrMRkuSJEmSGrPRkiRJkqTGbLQkSZIkqTEbLUmSJElq\nzEZLkiRJkhqz0ZIkSZKkxiZutJJ8MclnklybZOuI5Unyh0luTXJ9kmdMuk9J88m8kdQX80bSpNY0\n2s5PVtVXF1l2EnB09/gR4I+7n5K0HOaNpL6YN5KWrY+PDp4MvLMGrgQOTHJID/uVNH/MG0l9MW8k\n7VGLRquAjyS5OsnGEcsPBW4fer6tm/c9kmxMsjXJ1h07djQoS9IqZN5I6ot5I2kiLRqt51bVMxic\nQn9lkh9fzkaqalNVrauqdWvXrm1QlqRVyLyR1BfzRtJEJm60qmp79/Mu4BJg/YIh24HDh54f1s2T\npL1i3kjqi3kjaVITNVpJ9ktywO5p4ATghgXDLgV+sbs7z7OAnVV1xyT7lTR/zBtJfTFvJLUw6V0H\nDwYuSbJ7W39aVR9K8gqAqjofuAzYANwK/BPwSxPuU9J8Mm8k9cW8kTSxiRqtqroNePqI+ecPTRfw\nykn2I0nmjaS+mDeSWujj9u6SJEmSNFdstCRJkiSpMRstSZIkSWrMRkuSJEmSGrPRkiRJkqTGbLQk\nSZIkqTEbLUmSJElqzEZLkiRJkhqz0ZIkSZKkxmy0JEmSJKkxGy1JkiRJasxGS5IkSZIas9GSJEmS\npMZstCRJkiSpMRstSZIkSWrMRkuSJEmSGrPRkiRJkqTGbLQkSZIkqTEbLUmSJElqzEZLkiRJkhpb\ndqOV5PAklyf5bJIbk7xqxJjjkuxMcm33eO1k5UqaV2aOpL6YN5JaWDPBuruA/1RV1yQ5ALg6yZaq\n+uyCcZ+oqhdOsB9JAjNHUn/MG0kTW/YZraq6o6qu6abvB24CDm1VmCQNM3Mk9cW8kdRCk2u0khwF\n/DBw1YjFz05yXZIPJnlKi/1Jmm9mjqS+mDeSlmuSjw4CkGR/4P3Aq6vqvgWLrwGOrKoHkmwAPgAc\nvch2NgIbAY444ohJy5K0SrXIHPNG0jjMG0mTmOiMVpJ9GQTQu6vqzxcur6r7quqBbvoyYN8kB43a\nVlVtqqp1VbVu7dq1k5QlaZVqlTnmjaSlmDeSJjXJXQcDvB24qap+f5Exj+3GkWR9t7+7l7tPSfPL\nzJHUF/NGUguTfHTwOcAvAJ9Jcm037zeBIwCq6nzgFOCsJLuArwOnVlVNsE9J88vMkdQX80bSxJbd\naFXVFUCWGHMecN5y9yFJu5k5kvpi3khqocldByVJkiRJ32WjJUmSJEmN2WhJkiRJUmM2WpIkSZLU\nmI2WJEmSJDVmoyVJkiRJjdloSZIkSVJjNlqSJEmS1JiNliRJkiQ1ZqMlSZIkSY3ZaEmSJElSYzZa\nkiRJktSYjZYkSZIkNWajJUmSJEmN2WhJkiRJUmM2WpIkSZLUmI2WJEmSJDVmoyVJkiRJjdloSZIk\nSVJjNlqSJEmS1NhEjVaSE5PcnOTWJOeMWP7wJO/pll+V5KhJ9idpvpk5kvpi3kia1LIbrST7AG8D\nTgKOAU5LcsyCYWcA91bVE4G3AL+33P1Jmm9mjqS+mDeSWpjkjNZ64Naquq2qvglcDJy8YMzJwDu6\n6fcBxyfJBPuUNL/MHEl9MW8kTWySRutQ4Pah59u6eSPHVNUuYCfwmAn2KWl+mTmS+mLeSJrYmpUu\nYLckG4GN3dMHktw85qoHAV9tXs/yPwAwlXomtOyaJjgOS5m147THeqZ4HPZk1o4R+b29qunIadYy\niVWUNzB7/07Mm/EsWpN5M2DemDdjWlZN5s3ACuUNzNhxmlbeTNJobQcOH3p+WDdv1JhtSdYAjwTu\nHrWxqtoEbNrbIpJsrap1e7vetMxaPWBN45i1esCaRmiWOaslb2D2apq1esCaxjFr9cCK12TejGBN\nS5u1esCaxjGteib56OCngaOTPD7Jw4BTgUsXjLkUOL2bPgX4aFXVBPuUNL/MHEl9MW8kTWzZZ7Sq\naleSs4EPA/sAF1bVjUneAGytqkuBtwPvSnIrcA+DoJKkvWbmSOqLeSOphYmu0aqqy4DLFsx77dD0\nPwP/bpJ9jGGvT8dP2azVA9Y0jlmrB6zpQWYgc/xvsrRZqwesaRyzVg+YN/43Gc+s1TRr9YA1jWMq\n9cSz3JIkSZLU1iTXaEmSJEmSRnhINFpJTkxyc5Jbk5wzYvnDk7ynW35VkqNmoKaXJ9mR5NruceaU\n67kwyV1JblhkeZL8YVfv9UmeMc16xqzpuCQ7h47Ra0eNa1jP4UkuT/LZJDcmedWIMb0epzFr6u04\nJXlEkk8lua6r57dHjOn9961vs5Y55k2TmsybGcubbn9znzmzljdj1jTXmTNredPtc6Yyx7zpVNVM\nPxhchPoF4AnAw4DrgGMWjPk14Pxu+lTgPTNQ08uB83o8Tj8OPAO4YZHlG4APAgGeBVw1AzUdB/xV\nj8foEOAZ3fQBwOdH/Hfr9TiNWVNvx6l73ft30/sCVwHPWjCm19+3vh+zljnmTbOazJsZy5tuf3Od\nObOWN3tR01xnzqzlTbfPmcoc82bweCic0VoP3FpVt1XVN4GLgZMXjDkZeEc3/T7g+CRZ4Zp6VVUf\nZ3DXo8WcDLyzBq4EDkxyyArX1KuquqOqrumm7wduAg5dMKzX4zRmTb3pXvcD3dN9u8fCCzn7/n3r\n26xljnnTpqZemTdj1zTvmTNreTNuTb2atcyZtbyB2csc82bgodBoHQrcPvR8Gw/+D/UvY6pqF7AT\neMwK1wTws92p2fclOXzE8j6NW3Pfnt2dwv1gkqf0tdPuVPAPM3g3Y9iKHac91AQ9Hqck+yS5FrgL\n2FJVix6jnn7f+jZrmWPetGPeLF0T9Hyc5jxzZi1vxq0JzJylrEjewOxlzjznzUOh0Xqo+kvgqKp6\nGrCF73bH+q5rgCOr6unAW4EP9LHTJPsD7wdeXVX39bHPpSxRU6/Hqaq+XVXHAocB65M8dZr7UxPm\nzdLMm84s5Q2YOQ9RZs6erUjewOxlzrznzUOh0doODL9Tclg3b+SYJGuARwJ3r2RNVXV3VX2je3oB\n8Mwp1jOOcY5jr6rqvt2ncGvwfSX7JjlomvtMsi+DX/h3V9WfjxjS+3FaqqaVOE7dvr4GXA6cuGBR\n379vfZu1zDFvGjBvxqtppfKm2988Zs6s5c1YNZk5e7ZSv0ezljnmzUOj0fo0cHSSxyd5GIML0y5d\nMOZS4PRu+hTgo1U1zS8IW7KmBZ95fRGDz6aupEuBX+zuOPMsYGdV3bGSBSV57O7PvSZZz+Df49T+\n59Ht6+3ATVX1+4sM6/U4jVNTn8cpydokB3bT/wp4PvC5BcP6/n3r26xljnnTgHkze3nT7WPeM2fW\n8masmsycPev796jbz0xljnnTqR7viLLcB4O7pHyewV1wXtPNewPwom76EcB7gVuBTwFPmIGa3gjc\nyOBuPZcDT55yPRcBdwDfYvCZ2zOAVwCvqO/eaeVtXb2fAdb1cIyWqunsoWN0JfCjU67nuQwuerwe\nuLZ7bFjJ4zRmTb0dJ+BpwP/r6rkBeO2If9u9/771/Zi1zDFvmtRk3sxY3nT7m/vMmbW8GbOmuc6c\nWcubbp8zlTnmzeCRbqOSJEmSpEYeCh8dlCRJkqSHFBstSZIkSWrMRkuSJEmSGrPRkiRJkqTGbLQk\nSZIkqTEbLUmSJElqzEZLkiRJkhqz0ZIkSZKkxv4/YF424bM5A0YAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 1200x400 with 3 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", "fig = plt.figure(figsize = (12, 4))\n", "\n", "plt.subplot(1,3,1)\n", "plt.hist(ka.labels_)\n", "plt.ylim([0,20])\n", "plt.title('Hier: X')\n", "\n", "plt.subplot(1,3,2)\n", "plt.hist(km.labels_)\n", "plt.ylim([0,20])\n", "plt.title('K-means: X')\n", "\n", "plt.subplot(1,3,3)\n", "plt.hist(km_u.labels_)\n", "plt.ylim([0,20])\n", "plt.title('K-means: U*Sig')\n", "\n", "fig.tight_layout()\n", "\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p>Comparing the above methods, we see fairly even representation in the clusters. Given this result, it is reasonable to choose the clustering method that gives us a more interpretable result. \n", "</p>" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [py35]", "language": "python", "name": "Python [py35]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }