This week we will review some of the unsupervised learning models. Starting with the A priori algorithm, we will see how one can predict the probability of having customers simultaneously purchasing particular items from a grocery store or super market. We will then implement K-means and K-medoid and study how to determine the optimal value for K through the elbow method. We will implement the main Hierarchical clustering algorithms. Finally we will apply those algorithms to image segmentation and community detection.
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Part I: Kernels and SVM\n",
"\n",
"\n",
"##### Exercise 1.1. A linear classifier\n",
"\n",
"Consider the dataset given below. Start by learning a linear classifier for this dataset by minimizing the RSS criterion. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD6CAYAAACxrrxPAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAbNUlEQVR4nO2dbahlV3nHf88kDXSqaDBTKUnu3CiJOhYDzfiWthDrB5N8qC0kYHqJEFqGwReU0mLSgfaDHagfCqVGK0OQIA4NrRUbwRrEohZsrDeQF2NIOolmMk3BSZUK5oPEWf1wzm3OnDn7nH3OXi/PWvv/g82dc86ec9Zae63/ftbzPGttCyEghBCifvaVLoAQQog4SNCFEKIRJOhCCNEIEnQhhGgECboQQjSCBF0IIRphpaCb2WfN7Edm9r2Oz83M/tbMTpnZo2b2G/GLKYQQYhUX9zjnXuBu4HMdn98EXD093g783fTvUi677LKwvb3dq5BCCCEmPPTQQy+EEA4s+myloIcQvmVm20tOeS/wuTBZofSgmb3azH4thPDfy753e3ub3d3dVT8vhBBiBjN7tuuzGD70y4HnZl6fmb63qCBHzGzXzHbPnj0b4aeFEELsEUPQbcF7C/cTCCGcCCEcDiEcPnBg4YxBCCHEhsQQ9DPAlTOvrwCej/C9Qggh1iCGoN8PvH+a7fIO4H9X+c+FEELEZ2VQ1Mz+HrgBuMzMzgB/AfwSQAjhM8BXgJuBU8CLwB2pCiuEEKKbPlkut634PAAfjFYiISrh5Ek4dgxOn4atLTh+HHZ2SpdKjJk+eehCiDlOnoQjR+DFFyevn3128hok6qIcWvovxAYcO/aymO/x4ouT94UohQRdiA04fXq998fGyZOwvQ379k3+njxZukTjQIIuxAZsba33fgq8iuaeO+rZZyGEl91RXsrXMhJ0EQ2vApOC48dh//7z39u/f/J+DjyLptxR5ZCgiyh4FpgU7OzAiRNw8CCYTf6eOJEvIOpZNOWOKocEXUTBs8CkYmcHfvhDOHdu8jdndotn0fTgjhpExVNNCbqIgmeBaRHPolnaHTWIyqeaEnQRBc8C0yKeRbO0O2oQlU81JegjIMcM0rPAtIh30YztjsrmBUk91UxdkRBCkeO6664LIj2f/3wI+/eHMJk/To79+yfvp/itgwdDMJv8TfEbYnzk7MPh4MHzf2jvOHhw+HdHqgiwGzp0VYLeOCn7Z2p0gxAhZO7DKe8ekSqyTNDlcmmcWoOVlcemRESy9uGUvqwMFZGgN06twcrKY1MiItn7cKp81AwVkaA3Tq3BylpnFiI+tfbhC8hQEQl643jPhuii1pmFiE+tffgCMlTEJj72/Bw+fDjs7u4W+W3hn/n9xmFizFQ5kMV56MEgwzCzh0IIhxd9JgtduMSLVVbxKvC1yFVPBbsT05X+kvpQ2qLwTowMthpSL9et55A61ZxG6wWWpC3K5SJEB9vbEwtynoMHJ8kPq6jFbbROPYfWad++iYTPYzZJKhGrkctlhIzFVZCSoZk2taRerlPPoXVSsDstEvQGkZ8yDkPFp5bUy3XqObROzaQgOkWC3iC1WIYxSDkTGSo+tVij69RzaJ28BLubpcu5nvpQUDQdZosDT2alSxaXHJs2DQkAZt1UaiB961lTnVoFbc41LsaSSVBDPWvIclmXFutUE8sEXS6XBonpp/QcXC3to+7TNiUfU5fq2pWsUzE8D4RZupQ+9SELPS0xrKhU0+tYFl5JC92768F7+arCWWMil4vYhBSCGXNslBxn3t093stXFc4ac5mga2GR6CTFIpChi3XmKbUviPcFMt7LVxXOGlMLi8RGpEi7i+33LuXP9Z6S6L18VVFRY0rQRScpFoFUNDaW4n2BjPfyVcWQxswdTO3yxaQ+5EOvg9gpas7iS4Pwnr7nvXxVsUljJursyIcuPKH9sFejNmqA2AGjKct86BJ0IZxRyy6NYgWJgqkKigpREWPai6dpCgSMegm6md1oZk+a2Skzu3PB568ysy+b2SNm9riZ3RG/qEKMg9IrYEUkCkSmVwq6mV0EfAq4CTgE3GZmh+ZO+yDw/RDCtcANwF+b2SWRyyrEKGglE2j0FNhaso+F/jbgVAjhmRDCz4H7gPfOnROAV5qZAa8Afgy8FLWkjVDLlhCiHEo5zECugZh5oUQfQb8ceG7m9Znpe7PcDbwJeB54DPhICOECr7+ZHTGzXTPbPXv27IZFrhc9eEL0QXuGJ6bhgbgyy8XMbgXeE0L4o+nr24G3hRA+PHPOLcBvAn8MvB74GnBtCOGnXd87xiyXRFlMQoh1qHwgDs1yOQNcOfP6CiaW+Cx3AF+c5r2fAn4AvHGTwraMgl3dyBUlstHwQOwj6N8Frjazq6aBzvcB98+dcxp4N4CZvRZ4A/BMzIK2gIJdi2l4Biw80vBAXCnoIYSXgA8BDwBPAP8QQnjczI6a2dHpaR8Hrjezx4CvAx8LIbyQqtC1omDXYrzkXWuWkI+ibd3yQOzaEyD1Mda9XGrYXyN3GT08A7WlPWa846KtaxiIHaC9XERfSiw79xCj8lCGsaC2HoaW/ovelHB/eJgBNxwnc4faOh0SdHEeJQabh7zrhuNk7lBbp0OCLs6j1GAr/SR5D7OElHgK+Lbe1iWRoBdmyEBLMUjHOtg8zBJS4S0ttOW2Lk5XtDT1MdYsl1mGRPtTZgpUnAAgFuDsofXtk3gAoSwXnwyJ9itTQPTF2UPr2yZDmpiyXJwyJACpTAHRFwUhM1J4lZwEvSBDBpoGqejLWOMiGzMkOFXY0pKgF2TIQGt9kHrKyqgdBSHXYGgEubSl1eVcT30oKDphSPyk1eCli6XhYpwMjSBn6LwoKCpqQgFfUYwYEeSTJyc+89OnJ5b58eNRp0MKioqqUMBXFCOGy2R+lRxk8x9K0IU7SrshxYiJHZzKvKpLgi7c0XrAdwxUG9SOHUHOnMY4CkGvtnONFGVl1I23rQbWJubGQpn9h80HRUvs7y3EmFFQe4YEjTHqoKiXx5uJcTOmWaKC2jNk9h82L+jqXKI0uVwQXm4aCmrPkNl/2LzLRdM/UZocfdCTa9FTWVpk1C4XZUyI0uSYJXpyLSqoXY7mBV2dS5QmhwvCm2txaKKIF/dRbTQv6FD+8WZi3OSYJbbkt64+7bEgoxD0GLRiMbRSj5rIMUtsybXoyX1UHV27dqU+atptsZXd/1qph1hMK7tvmi3e8NCsdMl8gHZbHEYrmTIp6pF4YzkxQloZb6kYdZZLDFIHnHK5QWLXQ75OkYLq3Ucl/ZpdpnvqoyaXS8qnpud0g8Suh54mL1JRrftID7jwT8qFEjmnl7HroafJCzFHhgEtl8tAUmYp5Mwfjl2PllLlhIhC4QUBstALU3MASEu8hZhDFvq4qTkApFW440HrF3pSekB3OddTHzUFRVNTbQBIjAKtX1iTxAMaBUWFEJsSw4ug9QrxGOxyMbMbzexJMztlZnd2nHODmT1sZo+b2TeHFFjUhabjbTM0zqf1CvlYKehmdhHwKeAm4BBwm5kdmjvn1cCngd8NIbwZuDV+UcUQUomuBqt/hl77odlM2pslI12+mL0DeCfwwMzru4C75s75APCXq75r9qjBh96KbzulD1SLi3wT49oP/Q7tzRIXlvjQ+7hcLgeem3l9ZvreLNcAl5rZN8zsITN7/7DbTBrWsVRasjxTWkje9uEeSmvuoxjXfmg2k9YrZKRL6fcOJu6Te2Ze3w58cu6cu4EHgV8BLgP+E7hmwXcdAXaB3a2trYz3tPWtjJYsz5QWUkvt1GI2hwfruMV2LQkDLfQzwJUzr68Anl9wzldDCD8LIbwAfAu4dsHN40QI4XAI4fCBAwf63XEisa6l0pLlmdJCKp12G5MWfb0erGOtV8hIl9LvHcDFwDPAVcAlwCPAm+fOeRPw9em5+4HvAb++7Htz+9DXtVRkea73/S3EGnJas7naTNZxe7DEQu8VwARuBp4CngaOTd87ChydOedPge9Pxfyjq74zt6CvK9CtDYRWRDcluW7iufuWrn1bDBb0FEduQd9kEGkgjItcQtvS7E/kZ5mgj2qlqFariVXk6CPadlgMYdlK0VEJuhAeqHmHTVEe7bbogNbym8XmtJQZJNYksRBI0DPQ0iIlMZzcaXwyJiIxtCFzCEGXcz31UcPS/1goCCZKUWu2lruEhBgNGUkIGLiwSAykpUVKoi5yLJaKPQNwOaON0ZAZhECCngEPq/XEOEmtISnE1+WK3RgNmUEIJOgZUBBMlCK1hqQQX5cz2hgNmUEImhR0b0Eg7WUhSpFaQ1KIr8sZbYyGzCEEXc711EeqoGiqIJC7II0QPUnZd1ME/N0Gcp2IAGNa+j+qDiZEYWRA5WeZoDe3UjTFsmqt7BOiG22pkZdlK0Uvzl2Y1GxtLRbfIf43l0EaIZywsyMB90JzQdEUQSCXQRohhJijKkHvk72SIpCstEMhRA1U43LZW8Cwl/O6t4ABLhTr2FPAve+Sn1AI4ZlqLPTSq8d2diYB0HPnJn8l5kKItciwQKYaC12BSSFEtazjYhhANRa6ApON4W05rxApyeRiqEbQFZhsiFLb6Xm7iXgrj0hHLhdD14qj1McmK0W1eqwRSmwQ7225r7fyiH5sKkIR+zxjWikqKqDEU5K9Lff1Vh6xmnk/OEzcBH3yoof83zn0TFHhixIBEW9RdW/lEasZ4gfPtOWqBF3kp0RAxFtU3Vt5xGqG3oQz5D6PStBbjUFVV68SG8R7i6p7K49YTQ034S7neuoj90OiU8egSgVsFVtbA29RdW/laY3Y7etksDGm/dC7SJlYUfI6l0gYEcI9DW/UvkzQR5PlkjKxomTCQomEEVEIbTzen4aziJTlQlr3V8mEhRrceiICpRZj1cpIs4hGI+gpY1AlRVWxtZFQene62hippTMaQU+ZWFFSVEskjDRBbalBI7U4N2aslk6Xcz31kTsomhoHsRLRFyfZCmuh6Pf6NDooWRIUHY2Fnhrtl74hJSzlGO6L3OUem8UZo33HOCi7lD710ZqFLjaglKVsttjaNfNd7kYtzguocQaVEZS2KFxSKrVs6O82nBLnArXvUganLZrZjWb2pJmdMrM7l5z3VjP7hZndsmlhxYgoFegb6r5QgDItat+NWSnoZnYR8CngJuAQcJuZHeo47xPAA7ELuUdtiQliBTFSyzbpFENTg1pNifMywFpt3xx0+WL2DuCdwAMzr+8C7lpw3keBDwL3Ares+t51fehyqxUkle926EXt+/8b3dMjKp7q5Kkse+VxFLtgyF4uwC3APTOvbwfunjvncuCbwEXLBB04AuwCu1tbW2tVQllbhfC8q1mfTtHwnh5R8TbAvLSvt5tLGC7oty4Q9E/OnfOPwDum/05ioQ9NTBAb4m2gz9KnU3gu/7qkFDkNsMU47D/LBL1PUPQMcOXM6yuA5+fOOQzcZ2Y/nFr0nzaz3+vx3b2RW60QngNUfTqF5/KvQ+q9XDTAFlNZ/+kj6N8Frjazq8zsEuB9wP2zJ4QQrgohbIcQtoEvAB8IIXwpZkHHtq7CDZ4Hep9O4bn865B6LxcNsMVs2n9KBZi7TPfZA7gZeAp4Gjg2fe8ocHTBufeSwOUSgh+32qhw6EM8j1Wdwnv5+5LDJaIBdiGb9J/EfQ494GIkpMxGqXmg117+EFz6ckfDuv0n8bWSoI+BVizRVhl6U9H1rYfEs6llgq7NuUoS08+m/bL9EiOgmWKfZC8LiXKQs64l4zZdSp/6GL2FHtviUtqZXzy6S8Zk8eeuq3zoIyT2IB/yfaV8zC34tvvg8Wbr8SaTihJ1Tdi3JegeiT3IN7UKSm4FOxYL0aN4erzJpKKxui4TdPnQSxHbz7apj7WU731MPn+POd7e8vNT+ri91TUlXUqf+vBkoReZ+XuxUEtZL41ZTSvx5l6K1f9i1CvHfkEexlokkMulm6LX2sMgL+UO8OiGGBteUilz9AUPYy0SywR99E8sGv3DUfZS6mbdH/v3D0+J8/q7Ih6xBs++fRMJn8ds8jxQcR6Dn1jUMpXtvROfFPnNJX93TLnVpYk1eMbk406MLPTtkVvoLSGrPy+xBo+u21rIQl+CxwQEsSFjypzxQKzBU2qW2CCjt9BhYiAcOzaZKW5tTfqj+lKFyBebHw2e7MhCX8HOzmSGeO7c5K/6Y6Xk9sWm9tfXEA/Q4PF1nbrSX1IfXtIWxQZ4TQHLmYOq3GkRQpHrhPLQG6K0mHoXmlztkzp3Wnn6dVDgOi0TdPnQa8JDNkDutCCvPtrU/nrFA+qgwHWSD70VPGRx5EzcT/1g5CGk9tePNTfbkz+6D86ukwS9BJt2Wg+roHJ2YA83sC5S57uOMZ92yA281I3A23Xq8sWkPkbrQx/ig/bgV83pQ/e+gVdqf/3Y9qnftH+Xjutkbi8UFHXE0AdReAhIthJ4bI0adj5cxqY38JH1Ewm6J4ZanSnEtHTmTBdebmA1UNPOh7F/2/tMLjISdE94sya8i6bXm403YvWrkuK4aV/0NqYSI0H3hDcBHdlgaJZYQly6P2xyA/c2phKzTNCV5ZIbbxsRecicEcOJlX20adZGrCyTTbYSSD2makql7FL61MdoLXRvlLbIRBxiWqnrWsktW8gO64ZcLqKT3B1WPvF01JZuWAMO6yZBF8vJJQQpbx66UZSj5SwTh3VbJujV+tBrcmu5J9cWqKlWfqbeIkCdbTnOlr9Hpba6dSl96mOIhe7QrSX6kMraSTktVmdbTctt5LButOZycejW8odHF0SqC5dyWqzO1g+P/S0Wzuq2TNCr3D5XO4uuwMM2uznLlXJLX3U24Yzmts+tza2VHa+7FKbKF0654506m6iIKgXd246V7vC8WChFADblwhJ1NlERvQTdzG40syfN7JSZ3bng8x0ze3R6fNvMro1f1JfxttjSHa09LLkPqTJ1xtLZSlxDD/2mNbqc63sHcBHwNPA64BLgEeDQ3DnXA5dO/30T8J1V39tkHrqX4ElLD0sW6SlxDWP/ZsmxV9N+6MA7gQdmXt8F3LXk/EuB/1r1vc0Jujdh057loi8lrmHM3xw69oaMlQLjfqig3wLcM/P6duDuJef/yez5XUdzgj5WYcu1ks7L7GeeoWLgoU4lVkPG/M2SD40pMO6HCvqtCwT9kx3nvgt4AnhNx+dHgF1gd2trK1mFi+BwiXAWcnRob7OfGOXyVKfaLfQhY29oOQqM+ywuF+AtU1/7Nau+MwRZ6M2QQ5i8tu2QcnmqU+0+9CFtOVSQK7TQLwaeAa6aCYq+ee6cLeAUcP2q79s7mhN0TxZXblK7DrzOfoaUy1udSrh/Yv1myQev1+ZDn/x/bgaemlrgx6bvHQWOTv99D/AT4OHp0fmDe0dzgh6CH59oa3iyZmOVy2udamXTsRdDkGvKckl1NCnoteP1huR19tOKD33seO33HUjQxWq8C4zXQecxy8VrW4koLBP0KjfnEglIucGVyIfXjdlENJrbnEskwPP+L6I/uTZm07J9l0jQxQTtKtgGOW7MMZ4QpRtCEiTonsnZ6bWrYBvkuDEPnQWkfmTgiJGgeyV3px/LroKtk+PGPHQW4HW//gaQoHulRKfP9bDoGqjVJZDjxjx0FpAzXlPrddyUrvSX1IfSFlfgbSXhmPCewlmaWja0avQ6siRtURa6V1oPUnq2nOQSWM7QWUCueM0Yr2OX0qc+ZKGvoFHrIoTgv26aHaUnx+KnRq8jY7bQXRiCmxQipi/URSPM4N1yyjU78nZdcpIjXtP6LHcRXUqf+shhobswBEsXovTvLyK15TTU+svRZh6vS2s02saMdS8XFxvalS5E6d/PXaZYgzi1S8DjdWmRBve1WSboTe/lsm/fZJTMYzaZ6WWhdCFK//4iUu43UsueNB6vi0dOnpy44k6fnrhKjh8fdzotI97LxYULrXQhSv/+IlLmSteyJ43H6+IN7ytKPcZAukz31Id86CP2oaecBtfiyvB4Xbzh+VoWvH6M1YceghMXWulClP79+bKkHAg1CaWn6+IRz2mHBW82ywS9aR+6cEgOH7f8rm3gOR5SMAYyWh969Xj00Q0lh49be9K0gecdQJ3GQCToXvEeENoUpwNBOMTzDqBObzYSdK94X025KU4HgnCK19mW05uNfOheaTlPWT5uITZGPvQY5PZnt+ya8Gp1CVE59Qt6DqEt4c+Wa0IIsSZ1C3ouoS319CCHPjohhF/q9qHnylNt2Z8thKiKdn3oufbtaNmfLYRohroFPZfQyp8thKiAugU9l9DKny2EqICLSxdgEHuCmiOneWdHAi6EcE3dgg4SWiGEmFK3y0UIIcT/I0EXQohGkKALIUQjSNCFEKIRJOhCCNEIxZb+m9lZYMG6/aa5DHihdCGcoTa5ELXJhahNXuZgCOHAog+KCfoYMbPdrj0Yxora5ELUJheiNumHXC5CCNEIEnQhhGgECXpeTpQugEPUJheiNrkQtUkP5EMXQohGkIUuhBCNIEEXQohGkKBHxsxuNLMnzeyUmd254PMdM3t0enzbzK4tUc7crGqXmfPeama/MLNbcpavBH3axMxuMLOHzexxM/tm7jLmpsf4eZWZfdnMHpm2yR0lyumWEIKOSAdwEfA08DrgEuAR4NDcOdcDl07/fRPwndLl9tAuM+f9K/AV4JbS5S7dJsCrge8DW9PXv1q63A7a5M+AT0z/fQD4MXBJ6bJ7OWShx+VtwKkQwjMhhJ8D9wHvnT0hhPDtEMJPpi8fBK7IXMYSrGyXKR8G/gn4Uc7CFaJPm/wB8MUQwmmAEELr7dKnTQLwSjMz4BVMBP2lvMX0iwQ9LpcDz828PjN9r4s/BP4laYl8sLJdzOxy4PeBz2QsV0n69JVrgEvN7Btm9pCZvT9b6crQp03uBt4EPA88BnwkhHAuT/H8U/8Ti3xhC95bmBdqZu9iIui/lbREPujTLn8DfCyE8IuJ8dU8fdrkYuA64N3ALwP/bmYPhhCeSl24QvRpk/cADwO/A7we+JqZ/VsI4aeJy1YFEvS4nAGunHl9BRNL4jzM7C3APcBNIYT/yVS2kvRpl8PAfVMxvwy42cxeCiF8KUsJ89OnTc4AL4QQfgb8zMy+BVwLtCrofdrkDuCvwsSJfsrMfgC8EfiPPEX0jVwucfkucLWZXWVmlwDvA+6fPcHMtoAvArc3bGnNs7JdQghXhRC2QwjbwBeADzQs5tCjTYB/Bn7bzC42s/3A24EnMpczJ33a5DSTGQtm9lrgDcAzWUvpGFnoEQkhvGRmHwIeYBKx/2wI4XEzOzr9/DPAnwOvAT49tUZfCo3vItezXUZFnzYJITxhZl8FHgXOAfeEEL5XrtRp6dlPPg7ca2aPMXHRfCyEoG11p2jpvxBCNIJcLkII0QgSdCGEaAQJuhBCNIIEXQghGkGCLoQQjSBBF0KIRpCgCyFEI/wfjK+tFAv51KMAAAAASUVORK5CYII=\n",
"text/plain": [
"