{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": [
     "s1",
     "content",
     "l1"
    ]
   },
   "source": [
    "# Spectral Clustering\n",
    "\n",
    "## Spectral Clustering\n",
    "\n",
    "Spectral Clustering works by transforming the data into a subspace prior to clustering. This is incredibly useful when the data is high dimensional. This saves the effort of doing a PCA or a dimensionality reduction ourselves prior to clustering. Spectral clustering works by determining an affinity matrix between the datasets. The data is represented as a graph and an affinity matrix is computed. For the affinity function, we can use the rbf kernel function or nearest neighbors.\n",
    "\n",
    "Let us consider the intertwined circles with noise:\n",
    "\n",
    "<img src='https://s3.amazonaws.com/rfjh/media/CKEditorImages/2017/06/20/make_circles_292Si2F.png'/>\n",
    "\n",
    "<br/>\n",
    "## Exercise:\n",
    "\n",
    " - Apply spectral clustering to the dataset with number of clusters as 2.\n",
    " - Assign the cluster labels to a dataframe, circles_df with column 'spectral'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true,
    "tags": [
     "s1",
     "ce",
     "l1"
    ]
   },
   "outputs": [],
   "source": [
    "from sklearn.cluster import SpectralClustering\n",
    "from sklearn import datasets\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "\n",
    "N_Samples = 2000\n",
    "X, y = datasets.make_circles(n_samples=N_Samples, factor=.5,  noise=.2)\n",
    "noisy_circles = pd.DataFrame({'X_0':X[:,0],'X_1':X[:,1], 'y':y})\n",
    "\n",
    "# Fit the spectral clustering to the dataset and plot the results.\n",
    "spectral = SpectralClustering(n_clusters=2, eigen_solver='arpack', affinity=\"nearest_neighbors\")\n",
    "\n",
    "noisy_circles.drop('y', 1)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": [
     "s1",
     "l1",
     "hint"
    ]
   },
   "source": [
    "<p>use .fit and .labels_ to extract the cluster associations.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "tags": [
     "s1",
     "l1",
     "ans"
    ]
   },
   "outputs": [],
   "source": [
    "spectral.fit(noisy_circles)\n",
    "noisy_circles['spectral'] = spectral.labels_\n",
    "g=sns.pairplot(x_vars=\"X_0\", y_vars=\"X_1\", hue = \"spectral\", data = noisy_circles)\n",
    "g.fig.set_size_inches(14, 6)\n",
    "sns.despine()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "tags": [
     "s1",
     "hid",
     "l1"
    ]
   },
   "outputs": [],
   "source": [
    "ref_tmp_var = False\n",
    "\n",
    "\n",
    "try:\n",
    "    ref_assert_var = False\n",
    "    import numpy as np\n",
    "    \n",
    "    spectral_ = SpectralClustering(n_clusters=2, eigen_solver='arpack', affinity=\"nearest_neighbors\")\n",
    "    spectral_.fit(noisy_circles)\n",
    "    \n",
    "    if (len(noisy_circles['spectral']) == len(spectral_.labels_)):\n",
    "      ref_assert_var = True\n",
    "      out = g\n",
    "    else:\n",
    "      ref_assert_var = False\n",
    "    \n",
    "except Exception:\n",
    "    print('Please follow the instructions given and use the same variables provided in the instructions.')\n",
    "else:\n",
    "    if ref_assert_var:\n",
    "        ref_tmp_var = True\n",
    "    else:\n",
    "        print('Please follow the instructions given and use the same variables provided in the instructions.')\n",
    "\n",
    "\n",
    "assert ref_tmp_var"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": [
     "l2",
     "content",
     "s2"
    ]
   },
   "source": [
    "\n",
    "\n",
    "\n",
    "<br/><br/><br/>\n",
    "## The Algorithm\n",
    "\n",
    "* project your data to $R^{n}$\n",
    "* Form an Affinity  matrix, using a Gaussian Kernel/Adjacency matrix:  \n",
    "$$A_{i,j}=\\delta_{i,j}$$\n",
    "* Construct the Graph Laplacian from A\n",
    "* Solve an Eigenvalue problem, such as $$L v=\\lambda v$$ \n",
    "* Select k eigenvectors \\{ v_{i}, i=1, k \\}  corresponding to the k eigenvalues  $\\{ \\lambda_{i}, i=1, k \\}$, to define a k-dimensional subspace $P^{t}LP$\n",
    "* Compute clusters in this subspace using using k-means"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true,
    "tags": [
     "l2",
     "ce",
     "s2"
    ]
   },
   "outputs": [],
   "source": [
    "# This section will contain code examples that you can run to generate results as well as quizzes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": [
     "l2",
     "s2",
     "hint"
    ]
   },
   "source": [
    "<p>#</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "tags": [
     "l2",
     "s2",
     "ans"
    ]
   },
   "outputs": [],
   "source": [
    "#"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "tags": [
     "l2",
     "hid",
     "s2"
    ]
   },
   "outputs": [],
   "source": [
    "ref_tmp_var = False\n",
    "\n",
    "\n",
    "try:\n",
    "    ref_assert_var = True\n",
    "    \n",
    "    \n",
    "    \n",
    "    \n",
    "except Exception:\n",
    "    print('Please follow the instructions given and use the same variables provided in the instructions.')\n",
    "else:\n",
    "    if ref_assert_var:\n",
    "        ref_tmp_var = True\n",
    "    else:\n",
    "        print('Please follow the instructions given and use the same variables provided in the instructions.')\n",
    "\n",
    "\n",
    "assert ref_tmp_var"
   ]
  }
 ],
 "metadata": {
  "executed_sections": [],
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  },
  "rf_version": 1
 },
 "nbformat": 4,
 "nbformat_minor": 2
}