{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"tags": [
"s1",
"content",
"l1"
]
},
"source": [
"# Spectral Clustering\n",
"\n",
"## Spectral Clustering\n",
"\n",
"Spectral Clustering works by transforming the data into a subspace prior to clustering. This is incredibly useful when the data is high dimensional. This saves the effort of doing a PCA or a dimensionality reduction ourselves prior to clustering. Spectral clustering works by determining an affinity matrix between the datasets. The data is represented as a graph and an affinity matrix is computed. For the affinity function, we can use the rbf kernel function or nearest neighbors.\n",
"\n",
"Let us consider the intertwined circles with noise:\n",
"\n",
"\n",
"\n",
"
\n",
"## Exercise:\n",
"\n",
" - Apply spectral clustering to the dataset with number of clusters as 2.\n",
" - Assign the cluster labels to a dataframe, circles_df with column 'spectral'"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true,
"tags": [
"s1",
"ce",
"l1"
]
},
"outputs": [],
"source": [
"from sklearn.cluster import SpectralClustering\n",
"from sklearn import datasets\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"N_Samples = 2000\n",
"X, y = datasets.make_circles(n_samples=N_Samples, factor=.5, noise=.2)\n",
"noisy_circles = pd.DataFrame({'X_0':X[:,0],'X_1':X[:,1], 'y':y})\n",
"\n",
"# Fit the spectral clustering to the dataset and plot the results.\n",
"spectral = SpectralClustering(n_clusters=2, eigen_solver='arpack', affinity=\"nearest_neighbors\")\n",
"\n",
"noisy_circles.drop('y', 1)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"s1",
"l1",
"hint"
]
},
"source": [
"
use .fit and .labels_ to extract the cluster associations.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "s1", "l1", "ans" ] }, "outputs": [], "source": [ "spectral.fit(noisy_circles)\n", "noisy_circles['spectral'] = spectral.labels_\n", "g=sns.pairplot(x_vars=\"X_0\", y_vars=\"X_1\", hue = \"spectral\", data = noisy_circles)\n", "g.fig.set_size_inches(14, 6)\n", "sns.despine()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "s1", "hid", "l1" ] }, "outputs": [], "source": [ "ref_tmp_var = False\n", "\n", "\n", "try:\n", " ref_assert_var = False\n", " import numpy as np\n", " \n", " spectral_ = SpectralClustering(n_clusters=2, eigen_solver='arpack', affinity=\"nearest_neighbors\")\n", " spectral_.fit(noisy_circles)\n", " \n", " if (len(noisy_circles['spectral']) == len(spectral_.labels_)):\n", " ref_assert_var = True\n", " out = g\n", " else:\n", " ref_assert_var = False\n", " \n", "except Exception:\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "else:\n", " if ref_assert_var:\n", " ref_tmp_var = True\n", " else:\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "\n", "\n", "assert ref_tmp_var" ] }, { "cell_type": "markdown", "metadata": { "tags": [ "l2", "content", "s2" ] }, "source": [ "\n", "\n", "\n", "#
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "l2", "s2", "ans" ] }, "outputs": [], "source": [ "#" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "l2", "hid", "s2" ] }, "outputs": [], "source": [ "ref_tmp_var = False\n", "\n", "\n", "try:\n", " ref_assert_var = True\n", " \n", " \n", " \n", " \n", "except Exception:\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "else:\n", " if ref_assert_var:\n", " ref_tmp_var = True\n", " else:\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "\n", "\n", "assert ref_tmp_var" ] } ], "metadata": { "executed_sections": [], "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" }, "rf_version": 1 }, "nbformat": 4, "nbformat_minor": 2 }