{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# FastICA on 2D point clouds\n\nThis example illustrates visually in the feature space a comparison by\nresults using two different component analysis techniques.\n\n`ICA` vs `PCA`.\n\nRepresenting ICA in the feature space gives the view of 'geometric ICA':\nICA is an algorithm that finds directions in the feature space\ncorresponding to projections with high non-Gaussianity. These directions\nneed not be orthogonal in the original feature space, but they are\northogonal in the whitened feature space, in which all directions\ncorrespond to the same variance.\n\nPCA, on the other hand, finds orthogonal directions in the raw feature\nspace that correspond to directions accounting for maximum variance.\n\nHere we simulate independent sources using a highly non-Gaussian\nprocess, 2 student T with a low number of degrees of freedom (top left\nfigure). We mix them to create observations (top right figure).\nIn this raw observation space, directions identified by PCA are\nrepresented by orange vectors. We represent the signal in the PCA space,\nafter whitening by the variance corresponding to the PCA vectors (lower\nleft). Running ICA corresponds to finding a rotation in this space to\nidentify the directions of largest non-Gaussianity (lower right).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate sample data\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n\nfrom sklearn.decomposition import PCA, FastICA\n\nrng = np.random.RandomState(42)\nS = rng.standard_t(1.5, size=(20000, 2))\nS[:, 0] *= 2.0\n\n# Mix data\nA = np.array([[1, 1], [0, 2]]) # Mixing matrix\n\nX = np.dot(S, A.T) # Generate observations\n\npca = PCA()\nS_pca_ = pca.fit(X).transform(X)\n\nica = FastICA(random_state=rng, whiten=\"arbitrary-variance\")\nS_ica_ = ica.fit(X).transform(X) # Estimate the sources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot results\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n\n\ndef plot_samples(S, axis_list=None):\n plt.scatter(\n S[:, 0], S[:, 1], s=2, marker=\"o\", zorder=10, color=\"steelblue\", alpha=0.5\n )\n if axis_list is not None:\n for axis, color, label in axis_list:\n x_axis, y_axis = axis / axis.std()\n plt.quiver(\n (0, 0),\n (0, 0),\n x_axis,\n y_axis,\n zorder=11,\n width=0.01,\n scale=6,\n color=color,\n label=label,\n )\n\n plt.hlines(0, -5, 5, color=\"black\", linewidth=0.5)\n plt.vlines(0, -3, 3, color=\"black\", linewidth=0.5)\n plt.xlim(-5, 5)\n plt.ylim(-3, 3)\n plt.gca().set_aspect(\"equal\")\n plt.xlabel(\"x\")\n plt.ylabel(\"y\")\n\n\nplt.figure()\nplt.subplot(2, 2, 1)\nplot_samples(S / S.std())\nplt.title(\"True Independent Sources\")\n\naxis_list = [(pca.components_.T, \"orange\", \"PCA\"), (ica.mixing_, \"red\", \"ICA\")]\nplt.subplot(2, 2, 2)\nplot_samples(X / np.std(X), axis_list=axis_list)\nlegend = plt.legend(loc=\"upper left\")\nlegend.set_zorder(100)\n\nplt.title(\"Observations\")\n\nplt.subplot(2, 2, 3)\nplot_samples(S_pca_ / np.std(S_pca_))\nplt.title(\"PCA recovered signals\")\n\nplt.subplot(2, 2, 4)\nplot_samples(S_ica_ / np.std(S_ica_))\nplt.title(\"ICA recovered signals\")\n\nplt.subplots_adjust(0.09, 0.04, 0.94, 0.94, 0.26, 0.36)\nplt.tight_layout()\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 0 }