{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Factor Analysis (with rotation) to visualize patterns\n\nInvestigating the Iris dataset, we see that sepal length, petal\nlength and petal width are highly correlated. Sepal width is\nless redundant. Matrix decomposition techniques can uncover\nthese latent patterns. Applying rotations to the resulting\ncomponents does not inherently improve the predictive value\nof the derived latent space, but can help visualise their\nstructure; here, for example, the varimax rotation, which\nis found by maximizing the squared variances of the weights,\nfinds a structure where the second component only loads\npositively on sepal width.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause\n\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nfrom sklearn.datasets import load_iris\nfrom sklearn.decomposition import PCA, FactorAnalysis\nfrom sklearn.preprocessing import StandardScaler" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load Iris data\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "data = load_iris()\nX = StandardScaler().fit_transform(data[\"data\"])\nfeature_names = data[\"feature_names\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot covariance of Iris features\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "ax = plt.axes()\n\nim = ax.imshow(np.corrcoef(X.T), cmap=\"RdBu_r\", vmin=-1, vmax=1)\n\nax.set_xticks([0, 1, 2, 3])\nax.set_xticklabels(list(feature_names), rotation=90)\nax.set_yticks([0, 1, 2, 3])\nax.set_yticklabels(list(feature_names))\n\nplt.colorbar(im).ax.set_ylabel(\"$r$\", rotation=0)\nax.set_title(\"Iris feature correlation matrix\")\nplt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run factor analysis with Varimax rotation\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "n_comps = 2\n\nmethods = [\n (\"PCA\", PCA()),\n (\"Unrotated FA\", FactorAnalysis()),\n (\"Varimax FA\", FactorAnalysis(rotation=\"varimax\")),\n]\nfig, axes = plt.subplots(ncols=len(methods), figsize=(10, 8), sharey=True)\n\nfor ax, (method, fa) in zip(axes, methods):\n fa.set_params(n_components=n_comps)\n fa.fit(X)\n\n components = fa.components_.T\n print(\"\\n\\n %s :\\n\" % method)\n print(components)\n\n vmax = np.abs(components).max()\n ax.imshow(components, cmap=\"RdBu_r\", vmax=vmax, vmin=-vmax)\n ax.set_yticks(np.arange(len(feature_names)))\n ax.set_yticklabels(feature_names)\n ax.set_title(str(method))\n ax.set_xticks([0, 1])\n ax.set_xticklabels([\"Comp. 1\", \"Comp. 2\"])\nfig.suptitle(\"Factors\")\nplt.tight_layout()\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 0 }