{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%load_ext watermark"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sebastian Raschka 2016-06-25 \n",
"\n",
"CPython 3.5.1\n",
"IPython 4.2.0\n",
"\n",
"scikit-learn 0.17.1\n",
"matplotlib 1.5.1\n",
"numpy 1.11.0\n",
"pandas 0.18.1\n"
]
}
],
"source": [
"%watermark -v -d -a 'Sebastian Raschka' -p scikit-learn,matplotlib,numpy,pandas"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Principal Component Analysis in 3 Simple Steps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Principal Component Analysis (PCA) is a simple yet popular and useful linear transformation technique that is used in numerous applications, such as stock market predictions, the analysis of gene expression data, and many more. In this tutorial, we will see that PCA is not just a \"black box\", and we are going to unravel its internals in 3 basic steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This article just got a complete overhaul, the original version is still available at [principal_component_analysis_old.ipynb](http://nbviewer.ipython.org/github/rasbt/pattern_classification/blob/master/dimensionality_reduction/projection/principal_component_analysis.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n", " | sepal_len | \n", "sepal_wid | \n", "petal_len | \n", "petal_wid | \n", "class | \n", "
---|---|---|---|---|---|
145 | \n", "6.7 | \n", "3.0 | \n", "5.2 | \n", "2.3 | \n", "Iris-virginica | \n", "
146 | \n", "6.3 | \n", "2.5 | \n", "5.0 | \n", "1.9 | \n", "Iris-virginica | \n", "
147 | \n", "6.5 | \n", "3.0 | \n", "5.2 | \n", "2.0 | \n", "Iris-virginica | \n", "
148 | \n", "6.2 | \n", "3.4 | \n", "5.4 | \n", "2.3 | \n", "Iris-virginica | \n", "
149 | \n", "5.9 | \n", "3.0 | \n", "5.1 | \n", "1.8 | \n", "Iris-virginica | \n", "