{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"### About the Author"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some of Sebastian Raschka's greatest passions are \"Data Science\" and machine learning. Sebastian enjoys everything that involves working with data: The discovery of interesting patterns and coming up with insightful conclusions using techniques from the fields of data mining and machine learning for predictive modeling.\n",
"\n",
"Currently, Sebastian is sharpening his analytical skills as a PhD candidate at Michigan State University where he is working on a highly efficient virtual screening software for computer-aided drug-discovery and a novel approach to protein ligand docking (among other projects). Basically, it is about the screening of a database of millions of 3-dimensional structures of chemical compounds in order to identifiy the ones that could potentially bind to specific protein receptors in order to trigger a biological response.\n",
"\n",
"You can follow Sebastian on Twitter ([@rasbt](https://twitter.com/rasbt)) or read more about his favorite projects on [his blog](http://sebastianraschka.com/articles.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Principal Component Analysis in 3 Simple Steps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Principal Component Analysis (PCA) is a simple yet popular and useful linear transformation technique that is used in numerous applications, such as stock market predictions, the analysis of gene expression data, and many more. In this tutorial, we will see that PCA is not just a \"black box\", and we are going to unravel its internals in 3 basic steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n", " | sepal_len | \n", "sepal_wid | \n", "petal_len | \n", "petal_wid | \n", "class | \n", "
---|---|---|---|---|---|
145 | \n", "6.7 | \n", "3.0 | \n", "5.2 | \n", "2.3 | \n", "Iris-virginica | \n", "
146 | \n", "6.3 | \n", "2.5 | \n", "5.0 | \n", "1.9 | \n", "Iris-virginica | \n", "
147 | \n", "6.5 | \n", "3.0 | \n", "5.2 | \n", "2.0 | \n", "Iris-virginica | \n", "
148 | \n", "6.2 | \n", "3.4 | \n", "5.4 | \n", "2.3 | \n", "Iris-virginica | \n", "
149 | \n", "5.9 | \n", "3.0 | \n", "5.1 | \n", "1.8 | \n", "Iris-virginica | \n", "