"
],
"text/plain": [
""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import addutils.toc ; addutils.toc.js(ipy_notebook=True)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import scipy.io\n",
"import numpy as np\n",
"import pandas as pd\n",
"from addutils import css_notebook\n",
"css_notebook()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1 Principal Component Analysis (PCA)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before trying any ML technique it is always a good idea to visualize the available data from different point of view but when the dimensionality of the problem is higher, it's difficult to use some that kind of plots and it is necessary to do **dimenionality reduction**.\n",
"\n",
"We have already seen some simple visualization example made with scatter plot or scatter matrix on the tutorial ml01. In this lesson we're going furter by working with some of the mamy data projection techniques available in scikit-learn.\n",
"\n",
"In the first example we will consider the *digits* dataset in which we have many 8x8 grayscale images of handwritten digits. In other words, this dataset is made by samples with 64 features\n",
"\n",
"What we want to do is to obtain a descriptive 2D scatter plot."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:bokeh.resources:Getting CDN URL for local dev version will not produce usable URL\n"
]
},
{
"data": {
"text/html": [
" \n",
" \n",
" \n",
" \n",
"