{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Machine Learning Overview"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Preliminaries\n",
"- Goal\n",
" - Top-level overview of machine learning\n",
"- Materials\n",
" - Study Bishop pp. 1-4\n",
" - Study this notebook "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### What is Machine Learning?\n",
"- Machine Learning relates to **building models from data**.\n",
" - Suppose we want to make a model for a complex process about which we have little knowledge (so hand-programming is not possible).\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" - **Solution**: Get the computer to program itself by showing it examples of the behavior that we want.\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" - Practically, we choose a library of models, and write a program that picks a model and tunes it to fit the data.\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" - **Criterion**: a _good_ model generalizes well to unseen data from the same process.\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- This method is known in various scientific communities under different names such as machine learning, statistical inference, system identification, data mining, source coding, data compression, etc."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Machine learning and the scientific inquiry loop.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Machine Learning is Difficult\n",
"\n",
"- Modeling (Learning) Problems\n",
" - Is there any regularity in the data anyway?\n",
" - What is our prior knowledge and how to express it mathematically?\n",
" - How to pick the model library?\n",
" - How to tune the models to the data?\n",
" - How to measure the generalization performance?\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- Quality of Observed Data\n",
" - Not enough data\n",
" - Too much data?\n",
" - Available data may be messy (measurement noise, missing data points, outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### A Machine Learning Taxonomy\n",
"\n",
"- **Supervised Learning**: Given examples of inputs and corresponding\n",
"desired outputs, predict outputs on future inputs.\n",
" - Examples: classification, regression, time series prediction\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- **Unsupervised Learning**: (a.k.a. **density estimation**). Given only inputs, automatically discover representations, features, structure, etc.\n",
" - Examples: clustering, outlier detection, compression\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- **Reinforcement Learning**: Given sequences of inputs, actions from a\n",
"fixed set, and scalar rewards/punishments, _learn_ to select action\n",
"sequences in a way that maximizes expected reward, e.g. chess and robotics. (This is more akin to learning how to design good experiments and is not covered in this course.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- Other stuff, like **Preference Learning**, **learning to rank**, etc. (also not covered in this course). Note that many machine learning problems can be (re-)formulated as special cases of either a supervised or unsupervised problem, which are both covered in this class. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Supervised Learning\n",
"\n",
"- Given observations $D=\\{(x_1,y_1),\\dots,(x_N,y_N)\\}$, the goal is to estimate the conditional distribution $p(y|x)$.\n",
"\n",
"##### Classification \n",
"\n",
"\n",
"\n",
"- The target variable $y$ is a _discrete-valued_ vector representing class labels "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"##### Regression \n",
"\n",
"\n",
"\n",
"- Same problem statement as classification but now the target variable is a _real-valued_ vector."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Unsupervised Learning\n",
"\n",
"Given data $D=\\{x_1,\\ldots,x_N\\}$, model the (unconditional) probability distribution $p(x)$ (a.k.a. **density estimation**).\n",
"\n",
"\n",
"##### Clustering\n",
"\n",
"\n",
"\n",
"- Group data into clusters such that all data points in a cluster have similar properties."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"##### Compression / dimensionality reduction\n",
"\n",
"\n",
"\n",
"- Output from coder is much smaller in size than original, but if coded signal if further processed by a decoder, then the result is very close (or exactly equal) to the original."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Some Machine Learning Applications\n",
"\n",
"- computer speech recognition, speaker recognition\n",
"- face recognition, iris identification\n",
"- printed and handwritten text parsing\n",
"- financial prediction, outlier detection (credit-card fraud)\n",
"- user preference modeling (amazon); modeling of human perception\n",
"- modeling of the web (google)\n",
"- machine translation\n",
"- medical expert systems for disease diagnosis (e.g., mammogram)\n",
"- strategic games (chess, go, backgammon)\n",
"- **any 'knowledge-poor' but 'data-rich' problem**\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"open(\"../../styles/aipstyle.html\") do f display(\"text/html\", readstring(f)) end"
]
}
],
"metadata": {
"anaconda-cloud": {},
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Julia 0.6.1",
"language": "julia",
"name": "julia-0.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}