{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Machine Learning Overview" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Preliminaries\n", "- Goal\n", " - Top-level overview of machine learning\n", "- Materials\n", " - Study Bishop pp. 1-4\n", " - Study this notebook " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### What is Machine Learning?\n", "- Machine Learning relates to **building models from data**.\n", " - Suppose we want to make a model for a complex process about which we have little knowledge (so hand-programming is not possible).\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - **Solution**: Get the computer to program itself by showing it examples of the behavior that we want.\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - Practically, we choose a library of models, and write a program that picks a model and tunes it to fit the data.\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " - **Criterion**: a _good_ model generalizes well to unseen data from the same process.\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- This method is known in various scientific communities under different names such as machine learning, statistical inference, system identification, data mining, source coding, data compression, etc." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Machine learning and the scientific inquiry loop.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Machine Learning is Difficult\n", "\n", "- Modeling (Learning) Problems\n", " - Is there any regularity in the data anyway?\n", " - What is our prior knowledge and how to express it mathematically?\n", " - How to pick the model library?\n", " - How to tune the models to the data?\n", " - How to measure the generalization performance?\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- Quality of Observed Data\n", " - Not enough data\n", " - Too much data?\n", " - Available data may be messy (measurement noise, missing data points, outliers)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### A Machine Learning Taxonomy\n", "\n", "- **Supervised Learning**: Given examples of inputs and corresponding\n", "desired outputs, predict outputs on future inputs.\n", " - Examples: classification, regression, time series prediction\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- **Unsupervised Learning**: (a.k.a. **density estimation**). Given only inputs, automatically discover representations, features, structure, etc.\n", " - Examples: clustering, outlier detection, compression\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- **Reinforcement Learning**: Given sequences of inputs, actions from a\n", "fixed set, and scalar rewards/punishments, _learn_ to select action\n", "sequences in a way that maximizes expected reward, e.g. chess and robotics. (This is more akin to learning how to design good experiments and is not covered in this course.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- Other stuff, like **Preference Learning**, **learning to rank**, etc. (also not covered in this course). Note that many machine learning problems can be (re-)formulated as special cases of either a supervised or unsupervised problem, which are both covered in this class. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Supervised Learning\n", "\n", "- Given observations $D=\\{(x_1,y_1),\\dots,(x_N,y_N)\\}$, the goal is to estimate the conditional distribution $p(y|x)$.\n", "\n", "##### Classification \n", "\n", "\n", "\n", "- The target variable $y$ is a _discrete-valued_ vector representing class labels " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "##### Regression \n", "\n", "\n", "\n", "- Same problem statement as classification but now the target variable is a _real-valued_ vector." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Unsupervised Learning\n", "\n", "Given data $D=\\{x_1,\\ldots,x_N\\}$, model the (unconditional) probability distribution $p(x)$ (a.k.a. **density estimation**).\n", "\n", "\n", "##### Clustering\n", "\n", "\n", "\n", "- Group data into clusters such that all data points in a cluster have similar properties." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "##### Compression / dimensionality reduction\n", "\n", "\n", "\n", "- Output from coder is much smaller in size than original, but if coded signal if further processed by a decoder, then the result is very close (or exactly equal) to the original." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Some Machine Learning Applications\n", "\n", "- computer speech recognition, speaker recognition\n", "- face recognition, iris identification\n", "- printed and handwritten text parsing\n", "- financial prediction, outlier detection (credit-card fraud)\n", "- user preference modeling (amazon); modeling of human perception\n", "- modeling of the web (google)\n", "- machine translation\n", "- medical expert systems for disease diagnosis (e.g., mammogram)\n", "- strategic games (chess, go, backgammon)\n", "- **any 'knowledge-poor' but 'data-rich' problem**\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "open(\"../../styles/aipstyle.html\") do f display(\"text/html\", read(f,String)) end" ] } ], "metadata": { "anaconda-cloud": {}, "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Julia 1.1.0", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.0" } }, "nbformat": 4, "nbformat_minor": 1 }