{
"cells": [
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"import matplotlib as mpl\n",
"import matplotlib.pyplot as plot\n",
"\n",
"%matplotlib inline\n",
"mpl.rcParams['figure.dpi'] = 125"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Machine Learning 101\n",
"===========\n",
"\n",
"KharkivPy #17
\n",
"November 25th, 2017\n",
"\n",
"by Roman Podoliaka, Software Engineer at DataRobot\n",
"\n",
"twitter: @rpodoliaka
\n",
"email: roman.podoliaka@gmail.com
\n",
"blog: http://podoliaka.org
\n",
"slides: http://podoliaka.org/talks/"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Goal\n",
"\n",
"Take a close look at one of the simplest machine learning algorithms - *logistic regression* - to understand how it works internally and how it can be applied for the task of image classification.\n",
"\n",
"... and, hopefully, dispel at least some hype around machine learning. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Problem\n",
"\n",
"Create an algorithm (also called a **model**) to classify whether images contain either a *dog* or a *cat*.\n",
"\n",
"\n",
"\n",
"Kaggle competition: https://www.kaggle.com/c/dogs-vs-cats\n",
"\n",
"There are much better algorithms for this task, but we'll stick to logistic regression *for the sake of simplicity*."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Input data\n",
"\n",
"25000 **labeled** examples of JPEG images (of **different sizes**) of cats and dogs, e.g.: *cat.0.jpg*, *cat.1.jpg*, ..., *dog.1.jpg*, *dog.2.jpg*...\n",
"\n",
"