{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# k-Nearest Neighbor Methods and Image Data\n", "\n", "In this lab we will explore MNIST, a classic machine learning data set of images of handwritten digits (i.e., 0, 1, 2, 3, ...).\n", "In addition, we will investigate an intuitive, yet powerful learning method called k-nearest neighbors (KNN).\n", "Even though the type of data is different from what we've worked with so far, we'll see how to apply familiar tools to the data, namely, scikit learn and matplotlib for machine learning and plotting in python.\n", "\n", "Don't forget to fill out the [response form](https://docs.google.com/a/berkeley.edu/forms/d/188vgp28CXJrJ_qRpyGGBNwG0RjPvAfzirH9zLeSTMKo/viewform).\n", "\n", "And if you haven't already, fetch the data by running to following code (it will download into the `DATA_PATH` directory):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline\n", "import pylab\n", "from sklearn.datasets import fetch_mldata\n", "DATA_PATH = '~/data'\n", "mnist = fetch_mldata('MNIST original', data_home=DATA_PATH)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## MNIST\n", "\n", "