{ "metadata": { "name": "05A_application_to_face_recognition" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Example from Image Processing" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we'll take a look at a simple facial recognition example.\n", "This uses a dataset available within scikit-learn consisting of a\n", "subset of the [Labeled Faces in the Wild](http://vis-www.cs.umass.edu/lfw/)\n", "data. Note that this is a relatively large download (~200MB) so it may\n", "take a while to execute." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn import datasets\n", "lfw_people = datasets.fetch_lfw_people(min_faces_per_person=70, resize=0.4,\n", " data_home='datasets')\n", "lfw_people.data.shape" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you're on a unix-based system such as linux or Mac OSX, these shell commands\n", "can be used to see the downloaded dataset:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!ls datasets" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "!du -sh datasets/lfw_home" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's visualize these faces to see what we're working with:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "fig = plt.figure(figsize=(8, 6))\n", "# plot several images\n", "for i in range(15):\n", " ax = fig.add_subplot(3, 5, i + 1, xticks=[], yticks=[])\n", " ax.imshow(lfw_people.images[i], cmap=plt.cm.bone)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "plt.figure(figsize=(10, 2))\n", "\n", "unique_targets = np.unique(lfw_people.target)\n", "counts = [(lfw_people.target == i).sum() for i in unique_targets]\n", "\n", "plt.xticks(unique_targets, lfw_people.target_names[unique_targets])\n", "locs, labels = plt.xticks()\n", "plt.setp(labels, rotation=45, size=14)\n", "_ = plt.bar(unique_targets, counts)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One thing to note is that these faces have already been localized and scaled\n", "to a common size. This is an important preprocessing piece for facial\n", "recognition, and is a process that can require a large collection of training\n", "data. This can be done in scikit-learn, but the challenge is gathering a\n", "sufficient amount of training data for the algorithm to work\n", "\n", "Fortunately, this piece is common enough that it has been done. One good\n", "resource is [OpenCV](http://opencv.willowgarage.com/wiki/FaceRecognition), the\n", "*Open Computer Vision Library*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll perform a Support Vector classification of the images. We'll\n", "do a typical train-test split on the images to make this happen:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.cross_validation import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(lfw_people.data, lfw_people.target, random_state=0)\n", "\n", "print X_train.shape, X_test.shape" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Preprocessing: Principal Component Analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1850 dimensions is a lot for SVM. We can use PCA to reduce these 1850 features to a manageable\n", "size, while maintaining most of the information in the dataset. Here it is useful to use a variant\n", "of PCA called ``RandomizedPCA``, which is an approximation of PCA that can be much faster for large\n", "datasets. The interface is the same as the normal PCA we saw earlier:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn import decomposition\n", "pca = decomposition.RandomizedPCA(n_components=150, whiten=True)\n", "pca.fit(X_train)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One interesting part of PCA is that it computes the \"mean\" face, which can be\n", "interesting to examine:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "plt.imshow(pca.mean_.reshape((50, 37)), cmap=plt.cm.bone)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The principal components measure deviations about this mean along orthogonal axes.\n", "It is also interesting to visualize these principal components:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print pca.components_.shape" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "fig = plt.figure(figsize=(16, 6))\n", "for i in range(30):\n", " ax = fig.add_subplot(3, 10, i + 1, xticks=[], yticks=[])\n", " ax.imshow(pca.components_[i].reshape((50, 37)), cmap=plt.cm.bone)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The components (\"eigenfaces\") are ordered by their importance from top-left to bottom-right.\n", "We see that the first few components seem to primarily take care of lighting\n", "conditions; the remaining components pull out certain identifying features:\n", "the nose, eyes, eyebrows, etc." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this projection computed, we can now project our original training\n", "and test data onto the PCA basis:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "X_train_pca = pca.transform(X_train)\n", "X_test_pca = pca.transform(X_test)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "print X_train_pca.shape\n", "print X_test_pca.shape" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These projected components correspond to factors in a linear combination of\n", "component images such that the combination approaches the original face." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Doing the Learning: Support Vector Machines" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll perform support-vector-machine classification on this reduced dataset:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn import svm\n", "clf = svm.SVC(C=5., gamma=0.001)\n", "clf.fit(X_train_pca, y_train)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can evaluate how well this classification did. First, we might plot a\n", "few of the test-cases with the labels learned from the training set:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "fig = plt.figure(figsize=(8, 6))\n", "for i in range(15):\n", " ax = fig.add_subplot(3, 5, i + 1, xticks=[], yticks=[])\n", " ax.imshow(X_test[i].reshape((50, 37)), cmap=plt.cm.bone)\n", " y_pred = clf.predict(X_test_pca[i])[0]\n", " color = 'black' if y_pred == y_test[i] else 'red'\n", " ax.set_title(lfw_people.target_names[y_pred], fontsize='small', color=color)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The classifier is correct on an impressive number of images given the simplicity\n", "of its learning model! Using a linear classifier on 150 features derived from\n", "the pixel-level data, the algorithm correctly identifies a large number of the\n", "people in the images.\n", "\n", "Again, we can\n", "quantify this effectiveness using one of several measures from the ``sklearn.metrics``\n", "module. First we can do the classification report, which shows the precision,\n", "recall and other measures of the \"goodness\" of the classification:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn import metrics\n", "y_pred = clf.predict(X_test_pca)\n", "print(metrics.classification_report(y_test, y_pred, target_names=lfw_people.target_names))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another interesting metric is the *confusion matrix*, which indicates how often\n", "any two items are mixed-up. The confusion matrix of a perfect classifier\n", "would only have nonzero entries on the diagonal, with zeros on the off-diagonal." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(metrics.confusion_matrix(y_test, y_pred))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "print(metrics.f1_score(y_test, y_pred))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Pipelining" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above we used PCA as a pre-processing step before applying our support vector machine classifier.\n", "Plugging the output of one estimator directly into the input of a second estimator is a commonly\n", "used pattern; for this reason scikit-learn provides a ``Pipeline`` object which automates this\n", "process. The above problem can be re-expressed as a pipeline as follows:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.pipeline import Pipeline" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "clf = Pipeline([('pca', decomposition.RandomizedPCA(n_components=150, whiten=True)),\n", " ('svm', svm.LinearSVC(C=1.0))])" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "clf.fit(X_train, y_train)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "y_pred = clf.predict(X_test)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "print metrics.confusion_matrix(y_pred, y_test)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results are not identical because we used the randomized version of the PCA -- because the\n", "projection varies slightly each time, the results vary slightly as well." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "A Quick Note on Facial Recognition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we have used PCA \"eigenfaces\" as a pre-processing step for facial recognition.\n", "The reason we chose this is because PCA is a broadly-applicable technique, which can\n", "be useful for a wide array of data types. Research in the field of facial recognition\n", "in particular, however, has shown that other more specific feature extraction methods\n", "are can be much more effective." ] } ], "metadata": {} } ] }