{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "code", "collapsed": false, "input": [ "%autosave 10" ], "language": "python", "metadata": {}, "outputs": [ { "javascript": [ "IPython.notebook.set_autosave_interval(10000)" ], "metadata": {}, "output_type": "display_data" }, { "output_type": "stream", "stream": "stdout", "text": [ "Autosaving every 10 seconds\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Slides: [https://www.dropbox.com/s/efdlptzl53hj3zi/wideio-actrec-tutorial-pydata-ldn2014.pdf](https://www.dropbox.com/s/efdlptzl53hj3zi/wideio-actrec-tutorial-pydata-ldn2014.pdf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Source code: git clone https://bitbucket.org/wideio/pydata.git\n", "\n", "- `~/Programming/pydata`\n", "\n", "Requires:\n", "\n", "- Python\n", "- NumPy / SciPy\n", "- PIL\n", "- OpenCV2\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why Python\n", "\n", "- People usually come from MATLAB\n", "- Now Python, a real and usable language, is much better\n", "- Multiparadigm, operator overloading, lots of packages, community, ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is action recognition?\n", "\n", "- Classification task.\n", "- What are people doing in a video? Clapping, jogging, waving, boxing.\n", "- KTH, Human Action Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Supervised vs semi-supervised\n", "\n", "- More priors: action specific, holistic approaches\n", "- Less priors: feature-based, appearance-based, deep learning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Very traditional pipeline\n", "\n", " Raw data -> Feature Extraction -> Machine Learning\n", " \n", "- Insert domain-specific knowledge into parsing of raw data; hand-craft features \n", "\n", " Video Frames -> Bags of Keypoints -> SVM" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading videos\n", "\n", "- ffmpeg is the benchmark\n", "- OpenCV ffmpeg wrapper is unmaintained though\n", "- But if you also want sound probably want MLT (MLT Multimedia Framework)\n", " - Made by broadcasters\n", " - Backend of many open-source video editing software\n", " - `brew install mlt`\n", "- Prefer `pyglet` to `PIL`\n", " - `pyglet` uses OpenGL, works on all platforms (not Windows 64)\n", " - or use `QT`.\n", " - (!!AI not a fan of PIL!)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Goal of this tutorial\n", "\n", "Classify kissing and non-kissing pictures in video frames." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Harris Corner Detector\n", "\n", "- Classic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optical flow - wrapper based code\n", "\n", " python keypoints/custom_feature.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Feature extractions\n", "\n", "- Image\n", "- Produce a descriptor verctor using SIFT. Detect gradient.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See `~/Programming/pydata` for more info. There are IPython Notebooks there too." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Process\n", "\n", " Feature detction -> PCA Projections\n", " -> Pick more important eigenvectors\n", " -> Clustering\n", " -> Graphic Model -> centers -> Bag of Words" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Feature detection stack for image\n", "- PCA projection with graphic model\n", "- Direction of projection to BOW cluster centers with metric\n", "- Invert with 'discriminatpr' to turn into weighting\n", "- Normalise (area = 1)\n", "- Mean of stack to create histogram\n", "- Then throw this result into the classifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##\u00a0How to run part3 BOW\n", "\n", " cd ~/pydata/part3\n", " ipython\n", " \n", " >>> run bow.py\n", " >>> demo()\n", " \n", " # prints mean, centres, eigenvectors, but also saves model to file\n", " \n", " >>> run featurecomputer.py\n", " >>> demo()\n", " \n", " # computes histogram for a particular file\n", " \n", " >>> run svm_trainer_tester.py\n", " >>> parambash()\n", " \n", " # output param_bash_big.png. lighter colours are better.\n", " #\u00a0outputs a new direcory, opt3.1.2, with reports on classification\n", " \n", " " ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }