{ "metadata": { "name": "", "signature": "sha256:4cb52e439951ef85798d947a31bb63d6eebe06a17dafed8b3f5c62855fcd98e5" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LIBSVM format data into CAFFE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, you can see how you feed LIBSVM format data into CAFFE net." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline\n", "%cd ..\n", "import os, shutil\n", "dir_o = 'examples/tmp.libsvm_data'\n", "! mkdir -p {dir_o}" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Populating the interactive namespace from numpy and matplotlib\n", "/Users/takuya/Work/caffe\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a temporary directory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use the hand-written digits dataset provided by Scikit-Learn as an example. Image has a shape of (8, 8). The number of class is 10. The number of sample is 1797." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.datasets import load_digits\n", "digits = load_digits()\n", "print digits.images.shape, digits.target.shape" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1797, 8, 8) (1797,)\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Image is like this." ] }, { "cell_type": "code", "collapsed": false, "input": [ "imshow(digits.images[0], cmap=gray(), interpolation='none')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAPYAAAD7CAYAAABZjGkWAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADCdJREFUeJzt3V+MnXWdx/HPpzNt2oKhRcQVGdIClmi4gKlbGhEXDRrW\nWLkRKonRYOKVfxo3MUov9rbhSkmMNwoEtWKy42JoXNdWJAZjqFBbbTulKS0kLQUsaWEjsAks3704\nT80EpfPM8zy/33S+fb+SyZxzps98vz0zn/k955zf+f0cEQKQy6L5bgDA8Ag2kBDBBhIi2EBCBBtI\niGADCY33/Qa2eb0MmEcR4bfe1jvY2d12222djtu3b5+uvvrqOR+3ZcuWTvXuvvtubdq0ac7H7dix\no1O9bdu2acOGDXM+7s477+xU77XXXtOyZcvmfNypU6c61VsINmzYoG3btv3Dr3EqDiREsIGECHYh\nF198cdV61113XdV6a9asqVpvfJxHjXNBsAupHez169dXrXfVVVdVrbd48eKq9RY6gg0kRLCBhAg2\nkNCswbZ9s+0nbR+y/c0aTQHo54zBtj0m6buSbpb0AUm3235/jcYAdDfbiL1O0lMR8UxEvC7pp5Ju\nKd8WgD5mC/Z7JR2dcf1YcxuAs9hsweYNHsACNFuwn5U0MeP6hEajNoB5dvDgwbf92mzBfkLS+2yv\nsr1E0kZJDw3YG4COzjT774wTcCPiDdtfkfQrSWOS7omIA8O2B2Bos86sj4hfSvplhV4ADISZZ0BC\nBBtIiGADCRFsICGCDSREsIGECDaQEMEGEiLYQEILbk1X++92Mymq684cXV1++eVV61144YVV6734\n4otV623cuLFqvampqar13g4jNpAQwQYSIthAQgQbSIhgAwkRbCAhgg0kRLCBhAg2kFCbvbvutf2C\n7b01GgLQX5sR+z6N9u4CsEDMGuyIeFTSqQq9ABgIj7GBhAg2sED12eIHwFnqTFv8EGwgoTYvdz0g\n6feS1tg+avuO8m0B6KPN3l2312gEwHA4FQcSIthAQgQbSIhgAwkRbCAhgg0kRLCBhAg2kBDBBhJa\ncHt3rV27tmq91atXV6135ZVXVq135MiRqvW2b99etV7t3xf27gJQDMEGEiLYQEIEG0iIYAMJEWwg\nIYINJESwgYQINpBQm8UMJ2w/Ynu/7X22v1ajMQDdtZlS+rqkr0fEHtvnS9ple0dEHCjcG4CO2uzd\n9XxE7Gku/1XSAUmXlG4MQHdzeoxte5WkayXtLNEMgGG0DnZzGj4laVMzcgOYR7337rK9WNLPJP04\nIn4+UF8Aeui1d5dtS7pH0nREfGfAvgAU0mbEvl7S5yR91Pbu5uPmwn0B6KHN3l2/ExNZgAWFwAIJ\nEWwgIYINJESwgYQINpAQwQYSIthAQgQbSIhgAwktuL27VqxYUbXerl27qtY7fPhw1XqjtwLUU/v+\nPFcxYgMJEWwgIYINJESwgYQINpAQwQYSIthAQgQbSIhgAwm1WaV0qe2dtvfYnra9pUZjALprs5jh\n/9r+aES8antc0u9sf7hZ5BDAWajVqXhEvNpcXCJpTNLJYh0B6K3tTiCLbO+R9IKkRyJiumxbAPpo\nO2K/GRHXSLpU0kds31i0KwCz6r1312kR8bKkX0j6YM+eAPTUd++ui2yvaC4vk/RxSbsH6w7A4Nos\ntPAeSffbXqTRH4IfRcTDZdsC0Eebl7v2Spqs0AuAgTDzDEiIYAMJEWwgIYINJESwgYQINpAQwQYS\nIthAQgQbSGjB7d21cuXKqvUefpjZs0Oq/fM7efLcXDqAERtIiGADCRFsICGCDSREsIGECDaQEMEG\nEiLYQEIEG0io7YYBY7Z3295WuiEA/bUdsTdJmpYUBXsBMJA264pfKumTkn4gycU7AtBbmxH725K+\nIenNwr0AGMgZg237U5L+EhG7xWgNnFX67N31IUmftv20pAckfcz2DwfsDUBHnffuiojNETEREasl\nfVbSbyLi8wP3B2Bgc30dm2fFgQWg9QoqEfFbSb8t2AuAgTDzDEiIYAMJEWwgIYINJESwgYQINpAQ\nwQYSIthAQgQbSGjB7d310ksvVa03OTlZtV5ttffSqn1/Tk1NVa13tmDEBhIi2EBCBBtIiGADCRFs\nICGCDSREsIGECDaQEMEGEmo188z2M5L+R9L/SXo9ItaVbApAP22nlIakGyPiZMlmAAxjLqfi7AQC\nLBBtgx2Sfm37CdtfKtkQgP7anopfHxHP2X6XpB22n4yIR0s2BuDM+uzdJUmKiOeazyckPSiJJ8+A\nedZ57y5Jsr3c9juay+dJ+oSkvYN1B2BwbU7F3y3pQdun//3WiNhetCsAvcwa7Ih4WtI1FXoBMBBm\nngEJEWwgIYINJESwgYQINpAQwQYSIthAQgQbSIhgAwktuL27jhw5UrXe2rVrq9a79dZbU9er7a67\n7prvFuYFIzaQEMEGEiLYQEIEG0iIYAMJEWwgIYINJESwgYQINpBQm1VKV9iesn3A9rTt9TUaA9Bd\nmymld0v6r4j4jO1xSecV7glAT2cMtu0LJN0QEV+QpIh4Q9LLNRoD0N1sp+KrJZ2wfZ/tP9r+vu3l\nNRoD0N1swR6XNCnpexExKekVSd8q3hWAWfXZu+uYpGMR8XhzfUqjoAOYZ5337oqI5yUdtb2muekm\nSfuHaw1ACW2eFf+qpK22l0g6LOmOsi0B6KvN3l1/kvTPFXoBMBBmngEJEWwgIYINJESwgYQINpAQ\nwQYSIthAQgQbSIhgAwmxd9csNm/eXLXeli1bqtbbtWtX1Xrr1q2rWu9cxYgNJESwgYQINpAQwQYS\nIthAQgQbSIhgAwkRbCChNlv8XGV794yPl21/rUZzALpps+bZQUnXSpLtRZKelfRg4b4A9DDXU/Gb\nJB2OiKMlmgEwjLkG+7OSflKiEQDDaR3sZl3xDZL+o1w7AIYwlxH7XyXtiogTpZoB0F6fvbtmul3S\nA727ATCIznt3nWb7PI2eOPvPgXoCUFCrhRYi4hVJFxXuBcBAmHkGJESwgYQINpAQwQYSIthAQgQb\nSOicCXZEVK13/PjxqvUee+yxqvX2799ftV7tn99Cd84Eu7bawd65c2fVetPT01XrYW4INpDQIFv8\nTE5OzvmY48eP65JLLpnzcbbnfEyfepdddlmneocOHep07JIlSzrVGxsb63TsypUrO9VbunRpp2O7\n/K5I3X9+XU/hu9brqku9K6644m2/5r6PXWzz4AeYRxHxd6Nd72ADOPvwGBtIiGADCc1LsG3fbPtJ\n24dsf7NwrXttv2B7b8k6M+pN2H7E9n7b+0ov1Wx7qe2dtvfYnrZdfINt22PNUtTbStdq6j1j+89N\nzT8UrrXC9pTtA839ub5grXJLe0dE1Q9JY5KekrRK0mJJeyS9v2C9GzRaPnlvpf/fP0m6prl8vqSD\nJf9/TZ3lzedxSY9J+nDhev8maaukhyrdp09LurBSrfslfXHG/XlBpbqLJD0naWKI7zcfI/Y6SU9F\nxDMR8bqkn0q6pVSxiHhU0qlS3/8f1Hs+IvY0l/8q6YCkoq+bRMSrzcUlGv3hPFmqlu1LJX1S0g8k\ndXvtsWPp4gXsCyTdEBH3SlJEvBERL5eu2xh0ae/5CPZ7Jc1s/lhzWzq2V2l0tlB0WpjtRbb3SHpB\n0iMRUXJa2LclfUPSmwVrvFVI+rXtJ2x/qWCd1ZJO2L7P9h9tf9/28oL1Zhp0ae/5CPY58fqa7fMl\nTUna1IzcxUTEmxFxjaRLJX3E9o0l6tj+lKS/RMRu1R2tr4+IazVaKffLtm8oVGdc0qSk70XEpKRX\nJH2rUK2/KbG093wE+1lJEzOuT2g0aqdhe7Gkn0n6cUT8vFbd5rTxF5I+WKjEhyR92vbTGq1Y+zHb\nPyxU628i4rnm8wmNtpdaV6jUMUnHIuLx5vqURkEvbfClvecj2E9Iep/tVc1fqo2SHpqHPorwaM7r\nPZKmI+I7FepdZHtFc3mZpI9L2l2iVkRsjoiJiFit0anjbyLi8yVqnWZ7ue13NJfPk/QJSUVe4YiI\n5yUdtb2muekmSTXexjb40t6DzBWfi4h4w/ZXJP1Koyd67omIA6Xq2X5A0r9Ieqfto5L+PSLuK1VP\n0vWSPifpz7ZPB+zOiPjvQvXeI+n+ZsPERZJ+FBEPF6r1VjUeVr1b0oPNewTGJW2NiO0F631V0tZm\n0Dks6Y6CtWYu7T3ocwdMKQUSYuYZkBDBBhIi2EBCBBtIiGADCRFsICGCDSREsIGE/h8o2EHgA9yd\n3QAAAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dataset is split into 2. We use first one for training and second one for testing." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def split2(arr):\n", " div = arr.shape[0] / 2\n", " return arr[:div], arr[div:]\n", "data2 = split2(digits.data)\n", "target2 = split2(digits.target)\n", "print [d.shape[0] for d in data2]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[898, 899]\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is just an example of training and testing of digits data using Logistic Regression provided by Scikit-Learn. The test accuracy is 91.7%." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.linear_model import LogisticRegression\n", "lr = LogisticRegression().fit(data2[0], target2[0])\n", "from sklearn.metrics import accuracy_score\n", "print 'Test Accuracy:', accuracy_score(target2[1], lr.predict(data2[1]))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Test Accuracy: 0.916573971079\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Digits data to LIBSVM format" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dataset will be saved as LIBSVM format in order to use from CAFFE. The format is as following.\n", "\n", "