{ "metadata": { "name": "", "signature": "sha256:bfb31bb1823019a06be0e4046bd6f095bc9a2ed7fabb0ab516320643107eb1a9" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "R-CNN detection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "R-CNN\u662f\u4e00\u4e2a\u975e\u5e38\u4f18\u79c0\u7684\u76ee\u6807\u68c0\u6d4b\u6a21\u578b\uff0c\u867d\u7136\u76f8\u6bd4\u4eca\u5929\u5f88\u591astate-of-the-art\u65b9\u6cd5\uff0c\u5b83\u7684\u7cbe\u5ea6\u548c\u6548\u7387\u90fd\u6709\u7565\u663e\u4e0d\u8db3\uff0c\u4f46\u662f\u8be5\u6a21\u578b\u662f\u5f88\u591a\u7b97\u6cd5\u7684\u57fa\u7840\u601d\u60f3\u3002\u672c\u6587\u4ee5\u5b98\u7f51\u6587\u672c\u4e3a\u57fa\u7840\uff0c\u53ea\u505a\u672c\u5730\u8fd0\u884c\u7684\u4fee\u6539\u548c\u4e2d\u6587\u8bf4\u660e\uff1ahttp://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/detection.ipynb\n", "\n", "\u7ec6\u8282\u4fe1\u606f\u53ef\u53c2\u8003\u4f5c\u8005\u8bba\u6587\uff1aRich feature hierarchies for accurate object detection and semantic segmentation. Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. CVPR 2014. Arxiv 2013.\n", "\n", "
---Last update 2015\u5e746\u67088\u65e5
" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "\u51c6\u5907\u5de5\u4f5c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1.\u4e0b\u8f7d\u8bad\u7ec3\u597d\u7684R-CNN\u6a21\u578b\uff0c\u4e5f\u53ef\u4ee5\u81ea\u5df1\u4f7f\u7528Caffe\u8bad\u7ec3\u4e00\u4e2a\u81ea\u5df1\u7684\u6a21\u578b\u3002\u9884\u8bad\u7ec3\u6a21\u578b\u57fa\u4e8eImagenet\u6570\u636e\u96c6\uff0c\u5e76\u5728ILSVRC13\u4e0a\u8fdb\u884c\u5fae\u8c03\uff0c\u8f93\u51fa200\u4e2a\u68c0\u6d4b\u5206\u7c7b\u3002\n", "\n", "\u4e0b\u8f7d\u65b9\u6cd5\uff1a~/caffe-master$ ./scripts/download_model_binary.py models/bvlc_reference_rcnn_ilsvrc13 \n", "\n", "2.\u4e0b\u8f7dSelective Search\uff0c\u5e76\u8fd0\u884cmatlab\u7f16\u8bd1\u76f8\u5173mex\u6587\u4ef6\u3002\n", "\n", "(1) \u4e0b\u8f7d\u65b9\u6cd5\uff1ahttps://github.com/sergeyk/selective_search_ijcv_with_python \uff0c\u4e0b\u8f7d\u540e\u89e3\u538b\uff0c\u6539\u540d\uff0c\u5e76\u590d\u5236\u5230 \uff5e/caffe-master/python/selective_search_ijcv_with_python/ \n", "\n", "(2) \u7f16\u8bd1\u65b9\u6cd5\uff1a\u542f\u52a8matlab\u5ba2\u6237\u7aef\uff0c\u5e76\u8fd0\u884c\uff5e/caffe-master/python/selective_search_ijcv_with_python/demo.m \uff0c\u65e0\u62a5\u9519\u4fe1\u606f\u8fd0\u884c\u540e\u5173\u95edmatlab\u5373\u53ef\u3002\n", "\n", "3.\u6267\u884cpython/detect.py\u62a5\u9519\u65f6\uff0c\u53ef\u53c2\u8003\u5982\u4e0b\u4fee\u6539\u65b9\u6cd5\uff1a\n", "\n", "(1) \u62a5\u9519\u4fe1\u606f\uff1aOSError: [Errno 2] No such file or directory\n", "\n", " \u4fee\u6539\u6587\u4ef6\uff1a~/caffe-master/python/selective_search_ijcv_with_python/selective_search.py\n", " \u4fee\u6539\u524d\uff1amc = \"matlab -nojvm -r \\\"try; {}; catch; exit; end; exit\\\"\".format(command)\n", " \u4fee\u6539\u540e\uff1amc = \"/usr/local/MATLAB/R2014a/bin/matlab -nojvm -r \\\"try; {}; catch; exit; end; exit\\\"\".format(command)\n", " \n", "(2) \u62a5\u9519\u4fe1\u606f\uff1aValueError: 'axis' entry 2 is out of bounds (-2, 2)\n", "\n", " \u4fee\u6539\u6587\u4ef6\uff1a~/caffe-master/python/caffe/detector.py\n", " \u4fee\u6539\u524d\uff1apredictions = out[self.outputs[0]].squeeze(axis=(2, 3))\n", " \u4fee\u6539\u540e\uff1apredictions = out[self.outputs[0]].squeeze()" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "\u751f\u6210Region Proposals\uff0c\u5e76\u63d0\u53d6\u7279\u5f81" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u521b\u5efa\u4e34\u65f6\u76ee\u5f55\uff0c\u5e76\u5bfc\u5165\u68c0\u6d4b\u6837\u672c\u3002\u68c0\u6d4b\u6837\u672c\u53ef\u4ee5\u540c\u65f6\u5bfc\u5165\u591a\u4e2a\uff0c\u4f46\u4f1a\u88ab\u4f5c\u4e3a\u4e00\u4e2a\u6837\u672c\u8fdb\u884c\u5904\u7406\uff0c\u8fd9\u79cd\u65b9\u5f0f\u9002\u5408\u591a\u9884\u5904\u7406\u878d\u5408\u3002" ] }, { "cell_type": "code", "collapsed": false, "input": [ "% cd '/home/ouxinyu/caffe-master'\n", "! mkdir -p _temp\n", "! echo examples/images/fish-bike.jpg > _temp/det_input.txt" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "/home/ouxinyu/caffe-master\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u8c03\u7528Selective Search\u8fdb\u884cRegion Proposal\uff0c\u7136\u540e\u8c03\u7528Caffe\u8fdb\u884c\u5206\u7c7b\u9884\u6d4b\u3002\u9ed8\u8ba4\u8fd0\u884c\u4e8eGPU\u6a21\u5f0f\uff0c\u82e5\u9700\u8981\u8fd0\u884c\u4e8eCPU\u6a21\u5f0f\uff0c\u53ef\u53bb\u6389--gpu" ] }, { "cell_type": "code", "collapsed": false, "input": [ "! python/detect.py --crop_mode=selective_search --pretrained_model=models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel --model_def=models/bvlc_reference_rcnn_ilsvrc13/deploy.prototxt --gpu --raw_scale=255 _temp/det_input.txt _temp/det_output.h5" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "GPU mode\r\n", "WARNING: Logging before InitGoogleLogging() is written to STDERR\r\n", "I0608 10:32:38.067106 6131 net.cpp:42] Initializing net from parameters: \r\n", "name: \"R-CNN-ilsvrc13\"\r\n", "input: \"data\"\r\n", "input_dim: 10\r\n", "input_dim: 3\r\n", "input_dim: 227\r\n", "input_dim: 227\r\n", "state {\r\n", " phase: TEST\r\n", "}\r\n", "layer {\r\n", " name: \"conv1\"\r\n", " type: \"Convolution\"\r\n", " bottom: \"data\"\r\n", " top: \"conv1\"\r\n", " convolution_param {\r\n", " num_output: 96\r\n", " kernel_size: 11\r\n", " stride: 4\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu1\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"conv1\"\r\n", " top: \"conv1\"\r\n", "}\r\n", "layer {\r\n", " name: \"pool1\"\r\n", " type: \"Pooling\"\r\n", " bottom: \"conv1\"\r\n", " top: \"pool1\"\r\n", " pooling_param {\r\n", " pool: MAX\r\n", " kernel_size: 3\r\n", " stride: 2\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"norm1\"\r\n", " type: \"LRN\"\r\n", " bottom: \"pool1\"\r\n", " top: \"norm1\"\r\n", " lrn_param {\r\n", " local_size: 5\r\n", " alpha: 0.0001\r\n", " beta: 0.75\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"conv2\"\r\n", " type: \"Convolution\"\r\n", " bottom: \"norm1\"\r\n", " top: \"conv2\"\r\n", " convolution_param {\r\n", " num_output: 256\r\n", " pad: 2\r\n", " kernel_size: 5\r\n", " group: 2\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu2\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"conv2\"\r\n", " top: \"conv2\"\r\n", "}\r\n", "layer {\r\n", " name: \"pool2\"\r\n", " type: \"Pooling\"\r\n", " bottom: \"conv2\"\r\n", " top: \"pool2\"\r\n", " pooling_param {\r\n", " pool: MAX\r\n", " kernel_size: 3\r\n", " stride: 2\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"norm2\"\r\n", " type: \"LRN\"\r\n", " bottom: \"pool2\"\r\n", " top: \"norm2\"\r\n", " lrn_param {\r\n", " local_size: 5\r\n", " alpha: 0.0001\r\n", " beta: 0.75\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"conv3\"\r\n", " type: \"Convolution\"\r\n", " bottom: \"norm2\"\r\n", " top: \"conv3\"\r\n", " convolution_param {\r\n", " num_output: 384\r\n", " pad: 1\r\n", " kernel_size: 3\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu3\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"conv3\"\r\n", " top: \"conv3\"\r\n", "}\r\n", "layer {\r\n", " name: \"conv4\"\r\n", " type: \"Convolution\"\r\n", " bottom: \"conv3\"\r\n", " top: \"conv4\"\r\n", " convolution_param {\r\n", " num_output: 384\r\n", " pad: 1\r\n", " kernel_size: 3\r\n", " group: 2\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu4\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"conv4\"\r\n", " top: \"conv4\"\r\n", "}\r\n", "layer {\r\n", " name: \"conv5\"\r\n", " type: \"Convolution\"\r\n", " bottom: \"conv4\"\r\n", " top: \"conv5\"\r\n", " convolution_param {\r\n", " num_output: 256\r\n", " pad: 1\r\n", " kernel_size: 3\r\n", " group: 2\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu5\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"conv5\"\r\n", " top: \"conv5\"\r\n", "}\r\n", "layer {\r\n", " name: \"pool5\"\r\n", " type: \"Pooling\"\r\n", " bottom: \"conv5\"\r\n", " top: \"pool5\"\r\n", " pooling_param {\r\n", " pool: MAX\r\n", " kernel_size: 3\r\n", " stride: 2\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"fc6\"\r\n", " type: \"InnerProduct\"\r\n", " bottom: \"pool5\"\r\n", " top: \"fc6\"\r\n", " inner_product_param {\r\n", " num_output: 4096\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu6\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"fc6\"\r\n", " top: \"fc6\"\r\n", "}\r\n", "layer {\r\n", " name: \"drop6\"\r\n", " type: \"Dropout\"\r\n", " bottom: \"fc6\"\r\n", " top: \"fc6\"\r\n", " dropout_param {\r\n", " dropout_ratio: 0.5\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"fc7\"\r\n", " type: \"InnerProduct\"\r\n", " bottom: \"fc6\"\r\n", " top: \"fc7\"\r\n", " inner_product_param {\r\n", " num_output: 4096\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"relu7\"\r\n", " type: \"ReLU\"\r\n", " bottom: \"fc7\"\r\n", " top: \"fc7\"\r\n", "}\r\n", "layer {\r\n", " name: \"drop7\"\r\n", " type: \"Dropout\"\r\n", " bottom: \"fc7\"\r\n", " top: \"fc7\"\r\n", " dropout_param {\r\n", " dropout_ratio: 0.5\r\n", " }\r\n", "}\r\n", "layer {\r\n", " name: \"fc-rcnn\"\r\n", " type: \"InnerProduct\"\r\n", " bottom: \"fc7\"\r\n", " top: \"fc-rcnn\"\r\n", " inner_product_param {\r\n", " num_output: 200\r\n", " }\r\n", "}\r\n", "I0608 10:32:38.067556 6131 net.cpp:370] Input 0 -> data\r\n", "I0608 10:32:38.067576 6131 layer_factory.hpp:74] Creating layer conv1\r\n", "I0608 10:32:38.067585 6131 net.cpp:90] Creating Layer conv1\r\n", "I0608 10:32:38.067589 6131 net.cpp:410] conv1 <- data\r\n", "I0608 10:32:38.067595 6131 net.cpp:368] conv1 -> conv1\r\n", "I0608 10:32:38.067603 6131 net.cpp:120] Setting up conv1\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "I0608 10:32:38.108999 6131 net.cpp:127] Top shape: 10 96 55 55 (2904000)\r\n", "I0608 10:32:38.109035 6131 layer_factory.hpp:74] Creating layer relu1\r\n", "I0608 10:32:38.109048 6131 net.cpp:90] Creating Layer relu1\r\n", "I0608 10:32:38.109055 6131 net.cpp:410] relu1 <- conv1\r\n", "I0608 10:32:38.109063 6131 net.cpp:357] relu1 -> conv1 (in-place)\r\n", "I0608 10:32:38.109076 6131 net.cpp:120] Setting up relu1\r\n", "I0608 10:32:38.109233 6131 net.cpp:127] Top shape: 10 96 55 55 (2904000)\r\n", "I0608 10:32:38.109244 6131 layer_factory.hpp:74] Creating layer pool1\r\n", "I0608 10:32:38.109257 6131 net.cpp:90] Creating Layer pool1\r\n", "I0608 10:32:38.109263 6131 net.cpp:410] pool1 <- conv1\r\n", "I0608 10:32:38.109269 6131 net.cpp:368] pool1 -> pool1\r\n", "I0608 10:32:38.109277 6131 net.cpp:120] Setting up pool1\r\n", "I0608 10:32:38.109311 6131 net.cpp:127] Top shape: 10 96 27 27 (699840)\r\n", "I0608 10:32:38.109318 6131 layer_factory.hpp:74] Creating layer norm1\r\n", "I0608 10:32:38.109325 6131 net.cpp:90] Creating Layer norm1\r\n", "I0608 10:32:38.109329 6131 net.cpp:410] norm1 <- pool1\r\n", "I0608 10:32:38.109335 6131 net.cpp:368] norm1 -> norm1\r\n", "I0608 10:32:38.109341 6131 net.cpp:120] Setting up norm1\r\n", "I0608 10:32:38.109349 6131 net.cpp:127] Top shape: 10 96 27 27 (699840)\r\n", "I0608 10:32:38.109352 6131 layer_factory.hpp:74] Creating layer conv2\r\n", "I0608 10:32:38.109360 6131 net.cpp:90] Creating Layer conv2\r\n", "I0608 10:32:38.109364 6131 net.cpp:410] conv2 <- norm1\r\n", "I0608 10:32:38.109370 6131 net.cpp:368] conv2 -> conv2\r\n", "I0608 10:32:38.109376 6131 net.cpp:120] Setting up conv2\r\n", "I0608 10:32:38.109931 6131 net.cpp:127] Top shape: 10 256 27 27 (1866240)\r\n", "I0608 10:32:38.109947 6131 layer_factory.hpp:74] Creating layer relu2\r\n", "I0608 10:32:38.109954 6131 net.cpp:90] Creating Layer relu2\r\n", "I0608 10:32:38.109959 6131 net.cpp:410] relu2 <- conv2\r\n", "I0608 10:32:38.109966 6131 net.cpp:357] relu2 -> conv2 (in-place)\r\n", "I0608 10:32:38.109972 6131 net.cpp:120] Setting up relu2\r\n", "I0608 10:32:38.110002 6131 net.cpp:127] Top shape: 10 256 27 27 (1866240)\r\n", "I0608 10:32:38.110008 6131 layer_factory.hpp:74] Creating layer pool2\r\n", "I0608 10:32:38.110014 6131 net.cpp:90] Creating Layer pool2\r\n", "I0608 10:32:38.110018 6131 net.cpp:410] pool2 <- conv2\r\n", "I0608 10:32:38.110024 6131 net.cpp:368] pool2 -> pool2\r\n", "I0608 10:32:38.110030 6131 net.cpp:120] Setting up pool2\r\n", "I0608 10:32:38.110136 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)\r\n", "I0608 10:32:38.110144 6131 layer_factory.hpp:74] Creating layer norm2\r\n", "I0608 10:32:38.110152 6131 net.cpp:90] Creating Layer norm2\r\n", "I0608 10:32:38.110157 6131 net.cpp:410] norm2 <- pool2\r\n", "I0608 10:32:38.110162 6131 net.cpp:368] norm2 -> norm2\r\n", "I0608 10:32:38.110168 6131 net.cpp:120] Setting up norm2\r\n", "I0608 10:32:38.110175 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)\r\n", "I0608 10:32:38.110179 6131 layer_factory.hpp:74] Creating layer conv3\r\n", "I0608 10:32:38.110187 6131 net.cpp:90] Creating Layer conv3\r\n", "I0608 10:32:38.110191 6131 net.cpp:410] conv3 <- norm2\r\n", "I0608 10:32:38.110198 6131 net.cpp:368] conv3 -> conv3\r\n", "I0608 10:32:38.110203 6131 net.cpp:120] Setting up conv3\r\n", "I0608 10:32:38.111160 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)\r\n", "I0608 10:32:38.111176 6131 layer_factory.hpp:74] Creating layer relu3\r\n", "I0608 10:32:38.111183 6131 net.cpp:90] Creating Layer relu3\r\n", "I0608 10:32:38.111189 6131 net.cpp:410] relu3 <- conv3\r\n", "I0608 10:32:38.111194 6131 net.cpp:357] relu3 -> conv3 (in-place)\r\n", "I0608 10:32:38.111202 6131 net.cpp:120] Setting up relu3\r\n", "I0608 10:32:38.111232 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)\r\n", "I0608 10:32:38.111238 6131 layer_factory.hpp:74] Creating layer conv4\r\n", "I0608 10:32:38.111243 6131 net.cpp:90] Creating Layer conv4\r\n", "I0608 10:32:38.111248 6131 net.cpp:410] conv4 <- conv3\r\n", "I0608 10:32:38.111253 6131 net.cpp:368] conv4 -> conv4\r\n", "I0608 10:32:38.111260 6131 net.cpp:120] Setting up conv4\r\n", "I0608 10:32:38.112344 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)\r\n", "I0608 10:32:38.112357 6131 layer_factory.hpp:74] Creating layer relu4\r\n", "I0608 10:32:38.112365 6131 net.cpp:90] Creating Layer relu4\r\n", "I0608 10:32:38.112370 6131 net.cpp:410] relu4 <- conv4\r\n", "I0608 10:32:38.112375 6131 net.cpp:357] relu4 -> conv4 (in-place)\r\n", "I0608 10:32:38.112381 6131 net.cpp:120] Setting up relu4\r\n", "I0608 10:32:38.112411 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)\r\n", "I0608 10:32:38.112416 6131 layer_factory.hpp:74] Creating layer conv5\r\n", "I0608 10:32:38.112422 6131 net.cpp:90] Creating Layer conv5\r\n", "I0608 10:32:38.112427 6131 net.cpp:410] conv5 <- conv4\r\n", "I0608 10:32:38.112432 6131 net.cpp:368] conv5 -> conv5\r\n", "I0608 10:32:38.112439 6131 net.cpp:120] Setting up conv5\r\n", "I0608 10:32:38.113263 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)\r\n", "I0608 10:32:38.113279 6131 layer_factory.hpp:74] Creating layer relu5\r\n", "I0608 10:32:38.113286 6131 net.cpp:90] Creating Layer relu5\r\n", "I0608 10:32:38.113291 6131 net.cpp:410] relu5 <- conv5\r\n", "I0608 10:32:38.113297 6131 net.cpp:357] relu5 -> conv5 (in-place)\r\n", "I0608 10:32:38.113303 6131 net.cpp:120] Setting up relu5\r\n", "I0608 10:32:38.113333 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)\r\n", "I0608 10:32:38.113339 6131 layer_factory.hpp:74] Creating layer pool5\r\n", "I0608 10:32:38.113347 6131 net.cpp:90] Creating Layer pool5\r\n", "I0608 10:32:38.113350 6131 net.cpp:410] pool5 <- conv5\r\n", "I0608 10:32:38.113356 6131 net.cpp:368] pool5 -> pool5\r\n", "I0608 10:32:38.113363 6131 net.cpp:120] Setting up pool5\r\n", "I0608 10:32:38.113502 6131 net.cpp:127] Top shape: 10 256 6 6 (92160)\r\n", "I0608 10:32:38.113520 6131 layer_factory.hpp:74] Creating layer fc6\r\n", "I0608 10:32:38.113528 6131 net.cpp:90] Creating Layer fc6\r\n", "I0608 10:32:38.113533 6131 net.cpp:410] fc6 <- pool5\r\n", "I0608 10:32:38.113538 6131 net.cpp:368] fc6 -> fc6\r\n", "I0608 10:32:38.113545 6131 net.cpp:120] Setting up fc6\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "I0608 10:32:38.140440 6131 net.cpp:127] Top shape: 10 4096 (40960)\r\n", "I0608 10:32:38.140478 6131 layer_factory.hpp:74] Creating layer relu6\r\n", "I0608 10:32:38.140492 6131 net.cpp:90] Creating Layer relu6\r\n", "I0608 10:32:38.140498 6131 net.cpp:410] relu6 <- fc6\r\n", "I0608 10:32:38.140506 6131 net.cpp:357] relu6 -> fc6 (in-place)\r\n", "I0608 10:32:38.140516 6131 net.cpp:120] Setting up relu6\r\n", "I0608 10:32:38.140576 6131 net.cpp:127] Top shape: 10 4096 (40960)\r\n", "I0608 10:32:38.140583 6131 layer_factory.hpp:74] Creating layer drop6\r\n", "I0608 10:32:38.140589 6131 net.cpp:90] Creating Layer drop6\r\n", "I0608 10:32:38.140594 6131 net.cpp:410] drop6 <- fc6\r\n", "I0608 10:32:38.140599 6131 net.cpp:357] drop6 -> fc6 (in-place)\r\n", "I0608 10:32:38.140605 6131 net.cpp:120] Setting up drop6\r\n", "I0608 10:32:38.140611 6131 net.cpp:127] Top shape: 10 4096 (40960)\r\n", "I0608 10:32:38.140616 6131 layer_factory.hpp:74] Creating layer fc7\r\n", "I0608 10:32:38.140622 6131 net.cpp:90] Creating Layer fc7\r\n", "I0608 10:32:38.140630 6131 net.cpp:410] fc7 <- fc6\r\n", "I0608 10:32:38.140636 6131 net.cpp:368] fc7 -> fc7\r\n", "I0608 10:32:38.140643 6131 net.cpp:120] Setting up fc7\r\n", "I0608 10:32:38.153045 6131 net.cpp:127] Top shape: 10 4096 (40960)\r\n", "I0608 10:32:38.153095 6131 layer_factory.hpp:74] Creating layer relu7\r\n", "I0608 10:32:38.153105 6131 net.cpp:90] Creating Layer relu7\r\n", "I0608 10:32:38.153112 6131 net.cpp:410] relu7 <- fc7\r\n", "I0608 10:32:38.153120 6131 net.cpp:357] relu7 -> fc7 (in-place)\r\n", "I0608 10:32:38.153129 6131 net.cpp:120] Setting up relu7\r\n", "I0608 10:32:38.153200 6131 net.cpp:127] Top shape: 10 4096 (40960)\r\n", "I0608 10:32:38.153206 6131 layer_factory.hpp:74] Creating layer drop7\r\n", "I0608 10:32:38.153214 6131 net.cpp:90] Creating Layer drop7\r\n", "I0608 10:32:38.153219 6131 net.cpp:410] drop7 <- fc7\r\n", "I0608 10:32:38.153224 6131 net.cpp:357] drop7 -> fc7 (in-place)\r\n", "I0608 10:32:38.153231 6131 net.cpp:120] Setting up drop7\r\n", "I0608 10:32:38.153237 6131 net.cpp:127] Top shape: 10 4096 (40960)\r\n", "I0608 10:32:38.153242 6131 layer_factory.hpp:74] Creating layer fc-rcnn\r\n", "I0608 10:32:38.153249 6131 net.cpp:90] Creating Layer fc-rcnn\r\n", "I0608 10:32:38.153254 6131 net.cpp:410] fc-rcnn <- fc7\r\n", "I0608 10:32:38.153259 6131 net.cpp:368] fc-rcnn -> fc-rcnn\r\n", "I0608 10:32:38.153267 6131 net.cpp:120] Setting up fc-rcnn\r\n", "I0608 10:32:38.154058 6131 net.cpp:127] Top shape: 10 200 (2000)\r\n", "I0608 10:32:38.154080 6131 net.cpp:194] fc-rcnn does not need backward computation.\r\n", "I0608 10:32:38.154085 6131 net.cpp:194] drop7 does not need backward computation.\r\n", "I0608 10:32:38.154090 6131 net.cpp:194] relu7 does not need backward computation.\r\n", "I0608 10:32:38.154095 6131 net.cpp:194] fc7 does not need backward computation.\r\n", "I0608 10:32:38.154100 6131 net.cpp:194] drop6 does not need backward computation.\r\n", "I0608 10:32:38.154105 6131 net.cpp:194] relu6 does not need backward computation.\r\n", "I0608 10:32:38.154110 6131 net.cpp:194] fc6 does not need backward computation.\r\n", "I0608 10:32:38.154115 6131 net.cpp:194] pool5 does not need backward computation.\r\n", "I0608 10:32:38.154129 6131 net.cpp:194] relu5 does not need backward computation.\r\n", "I0608 10:32:38.154134 6131 net.cpp:194] conv5 does not need backward computation.\r\n", "I0608 10:32:38.154139 6131 net.cpp:194] relu4 does not need backward computation.\r\n", "I0608 10:32:38.154145 6131 net.cpp:194] conv4 does not need backward computation.\r\n", "I0608 10:32:38.154150 6131 net.cpp:194] relu3 does not need backward computation.\r\n", "I0608 10:32:38.154155 6131 net.cpp:194] conv3 does not need backward computation.\r\n", "I0608 10:32:38.154160 6131 net.cpp:194] norm2 does not need backward computation.\r\n", "I0608 10:32:38.154165 6131 net.cpp:194] pool2 does not need backward computation.\r\n", "I0608 10:32:38.154170 6131 net.cpp:194] relu2 does not need backward computation.\r\n", "I0608 10:32:38.154175 6131 net.cpp:194] conv2 does not need backward computation.\r\n", "I0608 10:32:38.154180 6131 net.cpp:194] norm1 does not need backward computation.\r\n", "I0608 10:32:38.154193 6131 net.cpp:194] pool1 does not need backward computation.\r\n", "I0608 10:32:38.154198 6131 net.cpp:194] relu1 does not need backward computation.\r\n", "I0608 10:32:38.154203 6131 net.cpp:194] conv1 does not need backward computation.\r\n", "I0608 10:32:38.154208 6131 net.cpp:235] This network produces output fc-rcnn\r\n", "I0608 10:32:38.154220 6131 net.cpp:482] Collecting Learning Rate and Weight Decay.\r\n", "I0608 10:32:38.154227 6131 net.cpp:247] Network initialization done.\r\n", "I0608 10:32:38.154232 6131 net.cpp:248] Memory required for data: 62425920\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "E0608 10:32:38.221285 6131 upgrade_proto.cpp:618] Attempting to upgrade input file specified using deprecated V1LayerParameter: models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "I0608 10:32:38.324671 6131 upgrade_proto.cpp:626] Successfully upgraded file specified using deprecated V1LayerParameter\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Loading input...\r\n", "selective_search_rcnn({'/home/ouxinyu/caffe-master/examples/images/fish-bike.jpg'}, '/tmp/tmpu85WGa.mat')\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Processed 1570 windows in 17.131 s.\r\n", "/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:2487: PerformanceWarning: \r\n", "your performance may suffer as PyTables will pickle object types that it cannot\r\n", "map directly to c-types [inferred_type->mixed,key->block1_values] [items->['prediction']]\r\n", "\r\n", " warnings.warn(ws, PerformanceWarning)\r\n", "Saved to _temp/det_output.h5 in 0.025 s.\r\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\u4e0b\u9762\u7684\u5185\u5bb9\u6ca1\u4ec0\u4e48\u95ee\u9898\uff0c\u8def\u5f84\u7ee7\u7eed\u6539\u6539\uff0c\u8bf4\u660e\u76f4\u63a5\u8d34\u539f\u4f5c\u7684....\n", "\n", "Running this outputs a DataFrame with the filenames, selected windows, and their detection scores to an HDF5 file. (We only ran on one image, so the filenames will all be the same.)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "df = pd.read_hdf('_temp/det_output.h5', 'df')\n", "print(df.shape)\n", "print(df.iloc[0])" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(1570, 5)\n", "prediction [-2.64134, -2.90464, -2.84325, -3.23465, -1.97...\n", "ymin 79.846\n", "xmin 9.62\n", "ymax 246.31\n", "xmax 339.624\n", "Name: /home/ouxinyu/caffe-master/examples/images/fish-bike.jpg, dtype: object\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "1570 regions were proposed with the R-CNN configuration of selective search. The number of proposals will vary from image to image based on its contents and size -- selective search isn't scale invariant.\n", "\n", "In general, detect.py is most efficient when running on a lot of images: it first extracts window proposals for all of them, batches the windows for efficient GPU processing, and then outputs the results. Simply list an image per line in the images_file, and it will process all of them.\n", "\n", "Although this guide gives an example of R-CNN ImageNet detection, detect.py is clever enough to adapt to different Caffe models\u2019 input dimensions, batch size, and output categories. You can switch the model definition and pretrained model as desired. Refer to python detect.py --help for the parameters to describe your data set. There's no need for hardcoding.\n", "\n", "Anyway, let's now load the ILSVRC13 detection class names and make a DataFrame of the predictions. Note you'll need the auxiliary ilsvrc2012 data fetched by data/ilsvrc12/get_ilsvrc12_aux.sh.\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "with open('data/ilsvrc12/det_synset_words.txt') as f:\n", " labels_df = pd.DataFrame([\n", " {\n", " 'synset_id': l.strip().split(' ')[0],\n", " 'name': ' '.join(l.strip().split(' ')[1:]).split(',')[0]\n", " }\n", " for l in f.readlines()\n", " ])\n", "labels_df.sort('synset_id')\n", "predictions_df = pd.DataFrame(np.vstack(df.prediction.values), columns=labels_df['name'])\n", "print(predictions_df.iloc[0])" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "name\n", "accordion -2.641338\n", "airplane -2.904639\n", "ant -2.843245\n", "antelope -3.234649\n", "apple -1.976960\n", "armadillo -2.488007\n", "artichoke -2.218568\n", "axe -2.338795\n", "baby bed -2.755479\n", "backpack -2.180768\n", "bagel -2.697270\n", "balance beam -2.780527\n", "banana -2.433329\n", "band aid -1.631823\n", "banjo -2.317316\n", "...\n", "trombone -2.587927\n", "trumpet -2.396858\n", "turtle -2.376043\n", "tv or monitor -2.763605\n", "unicycle -2.254395\n", "vacuum -1.918464\n", "violin -2.746913\n", "volleyball -2.758842\n", "waffle iron -2.421376\n", "washer -2.415665\n", "water bottle -2.175697\n", "watercraft -2.949454\n", "whale -3.157514\n", "wine bottle -2.790261\n", "zebra -2.768192\n", "Name: 0, Length: 200, dtype: float32\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at the activations." ] }, { "cell_type": "code", "collapsed": false, "input": [ "plt.gray()\n", "plt.matshow(predictions_df.values)\n", "plt.xlabel('Classes')\n", "plt.ylabel('Windows')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "