{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Convolutional Neural Networks\n",
    "- CS231n Convolutional Neural Networks for Visual Recognition: https://git.io/vKlww"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting MNIST_data/train-images-idx3-ubyte.gz\n",
      "Extracting MNIST_data/train-labels-idx1-ubyte.gz\n",
      "Extracting MNIST_data/t10k-images-idx3-ubyte.gz\n",
      "Extracting MNIST_data/t10k-labels-idx1-ubyte.gz\n"
     ]
    }
   ],
   "source": [
    "import tensorflow as tf\n",
    "from tensorflow.examples.tutorials.mnist import input_data\n",
    "mnist = input_data.read_data_sets(\"MNIST_data/\", one_hot=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "x = tf.placeholder(\"float\", [None, 784])\n",
    "y = tf.placeholder(\"float\", [None, 10])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Image reshape to 4D [-1, 28, 28, 1]\n",
    "  - The MNIST images have just one gray color value (the number of color channels is one), so that the last value is one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(?, 28, 28, 1)\n"
     ]
    }
   ],
   "source": [
    "x_image = tf.reshape(x, [-1, 28, 28, 1])\n",
    "print x_image.get_shape()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- **stride**\n",
    "  - we must specify the stride with which we slide the filter. \n",
    "  - stride = 1: we slide the filters one pixel at a time. \n",
    "  - stride = 2: we slide the filters two pixel at a time. \n",
    "  - This will produce smaller output volumes spatially.\n",
    "- **padding**\n",
    "  - we need pad the input volume with zeros around the border. \n",
    "  - zero padding"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- The CONV layer’s parameters consist of a set of learnable filters.\n",
    "- The convolution will compute 32 kernels (features) for each 5x5 patch. \n",
    "- Its weight tensor will have a shape of [5, 5, 1, 32]. \n",
    "  - The first two dimensions are the patch size\n",
    "  - The third is the number of input channels (here, it is one channel)\n",
    "  - The last one is the number of kernels (features). \n",
    "- A bias vector has a component per kernel."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#conv2d\n",
    "- https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#max_pool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "W_conv1 = tf.Variable(tf.truncated_normal([5, 5, 1, 32], stddev=0.1))\n",
    "b_conv1 = tf.Variable(tf.constant(0.1, shape=[32]))\n",
    "h_conv1 = tf.nn.relu(tf.nn.conv2d(x_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)\n",
    "h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "W_conv2 = tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev=0.1))\n",
    "b_conv2 = tf.Variable(tf.constant(0.1, shape=[64]))\n",
    "h_conv2 = tf.nn.relu(tf.nn.conv2d(h_pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)\n",
    "h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "W_fc1 = tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024], stddev=0.1))\n",
    "b_fc1 = tf.Variable(tf.constant(0.1, shape=[1024]))\n",
    "h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "keep_prob = tf.placeholder(\"float\")\n",
    "h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob=keep_prob)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "W_fc2 = tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1))\n",
    "b_fc2 = tf.Variable(tf.constant(0.1, shape=[10]))\n",
    "y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "cross_entropy = -tf.reduce_sum(y * tf.log(y_conv))\n",
    "train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "total batch: 550\n",
      "Epoch 0 Finished - Accuracy 0.9337\n",
      "Epoch 1 Finished - Accuracy 0.9567\n",
      "Epoch 2 Finished - Accuracy 0.9673\n",
      "Epoch 3 Finished - Accuracy 0.9714\n",
      "Epoch 4 Finished - Accuracy 0.9735\n",
      "Epoch 5 Finished - Accuracy 0.9767\n",
      "Epoch 6 Finished - Accuracy 0.9802\n",
      "Epoch 7 Finished - Accuracy 0.9801\n",
      "Epoch 8 Finished - Accuracy 0.9832\n",
      "Epoch 9 Finished - Accuracy 0.9821\n",
      "Epoch 10 Finished - Accuracy 0.9854\n",
      "Epoch 11 Finished - Accuracy 0.9867\n",
      "Epoch 12 Finished - Accuracy 0.9868\n",
      "Epoch 13 Finished - Accuracy 0.9851\n",
      "Epoch 14 Finished - Accuracy 0.9892\n",
      "Epoch 15 Finished - Accuracy 0.9886\n",
      "Epoch 16 Finished - Accuracy 0.9859\n",
      "Epoch 17 Finished - Accuracy 0.987\n",
      "Epoch 18 Finished - Accuracy 0.9877\n",
      "Epoch 19 Finished - Accuracy 0.9876\n",
      "Epoch 20 Finished - Accuracy 0.9884\n",
      "Epoch 21 Finished - Accuracy 0.989\n",
      "Epoch 22 Finished - Accuracy 0.988\n",
      "Epoch 23 Finished - Accuracy 0.9893\n",
      "Epoch 24 Finished - Accuracy 0.9872\n",
      "Epoch 25 Finished - Accuracy 0.9881\n",
      "Epoch 26 Finished - Accuracy 0.9873\n",
      "Epoch 27 Finished - Accuracy 0.9889\n",
      "Epoch 28 Finished - Accuracy 0.9877\n",
      "Epoch 29 Finished - Accuracy 0.9893\n",
      "Epoch 30 Finished - Accuracy 0.9896\n",
      "Epoch 31 Finished - Accuracy 0.9896\n",
      "Epoch 32 Finished - Accuracy 0.9901\n",
      "Epoch 33 Finished - Accuracy 0.9878\n",
      "Epoch 34 Finished - Accuracy 0.9895\n",
      "Epoch 35 Finished - Accuracy 0.9884\n",
      "Epoch 36 Finished - Accuracy 0.9895\n",
      "Epoch 37 Finished - Accuracy 0.9891\n",
      "Epoch 38 Finished - Accuracy 0.99\n",
      "Epoch 39 Finished - Accuracy 0.9893\n",
      "Optimization Finished!\n"
     ]
    }
   ],
   "source": [
    "# Initializing the variables\n",
    "init = tf.initialize_all_variables()\n",
    "\n",
    "# Parameters\n",
    "training_epochs = 40\n",
    "learning_rate = 0.001\n",
    "batch_size = 100\n",
    "display_step = 1\n",
    "\n",
    "# Calculate accuracy with a Test model \n",
    "prediction_and_ground_truth = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y, 1))\n",
    "accuracy = tf.reduce_mean(tf.cast(prediction_and_ground_truth, \"float\"))\n",
    "\n",
    "# Launch the tensorflow graph\n",
    "with tf.Session() as sess:\n",
    "    sess.run(init)\n",
    "    total_batch = int(mnist.train.num_examples/batch_size)\n",
    "    \n",
    "    print \"total batch: %d\" % total_batch\n",
    "\n",
    "    # Training cycle\n",
    "    for epoch in range(training_epochs):\n",
    "\n",
    "        # Loop over all batches\n",
    "        for i in range(total_batch):\n",
    "            batch_images, batch_labels = mnist.train.next_batch(batch_size)\n",
    "            sess.run(train_step, feed_dict={x: batch_images, y: batch_labels, keep_prob: 0.5})\n",
    "        print \"Epoch %d Finished - Accuracy %g\" % (epoch, sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels, keep_prob: 0.5}))\n",
    "\n",
    "    print(\"Optimization Finished!\")"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [Root]",
   "language": "python",
   "name": "Python [Root]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}