{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"\n",
"\n",
"
\n",
"
\n",
"\n",
"--------------------\n",
"# Introduction to Convolutional Neural Networks"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"In this section, we will use the famous [MNIST Dataset](http://yann.lecun.com/exdb/mnist/) to build two Neural Networks capable to perform handwritten digits classification. The first Network is a simple Multi-layer Perceptron (MLP) and the second one is a Convolutional Neural Network (CNN from now on). In other words, our algorithm will say, with some associated error, what type of digit is the presented input.\n",
"\n",
"This lesson is not intended to be a reference for _machine learning, convolutions or TensorFlow_. The intention is to give notions to the user about these fields and awareness of Data Scientist Workbench capabilities. We recommend that the students search for further references to understand completely the mathematical and theoretical concepts involved.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"## Table of contents\n",
"\n",
" * What is Deep Learning\n",
" * Simple test: Is tensorflow working?\n",
" * 1st part: classify MNIST using a simple model\n",
" * Evaluating the final result\n",
" * How to improve our model?\n",
" * 2nd part: Deep Learning applied on MNIST\n",
" * Summary of the Deep Convolutional Neural Network\n",
" * Define functions and train the model\n",
" * Evaluate the model"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"\n",
"# What is Deep Learning?"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"**Brief Theory:** Deep learning (also known as deep structured learning, hierarchical learning or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers, with complex structures or otherwise, composed of multiple non-linear transformations."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"deletable": true,
"editable": true
},
"source": [
"\n",
"
\n", "Number representation: 0\n", "Binary encoding: [2^5] [2^4] [2^3] [2^2] [2^1] [2^0] \n", "Array/vector: 0 0 0 0 0 0 \n", "\n", "Number representation: 5\n", "Binary encoding: [2^5] [2^4] [2^3] [2^2] [2^1] [2^0] \n", "Array/vector: 0 0 0 1 0 1 \n", "" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Using a different notation, the same digits using one-hot vector representation can be show as: " ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "
\n", "Number representation: 0\n", "One-hot encoding: [5] [4] [3] [2] [1] [0] \n", "Array/vector: 0 0 0 0 0 1 \n", "\n", "Number representation: 5\n", "One-hot encoding: [5] [4] [3] [2] [1] [0] \n", "Array/vector: 1 0 0 0 0 0 \n", "" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Understanding the imported data" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The imported data can be divided as follow:\n", "\n", "- Training (mnist.train) >> Use the given dataset with inputs and related outputs for training of NN. In our case, if you give an image that you know that represents a \"nine\", this set will tell the neural network that we expect a \"nine\" as the output. \n", " - 55,000 data points\n", " - mnist.train.images for inputs\n", " - mnist.train.labels for outputs\n", " \n", " \n", "- Validation (mnist.validation) >> The same as training, but now the date is used to generate model properties (classification error, for example) and from this, tune parameters like the optimal number of hidden units or determine a stopping point for the back-propagation algorithm \n", " - 5,000 data points\n", " - mnist.validation.images for inputs\n", " - mnist.validation.labels for outputs\n", " \n", " \n", "- Test (mnist.test) >> the model does not have access to this informations prior to the test phase. It is used to evaluate the performance and accuracy of the model against \"real life situations\". No further optimization beyond this point. \n", " - 10,000 data points\n", " - mnist.test.images for inputs\n", " - mnist.test.labels for outputs\n", " " ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Creating an interactive section" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You have two basic options when using TensorFlow to run your code:\n", "\n", "- [Build graphs and run session] Do all the set-up and THEN execute a session to evaluate tensors and run operations (ops) \n", "- [Interactive session] create your coding and run on the fly. \n", "\n", "For this first part, we will use the interactive session that is more suitable for environments like Jupyter notebooks." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "sess = tf.InteractiveSession()" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Creating placeholders" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "It's a best practice to create placeholders before variable assignments when using TensorFlow. Here we'll create placeholders for inputs (\"Xs\") and outputs (\"Ys\"). \n", "\n", "__Placeholder 'X':__ represents the \"space\" allocated input or the images. \n", " * Each input has 784 pixels distributed by a 28 width x 28 height matrix \n", " * The 'shape' argument defines the tensor size by its dimensions. \n", " * 1st dimension = None. Indicates that the batch size, can be of any size. \n", " * 2nd dimension = 784. Indicates the number of pixels on a single flattened MNIST image. \n", " \n", "__Placeholder 'Y':___ represents the final output or the labels. \n", " * 10 possible classes (0,1,2,3,4,5,6,7,8,9) \n", " * The 'shape' argument defines the tensor size by its dimensions. \n", " * 1st dimension = None. Indicates that the batch size, can be of any size. \n", " * 2nd dimension = 10. Indicates the number of targets/outcomes \n", "\n", "__dtype for both placeholders:__ if you not sure, use tf.float32. The limitation here is that the later presented softmax function only accepts float32 or float64 dtypes. For more dtypes, check TensorFlow's documentation here\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "x = tf.placeholder(tf.float32, shape=[None, 784])\n", "y_ = tf.placeholder(tf.float32, shape=[None, 10])" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Assigning bias and weights to null tensors" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Now we are going to create the weights and biases, for this purpose they will be used as arrays filled with zeros. The values that we choose here can be critical, but we'll cover a better way on the second part, instead of this type of initialization." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Weight tensor\n", "W = tf.Variable(tf.zeros([784,10],tf.float32))\n", "# Bias tensor\n", "b = tf.Variable(tf.zeros([10],tf.float32))" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Execute the assignment operation " ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Before, we assigned the weights and biases but we did not initialize them with null values. For this reason, TensorFlow need to initialize the variables that you assign. \n", "Please notice that we're using this notation \"sess.run\" because we previously started an interactive session." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "# run the op initialize_all_variables using an interactive session\n", "sess.run(tf.initialize_all_variables())" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Adding Weights and Biases to input" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The only difference from our next operation to the picture below is that we are using the mathematical convention for what is being executed in the illustration. The tf.matmul operation performs a matrix multiplication between x (inputs) and W (weights) and after the code add biases." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "\n", " \n", "