{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CIFAR-10: Part 1\n", "In this two-part tutorial, we present an end-to-end example of training and using a convolutional neural network for a classic image recognition problem. We will use the CIFAR-10 benchmark dataset, which is a 10-class dataset consisting of 60,000 color images of size 32x32. We will use a .png version of the dataset to emulate the use of a custom dataset that you might find in the wild. The specific items that this tutorial will cover are as follows:\n", "\n", "**Part 1:**\n", "- Download dataset\n", "- Write images to lmdbs\n", "- Define and train a model with checkpoints\n", "- Save the trained model\n", "\n", "**Part 2:**\n", "- Load pre-trained model from Part 1\n", "- Run inference on testing lmdb\n", "- Continue training to improve test accuracy\n", "- Test the retrained model\n", "\n", "\n", "Let's start with some necessary imports." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.\n", "WARNING:root:Debug message: No module named caffe2_pybind11_state_gpu\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Necessities imported!\n" ] } ], "source": [ "from __future__ import absolute_import\n", "from __future__ import division\n", "from __future__ import print_function\n", "from __future__ import unicode_literals\n", "\n", "%matplotlib inline\n", "from matplotlib import pyplot as plt\n", "import numpy as np\n", "import os\n", "import lmdb\n", "import shutil\n", "from imageio import imread\n", "import caffe2.python.predictor.predictor_exporter as pe\n", "from caffe2.proto import caffe2_pb2\n", "from caffe2.python.predictor import mobile_exporter\n", "from caffe2.python import (\n", " brew,\n", " core,\n", " model_helper,\n", " net_drawer,\n", " optimizer,\n", " visualize,\n", " workspace,\n", ")\n", "\n", "# If you would like to see some really detailed initializations,\n", "# you can change --caffe2_log_level=0 to --caffe2_log_level=-1\n", "core.GlobalInit(['caffe2', '--caffe2_log_level=0'])\n", "print(\"Necessities imported!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download and unpack dataset if necessary\n", "Now let's download the dataset from [Joseph Redmon](https://pjreddie.com/)'s CIFAR-10 dataset mirror and extract the data from the tarball. Note that this file is fairly large, so it may take a few minutes to download/unpack." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import requests\n", "import tarfile\n", "\n", "# Set paths and variables\n", "# data_folder is where the data is downloaded and unpacked\n", "data_folder = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10')\n", "# root_folder is where checkpoint files and .pb model definition files will be outputted\n", "root_folder = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_files', 'tutorial_cifar10')\n", "\n", "url = \"http://pjreddie.com/media/files/cifar.tgz\" # url to data\n", "filename = url.split(\"/\")[-1] # download file name\n", "download_path = os.path.join(data_folder, filename) # path to extract data to\n", "\n", "# Create data_folder if not already there\n", "if not os.path.isdir(data_folder):\n", " os.makedirs(data_folder)\n", "\n", "# If data does not already exist, download and extract\n", "if not os.path.exists(download_path.strip('.tgz')):\n", " # Download data\n", " r = requests.get(url, stream=True)\n", " print(\"Downloading... {} to {}\".format(url, download_path))\n", " open(download_path, 'wb').write(r.content)\n", " print(\"Finished downloading...\")\n", "\n", " # Unpack images from tgz file\n", " print('Extracting images from tarball...')\n", " tar = tarfile.open(download_path, 'r')\n", " for item in tar:\n", " tar.extract(item, data_folder)\n", " print(\"Completed download and extraction!\")\n", " \n", "else:\n", " print(\"Image directory already exists. Moving on...\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a peek at a few training images to get an idea of what we're dealing with." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import glob\n", "\n", "# Grab 5 image paths from training set to display\n", "sample_imgs = glob.glob(os.path.join(data_folder, \"cifar\", \"train\") + '/*.png')[:5]\n", "\n", "# Plot images\n", "f, ax = plt.subplots(1, 5, figsize=(10,10))\n", "plt.tight_layout()\n", "for i in range(5):\n", " ax[i].set_title(sample_imgs[i].split(\"_\")[-1].split(\".\")[0])\n", " ax[i].axis('off')\n", " ax[i].imshow(imread(sample_imgs[i]).astype(np.uint8)) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create label files and write LMDBs\n", "Now that we have our data, we need to write LMDBs for training, validation, and testing. To separate what images we want in each category, we will employ a similar technique to what was often used in the original Caffe framework: creating label files.\n", "\n", "Label files are text files that map each .png image to its class.\n", "\n", " /path/to/im1.png 7\n", " /path/to/im2.png 3\n", " /path/to/im3.png 5\n", " /path/to/im4.png 0\n", " ...\n", " \n", "The process of creating these label files will likely be different for every dataset you will encounter. It really depends on how the data is labeled in the original format of the download. In the case of the CIFAR-10 .png download:\n", "\n", "- cifar/labels.txt is a list of the 10 labels in their string form (airplane, automobile, bird, ...)\n", "- cifar/train/ is a directory of 50,000 labeled training images that contain their string label name in the filename (0_frog.png, 1_truck.png, 2_truck.png)\n", "- cifar/test/ is a directory of 10,000 testing images that are labeled the same way as the images in cifar/train/\n", "\n", "Using this information, let's start by creating label files to make life easier before writing to LMDBs.\n", "\n", "The first step to doing this is to declare our path variables, and create a `classes` dictionary to map string labels to integer labels that the LMDBs will take." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "classes: {'horse': 7, 'automobile': 1, 'deer': 4, 'dog': 5, 'frog': 6, 'cat': 3, 'truck': 9, 'ship': 8, 'airplane': 0, 'bird': 2}\n" ] } ], "source": [ "# Paths to train and test directories\n", "training_dir_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'cifar', 'train')\n", "testing_dir_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'cifar', 'test')\n", "\n", "# Paths to label files\n", "training_labels_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'training_dictionary.txt')\n", "validation_labels_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'validation_dictionary.txt')\n", "testing_labels_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'testing_dictionary.txt')\n", "\n", "# Paths to LMDBs\n", "training_lmdb_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'training_lmdb')\n", "validation_lmdb_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'validation_lmdb')\n", "testing_lmdb_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'testing_lmdb')\n", "\n", "# Path to labels.txt\n", "labels_path = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks', 'tutorial_data', 'cifar10', 'cifar', 'labels.txt')\n", "\n", "# Open label file handler\n", "labels_handler = open(labels_path, \"r\")\n", "\n", "\n", "# Create classes dictionary to map string labels to integer labels\n", "classes = {}\n", "i = 0\n", "lines = labels_handler.readlines()\n", "for line in sorted(lines):\n", " line = line.rstrip()\n", " classes[line] = i\n", " i += 1\n", "labels_handler.close()\n", "\n", "print(\"classes:\", classes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have our `classes` dictionary to map string labels to integer labels, we can write our label files for training, validation, and testing. We will split the data as follows:\n", "\n", " - training: 44,000 images (73%)\n", " - validation: 6,000 images (10%)\n", " - testing: 10,000 images (17%)\n", " \n", "Note that the validation images are simply a subset of our training images that we will withhold to periodically test with during training. We do this so we can see how well our network is doing on unseen images without exposing our testing images to the model during training; something that makes machine learning experts cringe.\n", "\n", "To help get a relatively even distribution of each class of image in the training and validation sets, we first read all of the images (full paths) from the training directory into an array called `imgs`, and shuffle this list before iterating over it to write our label files." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from random import shuffle\n", "\n", "# Open file handlers\n", "training_labels_handler = open(training_labels_path, \"w\")\n", "validation_labels_handler = open(validation_labels_path, \"w\")\n", "testing_labels_handler = open(testing_labels_path, \"w\")\n", "\n", "\n", "# Create training, validation, and testing label files\n", "i = 0\n", "validation_count = 6000\n", "imgs = glob.glob(training_dir_path + '/*.png') # read all training images into array\n", "shuffle(imgs) # shuffle array\n", "for img in imgs:\n", " # Write first 6,000 image paths, followed by their integer label, to the validation label files\n", " if i < validation_count:\n", " validation_labels_handler.write(img + ' ' + str(classes[img.split('_')[-1].split('.')[0]]) + '\\n')\n", " # Write the remaining to the training label files\n", " else:\n", " training_labels_handler.write(img + ' ' + str(classes[img.split('_')[-1].split('.')[0]]) + '\\n')\n", " i += 1\n", "print(\"Finished writing training and validation label files\")\n", "\n", "# Write our testing label files using the testing images\n", "for img in glob.glob(testing_dir_path + '/*.png'):\n", " testing_labels_handler.write(img + ' ' + str(classes[img.split('_')[-1].split('.')[0]]) + '\\n')\n", "print(\"Finished writing testing label files\")\n", "\n", "# Close file handlers\n", "training_labels_handler.close()\n", "validation_labels_handler.close()\n", "testing_labels_handler.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are now ready to use these label files to write our LMDBs. The following code is adapted from Caffe2's [lmdb_create_example.py](https://github.com/caffe2/caffe2/blob/master/caffe2/python/examples/lmdb_create_example.py) script. Note that before feeding the image data to the LMDB, we first reorder color channels from RGB --> BGR, and reorder columns from HWC --> CHW.\n", "\n", "If you have gone through the *Image Pre-Processing Pipeline* tutorial, you know that Caffe2 expects inputs in NCHW format, where N stands for the number of images in a batch. Don't worry, we'll add this N dimension when we define the data layer to our model (see `AddInput` below)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def write_lmdb(labels_file_path, lmdb_path):\n", " labels_handler = open(labels_file_path, \"r\")\n", " # Write to lmdb\n", " print(\">>> Write database...\")\n", " LMDB_MAP_SIZE = 1 << 40\n", " print(\"LMDB_MAP_SIZE\", LMDB_MAP_SIZE)\n", " env = lmdb.open(lmdb_path, map_size=LMDB_MAP_SIZE)\n", "\n", " with env.begin(write=True) as txn:\n", " count = 0\n", " for line in labels_handler.readlines():\n", " line = line.rstrip()\n", " im_path = line.split()[0]\n", " im_label = int(line.split()[1])\n", " \n", " # read in image (as RGB)\n", " img_data = imread(im_path).astype(np.float32)\n", " \n", " # convert to BGR\n", " img_data = img_data[:, :, (2, 1, 0)]\n", " \n", " # HWC -> CHW (N gets added in AddInput function)\n", " img_data = np.transpose(img_data, (2,0,1))\n", " \n", " # Create TensorProtos\n", " tensor_protos = caffe2_pb2.TensorProtos()\n", " img_tensor = tensor_protos.protos.add()\n", " img_tensor.dims.extend(img_data.shape)\n", " img_tensor.data_type = 1\n", " flatten_img = img_data.reshape(np.prod(img_data.shape))\n", " img_tensor.float_data.extend(flatten_img)\n", " label_tensor = tensor_protos.protos.add()\n", " label_tensor.data_type = 2\n", " label_tensor.int32_data.append(im_label)\n", " txn.put(\n", " '{}'.format(count).encode('ascii'),\n", " tensor_protos.SerializeToString()\n", " )\n", " if ((count % 1000 == 0)):\n", " print(\"Inserted {} rows\".format(count))\n", " count = count + 1\n", "\n", " print(\"Inserted {} rows\".format(count))\n", " print(\"\\nLMDB saved at \" + lmdb_path + \"\\n\\n\")\n", " labels_handler.close()\n", "\n", " \n", "# Call function to write our LMDBs\n", "if not os.path.exists(training_lmdb_path):\n", " print(\"Writing training LMDB\")\n", " write_lmdb(training_labels_path, training_lmdb_path)\n", "else:\n", " print(training_lmdb_path, \"already exists!\")\n", "if not os.path.exists(validation_lmdb_path):\n", " print(\"Writing validation LMDB\")\n", " write_lmdb(validation_labels_path, validation_lmdb_path)\n", "else:\n", " print(validation_lmdb_path, \"already exists!\")\n", "if not os.path.exists(testing_lmdb_path):\n", " print(\"Writing testing LMDB\")\n", " write_lmdb(testing_labels_path, testing_lmdb_path)\n", "else:\n", " print(testing_lmdb_path, \"already exists!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define our CNN model\n", "Now that we have our data formatted in LMDBs, it is time to define our model! \n", "\n", "First let's set some path variables, define dataset-specific parameters, and declare model training parameters. This is where we will set the number of training iterations that we want, as well as the batch sizes and validation interval to use. Feel free to come back and tinker with these parameters to see how it effects training and efficiency." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Paths to the init & predict net output locations\n", "init_net_out = 'cifar10_init_net.pb'\n", "predict_net_out = 'cifar10_predict_net.pb'\n", "\n", "# Dataset specific params\n", "image_width = 32 # input image width\n", "image_height = 32 # input image height\n", "image_channels = 3 # input image channels (3 for RGB)\n", "num_classes = 10 # number of image classes\n", "\n", "# Training params\n", "training_iters = 2000 # total training iterations\n", "training_net_batch_size = 100 # batch size for training\n", "validation_images = 6000 # total number of validation images\n", "validation_interval = 100 # validate every training iterations\n", "checkpoint_iters = 1000 # output checkpoint db every iterations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create the root_folder directory if it does not already exist. Also, call `workspace.ResetWorkspace(root_folder)` to set the `root_folder` as the working directory of our workspace." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create root_folder if not already there\n", "if not os.path.isdir(root_folder):\n", " os.makedirs(root_folder)\n", "\n", "# Resetting workspace with root_folder argument sets root_folder as working directory\n", "workspace.ResetWorkspace(root_folder)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next task is to define some helper functions to modularize our code, and ultimately define our model similarly to the MNIST tutorial. We will use the `ModelHelper` class to define and represent our model, as well as to contain the model's parameter information. The `brew` module will be used to add layers to our CNN model. For more information about the `ModelHelper`+`brew` model creation paradigm, see the [docs](https://caffe2.ai/docs/brew.html).\n", "\n", ">It is important to note that by calling these functions, we are *NOT* running any computation with our model. Instead, we are constructing the graph of operators that will ultimately dictate the calculations made as our data blobs propagate forward and backward through the network. \n", "\n", "The first helper function is `AddInput`, which adds the input (data) layer to our model. Note that the image data stored in our LMDBs requires some minor preprocessing before it is fed to our computational layers. First, we read in the raw image data and labels from the LMDB, which is of type `uint8` ([0, 255] pixel values). We then cast the data to type `float` and rescale the data to [0, 1] to promote faster convergence. Finally, we will call `model.StopGradient(data, data)` to prevent the gradient from being calculated any further in the backward pass.\n", "\n", "One final point about the blob names in quotes:\n", "- In the case of `\"data_uint8\"` and `\"label\"`, these are the names of the blobs associated with the DB input\n", "- If the name is an *input* blob, this represents the blob name that the operator expects when ran\n", "- If the name is an *output* blob, e.g. `\"data\"`, it represents the name of the output blob that the operator creates" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def AddInput(model, batch_size, db, db_type):\n", " # load the data\n", " data_uint8, label = brew.db_input(\n", " model,\n", " blobs_out=[\"data_uint8\", \"label\"],\n", " batch_size=batch_size,\n", " db=db,\n", " db_type=db_type,\n", " )\n", " # cast the data to float\n", " data = model.Cast(data_uint8, \"data\", to=core.DataType.FLOAT)\n", " # scale data from [0,255] down to [0,1]\n", " data = model.Scale(data, data, scale=float(1./256))\n", " # don't need the gradient for the backward pass\n", " data = model.StopGradient(data, data)\n", " return data, label" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step is to implement our CNN model definition. The network architecture that we will use is based on the \"quick\" model used in the original Caffe's [cifar10 example](https://github.com/BVLC/caffe/tree/master/examples/cifar10). This model has 3 convolutional/pooling layers, and uses Rectified Linear Unit activations (ReLU). Don't be afraid to come back and alter the model by changing hyperparameters and/or adding and removing layers to see how it affects training convergence.\n", "\n", "We will use the `update_dims` function as a helper to keep track of the dimensionality shrinkage that the convolutional and pooling layers cause. The dimensionality changes are as follows:\n", "\n", "$$height_{out}=\\frac{height_{in}-kernel+2*pad}{stride}+1$$\n", "\n", "---\n", "\n", "$$width_{out}=\\frac{width_{in}-kernel+2*pad}{stride}+1$$\n", "\n", "While this function is not necessary, we found that it is an easy strategy to avoid having to hand calculate the dimensionality changes of the data to provide to the penultimate fully connected layer. It also allows us to quickly change hyperparameters such as kernel size and not have to worry about the corresponding dimensionality alterations.\n", "\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Helper function for maintaining the correct height and width dimensions after\n", "# convolutional and pooling layers downsample the input data\n", "def update_dims(height, width, kernel, stride, pad):\n", " new_height = ((height - kernel + 2*pad)//stride) + 1\n", " new_width = ((width - kernel + 2*pad)//stride) + 1\n", " return new_height, new_width\n", "\n", "\n", "def Add_Original_CIFAR10_Model(model, data, num_classes, image_height, image_width, image_channels):\n", " # Convolutional layer 1\n", " conv1 = brew.conv(model, data, 'conv1', dim_in=image_channels, dim_out=32, kernel=5, stride=1, pad=2)\n", " h,w = update_dims(height=image_height, width=image_width, kernel=5, stride=1, pad=2)\n", " # Pooling layer 1\n", " pool1 = brew.max_pool(model, conv1, 'pool1', kernel=3, stride=2)\n", " h,w = update_dims(height=h, width=w, kernel=3, stride=2, pad=0)\n", " # ReLU layer 1\n", " relu1 = brew.relu(model, pool1, 'relu1')\n", " \n", " # Convolutional layer 2\n", " conv2 = brew.conv(model, relu1, 'conv2', dim_in=32, dim_out=32, kernel=5, stride=1, pad=2)\n", " h,w = update_dims(height=h, width=w, kernel=5, stride=1, pad=2)\n", " # ReLU layer 2\n", " relu2 = brew.relu(model, conv2, 'relu2')\n", " # Pooling layer 1\n", " pool2 = brew.average_pool(model, relu2, 'pool2', kernel=3, stride=2)\n", " h,w = update_dims(height=h, width=w, kernel=3, stride=2, pad=0)\n", " \n", " # Convolutional layer 3\n", " conv3 = brew.conv(model, pool2, 'conv3', dim_in=32, dim_out=64, kernel=5, stride=1, pad=2)\n", " h,w = update_dims(height=h, width=w, kernel=5, stride=1, pad=2)\n", " # ReLU layer 3\n", " relu3 = brew.relu(model, conv3, 'relu3')\n", " # Pooling layer 3\n", " pool3 = brew.average_pool(model, relu3, 'pool3', kernel=3, stride=2)\n", " h,w = update_dims(height=h, width=w, kernel=3, stride=2, pad=0)\n", " \n", " # Fully connected layers\n", " fc1 = brew.fc(model, pool3, 'fc1', dim_in=64*h*w, dim_out=64)\n", " fc2 = brew.fc(model, fc1, 'fc2', dim_in=64, dim_out=num_classes)\n", " \n", " # Softmax layer\n", " softmax = brew.softmax(model, fc2, 'softmax')\n", " return softmax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our next helper function is `AddTrainingOperators`. This function will be called by our train model to add a loss function and an optimization technique for learning. We will use an averaged cross entropy loss function between the model's softmax scores and the ground truth labels. We then add gradient operators to our model with respect to the loss that we previously calculated. Finally, we use the `build_sgd` function from Caffe2's `optimizer` class as our loss minimization function.\n", "\n", "Feel free to tinker with the hyper-parameters of the `build_sgd` function and observe the change in convergence efficiency during training." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def AddTrainingOperators(model, softmax, label):\n", " xent = model.LabelCrossEntropy([softmax, label], 'xent')\n", " # Compute the expected loss\n", " loss = model.AveragedLoss(xent, \"loss\")\n", " # Use the average loss we just computed to add gradient operators to the model\n", " model.AddGradientOperators([loss])\n", " # Use stochastic gradient descent as optimization function\n", " optimizer.build_sgd(\n", " model,\n", " base_learning_rate=0.01,\n", " policy=\"fixed\",\n", " momentum=0.9,\n", " weight_decay=0.004\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`AddAccuracy` adds an accuracy layer to a model using the `brew` module. This calculates the percentage of samples in a given batch whose top-1 softmax class matches the ground truth label class (i.e. percentage of samples in batch the model got right)." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def AddAccuracy(model, softmax, label):\n", " accuracy = brew.accuracy(model, [softmax, label], \"accuracy\")\n", " return accuracy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next and final helper function is `AddCheckpoints`, which outputs a checkpoint db at a regular interval of iterations. A checkpoint is essentially a saved state of a model during the training process. Checkpoints are useful for quickly loading a trained or partially trained model in the future, and they are an invaluable insurance policy during very long training processes. Caffe2 checkpoints are akin to Caffe's periodically outputted .caffemodel files. We use `brew`'s `iter` operator to track iterations, and will save them as LMDBs.\n", "\n", "It is important to note that when using checkpoints, you must be careful about attempting to overwrite checkpoints of the same name from a previous training process. If you attempt to overwrite a checkpoint db, the training process will error out. To deal with this, we will save the checkpoints in a uniquely named directory under our `root_folder`. This directory's name is based on the current system timestamp, to avoid duplication." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import datetime\n", "\n", "# Create uniquely named directory under root_folder to output checkpoints to\n", "unique_timestamp = str(datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))\n", "checkpoint_dir = os.path.join(root_folder, unique_timestamp)\n", "os.makedirs(checkpoint_dir)\n", "print(\"Checkpoint output location: \", checkpoint_dir)\n", "\n", "# Add checkpoints to a given model\n", "def AddCheckpoints(model, checkpoint_iters, db_type):\n", " ITER = brew.iter(model, \"iter\")\n", " model.Checkpoint([ITER] + model.params, [], db=os.path.join(unique_timestamp, \"cifar10_checkpoint_%05d.lmdb\"), db_type=\"lmdb\", every=checkpoint_iters)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialize models with ModelHelper\n", "\n", "Now that we have created the necessary helper functions, it is time to actually initialize our training and validation models and use our functions to define the models' operator graphs. *Remember that we are not executing the models yet*.\n", "\n", "First, we define the train model:\n", "\n", " (1) Initialize model with ModelHelper class\n", " (2) Add data layer with AddInput function\n", " (3) Add the Cifar10 model, which returns a softmax blob\n", " (4) Add training operators with AddTrainingOperators function; use softmax blob from (3)\n", " (5) Add periodic checkpoints with AddCheckpoints function\n", "\n", "Next, we define the validation model, which is structurally the same, but is separated because its input data comes from a different LMDB, and uses a different batch size. We will build as follows:\n", "\n", " (1) Initialize model with ModelHelper class with init_params=False\n", " (2) Add data layer with AddInput function\n", " (3) Add the Cifar10 model, which returns a softmax blob\n", " (4) Add accuracy layer with AddAccuracy function; use softmax blob from (3)\n", "\n", "Finally, we define the deploy model:\n", "\n", " (1) Initialize model with ModelHelper class with init_params=False\n", " (2) Add the Cifar10 model, which will expect input blob called \"data\"\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Training, Validation, and Deploy models all defined!\n" ] } ], "source": [ "arg_scope = {\"order\": \"NCHW\"}\n", "\n", "# TRAINING MODEL\n", "# Initialize with ModelHelper class\n", "train_model = model_helper.ModelHelper(\n", " name=\"train_net\", arg_scope=arg_scope)\n", "# Add data layer from training_lmdb\n", "data, label = AddInput(\n", " train_model, batch_size=training_net_batch_size,\n", " db=training_lmdb_path,\n", " db_type='lmdb')\n", "# Add model definition, save return value to 'softmax' variable\n", "softmax = Add_Original_CIFAR10_Model(train_model, data, num_classes, image_height, image_width, image_channels)\n", "# Add training operators using the softmax output from the model\n", "AddTrainingOperators(train_model, softmax, label)\n", "# Add periodic checkpoint outputs to the model\n", "AddCheckpoints(train_model, checkpoint_iters, db_type=\"lmdb\")\n", "\n", "\n", "# VALIDATION MODEL\n", "# Initialize with ModelHelper class without re-initializing params\n", "val_model = model_helper.ModelHelper(\n", " name=\"val_net\", arg_scope=arg_scope, init_params=False)\n", "# Add data layer from validation_lmdb\n", "data, label = AddInput(\n", " val_model, batch_size=validation_images,\n", " db=validation_lmdb_path,\n", " db_type='lmdb')\n", "# Add model definition, save return value to 'softmax' variable\n", "softmax = Add_Original_CIFAR10_Model(val_model, data, num_classes, image_height, image_width, image_channels)\n", "# Add accuracy operator\n", "AddAccuracy(val_model, softmax, label)\n", "\n", "\n", "# DEPLOY MODEL\n", "# Initialize with ModelHelper class without re-initializing params\n", "deploy_model = model_helper.ModelHelper(\n", " name=\"deploy_net\", arg_scope=arg_scope, init_params=False)\n", "# Add model definition, expect input blob called \"data\"\n", "Add_Original_CIFAR10_Model(deploy_model, \"data\", num_classes, image_height, image_width, image_channels)\n", "\n", "print(\"Training, Validation, and Deploy models all defined!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Run training\n", "\n", "Finally, now that we have our models and their operator graphs defined, it is time to actually run the training process. Under the hood, we have defined our models as operator graphs that are serialized in protobuf format. The final step is to send these protobufs to Caffe2's C++ backend so that model objects can be built and executed. \n", "\n", "Recall that a `ModelHelper` model object has two nets:\n", "\n", "- `param_init_net`: Contains parameters and initialization data\n", "\n", "- `net`: Contains the main network (operator graph) that we just defined\n", "\n", "Both of these nets must be run, and we must start with the `param_init_net`. Because this net only needs to be run once, we run it with the `workspace.RunNetOnce` function, which instantiates, runs, and immediately destructs the network. If we want to run a network multiple times, as we do in the case of our training and validation nets, we first create the net with `workspace.CreateNet`, and we can then run the net using `workspace.RunNet`. \n", "\n", "Note that when we call `workspace.RunNet` on the `train_model`, this runs the forward and backward pass with a batch from our training LMDB. Running the `val_model` runs a forward pass with a batch from our validation LMDB (which we set to be all of the images) and adds an accuracy layer that we will use to track model accuracy on our quasi-test data as we train." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Training iter: 0\n", "Loss: 2.311319589614868\n", "Validation accuracy: 0.10383333265781403\n", "\n", "Training iter: 100\n", "Loss: 1.9484632015228271\n", "Validation accuracy: 0.2861666679382324\n", "\n", "Training iter: 200\n", "Loss: 1.7397210597991943\n", "Validation accuracy: 0.3641666769981384\n", "\n", "Training iter: 300\n", "Loss: 1.7527788877487183\n", "Validation accuracy: 0.4051666557788849\n", "\n", "Training iter: 400\n", "Loss: 1.3784841299057007\n", "Validation accuracy: 0.45116665959358215\n", "\n", "Training iter: 500\n", "Loss: 1.5721114873886108\n", "Validation accuracy: 0.4581666588783264\n", "\n", "Training iter: 600\n", "Loss: 1.5422420501708984\n", "Validation accuracy: 0.4958333373069763\n", "\n", "Training iter: 700\n", "Loss: 1.3092886209487915\n", "Validation accuracy: 0.5076666474342346\n", "\n", "Training iter: 800\n", "Loss: 1.3119572401046753\n", "Validation accuracy: 0.5444999933242798\n", "\n", "Training iter: 900\n", "Loss: 1.3184524774551392\n", "Validation accuracy: 0.5375000238418579\n", "\n", "Training iter: 1000\n", "Loss: 1.2561535835266113\n", "Validation accuracy: 0.5534999966621399\n", "\n", "Training iter: 1100\n", "Loss: 1.1288306713104248\n", "Validation accuracy: 0.5805000066757202\n", "\n", "Training iter: 1200\n", "Loss: 1.221421480178833\n", "Validation accuracy: 0.5686666369438171\n", "\n", "Training iter: 1300\n", "Loss: 1.1555482149124146\n", "Validation accuracy: 0.5920000076293945\n", "\n", "Training iter: 1400\n", "Loss: 1.281171202659607\n", "Validation accuracy: 0.5929999947547913\n", "\n", "Training iter: 1500\n", "Loss: 1.0986618995666504\n", "Validation accuracy: 0.5846666693687439\n", "\n", "Training iter: 1600\n", "Loss: 1.1475869417190552\n", "Validation accuracy: 0.6179999709129333\n", "\n", "Training iter: 1700\n", "Loss: 1.0574977397918701\n", "Validation accuracy: 0.6158333420753479\n", "\n", "Training iter: 1800\n", "Loss: 1.2078982591629028\n", "Validation accuracy: 0.6358333230018616\n", "\n", "Training iter: 1900\n", "Loss: 0.8897716403007507\n", "Validation accuracy: 0.6358333230018616\n", "\n" ] } ], "source": [ "import math\n", "\n", "# Initialize and create the training network\n", "workspace.RunNetOnce(train_model.param_init_net)\n", "workspace.CreateNet(train_model.net, overwrite=True)\n", "# Initialize and create validation network\n", "workspace.RunNetOnce(val_model.param_init_net)\n", "workspace.CreateNet(val_model.net, overwrite=True)\n", "# Placeholder to track loss and validation accuracy\n", "loss = np.zeros(int(math.ceil(training_iters/validation_interval)))\n", "val_accuracy = np.zeros(int(math.ceil(training_iters/validation_interval)))\n", "val_count = 0\n", "iteration_list = np.zeros(int(math.ceil(training_iters/validation_interval)))\n", "\n", "# Now, we run the network (forward & backward pass)\n", "for i in range(training_iters):\n", " workspace.RunNet(train_model.net)\n", " \n", " # Validate every training iterations\n", " if (i % validation_interval == 0):\n", " print(\"Training iter: \", i)\n", " loss[val_count] = workspace.FetchBlob('loss')\n", " workspace.RunNet(val_model.net)\n", " val_accuracy[val_count] = workspace.FetchBlob('accuracy')\n", " print(\"Loss: \", str(loss[val_count]))\n", " print(\"Validation accuracy: \", str(val_accuracy[val_count]) + \"\\n\")\n", " iteration_list[val_count] = i\n", " val_count += 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot the validation accuracy vs. loss over the training iterations." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.title(\"Training Loss vs. Validation Accuracy\")\n", "plt.plot(iteration_list, loss, 'b')\n", "plt.plot(iteration_list, val_accuracy, 'r')\n", "plt.xlabel(\"Training iteration\")\n", "plt.legend(('Loss', 'Validation Accuracy'), loc='upper right')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Save trained model\n", "\n", "Now that we have the parameters of the trained model in the workspace, we will export the deploy model using the `mobile_exporter` class. In Caffe2, pretrained models are commonly saved as two separate protobuf (.pb) files (init_net and predict_net). Models can also be saved in db formats, but we will save our model as protobuf files, as this is how they commonly appear in the Model Zoo.\n", "\n", "For consistency, we'll save these in the same unique directory that the checkpoints are in." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Run init net and create main net\n", "workspace.RunNetOnce(deploy_model.param_init_net)\n", "workspace.CreateNet(deploy_model.net, overwrite=True)\n", "\n", "# Use mobile_exporter's Export function to acquire init_net and predict_net\n", "init_net, predict_net = mobile_exporter.Export(workspace, deploy_model.net, deploy_model.params)\n", "\n", "# Locations of output files\n", "full_init_net_out = os.path.join(checkpoint_dir, init_net_out)\n", "full_predict_net_out = os.path.join(checkpoint_dir, predict_net_out)\n", "\n", "# Simply write the two nets to file\n", "with open(full_init_net_out, 'wb') as f:\n", " f.write(init_net.SerializeToString())\n", "with open(full_predict_net_out, 'wb') as f:\n", " f.write(predict_net.SerializeToString())\n", "print(\"Model saved as \" + full_init_net_out + \" and \" + full_predict_net_out)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Congratulations!** You have made it through Part 1 of the tutorial. In Part 2, we will load the model that we just trained and do all sorts of fun things like running inference on our testing LMDB, running inference on a given .png, and continuing training for increased performance.\n", "\n", "Thanks, and see you at Part 2!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.14" } }, "nbformat": 4, "nbformat_minor": 2 }