{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.\n", "- Author: Sebastian Raschka\n", "- GitHub Repository: https://github.com/rasbt/deeplearning-models" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka \n", "\n", "CPython 3.7.3\n", "IPython 7.9.0\n", "\n", "torch 1.7.0\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -a 'Sebastian Raschka' -v -p torch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Runs on CPU or GPU (if available)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model Zoo -- Reproducible Results with Deterministic Behavior and Runtime Benchmark" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we are benchmarking the performance impact of setting PyTorch to deterministic behavior. In general, there are two aspects for reproducible resuls in PyTorch, \n", "1. Setting a random seed\n", "2. Setting cuDNN and PyTorch algorithmic behavior to deterministic\n", "\n", "For more details, please see https://pytorch.org/docs/stable/notes/randomness.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Setting a random seed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I recommend using a function like the following one prior to using dataset loaders and initializing a model if you want to ensure the data is shuffled in the same manner if you rerun this notebook and the model gets the same initial random weights:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def set_all_seeds(seed):\n", " os.environ[\"PL_GLOBAL_SEED\"] = str(seed)\n", " random.seed(seed)\n", " np.random.seed(seed)\n", " torch.manual_seed(seed)\n", " torch.cuda.manual_seed_all(seed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Setting cuDNN and PyTorch algorithmic behavior to deterministic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to the `set_all_seeds` function above, I recommend setting the behavior of PyTorch and cuDNN to deterministic (this is particulary relevant when using GPUs). We can also define a function for that:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def set_deterministic():\n", " if torch.cuda.is_available():\n", " torch.backends.cudnn.benchmark = False\n", " torch.backends.cudnn.deterministic = True\n", " torch.set_deterministic(True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1) Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After setting up the general configuration in this section, the following two sections will train a ResNet-101 model without and with deterministic behavior to get a sense how using deterministic options affect the runtime speed." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import torch\n", "import random" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Device: cuda:1\n" ] } ], "source": [ "##########################\n", "### SETTINGS\n", "##########################\n", "\n", "# Device\n", "CUDA_DEVICE_NUM = 1 # change as appropriate\n", "DEVICE = torch.device('cuda:%d' % CUDA_DEVICE_NUM if torch.cuda.is_available() else 'cpu')\n", "print('Device:', DEVICE)\n", "\n", "# Data settings\n", "num_classes = 10\n", "\n", "# Hyperparameters\n", "random_seed = 1\n", "learning_rate = 0.01\n", "batch_size = 128\n", "num_epochs = 50" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import sys\n", "\n", "sys.path.insert(0, \"..\") # to include ../helper_evaluate.py etc.\n", "\n", "from helper_evaluate import compute_accuracy\n", "from helper_data import get_dataloaders_cifar10\n", "from helper_train import train_classifier_simple_v1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2) Run without Deterministic Behavior" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we enable deterministic behavior, we will run a ResNet-101 with otherwise the exact same settings for comparison. Note that setting random seeds doesn't affect the timing results." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "### Set random seed ###\n", "set_all_seeds(random_seed)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Files already downloaded and verified\n" ] } ], "source": [ "##########################\n", "### Dataset\n", "##########################\n", "\n", "train_loader, valid_loader, test_loader = get_dataloaders_cifar10(\n", " batch_size, \n", " num_workers=0, \n", " validation_fraction=0.1)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "##########################\n", "### Model\n", "##########################\n", "\n", "\n", "from deterministic_benchmark_utils import resnet101\n", "\n", "\n", "\n", "\n", "model = resnet101(num_classes, grayscale=False)\n", "\n", "model = model.to(DEVICE)\n", "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 001/050 | Batch 0000/0352 | Loss: 2.6271\n", "Epoch: 001/050 | Batch 0200/0352 | Loss: 2.0193\n", "***Epoch: 001/050 | Train. Acc.: 26.831% | Loss: 2.772\n", "***Epoch: 001/050 | Valid. Acc.: 26.300% | Loss: 2.642\n", "Time elapsed: 1.17 min\n", "Epoch: 002/050 | Batch 0000/0352 | Loss: 1.9047\n", "Epoch: 002/050 | Batch 0200/0352 | Loss: 1.6645\n", "***Epoch: 002/050 | Train. Acc.: 41.080% | Loss: 1.581\n", "***Epoch: 002/050 | Valid. Acc.: 40.780% | Loss: 1.588\n", "Time elapsed: 2.39 min\n", "Epoch: 003/050 | Batch 0000/0352 | Loss: 1.4886\n", "Epoch: 003/050 | Batch 0200/0352 | Loss: 1.4483\n", "***Epoch: 003/050 | Train. Acc.: 47.693% | Loss: 1.428\n", "***Epoch: 003/050 | Valid. Acc.: 47.480% | Loss: 1.439\n", "Time elapsed: 3.63 min\n", "Epoch: 004/050 | Batch 0000/0352 | Loss: 1.3687\n", "Epoch: 004/050 | Batch 0200/0352 | Loss: 1.3750\n", "***Epoch: 004/050 | Train. Acc.: 57.487% | Loss: 1.189\n", "***Epoch: 004/050 | Valid. Acc.: 56.020% | Loss: 1.231\n", "Time elapsed: 4.85 min\n", "Epoch: 005/050 | Batch 0000/0352 | Loss: 1.2162\n", "Epoch: 005/050 | Batch 0200/0352 | Loss: 1.3256\n", "***Epoch: 005/050 | Train. Acc.: 58.827% | Loss: 1.151\n", "***Epoch: 005/050 | Valid. Acc.: 57.240% | Loss: 1.192\n", "Time elapsed: 6.08 min\n", "Epoch: 006/050 | Batch 0000/0352 | Loss: 1.2045\n", "Epoch: 006/050 | Batch 0200/0352 | Loss: 1.2144\n", "***Epoch: 006/050 | Train. Acc.: 62.062% | Loss: 1.085\n", "***Epoch: 006/050 | Valid. Acc.: 59.180% | Loss: 1.176\n", "Time elapsed: 7.31 min\n", "Epoch: 007/050 | Batch 0000/0352 | Loss: 1.0280\n", "Epoch: 007/050 | Batch 0200/0352 | Loss: 1.0381\n", "***Epoch: 007/050 | Train. Acc.: 67.880% | Loss: 0.917\n", "***Epoch: 007/050 | Valid. Acc.: 64.660% | Loss: 1.040\n", "Time elapsed: 8.53 min\n", "Epoch: 008/050 | Batch 0000/0352 | Loss: 0.9092\n", "Epoch: 008/050 | Batch 0200/0352 | Loss: 0.9647\n", "***Epoch: 008/050 | Train. Acc.: 69.656% | Loss: 0.873\n", "***Epoch: 008/050 | Valid. Acc.: 64.820% | Loss: 1.034\n", "Time elapsed: 9.75 min\n", "Epoch: 009/050 | Batch 0000/0352 | Loss: 0.7789\n", "Epoch: 009/050 | Batch 0200/0352 | Loss: 0.8018\n", "***Epoch: 009/050 | Train. Acc.: 67.900% | Loss: 0.935\n", "***Epoch: 009/050 | Valid. Acc.: 62.060% | Loss: 1.131\n", "Time elapsed: 10.98 min\n", "Epoch: 010/050 | Batch 0000/0352 | Loss: 0.6950\n", "Epoch: 010/050 | Batch 0200/0352 | Loss: 0.7482\n", "***Epoch: 010/050 | Train. Acc.: 69.613% | Loss: 0.912\n", "***Epoch: 010/050 | Valid. Acc.: 62.900% | Loss: 1.182\n", "Time elapsed: 12.20 min\n", "Epoch: 011/050 | Batch 0000/0352 | Loss: 0.5364\n", "Epoch: 011/050 | Batch 0200/0352 | Loss: 0.7148\n", "***Epoch: 011/050 | Train. Acc.: 70.978% | Loss: 0.890\n", "***Epoch: 011/050 | Valid. Acc.: 62.500% | Loss: 1.269\n", "Time elapsed: 13.42 min\n", "Epoch: 012/050 | Batch 0000/0352 | Loss: 0.5508\n", "Epoch: 012/050 | Batch 0200/0352 | Loss: 0.5948\n", "***Epoch: 012/050 | Train. Acc.: 67.689% | Loss: 1.032\n", "***Epoch: 012/050 | Valid. Acc.: 60.420% | Loss: 1.399\n", "Time elapsed: 14.65 min\n", "Epoch: 013/050 | Batch 0000/0352 | Loss: 0.4297\n", "Epoch: 013/050 | Batch 0200/0352 | Loss: 0.6009\n", "***Epoch: 013/050 | Train. Acc.: 64.773% | Loss: 1.240\n", "***Epoch: 013/050 | Valid. Acc.: 57.120% | Loss: 1.726\n", "Time elapsed: 15.87 min\n", "Epoch: 014/050 | Batch 0000/0352 | Loss: 0.4545\n", "Epoch: 014/050 | Batch 0200/0352 | Loss: 0.4772\n", "***Epoch: 014/050 | Train. Acc.: 65.091% | Loss: 1.284\n", "***Epoch: 014/050 | Valid. Acc.: 56.340% | Loss: 1.817\n", "Time elapsed: 17.10 min\n", "Epoch: 015/050 | Batch 0000/0352 | Loss: 0.2806\n", "Epoch: 015/050 | Batch 0200/0352 | Loss: 0.3789\n", "***Epoch: 015/050 | Train. Acc.: 63.369% | Loss: 1.385\n", "***Epoch: 015/050 | Valid. Acc.: 54.080% | Loss: 2.058\n", "Time elapsed: 18.32 min\n", "Epoch: 016/050 | Batch 0000/0352 | Loss: 0.2412\n", "Epoch: 016/050 | Batch 0200/0352 | Loss: 0.3509\n", "***Epoch: 016/050 | Train. Acc.: 80.284% | Loss: 0.615\n", "***Epoch: 016/050 | Valid. Acc.: 67.080% | Loss: 1.290\n", "Time elapsed: 19.55 min\n", "Epoch: 017/050 | Batch 0000/0352 | Loss: 0.2912\n", "Epoch: 017/050 | Batch 0200/0352 | Loss: 0.3700\n", "***Epoch: 017/050 | Train. Acc.: 75.598% | Loss: 0.832\n", "***Epoch: 017/050 | Valid. Acc.: 64.440% | Loss: 1.435\n", "Time elapsed: 20.77 min\n", "Epoch: 018/050 | Batch 0000/0352 | Loss: 0.2656\n", "Epoch: 018/050 | Batch 0200/0352 | Loss: 0.3249\n", "***Epoch: 018/050 | Train. Acc.: 77.631% | Loss: 0.675\n", "***Epoch: 018/050 | Valid. Acc.: 64.800% | Loss: 1.230\n", "Time elapsed: 21.99 min\n", "Epoch: 019/050 | Batch 0000/0352 | Loss: 0.4638\n", "Epoch: 019/050 | Batch 0200/0352 | Loss: 0.4722\n", "***Epoch: 019/050 | Train. Acc.: 83.680% | Loss: 0.510\n", "***Epoch: 019/050 | Valid. Acc.: 68.220% | Loss: 1.236\n", "Time elapsed: 23.21 min\n", "Epoch: 020/050 | Batch 0000/0352 | Loss: 0.1987\n", "Epoch: 020/050 | Batch 0200/0352 | Loss: 0.1785\n", "***Epoch: 020/050 | Train. Acc.: 85.493% | Loss: 0.462\n", "***Epoch: 020/050 | Valid. Acc.: 69.540% | Loss: 1.301\n", "Time elapsed: 24.44 min\n", "Epoch: 021/050 | Batch 0000/0352 | Loss: 0.1715\n", "Epoch: 021/050 | Batch 0200/0352 | Loss: 0.2591\n", "***Epoch: 021/050 | Train. Acc.: 87.880% | Loss: 0.379\n", "***Epoch: 021/050 | Valid. Acc.: 69.440% | Loss: 1.297\n", "Time elapsed: 25.66 min\n", "Epoch: 022/050 | Batch 0000/0352 | Loss: 0.0751\n", "Epoch: 022/050 | Batch 0200/0352 | Loss: 0.1399\n", "***Epoch: 022/050 | Train. Acc.: 64.718% | Loss: 1.335\n", "***Epoch: 022/050 | Valid. Acc.: 54.700% | Loss: 1.915\n", "Time elapsed: 26.89 min\n", "Epoch: 023/050 | Batch 0000/0352 | Loss: 0.4129\n", "Epoch: 023/050 | Batch 0200/0352 | Loss: 0.1313\n", "***Epoch: 023/050 | Train. Acc.: 83.102% | Loss: 0.593\n", "***Epoch: 023/050 | Valid. Acc.: 66.320% | Loss: 1.510\n", "Time elapsed: 28.11 min\n", "Epoch: 024/050 | Batch 0000/0352 | Loss: 0.1770\n", "Epoch: 024/050 | Batch 0200/0352 | Loss: 0.1969\n", "***Epoch: 024/050 | Train. Acc.: 85.444% | Loss: 0.483\n", "***Epoch: 024/050 | Valid. Acc.: 67.820% | Loss: 1.462\n", "Time elapsed: 29.34 min\n", "Epoch: 025/050 | Batch 0000/0352 | Loss: 0.0875\n", "Epoch: 025/050 | Batch 0200/0352 | Loss: 0.0878\n", "***Epoch: 025/050 | Train. Acc.: 83.856% | Loss: 0.572\n", "***Epoch: 025/050 | Valid. Acc.: 66.040% | Loss: 1.633\n", "Time elapsed: 30.56 min\n", "Epoch: 026/050 | Batch 0000/0352 | Loss: 0.1272\n", "Epoch: 026/050 | Batch 0200/0352 | Loss: 0.1095\n", "***Epoch: 026/050 | Train. Acc.: 89.209% | Loss: 0.346\n", "***Epoch: 026/050 | Valid. Acc.: 69.160% | Loss: 1.430\n", "Time elapsed: 31.79 min\n", "Epoch: 027/050 | Batch 0000/0352 | Loss: 0.0530\n", "Epoch: 027/050 | Batch 0200/0352 | Loss: 0.1496\n", "***Epoch: 027/050 | Train. Acc.: 85.542% | Loss: 0.496\n", "***Epoch: 027/050 | Valid. Acc.: 66.340% | Loss: 1.574\n", "Time elapsed: 33.01 min\n", "Epoch: 028/050 | Batch 0000/0352 | Loss: 0.1304\n", "Epoch: 028/050 | Batch 0200/0352 | Loss: 0.1132\n", "***Epoch: 028/050 | Train. Acc.: 81.762% | Loss: 0.664\n", "***Epoch: 028/050 | Valid. Acc.: 65.660% | Loss: 1.656\n", "Time elapsed: 34.24 min\n", "Epoch: 029/050 | Batch 0000/0352 | Loss: 0.1230\n", "Epoch: 029/050 | Batch 0200/0352 | Loss: 0.1004\n", "***Epoch: 029/050 | Train. Acc.: 90.560% | Loss: 0.294\n", "***Epoch: 029/050 | Valid. Acc.: 70.400% | Loss: 1.322\n", "Time elapsed: 35.47 min\n", "Epoch: 030/050 | Batch 0000/0352 | Loss: 0.0776\n", "Epoch: 030/050 | Batch 0200/0352 | Loss: 0.1733\n", "***Epoch: 030/050 | Train. Acc.: 89.807% | Loss: 0.349\n", "***Epoch: 030/050 | Valid. Acc.: 68.740% | Loss: 1.638\n", "Time elapsed: 36.69 min\n", "Epoch: 031/050 | Batch 0000/0352 | Loss: 0.0538\n", "Epoch: 031/050 | Batch 0200/0352 | Loss: 0.1678\n", "***Epoch: 031/050 | Train. Acc.: 92.460% | Loss: 0.249\n", "***Epoch: 031/050 | Valid. Acc.: 71.440% | Loss: 1.503\n", "Time elapsed: 37.91 min\n", "Epoch: 032/050 | Batch 0000/0352 | Loss: 0.1050\n", "Epoch: 032/050 | Batch 0200/0352 | Loss: 0.1882\n", "***Epoch: 032/050 | Train. Acc.: 93.773% | Loss: 0.198\n", "***Epoch: 032/050 | Valid. Acc.: 72.300% | Loss: 1.377\n", "Time elapsed: 39.14 min\n", "Epoch: 033/050 | Batch 0000/0352 | Loss: 0.0545\n", "Epoch: 033/050 | Batch 0200/0352 | Loss: 0.1488\n", "***Epoch: 033/050 | Train. Acc.: 90.589% | Loss: 0.328\n", "***Epoch: 033/050 | Valid. Acc.: 69.160% | Loss: 1.553\n", "Time elapsed: 40.36 min\n", "Epoch: 034/050 | Batch 0000/0352 | Loss: 0.0395\n", "Epoch: 034/050 | Batch 0200/0352 | Loss: 0.0678\n", "***Epoch: 034/050 | Train. Acc.: 92.429% | Loss: 0.245\n", "***Epoch: 034/050 | Valid. Acc.: 70.360% | Loss: 1.454\n", "Time elapsed: 41.59 min\n", "Epoch: 035/050 | Batch 0000/0352 | Loss: 0.0534\n", "Epoch: 035/050 | Batch 0200/0352 | Loss: 0.1108\n", "***Epoch: 035/050 | Train. Acc.: 88.993% | Loss: 0.384\n", "***Epoch: 035/050 | Valid. Acc.: 69.060% | Loss: 1.518\n", "Time elapsed: 42.82 min\n", "Epoch: 036/050 | Batch 0000/0352 | Loss: 0.1118\n", "Epoch: 036/050 | Batch 0200/0352 | Loss: 0.0408\n", "***Epoch: 036/050 | Train. Acc.: 93.811% | Loss: 0.195\n", "***Epoch: 036/050 | Valid. Acc.: 72.640% | Loss: 1.404\n", "Time elapsed: 44.04 min\n", "Epoch: 037/050 | Batch 0000/0352 | Loss: 0.0635\n", "Epoch: 037/050 | Batch 0200/0352 | Loss: 0.0642\n", "***Epoch: 037/050 | Train. Acc.: 92.851% | Loss: 0.236\n", "***Epoch: 037/050 | Valid. Acc.: 71.040% | Loss: 1.596\n", "Time elapsed: 45.27 min\n", "Epoch: 038/050 | Batch 0000/0352 | Loss: 0.0662\n", "Epoch: 038/050 | Batch 0200/0352 | Loss: 0.0905\n", "***Epoch: 038/050 | Train. Acc.: 91.587% | Loss: 0.289\n", "***Epoch: 038/050 | Valid. Acc.: 69.600% | Loss: 1.638\n", "Time elapsed: 46.49 min\n", "Epoch: 039/050 | Batch 0000/0352 | Loss: 0.0744\n", "Epoch: 039/050 | Batch 0200/0352 | Loss: 0.0106\n", "***Epoch: 039/050 | Train. Acc.: 94.356% | Loss: 0.182\n", "***Epoch: 039/050 | Valid. Acc.: 71.500% | Loss: 1.577\n", "Time elapsed: 47.71 min\n", "Epoch: 040/050 | Batch 0000/0352 | Loss: 0.0087\n", "Epoch: 040/050 | Batch 0200/0352 | Loss: 0.0522\n", "***Epoch: 040/050 | Train. Acc.: 85.756% | Loss: 0.536\n", "***Epoch: 040/050 | Valid. Acc.: 66.460% | Loss: 1.836\n", "Time elapsed: 48.94 min\n", "Epoch: 041/050 | Batch 0000/0352 | Loss: 0.0759\n", "Epoch: 041/050 | Batch 0200/0352 | Loss: 0.0693\n", "***Epoch: 041/050 | Train. Acc.: 91.647% | Loss: 0.283\n", "***Epoch: 041/050 | Valid. Acc.: 69.760% | Loss: 1.621\n", "Time elapsed: 50.16 min\n", "Epoch: 042/050 | Batch 0000/0352 | Loss: 0.1359\n", "Epoch: 042/050 | Batch 0200/0352 | Loss: 0.0824\n", "***Epoch: 042/050 | Train. Acc.: 72.082% | Loss: 1.133\n", "***Epoch: 042/050 | Valid. Acc.: 59.880% | Loss: 2.117\n", "Time elapsed: 51.37 min\n", "Epoch: 043/050 | Batch 0000/0352 | Loss: 0.3177\n", "Epoch: 043/050 | Batch 0200/0352 | Loss: 0.0631\n", "***Epoch: 043/050 | Train. Acc.: 95.522% | Loss: 0.138\n", "***Epoch: 043/050 | Valid. Acc.: 72.840% | Loss: 1.308\n", "Time elapsed: 52.60 min\n", "Epoch: 044/050 | Batch 0000/0352 | Loss: 0.0383\n", "Epoch: 044/050 | Batch 0200/0352 | Loss: 0.0217\n", "***Epoch: 044/050 | Train. Acc.: 98.031% | Loss: 0.060\n", "***Epoch: 044/050 | Valid. Acc.: 74.800% | Loss: 1.464\n", "Time elapsed: 53.82 min\n", "Epoch: 045/050 | Batch 0000/0352 | Loss: 0.0097\n", "Epoch: 045/050 | Batch 0200/0352 | Loss: 1.6721\n", "***Epoch: 045/050 | Train. Acc.: 63.620% | Loss: 1.092\n", "***Epoch: 045/050 | Valid. Acc.: 58.180% | Loss: 1.276\n", "Time elapsed: 55.05 min\n", "Epoch: 046/050 | Batch 0000/0352 | Loss: 1.0475\n", "Epoch: 046/050 | Batch 0200/0352 | Loss: 0.5436\n", "***Epoch: 046/050 | Train. Acc.: 92.420% | Loss: 0.235\n", "***Epoch: 046/050 | Valid. Acc.: 72.740% | Loss: 0.935\n", "Time elapsed: 56.27 min\n", "Epoch: 047/050 | Batch 0000/0352 | Loss: 0.3582\n", "Epoch: 047/050 | Batch 0200/0352 | Loss: 0.1439\n", "***Epoch: 047/050 | Train. Acc.: 97.762% | Loss: 0.068\n", "***Epoch: 047/050 | Valid. Acc.: 73.740% | Loss: 1.242\n", "Time elapsed: 57.49 min\n", "Epoch: 048/050 | Batch 0000/0352 | Loss: 0.0733\n", "Epoch: 048/050 | Batch 0200/0352 | Loss: 0.0193\n", "***Epoch: 048/050 | Train. Acc.: 98.331% | Loss: 0.050\n", "***Epoch: 048/050 | Valid. Acc.: 74.080% | Loss: 1.582\n", "Time elapsed: 58.72 min\n", "Epoch: 049/050 | Batch 0000/0352 | Loss: 0.0125\n", "Epoch: 049/050 | Batch 0200/0352 | Loss: 0.0322\n", "***Epoch: 049/050 | Train. Acc.: 96.682% | Loss: 0.105\n", "***Epoch: 049/050 | Valid. Acc.: 73.420% | Loss: 1.699\n", "Time elapsed: 59.94 min\n", "Epoch: 050/050 | Batch 0000/0352 | Loss: 0.0185\n", "Epoch: 050/050 | Batch 0200/0352 | Loss: 0.1059\n", "***Epoch: 050/050 | Train. Acc.: 97.193% | Loss: 0.085\n", "***Epoch: 050/050 | Valid. Acc.: 73.120% | Loss: 1.708\n", "Time elapsed: 61.16 min\n", "Total Training Time: 61.16 min\n" ] } ], "source": [ "_ = train_classifier_simple_v1(num_epochs=num_epochs, model=model, \n", " optimizer=optimizer, device=DEVICE, \n", " train_loader=train_loader, valid_loader=valid_loader, \n", " logging_interval=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3) Run with Deterministic Behavior" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we set the deterministic behavior via the `set_deterministic()` function defined at the top of this notebook and compare how it affects the runtime speed of the ResNet-101 model. (Note that setting random seeds doesn't affect the timing results.)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "set_deterministic()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "### Set random seed ###\n", "set_all_seeds(random_seed)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Files already downloaded and verified\n" ] } ], "source": [ "##########################\n", "### Dataset\n", "##########################\n", "\n", "train_loader, valid_loader, test_loader = get_dataloaders_cifar10(\n", " batch_size, \n", " num_workers=0, \n", " validation_fraction=0.1)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "##########################\n", "### Model\n", "##########################\n", "\n", "\n", "from deterministic_benchmark_utils import resnet101\n", "\n", "\n", "\n", "\n", "model = resnet101(num_classes, grayscale=False)\n", "\n", "model = model.to(DEVICE)\n", "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 001/050 | Batch 0000/0352 | Loss: 2.6271\n", "Epoch: 001/050 | Batch 0200/0352 | Loss: 2.2247\n", "***Epoch: 001/050 | Train. Acc.: 20.160% | Loss: 3.197\n", "***Epoch: 001/050 | Valid. Acc.: 19.660% | Loss: 2.888\n", "Time elapsed: 1.21 min\n", "Epoch: 002/050 | Batch 0000/0352 | Loss: 2.1651\n", "Epoch: 002/050 | Batch 0200/0352 | Loss: 1.8593\n", "***Epoch: 002/050 | Train. Acc.: 34.807% | Loss: 1.741\n", "***Epoch: 002/050 | Valid. Acc.: 34.780% | Loss: 1.740\n", "Time elapsed: 2.44 min\n", "Epoch: 003/050 | Batch 0000/0352 | Loss: 1.6749\n", "Epoch: 003/050 | Batch 0200/0352 | Loss: 1.5481\n", "***Epoch: 003/050 | Train. Acc.: 43.656% | Loss: 1.581\n", "***Epoch: 003/050 | Valid. Acc.: 43.020% | Loss: 1.567\n", "Time elapsed: 3.66 min\n", "Epoch: 004/050 | Batch 0000/0352 | Loss: 1.4751\n", "Epoch: 004/050 | Batch 0200/0352 | Loss: 1.3720\n", "***Epoch: 004/050 | Train. Acc.: 40.811% | Loss: 2.705\n", "***Epoch: 004/050 | Valid. Acc.: 40.340% | Loss: 2.576\n", "Time elapsed: 4.88 min\n", "Epoch: 005/050 | Batch 0000/0352 | Loss: 1.3779\n", "Epoch: 005/050 | Batch 0200/0352 | Loss: 1.3833\n", "***Epoch: 005/050 | Train. Acc.: 49.793% | Loss: 1.380\n", "***Epoch: 005/050 | Valid. Acc.: 50.000% | Loss: 1.402\n", "Time elapsed: 6.11 min\n", "Epoch: 006/050 | Batch 0000/0352 | Loss: 1.2832\n", "Epoch: 006/050 | Batch 0200/0352 | Loss: 1.2519\n", "***Epoch: 006/050 | Train. Acc.: 41.536% | Loss: 1.599\n", "***Epoch: 006/050 | Valid. Acc.: 41.540% | Loss: 1.616\n", "Time elapsed: 7.34 min\n", "Epoch: 007/050 | Batch 0000/0352 | Loss: 1.4472\n", "Epoch: 007/050 | Batch 0200/0352 | Loss: 1.2238\n", "***Epoch: 007/050 | Train. Acc.: 57.633% | Loss: 1.171\n", "***Epoch: 007/050 | Valid. Acc.: 57.100% | Loss: 1.219\n", "Time elapsed: 8.56 min\n", "Epoch: 008/050 | Batch 0000/0352 | Loss: 1.0530\n", "Epoch: 008/050 | Batch 0200/0352 | Loss: 1.0490\n", "***Epoch: 008/050 | Train. Acc.: 62.620% | Loss: 1.032\n", "***Epoch: 008/050 | Valid. Acc.: 60.460% | Loss: 1.109\n", "Time elapsed: 9.79 min\n", "Epoch: 009/050 | Batch 0000/0352 | Loss: 1.0152\n", "Epoch: 009/050 | Batch 0200/0352 | Loss: 0.9678\n", "***Epoch: 009/050 | Train. Acc.: 56.600% | Loss: 1.252\n", "***Epoch: 009/050 | Valid. Acc.: 55.620% | Loss: 1.311\n", "Time elapsed: 11.01 min\n", "Epoch: 010/050 | Batch 0000/0352 | Loss: 0.9740\n", "Epoch: 010/050 | Batch 0200/0352 | Loss: 0.9075\n", "***Epoch: 010/050 | Train. Acc.: 60.382% | Loss: 1.130\n", "***Epoch: 010/050 | Valid. Acc.: 57.740% | Loss: 1.254\n", "Time elapsed: 12.24 min\n", "Epoch: 011/050 | Batch 0000/0352 | Loss: 0.7509\n", "Epoch: 011/050 | Batch 0200/0352 | Loss: 0.9028\n", "***Epoch: 011/050 | Train. Acc.: 70.091% | Loss: 0.851\n", "***Epoch: 011/050 | Valid. Acc.: 64.060% | Loss: 1.045\n", "Time elapsed: 13.46 min\n", "Epoch: 012/050 | Batch 0000/0352 | Loss: 0.6871\n", "Epoch: 012/050 | Batch 0200/0352 | Loss: 0.8642\n", "***Epoch: 012/050 | Train. Acc.: 71.362% | Loss: 0.835\n", "***Epoch: 012/050 | Valid. Acc.: 64.200% | Loss: 1.079\n", "Time elapsed: 14.68 min\n", "Epoch: 013/050 | Batch 0000/0352 | Loss: 0.5075\n", "Epoch: 013/050 | Batch 0200/0352 | Loss: 0.7813\n", "***Epoch: 013/050 | Train. Acc.: 68.644% | Loss: 0.916\n", "***Epoch: 013/050 | Valid. Acc.: 62.620% | Loss: 1.169\n", "Time elapsed: 15.92 min\n", "Epoch: 014/050 | Batch 0000/0352 | Loss: 0.5169\n", "Epoch: 014/050 | Batch 0200/0352 | Loss: 0.7422\n", "***Epoch: 014/050 | Train. Acc.: 73.640% | Loss: 0.769\n", "***Epoch: 014/050 | Valid. Acc.: 65.380% | Loss: 1.090\n", "Time elapsed: 17.14 min\n", "Epoch: 015/050 | Batch 0000/0352 | Loss: 0.4203\n", "Epoch: 015/050 | Batch 0200/0352 | Loss: 0.6845\n", "***Epoch: 015/050 | Train. Acc.: 67.880% | Loss: 0.965\n", "***Epoch: 015/050 | Valid. Acc.: 62.080% | Loss: 1.229\n", "Time elapsed: 18.37 min\n", "Epoch: 016/050 | Batch 0000/0352 | Loss: 0.4785\n", "Epoch: 016/050 | Batch 0200/0352 | Loss: 0.6260\n", "***Epoch: 016/050 | Train. Acc.: 68.316% | Loss: 1.007\n", "***Epoch: 016/050 | Valid. Acc.: 60.440% | Loss: 1.380\n", "Time elapsed: 19.59 min\n", "Epoch: 017/050 | Batch 0000/0352 | Loss: 0.4362\n", "Epoch: 017/050 | Batch 0200/0352 | Loss: 0.5386\n", "***Epoch: 017/050 | Train. Acc.: 75.922% | Loss: 0.755\n", "***Epoch: 017/050 | Valid. Acc.: 65.360% | Loss: 1.277\n", "Time elapsed: 20.82 min\n", "Epoch: 018/050 | Batch 0000/0352 | Loss: 0.3375\n", "Epoch: 018/050 | Batch 0200/0352 | Loss: 0.5347\n", "***Epoch: 018/050 | Train. Acc.: 79.782% | Loss: 0.623\n", "***Epoch: 018/050 | Valid. Acc.: 68.000% | Loss: 1.168\n", "Time elapsed: 22.05 min\n", "Epoch: 019/050 | Batch 0000/0352 | Loss: 0.3641\n", "Epoch: 019/050 | Batch 0200/0352 | Loss: 0.5012\n", "***Epoch: 019/050 | Train. Acc.: 60.073% | Loss: 1.186\n", "***Epoch: 019/050 | Valid. Acc.: 57.720% | Loss: 1.249\n", "Time elapsed: 23.27 min\n", "Epoch: 020/050 | Batch 0000/0352 | Loss: 1.0688\n", "Epoch: 020/050 | Batch 0200/0352 | Loss: 0.8022\n", "***Epoch: 020/050 | Train. Acc.: 74.598% | Loss: 0.724\n", "***Epoch: 020/050 | Valid. Acc.: 67.060% | Loss: 0.991\n", "Time elapsed: 24.50 min\n", "Epoch: 021/050 | Batch 0000/0352 | Loss: 0.6721\n", "Epoch: 021/050 | Batch 0200/0352 | Loss: 0.5717\n", "***Epoch: 021/050 | Train. Acc.: 80.036% | Loss: 0.582\n", "***Epoch: 021/050 | Valid. Acc.: 69.000% | Loss: 1.022\n", "Time elapsed: 25.72 min\n", "Epoch: 022/050 | Batch 0000/0352 | Loss: 0.3786\n", "Epoch: 022/050 | Batch 0200/0352 | Loss: 0.4261\n", "***Epoch: 022/050 | Train. Acc.: 71.816% | Loss: 0.984\n", "***Epoch: 022/050 | Valid. Acc.: 61.340% | Loss: 1.499\n", "Time elapsed: 26.95 min\n", "Epoch: 023/050 | Batch 0000/0352 | Loss: 0.3137\n", "Epoch: 023/050 | Batch 0200/0352 | Loss: 0.3047\n", "***Epoch: 023/050 | Train. Acc.: 80.947% | Loss: 0.599\n", "***Epoch: 023/050 | Valid. Acc.: 67.320% | Loss: 1.299\n", "Time elapsed: 28.17 min\n", "Epoch: 024/050 | Batch 0000/0352 | Loss: 0.2161\n", "Epoch: 024/050 | Batch 0200/0352 | Loss: 0.2751\n", "***Epoch: 024/050 | Train. Acc.: 85.284% | Loss: 0.453\n", "***Epoch: 024/050 | Valid. Acc.: 69.680% | Loss: 1.193\n", "Time elapsed: 29.40 min\n", "Epoch: 025/050 | Batch 0000/0352 | Loss: 0.1900\n", "Epoch: 025/050 | Batch 0200/0352 | Loss: 0.2022\n", "***Epoch: 025/050 | Train. Acc.: 88.496% | Loss: 0.340\n", "***Epoch: 025/050 | Valid. Acc.: 71.800% | Loss: 1.134\n", "Time elapsed: 30.62 min\n", "Epoch: 026/050 | Batch 0000/0352 | Loss: 0.1252\n", "Epoch: 026/050 | Batch 0200/0352 | Loss: 0.3572\n", "***Epoch: 026/050 | Train. Acc.: 87.156% | Loss: 0.403\n", "***Epoch: 026/050 | Valid. Acc.: 71.300% | Loss: 1.200\n", "Time elapsed: 31.85 min\n", "Epoch: 027/050 | Batch 0000/0352 | Loss: 0.1662\n", "Epoch: 027/050 | Batch 0200/0352 | Loss: 0.3703\n", "***Epoch: 027/050 | Train. Acc.: 89.871% | Loss: 0.308\n", "***Epoch: 027/050 | Valid. Acc.: 71.880% | Loss: 1.199\n", "Time elapsed: 33.07 min\n", "Epoch: 028/050 | Batch 0000/0352 | Loss: 0.2198\n", "Epoch: 028/050 | Batch 0200/0352 | Loss: 0.2505\n", "***Epoch: 028/050 | Train. Acc.: 86.404% | Loss: 0.438\n", "***Epoch: 028/050 | Valid. Acc.: 70.160% | Loss: 1.262\n", "Time elapsed: 34.30 min\n", "Epoch: 029/050 | Batch 0000/0352 | Loss: 0.1404\n", "Epoch: 029/050 | Batch 0200/0352 | Loss: 0.0904\n", "***Epoch: 029/050 | Train. Acc.: 83.700% | Loss: 0.564\n", "***Epoch: 029/050 | Valid. Acc.: 66.820% | Loss: 1.514\n", "Time elapsed: 35.53 min\n", "Epoch: 030/050 | Batch 0000/0352 | Loss: 0.2442\n", "Epoch: 030/050 | Batch 0200/0352 | Loss: 0.2605\n", "***Epoch: 030/050 | Train. Acc.: 76.907% | Loss: 0.882\n", "***Epoch: 030/050 | Valid. Acc.: 64.300% | Loss: 1.780\n", "Time elapsed: 36.76 min\n", "Epoch: 031/050 | Batch 0000/0352 | Loss: 0.2206\n", "Epoch: 031/050 | Batch 0200/0352 | Loss: 0.1347\n", "***Epoch: 031/050 | Train. Acc.: 79.053% | Loss: 0.773\n", "***Epoch: 031/050 | Valid. Acc.: 64.720% | Loss: 1.691\n", "Time elapsed: 37.98 min\n", "Epoch: 032/050 | Batch 0000/0352 | Loss: 0.1039\n", "Epoch: 032/050 | Batch 0200/0352 | Loss: 0.1669\n", "***Epoch: 032/050 | Train. Acc.: 84.833% | Loss: 0.505\n", "***Epoch: 032/050 | Valid. Acc.: 67.180% | Loss: 1.558\n", "Time elapsed: 39.21 min\n", "Epoch: 033/050 | Batch 0000/0352 | Loss: 0.0840\n", "Epoch: 033/050 | Batch 0200/0352 | Loss: 0.1019\n", "***Epoch: 033/050 | Train. Acc.: 89.533% | Loss: 0.328\n", "***Epoch: 033/050 | Valid. Acc.: 70.920% | Loss: 1.309\n", "Time elapsed: 40.44 min\n", "Epoch: 034/050 | Batch 0000/0352 | Loss: 0.1585\n", "Epoch: 034/050 | Batch 0200/0352 | Loss: 0.1356\n", "***Epoch: 034/050 | Train. Acc.: 87.969% | Loss: 0.408\n", "***Epoch: 034/050 | Valid. Acc.: 69.360% | Loss: 1.486\n", "Time elapsed: 41.67 min\n", "Epoch: 035/050 | Batch 0000/0352 | Loss: 0.2327\n", "Epoch: 035/050 | Batch 0200/0352 | Loss: 0.0837\n", "***Epoch: 035/050 | Train. Acc.: 87.382% | Loss: 0.446\n", "***Epoch: 035/050 | Valid. Acc.: 69.560% | Loss: 1.468\n", "Time elapsed: 42.90 min\n", "Epoch: 036/050 | Batch 0000/0352 | Loss: 0.0715\n", "Epoch: 036/050 | Batch 0200/0352 | Loss: 0.0714\n", "***Epoch: 036/050 | Train. Acc.: 92.993% | Loss: 0.221\n", "***Epoch: 036/050 | Valid. Acc.: 73.320% | Loss: 1.318\n", "Time elapsed: 44.12 min\n", "Epoch: 037/050 | Batch 0000/0352 | Loss: 0.0546\n", "Epoch: 037/050 | Batch 0200/0352 | Loss: 0.1362\n", "***Epoch: 037/050 | Train. Acc.: 93.987% | Loss: 0.190\n", "***Epoch: 037/050 | Valid. Acc.: 73.420% | Loss: 1.317\n", "Time elapsed: 45.37 min\n", "Epoch: 038/050 | Batch 0000/0352 | Loss: 0.0735\n", "Epoch: 038/050 | Batch 0200/0352 | Loss: 0.0330\n", "***Epoch: 038/050 | Train. Acc.: 94.029% | Loss: 0.190\n", "***Epoch: 038/050 | Valid. Acc.: 72.420% | Loss: 1.421\n", "Time elapsed: 46.59 min\n", "Epoch: 039/050 | Batch 0000/0352 | Loss: 0.1424\n", "Epoch: 039/050 | Batch 0200/0352 | Loss: 0.1515\n", "***Epoch: 039/050 | Train. Acc.: 94.256% | Loss: 0.180\n", "***Epoch: 039/050 | Valid. Acc.: 73.500% | Loss: 1.354\n", "Time elapsed: 47.82 min\n", "Epoch: 040/050 | Batch 0000/0352 | Loss: 0.0455\n", "Epoch: 040/050 | Batch 0200/0352 | Loss: 0.0889\n", "***Epoch: 040/050 | Train. Acc.: 91.420% | Loss: 0.309\n", "***Epoch: 040/050 | Valid. Acc.: 72.240% | Loss: 1.485\n", "Time elapsed: 49.05 min\n", "Epoch: 041/050 | Batch 0000/0352 | Loss: 0.0589\n", "Epoch: 041/050 | Batch 0200/0352 | Loss: 0.0381\n", "***Epoch: 041/050 | Train. Acc.: 87.698% | Loss: 0.499\n", "***Epoch: 041/050 | Valid. Acc.: 68.860% | Loss: 1.768\n", "Time elapsed: 50.28 min\n", "Epoch: 042/050 | Batch 0000/0352 | Loss: 0.1281\n", "Epoch: 042/050 | Batch 0200/0352 | Loss: 0.0269\n", "***Epoch: 042/050 | Train. Acc.: 95.911% | Loss: 0.128\n", "***Epoch: 042/050 | Valid. Acc.: 75.200% | Loss: 1.303\n", "Time elapsed: 51.51 min\n", "Epoch: 043/050 | Batch 0000/0352 | Loss: 0.0151\n", "Epoch: 043/050 | Batch 0200/0352 | Loss: 0.3936\n", "***Epoch: 043/050 | Train. Acc.: 95.720% | Loss: 0.128\n", "***Epoch: 043/050 | Valid. Acc.: 75.320% | Loss: 1.133\n", "Time elapsed: 52.74 min\n", "Epoch: 044/050 | Batch 0000/0352 | Loss: 0.1388\n", "Epoch: 044/050 | Batch 0200/0352 | Loss: 0.0791\n", "***Epoch: 044/050 | Train. Acc.: 93.296% | Loss: 0.235\n", "***Epoch: 044/050 | Valid. Acc.: 73.260% | Loss: 1.451\n", "Time elapsed: 53.97 min\n", "Epoch: 045/050 | Batch 0000/0352 | Loss: 0.0067\n", "Epoch: 045/050 | Batch 0200/0352 | Loss: 0.0268\n", "***Epoch: 045/050 | Train. Acc.: 96.744% | Loss: 0.106\n", "***Epoch: 045/050 | Valid. Acc.: 75.040% | Loss: 1.363\n", "Time elapsed: 55.19 min\n", "Epoch: 046/050 | Batch 0000/0352 | Loss: 0.0690\n", "Epoch: 046/050 | Batch 0200/0352 | Loss: 0.0792\n", "***Epoch: 046/050 | Train. Acc.: 93.733% | Loss: 0.211\n", "***Epoch: 046/050 | Valid. Acc.: 73.500% | Loss: 1.439\n", "Time elapsed: 56.42 min\n", "Epoch: 047/050 | Batch 0000/0352 | Loss: 0.1034\n", "Epoch: 047/050 | Batch 0200/0352 | Loss: 0.0920\n", "***Epoch: 047/050 | Train. Acc.: 92.091% | Loss: 0.276\n", "***Epoch: 047/050 | Valid. Acc.: 71.780% | Loss: 1.538\n", "Time elapsed: 57.65 min\n", "Epoch: 048/050 | Batch 0000/0352 | Loss: 0.0308\n", "Epoch: 048/050 | Batch 0200/0352 | Loss: 0.0766\n", "***Epoch: 048/050 | Train. Acc.: 95.351% | Loss: 0.153\n", "***Epoch: 048/050 | Valid. Acc.: 75.200% | Loss: 1.408\n", "Time elapsed: 58.88 min\n", "Epoch: 049/050 | Batch 0000/0352 | Loss: 0.0672\n", "Epoch: 049/050 | Batch 0200/0352 | Loss: 0.0637\n", "***Epoch: 049/050 | Train. Acc.: 91.213% | Loss: 0.321\n", "***Epoch: 049/050 | Valid. Acc.: 71.400% | Loss: 1.533\n", "Time elapsed: 60.11 min\n", "Epoch: 050/050 | Batch 0000/0352 | Loss: 0.0195\n", "Epoch: 050/050 | Batch 0200/0352 | Loss: 0.0353\n", "***Epoch: 050/050 | Train. Acc.: 89.211% | Loss: 0.313\n", "***Epoch: 050/050 | Valid. Acc.: 72.620% | Loss: 0.953\n", "Time elapsed: 61.35 min\n", "Total Training Time: 61.35 min\n" ] } ], "source": [ "_ = train_classifier_simple_v1(num_epochs=num_epochs, model=model, \n", " optimizer=optimizer, device=DEVICE, \n", " train_loader=train_loader, valid_loader=valid_loader, \n", " logging_interval=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 4) Result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this particular case, the deterministic behavior does not seem to influence performance noticeably." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }