{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "f11a376bbed2a7c76a5887d18cd70ca0",
     "grade": false,
     "grade_id": "cell-9f26189845c414a6",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "# Part 3: Training and evaluating an MLP classifier with Torchbearer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "bf45994f9585bb2a9b46a7e758a5640d",
     "grade": false,
     "grade_id": "cell-6e0f33a0a40b7f84",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "# Execute this code block to install dependencies when running on colab\n",
    "try:\n",
    "    import torch\n",
    "except:\n",
    "    from os.path import exists\n",
    "    from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag\n",
    "    platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())\n",
    "    cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\\.\\([0-9]*\\)\\.\\([0-9]*\\)$/cu\\1\\2/'\n",
    "    accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'\n",
    "\n",
    "    !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision\n",
    "\n",
    "try: \n",
    "    import torchbearer\n",
    "except:\n",
    "    !pip install torchbearer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "f8e6214fbb52f1b1dae608cc39a8f414",
     "grade": false,
     "grade_id": "cell-f40ada026dcaaf3d",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "## Introducing Torchbearer\n",
    "You've now got to a stage where you've successfully implemented, trained and evaluated a neural network in PyTorch. You will have noticed that whilst defining the model was done in very few lines of code, that you actually had to do quite a lot of work to train and evaluate the model. Whilst the ability to have complete control over training and evaluation is useful (you'll see examples in later labs, and for the coursework you might come across situations where this is an absolute necessity), it can become rather tedious if you just want to perform a standard training and evaluation run. \n",
    "\n",
    "The [Torchbearer](https://github.com/ecs-vlc/torchbearer) library, written and maintained by members of the VLC research group in ECS, can help. Torchbearer is a model training and evaluation library that is designed to massively reduce the amount of code you need to write whilst still allowing full control over the process. \n",
    "\n",
    "The following code just reproduces the baseline MLP model implementation and loads data as we did in part 2:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "fbbc1e14a837d217ba2bdbe61d32d8b8",
     "grade": false,
     "grade_id": "cell-99d750680ef07a12",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn.functional as F\n",
    "import torchvision.transforms as transforms\n",
    "from torch import nn\n",
    "from torch import optim\n",
    "from torch.utils.data import DataLoader\n",
    "from torchvision.datasets import MNIST\n",
    "\n",
    "# fix random seed for reproducibility\n",
    "seed = 7\n",
    "torch.manual_seed(seed)\n",
    "torch.backends.cudnn.deterministic = True\n",
    "torch.backends.cudnn.benchmark = False\n",
    "import numpy as np\n",
    "np.random.seed(seed)\n",
    "\n",
    "# flatten 28*28 images to a 784 vector for each image\n",
    "transform = transforms.Compose([\n",
    "    transforms.ToTensor(),  # convert to tensor\n",
    "    transforms.Lambda(lambda x: x.view(-1))  # flatten into vector\n",
    "])\n",
    "\n",
    "# load data\n",
    "trainset = MNIST(\".\", train=True, download=True, transform=transform)\n",
    "testset = MNIST(\".\", train=False, download=True, transform=transform)\n",
    "\n",
    "# create data loaders\n",
    "trainloader = DataLoader(trainset, batch_size=128, shuffle=True)\n",
    "testloader = DataLoader(testset, batch_size=128, shuffle=True)\n",
    "\n",
    "# define baseline model\n",
    "class BaselineModel(nn.Module):\n",
    "    def __init__(self, input_size, hidden_size, num_classes):\n",
    "        super(BaselineModel, self).__init__()\n",
    "        self.fc1 = nn.Linear(input_size, hidden_size) \n",
    "        self.fc2 = nn.Linear(hidden_size, num_classes)  \n",
    "    \n",
    "    def forward(self, x):\n",
    "        out = self.fc1(x)\n",
    "        out = F.relu(out)\n",
    "        out = self.fc2(out)\n",
    "        if not self.training:\n",
    "            out = F.softmax(out, dim=1)\n",
    "        return out\n",
    "\n",
    "# build the model\n",
    "model = BaselineModel(784, 784, 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "168cde459e4f896bbb6bfc3f6e37ac75",
     "grade": false,
     "grade_id": "cell-dc258ff0a6778d16",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "We can use the torchbearer `Trial` class to do all the hard work in training and evaluating for us:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "595dab583eb228c55b7a53d651db0621",
     "grade": false,
     "grade_id": "cell-2f6425bc8d566229",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "import torchbearer\n",
    "\n",
    "# define the loss function and the optimiser\n",
    "loss_function = nn.CrossEntropyLoss()\n",
    "optimiser = optim.Adam(model.parameters())\n",
    "\n",
    "# Construct a trial object with the model, optimiser and loss.\n",
    "# Also specify metrics we wish to compute.\n",
    "trial = torchbearer.Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'])\n",
    "\n",
    "# Provide the data to the trial\n",
    "trial.with_generators(trainloader, test_generator=testloader)\n",
    "\n",
    "# Run 10 epochs of training\n",
    "trial.run(epochs=10)\n",
    "\n",
    "# test the performance\n",
    "results = trial.evaluate(data_key=torchbearer.TEST_DATA)\n",
    "print(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "a966b84f98d44de2d220a6d993ada48e",
     "grade": false,
     "grade_id": "cell-5d3c958d1716301e",
     "locked": true,
     "schema_version": 1,
     "solution": false
    }
   },
   "source": [
    "You can see that training and evaluating the model prints out much more informative information as the process runs - this is particularly useful for training big models as it enables you to keep an eye on progress. You should see the accuracy matches the one from the previous part of the lab.\n",
    "\n",
    "Take some time to have a look over the [Torchbearer documentation](https://torchbearer.readthedocs.io). Once you've had a look, __use the code block below to train the model (newly initialised) with a plain Stochastic Gradient Descent optimiser with a learning rate of 0.01 (keep all other SGD parameters at their default values). In addition to computing the loss and accuracy metrics, use Torchbearer to also compute the top-5 accuracy metric which was made popular by the ImageNet challenge.__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "checksum": "f38d28f2d3fc968ec52ad020e5e384dc",
     "grade": false,
     "grade_id": "cell-409fe541b92c1119",
     "locked": false,
     "schema_version": 1,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "model = BaselineModel(784, 784, 10)\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()\n",
    "\n",
    "trial.run(epochs=10)\n",
    "results = trial.evaluate(data_key=torchbearer.TEST_DATA)\n",
    "print(results)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "checksum": "177ff0787d23ab9a38360eeda8fc5d00",
     "grade": true,
     "grade_id": "cell-bbfe8875e4e2c5e3",
     "locked": true,
     "points": 4,
     "schema_version": 1,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'test_top_5_acc' in results\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}