{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MNIST with SciKit-Learn and skorch\n",
    "\n",
    "This notebooks shows how to define and train a simple Neural-Network with PyTorch and use it via skorch with SciKit-Learn."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.datasets import fetch_mldata\n",
    "from sklearn.model_selection import train_test_split\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading Data\n",
    "Using SciKit-Learns ```fetch_mldata``` to load MNIST data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "mnist = fetch_mldata('MNIST original', data_home='../datasets/')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'DESCR': 'mldata.org dataset: mnist-original',\n",
       " 'COL_NAMES': ['label', 'data'],\n",
       " 'target': array([0., 0., 0., ..., 9., 9., 9.]),\n",
       " 'data': array([[0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        ...,\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)}"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mnist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(70000, 784)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mnist.data.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preprocessing Data\n",
    "\n",
    "Each image of the MNIST dataset is encoded in a 784 dimensional vector, representing a 28 x 28 pixel image. Each pixel has a value between 0 and 255, corresponding to the grey-value of a pixel.<br />\n",
    "The above ```featch_mldata``` method to load MNIST returns ```data``` and ```target``` as ```uint8``` which we convert to ```float32``` and ```int64``` respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = mnist.data.astype('float32')\n",
    "y = mnist.target.astype('int64')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we will use ReLU as activation in combination with softmax over the output layer, we need to scale `X` down. An often use range is [0, 1]."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "X /= 255.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.0, 1.0)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.min(), X.max()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: data is not normalized."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert(X_train.shape[0] + X_test.shape[0] == mnist.data.shape[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((52500, 784), (52500,))"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train.shape, y_train.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build Neural Network with Torch\n",
    "Simple, fully connected neural network with one hidden layer. Input layer has 784 dimensions (28x28), hidden layer has 98 (= 784 / 8) and output layer 10 neurons, representing digits 0 - 9."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import nn\n",
    "import torch.nn.functional as F"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "torch.manual_seed(0);\n",
    "device = 'cuda' if torch.cuda.is_available() else 'cpu'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "mnist_dim = X.shape[1]\n",
    "hidden_dim = int(mnist_dim/8)\n",
    "output_dim = len(np.unique(mnist.target))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(784, 98, 10)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mnist_dim, hidden_dim, output_dim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A Neural network in PyTorch's framework."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "class ClassifierModule(nn.Module):\n",
    "    def __init__(\n",
    "            self,\n",
    "            input_dim=mnist_dim,\n",
    "            hidden_dim=hidden_dim,\n",
    "            output_dim=output_dim,\n",
    "            dropout=0.5,\n",
    "    ):\n",
    "        super(ClassifierModule, self).__init__()\n",
    "        self.dropout = nn.Dropout(dropout)\n",
    "\n",
    "        self.hidden = nn.Linear(input_dim, hidden_dim)\n",
    "        self.output = nn.Linear(hidden_dim, output_dim)\n",
    "\n",
    "    def forward(self, X, **kwargs):\n",
    "        X = F.relu(self.hidden(X))\n",
    "        X = self.dropout(X)\n",
    "        X = F.softmax(self.output(X), dim=-1)\n",
    "        return X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Skorch allows to use PyTorch's networks in the SciKit-Learn setting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "from skorch.net import NeuralNetClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "net = NeuralNetClassifier(\n",
    "    ClassifierModule,\n",
    "    max_epochs=20,\n",
    "    lr=0.1,\n",
    "    device=device,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  epoch    train_loss    valid_acc    valid_loss     dur\n",
      "-------  ------------  -----------  ------------  ------\n",
      "      1        \u001b[36m0.8284\u001b[0m       \u001b[32m0.9010\u001b[0m        \u001b[35m0.3771\u001b[0m  5.4626\n",
      "      2        \u001b[36m0.4376\u001b[0m       \u001b[32m0.9199\u001b[0m        \u001b[35m0.2879\u001b[0m  5.6434\n",
      "      3        \u001b[36m0.3664\u001b[0m       \u001b[32m0.9308\u001b[0m        \u001b[35m0.2458\u001b[0m  5.1058\n",
      "      4        \u001b[36m0.3239\u001b[0m       \u001b[32m0.9385\u001b[0m        \u001b[35m0.2150\u001b[0m  4.9935\n",
      "      5        \u001b[36m0.2971\u001b[0m       \u001b[32m0.9448\u001b[0m        \u001b[35m0.1947\u001b[0m  6.2501\n",
      "      6        \u001b[36m0.2755\u001b[0m       \u001b[32m0.9474\u001b[0m        \u001b[35m0.1823\u001b[0m  5.3603\n",
      "      7        \u001b[36m0.2643\u001b[0m       \u001b[32m0.9514\u001b[0m        \u001b[35m0.1712\u001b[0m  5.3241\n",
      "      8        \u001b[36m0.2443\u001b[0m       \u001b[32m0.9541\u001b[0m        \u001b[35m0.1585\u001b[0m  5.5232\n",
      "      9        \u001b[36m0.2346\u001b[0m       \u001b[32m0.9557\u001b[0m        \u001b[35m0.1500\u001b[0m  4.8883\n",
      "     10        \u001b[36m0.2257\u001b[0m       \u001b[32m0.9577\u001b[0m        \u001b[35m0.1447\u001b[0m  4.6561\n",
      "     11        \u001b[36m0.2165\u001b[0m       \u001b[32m0.9594\u001b[0m        \u001b[35m0.1394\u001b[0m  6.2466\n",
      "     12        \u001b[36m0.2093\u001b[0m       \u001b[32m0.9600\u001b[0m        \u001b[35m0.1338\u001b[0m  5.2418\n",
      "     13        \u001b[36m0.2045\u001b[0m       \u001b[32m0.9610\u001b[0m        \u001b[35m0.1297\u001b[0m  5.6362\n",
      "     14        \u001b[36m0.1969\u001b[0m       \u001b[32m0.9620\u001b[0m        \u001b[35m0.1263\u001b[0m  5.5742\n",
      "     15        \u001b[36m0.1931\u001b[0m       \u001b[32m0.9629\u001b[0m        \u001b[35m0.1223\u001b[0m  5.1409\n",
      "     16        \u001b[36m0.1893\u001b[0m       \u001b[32m0.9647\u001b[0m        \u001b[35m0.1191\u001b[0m  6.2617\n",
      "     17        \u001b[36m0.1849\u001b[0m       \u001b[32m0.9651\u001b[0m        \u001b[35m0.1185\u001b[0m  6.5456\n",
      "     18        \u001b[36m0.1803\u001b[0m       \u001b[32m0.9657\u001b[0m        \u001b[35m0.1155\u001b[0m  6.8025\n",
      "     19        \u001b[36m0.1765\u001b[0m       \u001b[32m0.9665\u001b[0m        \u001b[35m0.1136\u001b[0m  5.0039\n",
      "     20        \u001b[36m0.1721\u001b[0m       \u001b[32m0.9667\u001b[0m        \u001b[35m0.1103\u001b[0m  5.0675\n"
     ]
    }
   ],
   "source": [
    "net.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prediction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "predicted = net.predict(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9653142857142857"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.mean(predicted == y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An accuracy of nearly 96% for a network with only one hidden layer is not too bad"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Convolutional Network\n",
    "PyTorch expects a 4 dimensional tensor as input for its 2D convolution layer. The dimensions represent:\n",
    "* Batch size\n",
    "* Number of channel\n",
    "* Height\n",
    "* Width\n",
    "\n",
    "As initial batch size the number of examples needs to be provided. MNIST data has only one channel. As stated above, each MNIST vector represents a 28x28 pixel image. Hence, the resulting shape for PyTorch tensor needs to be (x, 1, 28, 28). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "XCnn = X.reshape(-1, 1, 28, 28)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(70000, 1, 28, 28)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "XCnn.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "XCnn_train, XCnn_test, y_train, y_test = train_test_split(XCnn, y, test_size=0.25, random_state=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((52500, 1, 28, 28), (52500,))"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "XCnn_train.shape, y_train.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Cnn(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(Cnn, self).__init__()\n",
    "        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)\n",
    "        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)\n",
    "        self.conv2_drop = nn.Dropout2d()\n",
    "        self.fc1 = nn.Linear(1600, 128) # 1600 = number channels * width * height\n",
    "        self.fc2 = nn.Linear(128, 10)\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = F.relu(F.max_pool2d(self.conv1(x), 2))\n",
    "        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))\n",
    "        x = x.view(-1, x.size(1) * x.size(2) * x.size(3)) # flatten over channel, height and width = 1600\n",
    "        x = F.relu(self.fc1(x))\n",
    "        x = F.dropout(x, training=self.training)\n",
    "        x = self.fc2(x)\n",
    "        x = F.softmax(x, dim=-1)\n",
    "        return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "cnn = NeuralNetClassifier(\n",
    "    Cnn,\n",
    "    max_epochs=15,\n",
    "    lr=1,\n",
    "    optimizer=torch.optim.Adadelta,\n",
    "    device=device,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  epoch    train_loss    valid_acc    valid_loss     dur\n",
      "-------  ------------  -----------  ------------  ------\n",
      "      1        \u001b[36m0.4692\u001b[0m       \u001b[32m0.9730\u001b[0m        \u001b[35m0.0873\u001b[0m  7.4336\n",
      "      2        \u001b[36m0.1503\u001b[0m       \u001b[32m0.9818\u001b[0m        \u001b[35m0.0601\u001b[0m  6.6657\n",
      "      3        \u001b[36m0.1177\u001b[0m       \u001b[32m0.9834\u001b[0m        \u001b[35m0.0525\u001b[0m  6.5910\n",
      "      4        \u001b[36m0.1037\u001b[0m       \u001b[32m0.9846\u001b[0m        \u001b[35m0.0476\u001b[0m  7.9510\n",
      "      5        \u001b[36m0.0889\u001b[0m       \u001b[32m0.9847\u001b[0m        \u001b[35m0.0446\u001b[0m  6.5556\n",
      "      6        \u001b[36m0.0808\u001b[0m       \u001b[32m0.9873\u001b[0m        \u001b[35m0.0407\u001b[0m  6.4084\n",
      "      7        \u001b[36m0.0724\u001b[0m       \u001b[32m0.9878\u001b[0m        \u001b[35m0.0384\u001b[0m  6.1549\n",
      "      8        \u001b[36m0.0680\u001b[0m       0.9875        \u001b[35m0.0379\u001b[0m  5.7811\n",
      "      9        \u001b[36m0.0646\u001b[0m       \u001b[32m0.9885\u001b[0m        \u001b[35m0.0376\u001b[0m  6.2944\n",
      "     10        \u001b[36m0.0582\u001b[0m       0.9883        \u001b[35m0.0370\u001b[0m  5.6687\n",
      "     11        \u001b[36m0.0578\u001b[0m       0.9879        \u001b[35m0.0350\u001b[0m  5.7188\n",
      "     12        \u001b[36m0.0542\u001b[0m       0.9879        0.0380  6.4705\n",
      "     13        \u001b[36m0.0523\u001b[0m       \u001b[32m0.9904\u001b[0m        \u001b[35m0.0326\u001b[0m  6.1535\n",
      "     14        \u001b[36m0.0493\u001b[0m       0.9884        0.0343  6.5948\n",
      "     15        0.0498       0.9900        \u001b[35m0.0316\u001b[0m  5.5662\n"
     ]
    }
   ],
   "source": [
    "cnn.fit(XCnn_train, y_train);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "cnn_pred = cnn.predict(XCnn_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9912571428571428"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.mean(cnn_pred == y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "An accuracy of 99.1% should suffice for this example!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}