{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 12.2\n",
    "## Activation maximization\n",
    "In this task, we use the approach of activation maximization to visualize to which patterns features of a CNN trained using on MNIST are sensitive. This will give us a deeper understanding of the working principle of CNNs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "keras version 2.4.0\n"
     ]
    }
   ],
   "source": [
    "import tensorflow as tf\n",
    "from tensorflow import keras\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "KTF = keras.backend\n",
    "layers = keras.layers\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download and preprocess data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()\n",
    "x_train = x_train.astype(np.float32)[...,np.newaxis] / 255.\n",
    "x_test = x_test.astype(np.float32)[...,np.newaxis] / 255.\n",
    "y_train = keras.utils.to_categorical(y_train, 10)\n",
    "y_test = keras.utils.to_categorical(y_test, 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up a convolutional neural network with at least 4 CNN layers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = keras.models.Sequential()\n",
    "\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### compile and train model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile(\n",
    "    loss='categorical_crossentropy',\n",
    "    optimizer=keras.optimizers.Adam(lr=1e-3),\n",
    "    metrics=['accuracy'])\n",
    "\n",
    "\n",
    "results = model.fit(x_train, y_train,\n",
    "                    batch_size=100,\n",
    "                    epochs=3,\n",
    "                    verbose=1,\n",
    "                    validation_split=0.1\n",
    "                    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Implementation of activation maximization\n",
    "Select a layer you want to visualize and perform activation maximization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "gradient_updates = 50\n",
    "step_size = 1.\n",
    "\n",
    "def normalize(x):\n",
    "    '''Normalize gradients via l2 norm'''\n",
    "    return x / (KTF.sqrt(KTF.mean(KTF.square(x))) + KTF.epsilon())\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the following, implement activation maximization to visualize to which patterns a specific feature map is sensitive:\n",
    "- Start from uniform distributed noise 'images' (note that the shape has to be `(1, 28, 28, 1)`, as we use a batch size of `1`).\n",
    "- Choose one specific feature map using 'filter_index'.\n",
    "- Create a scalar loss as discussed in Chapter 12 (maximize the average feature map activation).\n",
    "- Thereafter, add the calculated gradients to your start image (gradient ascent step) and repeat the procedure using gradient_updates = 50. \n",
    "You can calculate the gradients using the following expressions:  \n",
    "`with tf.GradientTape() as gtape:\n",
    "    grads = gtape.gradient(YOUR_OBJECTIVE, THE_VARIABLE_YOU_WANT_TO_OPTIMIZE)  \n",
    "    grads = normalize(grads)`\n",
    "\n",
    "- Finally, implement the gradient ascent step (you may use `assign_sub` or `assign_add` to adapt the parameters) and perform 50 updates.\n",
    "\n",
    "Remember to construct a Keras variable for the input (we want to find an input that 'maximizes' the output, so we build an input that holds adaptive parameters which we can train using TensorFlow / Keras)\n",
    "The following code snippet may help you to implement the maximization: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "visualized_feature = []\n",
    "layer_dict = layer_dict = dict([(layer.name, layer) for layer in model.layers[:]])\n",
    "layer_name = \"conv2d_3\"\n",
    "\n",
    "layer_output = layer_dict[layer_name].output\n",
    "sub_model = keras.models.Model([model.inputs], [layer_output])\n",
    "\n",
    "for filter_index in range(layer_output.shape[-1]):  # iterate over fiters\n",
    "\n",
    "    print('Processing filter %d' % (filter_index+1))\n",
    "   \n",
    "    input_img = KTF.variable([0]) # instead of '[0]' use noise as the (start) input image with correct shape\n",
    "\n",
    "    for i in range(gradient_updates):\n",
    "\n",
    "        with tf.GradientTape() as gtape:\n",
    "            # define a scalar loss using Keras.\n",
    "            # remember: You would like to maximize the activations in the respective feature map!\n",
    "            loss = 0  # <--: define your loss HERE\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot images to visualize to which patterns the respective feature maps are sensitive."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def deprocess_image(x):\n",
    "    # reprocess visualization to format of \"MNIST images\"\n",
    "    x -= x.mean()\n",
    "    x /= (x.std() + KTF.epsilon())\n",
    "    # x *= 0.1\n",
    "    x += 0.5\n",
    "    x *= 255\n",
    "    x = np.clip(x, 0, 255).astype('uint8')\n",
    "    return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10,10))\n",
    "\n",
    "for i, feature_ in enumerate(visualized_feature):\n",
    "    feature_image = deprocess_image(feature_)\n",
    "    ax = plt.subplot(8,8, 1+i, )\n",
    "    plt.imshow(feature_image.squeeze())\n",
    "    ax.axis('off')\n",
    "    plt.title(\"feature %s\" % i)\n",
    "    \n",
    "plt.tight_layout()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}