{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 3: Compression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow.keras.utils import to_categorical\n",
    "from sklearn.datasets import fetch_openml\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.preprocessing import LabelEncoder, StandardScaler\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib inline\n",
    "seed = 0\n",
    "np.random.seed(seed)\n",
    "import tensorflow as tf\n",
    "\n",
    "tf.random.set_seed(seed)\n",
    "import os\n",
    "\n",
    "os.environ['PATH'] = os.environ['XILINX_VITIS'] + '/bin:' + os.environ['PATH']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fetch the jet tagging dataset from Open ML"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_val = np.load('X_train_val.npy')\n",
    "X_test = np.load('X_test.npy')\n",
    "y_train_val = np.load('y_train_val.npy')\n",
    "y_test = np.load('y_test.npy')\n",
    "classes = np.load('classes.npy', allow_pickle=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Now construct a model\n",
    "We'll use the same architecture as in part 1: 3 hidden layers with 64, then 32, then 32 neurons. Each layer will use `relu` activation.\n",
    "Add an output layer with 5 neurons (one for each class), then finish with Softmax activation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow.keras.models import Sequential\n",
    "from tensorflow.keras.layers import Dense, Activation, BatchNormalization\n",
    "from tensorflow.keras.optimizers import Adam\n",
    "from tensorflow.keras.regularizers import l1\n",
    "from callbacks import all_callbacks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = Sequential()\n",
    "model.add(Dense(64, input_shape=(16,), name='fc1', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))\n",
    "model.add(Activation(activation='relu', name='relu1'))\n",
    "model.add(Dense(32, name='fc2', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))\n",
    "model.add(Activation(activation='relu', name='relu2'))\n",
    "model.add(Dense(32, name='fc3', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))\n",
    "model.add(Activation(activation='relu', name='relu3'))\n",
    "model.add(Dense(5, name='output', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001)))\n",
    "model.add(Activation(activation='softmax', name='softmax'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train sparse\n",
    "This time we'll use the Tensorflow model optimization sparsity to train a sparse model (forcing many weights to '0'). In this instance, the target sparsity is 75%"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow_model_optimization.python.core.sparsity.keras import prune, pruning_callbacks, pruning_schedule\n",
    "from tensorflow_model_optimization.sparsity.keras import strip_pruning\n",
    "\n",
    "pruning_params = {\"pruning_schedule\": pruning_schedule.ConstantSparsity(0.75, begin_step=2000, frequency=100)}\n",
    "model = prune.prune_low_magnitude(model, **pruning_params)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train the model\n",
    "We'll use the same settings as the model for part 1: Adam optimizer with categorical crossentropy loss.\n",
    "The callbacks will decay the learning rate and save the model into a directory 'model_2'\n",
    "The model isn't very complex, so this should just take a few minutes even on the CPU.\n",
    "If you've restarted the notebook kernel after training once, set `train = False` to load the trained model rather than training again."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train = True\n",
    "if train:\n",
    "    adam = Adam(lr=0.0001)\n",
    "    model.compile(optimizer=adam, loss=['categorical_crossentropy'], metrics=['accuracy'])\n",
    "    callbacks = all_callbacks(\n",
    "        stop_patience=1000,\n",
    "        lr_factor=0.5,\n",
    "        lr_patience=10,\n",
    "        lr_epsilon=0.000001,\n",
    "        lr_cooldown=2,\n",
    "        lr_minimum=0.0000001,\n",
    "        outputDir='model_2',\n",
    "    )\n",
    "    callbacks.callbacks.append(pruning_callbacks.UpdatePruningStep())\n",
    "    model.fit(\n",
    "        X_train_val,\n",
    "        y_train_val,\n",
    "        batch_size=1024,\n",
    "        epochs=10,\n",
    "        validation_split=0.25,\n",
    "        shuffle=True,\n",
    "        callbacks=callbacks.callbacks,\n",
    "    )\n",
    "    # Save the model again but with the pruning 'stripped' to use the regular layer types\n",
    "    model = strip_pruning(model)\n",
    "    model.save('model_2/KERAS_check_best_model.h5')\n",
    "else:\n",
    "    from tensorflow.keras.models import load_model\n",
    "\n",
    "    model = load_model('model_2/KERAS_check_best_model.h5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Check sparsity\n",
    "Make a quick check that the model was indeed trained sparse. We'll just make a histogram of the weights of the 1st layer, and hopefully observe a large peak in the bin containing '0'. Note logarithmic y axis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "w = model.layers[0].weights[0].numpy()\n",
    "h, b = np.histogram(w, bins=100)\n",
    "plt.figure(figsize=(7, 7))\n",
    "plt.bar(b[:-1], h, width=b[1] - b[0])\n",
    "plt.semilogy()\n",
    "print('% of zeros = {}'.format(np.sum(w == 0) / np.size(w)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Check performance\n",
    "How does this 75% sparse model compare against the unpruned model? Let's report the accuracy and make a ROC curve. The pruned model is shown with solid lines, the unpruned model from part 1 is shown with dashed lines.\n",
    "**Make sure you've trained the model from part 1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import plotting\n",
    "import matplotlib.pyplot as plt\n",
    "from sklearn.metrics import accuracy_score\n",
    "from tensorflow.keras.models import load_model\n",
    "\n",
    "model_ref = load_model('model_1/KERAS_check_best_model.h5')\n",
    "\n",
    "y_ref = model_ref.predict(X_test)\n",
    "y_prune = model.predict(X_test)\n",
    "\n",
    "print(\"Accuracy unpruned: {}\".format(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_ref, axis=1))))\n",
    "print(\"Accuracy pruned:   {}\".format(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_prune, axis=1))))\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(9, 9))\n",
    "_ = plotting.makeRoc(y_test, y_ref, classes)\n",
    "plt.gca().set_prop_cycle(None)  # reset the colors\n",
    "_ = plotting.makeRoc(y_test, y_prune, classes, linestyle='--')\n",
    "\n",
    "from matplotlib.lines import Line2D\n",
    "\n",
    "lines = [Line2D([0], [0], ls='-'), Line2D([0], [0], ls='--')]\n",
    "from matplotlib.legend import Legend\n",
    "\n",
    "leg = Legend(ax, lines, labels=['unpruned', 'pruned'], loc='lower right', frameon=False)\n",
    "ax.add_artist(leg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Convert the model to FPGA firmware with hls4ml\n",
    "Let's use the default configuration: `ap_fixed<16,6>` precision everywhere and `ReuseFactor=1`, so we can compare with the part 1 model. We need to use `strip_pruning` to change the layer types back to their originals.\n",
    "\n",
    "**The synthesis will take a while**\n",
    "\n",
    "While the C-Synthesis is running, we can monitor the progress looking at the log file by opening a terminal from the notebook home, and executing:\n",
    "\n",
    "`tail -f model_2/hls4ml_prj/vitis_hls.log`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hls4ml\n",
    "\n",
    "config = hls4ml.utils.config_from_keras_model(model, granularity='model', backend='Vitis')\n",
    "print(config)\n",
    "hls_model = hls4ml.converters.convert_from_keras_model(\n",
    "    model, hls_config=config, backend='Vitis', output_dir='model_2/hls4ml_prj', part='xcu250-figd2104-2L-e'\n",
    ")\n",
    "hls_model.compile()\n",
    "hls_model.build(csim=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Check the reports\n",
    "Print out the reports generated by Vitis HLS. Pay attention to the Utilization Estimates' section in particular this time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hls4ml.report.read_vivado_report('model_2/hls4ml_prj/')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print the report for the model trained in part 1. Remember these models have the same architecture, but the model in this section was trained using the sparsity API from tensorflow_model_optimization. Notice how the resource usage had dramatically reduced (particularly the DSPs). When Vitis HLS notices an operation like `y = 0 * x` it can avoid placing a DSP for that operation. The impact of this is biggest when `ReuseFactor = 1`, but still applies at higher reuse as well. **Note you need to have trained and synthesized the model from part 1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hls4ml.report.read_vivado_report('model_1/hls4ml_prj')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}