{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Machine Learning overview - assignment 2"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Digit Recognizer\n",
    "\n",
    "Although this is a computer vision problem, we employ here a simple model using **K-Nearest Neighbors** algorithm in this notebook to be a good starting point. We use the **GridSearchCV** to fine tune the hyperparameters such as *\"n_neighbors\", and \"weights\"*. Furthermore, we use **Data Augmentation** or **Artificial Data Synthesis** technique in this notebook to boost the model's performance on the test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
    "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5"
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "import numpy as np # Linear algebra\n",
    "import pandas as pd # For data manipulation\n",
    "import json\n",
    "import os\n",
    "import matplotlib.pyplot as plt # For visualization\n",
    "from sklearn.neighbors import KNeighborsClassifier # For modelling\n",
    "from sklearn.model_selection import cross_val_score, GridSearchCV # For evaluation and hyperparameter tuning\n",
    "from sklearn.metrics import confusion_matrix, classification_report # For evaluation\n",
    "from scipy.ndimage import shift, rotate, zoom # For data augmentation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Peeking the data**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Loading the datasets into dataframes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0",
    "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a"
   },
   "outputs": [],
   "source": [
    "train_df = pd.read_csv(\n",
    "    \"https://static-1300131294.cos.ap-shanghai.myqcloud.com/data/mnist_train.csv\"\n",
    ")\n",
    "\n",
    "test_df = pd.read_csv(\n",
    "    \"https://static-1300131294.cos.ap-shanghai.myqcloud.com/data/mnist_test.csv\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Knowing about the features in the datasets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_df.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_df.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Converting the train and test dataframes into numpy arrays"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train = train_df.iloc[:6000, 1:].values\n",
    "y_train = train_df.iloc[:6000, 0].values\n",
    "X_test = test_df.iloc[:1000, 1:].values\n",
    "y_test = test_df.iloc[:1000, 0].values\n",
    "\n",
    "print(f\"X_train shape: {X_train.shape}\")\n",
    "print(f\"y_train shape: {y_train.shape}\")\n",
    "print(f\"X_test shape: {X_test.shape}\")\n",
    "print(f\"y_test shape: {y_test.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Visualizing a digit from the training data as a 28 X 28 image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "some_digit = X_train[46]\n",
    "\n",
    "some_digit_image = some_digit.reshape(28, 28)\n",
    "print(f\"Label: {y_train[40]}\")\n",
    "plt.imshow(some_digit_image, cmap=\"binary\")\n",
    "plt.show()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Train Model**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "estimator = KNeighborsClassifier()\n",
    "estimator.fit(X_train, y_train)\n",
    "predictions = estimator.predict(X_test)\n",
    "\n",
    "print(classification_report(y_test, predictions, digits=3), end=\"\\n\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(confusion_matrix(y_test, predictions), end=\"\\n\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fine-tuning the model by finding the best values for the hyperparameters (weights, n_neighbors) using GridSearchCV"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "grid_params = {\n",
    "    \"weights\": ['distance'],\n",
    "    \"n_neighbors\": [3,  5, 7, 9, 11]\n",
    "}\n",
    "\n",
    "estimator = KNeighborsClassifier()\n",
    "grid_estimator = GridSearchCV(estimator, # Base estimator\n",
    "                              grid_params, # Parameters to tune\n",
    "                              verbose=2, # Verbosity of the logs\n",
    "                              n_jobs=-1) # Number of jobs to be run concurrently with -1 meaning all the processors\n",
    "\n",
    "# Fitting the estimator with training data\n",
    "grid_estimator.fit(X_train, y_train)\n",
    "\n",
    "print(f\"Best Score: {grid_estimator.best_score_}\", end=\"\\n\\n\")\n",
    "print(f\"Best Parameters: \\n{json.dumps(grid_estimator.best_params_, indent=4)}\",\n",
    "      end=\"\\n\\n\")\n",
    "print(\"Grid Search CV results:\")\n",
    "results_df = pd.DataFrame(grid_estimator.cv_results_)\n",
    "results_df"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Best parameter values found:** {n_neighbors: 3, weights: 'distance'}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fitting a new model with the found hyperparameter values to the training data and making predictions on the test data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "estimator = KNeighborsClassifier(n_neighbors=3, weights='distance')\n",
    "estimator.fit(X_train, y_train)\n",
    "predictions = estimator.predict(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "confusion_matrix(y_test, predictions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(classification_report(y_test, predictions, digits=3), end=\"\\n\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Data Augmentation**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each image in the training set is \n",
    "\n",
    "* shifted down, up, left and right by one pixel\n",
    "* rotated clockwise and anti-clockwise \n",
    "* clipped and zoomed at two different ranges\n",
    "\n",
    "generating eight different images. The image is clipped before zooming to preserve the image size."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def shift_in_one_direction(image, direction):\n",
    "    \"\"\"\n",
    "    Shifts an image by one pixel in the specified direction\n",
    "    \"\"\"\n",
    "    if direction == \"DOWN\":\n",
    "        image = shift(image, [1, 0])\n",
    "    elif direction == \"UP\":\n",
    "        image = shift(image, [-1, 0])\n",
    "    elif direction == \"LEFT\":\n",
    "        image = shift(image, [0, -1])\n",
    "    else:\n",
    "        image = shift(image, [0, 1])\n",
    "\n",
    "    return image\n",
    "\n",
    "\n",
    "def shift_in_all_directions(image):\n",
    "    \"\"\"\n",
    "    Shifts an image in all the directions by one pixel\n",
    "    \"\"\"\n",
    "    reshaped_image = image.reshape(28, 28)\n",
    "\n",
    "    down_shifted_image = shift_in_one_direction(reshaped_image, \"DOWN\")\n",
    "    up_shifted_image = shift_in_one_direction(reshaped_image, \"UP\")\n",
    "    left_shifted_image = shift_in_one_direction(reshaped_image, \"LEFT\")\n",
    "    right_shifted_image = shift_in_one_direction(reshaped_image, \"RIGHT\")\n",
    "\n",
    "    return (down_shifted_image, up_shifted_image,\n",
    "            left_shifted_image, right_shifted_image)\n",
    "\n",
    "\n",
    "def rotate_in_all_directions(image, angle):\n",
    "    \"\"\"\n",
    "    Rotates an image clockwise and anti-clockwise\n",
    "    \"\"\"\n",
    "    reshaped_image = image.reshape(28, 28)\n",
    "    \n",
    "    rotated_images = (rotate(reshaped_image, angle, reshape=False),\n",
    "                      rotate(reshaped_image, -angle, reshape=False))\n",
    "    \n",
    "    return rotated_images\n",
    "\n",
    "\n",
    "def clipped_zoom(image, zoom_ranges):\n",
    "    \"\"\"\n",
    "    Clips and zooms an image at the specified zooming ranges\n",
    "    \"\"\"\n",
    "    reshaped_image = image.reshape(28, 28)\n",
    "    \n",
    "    h, w = reshaped_image.shape\n",
    "    \n",
    "    zoomed_images = []\n",
    "    for zoom_range in zoom_ranges:\n",
    "        zh = int(np.round(h / zoom_range))\n",
    "        zw = int(np.round(w / zoom_range))\n",
    "        top = (h - zh) // 2\n",
    "        left = (w - zw) // 2\n",
    "        \n",
    "        zoomed_images.append(zoom(reshaped_image[top:top+zh, left:left+zw],\n",
    "                                  zoom_range))\n",
    "    \n",
    "    return zoomed_images\n",
    "\n",
    "def alter_image(image):\n",
    "    \"\"\"\n",
    "    Alters an image by shifting, rotating, and zooming it\n",
    "    \"\"\"\n",
    "    shifted_images = shift_in_all_directions(image)\n",
    "    rotated_images = rotate_in_all_directions(image, 10)\n",
    "    zoomed_images = clipped_zoom(image, [1.1, 1.2])\n",
    "            \n",
    "    return np.r_[shifted_images, rotated_images, zoomed_images]\n",
    "\n",
    "X_train_add = np.apply_along_axis(alter_image, 1, X_train).reshape(-1, 784)\n",
    "y_train_add = np.repeat(y_train, 8)\n",
    "\n",
    "print(f\"X_train_add shape: {X_train_add.shape}\")\n",
    "print(f\"y_train_add shape: {y_train_add.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Combining the synthesized data with the actual training data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_combined = np.r_[X_train, X_train_add]\n",
    "y_train_combined = np.r_[y_train, y_train_add]\n",
    "\n",
    "del X_train\n",
    "del X_train_add\n",
    "del y_train\n",
    "del y_train_add\n",
    "\n",
    "print(f\"X_train_combined shape: {X_train_combined.shape}\")\n",
    "print(f\"y_train_combined shape: {y_train_combined.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fitting a new model with the tuned hyperparameters to the combined dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cdata_estimator = KNeighborsClassifier(n_neighbors=3, weights='distance')\n",
    "cdata_estimator.fit(X_train_combined, y_train_combined)\n",
    "cdata_estimator_predictions = cdata_estimator.predict(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "confusion_matrix(y_test, cdata_estimator_predictions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(classification_report(y_test, cdata_estimator_predictions, digits=3), end=\"\\n\\n\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note:** With **Data Augmentation** the accuracy jumped from 91.6% to 95.3% on the test data."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Acknowledgments\n",
    "\n",
    "Thanks to SkalskiP for creating the open-source [Kaggle jupyter notebook](https://www.kaggle.com/code/gauthampughazh/digit-recognition-using-knn), licensed under Apache 2.0. It inspires the majority of the content of this assignment."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}