{
  "nbformat": 4,
  "nbformat_minor": 2,
  "metadata": {
    "colab": {
      "name": "Image-Classifier.ipynb",
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "# Quickstart Guide\n",
        "\n",
        "This guide will give a quick intro to training PyTorch models with HugsVision. We'll start by loading in some data and defining a model, then we'll train it for a few epochs and see how well it does.\n",
        "\n",
        "**Note**: The easiest way to use this tutorial is as a colab notebook, which allows you to dive in with no setup. We recommend you enable a free GPU with\n",
        "\n",
        "> **Runtime**   →   **Change runtime type**   →   **Hardware Accelerator: GPU**\n",
        "\n",
        "**Note**: You need to have at least Python 3.6 to run the scripts.\n",
        "\n",
        "## Install HugsVision\n",
        "\n",
        "First we install HugsVision if needed. "
      ],
      "metadata": {
        "id": "Pzd_l-d4IBap"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "try:\r\n",
        "    import hugsvision\r\n",
        "except:\r\n",
        "    !pip install -q hugsvision\r\n",
        "    import hugsvision\r\n",
        "    \r\n",
        "print(hugsvision.__version__)"
      ],
      "outputs": [],
      "metadata": {
        "id": "ZYAyYoSkISKe"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Downloading Data\n",
        "\n",
        "First, we need to download the dataset called `Pneumothorax` [here](https://www.kaggle.com/volodymyrgavrysh/pneumothorax-binary-classification-task) which weight around ~779 MB.\n",
        "\n",
        "## Converting Data\n",
        "\n",
        "Once it was downloaded, you will need to convert it to a directory based format:"
      ],
      "metadata": {
        "id": "tht3p78fIav4"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "import os\r\n",
        "import os.path\r\n",
        "from shutil import copyfile\r\n",
        "\r\n",
        "from tqdm import tqdm\r\n",
        "\r\n",
        "import pandas as pd\r\n",
        "\r\n",
        "df = pd.read_csv(\"./train_data.csv\")\r\n",
        "\r\n",
        "img_in = \"./small_train_data_set/small_train_data_set/\"\r\n",
        "img_out = \"./data/\"\r\n",
        "\r\n",
        "for index, row in tqdm(df.iterrows())   :\r\n",
        "\r\n",
        "    label = \"pneumothorax\" if row['target'] == 1 else \"normal\"\r\n",
        "\r\n",
        "    path_in = img_in + row['file_name']\r\n",
        "    path_out = img_out + label + \"/\" + row['file_name']\r\n",
        "\r\n",
        "    # Check if the input image exist\r\n",
        "    if not os.path.isfile(path_in):\r\n",
        "        continue\r\n",
        "\r\n",
        "    # Check if the output dir of the label exist\r\n",
        "    if not os.path.isdir(img_out + label):\r\n",
        "        os.mkdir(img_out + label)\r\n",
        "        print(\"Directory for the label \" + label + \" created!\")\r\n",
        "\r\n",
        "    # Copy the image\r\n",
        "    copyfile(path_in, path_out)"
      ],
      "outputs": [],
      "metadata": {
        "id": "0bc_AldVJvb3"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Loading Data\n",
        "\n",
        "Once it has been converted, we can start loading the data."
      ],
      "metadata": {
        "id": "1n_kAX_gJ10T"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "from hugsvision.dataio.VisionDataset import VisionDataset\r\n",
        "\r\n",
        "train, test, id2label, label2id = VisionDataset.fromImageFolder(\r\n",
        "\t\"./data/\",\r\n",
        "\ttest_ratio   = 0.15,\r\n",
        "\tbalanced     = True,\r\n",
        "\taugmentation = True,\r\n",
        ")"
      ],
      "outputs": [],
      "metadata": {
        "id": "3VAjDhVsKSkt"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Choose a image classifier model on HuggingFace\r\n",
        "\r\n",
        "Now we can choose our base model on which we will perform a fine-tuning to make it fit our needs.\r\n",
        "\r\n",
        "Our choices aren't very large since we haven't a lot of model available yet on HuggingFace for this task.\r\n",
        "\r\n",
        "So, to be sure that the model will be compatible with `HugsVision` we need to have a model exported in `PyTorch` and compatible with the `image-classification` task obviously.\r\n",
        "\r\n",
        "Models available with this criterias: [here](https://huggingface.co/models?filter=pytorch&pipeline_tag=image-classification&sort=downloads)\r\n",
        "\r\n",
        "At the time I'am writing this, I recommand to use the following models:\r\n",
        "\r\n",
        "* `google/vit-base-patch16-224-in21k`\r\n",
        "* `google/vit-base-patch16-224`\r\n",
        "* `facebook/deit-base-distilled-patch16-224`\r\n",
        "* `microsoft/beit-base-patch16-224`\r\n",
        "\r\n",
        "**Note:** Please specify `ignore_mismatched_sizes=True` for both `model` and `feature_extractor` if you aren't using the following model.\r\n"
      ],
      "metadata": {
        "id": "2cKwvSqdKeQf"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "huggingface_model = 'google/vit-base-patch16-224-in21k'"
      ],
      "outputs": [],
      "metadata": {
        "id": "TYpe3zYHNWpa"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Train the model\r\n",
        "\r\n",
        "So, once the model choosen, we can start building the `Trainer` and start the fine-tuning.\r\n",
        "\r\n",
        "**Note**: Import the `FeatureExtractor` and `ForImageClassification` according to your previous choice.\r\n"
      ],
      "metadata": {
        "id": "bI1MYIzgNcL5"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "from transformers import ViTFeatureExtractor, ViTForImageClassification\r\n",
        "from hugsvision.nnet.VisionClassifierTrainer import VisionClassifierTrainer\r\n",
        "\r\n",
        "trainer = VisionClassifierTrainer(\r\n",
        "\tmodel_name   = \"MyPneumoModel\",\r\n",
        "\ttrain      \t = train,\r\n",
        "\ttest      \t = test,\r\n",
        "\toutput_dir   = \"./out/\",\r\n",
        "\tmax_epochs   = 1,\r\n",
        "\tbatch_size   = 32, # On RTX 2080 Ti\r\n",
        "\tmodel = ViTForImageClassification.from_pretrained(\r\n",
        "\t    huggingface_model,\r\n",
        "\t    num_labels = len(label2id),\r\n",
        "\t    label2id   = label2id,\r\n",
        "\t    id2label   = id2label\r\n",
        "\t),\r\n",
        "\tfeature_extractor = ViTFeatureExtractor.from_pretrained(\r\n",
        "\t\thuggingface_model,\r\n",
        "\t),\r\n",
        ")"
      ],
      "outputs": [],
      "metadata": {
        "id": "60F0IXvQNc0N"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Evaluate F1-Score\n",
        "\n",
        "Using the F1-Score metrics will allow us to get a better representation of predictions for all the labels and find out if their are any anomalies wit ha specific label."
      ],
      "metadata": {
        "id": "C56x7mkEN6Rx"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "ref, hyp = trainer.evaluate_f1_score()"
      ],
      "outputs": [],
      "metadata": {
        "id": "hC2pA2EzQahn"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "```\n",
        "              precision    recall  f1-score   support\n",
        "\n",
        "      normal       0.76      0.24      0.36       191\n",
        "pneumothorax       0.41      0.88      0.56       114\n",
        "\n",
        "    accuracy                           0.48       305\n",
        "   macro avg       0.58      0.56      0.46       305\n",
        "weighted avg       0.63      0.48      0.43       305\n",
        "\n",
        "```"
      ],
      "metadata": {
        "id": "qPCJviqYQdb8"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Make a prediction\r\n",
        "\r\n",
        "Rename the `./out/MODEL_PATH/config.json` file present in the model output to `./out/MODEL_PATH/preprocessor_config.json`"
      ],
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "source": [
        "import os.path\r\n",
        "from transformers import ViTFeatureExtractor, ViTForImageClassification\r\n",
        "from hugsvision.inference.VisionClassifierInference import VisionClassifierInference\r\n",
        "\r\n",
        "path = \"./out/MyPneumoModel/20_2021-08-20-01-46-44/model/\"\r\n",
        "img  = \"../../../samples/pneumothorax/with.jpg\"\r\n",
        "\r\n",
        "classifier = VisionClassifierInference(\r\n",
        "    feature_extractor = ViTFeatureExtractor.from_pretrained(path),\r\n",
        "    model = ViTForImageClassification.from_pretrained(path),\r\n",
        ")\r\n",
        "\r\n",
        "label = classifier.predict(img_path=img)\r\n",
        "print(\"Predicted class:\", label)"
      ],
      "outputs": [],
      "metadata": {}
    }
  ]
}