{ "nbformat": 4, "nbformat_minor": 2, "metadata": { "colab": { "name": "Image-Classifier.ipynb", "provenance": [] }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "source": [ "# Quickstart Guide\n", "\n", "This guide will give a quick intro to training PyTorch models with HugsVision. We'll start by loading in some data and defining a model, then we'll train it for a few epochs and see how well it does.\n", "\n", "**Note**: The easiest way to use this tutorial is as a colab notebook, which allows you to dive in with no setup. We recommend you enable a free GPU with\n", "\n", "> **Runtime**   →   **Change runtime type**   →   **Hardware Accelerator: GPU**\n", "\n", "**Note**: You need to have at least Python 3.6 to run the scripts.\n", "\n", "## Install HugsVision\n", "\n", "First we install HugsVision if needed. " ], "metadata": { "id": "Pzd_l-d4IBap" } }, { "cell_type": "code", "execution_count": null, "source": [ "try:\r\n", " import hugsvision\r\n", "except:\r\n", " !pip install -q hugsvision\r\n", " import hugsvision\r\n", " \r\n", "print(hugsvision.__version__)" ], "outputs": [], "metadata": { "id": "ZYAyYoSkISKe" } }, { "cell_type": "markdown", "source": [ "## Downloading Data\n", "\n", "First, we need to download the dataset called `Pneumothorax` [here](https://www.kaggle.com/volodymyrgavrysh/pneumothorax-binary-classification-task) which weight around ~779 MB.\n", "\n", "## Converting Data\n", "\n", "Once it was downloaded, you will need to convert it to a directory based format:" ], "metadata": { "id": "tht3p78fIav4" } }, { "cell_type": "code", "execution_count": null, "source": [ "import os\r\n", "import os.path\r\n", "from shutil import copyfile\r\n", "\r\n", "from tqdm import tqdm\r\n", "\r\n", "import pandas as pd\r\n", "\r\n", "df = pd.read_csv(\"./train_data.csv\")\r\n", "\r\n", "img_in = \"./small_train_data_set/small_train_data_set/\"\r\n", "img_out = \"./data/\"\r\n", "\r\n", "for index, row in tqdm(df.iterrows()) :\r\n", "\r\n", " label = \"pneumothorax\" if row['target'] == 1 else \"normal\"\r\n", "\r\n", " path_in = img_in + row['file_name']\r\n", " path_out = img_out + label + \"/\" + row['file_name']\r\n", "\r\n", " # Check if the input image exist\r\n", " if not os.path.isfile(path_in):\r\n", " continue\r\n", "\r\n", " # Check if the output dir of the label exist\r\n", " if not os.path.isdir(img_out + label):\r\n", " os.mkdir(img_out + label)\r\n", " print(\"Directory for the label \" + label + \" created!\")\r\n", "\r\n", " # Copy the image\r\n", " copyfile(path_in, path_out)" ], "outputs": [], "metadata": { "id": "0bc_AldVJvb3" } }, { "cell_type": "markdown", "source": [ "## Loading Data\n", "\n", "Once it has been converted, we can start loading the data." ], "metadata": { "id": "1n_kAX_gJ10T" } }, { "cell_type": "code", "execution_count": null, "source": [ "from hugsvision.dataio.VisionDataset import VisionDataset\r\n", "\r\n", "train, test, id2label, label2id = VisionDataset.fromImageFolder(\r\n", "\t\"./data/\",\r\n", "\ttest_ratio = 0.15,\r\n", "\tbalanced = True,\r\n", "\taugmentation = True,\r\n", ")" ], "outputs": [], "metadata": { "id": "3VAjDhVsKSkt" } }, { "cell_type": "markdown", "source": [ "## Choose a image classifier model on HuggingFace\r\n", "\r\n", "Now we can choose our base model on which we will perform a fine-tuning to make it fit our needs.\r\n", "\r\n", "Our choices aren't very large since we haven't a lot of model available yet on HuggingFace for this task.\r\n", "\r\n", "So, to be sure that the model will be compatible with `HugsVision` we need to have a model exported in `PyTorch` and compatible with the `image-classification` task obviously.\r\n", "\r\n", "Models available with this criterias: [here](https://huggingface.co/models?filter=pytorch&pipeline_tag=image-classification&sort=downloads)\r\n", "\r\n", "At the time I'am writing this, I recommand to use the following models:\r\n", "\r\n", "* `google/vit-base-patch16-224-in21k`\r\n", "* `google/vit-base-patch16-224`\r\n", "* `facebook/deit-base-distilled-patch16-224`\r\n", "* `microsoft/beit-base-patch16-224`\r\n", "\r\n", "**Note:** Please specify `ignore_mismatched_sizes=True` for both `model` and `feature_extractor` if you aren't using the following model.\r\n" ], "metadata": { "id": "2cKwvSqdKeQf" } }, { "cell_type": "code", "execution_count": null, "source": [ "huggingface_model = 'google/vit-base-patch16-224-in21k'" ], "outputs": [], "metadata": { "id": "TYpe3zYHNWpa" } }, { "cell_type": "markdown", "source": [ "## Train the model\r\n", "\r\n", "So, once the model choosen, we can start building the `Trainer` and start the fine-tuning.\r\n", "\r\n", "**Note**: Import the `FeatureExtractor` and `ForImageClassification` according to your previous choice.\r\n" ], "metadata": { "id": "bI1MYIzgNcL5" } }, { "cell_type": "code", "execution_count": null, "source": [ "from transformers import ViTFeatureExtractor, ViTForImageClassification\r\n", "from hugsvision.nnet.VisionClassifierTrainer import VisionClassifierTrainer\r\n", "\r\n", "trainer = VisionClassifierTrainer(\r\n", "\tmodel_name = \"MyPneumoModel\",\r\n", "\ttrain \t = train,\r\n", "\ttest \t = test,\r\n", "\toutput_dir = \"./out/\",\r\n", "\tmax_epochs = 1,\r\n", "\tbatch_size = 32, # On RTX 2080 Ti\r\n", "\tmodel = ViTForImageClassification.from_pretrained(\r\n", "\t huggingface_model,\r\n", "\t num_labels = len(label2id),\r\n", "\t label2id = label2id,\r\n", "\t id2label = id2label\r\n", "\t),\r\n", "\tfeature_extractor = ViTFeatureExtractor.from_pretrained(\r\n", "\t\thuggingface_model,\r\n", "\t),\r\n", ")" ], "outputs": [], "metadata": { "id": "60F0IXvQNc0N" } }, { "cell_type": "markdown", "source": [ "## Evaluate F1-Score\n", "\n", "Using the F1-Score metrics will allow us to get a better representation of predictions for all the labels and find out if their are any anomalies wit ha specific label." ], "metadata": { "id": "C56x7mkEN6Rx" } }, { "cell_type": "code", "execution_count": null, "source": [ "ref, hyp = trainer.evaluate_f1_score()" ], "outputs": [], "metadata": { "id": "hC2pA2EzQahn" } }, { "cell_type": "markdown", "source": [ "```\n", " precision recall f1-score support\n", "\n", " normal 0.76 0.24 0.36 191\n", "pneumothorax 0.41 0.88 0.56 114\n", "\n", " accuracy 0.48 305\n", " macro avg 0.58 0.56 0.46 305\n", "weighted avg 0.63 0.48 0.43 305\n", "\n", "```" ], "metadata": { "id": "qPCJviqYQdb8" } }, { "cell_type": "markdown", "source": [ "## Make a prediction\r\n", "\r\n", "Rename the `./out/MODEL_PATH/config.json` file present in the model output to `./out/MODEL_PATH/preprocessor_config.json`" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "import os.path\r\n", "from transformers import ViTFeatureExtractor, ViTForImageClassification\r\n", "from hugsvision.inference.VisionClassifierInference import VisionClassifierInference\r\n", "\r\n", "path = \"./out/MyPneumoModel/20_2021-08-20-01-46-44/model/\"\r\n", "img = \"../../../samples/pneumothorax/with.jpg\"\r\n", "\r\n", "classifier = VisionClassifierInference(\r\n", " feature_extractor = ViTFeatureExtractor.from_pretrained(path),\r\n", " model = ViTForImageClassification.from_pretrained(path),\r\n", ")\r\n", "\r\n", "label = classifier.predict(img_path=img)\r\n", "print(\"Predicted class:\", label)" ], "outputs": [], "metadata": {} } ] }