{ "cells": [ { "cell_type": "markdown", "id": "direct-paragraph", "metadata": {}, "source": [ "\n", "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " Try in Google Colab\n", " \n", " \n", " \n", " \n", " Share via nbviewer\n", " \n", " \n", " \n", " \n", " View on GitHub\n", " \n", " \n", " \n", " \n", " Download notebook\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "close-brave", "metadata": {}, "source": [ "# How to Train Your Dragon (Detector)\n", "\n", "Often thought to be the stuff of legends, we aim to shed light on a mythical entity. Dragons? No... An effective and repeatable data-centric machine learning workflow, of course! It seems that many machine learning researchers and engineers these days are focused on developing the optimal model architecture for their tasks. In reality, the most surefire way to achieve a high-performing model is to meticulously understand, track, and improve your datasets and experimental results.\n", "\n", "In this notebook, we will walk through the process of developing a computer vision model and dataset in a repeatable and effective way utilizing [ClearML](https://clear.ml/) and [FiftyOne](https://fiftyone.ai). Specifically, we will be training the object detection model DETR on a dataset of dragon images, though the general workflow presented is extensible to nearly any computer vision and machine learning task.\n", "\n", "[FiftyOne](https://fiftyone.ai) is an open-source tool for building high-quality datasets and computer vision models with a [powerful API](https://voxel51.com/docs/fiftyone/user_guide/using_views.html) and [intuitive App](https://voxel51.com/docs/fiftyone/user_guide/app.html) letting you quickly understand the quality of your dataset, find your model's failure modes, and improve your datasets and models. On the other hand, [ClearML](https://clear.ml/) is an open-source platform that automates and simplifies developing and managing machine learning solutions through an end-to-end MLOps suite allowing you to focus on developing your ML code and automation, while ClearML ensures your work is reproducible and scalable.\n", "\n", "ClearML and FiftyOne go hand-in-hand with one another in your machine learning workflows. The combination of flexible, hands-on visualization and analysis of data and model results of FiftyOne combined with the experimental result tracking of ClearML produces a system that lets you quickly explore and improve your datasets while also persisting all of the changes and progress made to achieving a high-performing model." ] }, { "cell_type": "markdown", "id": "incoming-royal", "metadata": {}, "source": [ "## Setup\n", "\n", "To start, we will need to [install FiftyOne](https://voxel51.com/docs/fiftyone/getting_started/install.html) and [ClearML](https://clear.ml/):\n" ] }, { "cell_type": "code", "execution_count": null, "id": "sufficient-variance", "metadata": {}, "outputs": [], "source": [ "!pip install fiftyone" ] }, { "cell_type": "markdown", "id": "informative-japanese", "metadata": {}, "source": [ "In order to start tracking your experiments you'll need a free clearML account to send data to the community server or you can always host your own server too!" ] }, { "cell_type": "code", "execution_count": null, "id": "overall-encyclopedia", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Preferably run this in a terminal if you can, but if in a (colab) notebook\n", "# the input is not recognised properly. Press enter once, to give clearml-init\n", "# an empty input and then fill in the fields one at a time.\n", "!pip install clearml\n", "!clearml-init" ] }, { "cell_type": "markdown", "id": "fixed-geometry", "metadata": {}, "source": [ "Download the DETR repository that has been updated with the clearML experiment tracking code." ] }, { "cell_type": "code", "execution_count": 31, "id": "small-bullet", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/content2\n" ] } ], "source": [ "%cd /content" ] }, { "cell_type": "code", "execution_count": null, "id": "elder-zimbabwe", "metadata": {}, "outputs": [], "source": [ "# This is for when running in colab\n", "!git clone https://github.com/thepycoder/detr.git" ] }, { "cell_type": "markdown", "id": "corrected-ticket", "metadata": {}, "source": [ "Also import torch and check if a GPU is available" ] }, { "cell_type": "code", "execution_count": 8, "id": "ranking-sauce", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.10.0 True\n" ] } ], "source": [ "import torch\n", "import fiftyone\n", "from datetime import datetime\n", "print(torch.__version__, torch.cuda.is_available())" ] }, { "cell_type": "markdown", "id": "hawaiian-brother", "metadata": {}, "source": [ "The rest of the clearml-specific code is added in the `main.py` file and will keep track of every time it is ran from here. Keeping track of all the variables, arguments, metrics, model files and so on." ] }, { "cell_type": "markdown", "id": "second-least", "metadata": {}, "source": [ "## [The Tale of the Data and the Model FAIR](https://awoiaf.westeros.org/index.php/The_Bear_and_the_Maiden_Fair)" ] }, { "cell_type": "markdown", "id": "biological-madness", "metadata": {}, "source": [ "To keep things interesting, we created a whole new dataset especially for the occasion: dragons! We gathered 115 images of dragons and annotated them. Interestingly they are both cartoon-style dragons and more 'realistic' dragons.\n", "\n", "To start things out, let's download the dataset. [This repository](https://github.com/thepycoder/dragon_data) will download the images for you. Later on we will use this dataset directly to train a model and visualize the results." ] }, { "cell_type": "code", "execution_count": 31, "id": "acknowledged-destination", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/content2\n" ] } ], "source": [ "%cd /content" ] }, { "cell_type": "code", "execution_count": null, "id": "prostate-throw", "metadata": {}, "outputs": [], "source": [ "!git clone https://github.com/thepycoder/dragon_data" ] }, { "cell_type": "markdown", "id": "fixed-baking", "metadata": {}, "source": [ "Let's download the data to our disk by using the `get_dataset.py` script from the repo." ] }, { "cell_type": "code", "execution_count": 7, "id": "diagnostic-international", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/content2/dragon_data\n", "100%|███████████████████████████████████████████| 98/98 [00:14<00:00, 6.75it/s]\n", "100%|███████████████████████████████████████████| 10/10 [00:01<00:00, 6.39it/s]\n" ] } ], "source": [ "%cd dragon_data\n", "!python get_dataset.py" ] }, { "cell_type": "markdown", "id": "noble-fields", "metadata": {}, "source": [ "In the next section, we look at [Meta (Facebook) research's DETR](https://github.com/facebookresearch/detr), an object detection network based on the popular transformers architecture that we used and updated with ClearML experiment tracking capabilities." ] }, { "cell_type": "markdown", "id": "tight-arrow", "metadata": {}, "source": [ "## Integrating ClearML Experiment Tracking into Pytorch DETR" ] }, { "cell_type": "markdown", "id": "vanilla-captain", "metadata": {}, "source": [ "The main entrypoint in the original DETR code is the `main.py` script, which contains the main training loop. To start tracking our training runs in clearML we simply have to add 2 lines of code this script and we're off to the races!\n", "\n", "You can check this change in `main.py` starting from line 19.\n", "```python\n", "# Imports ...\n", "\n", "# Initialise a clearML task and its corresponding logger\n", "from clearml import Task\n", "task = Task.init(project_name='dragon_detector', task_name=f'DETR')\n", "\n", "# Rest of code ...\n", "```\n", "\n", "These 2 lines will already log all hyperparameters used, console output, code changes and much more.\n", "\n", "![parameters](images/training_clearml_detector/parameters.png)\n", "\n", "![artifacts](images/training_clearml_detector/artifacts.png)\n", "\n", "![console](images/training_clearml_detector/console.png)" ] }, { "cell_type": "markdown", "id": "compatible-thing", "metadata": {}, "source": [ "### Manually adding scalars" ] }, { "cell_type": "markdown", "id": "sunrise-salon", "metadata": {}, "source": [ "When using popular frameworks like detectron2, tensorboard and others, scalars (e.g. loss, mAP etc.) will automatically be logged by clearML. But in case you have a model implementation where this does not happen automatically, it's still very easy to add parameters to log manually.\n", "\n", "First, we add a clearML logger. Ideally, we place it under the 2 magic lines of code we just put in.\n", "\n", "You can check this change in `main.py` starting from line 22.\n", "```python\n", "# Imports ...\n", "\n", "# Initialise a clearML task and its corresponding logger\n", "from clearml import Task\n", "task = Task.init(project_name='dragon_detector', task_name=f'DETR')\n", "logger = task.get_logger()\n", "\n", "# Rest of code ...\n", "```\n", "\n", "Next, we can use this logger to add anything we want to the clearML report. In this case, DETR already keeps track of many important variables such as epoch number, all different losses for that epoch, training and testing accuracy etc. It keeps track of these metrics in a dict, so all we have to do is add this dict to our logger and BOOM! We're tracking them!\n", "\n", "In this case we did some additional things: we only log these metrics every 10 epochs and filter out the `coco_eval_bbox` key from the dictionary as it is not a metric, but a bounding box which is difficult to plot...\n", "\n", "```python\n", "\n", "# Original DETR evaluation step\n", "test_stats, coco_evaluator = evaluate(\n", " model, criterion, postprocessors, data_loader_val, base_ds, device, args.output_dir\n", ")\n", "\n", "# Original DETR dict containing metrics\n", "log_stats = {**{f'train_{k}': v for k, v in train_stats.items()},\n", " **{f'test_{k}': v for k, v in test_stats.items()},\n", " 'epoch': epoch,\n", " 'n_parameters': n_parameters}\n", "\n", "# Added ClearML tracking\n", "if epoch % 10 == 0:\n", " for key, value in log_stats.items():\n", " if 'coco_eval_bbox' in key:\n", " continue\n", " logger.report_scalar(title=key, series=key, value=value, iteration=epoch)\n", "\n", "```\n", "\n", "In the clearML dashboard, you'll find these metrics under Results -> Scalars and they'll be plotted for you.\n", "\n", "![clearml scalar view](images/training_clearml_detector/scalars.png)\n", "\n" ] }, { "cell_type": "markdown", "id": "prime-november", "metadata": {}, "source": [ "### Manually adding debug samples" ] }, { "cell_type": "markdown", "id": "polyphonic-monday", "metadata": {}, "source": [ "Finally, we would like to add Debug Samples. This is where you can log images and their detections straight to the clearML dashboard, so we can visually see that our training has worked!\n", "Again, if using something like tensorboard, debug images will be captured automatically but DETR did not evaluate any images in it's training loop, so we'll have to add a little more code now to get them.\n", "\n", "\n", "First, we're mainly interested in the detection quality at the end of training. So we add a custom function at the complete end of the training script, that will run evaluation with the currently trained model on all the validation images and log them to clearML.\n", "\n", "In `main.py` we add a function call to run evaluation on validation images: line 263 (last line)\n", "```python\n", "# Main entrypoint of training script\n", "if __name__ == '__main__':\n", " parser = argparse.ArgumentParser('DETR training and evaluation script', parents=[get_args_parser()])\n", " args = parser.parse_args()\n", " if args.output_dir:\n", " Path(args.output_dir).mkdir(parents=True, exist_ok=True)\n", " main(args)\n", " run_visual_validation(args, ['dragon'], logger)\n", "```\n", "\n", "The `run_visual_validation` function loads the model we just trained and sends a call to `run_visual_validation_workflow` for every validation image we have.\n", "```python\n", "def run_visual_validation(args, classes, logger):\n", " # Get the model ready\n", " model = torch.hub.load('facebookresearch/detr',\n", " 'detr_resnet50',\n", " pretrained=False,\n", " num_classes=args.num_classes)\n", " checkpoint = torch.load(Path(args.output_dir) / 'checkpoint.pth',\n", " map_location='cpu')\n", "\n", " model.load_state_dict(checkpoint['model'],\n", " strict=False)\n", " model.eval()\n", "\n", " # Iterate over the validation folder and run the model. For visual feedback.\n", " val_folder = Path(args.coco_path) / 'val'\n", " for img_name in os.listdir(val_folder):\n", " im = Image.open(Path(val_folder) / img_name)\n", " run_visual_validation_workflow(im, model, classes, logger, img_name)\n", "```\n", "\n", "The `run_visual_validation_workflow` on it's turn will take the validation image, normalize it, run it through the model, filter the predictions on a threshold and send the result out to be plotted.\n", "```python\n", "def run_visual_validation_workflow(my_image, my_model, classes, logger, img_name):\n", " # mean-std normalize the input image (batch-size: 1)\n", " img = transform(my_image).unsqueeze(0)\n", "\n", " # propagate through the model\n", " outputs = my_model(img)\n", "\n", " probas_to_keep, bboxes_scaled = filter_bboxes_from_outputs(outputs,\n", " img,\n", " threshold=0.35)\n", "\n", " plot_image_results(my_image,\n", " probas_to_keep,\n", " bboxes_scaled,\n", " classes,\n", " logger,\n", " img_name)\n", "```\n", "\n", "Finally, `plot_image_results` will make a matplotlib image from the validation image and the relevant predictions. Then with the single line `logger.report_matplotlib_figure` we send that matplotlib image to clearML. This is the only actual clearML line of code you need, the rest is needed to create the images.\n", "```python\n", "def plot_image_results(pil_img, prob, boxes, classes, logger, img_name):\n", " # colors for visualization\n", " colors = [[0.000, 0.447, 0.741], [0.850, 0.325, 0.098], [0.929, 0.694, 0.125],\n", " [0.494, 0.184, 0.556], [0.466, 0.674, 0.188], [0.301, 0.745, 0.933]]\n", " colors = colors * 100\n", " figure = plt.figure(figsize=(16,10))\n", " plt.imshow(pil_img)\n", " ax = plt.gca()\n", " for p, (xmin, ymin, xmax, ymax), c in zip(prob, boxes.tolist(), colors):\n", " ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,\n", " fill=False, color=c, linewidth=3))\n", " cl = p.argmax()\n", " text = f'{classes[cl]}: {p[cl]:0.2f}'\n", " ax.text(xmin, ymin, text, fontsize=15,\n", " bbox=dict(facecolor='yellow', alpha=0.5))\n", " plt.axis('off')\n", " plt.savefig(img_name)\n", " logger.report_matplotlib_figure('Evaluation Results', img_name, figure,\n", " iteration=None, report_image=True,\n", " report_interactive=False)\n", "```\n", "\n", "This all leads to a very handy overview of the model performance in the clearML dashboard as can be seen below\n", "\n", "![debug samples](images/training_clearml_detector/debug_samples.png)\n" ] }, { "cell_type": "markdown", "id": "graphic-crystal", "metadata": {}, "source": [ "## Train DETR\n", "\n", "Since the data in our dragon dataset is limited, our model should be pretrained. \n", "\n", "This means getting a pretrained model and chopping off the class head, so we can retrain that part only." ] }, { "cell_type": "code", "execution_count": 32, "id": "super-level", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/content2/detr\n" ] } ], "source": [ "%cd /content/detr" ] }, { "cell_type": "code", "execution_count": null, "id": "special-gamma", "metadata": {}, "outputs": [], "source": [ "!pip install -r requirements.txt\n", "!pip install pandas seaborn" ] }, { "cell_type": "code", "execution_count": 33, "id": "stretch-track", "metadata": {}, "outputs": [], "source": [ "# Get pretrained weights\n", "checkpoint = torch.hub.load_state_dict_from_url(\n", " url='https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth',\n", " map_location='cpu',\n", " check_hash=True)\n", "\n", "# Remove class weights\n", "del checkpoint[\"model\"][\"class_embed.weight\"]\n", "del checkpoint[\"model\"][\"class_embed.bias\"]\n", "\n", "# SaveOGH\n", "torch.save(checkpoint, 'detr-r50_no-class-head.pth')" ] }, { "cell_type": "markdown", "id": "brief-stick", "metadata": {}, "source": [ "Our dataset is loadable in a COCO format which is expected by the main python script of DETR." ] }, { "cell_type": "code", "execution_count": 36, "id": "swiss-bronze", "metadata": {}, "outputs": [], "source": [ "dataset_file = \"dragons\" # alternatively, implement your own coco-type dataset loader in datasets and add this \"key\" to datasets/__init__.py\n", "\n", "dataDir = '/content/dragon_data/data' # should lead to a directory with a train and val folder as well as an annotations folder\n", "num_classes = 2 # this int should be the amount of classes + 1\n", "\n", "outDir = f'outputs_{datetime.now().strftime(\"%m:%d:%Y_%H:%M:%S\")}'\n", "resume = \"detr-r50_no-class-head.pth\"" ] }, { "cell_type": "markdown", "id": "ethical-knitting", "metadata": {}, "source": [ "Start training!" ] }, { "cell_type": "code", "execution_count": 47, "id": "equipped-frequency", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ClearML Task: created new task id=e551155d03d0424dafcfa14547b9585c\n", "ClearML results page: https://app.community.clear.ml/projects/8131d2f23983461f8354156be95bc4f1/experiments/e551155d03d0424dafcfa14547b9585c/output/log\n", "Using cache found in /root/.cache/torch/hub/facebookresearch_detr_main\n", "2021-12-16 11:04:49,754 - clearml.model - INFO - Selected model id: 3ce34161504342838a56bf457a7ec2e7\n", "Running visual validation on outputs_12:16:2021_09:40:47\n", "2021-12-16 11:04:53,419 - clearml.model - INFO - Selected model id: 127e4ace5efe45e598f26f86ffb636e5\n", "/content2/detr/models/position_encoding.py:41: UserWarning:\n", "\n", "__floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').\n", "\n", "2021-12-16 11:05:30,662 - clearml.Task - INFO - Waiting to finish uploads\n", "2021-12-16 11:05:33,253 - clearml.Task - INFO - Finished uploading\n" ] } ], "source": [ "# We need at least 100 epochs for the dragons to be detected somewhat decently. It seems to be a difficult class.\n", "!python main.py \\\n", " --dataset_file $dataset_file \\\n", " --coco_path $dataDir \\\n", " --output_dir $outDir \\\n", " --resume $resume \\\n", " --num_classes $num_classes \\\n", " --lr 1e-5 \\\n", " --lr_backbone 1e-6 \\\n", " --epochs 50" ] }, { "cell_type": "markdown", "id": "changing-accordance", "metadata": {}, "source": [ "## ClearML Experiment Results\n", "\n", "Quick and easy overview of the tracked training results, an interactive version of these graphs should be visible on your clearML dashboard!" ] }, { "cell_type": "code", "execution_count": 20, "id": "official-forum", "metadata": {}, "outputs": [], "source": [ "from util.plot_utils import plot_logs\n", "from pathlib import Path\n", "\n", "log_directory = [Path(outDir)]" ] }, { "cell_type": "code", "execution_count": 21, "id": "ongoing-senegal", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fields_of_interest = (\n", " 'loss',\n", " 'mAP',\n", " )\n", "\n", "plot_logs(log_directory,\n", " fields_of_interest)" ] }, { "cell_type": "code", "execution_count": 22, "id": "sought-bread", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fields_of_interest = (\n", " 'loss_ce',\n", " 'loss_bbox',\n", " 'loss_giou',\n", " )\n", "\n", "plot_logs(log_directory,\n", " fields_of_interest)" ] }, { "cell_type": "code", "execution_count": 23, "id": "stainless-league", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fields_of_interest = (\n", " 'class_error',\n", " 'cardinality_error_unscaled',\n", " )\n", "\n", "plot_logs(log_directory,\n", " fields_of_interest) " ] }, { "cell_type": "markdown", "id": "activated-florida", "metadata": {}, "source": [ "If we want to dig deeper and debug our model's performance meticulously, we can analyze it using FiftyOne!" ] }, { "cell_type": "markdown", "id": "demonstrated-romania", "metadata": {}, "source": [ "## Digging in with FiftyOne" ] }, { "cell_type": "markdown", "id": "paperback-prefix", "metadata": {}, "source": [ "To start, let's [load the dataset into FiftyOne](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/index.html) so we can take a look at it in the [FiftyOne App](https://voxel51.com/docs/fiftyone/user_guide/app.html). Loading any custom dataset into FiftyOne is as simple as writing a Python loop. However, since this dataset is already in COCO format, we can load it with just one line of code. " ] }, { "cell_type": "code", "execution_count": 4, "id": "absolute-horse", "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo\n", "import os" ] }, { "cell_type": "code", "execution_count": 12, "id": "asian-helena", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/content2\n" ] } ], "source": [ "%cd /content\n", "# Should lead to a directory with a train and val folder as well as an annotations folder\n", "dataset_dir = \"dragon_data/data\"\n", "dataset_name = \"dragons\"\n", "\n", "# Reload the dataset if it was persistent\n", "if dataset_name in fo.list_datasets():\n", " fo.delete_dataset(dataset_name)" ] }, { "cell_type": "code", "execution_count": 13, "id": "introductory-drilling", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 100% |███████████████████| 98/98 [198.9ms elapsed, 0s remaining, 492.7 samples/s] \n", " 100% |███████████████████| 10/10 [15.1ms elapsed, 0s remaining, 662.2 samples/s] \n" ] }, { "data": { "text/plain": [ "['61bb163f336bdf908449101a',\n", " '61bb163f336bdf908449101e',\n", " '61bb163f336bdf908449101f',\n", " '61bb163f336bdf9084491026',\n", " '61bb163f336bdf9084491027',\n", " '61bb163f336bdf9084491028',\n", " '61bb163f336bdf9084491029',\n", " '61bb163f336bdf9084491031',\n", " '61bb163f336bdf9084491032',\n", " '61bb163f336bdf9084491033']" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Load the training dataset into FiftyOne and tag all samples with \"train\"\n", "dataset = fo.Dataset.from_dir(\n", " dataset_type=fo.types.COCODetectionDataset,\n", " data_path= os.path.join(dataset_dir, \"train\"),\n", " labels_path= os.path.join(dataset_dir, \"annotations/train.json\"),\n", " name=dataset_name,\n", " tags=\"train\",\n", ")\n", "\n", "# Add the validation data and tag the samples with \"val\"\n", "dataset.add_dir(\n", " dataset_type=fo.types.COCODetectionDataset,\n", " data_path= os.path.join(dataset_dir, \"val\"),\n", " labels_path= os.path.join(dataset_dir, \"annotations/val.json\"),\n", " tags=\"val\",\n", ")" ] }, { "cell_type": "markdown", "id": "vocational-colon", "metadata": {}, "source": [ "The `launch_app()` method launches the App directly in the output of this cell and also returns a Session instance, which you can subsequently use to interact programmatically with the App." ] }, { "cell_type": "code", "execution_count": 22, "id": "other-italic", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(dataset)" ] }, { "cell_type": "code", "execution_count": null, "id": "loaded-luxembourg", "metadata": {}, "outputs": [], "source": [ "session.freeze() # Screenshot the App for this example" ] }, { "cell_type": "markdown", "id": "ideal-relief", "metadata": {}, "source": [ "### Loading predictions into FiftyOne\n", "\n", "Similar to how you load ground truth labels into a FiftyOne Dataset, [loading model predictions](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/index.html#model-predictions) is as easy as writing a Python loop. In general, loading predictions follows this pseudocode:\n", "\n", "```python\n", "import fiftyone as fo\n", "\n", "dataset = fo.load_dataset(\"dragons\")\n", "img_paths = [\"/path/to/img1.png\", ...]\n", "\n", "# Ex. custom prediction format: [bbox, label, confidence]\n", "predictions = [[[0.1,0.2,0.3,0.5], \"car\", 0.921], ...]\n", "\n", "for img_path, img_preds in zip(img_paths, predictions):\n", " sample = dataset[img_path]\n", " dets = []\n", " for bbox, label, conf in img_preds:\n", " dets.append(\n", " fo.Detection(\n", " bounding_box=bbox,\n", " label=label,\n", " confidence=confidence,\n", " )\n", " )\n", " sample[\"predictions\"] = fo.Detections(detections=dets)\n", " sample.save()\n", "\n", "# View predictions in the App\n", "session = fo.launch_app(dataset)\n", "```\n", "\n", "Specifically, the following function iterates through a FiftyOne dataset, loads the image data, generates predictions, and stores those predictions on the dataset in FiftyOne format." ] }, { "cell_type": "code", "execution_count": null, "id": "opposite-sacramento", "metadata": {}, "outputs": [], "source": [ "from PIL import Image\n", "import torchvision.transforms as T\n", "\n", "def run_inference(samples, model, classes=[\"background\", \"dragon\"]):\n", " # standard PyTorch mean-std input image normalization\n", " transform = T.Compose([\n", " T.Resize(800),\n", " T.ToTensor(),\n", " T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])\n", " ])\n", "\n", " # Iterate over `samples`, a FiftyOne dataset or view\n", " for sample in samples.iter_samples(progress=True):\n", " image = Image.open(sample.filepath)\n", " if image.mode != \"RGB\":\n", " image = image.convert(\"RGB\")\n", "\n", " # Mean-std normalize the input image (batch-size: 1)\n", " image = transform(image).unsqueeze(0)\n", "\n", " # Perform inference\n", " preds = model(image)\n", "\n", " scores = preds[\"pred_logits\"].softmax(-1)[0, :, :-1]\n", "\n", " keep_thresh = 0.2\n", " keep = scores.max(-1).values > keep_thresh\n", "\n", " scores = scores[keep].cpu().detach().numpy()\n", " boxes = preds[\"pred_boxes\"][0, keep].cpu().detach().numpy()\n", "\n", " # Convert detections to FiftyOne format\n", " detections = [] \n", " for score, box in zip(scores, boxes):\n", " # Output boxes in [center-x, center-y, width, height]\n", " # Convert to [top-left-x, top-left-y, width, height]\n", " cx, cy, w, h = box\n", " formatted_box = [cx - 0.5 * w, cy - 0.5 * h, w, h]\n", "\n", " label_ind = score.argmax()\n", " label = classes[label_ind]\n", " confidence = score[label_ind]\n", "\n", " detections.append(\n", " fo.Detection(\n", " label=label,\n", " bounding_box=formatted_box,\n", " confidence=confidence,\n", " )\n", " )\n", "\n", " # Save predictions to dataset\n", " sample[\"predictions\"] = fo.Detections(detections=detections)\n", " sample.save()" ] }, { "cell_type": "markdown", "id": "serial-electric", "metadata": {}, "source": [ "Now let's load the model we just trained and add its predictions to our FiftyOne dataset." ] }, { "cell_type": "code", "execution_count": null, "id": "rational-source", "metadata": {}, "outputs": [], "source": [ "detr_path = \"detr\"\n", "num_classes = 2\n", "\n", "model = torch.hub.load('facebookresearch/detr',\n", " 'detr_resnet50',\n", " pretrained=False,\n", " num_classes=num_classes)\n", "\n", "checkpoint = torch.load(f'{outDir}/checkpoint.pth', map_location='cpu')\n", "\n", "model.load_state_dict(checkpoint['model'],\n", " strict=False)\n", "\n", "model.eval()" ] }, { "cell_type": "code", "execution_count": 16, "id": "interracial-chicago", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1% |/----------------| 1/108 [942.4ms elapsed, 1.7m remaining, 1.1 samples/s] " ] }, { "name": "stderr", "output_type": "stream", "text": [ "/root/.cache/torch/hub/facebookresearch_detr_main/models/position_encoding.py:41: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').\n", " dim_t = self.temperature ** (2 * (dim_t // 2) / self.num_pos_feats)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " 100% |█████████████████| 108/108 [1.8m elapsed, 0s remaining, 1.3 samples/s] \n" ] } ], "source": [ "run_inference(dataset, model)" ] }, { "cell_type": "markdown", "id": "vocational-belarus", "metadata": {}, "source": [ "We can easily visualize the predictions in the FiftyOne App." ] }, { "cell_type": "code", "execution_count": 7, "id": "treated-guinea", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(dataset)" ] }, { "cell_type": "code", "execution_count": 9, "id": "found-observer", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "simple-hypothesis", "metadata": {}, "source": [ "Once the predictions are in FiftyOne, we can easily export them in COCO format into a JSON file on disk." ] }, { "cell_type": "code", "execution_count": null, "id": "bulgarian-speaking", "metadata": {}, "outputs": [], "source": [ "# Export predictions from FiftyOne dataset to disk in COCO-formatted JSON\n", "dataset.export(\n", " label_field=\"predictions\",\n", " label_path=\"/path/to/coco_predictions.json\",\n", " dataset_type=fo.types.COCODetectionDataset,\n", ")" ] }, { "cell_type": "markdown", "id": "increased-employment", "metadata": {}, "source": [ " We can then upload this JSON file to our ClearML dataset to keep track of the model predictions for this iteration of the data." ] }, { "cell_type": "markdown", "id": "happy-spouse", "metadata": {}, "source": [ "### Evaluating and Filtering Results\n", "\n", "Now that the model predictions are loaded, we can dig in and analyze the results. FiftyOne provides [methods for evaluating classification, detection, and segmentation](https://voxel51.com/docs/fiftyone/user_guide/evaluation.html) models. While these methods can be used to compute dataset-wide metrics like so many other tools, the primary benefit is that this evaluation also populates instance-level results on the dataset like tagging individual true or false positive predictions. This allows you to not only understand how the model performs on the dataset as a whole but also specific instances in which the model performs well or poorly which is the best way to build intuition about the type of data you should use to retrain the model." ] }, { "cell_type": "code", "execution_count": 4, "id": "motivated-development", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(dataset)" ] }, { "cell_type": "code", "execution_count": null, "id": "sound-bulgarian", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "aging-yeast", "metadata": {}, "source": [ "Visualizing predictions in FiftyOne shows many low confidence incorrect predictions. This indicates that we should find an appropriate confidence threshold to limit the predictions in the dataset. " ] }, { "cell_type": "markdown", "id": "formal-ecuador", "metadata": {}, "source": [ "One way to find a threshold value for detection confidence is to calculate the number of true and false positives that exist currently and find the point at which there are an equal number of both." ] }, { "cell_type": "markdown", "id": "finnish-suspension", "metadata": {}, "source": [ "Let's call the [`evalute_detections()`](https://voxel51.com/docs/fiftyone/user_guide/evaluation.html#detections) method to use COCO-style object detection evaluation to compute if each ground truth and prediction is either a true positive, false positive, or false negative." ] }, { "cell_type": "code", "execution_count": 30, "id": "hawaiian-penguin", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |█████████████████| 108/108 [532.6ms elapsed, 0s remaining, 202.8 samples/s] \n", "Performing IoU sweep...\n", " 100% |█████████████████| 108/108 [898.6ms elapsed, 0s remaining, 120.2 samples/s] \n" ] } ], "source": [ "eval_key = \"full_dataset_eval\"\n", "results = dataset.evaluate_detections(\n", " \"predictions\", \n", " gt_field=\"ground_truth\", \n", " eval_key=eval_key, \n", ")" ] }, { "cell_type": "markdown", "id": "historic-trance", "metadata": {}, "source": [ "The FiftyOne API provides a [powerful query language](https://voxel51.com/docs/fiftyone/user_guide/using_views.html#) that can be used to [filter and slice datasets](https://voxel51.com/docs/fiftyone/user_guide/using_views.html#filtering) letting you look at the specific view in which you are interested. It also provides [dataset-wide aggregation functions](https://voxel51.com/docs/fiftyone/user_guide/using_aggregations.html) that let you easily access content from your datasets such as label values, counts, distributions, and ranges. One of these aggregations is the [`histogram_values()`](https://voxel51.com/docs/fiftyone/user_guide/using_aggregations.html#histogram-values) function that is perfect for our use case of computing the number of true and false positives for each confidence bin." ] }, { "cell_type": "code", "execution_count": null, "id": "aboriginal-perspective", "metadata": {}, "outputs": [], "source": [ "from fiftyone import ViewField as F" ] }, { "cell_type": "code", "execution_count": null, "id": "assigned-forum", "metadata": {}, "outputs": [], "source": [ "tp_view = dataset.filter_labels(\"predictions\", F(eval_key) == \"tp\")\n", "fp_view = dataset.filter_labels(\"predictions\", F(eval_key) == \"fp\")" ] }, { "cell_type": "code", "execution_count": null, "id": "authentic-relevance", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "id": "brown-necessity", "metadata": {}, "outputs": [], "source": [ "def plot_hist(counts, edges):\n", " counts = np.asarray(counts)\n", " edges = np.asarray(edges)\n", " left_edges = edges[:-1]\n", " widths = edges[1:] - edges[:-1]\n", " plt.bar(left_edges, counts, width=widths, align=\"edge\")" ] }, { "cell_type": "code", "execution_count": 71, "id": "sensitive-funds", "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD5CAYAAAAzzx7cAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAPNklEQVR4nO3df6xkd13G8fdjW1QK0q17W9fCsoggFEJLvW4QDClBpD9i2iIo1UCDjYsGTEmIaUMiJeGfJQoYRdAFmi4JFolQqVCQZgEbBIp3cWm3bpEKTS1suluKQtFEd/vxjzmbXpe9O2d+3tlv369kcmfOfGfm2TP3PnvmzHfOpKqQJLXjR9Y7gCRpuix2SWqMxS5JjbHYJakxFrskNcZil6TGnDxsQJIfA24FfrQb/zdVdW2S04G/BrYA9wC/XlXfPd59bdy4sbZs2TJhZEl6dNm9e/cDVbXUd3yGzWNPEuDUqnooySnA54GrgJcBD1bV9iTXABuq6urj3dfy8nKtrKz0zSZJApLsrqrlvuOH7oqpgYe6i6d0pwIuAXZ2y3cCl46YVZI0A732sSc5Kcke4ABwS1XdBpxZVfsBup9nzC6mJKmvXsVeVYer6lzgicDWJM/u+wBJtiVZSbJy8ODBcXNKknoaaVZMVf0H8DngAuD+JJsAup8H1rjNjqparqrlpaXe+/4lSWMaWuxJlpKc1p3/ceCXgbuAm4ArumFXAB+bVUhJUn9DpzsCm4CdSU5i8B/Bh6vq40m+CHw4yZXAvcArZphTktTT0GKvqtuB5x5j+XeAF88ilCRpfH7yVJIaY7FLUmP67GOXJK2y5ZpPHHP5PdsvnnOSY3OLXZIaY7FLUmMsdklqjMUuSY2x2CWpMRa7JDXGYpekxljsktQYi12SGmOxS1JjLHZJaozFLkmNsdglqTEWuyQ1xmKXpMZY7JLUGItdkhpjsUtSYyx2SWqMxS5JjbHYJakxFrskNcZil6TGWOyS1JihxZ7kSUk+m2RfkjuTXNUtf0uSbyXZ050umn1cSdIwJ/cYcwh4Y1V9Jcnjgd1Jbumue2dV/fHs4kmSRjW02KtqP7C/O//9JPuAs2YdTJI0npH2sSfZAjwXuK1b9Poktye5LsmGNW6zLclKkpWDBw9OFFaSNFzvYk/yOOAjwBuq6nvAe4CnAucy2KJ/+7FuV1U7qmq5qpaXlpamEFmSdDy9ij3JKQxK/YNV9VGAqrq/qg5X1cPAe4Gts4spSeqrz6yYAO8H9lXVO1Yt37Rq2GXA3unHkySNqs+smBcArwLuSLKnW/Ym4PIk5wIF3AO8diYJJUkj6TMr5vNAjnHVzdOPI0malJ88laTGWOyS1BiLXZIaY7FLUmMsdklqjMUuSY2x2CWpMRa7JDXGYpekxljsktQYi12SGmOxS1JjLHZJaozFLkmNsdglqTEWuyQ1xmKXpMZY7JLUGItdkhpjsUtSYyx2SWqMxS5JjbHYJakxFrskNcZil6TGWOyS1BiLXZIaM7TYkzwpyWeT7EtyZ5KruuWnJ7klyde7nxtmH1eSNEyfLfZDwBur6pnA84DXJTkbuAbYVVVPA3Z1lyVJ62xosVfV/qr6Snf++8A+4CzgEmBnN2wncOmsQkqS+htpH3uSLcBzgduAM6tqPwzKHzhjjdtsS7KSZOXgwYOTpZUkDdW72JM8DvgI8Iaq+l7f21XVjqparqrlpaWlcTJKkkbQq9iTnMKg1D9YVR/tFt+fZFN3/SbgwGwiSpJG0WdWTID3A/uq6h2rrroJuKI7fwXwsenHkySN6uQeY14AvAq4I8mebtmbgO3Ah5NcCdwLvGI2ESVJoxha7FX1eSBrXP3i6caRJE3KT55KUmMsdklqjMUuSY2x2CWpMRa7JDXGYpekxljsktQYi12SGmOxS1JjLHZJaozFLkmNsdglqTEWuyQ1xmKXpMZY7JLUmD5ftCFJTdtyzSeOufye7RfPOcl0uMUuSY2x2CWpMRa7JDXGYpekxljsktQYi12SGmOxS1JjnMcuaWG0Np98vbjFLkmNsdglqTFDiz3JdUkOJNm7atlbknwryZ7udNFsY0qS+uqzxX49cMExlr+zqs7tTjdPN5YkaVxDi72qbgUenEMWSdIUTLKP/fVJbu921WyYWiJJ0kTGLfb3AE8FzgX2A29fa2CSbUlWkqwcPHhwzIeTJPU1VrFX1f1VdbiqHgbeC2w9ztgdVbVcVctLS0vj5pQk9TRWsSfZtOriZcDetcZKkuZr6CdPk9wAnA9sTHIfcC1wfpJzgQLuAV47w4ySpBEMLfaquvwYi98/gyySpCnwk6eS1BiLXZIa49EdJT1qrHX0yNa4xS5JjbHYJakxFrskNcZil6TGWOyS1BiLXZIa43RHSSesRfvy6+NNp5xnJrfYJakxFrskNcZil6TGWOyS1BiLXZIaY7FLUmOc7iipOdM6iuOJejRIt9glqTEWuyQ1xmKXpMZY7JLUGItdkhpjsUtSYyx2SWqMxS5JjbHYJakxFrskNWZosSe5LsmBJHtXLTs9yS1Jvt793DDbmJKkvvpssV8PXHDUsmuAXVX1NGBXd1mStACGFntV3Qo8eNTiS4Cd3fmdwKVTziVJGtO4R3c8s6r2A1TV/iRnrDUwyTZgG8DmzZvHfDhJJ6Jpfdn0iXqUxfUy8zdPq2pHVS1X1fLS0tKsH06SHvXGLfb7k2wC6H4emF4kSdIkxi32m4AruvNXAB+bThxJ0qT6THe8Afgi8HNJ7ktyJbAdeEmSrwMv6S5LkhbA0DdPq+ryNa568ZSzSJKmwE+eSlJjLHZJaozFLkmNsdglqTEWuyQ1xmKXpMZY7JLUGItdkhpjsUtSY8Y9bK8kAeMdUtfD8M6WW+yS1BiLXZIaY7FLUmMsdklqjMUuSY2x2CWpMRa7JDXGYpekxljsktQYi12SGmOxS1JjLHZJaozFLkmNsdglqTEWuyQ1xmKXpMZY7JLUmIm+QSnJPcD3gcPAoapankYoSdL4pvHVeC+qqgemcD+SpClwV4wkNWbSLfYCPp2kgL+sqh1HD0iyDdgGsHnz5gkfTtJ68QuoTxyTbrG/oKrOAy4EXpfkhUcPqKodVbVcVctLS0sTPpwkaZiJir2qvt39PADcCGydRihJ0vjGLvYkpyZ5/JHzwK8Ae6cVTJI0nkn2sZ8J3JjkyP38VVV9aiqpJEljG7vYq+obwDlTzCJJmgKnO0pSY6bxAaVmrDWd657tF885iTQ6f391hFvsktQYi12SGmOxS1JjLHZJaozFLkmNsdglqTEnzHRHp3Ittmk9P6MeQXARn39/V7Xe3GKXpMZY7JLUGItdkhpjsUtSYyx2SWqMxS5JjbHYJakxJ8w89rW0MO/50WjW33i/XvPq52HUTKOui0X8N2s0brFLUmMsdklqjMUuSY2x2CWpMRa7JDXGYpekxqSq5vZgy8vLtbKyMtZtT6QpWKNOI1uvKXhOd5PmZ5Kp1kl2V9Vy3/FusUtSYyx2SWrMRMWe5IIkX0tyd5JrphVKkjS+sYs9yUnAnwMXAmcDlyc5e1rBJEnjmWSLfStwd1V9o6r+B/gQcMl0YkmSxjVJsZ8F/Puqy/d1yyRJ62iSozvmGMt+aO5kkm3Atu7iQ0m+NuR+NwIPTJBr1obmy9tGu8NRx/fQax3O4HH7WvTnGBY/46Lng8XPONd8Y/y9rc735FFuOEmx3wc8adXlJwLfPnpQVe0AdvS90yQro8zXnLdFzweLn3HR88HiZ1z0fLD4GVvON8mumH8CnpbkKUkeA7wSuGmC+5MkTcHYW+xVdSjJ64G/B04CrquqO6eWTJI0lom+QamqbgZunlKWI3rvtlkni54PFj/joueDxc+46Plg8TM2m2+ux4qRJM2ehxSQpMasW7EPOxxBkt9Kcnt3+kKScxYs3yVdtj1JVpL80iLlWzXuF5IcTvLyeebrHnvYOjw/yX9263BPkjcvUr5VGfckuTPJP8wzX5+MSf5g1frb2z3Xpy9Qvick+bskX+3W4WvmlW2EjBuS3Nj9PX85ybPnnO+6JAeS7F3j+iT50y7/7UnOG3qnVTX3E4M3W/8N+BngMcBXgbOPGvN8YEN3/kLgtgXL9zge2ZX1HOCuRcq3atxnGLwP8vIFfI7PBz6+wL+DpwH/AmzuLp+xaBmPGv+rwGcWKR/wJuBt3fkl4EHgMQuW8Y+Aa7vzzwB2zfl5fiFwHrB3jesvAj7J4LNDz+vTheu1xT70cARV9YWq+m538UsM5skvUr6HqlvrwKkc48NZ65mv8/vAR4ADc8x2xKIfcqJPvt8EPlpV9wJU1bzX46jr8HLghrkkG+iTr4DHJwmDjaEHgUMLlvFsYBdAVd0FbEly5rwCVtWtDNbLWi4BPlADXwJOS7LpePe5XsU+6uEIrmTwP9a89MqX5LIkdwGfAH57TtmgR74kZwGXAX8xx1yr9X2Of7F7mf7JJM+aTzSgX76nAxuSfC7J7iSvnlu6gd5/J0keC1zA4D/yeemT713AMxl8ePEO4Kqqeng+8YB+Gb8KvAwgyVYGn/Kc54bkMCMfvmW9ir3X4QgAkryIQbFfPdNERz3sMZb9UL6qurGqngFcCrx15qke0SffnwBXV9XhOeQ5lj4ZvwI8uarOAf4M+NuZp3pEn3wnAz8PXAy8FPjDJE+fdbBVev+dMNgN849Vdbwtv2nrk++lwB7gp4FzgXcl+YlZB1ulT8btDP4D38PgVe4/M99XFcOM8nsATDiPfQK9DkeQ5DnA+4ALq+o7c8oGPfMdUVW3Jnlqko1VNY9jT/TJtwx8aPAKmI3ARUkOVdW8ynNoxqr63qrzNyd594Ktw/uAB6rqB8APktwKnAP86xzyHXn8vr+Hr2S+u2GgX77XANu73ZZ3J/kmg/3YX55PxN6/h6+BwRuVwDe706IYqY+AdXvz9GTgG8BTeOQNjWcdNWYzcDfw/AXN97M88ubpecC3jlxehHxHjb+e+b952mcd/tSqdbgVuHeR1iGDXQi7urGPBfYCz16kddiNewKDfbSnLuBz/B7gLd35M7u/k40LlvE0ujd0gd9hsD97buuxe9wtrP3m6cX8/zdPvzzs/tZli73WOBxBkt/trv8L4M3ATwLv7rY6D9WcDtjTM9+vAa9O8r/AfwO/Ud2zsCD51lXPjC8Hfi/JIQbr8JWLtA6ral+STwG3Aw8D76uqY05JW6+M3dDLgE/X4JXF3PTM91bg+iR3MCimq2s+r8hGyfhM4ANJDjOYBXXlvPIBJLmBwQyxjUnuA64FTlmV72YGM2PuBv6L7tXFce9zTn9HkqQ58ZOnktQYi12SGmOxS1JjLHZJaozFLkmNsdglqTEWuyQ1xmKXpMb8HwzeYKCwk4HXAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "tp_counts, tp_edges, other = tp_view.histogram_values(\"predictions.detections.confidence\", bins=50)\n", "plot_hist(tp_counts, tp_edges)" ] }, { "cell_type": "code", "execution_count": 73, "id": "acoustic-shopper", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAOBElEQVR4nO3df6zd9V3H8efbdfwxwK3YS62VuzuROdgyunmtOoxhWeb4EQMoRqoZBIl3mmFYQgwNiRvJ/umi20ycY3aDwJLJYgI4FDZHuimZU2bBAsUyQVaxrKFgF2GLiba8/eN8m15v7+353nvO+fb77n0+kpN7zvd+z/2+7vec++rnfM/nexqZiSSpnh860QEkSStjgUtSURa4JBVlgUtSURa4JBW1psuNrVu3LmdmZrrcpCSV98gjj7yUmVMLl3da4DMzM+zcubPLTUpSeRHx74st9xCKJBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBXV6ZmYo5jZev+iy/duu7TjJJLUD47AJakoC1ySirLAJakoC1ySihpa4BFxVkR8PSL2RMSTEXFDs/yWiHg+InY1l0smH1eSdESbWSiHgBsz89GIOB14JCIebL73ycz8o8nFkyQtZWiBZ+Z+YH9z/ZWI2ANsnHQwSdLxLesYeETMAO8AHm4WXR8Rj0fE7RGxdon7zEXEzojY+eKLL44UVpJ0VOsCj4jTgLuBD2Xmy8CtwNnAJgYj9I8vdr/M3J6Zs5k5OzV1zH/pJklaoVYFHhGvZVDeX8jMewAy84XMPJyZrwKfBTZPLqYkaaE2s1ACuA3Yk5mfmLd8w7zVrgB2jz+eJGkpbWahXAC8H3giInY1y24GtkTEJiCBvcAHJpJQkrSoNrNQvgHEIt96YPxxJElteSamJBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSURa4JBVlgUtSUUMLPCLOioivR8SeiHgyIm5olp8REQ9GxNPN17WTjytJOqLNCPwQcGNmngv8HPDBiDgP2ArsyMxzgB3NbUlSR4YWeGbuz8xHm+uvAHuAjcBlwJ3NancCl08qpCTpWMs6Bh4RM8A7gIeB9Zm5HwYlD5w57nCSpKWtabtiRJwG3A18KDNfjoi295sD5gCmp6dXkvG4Zrbev6z19267dOwZJOlEaDUCj4jXMijvL2TmPc3iFyJiQ/P9DcCBxe6bmdszczYzZ6empsaRWZJEu1koAdwG7MnMT8z71n3ANc31a4AvjT+eJGkpbQ6hXAC8H3giInY1y24GtgF/ERHXAc8BvzaZiJKkxQwt8Mz8BrDUAe/3jDeOJKktz8SUpKIscEkqygKXpKIscEkqygKXpKIscEkqygKXpKIscEkqygKXpKIscEkqqvXHya4GS300rR9BK6mPHIFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVZYFLUlEWuCQVNbTAI+L2iDgQEbvnLbslIp6PiF3N5ZLJxpQkLdRmBH4HcNEiyz+ZmZuaywPjjSVJGmZogWfmQ8DBDrJIkpZhzQj3vT4irgZ2Ajdm5vcWWyki5oA5gOnp6RE2V9/M1vsXXb5326UdJ5F0Mljpm5i3AmcDm4D9wMeXWjEzt2fmbGbOTk1NrXBzkqSFVlTgmflCZh7OzFeBzwKbxxtLkjTMigo8IjbMu3kFsHupdSVJkzH0GHhE3AVcCKyLiH3AR4ALI2ITkMBe4AMTzChJWsTQAs/MLYssvm0CWSRJy+CZmJJUlAUuSUWNMg981ejb/O2+5ZF0YjgCl6SiLHBJKsoCl6SiLHBJKsoCl6SiLHBJKsoCl6SiVt088KXmUEvVeX7A6uMIXJKKssAlqSgLXJKKssAlqSgLXJKKssAlqSgLXJKKWnXzwLtQZa6584al2hyBS1JRFrgkFWWBS1JRFrgkFWWBS1JRFrgkFWWBS1JRzgPXSck57loNHIFLUlEWuCQVZYFLUlEWuCQVNbTAI+L2iDgQEbvnLTsjIh6MiKebr2snG1OStFCbEfgdwEULlm0FdmTmOcCO5rYkqUNDCzwzHwIOLlh8GXBnc/1O4PIx55IkDbHSeeDrM3M/QGbuj4gzl1oxIuaAOYDp6ekVbu7kNuk5y1U+n7wLq3F++Gr8nVeLib+JmZnbM3M2M2enpqYmvTlJWjVWWuAvRMQGgObrgfFFkiS1sdICvw+4prl+DfCl8cSRJLXVZhrhXcA/AD8VEfsi4jpgG/DeiHgaeG9zW5LUoaFvYmbmliW+9Z4xZ5EkLYNnYkpSURa4JBXl54FLnNi50svd9sk8r98568vjCFySirLAJakoC1ySirLAJakoC1ySirLAJakoC1ySinIe+EnkZJ4fvJQ+/s7V5zJXzw8nx+/QhiNwSSrKApekoixwSSrKApekoixwSSrKApekoixwSSrKeeA9djLMcR7XfNw+7ovlqv47rOSxnPTjf7LN614uR+CSVJQFLklFWeCSVJQFLklFWeCSVJQFLklFWeCSVJTzwDUW1ec4L+V4v9dqn4PcB5N+3vV9/rkjcEkqygKXpKIscEkqygKXpKJGehMzIvYCrwCHgUOZOTuOUJKk4cYxC+XdmfnSGH6OJGkZPIQiSUWNOgJP4KsRkcCfZeb2hStExBwwBzA9PT3i5vrlZJ373IW+z6/VZPm3Mx6jjsAvyMx3AhcDH4yIX1y4QmZuz8zZzJydmpoacXOSpCNGKvDM/G7z9QBwL7B5HKEkScOtuMAj4tSIOP3IdeCXgN3jCiZJOr5RjoGvB+6NiCM/588z8ytjSSVJGmrFBZ6ZzwLnjzGLJGkZnEYoSUVZ4JJUlJ8HLq1Q9bnM48pffT+sRF/OY3AELklFWeCSVJQFLklFWeCSVJQFLklFWeCSVJQFLklFOQ9cxziR83pX45zipbgv6jneYzaJOeKOwCWpKAtckoqywCWpKAtckoqywCWpKAtckoqywCWpKOeBS1o1+vI53uPiCFySirLAJakoC1ySirLAJakoC1ySirLAJakoC1ySinIeuKRVr+pnrzsCl6SiLHBJKsoCl6SiLHBJKmqkAo+IiyLi2xHxTERsHVcoSdJwKy7wiHgN8KfAxcB5wJaIOG9cwSRJxzfKCHwz8ExmPpuZ/wN8EbhsPLEkScOMMg98I/Af827vA3524UoRMQfMNTe/HxHfXuH21gEvrfC+Xeh7Puh/xr7ng/5nNN/oJpIxPjbS3d+42MJRCjwWWZbHLMjcDmwfYTuDjUXszMzZUX/OpPQ9H/Q/Y9/zQf8zmm90FTIeMcohlH3AWfNu/zjw3dHiSJLaGqXA/wk4JyLeFBGnAFcB940nliRpmBUfQsnMQxFxPfA3wGuA2zPzybElO9bIh2EmrO/5oP8Z+54P+p/RfKOrkBGAyDzmsLUkqQDPxJSkoixwSSqqdwU+7PT8iPjNiHi8uXwzIs7vWb7Lmmy7ImJnRPxCn/LNW+9nIuJwRFzZZb5m28P24YUR8V/NPtwVER/uU755GXdFxJMR8Xdd5muTMSJ+f97+29081mf0KN/rI+KvIuKxZh9e21W2lvnWRsS9zd/ytyLibV3may0ze3Nh8GbovwE/AZwCPAact2CddwFrm+sXAw/3LN9pHH1v4e3AU33KN2+9rwEPAFf28DG+EPjrHj8H3wD8CzDd3D6zbxkXrP/LwNf6lA+4GfhYc30KOAic0qN8fwh8pLn+FmDHiXg+Drv0bQQ+9PT8zPxmZn6vufmPDOaf9ynf97N51IFTWeTkphOZr/F7wN3AgQ6zHdH3j2Bok+83gHsy8zmAzOx6Py53H24B7uok2UCbfAmcHhHBYNBzEDjUo3znATsAMvMpYCYi1neUr7W+Ffhip+dvPM761wFfnmii/69Vvoi4IiKeAu4HfqujbNAiX0RsBK4APtNhrvnaPsY/37y8/nJEvLWbaEC7fG8G1kbE30bEIxFxdWfpBlr/nUTE64CLGPyD3ZU2+T4FnMvg5L8ngBsy89Vu4rXK9xjwKwARsZnBqexdDhZb6VuBtzo9HyAi3s2gwG+aaKIFm11k2WIfH3BvZr4FuBz46MRTHdUm3x8DN2Xm4Q7yLKZNxkeBN2bm+cCfAH858VRHtcm3Bvhp4FLgfcAfRMSbJx1sntZ/JwwOn/x9Zh6cYJ6F2uR7H7AL+DFgE/CpiPjhSQdrtMm3jcE/0rsYvGL9Z7p7hdBa3/5T41an50fE24HPARdn5n92lA2W+fEBmflQRJwdEesys4sP8GmTbxb44uCVK+uASyLiUGZ2VZJDM2bmy/OuPxARn+7ZPtwHvJSZPwB+EBEPAecD/9pBviPbb/s8vIpuD59Au3zXAtuaw43PRMR3GBxr/lYf8jXPwWsBmsM832ku/XKiD8IveONgDfAs8CaOvrnw1gXrTAPPAO/qab6f5OibmO8Enj9yuw/5Fqx/B92/idlmH/7ovH24GXiuT/uQwUv/Hc26rwN2A2/r0z5s1ns9g2PLp/bwMb4VuKW5vr75O1nXo3xvoHlTFfht4PNd7sO2l16NwHOJ0/Mj4nea738G+DDwI8Cnm1Hkoezok8Na5vtV4OqI+F/gv4Ffz+ZZ0JN8J1TLjFcCvxsRhxjsw6v6tA8zc09EfAV4HHgV+Fxm7u4iX9uMzapXAF/NwSuFzrTM91Hgjoh4gsEhjZuym1dYbfOdC3w+Ig4zmHF0XRfZlstT6SWpqL69iSlJaskCl6SiLHBJKsoCl6SiLHBJKsoCl6SiLHBJKur/AIls58sQeAP1AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fp_counts, fp_edges, other = fp_view.histogram_values(\"predictions.detections.confidence\", bins=50)\n", "plot_hist(fp_counts, fp_edges)" ] }, { "cell_type": "markdown", "id": "plain-sapphire", "metadata": {}, "source": [ "Let's integrate over this histogram starting at the highest confidence to find the confidence at which the number of true positives equals the number of false positives." ] }, { "cell_type": "code", "execution_count": 74, "id": "accessory-weight", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(tp_edges[:-1][::-1], np.cumsum(tp_counts[::-1]))\n", "plt.plot(fp_edges[:-1][::-1], np.cumsum(fp_counts[::-1]))\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "nearby-print", "metadata": {}, "source": [ "Based on this graph, we should set our confidence threshold to around 0.5. Let's visualize the dataset in the FiftyOne App to visually check how well this confidence threshold works for this problem." ] }, { "cell_type": "code", "execution_count": 41, "id": "sustained-bristol", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.dataset = dataset" ] }, { "cell_type": "code", "execution_count": 5, "id": "continued-logan", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "sapphire-convertible", "metadata": {}, "source": [ "After browsing through the samples in the dataset, a threshold of 0.5 provides enough flexibility to detect many of the dragons in the dataset without too many false positives.\n", "\n", "Now, let's create a view programmatically using this confidence threshold and reevaluate the model predictions, computing COCO-style mAP as well this time." ] }, { "cell_type": "code", "execution_count": null, "id": "conceptual-salvation", "metadata": {}, "outputs": [], "source": [ "conf_thresh = 0.5\n", "high_conf_view = dataset.filter_labels(\"predictions\", F(\"confidence\") > conf_thresh)" ] }, { "cell_type": "code", "execution_count": 44, "id": "polar-pierce", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = high_conf_view" ] }, { "cell_type": "code", "execution_count": null, "id": "detected-hospital", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "code", "execution_count": null, "id": "structured-guard", "metadata": {}, "outputs": [], "source": [ "eval_key = \"eval\"" ] }, { "cell_type": "code", "execution_count": 76, "id": "detected-baptist", "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |█████████████████| 101/101 [1.1s elapsed, 0s remaining, 94.7 samples/s] \n", "Performing IoU sweep...\n", " 100% |█████████████████| 101/101 [660.0ms elapsed, 0s remaining, 153.0 samples/s] \n" ] } ], "source": [ "results = high_conf_view.evaluate_detections(\n", " \"predictions\", \n", " gt_field=\"ground_truth\", \n", " eval_key=eval_key, \n", " compute_mAP=True,\n", ")" ] }, { "cell_type": "code", "execution_count": 78, "id": "purple-raleigh", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.3416714132507098\n" ] } ], "source": [ "print(results.mAP())" ] }, { "cell_type": "markdown", "id": "suspended-professional", "metadata": {}, "source": [ "### What can our failures teach us?\n", "\n", "One of the primary uses of FiftyOne is the ability to easily query and explore your dataset and model predictions for any question that comes to mind. An especially useful workflow is to explore the failure modes of your model to get a sense of how to best improve it going forward.\n", "\n", "For example, let's take a look at all of the predictions that were false positives but with high confidence, indicating that the model was fairly certain about its detection but was incorrect. These types of examples usually indicate either an ingrained issue with the model or an error in the ground truth annotations. Either need to be addressed promptly." ] }, { "cell_type": "code", "execution_count": null, "id": "dirty-quest", "metadata": {}, "outputs": [], "source": [ "high_conf_fp = high_conf_view.filter_labels(\n", " \"predictions\",\n", " (F(eval_key) == \"fp\") & (F(\"confidence\") > 0.9),\n", ")" ] }, { "cell_type": "code", "execution_count": 80, "id": "forward-reason", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = high_conf_fp" ] }, { "cell_type": "code", "execution_count": null, "id": "tamil-strike", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "initial-sellers", "metadata": {}, "source": [ "From the example above, it seems that one issue with our dataset is that we did not consistently annotate the wings of dragons. The model relatively accurately detected the dragon, but also included the wing which resulted in an IoU below the threshold used for evaluation (IoU=0.5). This detection would not necessarily be incorrect, though, so we may want to take a pass over the dataset to ensure dragon wings are consistently annotated.\n", "\n", "An easy way to reannotate this dataset is to use the integrations between FiftyOne and annotation tools like [CVAT](https://voxel51.com/docs/fiftyone/integrations/cvat.html) or [Labelbox](https://voxel51.com/docs/fiftyone/integrations/labelbox.html).\n", "\n", "Now, let's take a look at the false negatives in the dataset, where the model did not detect a ground truth object." ] }, { "cell_type": "code", "execution_count": null, "id": "binary-working", "metadata": {}, "outputs": [], "source": [ "fn_view = high_conf_view.filter_labels(\n", " \"ground_truth\",\n", " F(eval_key) == \"fn\",\n", ")" ] }, { "cell_type": "code", "execution_count": 83, "id": "confirmed-duplicate", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = fn_view" ] }, { "cell_type": "code", "execution_count": null, "id": "underlying-waste", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "prescription-ambassador", "metadata": {}, "source": [ "From the example above, we see another issue of the model incorrectly localizing the bounding box, even though it did detect the presence of a dragon. The comment about reannotating the dataset to include dragon wings still holds, however, it would also be useful to add additional training data to allow the model to learn to more accurately localize the boxes." ] }, { "cell_type": "code", "execution_count": 53, "id": "satellite-drive", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.dataset = dataset" ] }, { "cell_type": "code", "execution_count": null, "id": "hearing-collaboration", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "decent-assumption", "metadata": {}, "source": [ "In the example above, we see that the model is frequently detection non-dragon objects as dragons. The majority of the samples in this dataset contain only one or a few dragons isolated from other objects. Thus, the model seems to be learning to just detect all of the focal objects in the scene.\n", "\n", "The best way to resolve this would be to add more scenes with multiple types of objects to the dataset as well as expand the classes to other object types so that the model is able to learn to better differentiate between dragons and other objects." ] }, { "cell_type": "markdown", "id": "color-scott", "metadata": {}, "source": [ "### Train vs Validation split\n", "\n", "You may have noticed that all of the examples above have included samples from both the training and validation splits. We can also create views to evaluate the training and validation samples separately." ] }, { "cell_type": "code", "execution_count": null, "id": "aggregate-forum", "metadata": {}, "outputs": [], "source": [ "train_view = high_conf_view.match_tags(\"train\")\n", "val_view = high_conf_view.match_tags(\"val\")" ] }, { "cell_type": "code", "execution_count": 86, "id": "prescribed-control", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |███████████████████| 93/93 [615.8ms elapsed, 0s remaining, 151.0 samples/s] \n", "Performing IoU sweep...\n", " 100% |███████████████████| 93/93 [672.7ms elapsed, 0s remaining, 138.3 samples/s] \n", "Evaluating detections...\n", " 100% |█████████████████████| 8/8 [18.8ms elapsed, 0s remaining, 425.0 samples/s] \n", "Performing IoU sweep...\n", " 100% |█████████████████████| 8/8 [23.1ms elapsed, 0s remaining, 346.2 samples/s] \n" ] } ], "source": [ "train_results = train_view.evaluate_detections(\n", " \"predictions\", \n", " gt_field=\"ground_truth\", \n", " compute_mAP=True,\n", ")\n", "val_results = val_view.evaluate_detections(\n", " \"predictions\", \n", " gt_field=\"ground_truth\", \n", " compute_mAP=True,\n", ")" ] }, { "cell_type": "code", "execution_count": 87, "id": "political-distributor", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train mAP: 0.34209299981356495\n", "Val mAP: 0.43994970925664\n" ] } ], "source": [ "print(\"Train mAP: \", train_results.mAP())\n", "print(\"Val mAP: \", val_results.mAP())" ] }, { "cell_type": "markdown", "id": "searching-shipping", "metadata": {}, "source": [ "Given the similarity in performance between the train and validation splits, it appears there may be room for further improvement in model performance by continuing to train the model for a few more epochs.\n", "\n", "This also indicates that the analysis that we have performed on the combined train and validation splits seems to hold for unseen data." ] }, { "cell_type": "code", "execution_count": 19, "id": "charitable-partner", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(view=val_view)" ] }, { "cell_type": "code", "execution_count": null, "id": "prerequisite-portrait", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "abroad-islam", "metadata": {}, "source": [ "### Visualizing Image Augmentation with FiftyOne\n", "\n", "Building a robust model for your task is the ultimate goal of most machine learning projects. However, creating a dataset that captures all scenarios to properly train and evaluate your model is impossible. To that end, augmenting existing data to increase the number of samples in your dataset can be a cheap and effective way to train more robust models and find evaluation cases in which your model is underperforming. \n", "\n", "In this section, we evaluate the model we trained previously on augmented images of dragons and compare the performance to the unaugmented data. The augmentations are performed with the [Python library imgaug](https://github.com/aleju/imgaug). See the attached Colab notebook for code details. In summary, we regenerate all of the samples in the FiftyOne dataset using imgaug augmentations." ] }, { "cell_type": "code", "execution_count": null, "id": "bronze-young", "metadata": {}, "outputs": [], "source": [ "!pip install imgaug" ] }, { "cell_type": "code", "execution_count": 8, "id": "headed-marble", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset.first().metadata" ] }, { "cell_type": "code", "execution_count": null, "id": "valued-divorce", "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "from PIL import Image\n", "\n", "import imgaug as ia\n", "import imgaug.augmenters as iaa\n", "\n", "def augment_sample(sample, label_field, augmentations, output_dir, tags=None):\n", " # Load an existing sample,\n", " # apply given augmentations to the image and bounding boxes,\n", " # and return the augmented Sample\n", " \n", " os.makedirs(output_dir, exist_ok=True)\n", " image = Image.open(sample.filepath)\n", " if image.mode != \"RGB\":\n", " image = image.convert(\"RGB\")\n", " image = np.array(image)\n", " \n", " img_w = sample.metadata[\"width\"]\n", " img_h = sample.metadata[\"height\"]\n", " bbs = []\n", " labels = []\n", " for det in sample[label_field].detections:\n", " labels.append(det.label)\n", " tlx, tly, w, h = det.bounding_box\n", " \n", " # Convert from relative [tlx,tly,w,h] to absolute [x1,y1,x2,y2]\n", " ia_bbox = ia.BoundingBox(\n", " x1 = tlx * img_w,\n", " y1 = tly * img_h,\n", " x2 = (tlx + w) * img_w, \n", " y2 = (tly + h) * img_h,\n", " )\n", " bbs.append(ia_bbox)\n", " \n", " img_aug, bbs_aug = augmentations(images=[image], bounding_boxes=[bbs])\n", " img_aug, bbs_aug = img_aug[0], bbs_aug[0]\n", " aug_filepath = os.path.join(output_dir, sample.filename)\n", " output_img = Image.fromarray(img_aug)\n", " output_img.save(aug_filepath)\n", " aug_sample = fo.Sample(filepath=aug_filepath, tags=tags)\n", " \n", " img_h, img_w, _ = img_aug.shape\n", " dets = []\n", " for bb, label in zip(bbs_aug, labels):\n", " # Convert from absolute [x1,y1,x2,y2] to relative [tlx,tly,w,h]\n", " bbox = [\n", " bb.x1 / img_w,\n", " bb.y1 / img_h,\n", " (bb.x2 - bb.x1) / img_w,\n", " (bb.y2 - bb.y1) / img_h,\n", " ]\n", " det = fo.Detection(label=label, bounding_box=bbox)\n", " dets.append(det)\n", " \n", " aug_sample[label_field] = fo.Detections(detections=dets)\n", " \n", " return aug_sample" ] }, { "cell_type": "code", "execution_count": null, "id": "circular-language", "metadata": {}, "outputs": [], "source": [ "# Tag all existing samples to easily filter out augmentations in the future\n", "dataset.tag_samples(\"original\")" ] }, { "cell_type": "code", "execution_count": 42, "id": "oriented-partnership", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 100% |█████████████████| 108/108 [143.0ms elapsed, 0s remaining, 768.6 samples/s] \n" ] } ], "source": [ "tags = [\"augmented1\"]\n", "label_field = \"ground_truth\"\n", "output_dir = os.path.join(dataset_dir, \"augmented1\")\n", "augmentations = iaa.Sequential([\n", " iaa.AdditiveGaussianNoise(scale=0.05*255),\n", " iaa.Affine(translate_px={\"x\": (1, 5)})\n", "])\n", "\n", "aug_samples = []\n", "for sample in dataset.match_tags(\"original\"):\n", " aug_sample = augment_sample(\n", " sample,\n", " label_field,\n", " augmentations,\n", " output_dir,\n", " tags=tags,\n", " )\n", " aug_samples.append(aug_sample)\n", " \n", "_ = dataset.add_samples(aug_samples)" ] }, { "cell_type": "code", "execution_count": 43, "id": "dietary-butterfly", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(view=dataset.match_tags(\"augmented1\"))" ] }, { "cell_type": "code", "execution_count": null, "id": "bridal-inflation", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "ecological-browser", "metadata": {}, "source": [ "Let's also add some much more difficult augmentations (these are directly from an Imgaug example).\n" ] }, { "cell_type": "code", "execution_count": null, "id": "logical-advertiser", "metadata": {}, "outputs": [], "source": [ "sometimes = lambda aug: iaa.Sometimes(0.5, aug)\n", "\n", "augmentations2 = iaa.Sequential(\n", " [\n", " # apply the following augmenters to most images\n", " iaa.Fliplr(0.5), # horizontally flip 50% of all images\n", " iaa.Flipud(0.2), # vertically flip 20% of all images\n", " # crop images by -5% to 10% of their height/width\n", " sometimes(iaa.CropAndPad(\n", " percent=(-0.05, 0.1),\n", " pad_mode=ia.ALL,\n", " pad_cval=(0, 255)\n", " )),\n", " sometimes(iaa.Affine(\n", " scale={\"x\": (0.8, 1.2), \"y\": (0.8, 1.2)}, # scale images to 80-120% of their size, individually per axis\n", " translate_percent={\"x\": (-0.2, 0.2), \"y\": (-0.2, 0.2)}, # translate by -20 to +20 percent (per axis)\n", " rotate=(-45, 45), # rotate by -45 to +45 degrees\n", " shear=(-16, 16), # shear by -16 to +16 degrees\n", " order=[0, 1], # use nearest neighbour or bilinear interpolation (fast)\n", " cval=(0, 255), # if mode is constant, use a cval between 0 and 255\n", " mode=ia.ALL # use any of scikit-image's warping modes (see 2nd image from the top for examples)\n", " )),\n", " # execute 0 to 5 of the following (less important) augmenters per image\n", " # don't execute all of them, as that would often be way too strong\n", " iaa.SomeOf((0, 5),\n", " [\n", " sometimes(iaa.Superpixels(p_replace=(0, 1.0), n_segments=(20, 200))), # convert images into their superpixel representation\n", " iaa.OneOf([\n", " iaa.GaussianBlur((0, 3.0)), # blur images with a sigma between 0 and 3.0\n", " iaa.AverageBlur(k=(2, 7)), # blur image using local means with kernel sizes between 2 and 7\n", " iaa.MedianBlur(k=(3, 11)), # blur image using local medians with kernel sizes between 2 and 7\n", " ]),\n", " iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)), # sharpen images\n", " iaa.Emboss(alpha=(0, 1.0), strength=(0, 2.0)), # emboss images\n", " # search either for all edges or for directed edges,\n", " # blend the result with the original image using a blobby mask\n", " iaa.SimplexNoiseAlpha(iaa.OneOf([\n", " iaa.EdgeDetect(alpha=(0.5, 1.0)),\n", " iaa.DirectedEdgeDetect(alpha=(0.5, 1.0), direction=(0.0, 1.0)),\n", " ])),\n", " iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5), # add gaussian noise to images\n", " iaa.OneOf([\n", " iaa.Dropout((0.01, 0.1), per_channel=0.5), # randomly remove up to 10% of the pixels\n", " iaa.CoarseDropout((0.03, 0.15), size_percent=(0.02, 0.05), per_channel=0.2),\n", " ]),\n", " iaa.Invert(0.05, per_channel=True), # invert color channels\n", " iaa.Add((-10, 10), per_channel=0.5), # change brightness of images (by -10 to 10 of original value)\n", " iaa.AddToHueAndSaturation((-20, 20)), # change hue and saturation\n", " # either change the brightness of the whole image (sometimes\n", " # per channel) or change the brightness of subareas\n", " iaa.OneOf([\n", " iaa.Multiply((0.5, 1.5), per_channel=0.5),\n", " iaa.FrequencyNoiseAlpha(\n", " exponent=(-4, 0),\n", " first=iaa.Multiply((0.5, 1.5), per_channel=True),\n", " second=iaa.LinearContrast((0.5, 2.0))\n", " )\n", " ]),\n", " iaa.LinearContrast((0.5, 2.0), per_channel=0.5), # improve or worsen the contrast\n", " iaa.Grayscale(alpha=(0.0, 1.0)),\n", " sometimes(iaa.ElasticTransformation(alpha=(0.5, 3.5), sigma=0.25)), # move pixels locally around (with random strengths)\n", " sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.05))), # sometimes move parts of the image around\n", " sometimes(iaa.PerspectiveTransform(scale=(0.01, 0.1)))\n", " ],\n", " random_order=True\n", " )\n", " ],\n", " random_order=True\n", ")\n" ] }, { "cell_type": "code", "execution_count": null, "id": "local-sweden", "metadata": {}, "outputs": [], "source": [ "tags = [\"augmented2\"]\n", "label_field = \"ground_truth\"\n", "output_dir = os.path.join(dataset_dir, \"augmented2\")\n", "\n", "aug_samples = []\n", "for sample in dataset.match_tags(\"original\"):\n", " aug_sample = augment_sample(\n", " sample,\n", " label_field,\n", " augmentations2,\n", " output_dir,\n", " tags=tags,\n", " )\n", " aug_samples.append(aug_sample)\n", " \n", "_ = dataset.add_samples(aug_samples)\n" ] }, { "cell_type": "code", "execution_count": 47, "id": "continent-arbor", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = dataset.match_tags(\"augmented2\")" ] }, { "cell_type": "code", "execution_count": null, "id": "defined-welsh", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "floppy-expression", "metadata": {}, "source": [ "Let's rerun evaluation on the two views of augmented samples and see how the model performs." ] }, { "cell_type": "code", "execution_count": null, "id": "arbitrary-vacuum", "metadata": {}, "outputs": [], "source": [ "aug1_view = dataset.match_tags(\"augmented1\")\n", "aug2_view = dataset.match_tags(\"augmented2\")\n", "\n", "run_inference(aug1_view, model)\n", "run_inference(aug2_view, model)" ] }, { "cell_type": "code", "execution_count": null, "id": "mounted-publisher", "metadata": {}, "outputs": [], "source": [ "from fiftyone import ViewField as F\n", "\n", "# Get high confidence predictions\n", "conf_thresh = 0.5\n", "aug1_view = aug1_view.filter_labels(\"predictions\", F(\"confidence\") > conf_thresh)\n", "aug2_view = aug2_view.filter_labels(\"predictions\", F(\"confidence\") > conf_thresh)" ] }, { "cell_type": "code", "execution_count": 62, "id": "recreational-behalf", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |███████████████████| 88/88 [1.0s elapsed, 0s remaining, 84.4 samples/s] \n", "Performing IoU sweep...\n", " 100% |███████████████████| 88/88 [641.6ms elapsed, 0s remaining, 137.1 samples/s] \n", "0.2596665720673561\n" ] } ], "source": [ "eval_key_1 = \"aug1_eval\"\n", "results1 = aug1_view.evaluate_detections(\n", " \"predictions\", \n", " gt_field=\"ground_truth\", \n", " eval_key=eval_key_1, \n", " compute_mAP=True,\n", ")\n", "print(results1.mAP())" ] }, { "cell_type": "code", "execution_count": 63, "id": "generic-fireplace", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating detections...\n", " 100% |███████████████████| 80/80 [1.2s elapsed, 0s remaining, 65.0 samples/s] \n", "Performing IoU sweep...\n", " 100% |███████████████████| 80/80 [724.1ms elapsed, 0s remaining, 110.5 samples/s] \n", "0.17284998264006374\n" ] } ], "source": [ "eval_key_2 = \"aug2_eval\"\n", "results2 = aug2_view.evaluate_detections(\n", " \"predictions\", \n", " gt_field=\"ground_truth\", \n", " eval_key=eval_key_2, \n", " compute_mAP=True,\n", ")\n", "print(results2.mAP())" ] }, { "cell_type": "code", "execution_count": null, "id": "automatic-logan", "metadata": {}, "outputs": [], "source": [ "aug1_eval = (\n", " aug1_view\n", " .filter_labels(\"predictions\", (F(eval_key_1) == \"fp\") & (F(\"confidence\") > 0.7))\n", " .filter_labels(\"ground_truth\", F(eval_key_1) == \"fn\", only_matches=False) \n", ")\n", "aug2_eval = (\n", " aug2_view\n", " .filter_labels(\"predictions\", (F(eval_key_2) == \"fp\") & (F(\"confidence\") > 0.7))\n", " .filter_labels(\"ground_truth\", F(eval_key_2) == \"fn\", only_matches=False) \n", ")" ] }, { "cell_type": "code", "execution_count": 77, "id": "destroyed-terrorist", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = aug1_eval" ] }, { "cell_type": "code", "execution_count": null, "id": "polish-identification", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "reasonable-permission", "metadata": {}, "source": [ "Even the basic augmentations show the model having a much harder time detecting dragons and avoiding other objects, especially for small objects." ] }, { "cell_type": "code", "execution_count": 80, "id": "upset-shower", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = aug2_eval" ] }, { "cell_type": "code", "execution_count": null, "id": "hawaiian-elder", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "extreme-conditions", "metadata": {}, "source": [ "While the overall performance on these extremely augmented samples is worse than the un- and slightly-augmented samples, the model still performed somewhat surprisingly well for certain augmentations as you can see above." ] }, { "cell_type": "code", "execution_count": 78, "id": "flush-allergy", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session.view = aug2_eval" ] }, { "cell_type": "code", "execution_count": null, "id": "fourth-auckland", "metadata": {}, "outputs": [], "source": [ "session.freeze()" ] }, { "cell_type": "markdown", "id": "wooden-detail", "metadata": {}, "source": [ "An interesting point that jumped out is that the model seems to have difficulties with rotated images indicating that we may want to add that augmentation to the training dataset. We wouldn't want to miss any dragons once we put this model out in the real world just because they are flying upsidedown." ] }, { "cell_type": "markdown", "id": "conceptual-flush", "metadata": {}, "source": [ "## Next Steps\n", "\n", "Based on the observations in the previous sections, we have a plan of action for producing a higher-quality dataset and a higher-performing model.\n", "\n", "* Update the dataset annotations taking into account the wing\n", "\n", "* Incorporate augmentations into the training loop\n", "\n", "* Add more difficult samples like crowds of objects\n", "\n", "* Automate and parameterize the training loop for fast retraining iterations whenever we update the data" ] }, { "cell_type": "markdown", "id": "impossible-david", "metadata": {}, "source": [ "## Summary\n", "\n", "Creating a high-performing model requires much more than just some PyTorch code. Being able to iteratively track and analyze model performance and then use that to inform dataset improvements is necessary for a high-quality model. The combination of the model analysis capabilities of [FiftyOne](https://fiftyone.ai) with the experiment tracking capabilities of [ClearML](https://clear.ml/) results in a system that will lead to better models, faster." ] }, { "cell_type": "markdown", "id": "registered-strain", "metadata": {}, "source": [ "This walkthrough was made in collaboration between the teams at [ClearML](https://clear.ml/) and [Voxel51](https://voxel51.com)." ] } ], "metadata": { "interpreter": { "hash": "98b0a9b7b4eaaa670588a142fd0a9b87eaafe866f1db4228be72b4211d12040f" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }