{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Improving models by fixing object detection labels\n", "\n", "Label quality is one of the major factors that affects your model's performance. 3LC provides valuable insights into your label quality and singles out problematic labels. By fixing the label errors and retraining on the fixed dataset, you have a great chance of improving your model.\n", "\n", "![](../images/hardhat/figure_22.png)\n", "\n", "\n", "\n", "In this tutorial, we will cover the following topics:\n", "\n", "- Installing 3LC-integrated Ultralytics YOLO\n", "- Running Ultralytics YOLO training and collecting metrics \n", "- Opening a Project/Run/Table in the 3LC Dashboard\n", "- Finding and fixing label errors in the 3LC Dashboard\n", "- Comparing original and retrained models' performance\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installing 3LC-integrated Ultralytics YOLO\n", "\n", "The 3LC integration with Ultralytics YOLO is distributed separately from 3LC. To get started and learn more, check out the [GitHub repository](https://github.com/3lc-ai/3lc-ultralytics).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Running Ultralytics YOLO training and collecting metrics\n", "\n", "In order to run training with the integration, instantiate a model via `TLCYOLO` instead of `YOLO` and call the method `.train()` just like you are used to. Here is a simple example, which shows how to specify 3LC settings.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from tlc_ultralytics import YOLO, Settings\n", "\n", "# Set 3LC specific settings\n", "settings = Settings(\n", " project_name=\"hardhat-project\",\n", " run_name=\"base-run\",\n", " run_description=\"base run for BBs editing\",\n", " image_embeddings_dim=2,\n", " collection_epoch_start=4,\n", " collection_epoch_interval=5,\n", " conf_thres=0.4,\n", ")\n", "\n", "# Initialize and run training\n", "model = YOLO(\"yolov8n.pt\")\n", "model.train(data=\"hardhat.yaml\", epochs=20, settings=settings)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The **Runs** and **Tables** will be stored in the project folder \"hardhat-project\". For your first run, 3LC creates the tables for the training and validation sets provided through the data argument. For later runs, it will use the existing tables. Once you create new data **revisions** (we will cover this in the next section), the code will automatically pick up the latest revisions for training and metrics collection.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Opening a Project/Run/Table in the Dashboard\n", "\n", "In this section, we will get you familiarized with the 3LC **Dashboard**. The project \"hardhat-demo\" in the distributed public examples will be used throughout this tutorial.\n", "\n", "To access the 3LC Dashboard, first start the **3LC Service** in a terminal.\n", "\n", "```sh\n", "3lc service\n", "```\n", "\n", "Then, you can open the Dashboard in a browser at [dashboard.3lc.ai](https://dashboard.3lc.ai). The Dashboard consists of three panels – **Filters** (left), **Charts** (upper right), and **Rows** (lower right) panels - across various pages such as Runs or Tables page.\n", "\n", "In the Dashboard homepage (Projects page), double click on the project \"hardhat-demo\" in the project list. You will see a list of the 3 Runs in the \"hardhat-demo\" project.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 1](../images/hardhat/figure_1.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double click on any one of them to open it and view the metrics collected.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Finding and fixing label errors in the Dashboard\n", "\n", "We will use the Run \"base-run\", which was trained on the original training set, to demonstrate how to find potential label errors including missing labels and inaccurate bounding boxes (BB) by using the model's metrics.\n", "\n", "Double click the Run \"base-run\" to open it. Each displayed row represents the metrics collected for a single sample. Each column has a filter widget in the **Filters** panel.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 2](../images/hardhat/figure_2.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's first create a chart showing an image overlaid with both its ground truth and predicted BBs. To make the chart, select the `Image` column, **Ctrl + LeftClick** the `BBS` and `BBS_predicted` columns, and then press **2**. In the chart, solid boxes are ground truth labels and dashed boxes are the model's predictions. We will use this chart to visualize the filtered-in BBs after applying the filters.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 3](../images/hardhat/figure_3.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this demo, we will focus only on the training set. Therefore, filter on the \"hardhat-train\" table in the **Foreign Table** filter to follow along.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Find missing labels\n", "\n", "Adjust the `IOU` filter to the range [0.0, 0.2]. By applying this filter, only predicted BBs that have very little or no overlap with ground truth BBs will be visible. These filtered-in predictions could potentially correspond to missing labels.\n", "\n", "Under the `label_predicted` filter, the first numbers next to each class name represent the number of filtered-in BBs, while the second numbers are the total in the unfiltered dataset. That is, 717 helmet and 538 head BBs (total 1225) satisfy the applied filters, and they are in the 868 samples (out of total 7035), indicated on the top of the **METRICS** panel.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 4](../images/hardhat/figure_4.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We then can play with the `confidence` filter along with `IOU`. In general, highly confident false positive (FP) predictions have a higher chance of being missing labels rather than real FPs, while low-confidence FP predictions may be a mixed bag of missing labels and real FPs. With some manual checking, this dataset conforms to this trend as shown in the table below.\n", "\n", "| Confidence | % of filtered-in BBs | Missing labels/filtered-in BBs |\n", "|------------|----------------------|--------------------------------|\n", "| >0.6 | 40% | 80% |\n", "| <0.6 | 60% | 60% |\n", "\n", "Therefore, we can add the high-confidence low-IOU predictions into the ground truth dataset. Filter `confidence` to be >0.6, right click `BBS_PREDICTED` inside the chart, and click \"Add 473 predictions (369 rows)...\". By doing this batch assignment, we also added the 20% real FPs into the labels, so that we might want to quickly scan through those just-added labels and remove the unwanted ones (i.e., real FPs).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 5](../images/hardhat/figure_5.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we would like to manually go through the BBs with `confidence`<0.6 since a sizeable portion of them are real FPs that we don't want to add to the labels. To do this, filter `confidence` to be <0.6, flick through each sample, add missing labels and leave real FPs at your discretion. To add individual predictions, right click on the predicted BB and then click \"Add prediction\".\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 6](../images/hardhat/figure_6.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Find inaccurate BBs\n", "\n", "Inaccurate BBs are those that have big size/location differences from the predicted counterparts (i.e., smaller IOU), assuming that the predicted ones are more accurate. Those inaccurate BBs can not only potentially undermine model's performance, but also affect TP/FP/FN counts when the IOUs are around the IOU threshold for TP/FP/FN calculations.\n", "\n", "To find the inaccurate BBs, first clear all existing filters (icon on the top of **Filters** panel) to start fresh, then filter \"hardhat-train\" in **Foreign Table** (again) and `IOU` to be in range [0.4, 0.8]. A total of 8100 BBs in 3672 images are filtered in.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 7](../images/hardhat/figure_7.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For these inaccurate BBs, we want to replace the existing ones with the model's predictions. To do that, we first need to set \"Max IOU\" under `BBS` in the chart from default 1 to 0.4 (the low end of the previous IOU filter range) as shown in the figure below. This parameter sets the IOU threshold to replace any existing BBs that have IOU>0.4 with the predicted BB, which will be added as the new label.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 8](../images/hardhat/figure_8.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, we can do the batch assignment by clicking \"Add 8100 predictions (3672 row...\" under `BBS_PREDICTED` and then clicking OK in the popup dialog box. You notice that the new labels (same as the predictions) in this displayed sample have replaced the old ones.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 9](../images/hardhat/figure_9.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we would like to save all edits as a new revision. Click the pen icon on upper right and click \"Commit\" to make a new revision, which can be directly used for retraining later.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 10](../images/hardhat/figure_10.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is worth noting that iterating the label editing – retraining workflow a few times may be needed to reach optimal results. In fact, we have done several revisions for preparing this \"hardhat-demo\" project. Some intermediate Runs and data revisions are not included in the project for the sake of simplicity.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 11](../images/hardhat/figure_11.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Comparing original and retrained models' performance\n", "\n", "After a few iterations of label editing and retraining, we end up with a pretty decent revision. Now we can run training on the newest revision (\"retrained-run\") and the original one (\"original-run\") to compare if the model is improved by data editing. From the metrics charts below, you can see that \"retrained-run\" is better than \"original-run\" across the board. Note that the metrics collected for both Runs are against the same revision of the val set rather than the original val set.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 12](../images/hardhat/figure_12.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are the F1 scores for the two Runs.\n", "\n", "| Run | F1 |\n", "|---------------|-------|\n", "| Original-run | 0.941 |\n", "| Retrained-run | 0.959 |\n", "\n", "Finally, let's take a look at some samples to see if and how the model is getting better with revised train data. In the images below, left is from \"original-run\" and right is from \"retrained-run\". It is evident that the retrained model is indeed much better than the original model.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 13](../images/hardhat/figure_13.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 14](../images/hardhat/figure_14.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 15](../images/hardhat/figure_15.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 16](../images/hardhat/figure_16.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 17](../images/hardhat/figure_17.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 18](../images/hardhat/figure_18.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 19](../images/hardhat/figure_19.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 20](../images/hardhat/figure_20.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Feel free to further compare the two Runs' results on your own. Click on \"original-run\" and **Ctrl + double-click** on \"retrained-run\" to open both Runs in the same session. Note that comparing these runs to \"base-run\" will result in the metrics being split into two different tabs, which is not ideal for comparison. This is due to the \"base-run\" being based on a different model, using different metrics-collection settings.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 21](../images/hardhat/figure_21.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the Run's page (figure below), it indicates \"2 Runs\" on upper left and in the `RUN` filter. To get the `RUN` column in the metrics table, you can toggle `RUN` in the dropdown menu by clicking the wrench icon on the tool bar of the `METRICS` panel. This RUN column will make it easy to compare metrics from the same sample between Runs. You can also sort the `Example_ID` column to make the same sample next to each other, so that it's convenient to compare them across the Runs. Filter on the \"hardhat-val\" to compare the val set only. The `confidence`-`IOU` scatter chart in the figure below shows \"retrained-run\" has more clustered predictions with high confidence and high IOU, which aligns with other observations. To get this plot, select `Confidence`, `IOU`, and `Run` columns in sequence and press **2**.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Figure 22](../images/hardhat/figure_22.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "In this tutorial, we covered several topics including installing 3LC-integrated Ultralytics YOLO, training YOLOv8 models and collecting metrics, exploring 3LC Dashboard, finding and fixing label errors, and analyzing Runs to see model's improvements. We demonstrated that models can be improved by fixing label errors.\n" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }