{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "0", "metadata": { "id": "QHnVupBBn9eR" }, "source": [ "# Collect Bounding Boxes using Detectron2\n", "\n", "This notebook collects object detection metrics using the Detectron2 library and the COCO128 subset.\n", "\n", "No training is performed; instead, we use a pretrained model to evaluate performance. Metrics related to bounding boxes\n", "(true positives, false positives, false negatives, iou, confidence) are collected using the BoundingBoxMetricsCollector\n", "class.\n", "\n", "![](../images/coco128-collect-d2.png)\n", "\n", "\n", "\n", "The notebook illustrates:\n", "\n", "+ Metrics collection on a pretrained Detectron2 model using the COCO128 subset.\n", "+ Using `BoundingBoxMetricsCollector` for collecting object detection metrics.\n", "+ Collection per-sample embeddings using `EmbeddingsMetricsCollector`" ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "## Project Setup" ] }, { "cell_type": "code", "execution_count": null, "id": "2", "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "PROJECT_NAME = \"3LC Tutorials - COCO128\"\n", "RUN_NAME = \"COCO128-Metrics-Collection\"\n", "DESCRIPTION = \"Collect bounding box metrics for COCO128\"\n", "TRAIN_DATASET_NAME = \"COCO128\"\n", "DOWNLOAD_PATH = \"../../transient_data\"\n", "DATA_PATH = \"../../data\"\n", "MODEL_CONFIG = \"COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml\"\n", "MAX_DETECTIONS_PER_IMAGE = 30\n", "SCORE_THRESH_TEST = 0.2" ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "# NOTE: There is no single version of detectron2 that is appropriate for all users and all systems.\n", "# This notebook uses a particular prebuilt version of detectron2 that is only available for\n", "# Linux and for specific versions of torch, torchvision, and CUDA. It may not be appropriate\n", "# for your system. See https://detectron2.readthedocs.io/en/latest/tutorials/install.html for\n", "# instructions on how to install or build a version of detectron2 for your system.\n", "%pip install --force-reinstall torch==1.10.1+cu111 torchvision==0.11.2+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html\n", "%pip install detectron2 -f \"https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.10/index.html\"\n", "%pip install 3lc[pacmap]\n", "%pip install opencv-python\n", "%pip install matplotlib\n", "%pip install numpy==1.24.4" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": { "id": "ZyAvNCJMmvFF" }, "outputs": [], "source": [ "import random\n", "\n", "import cv2\n", "import matplotlib.pyplot as plt\n", "import tlc\n", "from detectron2 import model_zoo\n", "from detectron2.config import get_cfg\n", "from detectron2.data import DatasetCatalog, MetadataCatalog\n", "from detectron2.utils.logger import setup_logger\n", "from detectron2.utils.visualizer import Visualizer\n", "\n", "logger = setup_logger()\n", "logger.setLevel(\"ERROR\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6", "metadata": { "id": "tjbUIhSxUdm_" }, "source": [ "## Prepare the dataset\n", "\n", "A small subset of the [COCO dataset](https://github.com/3lc-ai/3lc-examples/tree/main/data/coco128) (in the COCO standard format) is available in the `./data/coco128` directory.\n", "\n", "It is provided while cloning our [repository](https://github.com/3lc-ai/3lc-examples/)." ] }, { "cell_type": "code", "execution_count": null, "id": "7", "metadata": {}, "outputs": [], "source": [ "train_json_path = tlc.Url(DATA_PATH + \"/coco128/annotations.json\").to_absolute()\n", "train_image_folder = tlc.Url(DATA_PATH + \"/coco128/images\").to_absolute()\n", "\n", "assert train_json_path.exists(), \"JSON file does not exist!\"\n", "assert train_image_folder.exists(), \"Image folder does not exist!\"" ] }, { "attachments": {}, "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "## Register the dataset with 3LC\n", "\n", "Now that we have the dataset in the COCO format, we can register it with 3LC." ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "from tlc.integration.detectron2 import register_coco_instances\n", "\n", "register_coco_instances(\n", " TRAIN_DATASET_NAME,\n", " {},\n", " train_json_path.to_str(),\n", " train_image_folder.to_str(),\n", " project_name=PROJECT_NAME,\n", " keep_crowd_annotations=False,\n", ")" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "The detectron2 dataset dicts and dataset metadata can be read from the `DatasetCatalog` and\n", "`MetadataCatalog`." ] }, { "cell_type": "code", "execution_count": null, "id": "11", "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset_metadata = MetadataCatalog.get(TRAIN_DATASET_NAME)\n", "dataset_dicts = DatasetCatalog.get(TRAIN_DATASET_NAME)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "12", "metadata": { "id": "6ljbWTX0Wi8E" }, "source": [ "To verify the dataset is in correct format, let's visualize the annotations of randomly selected samples in the training set:" ] }, { "cell_type": "code", "execution_count": null, "id": "13", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "UkNbUzUOLYf0", "outputId": "4f5ed932-624a-4ede-9d5b-22371569fe1d" }, "outputs": [], "source": [ "import numpy as np\n", "from detectron2.utils.file_io import PathManager\n", "\n", "for d in random.sample(dataset_dicts, 3):\n", " filename = tlc.Url(d[\"file_name\"]).to_absolute().to_str()\n", " if \"s3://\" in filename:\n", " with PathManager.open(filename, \"rb\") as f:\n", " img = np.asarray(bytearray(f.read()), dtype=\"uint8\")\n", " img = cv2.imdecode(img, cv2.IMREAD_COLOR)\n", " else:\n", " img = cv2.imread(filename)\n", " visualizer = Visualizer(img[:, :, ::-1], metadata=dataset_metadata, scale=0.5)\n", " out = visualizer.draw_dataset_dict(d)\n", " out_rgb = cv2.cvtColor(out.get_image(), cv2.COLOR_BGR2RGB)\n", " plt.imshow(out_rgb[:, :, ::-1])\n", " plt.title(filename.split(\"/\")[-1])\n", " plt.show()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "14", "metadata": { "id": "wlqXIXXhW8dA" }, "source": [ "## Start a 3LC Run and collect bounding box evaluation metrics\n" ] }, { "cell_type": "code", "execution_count": null, "id": "15", "metadata": {}, "outputs": [], "source": [ "run = tlc.init(\n", " PROJECT_NAME,\n", " run_name=RUN_NAME,\n", " description=DESCRIPTION,\n", " if_exists=\"overwrite\",\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": {}, "outputs": [], "source": [ "cfg = get_cfg()\n", "\n", "cfg.merge_from_file(model_zoo.get_config_file(MODEL_CONFIG))\n", "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(MODEL_CONFIG)\n", "cfg.DATASETS.TRAIN = (TRAIN_DATASET_NAME,)\n", "cfg.OUTPUT_DIR = DOWNLOAD_PATH\n", "cfg.DATALOADER.NUM_WORKERS = 0\n", "cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512\n", "cfg.MODEL.ROI_HEADS.NUM_CLASSES = 80\n", "cfg.TEST.DETECTIONS_PER_IMAGE = MAX_DETECTIONS_PER_IMAGE\n", "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = SCORE_THRESH_TEST\n", "cfg.MODEL.DEVICE = \"cuda\"\n", "cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS = False\n", "\n", "config = {\n", " \"model_config\": MODEL_CONFIG,\n", " \"test.detections_per_image\": MAX_DETECTIONS_PER_IMAGE,\n", " \"model.roi_heads.score_thresh_test\": SCORE_THRESH_TEST,\n", "}\n", "\n", "run.set_parameters(config)" ] }, { "cell_type": "code", "execution_count": null, "id": "17", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7unkuuiqLdqd", "outputId": "ba1716cd-3f3b-401d-bae5-8fbbd2199d9c" }, "outputs": [], "source": [ "from detectron2.engine import DefaultTrainer\n", "from tlc.integration.detectron2 import MetricsCollectionHook\n", "\n", "trainer = DefaultTrainer(cfg)\n", "\n", "# Define Embeddings metrics\n", "layer_index = 138 # Index of the layer to collect embeddings from\n", "embeddings_metrics_collector = tlc.EmbeddingsMetricsCollector(layers=[layer_index])\n", "\n", "predictor = tlc.Predictor(trainer.model, layers=[layer_index])\n", "\n", "bounding_box_metrics_collector = tlc.BoundingBoxMetricsCollector(\n", " classes=dataset_metadata.thing_classes,\n", " label_mapping=dataset_metadata.thing_dataset_id_to_contiguous_id,\n", ")\n", "\n", "metrics_collection_hook = MetricsCollectionHook(\n", " dataset_name=TRAIN_DATASET_NAME,\n", " metrics_collectors=[bounding_box_metrics_collector, embeddings_metrics_collector],\n", " collect_metrics_before_train=True,\n", " predictor=predictor, # Needs to be used for embeddings metrics\n", ")\n", "\n", "trainer.register_hooks([metrics_collection_hook])\n", "trainer.resume_or_load(resume=False)\n", "trainer.before_train()" ] }, { "cell_type": "code", "execution_count": null, "id": "18", "metadata": {}, "outputs": [], "source": [ "train_table = tlc.Table.from_url(\n", " dataset_metadata.get(\"latest_tlc_table_url\")\n", ").url # Get the last revision of the val table\n", "\n", "url_mapping = run.reduce_embeddings_by_foreign_table_url(\n", " train_table,\n", " method=\"pacmap\",\n", " n_components=3,\n", " n_neighbors=5,\n", ")" ] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 5 }