{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "0", "metadata": { "id": "QHnVupBBn9eR" }, "source": [ "# Collect instance segmentation metrics using Detectron2\n", "\n", "This notebook collects instance segmentation metrics using the Detectron2 library and the COCO128 subset.\n", "\n", "No training is performed; instead, we use a pretrained model to evaluate\n", "performance. Metrics related to bounding boxes (true positives, false positives,\n", "false negatives, iou, confidence) are collected using the\n", "BoundingBoxMetricsCollector class. The same class is used to collect instance\n", "segmentation metrics.\n", "\n", "![](../images/coco128-collect-seg-d2.png)\n", "\n", "\n", "\n", "The notebook illustrates:\n", "\n", "+ Metrics collection on a pretrained Detectron2 model using the COCO128 subset.\n", "+ Using `BoundingBoxMetricsCollector` for collecting object detection and segmentation metrics.\n", "+ Collection per-sample embeddings using `EmbeddingsMetricsCollector`" ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "## Project Setup" ] }, { "cell_type": "code", "execution_count": null, "id": "2", "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "PROJECT_NAME = \"3LC Tutorials - COCO128\"\n", "RUN_NAME = \"COCO128-Segmentation-Metrics-Collection\"\n", "DESCRIPTION = \"Collect segmentation metrics for COCO128\"\n", "TRAIN_DATASET_NAME = \"COCO128-seg\"\n", "TMP_PATH = \"../../transient_data\"\n", "DATA_PATH = \"../../data\"\n", "DETECTRON2_MODEL_CONFIG = \"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\"\n", "MAX_DETECTIONS_PER_IMAGE = 30\n", "SCORE_THRESH_TEST = 0.2\n", "MASK_FORMAT = \"bitmask\"\n", "INSTALL_DEPENDENCIES = True" ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "if INSTALL_DEPENDENCIES:\n", " # NOTE: detectron2 has no suitable prebuilt wheels for Python 3.10+, so we install from source.\n", " # This requires a working C++ compiler and may take a few minutes.\n", " # See https://detectron2.readthedocs.io/en/latest/tutorials/install.html for details.\n", " %env CC=gcc-11\n", " %env CXX=g++-11\n", " %pip install -q setuptools wheel ninja\n", " %pip install -q --no-build-isolation git+https://github.com/facebookresearch/detectron2.git\n", " %pip install -q 3lc[pacmap]\n", " %pip install -q opencv-python\n", " %pip install -q matplotlib" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": { "id": "ZyAvNCJMmvFF" }, "outputs": [], "source": [ "import random\n", "\n", "import cv2\n", "import matplotlib.pyplot as plt\n", "import tlc\n", "from detectron2 import model_zoo\n", "from detectron2.config import get_cfg\n", "from detectron2.data import DatasetCatalog, MetadataCatalog\n", "from detectron2.utils.logger import setup_logger\n", "from detectron2.utils.visualizer import Visualizer\n", "\n", "logger = setup_logger()\n", "logger.setLevel(\"ERROR\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6", "metadata": { "id": "tjbUIhSxUdm_" }, "source": [ "## Prepare the dataset\n", "\n", "A small subset of the [COCO dataset](https://github.com/3lc-ai/3lc-examples/tree/main/data/coco128) (in the COCO standard format) is available in the `./data/coco128` directory.\n", "\n", "It is provided while cloning our [repository](https://github.com/3lc-ai/3lc-examples/)." ] }, { "cell_type": "code", "execution_count": null, "id": "7", "metadata": {}, "outputs": [], "source": [ "train_json_path = tlc.Url(DATA_PATH + \"/coco128/annotations.json\").to_absolute()\n", "train_image_folder = tlc.Url(DATA_PATH + \"/coco128/images\").to_absolute()\n", "\n", "assert train_json_path.exists(), \"JSON file does not exist!\"\n", "assert train_image_folder.exists(), \"Image folder does not exist!\"" ] }, { "attachments": {}, "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "## Register the dataset with 3LC\n", "\n", "Now that we have the dataset in the COCO format, we can register it with 3LC." ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "from tlc.integration.detectron2 import register_coco_instances\n", "\n", "register_coco_instances(\n", " TRAIN_DATASET_NAME,\n", " {},\n", " train_json_path.to_str(),\n", " train_image_folder.to_str(),\n", " project_name=PROJECT_NAME,\n", " task=\"segment\",\n", " mask_format=MASK_FORMAT,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "10", "metadata": { "tags": [] }, "outputs": [], "source": [ "# The detectron2 dataset dicts and dataset metadata can be read from the DatasetCatalog and\n", "# MetadataCatalog, respectively.\n", "dataset_metadata = MetadataCatalog.get(TRAIN_DATASET_NAME)\n", "dataset_dicts = DatasetCatalog.get(TRAIN_DATASET_NAME)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "11", "metadata": { "id": "6ljbWTX0Wi8E" }, "source": [ "To verify the dataset is in correct format, let's visualize the annotations of randomly selected samples in the training set:" ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "UkNbUzUOLYf0", "outputId": "4f5ed932-624a-4ede-9d5b-22371569fe1d" }, "outputs": [], "source": [ "import numpy as np\n", "from detectron2.utils.file_io import PathManager\n", "\n", "for d in random.sample(dataset_dicts, 3):\n", " filename = tlc.Url(d[\"file_name\"]).to_absolute().to_str()\n", " if \"s3://\" in filename:\n", " with PathManager.open(filename, \"rb\") as f:\n", " img = np.asarray(bytearray(f.read()), dtype=\"uint8\")\n", " img = cv2.imdecode(img, cv2.IMREAD_COLOR)\n", " else:\n", " img = cv2.imread(filename)\n", " visualizer = Visualizer(img[:, :, ::-1], metadata=dataset_metadata, scale=0.5)\n", " out = visualizer.draw_dataset_dict(d)\n", " out_rgb = cv2.cvtColor(out.get_image(), cv2.COLOR_BGR2RGB)\n", " plt.imshow(out_rgb[:, :, ::-1])\n", " plt.title(filename.split(\"/\")[-1])\n", " plt.show()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "13", "metadata": { "id": "wlqXIXXhW8dA" }, "source": [ "## Start a 3LC Run and collect bounding box evaluation metrics\n" ] }, { "cell_type": "code", "execution_count": null, "id": "14", "metadata": {}, "outputs": [], "source": [ "run = tlc.init(\n", " PROJECT_NAME,\n", " run_name=RUN_NAME,\n", " description=DESCRIPTION,\n", " if_exists=\"overwrite\",\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "15", "metadata": {}, "outputs": [], "source": [ "cfg = get_cfg()\n", "\n", "cfg.merge_from_file(model_zoo.get_config_file(DETECTRON2_MODEL_CONFIG))\n", "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(DETECTRON2_MODEL_CONFIG)\n", "cfg.DATASETS.TRAIN = (TRAIN_DATASET_NAME,)\n", "cfg.OUTPUT_DIR = TMP_PATH\n", "cfg.DATALOADER.NUM_WORKERS = 0\n", "cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512\n", "cfg.MODEL.ROI_HEADS.NUM_CLASSES = 80\n", "cfg.TEST.DETECTIONS_PER_IMAGE = MAX_DETECTIONS_PER_IMAGE\n", "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = SCORE_THRESH_TEST\n", "cfg.MODEL.DEVICE = \"cuda\"\n", "cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS = False\n", "cfg.INPUT.MASK_FORMAT = MASK_FORMAT\n", "\n", "config = {\n", " \"model_config\": DETECTRON2_MODEL_CONFIG,\n", " \"test.detections_per_image\": MAX_DETECTIONS_PER_IMAGE,\n", " \"model.roi_heads.score_thresh_test\": SCORE_THRESH_TEST,\n", " \"input.mask_format\": MASK_FORMAT,\n", "}\n", "\n", "run.set_parameters(config)" ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7unkuuiqLdqd", "outputId": "ba1716cd-3f3b-401d-bae5-8fbbd2199d9c" }, "outputs": [], "source": [ "from detectron2.engine import DefaultTrainer\n", "from tlc.integration.detectron2 import MetricsCollectionHook\n", "\n", "trainer = DefaultTrainer(cfg)\n", "\n", "# Define Embeddings metrics\n", "layer_index = 138 # Index of the layer to collect embeddings from\n", "embeddings_metrics_collector = tlc.EmbeddingsMetricsCollector(layers=[layer_index])\n", "\n", "predictor = tlc.Predictor(trainer.model, layers=[layer_index])\n", "\n", "bounding_box_metrics_collector = tlc.BoundingBoxMetricsCollector(\n", " classes=dataset_metadata.thing_classes,\n", " label_mapping=dataset_metadata.thing_dataset_id_to_contiguous_id,\n", " save_segmentations=True,\n", ")\n", "\n", "metrics_collection_hook = MetricsCollectionHook(\n", " dataset_name=TRAIN_DATASET_NAME,\n", " metrics_collectors=[bounding_box_metrics_collector, embeddings_metrics_collector],\n", " collect_metrics_before_train=True,\n", " predictor=predictor, # Needs to be used for embeddings metrics\n", ")\n", "\n", "trainer.register_hooks([metrics_collection_hook])\n", "trainer.resume_or_load(resume=False)\n", "trainer.before_train()" ] }, { "cell_type": "code", "execution_count": null, "id": "17", "metadata": {}, "outputs": [], "source": [ "val_table = tlc.Table.from_url(\n", " dataset_metadata.get(\"latest_tlc_table_url\")\n", ").url # Get the last revision of the val table\n", "\n", "url_mapping = run.reduce_embeddings_by_foreign_table_url(\n", " val_table,\n", " method=\"pacmap\",\n", " n_components=3,\n", " n_neighbors=5,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "18", "metadata": {}, "outputs": [], "source": [ "run.url" ] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 5 }