{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "0",
   "metadata": {
    "id": "QHnVupBBn9eR"
   },
   "source": [
    "# Collect Bounding Boxes using Detectron2\n",
    "\n",
    "This notebook collects object detection metrics using the Detectron2 library and the COCO128 subset.\n",
    "\n",
    "No training is performed; instead, we use a pretrained model to evaluate performance. Metrics related to bounding boxes\n",
    "(true positives, false positives, false negatives, iou, confidence) are collected using the BoundingBoxMetricsCollector\n",
    "class.\n",
    "\n",
    "![](../images/coco128-collect-d2.png)\n",
    "\n",
    "<!-- Tags: [\"detectron2\", \"object-detection\", \"coco\"] -->\n",
    "\n",
    "The notebook illustrates:\n",
    "\n",
    "+ Metrics collection on a pretrained Detectron2 model using the COCO128 subset.\n",
    "+ Using `BoundingBoxMetricsCollector` for collecting object detection metrics.\n",
    "+ Collection per-sample embeddings using `EmbeddingsMetricsCollector`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1",
   "metadata": {},
   "source": [
    "## Project Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2",
   "metadata": {
    "tags": [
     "parameters"
    ]
   },
   "outputs": [],
   "source": [
    "PROJECT_NAME = \"3LC Tutorials - COCO128\"\n",
    "RUN_NAME = \"COCO128-Metrics-Collection\"\n",
    "DESCRIPTION = \"Collect bounding box metrics for COCO128\"\n",
    "TRAIN_DATASET_NAME = \"COCO128\"\n",
    "DOWNLOAD_PATH = \"../../transient_data\"\n",
    "DATA_PATH = \"../../data\"\n",
    "MODEL_CONFIG = \"COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml\"\n",
    "MAX_DETECTIONS_PER_IMAGE = 30\n",
    "SCORE_THRESH_TEST = 0.2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# NOTE: There is no single version of detectron2 that is appropriate for all users and all systems.\n",
    "#       This notebook uses a particular prebuilt version of detectron2 that is only available for\n",
    "#       Linux and for specific versions of torch, torchvision, and CUDA. It may not be appropriate\n",
    "#       for your system. See https://detectron2.readthedocs.io/en/latest/tutorials/install.html for\n",
    "#       instructions on how to install or build a version of detectron2 for your system.\n",
    "%pip install --force-reinstall torch==1.10.1+cu111 torchvision==0.11.2+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html\n",
    "%pip install detectron2 -f \"https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.10/index.html\"\n",
    "%pip install 3lc[pacmap]\n",
    "%pip install opencv-python\n",
    "%pip install matplotlib\n",
    "%pip install numpy==1.24.4"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {
    "id": "ZyAvNCJMmvFF"
   },
   "outputs": [],
   "source": [
    "import random\n",
    "\n",
    "import cv2\n",
    "import matplotlib.pyplot as plt\n",
    "import tlc\n",
    "from detectron2 import model_zoo\n",
    "from detectron2.config import get_cfg\n",
    "from detectron2.data import DatasetCatalog, MetadataCatalog\n",
    "from detectron2.utils.logger import setup_logger\n",
    "from detectron2.utils.visualizer import Visualizer\n",
    "\n",
    "logger = setup_logger()\n",
    "logger.setLevel(\"ERROR\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6",
   "metadata": {
    "id": "tjbUIhSxUdm_"
   },
   "source": [
    "## Prepare the dataset\n",
    "\n",
    "A small subset of the [COCO dataset](https://github.com/3lc-ai/3lc-examples/tree/main/data/coco128) (in the COCO standard format) is available in the `./data/coco128` directory.\n",
    "\n",
    "It is provided while cloning our [repository](https://github.com/3lc-ai/3lc-examples/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7",
   "metadata": {},
   "outputs": [],
   "source": [
    "train_json_path = tlc.Url(DATA_PATH + \"/coco128/annotations.json\").to_absolute()\n",
    "train_image_folder = tlc.Url(DATA_PATH + \"/coco128/images\").to_absolute()\n",
    "\n",
    "assert train_json_path.exists(), \"JSON file does not exist!\"\n",
    "assert train_image_folder.exists(), \"Image folder does not exist!\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "## Register the dataset with 3LC\n",
    "\n",
    "Now that we have the dataset in the COCO format, we can register it with 3LC."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tlc.integration.detectron2 import register_coco_instances\n",
    "\n",
    "register_coco_instances(\n",
    "    TRAIN_DATASET_NAME,\n",
    "    {},\n",
    "    train_json_path.to_str(),\n",
    "    train_image_folder.to_str(),\n",
    "    project_name=PROJECT_NAME,\n",
    "    keep_crowd_annotations=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "The detectron2 dataset dicts and dataset metadata can be read from the `DatasetCatalog` and\n",
    "`MetadataCatalog`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "dataset_metadata = MetadataCatalog.get(TRAIN_DATASET_NAME)\n",
    "dataset_dicts = DatasetCatalog.get(TRAIN_DATASET_NAME)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "12",
   "metadata": {
    "id": "6ljbWTX0Wi8E"
   },
   "source": [
    "To verify the dataset is in correct format, let's visualize the annotations of randomly selected samples in the training set:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 1000
    },
    "id": "UkNbUzUOLYf0",
    "outputId": "4f5ed932-624a-4ede-9d5b-22371569fe1d"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from detectron2.utils.file_io import PathManager\n",
    "\n",
    "for d in random.sample(dataset_dicts, 3):\n",
    "    filename = tlc.Url(d[\"file_name\"]).to_absolute().to_str()\n",
    "    if \"s3://\" in filename:\n",
    "        with PathManager.open(filename, \"rb\") as f:\n",
    "            img = np.asarray(bytearray(f.read()), dtype=\"uint8\")\n",
    "            img = cv2.imdecode(img, cv2.IMREAD_COLOR)\n",
    "    else:\n",
    "        img = cv2.imread(filename)\n",
    "    visualizer = Visualizer(img[:, :, ::-1], metadata=dataset_metadata, scale=0.5)\n",
    "    out = visualizer.draw_dataset_dict(d)\n",
    "    out_rgb = cv2.cvtColor(out.get_image(), cv2.COLOR_BGR2RGB)\n",
    "    plt.imshow(out_rgb[:, :, ::-1])\n",
    "    plt.title(filename.split(\"/\")[-1])\n",
    "    plt.show()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "14",
   "metadata": {
    "id": "wlqXIXXhW8dA"
   },
   "source": [
    "## Start a 3LC Run and collect bounding box evaluation metrics\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15",
   "metadata": {},
   "outputs": [],
   "source": [
    "run = tlc.init(\n",
    "    PROJECT_NAME,\n",
    "    run_name=RUN_NAME,\n",
    "    description=DESCRIPTION,\n",
    "    if_exists=\"overwrite\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16",
   "metadata": {},
   "outputs": [],
   "source": [
    "cfg = get_cfg()\n",
    "\n",
    "cfg.merge_from_file(model_zoo.get_config_file(MODEL_CONFIG))\n",
    "cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(MODEL_CONFIG)\n",
    "cfg.DATASETS.TRAIN = (TRAIN_DATASET_NAME,)\n",
    "cfg.OUTPUT_DIR = DOWNLOAD_PATH\n",
    "cfg.DATALOADER.NUM_WORKERS = 0\n",
    "cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512\n",
    "cfg.MODEL.ROI_HEADS.NUM_CLASSES = 80\n",
    "cfg.TEST.DETECTIONS_PER_IMAGE = MAX_DETECTIONS_PER_IMAGE\n",
    "cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = SCORE_THRESH_TEST\n",
    "cfg.MODEL.DEVICE = \"cuda\"\n",
    "cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS = False\n",
    "\n",
    "config = {\n",
    "    \"model_config\": MODEL_CONFIG,\n",
    "    \"test.detections_per_image\": MAX_DETECTIONS_PER_IMAGE,\n",
    "    \"model.roi_heads.score_thresh_test\": SCORE_THRESH_TEST,\n",
    "}\n",
    "\n",
    "run.set_parameters(config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "7unkuuiqLdqd",
    "outputId": "ba1716cd-3f3b-401d-bae5-8fbbd2199d9c"
   },
   "outputs": [],
   "source": [
    "from detectron2.engine import DefaultTrainer\n",
    "from tlc.integration.detectron2 import MetricsCollectionHook\n",
    "\n",
    "trainer = DefaultTrainer(cfg)\n",
    "\n",
    "# Define Embeddings metrics\n",
    "layer_index = 138  # Index of the layer to collect embeddings from\n",
    "embeddings_metrics_collector = tlc.EmbeddingsMetricsCollector(layers=[layer_index])\n",
    "\n",
    "predictor = tlc.Predictor(trainer.model, layers=[layer_index])\n",
    "\n",
    "bounding_box_metrics_collector = tlc.BoundingBoxMetricsCollector(\n",
    "    classes=dataset_metadata.thing_classes,\n",
    "    label_mapping=dataset_metadata.thing_dataset_id_to_contiguous_id,\n",
    ")\n",
    "\n",
    "metrics_collection_hook = MetricsCollectionHook(\n",
    "    dataset_name=TRAIN_DATASET_NAME,\n",
    "    metrics_collectors=[bounding_box_metrics_collector, embeddings_metrics_collector],\n",
    "    collect_metrics_before_train=True,\n",
    "    predictor=predictor,  # Needs to be used for embeddings metrics\n",
    ")\n",
    "\n",
    "trainer.register_hooks([metrics_collection_hook])\n",
    "trainer.resume_or_load(resume=False)\n",
    "trainer.before_train()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "18",
   "metadata": {},
   "outputs": [],
   "source": [
    "train_table = tlc.Table.from_url(\n",
    "    dataset_metadata.get(\"latest_tlc_table_url\")\n",
    ").url  # Get the last revision of the val table\n",
    "\n",
    "url_mapping = run.reduce_embeddings_by_foreign_table_url(\n",
    "    train_table,\n",
    "    method=\"pacmap\",\n",
    "    n_components=3,\n",
    "    n_neighbors=5,\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}