{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# Create Custom Oriented Bounding Box Table\n", "\n", "Create a 3LC Table with oriented bounding boxes using the HRSC2016-MS maritime ship detection dataset for rotated object detection.\n", "\n", "![img](../../images/hrsc2016-ms.png)\n", "\n", "\n", "\n", "Standard axis-aligned bounding boxes can't capture rotated objects efficiently. Oriented bounding boxes provide precise localization for rotated objects like ships, text, and aerial vehicles, reducing background noise and improving detection accuracy.\n", "\n", "This notebook processes the HRSC2016-MS dataset containing 1070 images with 4406 annotated ships. We demonstrate creating custom oriented bounding box annotations with rotation angles, showing how to handle non-standard coordinate systems and rotation representations. The dataset comes from remote sensing research and provides challenging examples of rotated ship detection in optical satellite imagery, making it ideal for testing oriented detection algorithms." ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "## Project setup" ] }, { "cell_type": "code", "execution_count": null, "id": "2", "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "PROJECT_NAME = \"3LC Tutorials - OBBs\"\n", "DATASET_NAME = \"HRSC2016-MS\"\n", "DOWNLOAD_PATH = \"../../../transient_data\"" ] }, { "cell_type": "markdown", "id": "3", "metadata": {}, "source": [ "## Install dependencies" ] }, { "cell_type": "code", "execution_count": null, "id": "4", "metadata": {}, "outputs": [], "source": [ "%pip install gdown\n", "%pip install 3lc" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "import gdown\n", "\n", "dst = DOWNLOAD_PATH + \"/\" + \"hrsc2016-ms.zip\"\n", "gdown.download(\"https://drive.google.com/uc?id=1UslulCCx8GoTflm1gpfIGZeXIsCAdMG5\", dst, quiet=False)" ] }, { "cell_type": "code", "execution_count": null, "id": "6", "metadata": {}, "outputs": [], "source": [ "import zipfile\n", "from pathlib import Path\n", "\n", "with zipfile.ZipFile(dst, \"r\") as zip_ref:\n", " zip_ref.extractall(DOWNLOAD_PATH + \"/\" + \"HRSC2016-MS\")\n", "\n", "# Remove the zipfile after extracting\n", "if Path(dst).exists():\n", " Path(dst).unlink()" ] }, { "cell_type": "markdown", "id": "7", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "id": "8", "metadata": {}, "outputs": [], "source": [ "import xml.etree.ElementTree as ET\n", "from collections import defaultdict\n", "from pathlib import Path\n", "\n", "import tlc" ] }, { "cell_type": "markdown", "id": "9", "metadata": {}, "source": [ "## Prepare the data\n", "\n", "The data can be downloaded as a 2.3 GB zip file from the dataset's [GitHub repository](https://github.com/wmchen/HRSC2016-MS).\n", "When unzipped, it has the folder structure:\n", "\n", "```\n", "HRSC2016-MS/\n", "├── Annotations/\n", "├── AllImages/\n", "└── ImageSets/\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "10", "metadata": {}, "outputs": [], "source": [ "DATASET_ROOT = DOWNLOAD_PATH + \"/\" + \"HRSC2016-MS\"\n", "\n", "# Register the dataset root as a URL alias to make it easier to share the tables and move the source data.\n", "# tlc.register_project_url_alias(\"HRSC2016_MS_DATA\", DATASET_ROOT, project=PROJECT_NAME)" ] }, { "cell_type": "code", "execution_count": null, "id": "11", "metadata": {}, "outputs": [], "source": [ "row_data = defaultdict(dict)\n", "\n", "for split in [\"train\", \"val\"]:\n", " image_splits = Path(DATASET_ROOT) / \"ImageSets\" / f\"{split}.txt\"\n", " image_ids = image_splits.read_text().splitlines()\n", " row_data[split] = defaultdict(list)\n", "\n", " for image_id in image_ids:\n", " image_path = Path(DATASET_ROOT) / \"AllImages\" / f\"{image_id}.bmp\"\n", " annotation_path = Path(DATASET_ROOT) / \"Annotations\" / f\"{image_id}.xml\"\n", " if not image_path.exists():\n", " print(f\"Image {image_id} does not exist\")\n", " if not annotation_path.exists():\n", " print(f\"Annotation {image_id} does not exist\")\n", " row_data[split][\"image\"].append(tlc.Url(image_path).to_relative().to_str())\n", " row_data[split][\"obb\"].append(annotation_path)" ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": {}, "outputs": [], "source": [ "def load_obb_annotation(annotation_path):\n", " \"\"\"Load annotations for a single image from XML format.\"\"\"\n", " tree = ET.parse(annotation_path)\n", " root = tree.getroot()\n", " width = int(root.find(\"size\").find(\"width\").text)\n", " height = int(root.find(\"size\").find(\"height\").text)\n", "\n", " obbs = tlc.OBB2DInstances.create_empty(\n", " image_height=height, image_width=width, instance_extras_keys={\"difficult\", \"truncated\"}\n", " )\n", "\n", " for obj in root.findall(\"object\"):\n", " difficult = int(obj.find(\"difficult\").text)\n", " truncated = int(obj.find(\"truncated\").text)\n", " bbox = obj.find(\"robndbox\")\n", " cx, cy, w, h, angle = (float(bbox.find(tag).text) for tag in [\"cx\", \"cy\", \"w\", \"h\", \"angle\"])\n", " obbs.add_instance(\n", " obb=[cx, cy, w, h, angle],\n", " label=0, # single class dataset—all instances are ships\n", " instance_extras={\n", " \"difficult\": difficult,\n", " \"truncated\": truncated,\n", " },\n", " )\n", "\n", " return obbs.to_row()" ] }, { "cell_type": "code", "execution_count": null, "id": "13", "metadata": {}, "outputs": [], "source": [ "# Transform annotation files to Table-ready data structure\n", "for split in [\"train\", \"val\"]:\n", " row_data[split][\"obb\"] = [load_obb_annotation(path) for path in row_data[split][\"obb\"]]" ] }, { "cell_type": "markdown", "id": "14", "metadata": {}, "source": [ "## Create the tables\n", "\n", "We use a `OrientedBoundingBoxes2DSchema` to describe the structure of the oriented bounding boxes, and `from_dict` to create the Tables." ] }, { "cell_type": "code", "execution_count": null, "id": "15", "metadata": {}, "outputs": [], "source": [ "schemas = {\n", " \"image\": tlc.ImageUrlSchema(),\n", " \"obb\": tlc.OrientedBoundingBoxes2DSchema(\n", " classes=[\"ship\"],\n", " per_instance_schemas={\n", " \"difficult\": tlc.BoolListSchema(),\n", " \"truncated\": tlc.BoolListSchema(),\n", " },\n", " ),\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": {}, "outputs": [], "source": [ "for split in [\"train\", \"val\"]:\n", " table = tlc.Table.from_dict(\n", " data=row_data[split],\n", " structure=schemas,\n", " table_name=f\"{split}\",\n", " dataset_name=DATASET_NAME,\n", " project_name=PROJECT_NAME,\n", " if_exists=\"rename\",\n", " )" ] }, { "cell_type": "code", "execution_count": null, "id": "17", "metadata": {}, "outputs": [], "source": [ "table" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" }, "test_marks": [ "dependent" ] }, "nbformat": 4, "nbformat_minor": 5 }