{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 5: 3D Annotation\n", "\n", "Now we annotate 3D cuboids on the point cloud slice. This step covers:\n", "1. Setting up a 3D annotation schema\n", "2. Using the 3D annotation tools (cuboids, transform controls)\n", "3. Understanding the annotation plane concept\n", "4. Viewing 3D labels projected onto 2D camera images\n", "\n", "> **Tip:** Complete Step 4 (2D annotation) first. Having 2D labels as reference helps with 3D annotation consistency." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo\n", "from fiftyone import ViewField as F\n", "\n", "dataset = fo.load_dataset(\"annotation_tutorial\")\n", "batch_v0 = dataset.load_saved_view(\"batch_v0\")\n", "\n", "# Get point cloud slice from batch\n", "batch_v0_pcd = batch_v0.select_group_slices([\"pcd\"])\n", "\n", "print(f\"Batch v0: {len(batch_v0.distinct('group.id'))} groups (scenes)\")\n", "print(f\"Point cloud samples to annotate: {len(batch_v0_pcd)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Your 3D Schema\n", "\n", "For 3D cuboids, we use a subset of KITTI classes - focusing on objects that have clear 3D extent in point clouds." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define annotation schema for 3D cuboids\n", "LABEL_FIELD_3D = \"human_cuboids\"\n", "\n", "SCHEMA_3D = {\n", " \"field_name\": LABEL_FIELD_3D,\n", " \"classes\": [\n", " \"Car\",\n", " \"Van\",\n", " \"Truck\",\n", " \"Pedestrian\",\n", " \"Cyclist\"\n", " ]\n", "}\n", "\n", "SCHEMA_CLASSES_3D = set(SCHEMA_3D[\"classes\"])\n", "\n", "# Store in dataset\n", "dataset.info[\"annotation_schema_3d\"] = SCHEMA_3D\n", "dataset.save()\n", "\n", "print(f\"3D Schema defined: {len(SCHEMA_3D['classes'])} classes\")\n", "print(f\"Target field: {LABEL_FIELD_3D}\")\n", "print(f\"\\nClasses: {SCHEMA_3D['classes']}\")\n", "print(f\"\\nWhen you create a field in the App, name it exactly: {LABEL_FIELD_3D}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3D Annotation in the App\n", "\n", "### Getting to the 3D View\n", "1. Launch the App with your batch\n", "2. Click a sample to open the modal\n", "3. **Select the `pcd` slice** from the slice dropdown\n", "4. The 3D visualizer will load the point cloud\n", "\n", "### 3D Navigation\n", "- **Rotate**: Left-click and drag\n", "- **Pan**: Right-click and drag (or Shift + left-click)\n", "- **Zoom**: Scroll wheel\n", "- **Preset views**: Press `1`, `2`, `3`, `4` for top/right/front/annotation-plane views" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Launch App with point cloud view\n", "session = fo.launch_app(batch_v0_pcd)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating 3D Cuboids\n", "\n", "### Enter Annotate Mode\n", "1. Click the **Annotate** tab (pencil icon)\n", "2. Click **Schema** -> **New Field** -> name it `human_cuboids`\n", "3. Set type to **Detections** and add the classes above\n", "\n", "### Understanding the Annotation Plane\n", "\n", "The **annotation plane** is a virtual surface that determines where your clicks place vertices. By default, it's the XY plane (ground level).\n", "\n", "- **Moving the plane**: Reposition to place vertices at different heights\n", "- **Why it matters**: Cuboid corners snap to this plane when you click\n", "\n", "### Drawing a Cuboid\n", "1. Click the **Cuboid** tool in the left toolbar\n", "2. Click to place the **first corner** on the annotation plane\n", "3. Click to place the **opposite corner** (defines the base rectangle)\n", "4. The cuboid is created with a default height\n", "5. Select a class from the dropdown\n", "\n", "### Transform Controls\n", "After creating a cuboid, use transform controls to refine it:\n", "\n", "| Control | What it does |\n", "|---------|-------------|\n", "| **Translation** | Move along X/Y/Z axes or XY/XZ/YZ planes |\n", "| **Rotation** | Rotate around X/Y/Z axes |\n", "| **Scaling** | Resize along X/Y/Z axes |\n", "\n", "Click on a cuboid to select it, then use the transform handles." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Camera Projections\n", "\n", "One of FiftyOne's key 3D features is **camera projections**:\n", "\n", "### Point Cloud Projections\n", "- Flatten the 3D view to 2D planes (top-down, side views)\n", "- Useful for accurate positioning\n", "\n", "### 2D Image Projections\n", "- See the camera images in the 3D viewer dropdown\n", "- Your 3D cuboids are **projected onto the 2D images in real-time**\n", "- This helps verify that your 3D labels align with the 2D scene\n", "\n", "To use camera projections:\n", "1. Look for the **projection dropdown** in the 3D viewer\n", "2. Select a camera (e.g., `left`)\n", "3. See your cuboids rendered on the 2D image\n", "\n", "> **Note:** Camera projections require camera intrinsics/extrinsics to be defined in the dataset. The KITTI data in quickstart-groups should have these." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Annotation Guidelines for 3D\n", "\n", "### Positioning\n", "- Center the cuboid on the point cloud cluster representing the object\n", "- The base should touch the ground plane\n", "- Include all points belonging to the object\n", "\n", "### Orientation\n", "- Align the cuboid's longest axis with the object's heading direction\n", "- For vehicles, the front should point in the driving direction\n", "\n", "### Sizing\n", "- Tightly fit the cuboid to the point cloud extent\n", "- Don't include points from other objects or ground\n", "\n", "### Consistency with 2D\n", "- Objects labeled in 2D should also be labeled in 3D (if visible in point cloud)\n", "- Use the same class for the same object across both modalities" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "## Fast-Forward Option\n", "\n", "If you want to skip manual 3D labeling, set `FAST_FORWARD = True` below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set to True ONLY if you want to skip manual 3D annotation\n", "FAST_FORWARD = False\n", "\n", "if FAST_FORWARD:\n", " print(\"Fast-forwarding: copying 3D ground_truth to human_cuboids...\")\n", " print(f\"Filtering to schema classes: {SCHEMA_CLASSES_3D}\")\n", " \n", " copied = 0\n", " skipped = 0\n", " \n", " for sample in batch_v0_pcd:\n", " if sample.ground_truth:\n", " human_cuboids = []\n", " for det in sample.ground_truth.detections:\n", " if det.label in SCHEMA_CLASSES_3D:\n", " # Copy the 3D detection\n", " human_cuboids.append(fo.Detection(\n", " label=det.label,\n", " location=det.location if hasattr(det, 'location') else None,\n", " dimensions=det.dimensions if hasattr(det, 'dimensions') else None,\n", " rotation=det.rotation if hasattr(det, 'rotation') else None,\n", " ))\n", " copied += 1\n", " else:\n", " skipped += 1\n", " sample[LABEL_FIELD_3D] = fo.Detections(detections=human_cuboids)\n", " else:\n", " sample[LABEL_FIELD_3D] = fo.Detections(detections=[])\n", " sample.save()\n", " \n", " print(f\"Copied {copied} cuboids, skipped {skipped} (not in schema)\")\n", "else:\n", " print(\"Using your manual 3D annotations.\")\n", " print(f\"Make sure you created '{LABEL_FIELD_3D}' and labeled on the PCD slice!\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Reload to see changes\n", "dataset.reload()\n", "\n", "# Check point cloud samples in batch\n", "batch_pcd = dataset.match_tags(\"batch:v0\").select_group_slices([\"pcd\"])\n", "\n", "if LABEL_FIELD_3D in dataset.get_field_schema():\n", " has_labels = batch_pcd.match(F(f\"{LABEL_FIELD_3D}.detections\").length() > 0)\n", " no_labels = batch_pcd.match(\n", " (F(LABEL_FIELD_3D) == None) | (F(f\"{LABEL_FIELD_3D}.detections\").length() == 0)\n", " )\n", " \n", " print(f\"Batch v0 (point cloud) status:\")\n", " print(f\" With 3D labels: {len(has_labels)}\")\n", " print(f\" Without labels: {len(no_labels)}\")\n", " \n", " if len(has_labels) > 0:\n", " has_labels.tag_samples(\"annotated_3d:v0\")\n", " print(f\"\\nTagged {len(has_labels)} point cloud samples as 'annotated_3d:v0'\")\n", "else:\n", " print(f\"Field '{LABEL_FIELD_3D}' not found. Create it in the App first.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## QA Checks for 3D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get annotated point cloud samples\n", "annotated_3d = dataset.match_tags(\"annotated_3d:v0\")\n", "\n", "if len(annotated_3d) == 0:\n", " print(\"No 3D annotated samples yet.\")\n", "else:\n", " print(f\"QA Check: 3D Label coverage\")\n", " print(f\" Annotated samples (point cloud): {len(annotated_3d)}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Class distribution for 3D\n", "from collections import Counter\n", "\n", "if len(annotated_3d) > 0:\n", " all_labels_3d = []\n", " for sample in annotated_3d:\n", " if sample[LABEL_FIELD_3D]:\n", " all_labels_3d.extend([d.label for d in sample[LABEL_FIELD_3D].detections])\n", "\n", " print(f\"\\n3D Class distribution ({len(all_labels_3d)} total cuboids)\")\n", " for label, count in Counter(all_labels_3d).most_common():\n", " print(f\" {label}: {count}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Cross-check: scenes with 2D labels should have 3D labels\n", "LABEL_FIELD_2D = \"human_detections\"\n", "\n", "if LABEL_FIELD_2D in dataset.get_field_schema() and LABEL_FIELD_3D in dataset.get_field_schema():\n", " batch_left = dataset.match_tags(\"batch:v0\").select_group_slices([\"left\"])\n", " batch_pcd = dataset.match_tags(\"batch:v0\").select_group_slices([\"pcd\"])\n", " \n", " # Groups with 2D labels\n", " groups_2d = set(\n", " s.group.id for s in batch_left \n", " if s[LABEL_FIELD_2D] and len(s[LABEL_FIELD_2D].detections) > 0\n", " )\n", " \n", " # Groups with 3D labels\n", " groups_3d = set(\n", " s.group.id for s in batch_pcd \n", " if s[LABEL_FIELD_3D] and len(s[LABEL_FIELD_3D].detections) > 0\n", " )\n", " \n", " print(f\"\\nCross-modality check:\")\n", " print(f\" Groups with 2D labels: {len(groups_2d)}\")\n", " print(f\" Groups with 3D labels: {len(groups_3d)}\")\n", " print(f\" Groups with both: {len(groups_2d & groups_3d)}\")\n", " \n", " missing_3d = groups_2d - groups_3d\n", " if missing_3d:\n", " print(f\" >>> {len(missing_3d)} groups have 2D but not 3D labels\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "You annotated 3D cuboids on the point cloud slice:\n", "- Defined a 3D schema (subset of KITTI classes)\n", "- Used the annotation plane and transform controls\n", "- Verified alignment using camera projections\n", "- Ran QA checks for coverage and cross-modality consistency\n", "\n", "**Artifacts:**\n", "- `human_cuboids` field with 3D cuboid annotations\n", "- `annotated_3d:v0` tag on point cloud samples with labels\n", "\n", "**Key Concept:** The 3D→2D camera projections let you verify that your 3D labels align with the 2D scene. This cross-modal validation is a key differentiator for multimodal annotation workflows.\n", "\n", "**Next:** Step 6 - Train + Evaluate" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }