{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 5: 3D Annotation\n",
    "\n",
    "Now we annotate 3D cuboids on the point cloud slice. This step covers:\n",
    "1. Setting up a 3D annotation schema\n",
    "2. Using the 3D annotation tools (cuboids, transform controls)\n",
    "3. Understanding the annotation plane concept\n",
    "4. Viewing 3D labels projected onto 2D camera images\n",
    "\n",
    "> **Tip:** Complete Step 4 (2D annotation) first. Having 2D labels as reference helps with 3D annotation consistency."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fiftyone as fo\n",
    "from fiftyone import ViewField as F\n",
    "\n",
    "dataset = fo.load_dataset(\"annotation_tutorial\")\n",
    "batch_v0 = dataset.load_saved_view(\"batch_v0\")\n",
    "\n",
    "# Get point cloud slice from batch\n",
    "batch_v0_pcd = batch_v0.select_group_slices([\"pcd\"])\n",
    "\n",
    "print(f\"Batch v0: {len(batch_v0.distinct('group.id'))} groups (scenes)\")\n",
    "print(f\"Point cloud samples to annotate: {len(batch_v0_pcd)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define Your 3D Schema\n",
    "\n",
    "For 3D cuboids, we use a subset of KITTI classes - focusing on objects that have clear 3D extent in point clouds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define annotation schema for 3D cuboids\n",
    "LABEL_FIELD_3D = \"human_cuboids\"\n",
    "\n",
    "SCHEMA_3D = {\n",
    "    \"field_name\": LABEL_FIELD_3D,\n",
    "    \"classes\": [\n",
    "        \"Car\",\n",
    "        \"Van\",\n",
    "        \"Truck\",\n",
    "        \"Pedestrian\",\n",
    "        \"Cyclist\"\n",
    "    ]\n",
    "}\n",
    "\n",
    "SCHEMA_CLASSES_3D = set(SCHEMA_3D[\"classes\"])\n",
    "\n",
    "# Store in dataset\n",
    "dataset.info[\"annotation_schema_3d\"] = SCHEMA_3D\n",
    "dataset.save()\n",
    "\n",
    "print(f\"3D Schema defined: {len(SCHEMA_3D['classes'])} classes\")\n",
    "print(f\"Target field: {LABEL_FIELD_3D}\")\n",
    "print(f\"\\nClasses: {SCHEMA_3D['classes']}\")\n",
    "print(f\"\\nWhen you create a field in the App, name it exactly: {LABEL_FIELD_3D}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3D Annotation in the App\n",
    "\n",
    "### Getting to the 3D View\n",
    "1. Launch the App with your batch\n",
    "2. Click a sample to open the modal\n",
    "3. **Select the `pcd` slice** from the slice dropdown\n",
    "4. The 3D visualizer will load the point cloud\n",
    "\n",
    "### 3D Navigation\n",
    "- **Rotate**: Left-click and drag\n",
    "- **Pan**: Right-click and drag (or Shift + left-click)\n",
    "- **Zoom**: Scroll wheel\n",
    "- **Preset views**: Press `1`, `2`, `3`, `4` for top/right/front/annotation-plane views"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Launch App with point cloud view\n",
    "session = fo.launch_app(batch_v0_pcd)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating 3D Cuboids\n",
    "\n",
    "### Enter Annotate Mode\n",
    "1. Click the **Annotate** tab (pencil icon)\n",
    "2. Click **Schema** -> **New Field** -> name it `human_cuboids`\n",
    "3. Set type to **Detections** and add the classes above\n",
    "\n",
    "### Understanding the Annotation Plane\n",
    "\n",
    "The **annotation plane** is a virtual surface that determines where your clicks place vertices. By default, it's the XY plane (ground level).\n",
    "\n",
    "- **Moving the plane**: Reposition to place vertices at different heights\n",
    "- **Why it matters**: Cuboid corners snap to this plane when you click\n",
    "\n",
    "### Drawing a Cuboid\n",
    "1. Click the **Cuboid** tool in the left toolbar\n",
    "2. Click to place the **first corner** on the annotation plane\n",
    "3. Click to place the **opposite corner** (defines the base rectangle)\n",
    "4. The cuboid is created with a default height\n",
    "5. Select a class from the dropdown\n",
    "\n",
    "### Transform Controls\n",
    "After creating a cuboid, use transform controls to refine it:\n",
    "\n",
    "| Control | What it does |\n",
    "|---------|-------------|\n",
    "| **Translation** | Move along X/Y/Z axes or XY/XZ/YZ planes |\n",
    "| **Rotation** | Rotate around X/Y/Z axes |\n",
    "| **Scaling** | Resize along X/Y/Z axes |\n",
    "\n",
    "Click on a cuboid to select it, then use the transform handles."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Camera Projections\n",
    "\n",
    "One of FiftyOne's key 3D features is **camera projections**:\n",
    "\n",
    "### Point Cloud Projections\n",
    "- Flatten the 3D view to 2D planes (top-down, side views)\n",
    "- Useful for accurate positioning\n",
    "\n",
    "### 2D Image Projections\n",
    "- See the camera images in the 3D viewer dropdown\n",
    "- Your 3D cuboids are **projected onto the 2D images in real-time**\n",
    "- This helps verify that your 3D labels align with the 2D scene\n",
    "\n",
    "To use camera projections:\n",
    "1. Look for the **projection dropdown** in the 3D viewer\n",
    "2. Select a camera (e.g., `left`)\n",
    "3. See your cuboids rendered on the 2D image\n",
    "\n",
    "> **Note:** Camera projections require camera intrinsics/extrinsics to be defined in the dataset. The KITTI data in quickstart-groups should have these."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Annotation Guidelines for 3D\n",
    "\n",
    "### Positioning\n",
    "- Center the cuboid on the point cloud cluster representing the object\n",
    "- The base should touch the ground plane\n",
    "- Include all points belonging to the object\n",
    "\n",
    "### Orientation\n",
    "- Align the cuboid's longest axis with the object's heading direction\n",
    "- For vehicles, the front should point in the driving direction\n",
    "\n",
    "### Sizing\n",
    "- Tightly fit the cuboid to the point cloud extent\n",
    "- Don't include points from other objects or ground\n",
    "\n",
    "### Consistency with 2D\n",
    "- Objects labeled in 2D should also be labeled in 3D (if visible in point cloud)\n",
    "- Use the same class for the same object across both modalities"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## Fast-Forward Option\n",
    "\n",
    "If you want to skip manual 3D labeling, set `FAST_FORWARD = True` below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set to True ONLY if you want to skip manual 3D annotation\n",
    "FAST_FORWARD = False\n",
    "\n",
    "if FAST_FORWARD:\n",
    "    print(\"Fast-forwarding: copying 3D ground_truth to human_cuboids...\")\n",
    "    print(f\"Filtering to schema classes: {SCHEMA_CLASSES_3D}\")\n",
    "    \n",
    "    copied = 0\n",
    "    skipped = 0\n",
    "    \n",
    "    for sample in batch_v0_pcd:\n",
    "        if sample.ground_truth:\n",
    "            human_cuboids = []\n",
    "            for det in sample.ground_truth.detections:\n",
    "                if det.label in SCHEMA_CLASSES_3D:\n",
    "                    # Copy the 3D detection\n",
    "                    human_cuboids.append(fo.Detection(\n",
    "                        label=det.label,\n",
    "                        location=det.location if hasattr(det, 'location') else None,\n",
    "                        dimensions=det.dimensions if hasattr(det, 'dimensions') else None,\n",
    "                        rotation=det.rotation if hasattr(det, 'rotation') else None,\n",
    "                    ))\n",
    "                    copied += 1\n",
    "                else:\n",
    "                    skipped += 1\n",
    "            sample[LABEL_FIELD_3D] = fo.Detections(detections=human_cuboids)\n",
    "        else:\n",
    "            sample[LABEL_FIELD_3D] = fo.Detections(detections=[])\n",
    "        sample.save()\n",
    "    \n",
    "    print(f\"Copied {copied} cuboids, skipped {skipped} (not in schema)\")\n",
    "else:\n",
    "    print(\"Using your manual 3D annotations.\")\n",
    "    print(f\"Make sure you created '{LABEL_FIELD_3D}' and labeled on the PCD slice!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reload to see changes\n",
    "dataset.reload()\n",
    "\n",
    "# Check point cloud samples in batch\n",
    "batch_pcd = dataset.match_tags(\"batch:v0\").select_group_slices([\"pcd\"])\n",
    "\n",
    "if LABEL_FIELD_3D in dataset.get_field_schema():\n",
    "    has_labels = batch_pcd.match(F(f\"{LABEL_FIELD_3D}.detections\").length() > 0)\n",
    "    no_labels = batch_pcd.match(\n",
    "        (F(LABEL_FIELD_3D) == None) | (F(f\"{LABEL_FIELD_3D}.detections\").length() == 0)\n",
    "    )\n",
    "    \n",
    "    print(f\"Batch v0 (point cloud) status:\")\n",
    "    print(f\"  With 3D labels: {len(has_labels)}\")\n",
    "    print(f\"  Without labels: {len(no_labels)}\")\n",
    "    \n",
    "    if len(has_labels) > 0:\n",
    "        has_labels.tag_samples(\"annotated_3d:v0\")\n",
    "        print(f\"\\nTagged {len(has_labels)} point cloud samples as 'annotated_3d:v0'\")\n",
    "else:\n",
    "    print(f\"Field '{LABEL_FIELD_3D}' not found. Create it in the App first.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## QA Checks for 3D"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get annotated point cloud samples\n",
    "annotated_3d = dataset.match_tags(\"annotated_3d:v0\")\n",
    "\n",
    "if len(annotated_3d) == 0:\n",
    "    print(\"No 3D annotated samples yet.\")\n",
    "else:\n",
    "    print(f\"QA Check: 3D Label coverage\")\n",
    "    print(f\"  Annotated samples (point cloud): {len(annotated_3d)}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Class distribution for 3D\n",
    "from collections import Counter\n",
    "\n",
    "if len(annotated_3d) > 0:\n",
    "    all_labels_3d = []\n",
    "    for sample in annotated_3d:\n",
    "        if sample[LABEL_FIELD_3D]:\n",
    "            all_labels_3d.extend([d.label for d in sample[LABEL_FIELD_3D].detections])\n",
    "\n",
    "    print(f\"\\n3D Class distribution ({len(all_labels_3d)} total cuboids)\")\n",
    "    for label, count in Counter(all_labels_3d).most_common():\n",
    "        print(f\"  {label}: {count}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cross-check: scenes with 2D labels should have 3D labels\n",
    "LABEL_FIELD_2D = \"human_detections\"\n",
    "\n",
    "if LABEL_FIELD_2D in dataset.get_field_schema() and LABEL_FIELD_3D in dataset.get_field_schema():\n",
    "    batch_left = dataset.match_tags(\"batch:v0\").select_group_slices([\"left\"])\n",
    "    batch_pcd = dataset.match_tags(\"batch:v0\").select_group_slices([\"pcd\"])\n",
    "    \n",
    "    # Groups with 2D labels\n",
    "    groups_2d = set(\n",
    "        s.group.id for s in batch_left \n",
    "        if s[LABEL_FIELD_2D] and len(s[LABEL_FIELD_2D].detections) > 0\n",
    "    )\n",
    "    \n",
    "    # Groups with 3D labels\n",
    "    groups_3d = set(\n",
    "        s.group.id for s in batch_pcd \n",
    "        if s[LABEL_FIELD_3D] and len(s[LABEL_FIELD_3D].detections) > 0\n",
    "    )\n",
    "    \n",
    "    print(f\"\\nCross-modality check:\")\n",
    "    print(f\"  Groups with 2D labels: {len(groups_2d)}\")\n",
    "    print(f\"  Groups with 3D labels: {len(groups_3d)}\")\n",
    "    print(f\"  Groups with both: {len(groups_2d & groups_3d)}\")\n",
    "    \n",
    "    missing_3d = groups_2d - groups_3d\n",
    "    if missing_3d:\n",
    "        print(f\"  >>> {len(missing_3d)} groups have 2D but not 3D labels\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "You annotated 3D cuboids on the point cloud slice:\n",
    "- Defined a 3D schema (subset of KITTI classes)\n",
    "- Used the annotation plane and transform controls\n",
    "- Verified alignment using camera projections\n",
    "- Ran QA checks for coverage and cross-modality consistency\n",
    "\n",
    "**Artifacts:**\n",
    "- `human_cuboids` field with 3D cuboid annotations\n",
    "- `annotated_3d:v0` tag on point cloud samples with labels\n",
    "\n",
    "**Key Concept:** The 3D→2D camera projections let you verify that your 3D labels align with the 2D scene. This cross-modal validation is a key differentiator for multimodal annotation workflows.\n",
    "\n",
    "**Next:** Step 6 - Train + Evaluate"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}