{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 4: 2D Annotation + QA\n", "\n", "Now we annotate the selected batch. This step covers:\n", "1. Setting up a consistent annotation schema (KITTI classes)\n", "2. Annotating 2D detections on the **left camera slice**\n", "3. QA checks before moving to 3D annotation\n", "\n", "> **Time commitment:** Plan 1-2 minutes per scene for careful annotation. Start with 10-20 scenes to get the workflow, then continue or use the fast-forward option." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo\n", "from fiftyone import ViewField as F\n", "\n", "dataset = fo.load_dataset(\"annotation_tutorial\")\n", "batch_v0 = dataset.load_saved_view(\"batch_v0\")\n", "\n", "# Get left camera slice from batch\n", "batch_v0_left = batch_v0.select_group_slices([\"left\"])\n", "\n", "print(f\"Batch v0: {len(batch_v0.distinct('group.id'))} groups (scenes)\")\n", "print(f\"Left camera samples to annotate: {len(batch_v0_left)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Your Schema (KITTI Classes)\n", "\n", "Before labeling, define the rules. This prevents class drift and maintains consistency.\n", "\n", "We use the standard KITTI classes for autonomous driving." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define annotation schema for 2D detections\n", "LABEL_FIELD_2D = \"human_detections\"\n", "\n", "SCHEMA_2D = {\n", " \"field_name\": LABEL_FIELD_2D,\n", " \"classes\": [\n", " \"Car\",\n", " \"Van\",\n", " \"Truck\",\n", " \"Pedestrian\",\n", " \"Person_sitting\",\n", " \"Cyclist\",\n", " \"Tram\",\n", " \"Misc\" # catch-all for edge cases\n", " ]\n", "}\n", "\n", "SCHEMA_CLASSES_2D = set(SCHEMA_2D[\"classes\"])\n", "\n", "# Store in dataset for reference\n", "dataset.info[\"annotation_schema_2d\"] = SCHEMA_2D\n", "dataset.save()\n", "\n", "print(f\"2D Schema defined: {len(SCHEMA_2D['classes'])} classes\")\n", "print(f\"Target field: {LABEL_FIELD_2D}\")\n", "print(f\"\\nClasses: {SCHEMA_2D['classes']}\")\n", "print(f\"\\nWhen you create a field in the App, name it exactly: {LABEL_FIELD_2D}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Annotate 2D Detections in the App\n", "\n", "**This is the real labeling step.** Open the App and annotate the left camera images.\n", "\n", "### Setup (one time)\n", "1. Launch the App with your batch\n", "2. Click a sample to open the modal\n", "3. **Select the `left` slice** from the slice dropdown\n", "4. Click the **Annotate** tab (pencil icon)\n", "5. Click **Schema** -> **New Field** -> name it `human_detections`\n", "6. Set type to **Detections** and add the KITTI classes above\n", "\n", "### For each scene\n", "1. Ensure you're on the **left** slice\n", "2. Review the image\n", "3. Click **Detection** button (square icon)\n", "4. Draw boxes around all vehicles, pedestrians, cyclists\n", "5. Assign the correct KITTI class\n", "6. Move to the next scene\n", "\n", "### Labeling Guidelines\n", "- **Car**: Sedans, SUVs, hatchbacks\n", "- **Van**: Minivans, cargo vans\n", "- **Truck**: Pickup trucks, semi-trucks\n", "- **Pedestrian**: Standing or walking people\n", "- **Person_sitting**: Seated people (benches, ground)\n", "- **Cyclist**: Person on bicycle\n", "- **Tram**: Streetcars, light rail\n", "- **Misc**: Ambiguous or other vehicles" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Launch App with batch view (left camera slice)\n", "session = fo.launch_app(batch_v0_left)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Stop here and annotate samples\n", "\n", "Take 15-30 minutes to label some scenes. This is the core skill.\n", "\n", "When you're done (or want to fast-forward), continue below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "## Fast-Forward Option\n", "\n", "If you want to proceed without labeling everything manually, set `FAST_FORWARD = True` below. This copies `ground_truth` labels to `human_detections` to simulate completed annotation.\n", "\n", "> **Note:** In real projects, there's no shortcut. Label quality determines model quality." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set to True ONLY if you want to skip manual annotation\n", "FAST_FORWARD = False\n", "\n", "if FAST_FORWARD:\n", " print(\"Fast-forwarding: copying ground_truth to human_detections...\")\n", " print(f\"Filtering to schema classes: {SCHEMA_CLASSES_2D}\")\n", " \n", " copied = 0\n", " skipped = 0\n", " \n", " # Only copy to left camera samples\n", " for sample in batch_v0_left:\n", " if sample.ground_truth:\n", " human_dets = []\n", " for det in sample.ground_truth.detections:\n", " if det.label in SCHEMA_CLASSES_2D:\n", " human_dets.append(fo.Detection(\n", " label=det.label,\n", " bounding_box=det.bounding_box,\n", " ))\n", " copied += 1\n", " else:\n", " skipped += 1\n", " sample[LABEL_FIELD_2D] = fo.Detections(detections=human_dets)\n", " else:\n", " sample[LABEL_FIELD_2D] = fo.Detections(detections=[])\n", " sample.save()\n", " \n", " print(f\"Copied {copied} detections, skipped {skipped} (not in schema)\")\n", "else:\n", " print(\"Using your manual annotations.\")\n", " print(f\"Make sure you created '{LABEL_FIELD_2D}' and labeled on the LEFT slice!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mark Annotated Samples\n", "\n", "**Important:** We only mark samples as \"annotated\" if they actually have labels." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": "# Reload to see changes\ndataset.reload()\n\n# Check left camera samples in batch\nbatch_left = dataset.match_tags(\"batch:v0\").select_group_slices([\"left\"])\n\nif LABEL_FIELD_2D in dataset.get_field_schema():\n has_labels = batch_left.match(F(f\"{LABEL_FIELD_2D}.detections\").length() > 0)\n no_labels = batch_left.match(\n (F(LABEL_FIELD_2D) == None) | (F(f\"{LABEL_FIELD_2D}.detections\").length() == 0)\n )\n \n print(f\"Batch v0 (left camera) status:\")\n print(f\" With 2D labels: {len(has_labels)}\")\n print(f\" Without labels: {len(no_labels)}\")\n \n if len(has_labels) == 0:\n print(f\"\\n>>> No samples have labels in {LABEL_FIELD_2D}.\")\n print(\">>> Either label some samples in the App, or set FAST_FORWARD = True.\")\n else:\n # Tag samples with labels as annotated_2d\n has_labels.tag_samples(\"annotated_2d:v0\")\n \n # Tag all slices of annotated groups\n # Must iterate explicitly since F(\"group.id\").is_in() does not work on grouped datasets\n annotated_group_ids = set(has_labels.distinct(\"group.id\"))\n \n for slice_name in dataset.group_slices:\n view = dataset.select_group_slices([slice_name])\n for sample in view.iter_samples(autosave=True):\n if sample.group.id in annotated_group_ids:\n if \"annotated:v0\" not in sample.tags:\n sample.tags.append(\"annotated:v0\")\n \n print(f\"\\nTagged {len(has_labels)} left camera samples as annotated_2d:v0\")\n print(f\"Tagged all slices of {len(annotated_group_ids)} groups as annotated:v0\")\nelse:\n print(f\"Field {LABEL_FIELD_2D} not found. Create it in the App first.\")" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## QA Checks\n", "\n", "Before moving to 3D annotation, verify 2D label quality." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get annotated left camera samples\n", "annotated_2d = dataset.match_tags(\"annotated_2d:v0\")\n", "\n", "if len(annotated_2d) == 0:\n", " print(\"No 2D annotated samples yet. Complete the annotation step above.\")\n", "else:\n", " print(f\"QA Check 1: Label coverage\")\n", " print(f\" Annotated samples (left camera): {len(annotated_2d)}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check 2: Class distribution\n", "from collections import Counter\n", "\n", "if len(annotated_2d) > 0:\n", " all_labels = []\n", " for sample in annotated_2d:\n", " if sample[LABEL_FIELD_2D]:\n", " all_labels.extend([d.label for d in sample[LABEL_FIELD_2D].detections])\n", "\n", " print(f\"\\nQA Check 2: Class distribution ({len(all_labels)} total detections)\")\n", " for label, count in Counter(all_labels).most_common():\n", " print(f\" {label}: {count}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check 3: Unexpected classes\n", "if len(annotated_2d) > 0 and len(all_labels) > 0:\n", " actual = set(all_labels)\n", " unexpected = actual - SCHEMA_CLASSES_2D\n", "\n", " if unexpected:\n", " print(f\"\\nQA Check 3: Unexpected classes found: {unexpected}\")\n", " print(\" These don't match your schema. Review before training.\")\n", " else:\n", " print(f\"\\nQA Check 3: All classes match schema\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check 4: Detection count per scene\n", "if len(annotated_2d) > 0:\n", " det_counts = [\n", " len(s[LABEL_FIELD_2D].detections) \n", " for s in annotated_2d \n", " if s[LABEL_FIELD_2D]\n", " ]\n", " \n", " print(f\"\\nQA Check 4: Detections per scene\")\n", " print(f\" Min: {min(det_counts)}\")\n", " print(f\" Max: {max(det_counts)}\")\n", " print(f\" Mean: {sum(det_counts)/len(det_counts):.1f}\")\n", " \n", " # Flag scenes with very few or very many detections\n", " low_det = [s for s in annotated_2d if s[LABEL_FIELD_2D] and len(s[LABEL_FIELD_2D].detections) < 2]\n", " if low_det:\n", " print(f\" \\n >>> {len(low_det)} scenes have <2 detections. Review for missed objects.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "You annotated 2D detections on the left camera slice:\n", "- Defined KITTI schema for consistency\n", "- Labeled samples in the App (or fast-forwarded)\n", "- **Only samples with actual labels** were marked as annotated\n", "- Ran QA checks: coverage, class distribution, schema compliance\n", "\n", "**Artifacts:**\n", "- `human_detections` field with 2D bounding boxes\n", "- `annotated_2d:v0` tag on left camera samples with labels\n", "- `annotated:v0` tag on all slices of annotated groups\n", "\n", "**Next:** Step 5 - 3D Annotation (cuboids on point clouds)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }