{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 7: Iteration Loop\n", "\n", "Now you have a trained model and know where it fails. This step shows how to:\n", "1. Run a **Golden QA check** to detect annotation drift\n", "2. Select the **next batch** using a hybrid strategy\n", "\n", "The hybrid strategy balances:\n", "- **30% Coverage** - Diversity sampling to avoid tunnel vision\n", "- **70% Targeted** - Samples similar to failures\n", "\n", "This balance is critical. Only chasing failures creates a model that's great at edge cases and terrible at normal cases." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -q scikit-learn" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo\n", "import fiftyone.brain as fob\n", "from fiftyone import ViewField as F\n", "import numpy as np\n", "from sklearn.metrics.pairwise import cosine_similarity\n", "from collections import Counter\n", "\n", "LABEL_FIELD_2D = \"human_detections\"\n", "\n", "dataset = fo.load_dataset(\"annotation_tutorial\")\n", "\n", "# Get schema classes\n", "if \"annotation_schema_2d\" in dataset.info:\n", " SCHEMA_CLASSES = set(dataset.info[\"annotation_schema_2d\"][\"classes\"])\n", "else:\n", " SCHEMA_CLASSES = {\"Car\", \"Van\", \"Truck\", \"Pedestrian\", \"Person_sitting\", \"Cyclist\", \"Tram\", \"Misc\"}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Golden QA Check\n", "\n", "Before selecting the next batch, verify annotation quality hasn't drifted. The golden set is a small, carefully reviewed sample we check each iteration.\n", "\n", "**What to look for:**\n", "- Label count distribution staying stable\n", "- No unexpected empty samples\n", "- Class distribution roughly matching earlier rounds" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load golden QA set (left camera slice)\n", "golden = dataset.load_saved_view(\"golden_qa\").select_group_slices([\"left\"])\n", "\n", "# For tutorial, copy ground_truth to human_detections if not present\n", "for sample in golden:\n", " if sample.ground_truth and not sample[LABEL_FIELD_2D]:\n", " filtered_dets = [\n", " fo.Detection(label=d.label, bounding_box=d.bounding_box)\n", " for d in sample.ground_truth.detections\n", " if d.label in SCHEMA_CLASSES\n", " ]\n", " sample[LABEL_FIELD_2D] = fo.Detections(detections=filtered_dets)\n", " sample.save()\n", "\n", "print(f\"Golden QA set (left camera): {len(golden)} samples\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Golden QA Check: Compute baseline stats\n", "golden_stats = {\n", " \"total_samples\": len(golden),\n", " \"samples_with_labels\": 0,\n", " \"total_detections\": 0,\n", " \"class_counts\": Counter()\n", "}\n", "\n", "for sample in golden:\n", " if sample[LABEL_FIELD_2D] and len(sample[LABEL_FIELD_2D].detections) > 0:\n", " golden_stats[\"samples_with_labels\"] += 1\n", " golden_stats[\"total_detections\"] += len(sample[LABEL_FIELD_2D].detections)\n", " for det in sample[LABEL_FIELD_2D].detections:\n", " golden_stats[\"class_counts\"][det.label] += 1\n", "\n", "print(\"=\" * 40)\n", "print(\"GOLDEN QA CHECK\")\n", "print(\"=\" * 40)\n", "print(f\"Samples with labels: {golden_stats['samples_with_labels']}/{golden_stats['total_samples']}\")\n", "print(f\"Total detections: {golden_stats['total_detections']}\")\n", "print(f\"Avg detections/sample: {golden_stats['total_detections']/max(1,golden_stats['samples_with_labels']):.1f}\")\n", "print(f\"\\nTop classes:\")\n", "for cls, count in golden_stats[\"class_counts\"].most_common(5):\n", " print(f\" {cls}: {count}\")\n", "print(\"=\" * 40)\n", "print(\"\\nIf these numbers change unexpectedly between iterations,\")\n", "print(\"investigate annotation consistency before continuing.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Store golden stats for comparison in future iterations\n", "if \"golden_qa_history\" not in dataset.info:\n", " dataset.info[\"golden_qa_history\"] = []\n", "\n", "dataset.info[\"golden_qa_history\"].append({\n", " \"iteration\": len(dataset.info[\"golden_qa_history\"]),\n", " \"samples_with_labels\": golden_stats[\"samples_with_labels\"],\n", " \"total_detections\": golden_stats[\"total_detections\"],\n", " \"top_classes\": dict(golden_stats[\"class_counts\"].most_common(5))\n", "})\n", "dataset.save()\n", "\n", "print(f\"Saved golden QA stats (iteration {len(dataset.info['golden_qa_history'])-1})\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare for Next Batch Selection" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get unlabeled groups from pool\n", "pool = dataset.load_saved_view(\"pool\")\n", "pool_left = pool.select_group_slices([\"left\"])\n", "\n", "# Find samples still unlabeled\n", "remaining = pool_left.match(F(\"annotation_status\") == \"unlabeled\")\n", "remaining_groups = remaining.distinct(\"group.id\")\n", "\n", "print(f\"Pool (left camera): {len(pool_left)} samples\")\n", "print(f\"Remaining unlabeled: {len(remaining)} samples ({len(remaining_groups)} groups)\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get failure samples from evaluation\n", "try:\n", " failures = dataset.load_saved_view(\"eval_v0_failures\")\n", " print(f\"Failure samples: {len(failures)}\")\n", "except:\n", " failures = dataset.limit(0) # Empty view\n", " print(\"No failure view found. Run Step 6 first, or continue with coverage-only selection.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Acquisition Budget\n", "\n", "**Batch sizing guidance:**\n", "- Size batches to your labeling capacity\n", "- For this tutorial, we'll select ~20% of remaining groups" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Select batch size based on remaining pool\n", "batch_size = max(10, int(0.20 * len(remaining_groups)))\n", "\n", "# Split: 30% coverage (ZCore), 70% targeted\n", "coverage_budget = int(0.30 * batch_size)\n", "targeted_budget = batch_size - coverage_budget\n", "\n", "print(f\"Batch v1 budget: {batch_size} groups\")\n", "print(f\" Coverage (diversity): {coverage_budget} (30%)\")\n", "print(f\" Targeted (failures): {targeted_budget} (70%)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part 1: Coverage Selection (30%)\n", "\n", "Use ZCore scores computed in Step 3 to select diverse groups from remaining pool." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get samples with ZCore scores from remaining pool\n", "remaining_with_scores = remaining.match(F(\"zcore\") != None)\n", "\n", "if len(remaining_with_scores) == 0:\n", " print(\"No ZCore scores found in remaining pool. Using random coverage selection.\")\n", " # Random fallback\n", " coverage_groups = list(remaining_groups)[:coverage_budget]\n", "else:\n", " # Build group -> score mapping\n", " group_scores = {}\n", " for sample in remaining_with_scores:\n", " group_scores[sample.group.id] = sample.zcore\n", " \n", " # Sort and select top groups\n", " sorted_groups = sorted(group_scores.items(), key=lambda x: x[1], reverse=True)\n", " coverage_groups = [gid for gid, _ in sorted_groups[:coverage_budget]]\n", " \n", "print(f\"Coverage selection: {len(coverage_groups)} groups\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part 2: Targeted Selection (70%)\n", "\n", "Find groups similar to failures using embedding-based neighbor search." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def find_neighbor_groups(query_embs, query_group_ids, pool_embs, pool_group_ids, n_per_query=3):\n", " \"\"\"Find nearest neighbor groups in embedding space.\"\"\"\n", " if len(query_embs) == 0 or len(pool_embs) == 0:\n", " return []\n", " \n", " sims = cosine_similarity(query_embs, pool_embs)\n", " neighbor_groups = set()\n", " \n", " for sim_row in sims:\n", " top_idx = np.argsort(sim_row)[-n_per_query:]\n", " for idx in top_idx:\n", " neighbor_groups.add(pool_group_ids[idx])\n", " \n", " return list(neighbor_groups)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get embeddings for remaining samples\n", "remaining_samples = list(remaining)\n", "remaining_embs = np.array([s.embeddings for s in remaining_samples if s.embeddings is not None])\n", "remaining_group_ids = [s.group.id for s in remaining_samples if s.embeddings is not None]\n", "\n", "if len(failures) > 0 and len(remaining_embs) > 0:\n", " failure_embs = np.array([s.embeddings for s in failures if s.embeddings is not None])\n", " failure_group_ids = [s.group.id for s in failures if s.embeddings is not None]\n", " \n", " print(f\"Finding neighbors of {len(failure_embs)} failure samples...\")\n", " \n", " # Find neighbor groups (excluding already-selected coverage groups)\n", " targeted_groups = find_neighbor_groups(\n", " failure_embs, failure_group_ids,\n", " remaining_embs, remaining_group_ids,\n", " n_per_query=5\n", " )\n", " targeted_groups = [gid for gid in targeted_groups if gid not in coverage_groups][:targeted_budget]\n", " print(f\"Targeted selection: {len(targeted_groups)} groups\")\n", "else:\n", " print(\"No failures to target or no embeddings. Using coverage-only selection.\")\n", " # Fall back to more coverage\n", " if len(remaining_with_scores) > coverage_budget:\n", " extra_groups = [gid for gid, _ in sorted_groups[coverage_budget:coverage_budget + targeted_budget]]\n", " targeted_groups = [gid for gid in extra_groups if gid not in coverage_groups]\n", " else:\n", " targeted_groups = []\n", " print(f\"Additional coverage selection: {len(targeted_groups)} groups\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Combine and Tag Batch v1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Combine selections\n", "batch_v1_groups = list(set(coverage_groups + targeted_groups))\n", "\n", "if len(batch_v1_groups) == 0:\n", " print(\"No groups selected. Check that Steps 3 and 6 completed successfully.\")\n", "else:\n", " # Select ALL samples in these groups (all slices)\n", " batch_v1 = dataset.match(F(\"group.id\").is_in(batch_v1_groups))\n", "\n", " # Tag\n", " batch_v1.tag_samples(\"batch:v1\")\n", " batch_v1.tag_samples(\"to_annotate\")\n", " batch_v1.set_values(\"annotation_status\", [\"selected\"] * len(batch_v1))\n", "\n", " # Track source for analysis\n", " dataset.match(F(\"group.id\").is_in(coverage_groups)).tag_samples(\"source:coverage\")\n", " dataset.match(F(\"group.id\").is_in(targeted_groups)).tag_samples(\"source:targeted\")\n", "\n", " # Save view\n", " dataset.save_view(\"batch_v1\", dataset.match_tags(\"batch:v1\"))\n", "\n", " print(f\"\\nBatch v1: {len(batch_v1_groups)} groups\")\n", " print(f\" Coverage: {len(coverage_groups)}\")\n", " print(f\" Targeted: {len(targeted_groups)}\")\n", " print(f\" Total samples (all slices): {len(batch_v1)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Complete Loop\n", "\n", "You now have the full iteration recipe:\n", "\n", "```\n", "1. Run Golden QA check (detect drift)\n", "2. Annotate the current batch:\n", " - Step 4: 2D detections on left camera\n", " - Step 5: 3D cuboids on point cloud\n", "3. Train on all annotated data (Step 6)\n", "4. Evaluate on val set, tag failures\n", "5. Select next batch: 30% coverage + 70% targeted\n", "6. Repeat until stopping criteria\n", "```\n", "\n", "### Stopping Criteria\n", "\n", "Stop when:\n", "- Gains per labeled sample flatten (diminishing returns)\n", "- Remaining failures are mostly label ambiguity\n", "- Val metrics hit your target threshold\n", "\n", "### The 30% Coverage Rule\n", "\n", "**Don't skip the coverage budget.** Only chasing failures leads to:\n", "- Overfitting to edge cases\n", "- Distorted class priors\n", "- Models that fail on \"normal\" inputs\n", "\n", "Coverage keeps you honest." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Progress summary\n", "pool = dataset.load_saved_view(\"pool\")\n", "pool_groups = pool.distinct(\"group.id\")\n", "total_pool_groups = len(pool_groups)\n", "\n", "annotated_groups = len(dataset.match_tags(\"annotated:v0\").distinct(\"group.id\"))\n", "selected_v1_groups = len(dataset.match_tags(\"batch:v1\").distinct(\"group.id\"))\n", "\n", "pool_left = pool.select_group_slices([\"left\"])\n", "still_unlabeled = len(pool_left.match(F(\"annotation_status\") == \"unlabeled\").distinct(\"group.id\"))\n", "\n", "print(\"=\" * 40)\n", "print(\"ANNOTATION PROGRESS (by group/scene)\")\n", "print(\"=\" * 40)\n", "print(f\"Pool total: {total_pool_groups} groups\")\n", "print(f\"Annotated (v0): {annotated_groups} groups ({100*annotated_groups/total_pool_groups:.0f}%)\")\n", "print(f\"Selected (v1): {selected_v1_groups} groups ({100*selected_v1_groups/total_pool_groups:.0f}%)\")\n", "print(f\"Still unlabeled: {still_unlabeled} groups ({100*still_unlabeled/total_pool_groups:.0f}%)\")\n", "print(\"=\" * 40)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "You implemented the iteration loop:\n", "- **Golden QA check** to detect annotation drift\n", "- **Hybrid acquisition**: 30% coverage + 70% targeted\n", "- Tagged `batch:v1` ready for annotation (all slices: left, right, pcd)\n", "\n", "**Why this works:** \n", "- Coverage prevents overfitting to edge cases\n", "- Targeting fixes known failures\n", "- Golden QA catches annotation drift early\n", "- The combination improves faster than either strategy alone\n", "\n", "**Your turn:** Repeat Steps 4-7 with batch_v1, then batch_v2, etc.\n", "\n", "---\n", "\n", "## Congratulations!\n", "\n", "You've completed the Full Loop annotation tutorial. You now know how to:\n", "\n", "1. **Setup** - Create group-level splits for multimodal data\n", "2. **Select** - Use ZCore for diversity-based sample selection\n", "3. **Annotate 2D** - Label detections on camera images\n", "4. **Annotate 3D** - Label cuboids on point clouds\n", "5. **Train + Evaluate** - Train a model and analyze failures\n", "6. **Iterate** - Use hybrid acquisition to select the next batch\n", "\n", "This workflow scales from small experiments to production annotation pipelines." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }