{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 7: Iteration Loop\n",
    "\n",
    "Now you have a trained model and know where it fails. This step shows how to:\n",
    "1. Run a **Golden QA check** to detect annotation drift\n",
    "2. Select the **next batch** using a hybrid strategy\n",
    "\n",
    "The hybrid strategy balances:\n",
    "- **30% Coverage** - Diversity sampling to avoid tunnel vision\n",
    "- **70% Targeted** - Samples similar to failures\n",
    "\n",
    "This balance is critical. Only chasing failures creates a model that's great at edge cases and terrible at normal cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -q scikit-learn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fiftyone as fo\n",
    "import fiftyone.brain as fob\n",
    "from fiftyone import ViewField as F\n",
    "import numpy as np\n",
    "from sklearn.metrics.pairwise import cosine_similarity\n",
    "from collections import Counter\n",
    "\n",
    "LABEL_FIELD_2D = \"human_detections\"\n",
    "\n",
    "dataset = fo.load_dataset(\"annotation_tutorial\")\n",
    "\n",
    "# Get schema classes\n",
    "if \"annotation_schema_2d\" in dataset.info:\n",
    "    SCHEMA_CLASSES = set(dataset.info[\"annotation_schema_2d\"][\"classes\"])\n",
    "else:\n",
    "    SCHEMA_CLASSES = {\"Car\", \"Van\", \"Truck\", \"Pedestrian\", \"Person_sitting\", \"Cyclist\", \"Tram\", \"Misc\"}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Golden QA Check\n",
    "\n",
    "Before selecting the next batch, verify annotation quality hasn't drifted. The golden set is a small, carefully reviewed sample we check each iteration.\n",
    "\n",
    "**What to look for:**\n",
    "- Label count distribution staying stable\n",
    "- No unexpected empty samples\n",
    "- Class distribution roughly matching earlier rounds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load golden QA set (left camera slice)\n",
    "golden = dataset.load_saved_view(\"golden_qa\").select_group_slices([\"left\"])\n",
    "\n",
    "# For tutorial, copy ground_truth to human_detections if not present\n",
    "for sample in golden:\n",
    "    if sample.ground_truth and not sample[LABEL_FIELD_2D]:\n",
    "        filtered_dets = [\n",
    "            fo.Detection(label=d.label, bounding_box=d.bounding_box)\n",
    "            for d in sample.ground_truth.detections\n",
    "            if d.label in SCHEMA_CLASSES\n",
    "        ]\n",
    "        sample[LABEL_FIELD_2D] = fo.Detections(detections=filtered_dets)\n",
    "        sample.save()\n",
    "\n",
    "print(f\"Golden QA set (left camera): {len(golden)} samples\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Golden QA Check: Compute baseline stats\n",
    "golden_stats = {\n",
    "    \"total_samples\": len(golden),\n",
    "    \"samples_with_labels\": 0,\n",
    "    \"total_detections\": 0,\n",
    "    \"class_counts\": Counter()\n",
    "}\n",
    "\n",
    "for sample in golden:\n",
    "    if sample[LABEL_FIELD_2D] and len(sample[LABEL_FIELD_2D].detections) > 0:\n",
    "        golden_stats[\"samples_with_labels\"] += 1\n",
    "        golden_stats[\"total_detections\"] += len(sample[LABEL_FIELD_2D].detections)\n",
    "        for det in sample[LABEL_FIELD_2D].detections:\n",
    "            golden_stats[\"class_counts\"][det.label] += 1\n",
    "\n",
    "print(\"=\" * 40)\n",
    "print(\"GOLDEN QA CHECK\")\n",
    "print(\"=\" * 40)\n",
    "print(f\"Samples with labels: {golden_stats['samples_with_labels']}/{golden_stats['total_samples']}\")\n",
    "print(f\"Total detections: {golden_stats['total_detections']}\")\n",
    "print(f\"Avg detections/sample: {golden_stats['total_detections']/max(1,golden_stats['samples_with_labels']):.1f}\")\n",
    "print(f\"\\nTop classes:\")\n",
    "for cls, count in golden_stats[\"class_counts\"].most_common(5):\n",
    "    print(f\"  {cls}: {count}\")\n",
    "print(\"=\" * 40)\n",
    "print(\"\\nIf these numbers change unexpectedly between iterations,\")\n",
    "print(\"investigate annotation consistency before continuing.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Store golden stats for comparison in future iterations\n",
    "if \"golden_qa_history\" not in dataset.info:\n",
    "    dataset.info[\"golden_qa_history\"] = []\n",
    "\n",
    "dataset.info[\"golden_qa_history\"].append({\n",
    "    \"iteration\": len(dataset.info[\"golden_qa_history\"]),\n",
    "    \"samples_with_labels\": golden_stats[\"samples_with_labels\"],\n",
    "    \"total_detections\": golden_stats[\"total_detections\"],\n",
    "    \"top_classes\": dict(golden_stats[\"class_counts\"].most_common(5))\n",
    "})\n",
    "dataset.save()\n",
    "\n",
    "print(f\"Saved golden QA stats (iteration {len(dataset.info['golden_qa_history'])-1})\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare for Next Batch Selection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get unlabeled groups from pool\n",
    "pool = dataset.load_saved_view(\"pool\")\n",
    "pool_left = pool.select_group_slices([\"left\"])\n",
    "\n",
    "# Find samples still unlabeled\n",
    "remaining = pool_left.match(F(\"annotation_status\") == \"unlabeled\")\n",
    "remaining_groups = remaining.distinct(\"group.id\")\n",
    "\n",
    "print(f\"Pool (left camera): {len(pool_left)} samples\")\n",
    "print(f\"Remaining unlabeled: {len(remaining)} samples ({len(remaining_groups)} groups)\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get failure samples from evaluation\n",
    "try:\n",
    "    failures = dataset.load_saved_view(\"eval_v0_failures\")\n",
    "    print(f\"Failure samples: {len(failures)}\")\n",
    "except:\n",
    "    failures = dataset.limit(0)  # Empty view\n",
    "    print(\"No failure view found. Run Step 6 first, or continue with coverage-only selection.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define Acquisition Budget\n",
    "\n",
    "**Batch sizing guidance:**\n",
    "- Size batches to your labeling capacity\n",
    "- For this tutorial, we'll select ~20% of remaining groups"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Select batch size based on remaining pool\n",
    "batch_size = max(10, int(0.20 * len(remaining_groups)))\n",
    "\n",
    "# Split: 30% coverage (ZCore), 70% targeted\n",
    "coverage_budget = int(0.30 * batch_size)\n",
    "targeted_budget = batch_size - coverage_budget\n",
    "\n",
    "print(f\"Batch v1 budget: {batch_size} groups\")\n",
    "print(f\"  Coverage (diversity): {coverage_budget} (30%)\")\n",
    "print(f\"  Targeted (failures): {targeted_budget} (70%)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 1: Coverage Selection (30%)\n",
    "\n",
    "Use ZCore scores computed in Step 3 to select diverse groups from remaining pool."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get samples with ZCore scores from remaining pool\n",
    "remaining_with_scores = remaining.match(F(\"zcore\") != None)\n",
    "\n",
    "if len(remaining_with_scores) == 0:\n",
    "    print(\"No ZCore scores found in remaining pool. Using random coverage selection.\")\n",
    "    # Random fallback\n",
    "    coverage_groups = list(remaining_groups)[:coverage_budget]\n",
    "else:\n",
    "    # Build group -> score mapping\n",
    "    group_scores = {}\n",
    "    for sample in remaining_with_scores:\n",
    "        group_scores[sample.group.id] = sample.zcore\n",
    "    \n",
    "    # Sort and select top groups\n",
    "    sorted_groups = sorted(group_scores.items(), key=lambda x: x[1], reverse=True)\n",
    "    coverage_groups = [gid for gid, _ in sorted_groups[:coverage_budget]]\n",
    "    \n",
    "print(f\"Coverage selection: {len(coverage_groups)} groups\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 2: Targeted Selection (70%)\n",
    "\n",
    "Find groups similar to failures using embedding-based neighbor search."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def find_neighbor_groups(query_embs, query_group_ids, pool_embs, pool_group_ids, n_per_query=3):\n",
    "    \"\"\"Find nearest neighbor groups in embedding space.\"\"\"\n",
    "    if len(query_embs) == 0 or len(pool_embs) == 0:\n",
    "        return []\n",
    "    \n",
    "    sims = cosine_similarity(query_embs, pool_embs)\n",
    "    neighbor_groups = set()\n",
    "    \n",
    "    for sim_row in sims:\n",
    "        top_idx = np.argsort(sim_row)[-n_per_query:]\n",
    "        for idx in top_idx:\n",
    "            neighbor_groups.add(pool_group_ids[idx])\n",
    "    \n",
    "    return list(neighbor_groups)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get embeddings for remaining samples\n",
    "remaining_samples = list(remaining)\n",
    "remaining_embs = np.array([s.embeddings for s in remaining_samples if s.embeddings is not None])\n",
    "remaining_group_ids = [s.group.id for s in remaining_samples if s.embeddings is not None]\n",
    "\n",
    "if len(failures) > 0 and len(remaining_embs) > 0:\n",
    "    failure_embs = np.array([s.embeddings for s in failures if s.embeddings is not None])\n",
    "    failure_group_ids = [s.group.id for s in failures if s.embeddings is not None]\n",
    "    \n",
    "    print(f\"Finding neighbors of {len(failure_embs)} failure samples...\")\n",
    "    \n",
    "    # Find neighbor groups (excluding already-selected coverage groups)\n",
    "    targeted_groups = find_neighbor_groups(\n",
    "        failure_embs, failure_group_ids,\n",
    "        remaining_embs, remaining_group_ids,\n",
    "        n_per_query=5\n",
    "    )\n",
    "    targeted_groups = [gid for gid in targeted_groups if gid not in coverage_groups][:targeted_budget]\n",
    "    print(f\"Targeted selection: {len(targeted_groups)} groups\")\n",
    "else:\n",
    "    print(\"No failures to target or no embeddings. Using coverage-only selection.\")\n",
    "    # Fall back to more coverage\n",
    "    if len(remaining_with_scores) > coverage_budget:\n",
    "        extra_groups = [gid for gid, _ in sorted_groups[coverage_budget:coverage_budget + targeted_budget]]\n",
    "        targeted_groups = [gid for gid in extra_groups if gid not in coverage_groups]\n",
    "    else:\n",
    "        targeted_groups = []\n",
    "    print(f\"Additional coverage selection: {len(targeted_groups)} groups\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Combine and Tag Batch v1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Combine selections\n",
    "batch_v1_groups = list(set(coverage_groups + targeted_groups))\n",
    "\n",
    "if len(batch_v1_groups) == 0:\n",
    "    print(\"No groups selected. Check that Steps 3 and 6 completed successfully.\")\n",
    "else:\n",
    "    # Select ALL samples in these groups (all slices)\n",
    "    batch_v1 = dataset.match(F(\"group.id\").is_in(batch_v1_groups))\n",
    "\n",
    "    # Tag\n",
    "    batch_v1.tag_samples(\"batch:v1\")\n",
    "    batch_v1.tag_samples(\"to_annotate\")\n",
    "    batch_v1.set_values(\"annotation_status\", [\"selected\"] * len(batch_v1))\n",
    "\n",
    "    # Track source for analysis\n",
    "    dataset.match(F(\"group.id\").is_in(coverage_groups)).tag_samples(\"source:coverage\")\n",
    "    dataset.match(F(\"group.id\").is_in(targeted_groups)).tag_samples(\"source:targeted\")\n",
    "\n",
    "    # Save view\n",
    "    dataset.save_view(\"batch_v1\", dataset.match_tags(\"batch:v1\"))\n",
    "\n",
    "    print(f\"\\nBatch v1: {len(batch_v1_groups)} groups\")\n",
    "    print(f\"  Coverage: {len(coverage_groups)}\")\n",
    "    print(f\"  Targeted: {len(targeted_groups)}\")\n",
    "    print(f\"  Total samples (all slices): {len(batch_v1)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The Complete Loop\n",
    "\n",
    "You now have the full iteration recipe:\n",
    "\n",
    "```\n",
    "1. Run Golden QA check (detect drift)\n",
    "2. Annotate the current batch:\n",
    "   - Step 4: 2D detections on left camera\n",
    "   - Step 5: 3D cuboids on point cloud\n",
    "3. Train on all annotated data (Step 6)\n",
    "4. Evaluate on val set, tag failures\n",
    "5. Select next batch: 30% coverage + 70% targeted\n",
    "6. Repeat until stopping criteria\n",
    "```\n",
    "\n",
    "### Stopping Criteria\n",
    "\n",
    "Stop when:\n",
    "- Gains per labeled sample flatten (diminishing returns)\n",
    "- Remaining failures are mostly label ambiguity\n",
    "- Val metrics hit your target threshold\n",
    "\n",
    "### The 30% Coverage Rule\n",
    "\n",
    "**Don't skip the coverage budget.** Only chasing failures leads to:\n",
    "- Overfitting to edge cases\n",
    "- Distorted class priors\n",
    "- Models that fail on \"normal\" inputs\n",
    "\n",
    "Coverage keeps you honest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Progress summary\n",
    "pool = dataset.load_saved_view(\"pool\")\n",
    "pool_groups = pool.distinct(\"group.id\")\n",
    "total_pool_groups = len(pool_groups)\n",
    "\n",
    "annotated_groups = len(dataset.match_tags(\"annotated:v0\").distinct(\"group.id\"))\n",
    "selected_v1_groups = len(dataset.match_tags(\"batch:v1\").distinct(\"group.id\"))\n",
    "\n",
    "pool_left = pool.select_group_slices([\"left\"])\n",
    "still_unlabeled = len(pool_left.match(F(\"annotation_status\") == \"unlabeled\").distinct(\"group.id\"))\n",
    "\n",
    "print(\"=\" * 40)\n",
    "print(\"ANNOTATION PROGRESS (by group/scene)\")\n",
    "print(\"=\" * 40)\n",
    "print(f\"Pool total:      {total_pool_groups} groups\")\n",
    "print(f\"Annotated (v0):  {annotated_groups} groups ({100*annotated_groups/total_pool_groups:.0f}%)\")\n",
    "print(f\"Selected (v1):   {selected_v1_groups} groups ({100*selected_v1_groups/total_pool_groups:.0f}%)\")\n",
    "print(f\"Still unlabeled: {still_unlabeled} groups ({100*still_unlabeled/total_pool_groups:.0f}%)\")\n",
    "print(\"=\" * 40)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "You implemented the iteration loop:\n",
    "- **Golden QA check** to detect annotation drift\n",
    "- **Hybrid acquisition**: 30% coverage + 70% targeted\n",
    "- Tagged `batch:v1` ready for annotation (all slices: left, right, pcd)\n",
    "\n",
    "**Why this works:** \n",
    "- Coverage prevents overfitting to edge cases\n",
    "- Targeting fixes known failures\n",
    "- Golden QA catches annotation drift early\n",
    "- The combination improves faster than either strategy alone\n",
    "\n",
    "**Your turn:** Repeat Steps 4-7 with batch_v1, then batch_v2, etc.\n",
    "\n",
    "---\n",
    "\n",
    "## Congratulations!\n",
    "\n",
    "You've completed the Full Loop annotation tutorial. You now know how to:\n",
    "\n",
    "1. **Setup** - Create group-level splits for multimodal data\n",
    "2. **Select** - Use ZCore for diversity-based sample selection\n",
    "3. **Annotate 2D** - Label detections on camera images\n",
    "4. **Annotate 3D** - Label cuboids on point clouds\n",
    "5. **Train + Evaluate** - Train a model and analyze failures\n",
    "6. **Iterate** - Use hybrid acquisition to select the next batch\n",
    "\n",
    "This workflow scales from small experiments to production annotation pipelines."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}