{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 6: Train + Evaluate\n",
    "\n",
    "Train a 2D detector on your labeled data and evaluate it properly. Understanding **where** the model fails tells you what to label next.\n",
    "\n",
    "> **Note:** We train on `human_detections` (2D labels from Step 4) using the **left camera slice**. For evaluation, we need labels on the val set too."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -q ultralytics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fiftyone as fo\n",
    "from fiftyone import ViewField as F\n",
    "import os\n",
    "\n",
    "LABEL_FIELD_2D = \"human_detections\"\n",
    "\n",
    "dataset = fo.load_dataset(\"annotation_tutorial\")\n",
    "\n",
    "# Get schema classes\n",
    "if \"annotation_schema_2d\" in dataset.info:\n",
    "    SCHEMA_CLASSES = set(dataset.info[\"annotation_schema_2d\"][\"classes\"])\n",
    "else:\n",
    "    SCHEMA_CLASSES = {\"Car\", \"Van\", \"Truck\", \"Pedestrian\", \"Person_sitting\", \"Cyclist\", \"Tram\", \"Misc\"}\n",
    "\n",
    "print(f\"Schema classes: {len(SCHEMA_CLASSES)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get Training Data\n",
    "\n",
    "**Important:** We train on the **left camera slice** of annotated groups. Only samples with actual labels are included."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Training data: left camera slice with labels from annotated groups\n",
    "annotated_groups = dataset.match_tags(\"annotated:v0\")\n",
    "annotated_left = annotated_groups.select_group_slices([\"left\"])\n",
    "\n",
    "# Filter to samples with actual detections\n",
    "train_view = annotated_left.match(F(f\"{LABEL_FIELD_2D}.detections\").length() > 0)\n",
    "\n",
    "print(f\"Annotated groups: {len(annotated_groups.distinct('group.id'))}\")\n",
    "print(f\"Training samples (left camera with labels): {len(train_view)}\")\n",
    "\n",
    "if len(train_view) == 0:\n",
    "    print(\"\\n>>> No training samples with labels. Complete Step 4 first.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Validation data: left camera slice from val set\n",
    "val_set = dataset.load_saved_view(\"val_set\")\n",
    "val_left = val_set.select_group_slices([\"left\"])\n",
    "\n",
    "print(f\"Validation groups: {len(val_set.distinct('group.id'))}\")\n",
    "print(f\"Validation samples (left camera): {len(val_left)}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# For evaluation, we need human_detections on val set\n",
    "# In production, you'd label these. For tutorial, copy ground_truth\n",
    "# FILTERED to schema classes for consistency\n",
    "\n",
    "copied_count = 0\n",
    "skipped_count = 0\n",
    "\n",
    "for sample in val_left:\n",
    "    if sample.ground_truth and not sample[LABEL_FIELD_2D]:\n",
    "        filtered_dets = []\n",
    "        for d in sample.ground_truth.detections:\n",
    "            if d.label in SCHEMA_CLASSES:\n",
    "                filtered_dets.append(fo.Detection(\n",
    "                    label=d.label,\n",
    "                    bounding_box=d.bounding_box\n",
    "                ))\n",
    "                copied_count += 1\n",
    "            else:\n",
    "                skipped_count += 1\n",
    "        sample[LABEL_FIELD_2D] = fo.Detections(detections=filtered_dets)\n",
    "        sample.save()\n",
    "\n",
    "print(f\"Val set prepared: {copied_count} detections copied, {skipped_count} skipped (not in schema)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Export for Training\n",
    "\n",
    "Export the left camera images and labels in YOLO format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if len(train_view) == 0:\n",
    "    raise ValueError(\"No training samples. Complete Step 4 first.\")\n",
    "\n",
    "# Get classes from training data\n",
    "classes = train_view.distinct(f\"{LABEL_FIELD_2D}.detections.label\")\n",
    "print(f\"Classes in training data: {classes}\")\n",
    "\n",
    "export_dir = \"/tmp/annotation_tutorial_yolo\"\n",
    "os.makedirs(export_dir, exist_ok=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Export training data\n",
    "train_view.export(\n",
    "    export_dir=os.path.join(export_dir, \"train\"),\n",
    "    dataset_type=fo.types.YOLOv5Dataset,\n",
    "    label_field=LABEL_FIELD_2D,\n",
    "    classes=classes,\n",
    ")\n",
    "\n",
    "# Export validation data (also needs labels)\n",
    "val_with_labels = val_left.match(F(f\"{LABEL_FIELD_2D}.detections\").length() > 0)\n",
    "val_with_labels.export(\n",
    "    export_dir=os.path.join(export_dir, \"val\"),\n",
    "    dataset_type=fo.types.YOLOv5Dataset,\n",
    "    label_field=LABEL_FIELD_2D,\n",
    "    classes=classes,\n",
    ")\n",
    "\n",
    "print(f\"Exported {len(train_view)} train, {len(val_with_labels)} val samples to {export_dir}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create YAML config\n",
    "yaml_content = f\"\"\"path: {export_dir}\n",
    "train: train/images\n",
    "val: val/images\n",
    "\n",
    "names:\n",
    "\"\"\"\n",
    "for i, cls in enumerate(classes):\n",
    "    yaml_content += f\"  {i}: {cls}\\n\"\n",
    "\n",
    "yaml_path = os.path.join(export_dir, \"dataset.yaml\")\n",
    "with open(yaml_path, \"w\") as f:\n",
    "    f.write(yaml_content)\n",
    "\n",
    "print(f\"Created {yaml_path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train YOLOv8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ultralytics import YOLO\n",
    "\n",
    "# Train (small model, few epochs for demo)\n",
    "model = YOLO('yolov8n.pt')\n",
    "results = model.train(\n",
    "    data=yaml_path,\n",
    "    epochs=10,\n",
    "    imgsz=640,\n",
    "    batch=8,\n",
    "    name='tutorial_v0',\n",
    "    project='/tmp/yolo_tutorial'\n",
    ")\n",
    "\n",
    "model_path = '/tmp/yolo_tutorial/tutorial_v0/weights/best.pt'\n",
    "print(f\"Model saved: {model_path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run Inference on Validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load trained model\n",
    "model = YOLO(model_path)\n",
    "\n",
    "# Run inference on val set (left camera slice)\n",
    "for sample in val_left:\n",
    "    results = model(sample.filepath, verbose=False)[0]\n",
    "    \n",
    "    detections = []\n",
    "    if results.boxes is not None:\n",
    "        for box in results.boxes:\n",
    "            x1, y1, x2, y2 = box.xyxyn[0].tolist()\n",
    "            conf = box.conf[0].item()\n",
    "            cls_idx = int(box.cls[0].item())\n",
    "            label = classes[cls_idx] if cls_idx < len(classes) else f\"class_{cls_idx}\"\n",
    "            \n",
    "            detections.append(fo.Detection(\n",
    "                label=label,\n",
    "                bounding_box=[x1, y1, x2-x1, y2-y1],\n",
    "                confidence=conf\n",
    "            ))\n",
    "    \n",
    "    sample[\"predictions\"] = fo.Detections(detections=detections)\n",
    "    sample.save()\n",
    "\n",
    "print(f\"Added predictions to {len(val_left)} val samples\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run evaluation\n",
    "results = val_left.evaluate_detections(\n",
    "    \"predictions\",\n",
    "    gt_field=LABEL_FIELD_2D,\n",
    "    eval_key=\"eval_v0\",\n",
    "    compute_mAP=True\n",
    ")\n",
    "\n",
    "print(f\"mAP: {results.mAP():.3f}\")\n",
    "results.print_report()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analyze Failures\n",
    "\n",
    "Understanding failures is more important than the mAP number."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reload to see eval fields\n",
    "val_left = dataset.load_saved_view(\"val_set\").select_group_slices([\"left\"])\n",
    "\n",
    "# Find high-FN samples (model missed objects)\n",
    "high_fn = val_left.sort_by(\"eval_v0_fn\", reverse=True).limit(10)\n",
    "high_fn.tag_samples(\"failure:high_fn\")\n",
    "\n",
    "# Find high-FP samples (model hallucinated)\n",
    "high_fp = val_left.sort_by(\"eval_v0_fp\", reverse=True).limit(10)\n",
    "high_fp.tag_samples(\"failure:high_fp\")\n",
    "\n",
    "print(f\"Tagged {len(high_fn)} high-FN samples\")\n",
    "print(f\"Tagged {len(high_fp)} high-FP samples\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# View failures in App\n",
    "session = fo.launch_app(val_left)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the App:\n",
    "1. Filter by `failure:high_fn` to see where model missed objects\n",
    "2. Filter by `failure:high_fp` to see where model hallucinated\n",
    "3. Look for patterns: specific classes? distances? occlusions?\n",
    "\n",
    "These patterns tell you what to label next."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save evaluation info\n",
    "dataset.info[\"eval_v0\"] = {\n",
    "    \"mAP\": results.mAP(),\n",
    "    \"train_samples\": len(train_view),\n",
    "    \"val_samples\": len(val_left),\n",
    "    \"model_path\": model_path\n",
    "}\n",
    "dataset.save()\n",
    "\n",
    "# Save failure view\n",
    "failures = val_left.match_tags([\"failure:high_fn\", \"failure:high_fp\"])\n",
    "dataset.save_view(\"eval_v0_failures\", failures)\n",
    "\n",
    "print(f\"Saved {len(failures)} failure samples to view 'eval_v0_failures'\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "You trained and evaluated a 2D detector:\n",
    "- Used **left camera slice** from annotated groups\n",
    "- Exported in YOLO format\n",
    "- Trained YOLOv8n for 10 epochs\n",
    "- Evaluated with FiftyOne: mAP + per-sample FP/FN\n",
    "- Tagged failure cases for next iteration\n",
    "\n",
    "**Key insight:** The failure tags tell you what to label next.\n",
    "\n",
    "**Artifacts:**\n",
    "- `predictions` field with model outputs\n",
    "- `eval_v0` evaluation results\n",
    "- Failure tags: `failure:high_fn`, `failure:high_fp`\n",
    "- `eval_v0_failures` saved view\n",
    "\n",
    "**Next:** Step 7 - Iteration Loop"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}