{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 6: Train + Evaluate\n", "\n", "Train a 2D detector on your labeled data and evaluate it properly. Understanding **where** the model fails tells you what to label next.\n", "\n", "> **Note:** We train on `human_detections` (2D labels from Step 4) using the **left camera slice**. For evaluation, we need labels on the val set too." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -q ultralytics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo\n", "from fiftyone import ViewField as F\n", "import os\n", "\n", "LABEL_FIELD_2D = \"human_detections\"\n", "\n", "dataset = fo.load_dataset(\"annotation_tutorial\")\n", "\n", "# Get schema classes\n", "if \"annotation_schema_2d\" in dataset.info:\n", " SCHEMA_CLASSES = set(dataset.info[\"annotation_schema_2d\"][\"classes\"])\n", "else:\n", " SCHEMA_CLASSES = {\"Car\", \"Van\", \"Truck\", \"Pedestrian\", \"Person_sitting\", \"Cyclist\", \"Tram\", \"Misc\"}\n", "\n", "print(f\"Schema classes: {len(SCHEMA_CLASSES)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get Training Data\n", "\n", "**Important:** We train on the **left camera slice** of annotated groups. Only samples with actual labels are included." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Training data: left camera slice with labels from annotated groups\n", "annotated_groups = dataset.match_tags(\"annotated:v0\")\n", "annotated_left = annotated_groups.select_group_slices([\"left\"])\n", "\n", "# Filter to samples with actual detections\n", "train_view = annotated_left.match(F(f\"{LABEL_FIELD_2D}.detections\").length() > 0)\n", "\n", "print(f\"Annotated groups: {len(annotated_groups.distinct('group.id'))}\")\n", "print(f\"Training samples (left camera with labels): {len(train_view)}\")\n", "\n", "if len(train_view) == 0:\n", " print(\"\\n>>> No training samples with labels. Complete Step 4 first.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Validation data: left camera slice from val set\n", "val_set = dataset.load_saved_view(\"val_set\")\n", "val_left = val_set.select_group_slices([\"left\"])\n", "\n", "print(f\"Validation groups: {len(val_set.distinct('group.id'))}\")\n", "print(f\"Validation samples (left camera): {len(val_left)}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# For evaluation, we need human_detections on val set\n", "# In production, you'd label these. For tutorial, copy ground_truth\n", "# FILTERED to schema classes for consistency\n", "\n", "copied_count = 0\n", "skipped_count = 0\n", "\n", "for sample in val_left:\n", " if sample.ground_truth and not sample[LABEL_FIELD_2D]:\n", " filtered_dets = []\n", " for d in sample.ground_truth.detections:\n", " if d.label in SCHEMA_CLASSES:\n", " filtered_dets.append(fo.Detection(\n", " label=d.label,\n", " bounding_box=d.bounding_box\n", " ))\n", " copied_count += 1\n", " else:\n", " skipped_count += 1\n", " sample[LABEL_FIELD_2D] = fo.Detections(detections=filtered_dets)\n", " sample.save()\n", "\n", "print(f\"Val set prepared: {copied_count} detections copied, {skipped_count} skipped (not in schema)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Export for Training\n", "\n", "Export the left camera images and labels in YOLO format." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if len(train_view) == 0:\n", " raise ValueError(\"No training samples. Complete Step 4 first.\")\n", "\n", "# Get classes from training data\n", "classes = train_view.distinct(f\"{LABEL_FIELD_2D}.detections.label\")\n", "print(f\"Classes in training data: {classes}\")\n", "\n", "export_dir = \"/tmp/annotation_tutorial_yolo\"\n", "os.makedirs(export_dir, exist_ok=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export training data\n", "train_view.export(\n", " export_dir=os.path.join(export_dir, \"train\"),\n", " dataset_type=fo.types.YOLOv5Dataset,\n", " label_field=LABEL_FIELD_2D,\n", " classes=classes,\n", ")\n", "\n", "# Export validation data (also needs labels)\n", "val_with_labels = val_left.match(F(f\"{LABEL_FIELD_2D}.detections\").length() > 0)\n", "val_with_labels.export(\n", " export_dir=os.path.join(export_dir, \"val\"),\n", " dataset_type=fo.types.YOLOv5Dataset,\n", " label_field=LABEL_FIELD_2D,\n", " classes=classes,\n", ")\n", "\n", "print(f\"Exported {len(train_view)} train, {len(val_with_labels)} val samples to {export_dir}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create YAML config\n", "yaml_content = f\"\"\"path: {export_dir}\n", "train: train/images\n", "val: val/images\n", "\n", "names:\n", "\"\"\"\n", "for i, cls in enumerate(classes):\n", " yaml_content += f\" {i}: {cls}\\n\"\n", "\n", "yaml_path = os.path.join(export_dir, \"dataset.yaml\")\n", "with open(yaml_path, \"w\") as f:\n", " f.write(yaml_content)\n", "\n", "print(f\"Created {yaml_path}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train YOLOv8" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from ultralytics import YOLO\n", "\n", "# Train (small model, few epochs for demo)\n", "model = YOLO('yolov8n.pt')\n", "results = model.train(\n", " data=yaml_path,\n", " epochs=10,\n", " imgsz=640,\n", " batch=8,\n", " name='tutorial_v0',\n", " project='/tmp/yolo_tutorial'\n", ")\n", "\n", "model_path = '/tmp/yolo_tutorial/tutorial_v0/weights/best.pt'\n", "print(f\"Model saved: {model_path}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run Inference on Validation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load trained model\n", "model = YOLO(model_path)\n", "\n", "# Run inference on val set (left camera slice)\n", "for sample in val_left:\n", " results = model(sample.filepath, verbose=False)[0]\n", " \n", " detections = []\n", " if results.boxes is not None:\n", " for box in results.boxes:\n", " x1, y1, x2, y2 = box.xyxyn[0].tolist()\n", " conf = box.conf[0].item()\n", " cls_idx = int(box.cls[0].item())\n", " label = classes[cls_idx] if cls_idx < len(classes) else f\"class_{cls_idx}\"\n", " \n", " detections.append(fo.Detection(\n", " label=label,\n", " bounding_box=[x1, y1, x2-x1, y2-y1],\n", " confidence=conf\n", " ))\n", " \n", " sample[\"predictions\"] = fo.Detections(detections=detections)\n", " sample.save()\n", "\n", "print(f\"Added predictions to {len(val_left)} val samples\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluate" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Run evaluation\n", "results = val_left.evaluate_detections(\n", " \"predictions\",\n", " gt_field=LABEL_FIELD_2D,\n", " eval_key=\"eval_v0\",\n", " compute_mAP=True\n", ")\n", "\n", "print(f\"mAP: {results.mAP():.3f}\")\n", "results.print_report()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analyze Failures\n", "\n", "Understanding failures is more important than the mAP number." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Reload to see eval fields\n", "val_left = dataset.load_saved_view(\"val_set\").select_group_slices([\"left\"])\n", "\n", "# Find high-FN samples (model missed objects)\n", "high_fn = val_left.sort_by(\"eval_v0_fn\", reverse=True).limit(10)\n", "high_fn.tag_samples(\"failure:high_fn\")\n", "\n", "# Find high-FP samples (model hallucinated)\n", "high_fp = val_left.sort_by(\"eval_v0_fp\", reverse=True).limit(10)\n", "high_fp.tag_samples(\"failure:high_fp\")\n", "\n", "print(f\"Tagged {len(high_fn)} high-FN samples\")\n", "print(f\"Tagged {len(high_fp)} high-FP samples\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# View failures in App\n", "session = fo.launch_app(val_left)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the App:\n", "1. Filter by `failure:high_fn` to see where model missed objects\n", "2. Filter by `failure:high_fp` to see where model hallucinated\n", "3. Look for patterns: specific classes? distances? occlusions?\n", "\n", "These patterns tell you what to label next." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Save evaluation info\n", "dataset.info[\"eval_v0\"] = {\n", " \"mAP\": results.mAP(),\n", " \"train_samples\": len(train_view),\n", " \"val_samples\": len(val_left),\n", " \"model_path\": model_path\n", "}\n", "dataset.save()\n", "\n", "# Save failure view\n", "failures = val_left.match_tags([\"failure:high_fn\", \"failure:high_fp\"])\n", "dataset.save_view(\"eval_v0_failures\", failures)\n", "\n", "print(f\"Saved {len(failures)} failure samples to view 'eval_v0_failures'\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "You trained and evaluated a 2D detector:\n", "- Used **left camera slice** from annotated groups\n", "- Exported in YOLO format\n", "- Trained YOLOv8n for 10 epochs\n", "- Evaluated with FiftyOne: mAP + per-sample FP/FN\n", "- Tagged failure cases for next iteration\n", "\n", "**Key insight:** The failure tags tell you what to label next.\n", "\n", "**Artifacts:**\n", "- `predictions` field with model outputs\n", "- `eval_v0` evaluation results\n", "- Failure tags: `failure:high_fn`, `failure:high_fp`\n", "- `eval_v0_failures` saved view\n", "\n", "**Next:** Step 7 - Iteration Loop" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }