{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 3: Smart Sample Selection\n", "\n", "Random sampling wastes labels on redundant near-duplicates. This step uses **diversity-based selection** to pick high-value scenes that cover your data distribution efficiently.\n", "\n", "We'll use **ZCore (Zero-Shot Coreset Selection)** to score samples based on:\n", "- **Coverage**: How much of the embedding space does this sample represent?\n", "- **Redundancy**: How many near-duplicates exist?\n", "\n", "High ZCore score = valuable for labeling. Low score = redundant, skip it.\n", "\n", "> **Note:** For grouped datasets, we compute embeddings on the **left camera slice** and select at the **group level** (scene)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": "import fiftyone as fo\nimport fiftyone.brain as fob\nimport numpy as np\nfrom fiftyone import ViewField as F\n\ndataset = fo.load_dataset(\"annotation_tutorial\")\npool = dataset.load_saved_view(\"pool\")\n\n# Get pool groups (scenes)\npool_groups = pool.distinct(\"group.id\")\nprint(f\"Pool: {len(pool_groups)} groups (scenes)\")\ntotal_samples = sum(len(pool.select_group_slices([s])) for s in dataset.group_slices)\nprint(f\"Pool: {total_samples} total samples (all slices)\")" }, { "cell_type": "markdown", "metadata": {}, "source": "## Compute Embeddings on Left Camera Slice\n\nFor diversity selection, we need embeddings. We compute them on the **left camera images** since that is our primary 2D annotation target.\n\n> **Dependencies:** This step requires `torch` and `umap-learn`. Install them if needed:\n> ```bash\n> pip install torch torchvision umap-learn\n> ```" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get left camera slice from pool\n", "pool_left = pool.select_group_slices([\"left\"])\n", "\n", "print(f\"Left camera samples in pool: {len(pool_left)}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute embeddings (takes a few minutes)\n", "fob.compute_visualization(\n", " pool_left,\n", " embeddings=\"embeddings\",\n", " brain_key=\"img_viz\",\n", " verbose=True\n", ")\n", "\n", "print(\"Embeddings computed on left camera slice.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ZCore: Zero-Shot Coreset Selection\n", "\n", "ZCore scores each sample by iteratively:\n", "1. Sampling random points in embedding space\n", "2. Finding the nearest data point (coverage bonus)\n", "3. Penalizing nearby neighbors (redundancy penalty)\n", "\n", "The result: samples covering unique regions score high; redundant samples score low." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def zcore_score(embeddings, n_sample=10000, sample_dim=2, redund_nn=100, redund_exp=4, seed=42):\n", " \"\"\"\n", " Compute ZCore scores for coverage-based sample selection.\n", " \n", " Reference implementation from https://github.com/voxel51/zcore\n", " \n", " Args:\n", " embeddings: np.array of shape (n_samples, embedding_dim)\n", " n_sample: Number of random samples to draw\n", " sample_dim: Number of dimensions to sample at a time\n", " redund_nn: Number of nearest neighbors for redundancy penalty\n", " redund_exp: Exponent for distance-based redundancy penalty\n", " seed: Random seed for reproducibility\n", " \n", " Returns:\n", " Normalized scores (0-1) where higher = more valuable for labeling\n", " \"\"\"\n", " np.random.seed(seed)\n", " \n", " n = len(embeddings)\n", " n_dim = embeddings.shape[1]\n", " \n", " emb_min = np.min(embeddings, axis=0)\n", " emb_max = np.max(embeddings, axis=0)\n", " emb_med = np.median(embeddings, axis=0)\n", " \n", " scores = np.random.uniform(0, 1, n)\n", " \n", " for i in range(n_sample):\n", " if i % 2000 == 0:\n", " print(f\" ZCore progress: {i}/{n_sample}\")\n", " \n", " dim = np.random.choice(n_dim, min(sample_dim, n_dim), replace=False)\n", " sample = np.random.triangular(emb_min[dim], emb_med[dim], emb_max[dim])\n", " \n", " embed_dist = np.sum(np.abs(embeddings[:, dim] - sample), axis=1)\n", " idx = np.argmin(embed_dist)\n", " scores[idx] += 1\n", " \n", " cover_sample = embeddings[idx, dim]\n", " nn_dist = np.sum(np.abs(embeddings[:, dim] - cover_sample), axis=1)\n", " nn = np.argsort(nn_dist)[1:]\n", " \n", " if nn_dist[nn[0]] == 0:\n", " scores[nn[0]] -= 1\n", " else:\n", " nn = nn[:redund_nn]\n", " dist_penalty = 1 / (nn_dist[nn] ** redund_exp + 1e-8)\n", " dist_penalty /= np.sum(dist_penalty)\n", " scores[nn] -= dist_penalty\n", " \n", " scores = (scores - np.min(scores)) / (np.max(scores) - np.min(scores) + 1e-8)\n", " return scores.astype(np.float32)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get embeddings from left camera samples\n", "pool_left_samples = list(pool_left)\n", "embeddings = np.array([s.embeddings for s in pool_left_samples if s.embeddings is not None])\n", "valid_samples = [s for s in pool_left_samples if s.embeddings is not None]\n", "\n", "print(f\"Computing ZCore for {len(embeddings)} samples...\")\n", "print(f\"Embedding dimension: {embeddings.shape[1]}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute ZCore scores\n", "scores = zcore_score(\n", " embeddings,\n", " n_sample=5000,\n", " sample_dim=2,\n", " redund_nn=50,\n", " redund_exp=4,\n", " seed=42\n", ")\n", "\n", "print(f\"\\nZCore scores computed!\")\n", "print(f\"Score range: {scores.min():.3f} - {scores.max():.3f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Add ZCore scores to the left camera samples\n", "for sample, score in zip(valid_samples, scores):\n", " sample[\"zcore\"] = float(score)\n", " sample.save()\n", "\n", "print(f\"Added 'zcore' field to {len(valid_samples)} left camera samples\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select at the Group Level\n", "\n", "We computed scores on individual samples (left camera), but we need to select **groups** (scenes). Each group includes all slices (left, right, pcd).\n", "\n", "Selection strategy: Use the ZCore score from the left camera sample to rank groups." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Build group_id -> zcore mapping\n", "group_scores = {}\n", "for sample in valid_samples:\n", " group_id = sample.group.id\n", " group_scores[group_id] = sample.zcore\n", "\n", "# Sort groups by ZCore score\n", "sorted_groups = sorted(group_scores.items(), key=lambda x: x[1], reverse=True)\n", "\n", "print(f\"Groups with ZCore scores: {len(sorted_groups)}\")\n", "print(f\"Top 5 groups by ZCore:\")\n", "for gid, score in sorted_groups[:5]:\n", " print(f\" {gid[:8]}...: {score:.3f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Select top groups for batch v0\n", "# ~25% of pool groups, minimum 20\n", "batch_size = max(20, int(0.25 * len(sorted_groups)))\n", "selected_group_ids = [gid for gid, _ in sorted_groups[:batch_size]]\n", "\n", "print(f\"Selected {len(selected_group_ids)} groups for Batch v0\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": "# Tag ALL samples in selected groups (all slices)\n# Must iterate all slices since F(\"group.id\").is_in() does not work on grouped datasets\nfor slice_name in dataset.group_slices:\n view = dataset.select_group_slices([slice_name])\n for sample in view.iter_samples(autosave=True):\n if sample.group.id in selected_group_ids:\n sample.tags.append(\"batch:v0\")\n sample.tags.append(\"to_annotate\")\n sample[\"annotation_status\"] = \"selected\"\n\n# Save as view\ndataset.save_view(\"batch_v0\", dataset.match_tags(\"batch:v0\"))\n\n# Count results\nbatch_v0_view = dataset.load_saved_view(\"batch_v0\")\nn_groups = len(batch_v0_view.distinct(\"group.id\"))\nn_samples = sum(len(batch_v0_view.select_group_slices([s])) for s in dataset.group_slices)\n\nprint(f\"\\nBatch v0:\")\nprint(f\" Groups: {n_groups}\")\nprint(f\" Total samples (all slices): {n_samples}\")" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Verify: check sample counts per slice\n", "batch_view = dataset.load_saved_view(\"batch_v0\")\n", "for slice_name in dataset.group_slices:\n", " slice_count = len(batch_view.select_group_slices([slice_name]))\n", " print(f\" {slice_name}: {slice_count} samples\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Visualize in the App\n", "session = fo.launch_app(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": "In the App:\n1. Open the **Embeddings** panel to see the 2D projection\n2. Color by `zcore` to see score distribution\n3. Filter by `batch:v0` tag to see selected groups\n4. Verify high-ZCore samples are spread across clusters (good coverage)\n\n![Embeddings panel with ZCore scores](https://cdn.voxel51.com/getting_started_annotation/notebook3/embeddings_zcore.webp)" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why Diversity Sampling Beats Random\n", "\n", "| Method | What it does | Result |\n", "|--------|-------------|--------|\n", "| **Random** | Picks samples uniformly | Over-samples dense regions, misses rare cases |\n", "| **ZCore** | Balances coverage vs redundancy | Maximizes diversity, fewer wasted labels |\n", "\n", "Research shows diversity-based selection can significantly reduce labeling requirements while maintaining model performance." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "You selected a diverse batch using ZCore:\n", "- Computed embeddings on **left camera slice**\n", "- Ran ZCore to score coverage vs redundancy\n", "- Selected top-scoring **groups** (scenes)\n", "- Tagged all slices (left, right, pcd) for annotation\n", "\n", "**Artifacts:**\n", "- `embeddings` field on left camera samples\n", "- `zcore` field with selection scores\n", "- `batch_v0` saved view (all slices for selected groups)\n", "- Tags: `batch:v0`, `to_annotate`\n", "\n", "**Next:** Step 4 - 2D Annotation + QA" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }