{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Collect and reduce classifier embeddings\n",
    "\n",
    "In this tutorial, we will use an existing classifier model to generate\n",
    "per-instance embeddings for a COCO-style object detection dataset. We will\n",
    "then reduce these embeddings to 3D using PaCMAP.\n",
    "\n",
    "![](../../images/instance-embeddings.png)\n",
    "\n",
    "<!-- Tags: [\"classification\", \"object detection\", \"embeddings\"] -->\n",
    "\n",
    "To run this notebook, you must also have run:\n",
    "* [1-train-crop-model.ipynb](1-train-crop-model.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -q 3lc[pacmap]\n",
    "%pip install -q git+https://github.com/3lc-ai/3lc-examples.git\n",
    "%pip install -q timm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tlc\n",
    "\n",
    "from tlc_tools.augment_bbs.extend_table_with_metrics import extend_table_with_metrics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Project setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "parameters"
    ]
   },
   "outputs": [],
   "source": [
    "PROJECT_NAME = \"3LC Tutorials - COCO128\"\n",
    "TMP_PATH = \"../../../transient_data\"\n",
    "EFFICIENTNET_MODEL_NAME = \"efficientnet_b0\"\n",
    "BATCH_SIZE = 32\n",
    "NUM_COMPONENTS = 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MODEL_CHECKPOINT = TMP_PATH + \"/instance_classifier.pth\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get input Table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Open the Table used in the previous notebook\n",
    "input_table = tlc.Table.from_names(\n",
    "    table_name=\"initial-segmentation\",\n",
    "    dataset_name=\"COCO128\",\n",
    "    project_name=PROJECT_NAME,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Collect embeddings and metrics from pre-trained model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "output_table_url, pacmap_reducer, fit_embeddings = extend_table_with_metrics(\n",
    "    input_table=input_table,\n",
    "    output_table_name=\"extended\",\n",
    "    add_embeddings=True,\n",
    "    add_image_metrics=True,\n",
    "    model_name=EFFICIENTNET_MODEL_NAME,\n",
    "    model_checkpoint=MODEL_CHECKPOINT,\n",
    "    batch_size=BATCH_SIZE,\n",
    "    num_components=NUM_COMPONENTS,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "output_table_url"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}