{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6888a2ad",
   "metadata": {},
   "source": [
    "\n",
    "# Understanding and Using Embeddings\n",
    "\n",
    "Welcome to this hands-on workshop where we will explore **embeddings** and their importance in Visual AI. \n",
    "Embeddings play a crucial role in **image search, clustering, anomaly detection, and representation learning**.\n",
    "In this notebook, we will learn how to generate, visualize, and explore embeddings using **FiftyOne**.\n",
    "\n",
    "![using_embeddings](https://cdn.voxel51.com/getting_started_manufacturing/notebook2/using_embeddings.webp)\n",
    "\n",
    "## Learning Objectives:\n",
    "- Understand what embeddings are and why they matter in Visual AI.\n",
    "- Learn how to compute and store embeddings in FiftyOne.\n",
    "- Use embeddings for similarity search and visualization.\n",
    "- Leverage FiftyOne's interactive tools to explore embeddings.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cdd4c2d9",
   "metadata": {},
   "source": [
    "\n",
    "## What Are Embeddings?\n",
    "\n",
    "Embeddings are **vector representations** of data (images, videos, text, etc.) that capture meaningful characteristics. \n",
    "For images, embeddings store compressed feature representations learned by deep learning models. These features enable tasks such as:\n",
    "- **Similarity Search**: Find images that are visually similar.\n",
    "- **Clustering**: Group images with shared characteristics.\n",
    "- **Anomaly Detection**: Identify outliers in datasets.\n",
    "- **Transfer Learning**: Use learned embeddings to improve other AI tasks.\n",
    "\n",
    "### Further Reading:\n",
    "- [Introduction to Embeddings](https://www.tensorflow.org/text/guide/word_embeddings)\n",
    "- [Feature Representations in Deep Learning](https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34dbec8d",
   "metadata": {},
   "source": [
    "\n",
    "## Generating Embeddings in FiftyOne\n",
    "\n",
    "FiftyOne provides seamless integration for embedding computation. \n",
    "You can extract embeddings using pre-trained deep learning models (such as CLIP, ResNet, or custom models) and store them in FiftyOne datasets.\n",
    "\n",
    "### How It Works:\n",
    "1. Load a dataset in FiftyOne.\n",
    "2. Extract embeddings from a model.\n",
    "3. Store and visualize embeddings.\n",
    "\n",
    "**Relevant Documentation:** [Computing and Storing Embeddings](https://voxel51.com/docs/fiftyone/user_guide/brain.html#computing-embeddings)\n",
    "\n",
    "<div style=\"border-left: 4px solid #3498db; padding: 6px;\">\n",
    "<strong>Note:</strong> You must install the `umap-learn>=0.5` package in order to use UMAP-based visualization. This is recommended, as UMAP is awesome! If you do not wish to install UMAP, try `method='tsne'` instead\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install fiftyone huggingface_hub gdown umap-learn torch torchvision"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fiftyone as fo # base library and app\n",
    "import fiftyone.utils.huggingface as fouh # Hugging Face integration\n",
    "\n",
    "dataset_name = \"MVTec_AD\"\n",
    "\n",
    "# Check if the dataset exists\n",
    "if dataset_name in fo.list_datasets():\n",
    "    print(f\"Dataset '{dataset_name}' exists. Loading...\")\n",
    "    dataset = fo.load_dataset(dataset_name)\n",
    "else:\n",
    "    print(f\"Dataset '{dataset_name}' does not exist. Creating a new one...\")\n",
    "    # Clone the dataset with a new name and make it persistent\n",
    "    dataset_ = fouh.load_from_hub(\"Voxel51/mvtec-ad\", persistent=True, overwrite=True)\n",
    "    dataset = dataset_.clone(\"MVTec_AD\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_emb = fo.load_dataset(\"MVTec_AD_emb\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(dataset_emb)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5596cbed",
   "metadata": {},
   "source": [
    "\n",
    "## Exploring and Visualizing Embeddings\n",
    "\n",
    "Once embeddings are generated, we can visualize them using **dimensionality reduction techniques** like:\n",
    "- **t-SNE (t-Distributed Stochastic Neighbor Embedding)**\n",
    "- **UMAP (Uniform Manifold Approximation and Projection)**\n",
    "\n",
    "These methods reduce the high-dimensional feature space into 2D/3D representations for interactive visualization.\n",
    "\n",
    "**Relevant Documentation:** [Visualizing Embeddings in FiftyOne](https://docs.voxel51.com/brain.html#visualizing-embeddings), [Dimensionality Reduction](https://docs.voxel51.com/brain.html#visualizing-embeddings)\n",
    "\n",
    "<div style=\"border-left: 4px solid #3498db; padding: 6px;\">\n",
    "<strong>Note:</strong> Be patient, it will take about 5-10 minutes to compute the embeddings.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Compute embeddings for MVTec AD using CLIP\n",
    "\n",
    "import fiftyone.brain as fob\n",
    "import fiftyone.zoo.models as fozm\n",
    "\n",
    "# Load a pre-trained model (e.g., CLIP)\n",
    "model = fozm.load_zoo_model(\"clip-vit-base32-torch\")\n",
    "\n",
    "fob.compute_visualization(\n",
    "    dataset_emb,\n",
    "    model=model,\n",
    "    embeddings=\"mvtec_emb\",\n",
    "    brain_key=\"mvtec_embeddings\",\n",
    "    method=\"umap\",  # Change to \"tsne\" for t-SNE\n",
    "    num_dims=2  # Reduce to 2D\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_emb.reload()\n",
    "print(dataset_emb)\n",
    "print(dataset_emb.last())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d486ee56",
   "metadata": {},
   "source": [
    "\n",
    "## Performing Similarity Search with Embeddings\n",
    "\n",
    "With embeddings, we can search for visually similar images by computing the nearest neighbors in the embedding space.\n",
    "FiftyOne provides built-in tools to perform **similarity search** efficiently.\n",
    "\n",
    "**Relevant Documentation:** [Performing Similarity Search](https://voxel51.com/docs/fiftyone/user_guide/brain.html#similarity-search)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "session = fo.launch_app(dataset, port=5152, auto=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![similarity](https://cdn.voxel51.com/getting_started_manufacturing/notebook2/similarity.webp)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "186f702f",
   "metadata": {},
   "source": [
    "\n",
    "### Next Steps:\n",
    "Try using different models for embedding extraction, explore clustering techniques, and test similarity search with your own datasets! 🚀\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "manu_env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}