{ "cells": [ { "cell_type": "markdown", "id": "6888a2ad", "metadata": {}, "source": [ "\n", "# Understanding and Using Embeddings\n", "\n", "Welcome to this hands-on workshop where we will explore **embeddings** and their importance in Visual AI. \n", "Embeddings play a crucial role in **image search, clustering, anomaly detection, and representation learning**.\n", "In this notebook, we will learn how to generate, visualize, and explore embeddings using **FiftyOne**.\n", "\n", "![using_embeddings](https://cdn.voxel51.com/getting_started_manufacturing/notebook2/using_embeddings.webp)\n", "\n", "## Learning Objectives:\n", "- Understand what embeddings are and why they matter in Visual AI.\n", "- Learn how to compute and store embeddings in FiftyOne.\n", "- Use embeddings for similarity search and visualization.\n", "- Leverage FiftyOne's interactive tools to explore embeddings.\n", "\n" ] }, { "cell_type": "markdown", "id": "cdd4c2d9", "metadata": {}, "source": [ "\n", "## What Are Embeddings?\n", "\n", "Embeddings are **vector representations** of data (images, videos, text, etc.) that capture meaningful characteristics. \n", "For images, embeddings store compressed feature representations learned by deep learning models. These features enable tasks such as:\n", "- **Similarity Search**: Find images that are visually similar.\n", "- **Clustering**: Group images with shared characteristics.\n", "- **Anomaly Detection**: Identify outliers in datasets.\n", "- **Transfer Learning**: Use learned embeddings to improve other AI tasks.\n", "\n", "### Further Reading:\n", "- [Introduction to Embeddings](https://www.tensorflow.org/text/guide/word_embeddings)\n", "- [Feature Representations in Deep Learning](https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html)\n" ] }, { "cell_type": "markdown", "id": "34dbec8d", "metadata": {}, "source": [ "\n", "## Generating Embeddings in FiftyOne\n", "\n", "FiftyOne provides seamless integration for embedding computation. \n", "You can extract embeddings using pre-trained deep learning models (such as CLIP, ResNet, or custom models) and store them in FiftyOne datasets.\n", "\n", "### How It Works:\n", "1. Load a dataset in FiftyOne.\n", "2. Extract embeddings from a model.\n", "3. Store and visualize embeddings.\n", "\n", "**Relevant Documentation:** [Computing and Storing Embeddings](https://voxel51.com/docs/fiftyone/user_guide/brain.html#computing-embeddings)\n", "\n", "
\n", "Note: You must install the `umap-learn>=0.5` package in order to use UMAP-based visualization. This is recommended, as UMAP is awesome! If you do not wish to install UMAP, try `method='tsne'` instead\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install fiftyone huggingface_hub gdown umap-learn torch torchvision" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo # base library and app\n", "import fiftyone.utils.huggingface as fouh # Hugging Face integration\n", "\n", "dataset_name = \"MVTec_AD\"\n", "\n", "# Check if the dataset exists\n", "if dataset_name in fo.list_datasets():\n", " print(f\"Dataset '{dataset_name}' exists. Loading...\")\n", " dataset = fo.load_dataset(dataset_name)\n", "else:\n", " print(f\"Dataset '{dataset_name}' does not exist. Creating a new one...\")\n", " # Clone the dataset with a new name and make it persistent\n", " dataset_ = fouh.load_from_hub(\"Voxel51/mvtec-ad\", persistent=True, overwrite=True)\n", " dataset = dataset_.clone(\"MVTec_AD\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset_emb = fo.load_dataset(\"MVTec_AD_emb\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(dataset_emb)" ] }, { "cell_type": "markdown", "id": "5596cbed", "metadata": {}, "source": [ "\n", "## Exploring and Visualizing Embeddings\n", "\n", "Once embeddings are generated, we can visualize them using **dimensionality reduction techniques** like:\n", "- **t-SNE (t-Distributed Stochastic Neighbor Embedding)**\n", "- **UMAP (Uniform Manifold Approximation and Projection)**\n", "\n", "These methods reduce the high-dimensional feature space into 2D/3D representations for interactive visualization.\n", "\n", "**Relevant Documentation:** [Visualizing Embeddings in FiftyOne](https://docs.voxel51.com/brain.html#visualizing-embeddings), [Dimensionality Reduction](https://docs.voxel51.com/brain.html#visualizing-embeddings)\n", "\n", "
\n", "Note: Be patient, it will take about 5-10 minutes to compute the embeddings.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute embeddings for MVTec AD using CLIP\n", "\n", "import fiftyone.brain as fob\n", "import fiftyone.zoo.models as fozm\n", "\n", "# Load a pre-trained model (e.g., CLIP)\n", "model = fozm.load_zoo_model(\"clip-vit-base32-torch\")\n", "\n", "fob.compute_visualization(\n", " dataset_emb,\n", " model=model,\n", " embeddings=\"mvtec_emb\",\n", " brain_key=\"mvtec_embeddings\",\n", " method=\"umap\", # Change to \"tsne\" for t-SNE\n", " num_dims=2 # Reduce to 2D\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset_emb.reload()\n", "print(dataset_emb)\n", "print(dataset_emb.last())" ] }, { "cell_type": "markdown", "id": "d486ee56", "metadata": {}, "source": [ "\n", "## Performing Similarity Search with Embeddings\n", "\n", "With embeddings, we can search for visually similar images by computing the nearest neighbors in the embedding space.\n", "FiftyOne provides built-in tools to perform **similarity search** efficiently.\n", "\n", "**Relevant Documentation:** [Performing Similarity Search](https://voxel51.com/docs/fiftyone/user_guide/brain.html#similarity-search)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "session = fo.launch_app(dataset, port=5152, auto=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![similarity](https://cdn.voxel51.com/getting_started_manufacturing/notebook2/similarity.webp)\n" ] }, { "cell_type": "markdown", "id": "186f702f", "metadata": {}, "source": [ "\n", "### Next Steps:\n", "Try using different models for embedding extraction, explore clustering techniques, and test similarity search with your own datasets! 🚀\n" ] } ], "metadata": { "kernelspec": { "display_name": "manu_env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.17" } }, "nbformat": 4, "nbformat_minor": 2 }