{ "cells": [ { "cell_type": "markdown", "id": "a029f92a", "metadata": {}, "source": [ "# Custom Embeddings for Anomaly Detection\n", "\n", "In this notebook, we will explore how to generate **custom embeddings** for **anomaly detection** using the **Padim model** from Anomalib. \n", "Unlike general-purpose embeddings from models like CLIP or ResNet, anomaly detection requires **task-specific embeddings** that can distinguish between normal and abnormal samples.\n", "\n", "![anomaly_mvtec](https://cdn.voxel51.com/getting_started_manufacturing/notebook4/anomaly_mvtec.webp)\n", "\n", "\n", "## Learning Objectives:\n", "- Understand the difference between standard embeddings and anomaly-specific embeddings.\n", "- Explore how to compute embeddings using **Padim from Anomalib**.\n", "- Integrate these embeddings into a FiftyOne dataset.\n", "- Leverage FiftyOne for visualization and analysis.\n" ] }, { "cell_type": "markdown", "id": "196000c7", "metadata": {}, "source": [ "\n", "## Why Use Custom Embeddings for Anomaly Detection?\n", "\n", "Pre-trained models like **CLIP or ResNet** generate **general-purpose embeddings** that focus on visual similarity. However, detecting **abnormalities** requires learning **subtle deviations** from normal patterns, which these models cannot capture effectively.\n", "\n", "Instead, we use a dedicated anomaly detection model like **Padim from Anomalib**, which:\n", "- Learns representations specific to normal and anomalous samples.\n", "- Extracts feature maps from an encoder (e.g., ResNet).\n", "- Compares new samples against normal feature distributions.\n", "\n", "### Further Reading:\n", "- [Anomalib Documentation](https://github.com/openvinotoolkit/anomalib)\n", "- [Understanding Memory-Based Anomaly Detection](https://arxiv.org/pdf/2011.08785)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the MVTec Dataset as usual" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo\n", "import fiftyone.utils.huggingface as fouh # Hugging Face integration\n", "\n", "# Define the new dataset name\n", "dataset_name = \"MVTec_AD_cEmb\"\n", "\n", "# Check if the dataset exists\n", "if dataset_name in fo.list_datasets():\n", " print(f\"Dataset '{dataset_name}' exists. Loading...\")\n", " dataset = fo.load_dataset(\"MVTec_AD_cEmb\")\n", "else:\n", " print(f\"Dataset '{dataset_name}' does not exist. Creating a new one...\")\n", " # Clone the dataset with a new name and make it persistent\n", " dataset_ = fo.load_dataset(\"MVTec_AD\")\n", " dataset = dataset_.clone(dataset_name, persistent=True)" ] }, { "cell_type": "markdown", "id": "af37356c", "metadata": {}, "source": [ "\n", "## Extracting Custom Embeddings from Padim (Anomalib)\n", "\n", "Instead of using a general embedding model, we will:\n", "1. **Load a Padim anomaly detection model** using Anomalib.\n", "2. **Run inference on a dataset** to extract anomaly embeddings.\n", "3. **Store the embeddings in FiftyOne** for further visualization.\n", "\n", "**Relevant Documentation:** \n", "- [Anomalib Models](https://anomalib.readthedocs.io/en/latest/markdown/guides/reference/models/image/index.html) \n", "- [Remotely-sourced Zoo Models](https://docs.voxel51.com/model_zoo/remote.html)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import torch\n", "from anomalib.models.image.padim.torch_model import PadimModel\n", "\n", "# Create a PaDiM model\n", "model = PadimModel(\n", " backbone=\"resnet18\", # or \"wide_resnet50_2\", etc.\n", " layers=[\"layer1\", \"layer2\"], # choose the layers you want\n", " pre_trained=True,\n", " n_features=100 # optional dimension reduction\n", ")\n", "model.eval() # set to eval mode" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(model)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from anomalib.models.image.padim.lightning_model import Padim\n", "import torch\n", "from PIL import Image\n", "import torchvision.transforms as T\n", "\n", "# 1) Create the Lightning-based PaDiM\n", "padim = Padim(\n", " backbone=\"resnet18\",\n", " layers=[\"layer1\", \"layer2\"],\n", " pre_trained=True\n", ")\n", "padim.train() # so forward(...) returns embeddings\n", "\n", "# 2) Load image\n", "transform = T.Compose([T.Resize(224), T.ToTensor()])\n", "\n", "# Replace this with the path to your image\n", "image_path = \"path/to/your/image.png\"\n", "pil_image = Image.open(image_path).convert(\"RGB\")\n", "tensor = transform(pil_image).unsqueeze(0) # (1, C, H, W)\n", "\n", "# 3) Pass it through the model in train mode\n", "with torch.no_grad():\n", " embeddings = padim.model(tensor) # shape (1, embed_dim, H', W')\n", "print(embeddings.shape)" ] }, { "cell_type": "markdown", "id": "081dd2f4", "metadata": {}, "source": [ "\n", "## Integrating Anomaly Embeddings into FiftyOne\n", "\n", "Once we obtain embeddings from Padim, we will add them to our FiftyOne dataset.\n", "This allows us to:\n", "- Perform **similarity searches** based on anomaly scores.\n", "- Compare normal vs. abnormal sample distributions.\n", "- Leverage **FiftyOne App** to inspect anomalies.\n", "\n", "```python\n", "import fiftyone as fo\n", "\n", "dataset = fo.Dataset(\"object_from_mvtec_ad\")\n", "\n", "# Add embeddings to each sample\n", "for sample in dataset:\n", " ...\n", " # Convert to CPU NumPy for storage\n", " embedding_1d = patch_embedding.squeeze(0).cpu().numpy() # shape (D,)\n", "\n", " # Store as a list in a new field\n", " sample[\"embedding\"] = embedding_1d.tolist()\n", " sample.save()\n", " ...\n", "```\n", "**Relevant Documentation:** [Adding Custom Fields to FiftyOne Datasets](https://docs.voxel51.com/user_guide/using_datasets.html)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Selecting object from MVTec AD Dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from fiftyone import ViewField as F # helper for defining views\n", "\n", "## get the test split of the dataset\n", "test_split = dataset.match(F(\"category.label\") == 'bottle')\n", "\n", "# Clone the dataset into a new one called \"mvtec_bottle\"\n", "mvtec_bottle = test_split.clone(\"mvtec-bottle\", persistent=True)\n", "\n", "print(mvtec_bottle)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(dataset)\n", "print(mvtec_bottle)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating Embeddings using Inference with Padim Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from PIL import Image\n", "\n", "for sample in mvtec_bottle:\n", " # Load the image via PIL\n", " pil_image = Image.open(sample.filepath).convert(\"RGB\")\n", "\n", " # Apply your transform\n", " input_tensor = transform(pil_image).unsqueeze(0) # shape (1, C, H, W)\n", "\n", " # Compute patch embeddings in train mode\n", " with torch.no_grad():\n", " patch_embedding = padim.model(input_tensor) # shape (1, D, H', W')\n", "\n", " # Optional: flatten or pool across spatial dims\n", " # Here we use mean pooling to get a (1, D) vector\n", " patch_embedding = patch_embedding.mean(dim=[2, 3]) # shape (1, D)\n", "\n", " # Convert to CPU NumPy for storage\n", " embedding_1d = patch_embedding.squeeze(0).cpu().numpy() # shape (D,)\n", "\n", " # Store as a list in a new field\n", " sample[\"embedding\"] = embedding_1d.tolist()\n", " sample.save()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing Embeddings in FiftyOne" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from fiftyone.brain import compute_visualization\n", "\n", "# This will perform PCA on the \"embedding\" field\n", "compute_visualization(\n", " mvtec_bottle,\n", " embeddings=\"padin_emb\",\n", " brain_key=\"embedding_pca\",\n", " method=\"pca\",\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mvtec_bottle.reload()\n", "print(mvtec_bottle)\n", "print(mvtec_bottle.last())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "session = fo.launch_app(mvtec_bottle, port=5154, auto=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![embedding_anomaly](https://cdn.voxel51.com/getting_started_manufacturing/notebook4/embedding_annomaly.webp)\n" ] }, { "cell_type": "markdown", "id": "428fe1d7", "metadata": {}, "source": [ "### Next Steps:\n", "Try using different anomaly detection models from Anomalib and compare their embeddings with FiftyOne's visualization tools! 🚀\n" ] } ], "metadata": { "kernelspec": { "display_name": "py311_anomalib200b3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 2 }