{ "cells": [ { "cell_type": "markdown", "id": "dc5feaed", "metadata": {}, "source": [ "# Loading and Exploring Datasets\n", "\n", "This notebook is a modification of the original Getting Started Notebook for the **Visual Anomaly and Novelty Detection (VAND) 2025 Challenge** at CVPR!\n", "\n", "This challenge, sponsored by **Voxel51**, **Anomalib**, and **MVTec**, focuses on advancing **Visual Anomaly Detection** in real-world industrial scenarios. Here you’ll work with the newly released **MVTec AD 2** dataset, featuring challenging real-world conditions, distribution shifts, and unexpected defect types.\n", "\n", "![csm_mad2_objects_overview](https://cdn.voxel51.com/getting_started_manufacturing/notebook8/csm_mad2_objects_overview.webp)\n", "\n", "**Track 1 Dataset (MVTec AD 2)**: [Explore here](https://www.mvtec.com/company/research/datasets/mvtec-ad-2) \n", "**Challenge Registration & Info**: [Join the challenge](https://voxel51.com/computer-vision-events/vand-3-0-challenge-at-cvpr-2025/) \n", "**Join the Community**: [FiftyOne Discord](https://discord.com/invite/fiftyone-community) → `#cvpr-challenge-vand3-0`\n", "\n", "**Related Documentation**\n", "- Anomalib: [GitHub Repo](https://github.com/open-edge-platform/anomalib)\n", "- MVTec AD2: [Download Link](https://www.mvtec.com/company/research/datasets/mvtec-ad-2)\n", "- FiftyOne: [https://docs.voxel51.com/](https://docs.voxel51.com/)\n", "\n", "Let’s dive into anomaly detection with top-tier tools and datasets!" ] }, { "cell_type": "markdown", "id": "d8a871e5", "metadata": {}, "source": [ "### Load MVTec AD 2 Using Anomalib\n", "This section performs an essential step in preparing or visualizing the dataset.\n", "\n", "Note: You will need to use the Anomalib library for this notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Using Anomalib you can also import MVTec AD 2 \n", "from anomalib.data import MVTecAD2\n", "\n", "OBJECTS_LIST = [\n", " 'can',\n", " 'fruit_jelly',\n", " 'sheet_metal',\n", " 'wallplugs',\n", " 'fabric',\n", " 'woriceod',\n", " 'vial',\n", " 'walnuts',\n", "]\n", "OBJECT = \"sheet_metal\" ## object to select\n", "\n", "# Create datamodule with public test set\n", "datamodule = MVTecAD2(\n", " root=\"path/to/mvtec_ad_2\", # <-- replace with your dataset path\n", " category=OBJECT, # <-- set the object/category as needed\n", " train_batch_size=32,\n", " eval_batch_size=32,\n", ")\n", "\n", "# Access different test sets\n", "datamodule.setup()\n", "public_loader = datamodule.test_dataloader() # returns loader based on test_type\n", "#private_loader = datamodule.test_dataloader(test_type=\"private\")\n", "#mixed_loader = datamodule.test_dataloader(test_type=\"private_mixed\")" ] }, { "cell_type": "markdown", "id": "1a50cd6e", "metadata": {}, "source": [ "### Dataset Loading and Preprocessing Functions\n", "\n", "This block defines several utility functions and logic to structure and load the **MVTec AD 2** dataset into a FiftyOne dataset, enriching it with custom fields like defect category, shift type, and segmentation masks.\n", "\n", "**Key Components:**\n", "\n", "- `explore_dataset_structure`: Recursively walks through the dataset directory and prints its structure as a DataFrame.\n", "- `extract_shifting_info`: Determines the type of distribution shift (e.g., overexposed, underexposed) from the image filename.\n", "- `load_segmentation_mask`: Loads a mask image as a NumPy array in grayscale.\n", "- `find_segmentation_mask`: Finds the path to the ground truth segmentation mask corresponding to a bad image.\n", "- `get_fiftyone_dataset`: Creates a new FiftyOne dataset named `\"mvtecad2\"` and populates it with:\n", " - Image samples and their metadata\n", " - Classification labels for category and defect\n", " - Folder type and shift type information\n", " - Segmentation masks for defective samples if available\n", " \n", "The resulting dataset can be explored using FiftyOne’s App for deeper analysis and visualization of anomalies and their spatial locations.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import cv2\n", "import numpy as np\n", "import pandas as pd\n", "import fiftyone as fo\n", "import fiftyone.core.dataset as fod\n", "import fiftyone.core.metadata as fom\n", "from fiftyone.core.labels import Segmentation, Classification\n", "from glob import glob\n", "\n", "def explore_dataset_structure(base_path):\n", " \"\"\"Returns dataset folder structure as a dictionary and prints it.\"\"\"\n", " dataset_metadata = []\n", " for root, dirs, files in os.walk(base_path):\n", " level = root.replace(base_path, '').count(os.sep)\n", " folder_name = os.path.basename(root)\n", " dataset_metadata.append({'Level': level, 'Folder': folder_name, 'Path': root, 'Num_Files': len(files)})\n", " df = pd.DataFrame(dataset_metadata)\n", " print(df)\n", " return df\n", "\n", "def extract_shifting_info(image_name):\n", " \"\"\"Extracts shifting type from image filename.\"\"\"\n", " if 'regular' in image_name:\n", " return 'regular'\n", " elif 'overexposed' in image_name:\n", " return 'overexposed'\n", " elif 'underexposed' in image_name:\n", " return 'underexposed'\n", " elif 'shift' in image_name:\n", " return f'shift_{image_name.split(\"_\")[-1].split(\".\")[0]}'\n", " return 'unknown'\n", "\n", "def load_segmentation_mask(mask_path):\n", " \"\"\"Loads the segmentation mask as a NumPy array.\"\"\"\n", " if os.path.exists(mask_path):\n", " mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)\n", " if mask is not None:\n", " return mask\n", " return None\n", "\n", "def find_segmentation_mask(image_path):\n", " \"\"\"Finds the corresponding segmentation mask for a given bad image.\"\"\"\n", " mask_path = image_path.replace(\"/bad/\", \"/ground_truth/bad/\")\n", " if os.path.exists(mask_path):\n", " return mask_path\n", " return None\n", "\n", "def get_fiftyone_dataset(base_path):\n", " \"\"\"Creates and loads dataset into FiftyOne.\"\"\"\n", " dataset_name = \"mvtecad2\"\n", " if dataset_name in fo.list_datasets():\n", " fo.delete_dataset(dataset_name)\n", " \n", " dataset = fod.Dataset(name=dataset_name, persistent=True)\n", " dataset.add_sample_field(\"category_label\", fo.EmbeddedDocumentField, embedded_doc_type=Classification)\n", " dataset.add_sample_field(\"folder_type\", fo.StringField)\n", " dataset.add_sample_field(\"defect_label\", fo.EmbeddedDocumentField, embedded_doc_type=Classification)\n", " dataset.add_sample_field(\"shift_type\", fo.StringField)\n", " dataset.add_sample_field(\"segmentation\", fo.EmbeddedDocumentField, embedded_doc_type=Segmentation)\n", " \n", " for category_label in os.listdir(base_path):\n", " category_path = os.path.join(base_path, category_label)\n", " print(\"Category\", category_label, category_path)\n", " if os.path.isdir(category_path):\n", " for folder_type in os.listdir(category_path):\n", " folder_path = os.path.join(category_path, folder_type)\n", " print(\"Folder_Type\", folder_type, folder_path)\n", " if os.path.isdir(folder_path):\n", " for defect_label in os.listdir(folder_path):\n", " defect_path = os.path.join(folder_path, defect_label)\n", " print(\"Defect\", defect_label, defect_path)\n", " if os.path.isdir(defect_path):\n", " for img in glob(os.path.join(defect_path, '*.png')):\n", " img_name = os.path.basename(img)\n", " shift_type = extract_shifting_info(img_name)\n", " print(\"Image_Sample\", img_name, shift_type)\n", " metadata = fom.ImageMetadata()\n", " sample = fo.Sample(filepath=os.path.abspath(img), \n", " tags=[defect_label], \n", " metadata=metadata,\n", " category_label=Classification(label=str(category_label)),\n", " folder_type=folder_type,\n", " defect_label=Classification(label=str(defect_label)),\n", " shift_type=shift_type)\n", " \n", " # Add segmentation mask if applicable\n", " if defect_label == 'bad' and 'test_public' in folder_type:\n", " mask_path = find_segmentation_mask(img)\n", " if mask_path:\n", " mask_array = load_segmentation_mask(mask_path)\n", " if mask_array is not None:\n", " sample[\"segmentation\"] = Segmentation(mask=mask_array)\n", " \n", " dataset.add_sample(sample)\n", "\n", " # Ensure `ground_truth/bad` folder is processed correctly\n", " ground_truth_path = os.path.join(folder_path, \"ground_truth\", \"bad\")\n", " if os.path.exists(ground_truth_path):\n", " for mask_file in glob(os.path.join(ground_truth_path, '*.png')):\n", " mask_name = os.path.basename(mask_file).replace(\"_mask\", \"\") # Remove _mask suffix if present\n", " corresponding_img = os.path.abspath(os.path.join(folder_path, \"bad\", mask_name))\n", " print(\"GT\", mask_name, corresponding_img)\n", " # Debugging Output\n", " print(f\"GT Processing: Mask {mask_name} → Image {corresponding_img}\")\n", "\n", " if os.path.exists(corresponding_img):\n", " mask_array = load_segmentation_mask(mask_file)\n", " if mask_array is not None:\n", " matched_samples = dataset.match({\"filepath\": corresponding_img})\n", " if len(matched_samples) > 0:\n", " sample = matched_samples.first()\n", " sample[\"segmentation\"] = Segmentation(mask=mask_array)\n", " sample.save()\n", " print(f\"✓ Segmentation mask added for {corresponding_img}\")\n", " else:\n", " print(f\"⚠ Warning: No matching sample found for {corresponding_img}\")\n", " else:\n", " print(f\"⚠ Warning: Corresponding image not found for mask {mask_file}\")\n", " return dataset" ] }, { "cell_type": "markdown", "id": "c88bc2f0", "metadata": {}, "source": [ "### Load and Explore the MVTec AD 2 Dataset\n", "\n", "This section defines the path to the **MVTec AD 2** dataset, explores its structure, and loads it into a FiftyOne dataset using the previously defined helper functions.\n", "\n", "**What it does:**\n", "\n", "- Sets the base path to the dataset (update the path if needed).\n", "- Calls `explore_dataset_structure()` to print a high-level view of the folder hierarchy and file counts.\n", "- Uses `get_fiftyone_dataset()` to construct and populate a FiftyOne dataset with images, metadata, labels, and segmentation masks.\n", "- Reloads and prints summary information about the loaded dataset to confirm successful ingestion.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Code explanation: Add description here\n", "# Define base dataset path\n", "dataset_base_path = \"/path/to/mvtec_ad_2\" # Change this to your actual dataset path\n", "\n", "#dataset_base_path = \"./dataset\" # Change this to your actual dataset path\n", "\n", "# Explore dataset structure\n", "dataset_structure_df = explore_dataset_structure(dataset_base_path)\n", "\n", "# Load dataset into FiftyOne\n", "dataset = get_fiftyone_dataset(dataset_base_path)\n", "dataset.reload()\n", "print(dataset)\n", "print(\"Dataset loaded into FiftyOne\")" ] }, { "cell_type": "markdown", "id": "195da034", "metadata": {}, "source": [ "### Launching FiftyOne App\n", "Launch the FiftyOne App to explore the dataset in an interactive interface." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# 📌 Launch the FiftyOne App session to interactively explore the dataset\n", "# 📌 Code explanation: TBD\n", "session = fo.launch_app(dataset, port=5149, auto=False)" ] }, { "cell_type": "markdown", "id": "554b53a9", "metadata": {}, "source": [ "### Explore image quality in MVTec AD 2\n", "Install ```image-quality-issues``` plugin in your FiftyOne Environment" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!fiftyone plugins download https://github.com/jacobmarks/image-quality-issues/" ] }, { "cell_type": "markdown", "id": "01ae279a", "metadata": {}, "source": [ "### Check dataset schema" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the schema\n", "schema = dataset.get_field_schema()\n", "\n", "# Print the schema\n", "for field, field_type in schema.items():\n", " print(f\"{field}: {field_type}\")" ] }, { "cell_type": "markdown", "id": "d52b1849", "metadata": {}, "source": [ "### Creating a Grouped Dataset with Segmentation Masks\n", "\n", "This section defines a function to create a **grouped dataset** in FiftyOne, where each sample has two views:\n", "- The **original image**\n", "- Its corresponding **segmentation mask**\n", "\n", "**What’s included:**\n", "\n", "- Imports essential libraries: `os`, `numpy`, `OpenCV (cv2)`, and `FiftyOne`.\n", "- `create_grouped_dataset()`:\n", " - Deletes an existing grouped dataset (if it exists).\n", " - Loads the original dataset and iterates through samples.\n", " - Copies metadata fields such as category, defect type, and shift type.\n", " - Saves segmentation masks (real or blank) to a `masks/` subfolder next to each image.\n", " - Uses FiftyOne’s native grouping feature to associate each image with its mask under a shared `Group`.\n", "\n", "This setup is especially useful for workflows involving **paired data**—like image and segmentation mask combinations, enabling visual comparison and structured exploration within the FiftyOne App. Ideal for exploring tasks with the MVTec AD 2.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import fiftyone as fo\n", "import cv2\n", "\n", "def create_grouped_dataset(original_dataset_name, grouped_dataset_name):\n", " \"\"\"\n", " Creates a grouped dataset using FiftyOne's native grouping feature.\n", " Each group contains:\n", " - The original image\n", " - Its corresponding segmentation mask (real or blank)\n", " Metadata fields are mirrored if present. Masks are saved in a 'masks/' subfolder.\n", " \"\"\"\n", " # Remove existing dataset if it exists\n", " if grouped_dataset_name in fo.list_datasets():\n", " fo.delete_dataset(grouped_dataset_name)\n", "\n", " grouped_dataset = fo.Dataset(name=grouped_dataset_name, persistent=True, overwrite=True)\n", " grouped_dataset.add_group_field(\"group\", default=\"image\")\n", "\n", " original_dataset = fo.load_dataset(original_dataset_name)\n", " grouped_samples = []\n", "\n", " for sample in original_dataset.iter_samples(autosave=True, progress=True):\n", " group = fo.Group()\n", "\n", " # Mirror metadata fields if they exist\n", " category_label = getattr(sample, \"category_label\", None)\n", " folder_type = getattr(sample, \"folder_type\", None)\n", " defect_label = getattr(sample, \"defect_label\", None)\n", " shift_type = getattr(sample, \"shift_type\", None)\n", "\n", " sample[\"category_label\"] = category_label\n", " sample[\"folder_type\"] = folder_type\n", " sample[\"defect_label\"] = defect_label\n", " sample[\"shift_type\"] = shift_type\n", " sample.save()\n", "\n", " # Create image sample\n", " image_sample = fo.Sample(\n", " filepath=sample.filepath,\n", " group=group.element(\"image\"),\n", " category_label=category_label,\n", " folder_type=folder_type,\n", " defect_label=defect_label,\n", " shift_type=shift_type,\n", " )\n", " if sample.metadata is not None:\n", " image_sample.metadata = sample.metadata\n", "\n", " # Get the segmentation mask if present\n", " segmentation_mask = None\n", " if hasattr(sample, \"segmentation\") and sample.segmentation is not None:\n", " segmentation_mask = getattr(sample.segmentation, \"mask\", None)\n", "\n", " # Prepare mask path\n", " image_dir = os.path.dirname(sample.filepath)\n", " mask_dir = os.path.join(image_dir, \"masks\")\n", " os.makedirs(mask_dir, exist_ok=True)\n", " base_filename = os.path.splitext(os.path.basename(sample.filepath))[0]\n", " mask_filename = f\"{base_filename}_mask.png\"\n", " segmentation_filepath = os.path.join(mask_dir, mask_filename)\n", "\n", " if isinstance(segmentation_mask, np.ndarray):\n", " # Save the provided segmentation mask\n", " cv2.imwrite(segmentation_filepath, segmentation_mask)\n", " print(f\"Saved real mask: {segmentation_filepath}\")\n", " mask_to_use = segmentation_mask\n", " else:\n", " # Create a blank mask for images without a mask\n", " img = cv2.imread(sample.filepath, cv2.IMREAD_UNCHANGED)\n", " height, width = img.shape[:2]\n", " mask_to_use = np.zeros((height, width), dtype=np.uint8)\n", " cv2.imwrite(segmentation_filepath, mask_to_use)\n", " print(f\"Saved zero mask: {segmentation_filepath}\")\n", "\n", " segmentation_sample = fo.Sample(\n", " filepath=segmentation_filepath,\n", " group=group.element(\"segmentation\"),\n", " segmentation=fo.Segmentation(mask=mask_to_use),\n", " category_label=category_label,\n", " folder_type=folder_type,\n", " defect_label=defect_label,\n", " shift_type=shift_type,\n", " )\n", " if sample.metadata is not None:\n", " segmentation_sample.metadata = sample.metadata\n", "\n", " grouped_samples.extend([image_sample, segmentation_sample])\n", "\n", " grouped_dataset.add_samples(grouped_samples)\n", " return grouped_dataset" ] }, { "cell_type": "markdown", "id": "65ef20cb", "metadata": {}, "source": [ "### Generate and Load the Grouped Dataset\n", "\n", "In this step, we create and load a grouped version of the MVTec AD 2 dataset using the `create_grouped_dataset()` function defined earlier.\n", "\n", "**What it does:**\n", "\n", "- Calls the function with the original dataset name (`\"mvtecad2\"`) and the desired grouped dataset name (`\"mvtecad2_grouped\"`).\n", "- Reloads the dataset to ensure it's updated in memory.\n", "- Prints the dataset summary to confirm successful creation.\n", "\n", "This grouped dataset allows us to explore each image alongside its corresponding segmentation mask using FiftyOne’s powerful group-based visualization.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "grouped_dataset = create_grouped_dataset(\"mvtecad2\", \"mvtecad2_grouped\")\n", "grouped_dataset.reload()\n", "print(grouped_dataset)\n", "print(\"Grouped dataset created in FiftyOne\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "os.environ[\"FIFTYONE_ALLOW_LEGACY_ORCHESTRATORS\"] = \"true\"\n", "session = fo.launch_app(grouped_dataset, port=5160, auto=False)" ] }, { "cell_type": "markdown", "id": "7345a9fb", "metadata": {}, "source": [ "### Dynamic Grouping of Samples by Category\n", "\n", "This section demonstrates how to dynamically group a subset of the dataset based on ground truth labels using FiftyOne.\n", "\n", "**What it does:**\n", "\n", "- Randomly selects 100 samples from the dataset using a fixed seed for reproducibility.\n", "- Groups the selected samples by the `category_label.label` field (e.g., bottle, cable, etc.).\n", "- Prints the media type of the resulting grouped view and the number of groups.\n", "- Updates the FiftyOne App session to display this grouped view, making it easier to explore anomalies by category.\n", "\n", "Grouping by labels enables intuitive visual inspection and comparative analysis across different object categories.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Take 100 samples and group by ground truth label\n", "view_group_by = dataset.take(100, seed=51).group_by(\"category_label.label\")\n", "\n", "print(view_group_by.media_type) # group\n", "print(len(view_group_by)) # 8\n", "\n", "session.view = view_group_by" ] }, { "cell_type": "markdown", "id": "576be2ef", "metadata": {}, "source": [ "### Compute Visual Embeddings with CLIP and UMAP\n", "\n", "In this step, we leverage FiftyOne Brain and a pre-trained CLIP model to compute a visual embedding space for the dataset.\n", "\n", "**What it does:**\n", "\n", "- Imports the `fiftyone.brain` module for computing embeddings and visual insights.\n", "- Loads a pre-trained CLIP model (`ViT-B/32`) from the FiftyOne Model Zoo.\n", "- Computes visual embeddings for the dataset using CLIP and projects them into 2D using **UMAP**.\n", "- Stores the results under the brain key `\"clip_vis\"` and embedding field `\"clip_embeddings\"`.\n", "\n", "This enables powerful visual exploration and clustering of semantically similar images in the FiftyOne App using the Brain panel.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fiftyone.brain as fob\n", "import fiftyone.zoo.models as fozm\n", "\n", "# Load a pre-trained model (e.g., CLIP)\n", "model = fozm.load_zoo_model(\"clip-vit-base32-torch\")\n", "\n", "fob.compute_visualization(\n", " dataset, embeddings=\"clip_embeddings\", method=\"umap\", brain_key=\"clip_vis\"\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(dataset)" ] }, { "cell_type": "markdown", "id": "0e65379d", "metadata": {}, "source": [ "### Exporting the Dataset in FiftyOne Format\n", "\n", "This section exports the annotated dataset to disk in the native **FiftyOneDataset** format.\n", "\n", "**What it does:**\n", "\n", "- Defines a target export directory named `\"MVTec_AD2_FO\"`.\n", "- Uses `dataset.export()` to write all samples, labels, and metadata to that directory.\n", "\n", "This export can be reused later to reload the dataset easily or to share it with collaborators in a structured format supported by FiftyOne.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "export_dir = \"MVTec_AD2_FO\"\n", "dataset.export(\n", " export_dir=export_dir,\n", " dataset_type=fo.types.FiftyOneDataset,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from fiftyone.utils.huggingface import push_to_hub\n", "\n", "push_to_hub(dataset, \"MVTecAD2_FO\")" ] } ], "metadata": { "kernelspec": { "display_name": "manu_env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.17" } }, "nbformat": 4, "nbformat_minor": 2 }