{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Create Semantic Segmentation Table\n", "\n", "Create a 3LC Table for semantic segmentation tasks using paired images and grayscale mask files from the ADE20k dataset.\n", "\n", "![img](../images/ade-20-semseg.jpg)\n", "\n", "\n", "\n", "Semantic segmentation requires pixel-level classification where each pixel belongs to a specific class. This format is essential for tasks like scene parsing, medical imaging, or autonomous driving where precise spatial understanding is needed.\n", "\n", "This notebook demonstrates creating a table from paired image and mask files. We download the ADE20k dataset from HuggingFace Hub and structure it into a 3LC table with properly formatted segmentation annotations using grayscale PNG masks." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Project Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "PROJECT_NAME = \"3LC Tutorials - Semantic Segmentation ADE20k\"\n", "DATASET_NAME = \"ADE20k_toy_dataset\"\n", "TABLE_NAME = \"ADE20K-semantic-segmentation\"\n", "DOWNLOAD_PATH = \"../../transient_data\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Install dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install 3lc\n", "%pip install git+https://github.com/3lc-ai/3lc-examples.git\n", "%pip install huggingface-hub" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "from pathlib import Path\n", "\n", "import tlc\n", "from huggingface_hub import hf_hub_download\n", "\n", "from tlc_tools.common import download_and_extract_zipfile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download the dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DATASET_ROOT = (Path(DOWNLOAD_PATH) / \"ADE20k_toy_dataset\").resolve()\n", "\n", "if not DATASET_ROOT.exists():\n", " print(\"Downloading data...\")\n", " download_and_extract_zipfile(\n", " url=\"https://www.dropbox.com/s/l1e45oht447053f/ADE20k_toy_dataset.zip?dl=1\",\n", " location=DOWNLOAD_PATH,\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fetch the label map from the Hugging Face Hub" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# load id2label mapping from a JSON on the hub\n", "with open(\n", " hf_hub_download(\n", " repo_id=\"huggingface/label-files\",\n", " filename=\"ade20k-id2label.json\",\n", " repo_type=\"dataset\",\n", " )\n", ") as f:\n", " id2label = json.load(f)\n", "\n", "categories = list(id2label.values())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "value_map = {i + 1: tlc.MapElement(category) for i, category in enumerate(categories)}\n", "value_map[0] = tlc.MapElement(\"background\", display_color=\"#00000000\") # Set transparent background" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the images and segmentation maps\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_paths = list(sorted(DATASET_ROOT.glob(\"**/images/training/*.jpg\")))\n", "segmentation_map_paths = list(sorted(DATASET_ROOT.glob(\"**/annotations/training/*.png\")))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Call .to_relative() to ensure aliases are applied\n", "image_paths = [tlc.Url(p).to_relative().to_str() for p in image_paths]\n", "mask_paths = [tlc.Url(p).to_relative().to_str() for p in segmentation_map_paths]\n", "print(image_paths[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Write the instance segmentation masks to a table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "table = tlc.Table.from_dict(\n", " data={\n", " \"image\": image_paths,\n", " \"mask\": mask_paths,\n", " },\n", " structure=(tlc.PILImage(\"image\"), tlc.SegmentationPILImage(\"mask\", classes=value_map)),\n", " table_name=TABLE_NAME,\n", " dataset_name=DATASET_NAME,\n", " project_name=PROJECT_NAME,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image, mask = table[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" }, "test_marks": [ "dependent" ] }, "nbformat": 4, "nbformat_minor": 2 }