{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# Per Bounding Box Luminosity Calculation\n", "\n", "This notebook demonstrates how to calculate the luminosity of images and their bounding boxes and add them as columns to a Table.\n", "\n", "![](../images/compute-per-bb-metrics.png)\n", "\n", "\n", "\n", "We will write a new Table combining the columns of the input table with the calculated luminosity properties.\n" ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "## Project Setup" ] }, { "cell_type": "code", "execution_count": null, "id": "2", "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "PROJECT_NAME = \"3LC Tutorials - COCO128\"\n", "DATASET_NAME = \"COCO128\"\n", "DATA_PATH = \"../../data\"" ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "%pip install 3lc" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "from io import BytesIO\n", "\n", "import numpy as np\n", "import tlc\n", "import tqdm\n", "from PIL import Image" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "## Set Up Input Table\n", "\n", "We will use a `TableFromCoco` to load the input dataset from a annotations file and a folder of images." ] }, { "cell_type": "code", "execution_count": null, "id": "7", "metadata": {}, "outputs": [], "source": [ "annotations_file = tlc.Url(DATA_PATH + \"/coco128/annotations.json\").to_absolute()\n", "images_dir = tlc.Url(DATA_PATH + \"/coco128/images\").to_absolute()\n", "\n", "input_table = tlc.Table.from_coco(\n", " project_name=PROJECT_NAME,\n", " dataset_name=DATASET_NAME,\n", " table_name=\"initial-bbs\",\n", " annotations_file=annotations_file,\n", " image_folder=images_dir,\n", " description=\"COCO 128 dataset\",\n", ")" ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "## Calculate the Luminosity of Images and Bounding Boxes\n", "\n", "In this section, we will calculate the luminosity property for each image as well as for each bounding box within the images.\n", "\n", "We build the variables `per_image_luminosity` and `per_bb_luminosity` to store the luminosity properties for each image and bounding box, respectively." ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "def calculate_luminosity(image: Image) -> float:\n", " np_image = np.array(image)\n", " axes_to_reduce = tuple(range(np_image.ndim - 1))\n", " avg_luminosity = np.mean(np_image, axis=axes_to_reduce) / 255.0\n", " return float(np.mean(avg_luminosity))" ] }, { "cell_type": "code", "execution_count": null, "id": "10", "metadata": {}, "outputs": [], "source": [ "per_bb_luminosity: list[list[float]] = []\n", "per_image_luminosity: list[float] = []\n", "\n", "bb_schema = input_table.row_schema.values[\"bbs\"].values[\"bb_list\"]\n", "\n", "for row in tqdm.tqdm(input_table, total=len(input_table), desc=\"Calculating luminosity\"):\n", " image_filename = row[\"image\"]\n", " image_bbs = row[\"bbs\"][\"bb_list\"]\n", "\n", " image_bytes = tlc.Url(image_filename).read()\n", " image = Image.open(BytesIO(image_bytes))\n", "\n", " image_luminosity = calculate_luminosity(image)\n", " per_image_luminosity.append(image_luminosity)\n", "\n", " bb_luminosity_list: list[float] = []\n", " h, w = image.size\n", "\n", " for bb in image_bbs:\n", " bb_crop = tlc.BBCropInterface.crop(image, bb, bb_schema)\n", " bb_luminosity = calculate_luminosity(bb_crop)\n", " bb_luminosity_list.append(bb_luminosity)\n", "\n", " per_bb_luminosity.append(bb_luminosity_list)" ] }, { "cell_type": "markdown", "id": "11", "metadata": {}, "source": [ "## Create new Table containing luminosity properties\n", "\n", "After calculating the luminosity, we will create a new table using a `TableWriter`.\n", "\n", "### Setup the Schema of the output Table" ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": {}, "outputs": [], "source": [ "# Each entry in the list is a list of luminosity values for each bounding box in the image\n", "per_bb_luminosity_schema = tlc.Schema(\n", " value=tlc.Float32Value(\n", " value_min=0,\n", " value_max=1,\n", " number_role=tlc.NUMBER_ROLE_FRACTION,\n", " ),\n", " size0=tlc.DimensionNumericValue(value_min=0, value_max=1000), # Max 1000 bounding boxes\n", " sample_type=\"hidden\", # Hide this column when iterating over the \"sample view\" of the table\n", " writable=False,\n", ")\n", "\n", "per_image_luminosity_schema = tlc.Schema(\n", " value=tlc.Float32Value(\n", " value_min=0,\n", " value_max=1,\n", " number_role=tlc.NUMBER_ROLE_FRACTION,\n", " ),\n", " sample_type=\"hidden\", # Hide this column when iterating over the \"sample view\" of the table\n", " writable=False,\n", ")\n", "\n", "schemas = {\n", " \"per_bb_luminosity\": per_bb_luminosity_schema,\n", " \"per_image_luminosity\": per_image_luminosity_schema,\n", "}\n", "schemas.update(input_table.row_schema.values) # Copy over the schema from the input table" ] }, { "cell_type": "markdown", "id": "13", "metadata": {}, "source": [ "### Write the output Table\n", "\n", "We will use a `TableWriter` to write the output table as a `TableFromParquet`." ] }, { "cell_type": "code", "execution_count": null, "id": "14", "metadata": {}, "outputs": [], "source": [ "from collections import defaultdict\n", "\n", "table_writer = tlc.TableWriter(\n", " project_name=PROJECT_NAME,\n", " dataset_name=DATASET_NAME,\n", " description=\"Table with added per-bb luminosity metrics\",\n", " table_name=\"added_luminosity_metrics\",\n", " column_schemas=schemas,\n", " if_exists=\"overwrite\",\n", " input_tables=[input_table.url],\n", ")\n", "\n", "# TableWriter accepts data as a dictionary of column names to lists\n", "data = defaultdict(list)\n", "\n", "# Copy over all rows from the input table\n", "for row in input_table.table_rows:\n", " for column_name, column_value in row.items():\n", " data[column_name].append(column_value)\n", "\n", "# Add the luminosity metrics\n", "data[\"per_image_luminosity\"] = per_image_luminosity\n", "data[\"per_bb_luminosity\"] = per_bb_luminosity\n", "\n", "table_writer.add_batch(data)\n", "new_table = table_writer.finalize()" ] }, { "cell_type": "markdown", "id": "15", "metadata": {}, "source": [ "### Inspect the properties of the output Table" ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": {}, "outputs": [], "source": [ "print(len(new_table))\n", "print(new_table.columns)\n", "print(new_table.url.to_relative(input_table.url))" ] }, { "cell_type": "markdown", "id": "17", "metadata": {}, "source": [ "Let's check which columns are present in the sample view / table view of the input and output tables:" ] }, { "cell_type": "code", "execution_count": null, "id": "18", "metadata": {}, "outputs": [], "source": [ "# Sample view of input table\n", "input_table[0].keys()" ] }, { "cell_type": "code", "execution_count": null, "id": "19", "metadata": {}, "outputs": [], "source": [ "# Table view of input table\n", "input_table.table_rows[0].keys()" ] }, { "cell_type": "code", "execution_count": null, "id": "20", "metadata": {}, "outputs": [], "source": [ "# Sample view of output table (does not contain the luminosity columns due to the sample_type=\"hidden\" flag)\n", "new_table[0].keys()" ] }, { "cell_type": "code", "execution_count": null, "id": "21", "metadata": {}, "outputs": [], "source": [ "# Table view of output table (contains the luminosity columns)\n", "new_table.table_rows[0].keys()" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 5 }