{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Detect objects in images\n", "\n", "Automatically identify and locate objects in images using YOLOX object detection models." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem\n", "\n", "You have images that need object detection—identifying what objects are present and where they're located. Manual labeling is slow and expensive.\n", "\n", "| Use case | Images | Need |\n", "|----------|--------|------|\n", "| Inventory counting | 5K product photos | Count items per image |\n", "| Security monitoring | 10K frames | Detect people, vehicles |\n", "| Quality control | 20K inspection images | Find defects |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n", "\n", "**What's in this recipe:**\n", "\n", "- Detect objects using YOLOX models (runs locally, no API needed)\n", "- Get bounding boxes and class labels\n", "- Filter detections by confidence threshold\n", "\n", "You add a computed column that runs YOLOX on each image. Detection happens automatically when you insert new images." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:12.764565Z", "iopub.status.busy": "2025-12-12T02:39:12.764474Z", "iopub.status.idle": "2025-12-12T02:39:15.461235Z", "shell.execute_reply": "2025-12-12T02:39:15.460679Z" } }, "outputs": [], "source": [ "%pip install -qU pixeltable pixeltable-yolox" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:15.478535Z", "iopub.status.busy": "2025-12-12T02:39:15.478324Z", "iopub.status.idle": "2025-12-12T02:39:17.125356Z", "shell.execute_reply": "2025-12-12T02:39:17.125059Z" } }, "outputs": [], "source": [ "import pixeltable as pxt\n", "from pixeltable.functions.yolox import yolox" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load images" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:17.127880Z", "iopub.status.busy": "2025-12-12T02:39:17.127302Z", "iopub.status.idle": "2025-12-12T02:39:17.329841Z", "shell.execute_reply": "2025-12-12T02:39:17.329471Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata\n", "Created directory 'detection_demo'.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create a fresh directory\n", "pxt.drop_dir('detection_demo', force=True)\n", "pxt.create_dir('detection_demo')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:17.331568Z", "iopub.status.busy": "2025-12-12T02:39:17.331456Z", "iopub.status.idle": "2025-12-12T02:39:17.392488Z", "shell.execute_reply": "2025-12-12T02:39:17.392064Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'images'.\n" ] } ], "source": [ "# Create table for images\n", "images = pxt.create_table('detection_demo/images', {'image': pxt.Image})" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:17.394564Z", "iopub.status.busy": "2025-12-12T02:39:17.394338Z", "iopub.status.idle": "2025-12-12T02:39:18.346975Z", "shell.execute_reply": "2025-12-12T02:39:18.346544Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "Inserting rows into `images`: 0 rows [00:00, ? rows/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "Inserting rows into `images`: 3 rows [00:00, 523.85 rows/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Inserted 3 rows with 0 errors.\n" ] }, { "data": { "text/plain": [ "3 rows inserted, 6 values computed." ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Insert sample images (COCO dataset samples with common objects)\n", "image_urls = [\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000036.jpg',\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000090.jpg',\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000106.jpg',\n", "]\n", "\n", "images.insert([{'image': url} for url in image_urls])" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:18.349297Z", "iopub.status.busy": "2025-12-12T02:39:18.348872Z", "iopub.status.idle": "2025-12-12T02:39:18.466369Z", "shell.execute_reply": "2025-12-12T02:39:18.465912Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
image
\n", " \n", "
\n", " \n", "
\n", " \n", "
" ], "text/plain": [ " image\n", "0 \n", " \n", " \n", " image\n", " detections\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " {"bboxes": [[2.323, 55.638, 462.701, 485.695], [173.273, 161.867, 478.283, 635.745]], "labels": [25, 0], "scores": [0.959, 0.947]}\n", " \n", " \n", "
\n", " \n", "
\n", " {"bboxes": [[234.995, 353.385, 307.581, 414.797]], "labels": [19], "scores": [0.937]}\n", " \n", " \n", "
\n", " \n", "
\n", " {"bboxes": [[45.552, 93.668, 489.63, 363.2], [17.867, 54.537, 301.197, 117.939]], "labels": [14, 29], "scores": [0.855, 0.782]}\n", " \n", " \n", "" ], "text/plain": [ " image \\\n", "0 int:\n", " \"\"\"Count the number of detected objects.\"\"\"\n", " return len(detections.get('labels', []))\n", "\n", "\n", "images.add_computed_column(object_count=count_objects(images.detections))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:20.553029Z", "iopub.status.busy": "2025-12-12T02:39:20.552912Z", "iopub.status.idle": "2025-12-12T02:39:20.584353Z", "shell.execute_reply": "2025-12-12T02:39:20.583786Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added 3 column values with 0 errors.\n" ] }, { "data": { "text/plain": [ "3 rows updated, 3 values computed." ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Extract unique object classes\n", "@pxt.udf\n", "def get_classes(detections: dict) -> list:\n", " \"\"\"Get list of detected object classes.\"\"\"\n", " return list(set(detections.get('labels', [])))\n", "\n", "\n", "images.add_computed_column(object_classes=get_classes(images.detections))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2025-12-12T02:39:20.586644Z", "iopub.status.busy": "2025-12-12T02:39:20.586480Z", "iopub.status.idle": "2025-12-12T02:39:20.672155Z", "shell.execute_reply": "2025-12-12T02:39:20.671596Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
imageobject_countobject_classes
\n", " \n", "
2[0, 25]
\n", " \n", "
1[19]
\n", " \n", "
2[29, 14]
" ], "text/plain": [ " image object_count \\\n", "0