{ "cells": [ { "cell_type": "markdown", "id": "c7f0f404-3929-4c42-a70a-34dc32d92640", "metadata": {}, "source": [ "# 10-Minute Tour\n", "\n", "Welcome to Pixeltable! In this tutorial, we'll survey how to create tables, populate them with data, and enhance them with built-in and user-defined transformations and AI operations.\n", "\n", "## Install Python packages\n", "\n", "First run the following command to install Pixeltable and related libraries needed for this tutorial." ] }, { "cell_type": "code", "execution_count": null, "id": "5115179a-2c88-48ef-b46c-dcebe82ed2de", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%pip install -qU torch transformers openai pixeltable" ] }, { "cell_type": "markdown", "id": "3c265fc3-1a33-4d4d-be35-c49c668514b3", "metadata": {}, "source": [ "## Creating a table\n", "\n", "Let's begin by creating a `demo` directory (if it doesn't already exist) and a table that can hold image data, `demo/first`. The table will initially have just a single column to hold our input images, which we'll call `input_image`. We also need to specify a type for the column: `pxt.Image`." ] }, { "cell_type": "code", "execution_count": 1, "id": "e926b5c8-16bc-4d4d-8384-32f62fedac28", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata\n", "Created directory 'demo'.\n", "Created table 'first'.\n" ] } ], "source": [ "import pixeltable as pxt\n", "\n", "# Create the directory `demo`, dropping it first (if it exists)\n", "# to ensure a clean environment.\n", "pxt.drop_dir('demo', force=True)\n", "pxt.create_dir('demo')\n", "\n", "# Create the table `demo/first` with a single column `input_image`\n", "t = pxt.create_table('demo/first', {'input_image': pxt.Image})" ] }, { "cell_type": "markdown", "id": "2e5ec663-261a-4bb9-a8b8-dfb1ec2633f0", "metadata": {}, "source": [ "We can use `t.describe()` to examine the table schema. We see that it now contains a single column, as expected." ] }, { "cell_type": "code", "execution_count": 2, "id": "00e1bdde-8f5b-438d-b2b2-0fd6193c9831", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
table 'demo/first'
\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Column NameTypeComputed With
input_imageImage
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "t.describe()" ] }, { "cell_type": "markdown", "id": "abe5f4b5-86d0-4ff6-bb74-2c427fe9d656", "metadata": {}, "source": [ "The new table is initially empty, with no rows:" ] }, { "cell_type": "code", "execution_count": 3, "id": "f2b23cd1-d8ea-401f-8f7d-bc1f41654609", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.count()" ] }, { "cell_type": "markdown", "id": "3521e9b4-9ddf-4554-98a5-3c26b550caaf", "metadata": {}, "source": [ "Now let's put an image into it! We can add images simply by giving Pixeltable their URLs. The example images in this demo come from the [COCO dataset](https://cocodataset.org/), and we'll be referencing copies of them in the Pixeltable github repo. But in practice, the images can come from anywhere: an S3 bucket, say, or the local file system.\n", "\n", "When we add the image, we see that Pixeltable gives us some useful status updates indicating that the operation was successful." ] }, { "cell_type": "code", "execution_count": 4, "id": "0afd6da0-e48a-4dff-8df9-e82d5e1457a7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inserted 1 row with 0 errors in 0.21 s (4.86 rows/s)\n" ] }, { "data": { "text/plain": [ "1 row inserted." ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.insert(\n", " [\n", " {\n", " 'input_image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000025.jpg'\n", " }\n", " ]\n", ")" ] }, { "cell_type": "markdown", "id": "f214bbdd-6934-4c7a-b64d-7c3d1ad89625", "metadata": {}, "source": [ "We can use `t.head()` to examine the contents of the table." ] }, { "cell_type": "code", "execution_count": 5, "id": "b9030f28-6de0-4fec-8e28-c3d54d56157f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
input_image
\n", " \n", "
" ], "text/plain": [ " input_image\n", "0 \n", " \n", " \n", " input_image\n", " detections\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " {"boxes": [[51.942, 356.174, 181.481, 413.975], [383.225, 58.66, 605.64, 361.346]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]}\n", " \n", " \n", "" ], "text/plain": [ " input_image \\\n", "0 \n", " \n", " \n", " input_image\n", " detections\n", " detections_text\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " {"boxes": [[51.942, 356.174, 181.481, 413.975], [383.225, 58.66, 605.64, 361.346]], "labels": [25, 25], "scores": [0.99, 0.999], "label_text": ["giraffe", "giraffe"]}\n", " ["giraffe", "giraffe"]\n", " \n", " \n", "" ], "text/plain": [ " input_image \\\n", "0 \n", "#T_d5455_row0_col0 {\n", " white-space: pre-wrap;\n", " text-align: left;\n", " font-weight: bold;\n", "}\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
table 'demo/first'
\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Column NameTypeComputed With
input_imageImage
detectionsJsondetr_for_object_detection(input_image, model_id='facebook/detr-resnet-50')
detections_textJsondetections.label_text
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "t.describe()" ] }, { "cell_type": "markdown", "id": "0c05012d-fc91-4b77-a521-00e35c9c432b", "metadata": {}, "source": [ "Now let's add some more images to our table. This demonstrates another important feature of computed columns: by default, they update incrementally any time new data shows up on their inputs. In this case, Pixeltable will run the ResNet-50 model against each new image that is added, then extract the labels into the `detect_text` column. Pixeltable will orchestrate the execution of any sequence (or DAG) of computed columns.\n", "\n", "Note how we can pass multiple rows to `t.insert` with a single statement, which will insert them more efficiently." ] }, { "cell_type": "code", "execution_count": 10, "id": "495d5b39-3795-45db-a174-e19e7c2f1eb6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inserted 4 rows with 0 errors in 1.51 s (2.65 rows/s)\n" ] }, { "data": { "text/plain": [ "4 rows inserted." ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "more_images = [\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000030.jpg',\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000034.jpg',\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000042.jpg',\n", " 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources/images/000000000061.jpg',\n", "]\n", "t.insert({'input_image': image} for image in more_images)" ] }, { "cell_type": "markdown", "id": "c512cd03-8f0e-4eca-aa7f-888776079666", "metadata": {}, "source": [ "Let's see what the model came up with. We'll use `t.select` to suppress the display of the `detect` column, since right now we're only interested in the text labels." ] }, { "cell_type": "code", "execution_count": 11, "id": "2d9422a5-f205-4c36-9101-714c5c3b7b8f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
input_imagedetections_text
\n", " \n", "
["giraffe", "giraffe"]
\n", " \n", "
["vase", "potted plant"]
\n", " \n", "
["zebra"]
\n", " \n", "
["dog", "dog"]
\n", " \n", "
["person", "person", "bench", "person", "elephant", "elephant", "person"]
" ], "text/plain": [ " input_image \\\n", "0 \n", " \n", " \n", " input_image\n", " detections_text\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " ["giraffe", "giraffe"]\n", " \n", " \n", "
\n", " \n", "
\n", " ["vase", "potted plant"]\n", " \n", " \n", "" ], "text/plain": [ " input_image detections_text\n", "0 \n", " \n", " \n", " input_image\n", " detections_text\n", " vision\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " ["giraffe", "giraffe"]\n", " {"id": "chatcmpl-DCYw7EsCy0gWSmikqyiT89Z7iABX4", "model": "gpt-4o-mini-2024-07-18", "usage": {"total_tokens": 14238, "prompt_tokens": 14179, "completion_tokens": 59, "prompt_tokens_details": {"audio_tokens": 0, "cached_tokens": 0}, "completion_tokens_details": {"audio_tokens": 0, "reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0}}, "object": "chat.completion", "choices": [{"index": 0, "message": {"role": "assistant", "content": "The image shows two giraffes in a natural setting. One giraffe is closer to the camera, standing upright and likely reaching for leaves on a tree, ...... ther giraffe is further back and partially obscured. There are trees and greenery surrounding them, suggesting an open or safari-like environment.", "refusal": null, "annotations": []}, "logprobs": null, "finish_reason": "stop"}], "created": 1771886603, "service_tier": "default", "system_fingerprint": "fp_0a8a757e2a"}\n", " \n", " \n", "
\n", " \n", "
\n", " ["vase", "potted plant"]\n", " {"id": "chatcmpl-DCYwA0flHt6LKtKnUgWfQSsHFnRmO", "model": "gpt-4o-mini-2024-07-18", "usage": {"total_tokens": 14248, "prompt_tokens": 14179, "completion_tokens": 69, "prompt_tokens_details": {"audio_tokens": 0, "cached_tokens": 0}, "completion_tokens_details": {"audio_tokens": 0, "reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0}}, "object": "chat.completion", "choices": [{"index": 0, "message": {"role": "assistant", "content": "The image features a decorative white vase or urn that contains a vibrant arrangement of flowers. The bouquet includes various types of flowers, p ...... railing in a setting that appears to be outdoors, surrounded by soft, blurred greenery in the background, suggesting a garden or lush environment.", "refusal": null, "annotations": []}, "logprobs": null, "finish_reason": "stop"}], "created": 1771886606, "service_tier": "default", "system_fingerprint": "fp_373a14eb6f"}\n", " \n", " \n", "
\n", " \n", "
\n", " ["zebra"]\n", " {"id": "chatcmpl-DCYwAq4XiJOFLgafcdlTeoQ4FExaT", "model": "gpt-4o-mini-2024-07-18", "usage": {"total_tokens": 14217, "prompt_tokens": 14179, "completion_tokens": 38, "prompt_tokens_details": {"audio_tokens": 0, "cached_tokens": 0}, "completion_tokens_details": {"audio_tokens": 0, "reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0}}, "object": "chat.completion", "choices": [{"index": 0, "message": {"role": "assistant", "content": "The image shows a zebra grazing on green grass. The zebra has distinctive black and white stripes and is depicted in a natural setting. The background is mostly green, indicating a grassland environment.", "refusal": null, "annotations": []}, "logprobs": null, "finish_reason": "stop"}], "created": 1771886606, "service_tier": "default", "system_fingerprint": "fp_0a8a757e2a"}\n", " \n", " \n", "
\n", " \n", "
\n", " ["dog", "dog"]\n", " {"id": "chatcmpl-DCYwAfDVom1lhA4mRuZxIFulvEfrv", "model": "gpt-4o-mini-2024-07-18", "usage": {"total_tokens": 14231, "prompt_tokens": 14179, "completion_tokens": 52, "prompt_tokens_details": {"audio_tokens": 0, "cached_tokens": 0}, "completion_tokens_details": {"audio_tokens": 0, "reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0}}, "object": "chat.completion", "choices": [{"index": 0, "message": {"role": "assistant", "content": "The image shows a collection of shoes and flip-flops on a shoe rack, with a dog resting on top of them. The dog appears to have a curly coat and is nestled among the footwear. The background features a wall and part of the shoe rack.", "refusal": null, "annotations": []}, "logprobs": null, "finish_reason": "stop"}], "created": 1771886606, "service_tier": "default", "system_fingerprint": "fp_373a14eb6f"}\n", " \n", " \n", "
\n", " \n", "
\n", " ["person", "person", "bench", "person", "elephant", "elephant", "person"]\n", " {"id": "chatcmpl-DCYwAzHazyDJBylIIFQogDU4AGJAq", "model": "gpt-4o-mini-2024-07-18", "usage": {"total_tokens": 14234, "prompt_tokens": 14179, "completion_tokens": 55, "prompt_tokens_details": {"audio_tokens": 0, "cached_tokens": 0}, "completion_tokens_details": {"audio_tokens": 0, "reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0}}, "object": "chat.completion", "choices": [{"index": 0, "message": {"role": "assistant", "content": "The image depicts a dense jungle scene with two elephants carrying riders, who appear to be exploring the lush greenery. The surrounding vegetatio ...... l trees, creating a vibrant natural setting. The atmosphere suggests a tropical or subtropical environment, typical of areas rich in biodiversity.", "refusal": null, "annotations": []}, "logprobs": null, "finish_reason": "stop"}], "created": 1771886606, "service_tier": "default", "system_fingerprint": "fp_0a8a757e2a"}\n", " \n", " \n", "" ], "text/plain": [ " input_image \\\n", "0 \n", " \n", " \n", " input_image\n", " detections_text\n", " vision_choices0_message_content\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " ["giraffe", "giraffe"]\n", " The image shows two giraffes in a natural setting. One giraffe is closer to the camera, standing upright and likely reaching for leaves on a tree, while another giraffe is further back and partially obscured. There are trees and greenery surrounding them, suggesting an open or safari-like environment.\n", " \n", " \n", "
\n", " \n", "
\n", " ["vase", "potted plant"]\n", " The image features a decorative white vase or urn that contains a vibrant arrangement of flowers. The bouquet includes various types of flowers, predominantly shades of pink and white, with some greenery. The vase is placed on a railing in a setting that appears to be outdoors, surrounded by soft, blurred greenery in the background, suggesting a garden or lush environment.\n", " \n", " \n", "
\n", " \n", "
\n", " ["zebra"]\n", " The image shows a zebra grazing on green grass. The zebra has distinctive black and white stripes and is depicted in a natural setting. The background is mostly green, indicating a grassland environment.\n", " \n", " \n", "
\n", " \n", "
\n", " ["dog", "dog"]\n", " The image shows a collection of shoes and flip-flops on a shoe rack, with a dog resting on top of them. The dog appears to have a curly coat and is nestled among the footwear. The background features a wall and part of the shoe rack.\n", " \n", " \n", "
\n", " \n", "
\n", " ["person", "person", "bench", "person", "elephant", "elephant", "person"]\n", " The image depicts a dense jungle scene with two elephants carrying riders, who appear to be exploring the lush greenery. The surrounding vegetation includes thick foliage and tall trees, creating a vibrant natural setting. The atmosphere suggests a tropical or subtropical environment, typical of areas rich in biodiversity.\n", " \n", " \n", "" ], "text/plain": [ " input_image \\\n", "0 \n", " \n", " \n", " rot_image\n", " rotvision_choices0_message_content\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " The image features two giraffes in a natural setting. One giraffe is prominently displayed in the foreground, with its distinctive long neck and spotted coat. The background consists of greenery and trees, creating a serene and picturesque scene.\n", " \n", " \n", "
\n", " \n", "
\n", " The image shows a white decorative vase hanging upside down, adorned with a bouquet of flowers. The bouquet features various types of flowers, including pink and white blooms, adding a vibrant and decorative touch. The background appears to be a lush, green area, suggesting a natural setting.\n", " \n", " \n", "
\n", " \n", "
\n", " The image contains a zebra lying on grass. The zebra is characterized by its distinctive black and white stripes, and it appears to be in a natural outdoor setting.\n", " \n", " \n", "
\n", " \n", "
\n", " The image appears to show a small, fluffy dog lying among various pairs of shoes on a shelf. The shoes include sneakers and sandals. The setting looks like an indoor space, possibly near an entrance.\n", " \n", " \n", "
\n", " \n", "
\n", " The image appears to depict a lush, green landscape, likely in a forest or jungle setting. There are various plants and foliage, creating a dense environment. Some objects or individuals are visible, but it's difficult to make out specific details due to the overwhelming greenery. The overall scene conveys a sense of nature and wilderness.\n", " \n", " \n", "" ], "text/plain": [ " rot_image \\\n", "0 str:\n", " scores = detect['scores']\n", " label_text = detect['label_text']\n", " # Get the index of the object with the highest confidence\n", " i = scores.index(max(scores))\n", " # Return the corresponding label\n", " return label_text[i]" ] }, { "cell_type": "code", "execution_count": 19, "id": "3228c211-5bd6-41e4-8da0-643b0dda27a4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added 5 column values with 0 errors in 0.11 s (45.52 rows/s)\n" ] }, { "data": { "text/plain": [ "5 rows updated." ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.add_computed_column(top=top_detection(t.detections))" ] }, { "cell_type": "code", "execution_count": 20, "id": "228a2b9e-d8be-4e18-b408-872651fd8928", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
detections_texttop
["person", "person", "bench", "person", "elephant", "elephant", "person"]elephant
["zebra"]zebra
["giraffe", "giraffe"]giraffe
["dog", "dog"]dog
["vase", "potted plant"]vase
" ], "text/plain": [ " detections_text top\n", "0 [person, person, bench, person, elephant, elep... elephant\n", "1 [zebra] zebra\n", "2 [giraffe, giraffe] giraffe\n", "3 [dog, dog] dog\n", "4 [vase, potted plant] vase" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.select(t.detections_text, t.top).show()" ] }, { "cell_type": "markdown", "id": "0f189b1e-708c-488f-856c-a819b6c73825", "metadata": {}, "source": [ "Congratulations! You've reached the end of the tutorial. Hopefully, this gives a good overview of the capabilities of Pixeltable, but there's much more to explore. As a next step, you might check out one of the other tutorials, depending on your interests:\n", "\n", "- [Object Detection in Videos](https://docs.pixeltable.com/howto/use-cases/object-detection-in-videos)\n", "- [RAG Operations in Pixeltable](https://docs.pixeltable.com/howto/use-cases/rag-operations)\n", "- [Working with OpenAI in Pixeltable](https://docs.pixeltable.com/howto/providers/working-with-openai)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.19" } }, "nbformat": 4, "nbformat_minor": 5 }