{ "cells": [ { "cell_type": "markdown", "id": "ba79d33d-7ae2-4f0a-94c3-783f799e1ff4", "metadata": {}, "source": [ "# Queries and Expressions\n", "\n", "Expressions are the basic building blocks of Pixeltable. This guide explores how to use queries and expressions, including:\n", "\n", "- Different types of Pixeltable expressions\n", "- Column references and arithmetic operations\n", "- Function calls and media operations\n", "- The Pixeltable type system\n", "\n", "## Prerequisites\n", "\n", "This guide assumes you're familiar with:\n", "\n", "- Creating and managing tables\n", "- Basic table operations and queries\n", "- Computed columns\n", "\n", "If you're new to these concepts, start with:\n", "\n", "- [Tables and Data Operations](https://docs.pixeltable.com/tutorials/tables-and-data-operations)\n", "- [Computed Columns](https://docs.pixeltable.com/tutorials/computed-columns)\n", "\n", "## Understanding Expressions\n", "\n", "You can use Pixeltable expressions in queries:\n", "\n", "```python\n", "pop_t.select(yoy_change=(pop_t.pop_2023 - pop_t.pop_2022)).collect()\n", "```\n", "\n", "Or as computed columns that update automatically:\n", "\n", "```python\n", "pop_t.add_column(yoy_change=(pop_t.pop_2023 - pop_t.pop_2022))\n", "```\n", "\n", "Both examples use the expression `pop_t.pop_2023 - pop_t.pop_2022`. You can also chain operations:\n", "\n", "```python\n", "t.source.convert('RGBA').rotate(10)\n", "```\n", "\n", "Or invoke models:\n", "\n", "```python\n", "detr_for_object_detection(\n", " t.source,\n", " model_id='facebook/detr-resnet-50',\n", " threshold=0.8\n", ")\n", "```\n", "\n", "You can include an expression in a `select()` statement to evaluate it dynamically, or in an `add_column()` statement to add it to the table schema as a computed column.\n", "\n", "To get started, let's import the necessary libraries and set up a demo directory." ] }, { "cell_type": "code", "execution_count": null, "id": "af95c4aa-7905-44e9-b387-af512c347b68", "metadata": {}, "outputs": [], "source": [ "%pip install -qU pixeltable datasets torch transformers" ] }, { "cell_type": "code", "execution_count": 1, "id": "7c14c1fe-83fb-48dd-a766-a2b22c9d214a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata\n", "Created directory 'demo'.\n" ] } ], "source": [ "import pixeltable as pxt\n", "\n", "pxt.drop_dir('demo', force=True)\n", "pxt.create_dir('demo')" ] }, { "cell_type": "markdown", "id": "ba3ff2d8-aafd-45af-b3e0-a86716ccb63a", "metadata": {}, "source": [ "In this guide we'll work with a subset of the MNIST dataset, a classic reference database of hand-drawn digits. A copy of the MNIST dataset is hosted on the Hugging Face datasets repository, so we can use `create_table()` with the `source` parameter to load it into a Pixeltable table." ] }, { "cell_type": "code", "execution_count": 2, "id": "c515c6ba-959f-4da0-9bba-872acf2a845c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'mnist'.\n", "Inserting rows into `mnist`: 50 rows [00:00, 7516.67 rows/s]\n", "Inserted 50 rows with 0 errors.\n" ] } ], "source": [ "import datasets\n", "\n", "# Download the first 50 images of the MNIST dataset\n", "ds = datasets.load_dataset('ylecun/mnist', split='train[:50]')\n", "\n", "# Import them into a Pixeltable table\n", "t = pxt.create_table('demo/mnist', source=ds)" ] }, { "cell_type": "code", "execution_count": 3, "id": "57119aee-d513-4489-a3b5-44f5e2988636", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| image | \n", "label | \n", "
|---|---|
\n",
" | \n",
" 5 | \n", "
\n",
" | \n",
" 0 | \n", "
\n",
" | \n",
" 4 | \n", "
\n",
" | \n",
" 1 | \n", "
\n",
" | \n",
" 9 | \n", "
| Column\n", "'image'\n", "(of table 'demo/mnist') | \n", "
| Column Name | \n", "Type | \n", "Computed With | \n", "
|---|---|---|
| image | \n", "Image | \n", "\n", " |
| image | \n", "classification | \n", "
|---|---|
\n",
" | \n",
" {"labels": [5, 3, 2, 8, 7], "scores": [0.981, 0.013, 0.002, 0.001, 0.001], "label_text": ["5", "3", "2", "8", "7"]} | \n", "
\n",
" | \n",
" {"labels": [0, 6, 9, 8, 1], "scores": [0.997, 0., 0., 0., 0.], "label_text": ["0", "6", "9", "8", "1"]} | \n", "
\n",
" | \n",
" {"labels": [4, 1, 9, 7, 0], "scores": [0.997, 0.001, 0., 0., 0.], "label_text": ["4", "1", "9", "7", "0"]} | \n", "
| table 'demo/mnist' | \n", "
| Column Name | \n", "Type | \n", "Computed With | \n", "
|---|---|---|
| image | \n", "Image | \n", "\n", " |
| label | \n", "String | \n", "\n", " |
| classification | \n", "Json | \n", "vit_for_image_classification(image, model_id='farleyknight-org-username/vit-base-mnist') | \n", "
| classification_labels | \n", "
|---|
| [5, 3, 2, 8, 7] | \n", "
| [0, 6, 9, 8, 1] | \n", "
| [4, 1, 9, 7, 0] | \n", "
| classification_labels | \n", "
|---|
| [5, 3, 2, 8, 7] | \n", "
| [0, 6, 9, 8, 1] | \n", "
| [4, 1, 9, 7, 0] | \n", "
| classification_labels0 | \n", "
|---|
| 5 | \n", "
| 0 | \n", "
| 4 | \n", "
| classification_labels2 | \n", "
|---|
| [5, 3] | \n", "
| [0, 6] | \n", "
| [4, 1] | \n", "
| classification_notakey | \n", "
|---|
| None | \n", "
| None | \n", "
| None | \n", "
| table 'demo/mnist' | \n", "
| Column Name | \n", "Type | \n", "Computed With | \n", "
|---|---|---|
| image | \n", "Image | \n", "\n", " |
| label | \n", "String | \n", "\n", " |
| classification | \n", "Json | \n", "vit_for_image_classification(image, model_id='farleyknight-org-username/vit-base-mnist') | \n", "
| pred_label | \n", "Json | \n", "classification.label_text[0] | \n", "
| image | \n", "col_1 | \n", "
|---|---|
\n",
" | \n",
" {"ground_truth": "5", "prediction": "5", "is_correct": true, "engine": "pixeltable"} | \n", "
\n",
" | \n",
" {"ground_truth": "0", "prediction": "0", "is_correct": true, "engine": "pixeltable"} | \n", "
\n",
" | \n",
" {"ground_truth": "4", "prediction": "4", "is_correct": true, "engine": "pixeltable"} | \n", "
\n",
" | \n",
" {"ground_truth": "1", "prediction": "1", "is_correct": true, "engine": "pixeltable"} | \n", "
\n",
" | \n",
" {"ground_truth": "9", "prediction": "9", "is_correct": true, "engine": "pixeltable"} | \n", "
rot_model_result for later reuse. Every Pixeltable expression is a Python object, so you can freely assign them to variables, reuse them, compose them, and so on. Remember that nothing actually happens until the expression is used in a query - so in this example, setting the variable rot_model_result doesn't itself result in any data being retrieved; that only happens later, when we actually use it in the select() query.\n",
"\n",
"\n",
"There are a large number of built-in UDFs that ship with Pixeltable; you can always refer back to the [SDK Documentation](https://docs.pixeltable.com/sdk/latest/) for details."
]
},
{
"cell_type": "markdown",
"id": "a9bdce5c-f774-4e87-9143-5383081060ff",
"metadata": {},
"source": [
"### Method Calls\n",
"\n",
"Many built-in UDFs allow a convenient alternate syntax. The following two expressions are exactly equivalent:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "e4d9189b-39f8-47ff-807b-c4f7393a474c",
"metadata": {},
"outputs": [],
"source": [
"a = t.image.rotate(90)\n",
"b = pxt.functions.image.rotate(t.image, 90)"
]
},
{
"cell_type": "markdown",
"id": "827f0033-7b39-4d09-8db6-09326d1764d0",
"metadata": {},
"source": [
"`a` and `b` can always be used interchangeably in queries, with identical results. Just like in standard Python classes, whenever Pixeltable sees the __method call__ `t.image.rotate(90)`, it interprets it as a __function call__ `pxt.functions.image.rotate(self, 90)`, with (in this case) `self` equal to `t.image`.\n",
"\n",
"Any method call can also be written as a function call, but (just like in standard Python) not every function call can be written as a method call. For example, the following won't work:\n",
"\n",
"```python\n",
"t.image.vit_for_image_classification(\n",
" model_id='farleyknight-org-username/vit-base-mnist'\n",
")\n",
"```\n",
"\n",
"That's because `vit_for_image_classification` is part of the `pxt.functions.huggingface` module, not the core module `pxt.functions.image`. Most Pixeltable types have a corresponding __core module__ of UDFs that can be used as method calls (`pxt.functions.image` for `Image`; `pxt.functions.string` for `String`; and so on), described fully in the [SDK Documentation](https://docs.pixeltable.com/sdk/latest/)."
]
},
{
"cell_type": "markdown",
"id": "f5f4c17a-61fb-4b89-b05a-20b79edebc91",
"metadata": {},
"source": [
"### Arithmetic and Boolean Operations"
]
},
{
"cell_type": "markdown",
"id": "7bd67623-5031-4eae-990f-18577ba8dc19",
"metadata": {},
"source": [
"Expressions can also be combined using standard arithmetic and boolean operators. As with everything else, arithmetic and boolean expressions are operations on columns that (when used in a query) are applied to every row."
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "b603d361-eddf-4d53-87df-ab2cc30eb42b",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"| image | \n", "label | \n", "col_2 | \n", "col_3 | \n", "
|---|---|---|---|
\n",
" | \n",
" 5 | \n", "False | \n", "False | \n", "
\n",
" | \n",
" 0 | \n", "False | \n", "True | \n", "
\n",
" | \n",
" 4 | \n", "True | \n", "True | \n", "
\n",
" | \n",
" 1 | \n", "False | \n", "True | \n", "
\n",
" | \n",
" 9 | \n", "False | \n", "False | \n", "
| image | \n", "image_localpath | \n", "
|---|---|
\n",
" | \n",
" /Users/asiegel/.pixeltable/media/99266a70fa5141bc9b5d1c9fddcc7a16/92/9276/99266a70fa5141bc9b5d1c9fddcc7a16_0_1_927616cf87bf45ea9a3ebabd1da9bc01.jpeg | \n", "
\n",
" | \n",
" /Users/asiegel/.pixeltable/media/99266a70fa5141bc9b5d1c9fddcc7a16/d3/d360/99266a70fa5141bc9b5d1c9fddcc7a16_0_1_d360c94ae7624066a723e7e1ccd4cf37.jpeg | \n", "
\n",
" | \n",
" /Users/asiegel/.pixeltable/media/99266a70fa5141bc9b5d1c9fddcc7a16/4f/4f38/99266a70fa5141bc9b5d1c9fddcc7a16_0_1_4f38ed14b91445b2b03f1d4772e7f2e7.jpeg | \n", "
\n",
" | \n",
" /Users/asiegel/.pixeltable/media/99266a70fa5141bc9b5d1c9fddcc7a16/0c/0ca1/99266a70fa5141bc9b5d1c9fddcc7a16_0_1_0ca1cf9c60cc4c90abfd0d3d4e6ef4c3.jpeg | \n", "
\n",
" | \n",
" /Users/asiegel/.pixeltable/media/99266a70fa5141bc9b5d1c9fddcc7a16/65/657d/99266a70fa5141bc9b5d1c9fddcc7a16_0_1_657dcf92103f409a9203817b2565d05a.jpeg | \n", "
| image | \n", "channel | \n", "channel_errortype | \n", "channel_errormsg | \n", "
|---|---|---|---|
\n",
" | \n",
" None | \n", "ValueError | \n", "band index out of range | \n", "
\n",
" | \n",
" None | \n", "ValueError | \n", "band index out of range | \n", "
\n",
" | \n",
" None | \n", "ValueError | \n", "band index out of range | \n", "
\n",
" | \n",
" None | \n", "ValueError | \n", "band index out of range | \n", "
\n",
" | \n",
" None | \n", "ValueError | \n", "band index out of range | \n", "
pxt.Audio, pxt.Video, and pxt.Document all correspond to the Python type str. This is because those types are represented by file paths that reference the media in question. When you query for, say, t.select(t.video_col), you're guaranteed to get a file path on the local filesystem (Pixeltable will download and cache a local copy of the video if necessary to ensure this). If you want the original URL, use t.video_col.fileurl instead.\n",
""
]
},
{
"cell_type": "markdown",
"id": "ff20bf54-f4a0-435b-9874-13a4e8471dd3",
"metadata": {},
"source": [
"Several types can be __specialized__ to constrain the allowable data in a column.\n",
"\n",
"- `pxt.Image` can be specialized with a resolution and/or an image mode:\n",
" - `pxt.Image[(300,200)]` - images with width 300 and height 200\n",
" - `pxt.Image['RGB']` - images with mode `'RGB'`; see the [PIL Documentation](https://pillow.readthedocs.io/en/stable/handbook/concepts.html) for the full list\n",
" - `pxt.Image[(300,200), 'RGB']` - combines the above constraints\n",
"- `pxt.Array` can be specialized with a shape and/or a dtype:\n",
" - `pxt.Array[pxt.Float]` - arrays with dtype `pxt.Float`\n",
" - `pxt.Array[(64,64,3), pxt.Float]` - 3-dimensional arrays with dtype `pxt.Float` and 64x64x3 shape\n",
"\n",
"If we look at the structure of our table now, we see examples of specialized image and array types."
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "0014d0c7-680c-4096-a599-50a788b1d840",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"| table 'demo/mnist' | \n", "
| Column Name | \n", "Type | \n", "Computed With | \n", "
|---|---|---|
| image | \n", "Image | \n", "\n", " |
| label | \n", "String | \n", "\n", " |
| classification | \n", "Json | \n", "vit_for_image_classification(image, model_id='farleyknight-org-username/vit-base-mnist') | \n", "
| pred_label | \n", "Json | \n", "classification.label_text[0] | \n", "
| clip | \n", "Array[(512,), Float] | \n", "clip(image, model_id='openai/clip-vit-base-patch32') | \n", "
| channel | \n", "Image['L'] | \n", "image.getchannel(1) | \n", "
pxt.Image by itself to mean \"any image, without constraints\", but numerical arrays must always specify a shape and a dtype; pxt.Array by itself will raise an error.\n",
"(512,) or (64,64,3). A None may be used in place of an integer to indicate an unconstrained size for that dimension, as in (None,None,3) (3-dimensional array with two unconstrained dimensions), or simply (None,) (unconstrained 1-dimensional array).\n",
"