{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Look up structured data with retrieval UDFs\n", "\n", "Create lookup functions that query tables by key—for customer records, product catalogs, or financial data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem\n", "\n", "You have structured data—customer records, product catalogs, financial data—and need to look up rows by key values. Common scenarios:\n", "\n", "| Use case | Lookup key | Returns |\n", "|----------|------------|---------|\n", "| Customer support | `customer_id` | Contact info, order history |\n", "| Product search | `sku` or `product_name` | Price, inventory, specs |\n", "| Financial analysis | `ticker` or `date` | Stock prices, transactions |\n", "| Inventory check | `warehouse_id`, `product_id` | Stock levels |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n", "\n", "**What's in this recipe:**\n", "\n", "- Create lookup functions from tables with `retrieval_udf`\n", "- Query by single or multiple keys\n", "- Use lookups in computed columns for data enrichment\n", "\n", "Use `pxt.retrieval_udf(table)` to automatically create a function that queries the table by its columns." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install -qU pixeltable" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata\n", "Created directory 'lookup_demo'.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pixeltable as pxt\n", "\n", "# Create a fresh directory\n", "pxt.drop_dir('lookup_demo', force=True)\n", "pxt.create_dir('lookup_demo')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a product catalog table" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'products'.\n", "Inserting rows into `products`: 5 rows [00:00, 502.31 rows/s]\n", "Inserted 5 rows with 0 errors.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
skunamepricecategory
LAPTOP-001MacBook Pro 14"1999.electronics
LAPTOP-002ThinkPad X11499.electronics
PHONE-001iPhone 15 Pro999.electronics
CHAIR-001Ergonomic Office Chair449.furniture
DESK-001Standing Desk699.furniture
" ], "text/plain": [ " sku name price category\n", "0 LAPTOP-001 MacBook Pro 14\" 1999.0 electronics\n", "1 LAPTOP-002 ThinkPad X1 1499.0 electronics\n", "2 PHONE-001 iPhone 15 Pro 999.0 electronics\n", "3 CHAIR-001 Ergonomic Office Chair 449.0 furniture\n", "4 DESK-001 Standing Desk 699.0 furniture" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create a product catalog\n", "products = pxt.create_table(\n", " 'lookup_demo/products',\n", " {\n", " 'sku': pxt.String,\n", " 'name': pxt.String,\n", " 'price': pxt.Float,\n", " 'category': pxt.String,\n", " },\n", ")\n", "\n", "products.insert(\n", " [\n", " {\n", " 'sku': 'LAPTOP-001',\n", " 'name': 'MacBook Pro 14\"',\n", " 'price': 1999.00,\n", " 'category': 'electronics',\n", " },\n", " {\n", " 'sku': 'LAPTOP-002',\n", " 'name': 'ThinkPad X1',\n", " 'price': 1499.00,\n", " 'category': 'electronics',\n", " },\n", " {\n", " 'sku': 'PHONE-001',\n", " 'name': 'iPhone 15 Pro',\n", " 'price': 999.00,\n", " 'category': 'electronics',\n", " },\n", " {\n", " 'sku': 'CHAIR-001',\n", " 'name': 'Ergonomic Office Chair',\n", " 'price': 449.00,\n", " 'category': 'furniture',\n", " },\n", " {\n", " 'sku': 'DESK-001',\n", " 'name': 'Standing Desk',\n", " 'price': 699.00,\n", " 'category': 'furniture',\n", " },\n", " ]\n", ")\n", "\n", "products.collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a lookup function with retrieval_udf" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Create a lookup function that searches by SKU\n", "get_product = pxt.retrieval_udf(\n", " products,\n", " name='get_product',\n", " description='Look up a product by its SKU code',\n", " parameters=['sku'], # Only use SKU as the lookup key\n", " limit=1, # Return at most 1 result\n", ")\n", "# Check the function signature" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Look up a product by SKU\n", "result = products.select(get_product(sku='LAPTOP-001')).limit(1).collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Look up by category (multiple results)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
get_by_category
[{"sku": "LAPTOP-001", "name": "MacBook Pro 14\\"", "price": 1999., "category": "electronics"}, {"sku": "LAPTOP-002", "name": "ThinkPad X1", "price": 1499., "category": "electronics"}, {"sku": "PHONE-001", "name": "iPhone 15 Pro", "price": 999., "category": "electronics"}]
" ], "text/plain": [ " get_by_category\n", "0 [{'sku': 'LAPTOP-001', 'name': 'MacBook Pro 14..." ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create a category lookup (returns multiple products)\n", "get_by_category = pxt.retrieval_udf(\n", " products,\n", " name='get_by_category',\n", " description='Get all products in a category',\n", " parameters=['category'],\n", " limit=10, # Return up to 10 products\n", ")\n", "\n", "# Find all electronics\n", "products.select(get_by_category(category='electronics')).limit(\n", " 1\n", ").collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use lookups for data enrichment" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'orders'.\n", "Inserting rows into `orders`: 3 rows [00:00, 1186.28 rows/s]\n", "Inserted 3 rows with 0 errors.\n" ] }, { "data": { "text/plain": [ "3 rows inserted, 6 values computed." ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create an orders table\n", "orders = pxt.create_table(\n", " 'lookup_demo/orders',\n", " {\n", " 'order_id': pxt.String,\n", " 'product_sku': pxt.String,\n", " 'quantity': pxt.Int,\n", " },\n", ")\n", "\n", "orders.insert(\n", " [\n", " {\n", " 'order_id': 'ORD-001',\n", " 'product_sku': 'LAPTOP-001',\n", " 'quantity': 2,\n", " },\n", " {\n", " 'order_id': 'ORD-002',\n", " 'product_sku': 'PHONE-001',\n", " 'quantity': 1,\n", " },\n", " {\n", " 'order_id': 'ORD-003',\n", " 'product_sku': 'CHAIR-001',\n", " 'quantity': 4,\n", " },\n", " ]\n", ")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added 3 column values with 0 errors.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order_idproduct_skuquantityproduct_info
ORD-001LAPTOP-0012[{"sku": "LAPTOP-001", "name": "MacBook Pro 14\\"", "price": 1999., "category": "electronics"}]
ORD-002PHONE-0011[{"sku": "PHONE-001", "name": "iPhone 15 Pro", "price": 999., "category": "electronics"}]
ORD-003CHAIR-0014[{"sku": "CHAIR-001", "name": "Ergonomic Office Chair", "price": 449., "category": "furniture"}]
" ], "text/plain": [ " order_id product_sku quantity \\\n", "0 ORD-001 LAPTOP-001 2 \n", "1 ORD-002 PHONE-001 1 \n", "2 ORD-003 CHAIR-001 4 \n", "\n", " product_info \n", "0 [{'sku': 'LAPTOP-001', 'name': 'MacBook Pro 14... \n", "1 [{'sku': 'PHONE-001', 'name': 'iPhone 15 Pro',... \n", "2 [{'sku': 'CHAIR-001', 'name': 'Ergonomic Offic... " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Add a computed column that enriches orders with product details\n", "orders.add_computed_column(\n", " product_info=get_product(sku=orders.product_sku)\n", ")\n", "\n", "# View enriched orders\n", "orders.select(\n", " orders.order_id,\n", " orders.product_sku,\n", " orders.quantity,\n", " orders.product_info,\n", ").collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explanation\n", "\n", "**`retrieval_udf` parameters:**\n", "\n", "| Parameter | Description |\n", "|-----------|-------------|\n", "| `table` | The table to query |\n", "| `name` | Function name (defaults to table name) |\n", "| `description` | Description for LLM tool use |\n", "| `parameters` | Columns to use as lookup keys |\n", "| `limit` | Max rows to return (None = all) |\n", "\n", "**Use cases:**\n", "\n", "| Pattern | Example |\n", "|---------|---------|\n", "| Data enrichment | Join order SKUs to product details |\n", "| LLM tool calling | Let agents query databases |\n", "| Reference lookups | Convert codes to descriptions |\n", "| Cross-table joins | Link related records |\n", "\n", "**Tips:**\n", "\n", "- Use `limit=1` for unique key lookups\n", "- Specify only needed columns in `parameters` for cleaner APIs\n", "- Add descriptions for LLM tool integration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## See also\n", "\n", "- [Use tool calling with LLMs](https://docs.pixeltable.com/howto/cookbooks/agents/llm-tool-calling) - Use retrieval UDFs as LLM tools\n", "- [Build a RAG pipeline](https://docs.pixeltable.com/howto/cookbooks/agents/pattern-rag-pipeline) - Semantic search with `@pxt.query`" ] } ], "metadata": { "kernelspec": { "display_name": "pixeltable", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 2 }