{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Use tool calling and MCP servers with LLMs\n", "\n", "Enable LLMs to call functions and tools, then execute the results automatically." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem\n", "\n", "You want an LLM to decide which functions to call based on user queries—for agents, chatbots, or automated workflows.\n", "\n", "| Use case | Tools needed |\n", "|----------|--------------|\n", "| Data assistant | `get_data`, `run_query` |\n", "| Customer support | `lookup_order`, `check_status` |\n", "| Research agent | `search_web`, `fetch_article` |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n", "\n", "**What's in this recipe:**\n", "\n", "- Define tools as Python functions\n", "- Let LLMs decide which tool to call\n", "- Automatically execute tool calls with `invoke_tools`\n", "- Use MCP servers to load external tools\n", "\n", "You define tools with JSON schemas, pass them to the LLM, and use `invoke_tools` to execute the function calls." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install -qU pixeltable openai mcp" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import getpass\n", "import os\n", "\n", "if 'OPENAI_API_KEY' not in os.environ:\n", " os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key: ')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/pierre/pixeltable/pixeltable/env.py:501: UserWarning: Progress reporting is disabled because ipywidgets is not installed. To fix this, run: `pip install ipywidgets`\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata\n", "Created directory 'tools_demo'.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pixeltable as pxt\n", "from pixeltable.functions import openai\n", "\n", "# Create a fresh directory\n", "pxt.drop_dir('tools_demo', force=True)\n", "pxt.create_dir('tools_demo')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define tools as UDFs" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Define tool functions as Pixeltable UDFs\n", "@pxt.udf\n", "def get_weather(city: str) -> str:\n", " \"\"\"Get the current weather for a city.\"\"\"\n", " # In production, call a real weather API\n", " weather_data = {\n", " 'new york': 'Sunny, 72°F',\n", " 'london': 'Cloudy, 58°F',\n", " 'tokyo': 'Rainy, 65°F',\n", " 'paris': 'Partly cloudy, 68°F',\n", " }\n", " return weather_data.get(\n", " city.lower(), f'Weather data not available for {city}'\n", " )\n", "\n", "\n", "@pxt.udf\n", "def get_stock_price(symbol: str) -> str:\n", " \"\"\"Get the current stock price for a symbol.\"\"\"\n", " # In production, call a real stock API\n", " prices = {\n", " 'AAPL': '$178.50',\n", " 'GOOGL': '$141.25',\n", " 'MSFT': '$378.90',\n", " 'AMZN': '$185.30',\n", " }\n", " return prices.get(symbol.upper(), f'Price not available for {symbol}')" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Create a Tools object with our functions\n", "tools = pxt.tools(get_weather, get_stock_price)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create tool-calling pipeline" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'queries'.\n" ] } ], "source": [ "# Create table for queries\n", "queries = pxt.create_table('tools_demo/queries', {'query': pxt.String})" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added 0 column values with 0 errors in 0.00 s\n" ] }, { "data": { "text/plain": [ "No rows affected." ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Add LLM call with tools\n", "queries.add_computed_column(\n", " response=openai.chat_completions(\n", " messages=[{'role': 'user', 'content': queries.query}],\n", " model='gpt-4o-mini',\n", " tools=tools, # Pass tools to the LLM\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added 0 column values with 0 errors in 0.01 s\n" ] }, { "data": { "text/plain": [ "No rows affected." ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Automatically execute tool calls and get results\n", "queries.add_computed_column(\n", " tool_results=openai.invoke_tools(tools, queries.response)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Run tool-enabled queries" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inserted 3 rows with 0 errors in 4.16 s (0.72 rows/s)\n" ] }, { "data": { "text/plain": [ "3 rows inserted." ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Insert queries that require tool calls\n", "sample_queries = [\n", " {'query': \"What's the weather in Tokyo?\"},\n", " {'query': \"What's the stock price of Apple?\"},\n", " {\n", " 'query': \"What's the weather in Paris and the price of Microsoft stock?\"\n", " },\n", "]\n", "\n", "queries.insert(sample_queries)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
querytool_results
What's the weather in Tokyo?{"get_weather": ["Rainy, 65\\u00b0F"], "get_stock_price": null}
What's the stock price of Apple?{"get_weather": null, "get_stock_price": ["\\$178.50"]}
What's the weather in Paris and the price of Microsoft stock?{"get_weather": ["Partly cloudy, 68\\u00b0F"], "get_stock_price": ["\\$378.90"]}
" ], "text/plain": [ " query \\\n", "0 What's the weather in Tokyo? \n", "1 What's the stock price of Apple? \n", "2 What's the weather in Paris and the price of M... \n", "\n", " tool_results \n", "0 {'get_weather': ['Rainy, 65°F'], 'get_stock_pr... \n", "1 {'get_weather': None, 'get_stock_price': ['$17... \n", "2 {'get_weather': ['Partly cloudy, 68°F'], 'get_... " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# View results\n", "queries.select(queries.query, queries.tool_results).collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using MCP Servers as Tools\n", "\n", "The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) is an open protocol that standardizes how applications provide context to LLMs. Pixeltable can connect to MCP servers and use their exposed tools as UDFs.\n", "\n", "### Why MCP?\n", "\n", "| Benefit | Description |\n", "|---------|-------------|\n", "| **Standardized interface** | Connect to any MCP-compliant server |\n", "| **External services** | Integrate APIs, databases, and custom logic |\n", "| **Reusable tools** | Share tools across different applications |\n", "| **Dynamic discovery** | Automatically discover available tools |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create an MCP Server\n", "\n", "First, create an MCP server with tools you want to expose. Save this as `mcp_server.py`:\n", "\n", "```python\n", "from mcp.server.fastmcp import FastMCP\n", "\n", "mcp = FastMCP('PixeltableDemo', stateless_http=True)\n", "\n", "@mcp.tool()\n", "def calculate_discount(price: float, discount_percent: float) -> float:\n", " \"\"\"Calculate the discounted price.\"\"\"\n", " return price * (1 - discount_percent / 100)\n", "\n", "@mcp.tool()\n", "def check_inventory(product_id: str) -> str:\n", " \"\"\"Check inventory status for a product.\"\"\"\n", " # In production, query your inventory database\n", " inventory = {\n", " 'SKU001': 'In stock (42 units)',\n", " 'SKU002': 'Low stock (3 units)',\n", " 'SKU003': 'Out of stock',\n", " }\n", " return inventory.get(product_id, f'Unknown product: {product_id}')\n", "\n", "if __name__ == '__main__':\n", " mcp.run(transport='streamable-http')\n", "```\n", "\n", "Run the server: `python mcp_server.py` (it will listen on `http://localhost:8000/mcp`)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Connect to MCP Server and Use Tools" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "- SearchPixeltableDocumentation: Search across the Pixeltable Documentation knowledge base to find relevant information, code examples, API references, and guides. Use this tool when you need to answer questions about Pixeltable Documentation, find specific documentation, understand how features work, or locate implementation details. The search returns contextual content with titles and direct links to the documentation pages.\n" ] } ], "source": [ "# Connect to the MCP server using pxt.mcp_udfs()\n", "# This creates a Pixeltable UDF for each tool exposed by the server\n", "# See: https://docs.pixeltable.com/platform/custom-functions#5-mcp-udfs\n", "mcp_tools = pxt.mcp_udfs('https://docs.pixeltable.com/mcp')\n", "\n", "# View available tools - each is now a callable Pixeltable function\n", "for tool in mcp_tools:\n", " print(f'- {tool.name}: {tool.comment()}')" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'mcp_queries'.\n", "Added 0 column values with 0 errors in 0.00 s\n", "Added 0 column values with 0 errors in 0.01 s\n" ] }, { "data": { "text/markdown": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
table 'tools_demo/mcp_queries'
\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Column NameTypeComputed With
queryString
responseRequired[Json]chat_completions(messages=[{'role': 'user', 'content': query}], model='gpt-4o-mini', tools=[{'name': 'SearchPixeltableDocumentation', 'description': 'Search across the Pixeltable Documentation knowledge base to find relevant information, code examples, API references, and guides. Use this tool when you need to answer questions about Pixeltable Documentation, find specific documentation, understand how features work, or locate implementation details. The search returns contextual content with titles and direct links to the documentation pages.', 'parameters': {'type': 'object', 'properties': {'query': {'type': 'string'}}}, 'required': ['query'], 'additionalProperties': False}])
tool_resultsRequired[Json]{'SearchPixeltableDocumentation': map(_openai_response_to_pxt_tool_calls(response).SearchPixeltableDocumentation[*], lambda R: SearchPixeltableDocumentation(query=_extract_str_tool_arg(ObjectRef(JsonMapperDispatch(), 5263294452192566501, 88144469474745989).args, param_name='query')))}
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Bundle MCP tools for LLM use\n", "mcp_toolset = pxt.tools(*mcp_tools)\n", "\n", "# Create a table with MCP tool-calling pipeline\n", "mcp_queries = pxt.create_table(\n", " 'tools_demo/mcp_queries', {'query': pxt.String}\n", ")\n", "\n", "# Add LLM call with MCP tools\n", "mcp_queries.add_computed_column(\n", " response=openai.chat_completions(\n", " messages=[{'role': 'user', 'content': mcp_queries.query}],\n", " model='gpt-4o-mini',\n", " tools=mcp_toolset,\n", " )\n", ")\n", "\n", "# Execute MCP tool calls\n", "mcp_queries.add_computed_column(\n", " tool_results=openai.invoke_tools(mcp_toolset, mcp_queries.response)\n", ")\n", "\n", "# View the schema - note that mcp_toolset is stored as persistent metadata\n", "# Every subsequent insert will use these same tools automatically\n", "mcp_queries.describe()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
querytool_results
What is Pixeltable?{"SearchPixeltableDocumentation": ["Title: pixeltable.DataFrame\\nLink: https://docs.pixeltable.com/sdk/v0.4.22/dataframe\\nContent: pixeltable.DataFrame\\nRepresents a query for retrievin ...... fault: 10 ): Number of rows to select. Default is 10. Returns: DataFrameResultSet : A DataFrameResultSet with the first n rows of the DataFrame.\\n\\n"]}
How to use OpenAI in Pixeltable?{"SearchPixeltableDocumentation": ["Title: Working with OpenAI in Pixeltable\\nLink: https://docs.pixeltable.com/howto/providers/working-with-openai\\nContent: Working with OpenAI in Pix ...... s pxt\\n\\n# Remove the 'openai_demo' directory and its contents, if it exists\\npxt.drop_dir('openai_demo', force=True)\\npxt.create_dir('openai_demo')\\n\\n"]}
" ], "text/plain": [ " query \\\n", "0 What is Pixeltable? \n", "1 How to use OpenAI in Pixeltable? \n", "\n", " tool_results \n", "0 {'SearchPixeltableDocumentation': ['Title: pix... \n", "1 {'SearchPixeltableDocumentation': ['Title: Wor... " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Test with e-commerce queries\n", "mcp_queries.insert(\n", " [\n", " {'query': 'What is Pixeltable?'},\n", " {'query': 'How to use OpenAI in Pixeltable?'},\n", " ]\n", ")\n", "\n", "mcp_queries.select(mcp_queries.query, mcp_queries.tool_results).collect()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
search_result
Title: pixeltable.DataFrame\n", "Link: https://docs.pixeltable.com/sdk/v0.4.22/dataframe\n", "Content: pixeltable.DataFrame\n", "Represents a query for retrieving and transforming data from Pixeltable tables. View source on GitHub\n", "\n", "Represents a query for retrieving and transforming data from Pixeltable tables. View source on GitHub\n", "\n", "method head()\n", "Return the first n rows of the DataFrame, in insertion order of the underlying Table. head() is not supported for joins. Parameters: n ( int , default: 10 ): Number of rows to select. Default is 10. Returns: DataFrameResultSet : A DataFrameResultSet with the first n rows of the DataFrame.
Title: Working with OpenAI in Pixeltable\n", "Link: https://docs.pixeltable.com/howto/providers/working-with-openai\n", "Content: Working with OpenAI in Pixeltable\n", "This documentation page is also available as an interactive notebook. You can launch the notebook in\n", "Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the\n", "above links. Pixeltable’s OpenAI integration enables you to access OpenAI models via\n", "the OpenAI API.\n", "\n", "This documentation page is also a ...... API.\n", "\n", "Important notes\n", "OpenAI usage may incur costs based on your OpenAI plan. Be mindful of sensitive data and consider security measures when\n", "integrating with external services. First you’ll need to install required libraries and enter your OpenAI\n", "API key. Now let’s create a Pixeltable directory to hold the tables for our demo. import pixeltable as pxt\n", "\n", "# Remove the 'openai_demo' directory and its contents, if it exists\n", "pxt.drop_dir('openai_demo', force=True)\n", "pxt.create_dir('openai_demo')
" ], "text/plain": [ " search_result\n", "0 Title: pixeltable.DataFrame\\nLink: https://doc...\n", "1 Title: Working with OpenAI in Pixeltable\\nLink..." ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Extract the search result with a named column\n", "mcp_queries.select(\n", " search_result=mcp_queries.tool_results[\n", " 'SearchPixeltableDocumentation'\n", " ][0]\n", ").collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explanation\n", "\n", "**Tool calling flow:**\n", "\n", "```\n", "Query → LLM decides tool → invoke_tools executes → Results\n", "```\n", "\n", "**Key components:**\n", "\n", "| Component | Purpose |\n", "|-----------|---------|\n", "| `@pxt.udf` | Define tool functions |\n", "| `pxt.tools()` | Bundle functions into Tools object |\n", "| `tools=` parameter | Pass tools to LLM |\n", "| `invoke_tools()` | Execute tool calls from LLM response |\n", "| `pxt.mcp_udfs()` | Load tools from an MCP server |\n", "\n", "**MCP integration:**\n", "\n", "```\n", "MCP Server → pxt.mcp_udfs() → pxt.tools() → LLM tool calling\n", "```\n", "\n", "MCP servers expose tools via a standardized protocol. Pixeltable's `mcp_udfs()` connects to any MCP server and returns the tools as callable UDFs that can be bundled with `pxt.tools()` for LLM use.\n", "\n", "**Supported providers:**\n", "\n", "| Provider | Function |\n", "|----------|----------|\n", "| OpenAI | `openai.invoke_tools()` |\n", "| Anthropic | `anthropic.invoke_tools()` |\n", "| Groq | `groq.invoke_tools()` |\n", "| Gemini | `gemini.invoke_tools()` |\n", "| Bedrock | `bedrock.invoke_tools()` |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## See also\n", "\n", "- [Build a RAG pipeline](https://docs.pixeltable.com/howto/cookbooks/agents/pattern-rag-pipeline) - Retrieval-augmented generation\n", "- [Run local LLMs](https://docs.pixeltable.com/howto/providers/working-with-ollama) - Local model inference\n", "- [Multimodal MCP Servers](https://docs.pixeltable.com/libraries/mcp) - Pixeltable's MCP server collection\n", "- [Custom Functions](https://docs.pixeltable.com/platform/custom-functions) - More about UDFs and MCP integration" ] } ], "metadata": { "kernelspec": { "display_name": "pixeltable", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 2 }