{ "cells": [ { "cell_type": "markdown", "id": "3e777354", "metadata": {}, "source": [ "# Working with OpenAI in Pixeltable\n", "\n", "Pixeltable's OpenAI integration enables you to access OpenAI models via the OpenAI API.\n", "\n", "### Prerequisites\n", "\n", "- An OpenAI account with an API key (https://openai.com/index/openai-api/)\n", "\n", "### Important notes\n", "\n", "- OpenAI usage may incur costs based on your OpenAI plan.\n", "- Be mindful of sensitive data and consider security measures when integrating with external services." ] }, { "cell_type": "markdown", "id": "8b2e6912-e936-4c3a-84a2-ba99950c9493", "metadata": {}, "source": [ "First you'll need to install required libraries and enter your OpenAI API key." ] }, { "cell_type": "code", "execution_count": null, "id": "d5288926-c278-4cbc-815c-cbc0433bbf49", "metadata": {}, "outputs": [], "source": [ "%pip install -qU pixeltable openai" ] }, { "cell_type": "code", "execution_count": null, "id": "385f6831-f029-42bb-99f1-652a809ffc6e", "metadata": {}, "outputs": [], "source": [ "import getpass\n", "import os\n", "\n", "if 'OPENAI_API_KEY' not in os.environ:\n", " os.environ['OPENAI_API_KEY'] = getpass.getpass(\n", " 'Enter your OpenAI API key:'\n", " )" ] }, { "cell_type": "markdown", "id": "8d3dd131-22de-496c-9f02-ffd4515c20d3", "metadata": {}, "source": [ "Now let's create a Pixeltable directory to hold the tables for our demo." ] }, { "cell_type": "code", "execution_count": 1, "id": "9bdc613f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata\n", "Created directory 'openai_demo'.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pixeltable as pxt\n", "\n", "# Remove the 'openai_demo' directory and its contents, if it exists\n", "pxt.drop_dir('openai_demo', force=True)\n", "pxt.create_dir('openai_demo')" ] }, { "cell_type": "markdown", "id": "02f8595f-fb03-419f-9440-ee2ae784fd20", "metadata": {}, "source": [ "## Chat completions\n", "\n", "Create a Table: In Pixeltable, create a table with columns to represent your input data and the columns where you want to store the results from OpenAI." ] }, { "cell_type": "code", "execution_count": 2, "id": "342407c1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'chat'.\n", "Added 0 column values with 0 errors in 0.01 s\n" ] }, { "data": { "text/plain": [ "No rows affected." ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pixeltable.functions import openai\n", "\n", "# Create a table in Pixeltable and add a computed column that calls OpenAI\n", "\n", "t = pxt.create_table('openai_demo/chat', {'input': pxt.String})\n", "\n", "messages = [{'role': 'user', 'content': t.input}]\n", "t.add_computed_column(\n", " output=openai.chat_completions(\n", " messages=messages,\n", " model='gpt-4o-mini',\n", " model_kwargs={\n", " # Optional dict with parameters for the OpenAI API\n", " 'max_tokens': 300,\n", " 'top_p': 0.9,\n", " 'temperature': 0.7,\n", " },\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 3, "id": "c5f0b862", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Added 0 column values with 0 errors in 0.01 s\n" ] }, { "data": { "text/plain": [ "No rows affected." ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Parse the response into a new column\n", "t.add_computed_column(response=t.output.choices[0].message.content)" ] }, { "cell_type": "code", "execution_count": 4, "id": "15c9bc76-1b28-4d17-9a2d-339968f90786", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inserted 1 row with 0 errors in 3.39 s (0.29 rows/s)\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
inputresponse
How many islands are in the Aleutian island chain?The Aleutian Islands, which stretch from Alaska toward Russia, comprise approximately 300 islands. This chain is part of the larger Aleutian Arc, which includes both large and small islands, some of which are uninhabited. The exact number can vary depending on how one defines an island, including whether smaller islets are counted.
" ], "text/plain": [ " input \\\n", "0 How many islands are in the Aleutian island ch... \n", "\n", " response \n", "0 The Aleutian Islands, which stretch from Alask... " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Start a conversation\n", "t.insert(\n", " [{'input': 'How many islands are in the Aleutian island chain?'}]\n", ")\n", "t.select(t.input, t.response).head()" ] }, { "cell_type": "markdown", "id": "f285bcef-eba6-4d0c-9f7d-950fb467eb21", "metadata": {}, "source": [ "## Embeddings\n", "\n", "Note: OpenAI Embeddings API is not available with free tier API keys" ] }, { "cell_type": "code", "execution_count": 5, "id": "edf544a9-35cc-4e40-bd56-1b76226d046e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'embeddings'.\n", "Added 0 column values with 0 errors in 0.01 s\n" ] }, { "data": { "text/plain": [ "No rows affected." ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emb_t = pxt.create_table('openai_demo/embeddings', {'input': pxt.String})\n", "emb_t.add_computed_column(\n", " embedding=openai.embeddings(\n", " input=emb_t.input, model='text-embedding-3-small'\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 6, "id": "16e38191-0a08-457b-a202-6b2d0bcab892", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inserted 1 row with 0 errors in 1.03 s (0.97 rows/s)\n" ] }, { "data": { "text/plain": [ "1 row inserted." ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emb_t.insert(\n", " [{'input': 'OpenAI provides a variety of embeddings models.'}]\n", ")" ] }, { "cell_type": "code", "execution_count": 7, "id": "5bf1cc4d-fd13-497e-a7fb-2ce78e3ee2e8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
inputembedding
OpenAI provides a variety of embeddings models.[-0.023 -0.045 0.069 -0.017 -0.008 -0.027 ... 0.009 0.005 0.021 0.018 -0.012 -0.008]
" ], "text/plain": [ " input \\\n", "0 OpenAI provides a variety of embeddings models. \n", "\n", " embedding \n", "0 [-0.02293767, -0.044737745, 0.06876658, -0.016... " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emb_t.head()" ] }, { "cell_type": "markdown", "id": "069cb1eb-2f5c-4843-9645-a79d410d8bf4", "metadata": {}, "source": [ "## Image generations" ] }, { "cell_type": "code", "execution_count": 8, "id": "24970bff-013f-4b5b-844d-764f1b5465d6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created table 'images'.\n", "Added 0 column values with 0 errors in 0.01 s\n" ] }, { "data": { "text/plain": [ "No rows affected." ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_t = pxt.create_table('openai_demo/images', {'input': pxt.String})\n", "image_t.add_computed_column(\n", " img=openai.image_generations(image_t.input, model='dall-e-2')\n", ")" ] }, { "cell_type": "code", "execution_count": 9, "id": "5f7f1b29-2963-4d73-b1b6-f7168f5d0a73", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inserted 1 row with 0 errors in 11.59 s (0.09 rows/s)\n" ] }, { "data": { "text/plain": [ "1 row inserted." ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_t.insert(\n", " [\n", " {\n", " 'input': 'A giant Pixel floating in the open ocean in a sea of data'\n", " }\n", " ]\n", ")" ] }, { "cell_type": "code", "execution_count": 10, "id": "60c6e39b-b0b2-4087-a82a-3d8033f29e90", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
table 'openai_demo/images'
\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Column NameTypeComputed With
inputString
imgImage[(1024, 1024)]image_generations(input, model='dall-e-2')
\n" ], "text/plain": [ "table 'openai_demo/images'\n", "\n", " Column Name Type Computed With\n", " input String \n", " img Image[(1024, 1024)] image_generations(input, model='dall-e-2')" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_t" ] }, { "cell_type": "code", "execution_count": 11, "id": "c1b86b7b-2a95-447f-b847-63226278de93", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
inputimg
A giant Pixel floating in the open ocean in a sea of data
\n", " \n", "
" ], "text/plain": [ " input \\\n", "0 A giant Pixel floating in the open ocean in a ... \n", "\n", " img \n", "0 \n", " \n", " \n", " input\n", " result\n", " \n", " \n", " \n", " \n", "
\n", " \n", "
\n", " {"text": "Allow me to illustrate. During the last 60 days, I have been at the task of constructing an administration. It has been a long and deliberate proc ...... a hill. The eyes of all peoples are upon us. Today the eyes of all people are truly upon us. And our governments, in every branch, at every level,", "usage": {"type": "duration", "seconds": 60}, "logprobs": null}\n", " \n", " \n", "" ], "text/plain": [ " input \\\n", "0 /Users/asiegel/.pixeltable/file_cache/5f2b916a... \n", "\n", " result \n", "0 {'text': 'Allow me to illustrate. During the l... " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "audio_t.head()" ] }, { "cell_type": "code", "execution_count": 15, "id": "c0472fe0-b415-4a78-9431-2ecd27f62372", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Allow me to illustrate. During the last 60 days, I have been at the task of constructing an administration. It has been a long and deliberate process. Some have counseled greater speed. Others have counseled more expedient tests. But I have been guided by the standard John Winthrop set before his shipmates on the flagship Arabella 331 years ago, as they too faced the task of building a new government on a perilous frontier. We must always consider, he said, that we shall be as a city upon a hill. The eyes of all peoples are upon us. Today the eyes of all people are truly upon us. And our governments, in every branch, at every level,'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "audio_t.head()[0]['result']['text']" ] }, { "cell_type": "markdown", "id": "622c2abd-8709-452a-b773-18fb28d180ce", "metadata": {}, "source": [ "### Learn more\n", "\n", "To learn more about advanced techniques like RAG operations in Pixeltable, check out the [RAG Operations in Pixeltable](https://docs.pixeltable.com/howto/use-cases/rag-operations) tutorial.\n", "\n", "If you have any questions, don't hesitate to reach out." ] } ], "metadata": { "kernelspec": { "display_name": "pxt", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.19" } }, "nbformat": 4, "nbformat_minor": 5 }