--- name: openai-api-development description: Expert guidance for OpenAI API development including GPT models, Assistants API, function calling, embeddings, and best practices for production applications. --- # OpenAI API Development You are an expert in OpenAI API development, including GPT models, Assistants API, function calling, embeddings, and building production-ready AI applications. ## Key Principles - Write concise, technical responses with accurate Python examples - Use type hints for all function signatures - Implement proper error handling and retry logic - Never hardcode API keys; use environment variables - Follow OpenAI's usage policies and rate limit guidelines ## Setup and Configuration ### Environment Setup ```python import os from openai import OpenAI # Always use environment variables for API keys client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) ``` ### Best Practices - Store API keys in `.env` files, never commit them - Use `python-dotenv` for local development - Implement proper key rotation strategies - Set up separate keys for development and production ## Chat Completions API ### Basic Usage ```python from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ], temperature=0.7, max_tokens=1000 ) message = response.choices[0].message.content ``` ### Streaming Responses ```python stream = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Tell me a story"}], stream=True ) for chunk in stream: if chunk.choices[0].delta.content is not None: print(chunk.choices[0].delta.content, end="") ``` ### Model Selection - Use `gpt-4o` for complex reasoning and multimodal tasks - Use `gpt-4o-mini` for faster, cost-effective responses - Use `o1` models for advanced reasoning tasks - Consider `gpt-3.5-turbo` for simple tasks requiring speed ## Function Calling ### Defining Functions ```python tools = [ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City and state, e.g., San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit" } }, "required": ["location"] } } } ] response = client.chat.completions.create( model="gpt-4o", messages=messages, tools=tools, tool_choice="auto" ) ``` ### Handling Tool Calls ```python import json def process_tool_calls(response, messages): tool_calls = response.choices[0].message.tool_calls if tool_calls: messages.append(response.choices[0].message) for tool_call in tool_calls: function_name = tool_call.function.name function_args = json.loads(tool_call.function.arguments) # Execute the function result = execute_function(function_name, function_args) messages.append({ "role": "tool", "tool_call_id": tool_call.id, "content": json.dumps(result) }) # Get final response return client.chat.completions.create( model="gpt-4o", messages=messages, tools=tools ) return response ``` ## Assistants API ### Creating an Assistant ```python assistant = client.beta.assistants.create( name="Data Analyst", instructions="You are a data analyst. Analyze data and provide insights.", tools=[ {"type": "code_interpreter"}, {"type": "file_search"} ], model="gpt-4o" ) ``` ### Managing Threads ```python # Create a thread thread = client.beta.threads.create() # Add a message message = client.beta.threads.messages.create( thread_id=thread.id, role="user", content="Analyze this data..." ) # Run the assistant run = client.beta.threads.runs.create_and_poll( thread_id=thread.id, assistant_id=assistant.id ) # Get messages if run.status == "completed": messages = client.beta.threads.messages.list(thread_id=thread.id) ``` ## Embeddings ### Generating Embeddings ```python response = client.embeddings.create( model="text-embedding-3-small", input="Your text to embed", encoding_format="float" ) embedding = response.data[0].embedding ``` ### Best Practices for Embeddings - Use `text-embedding-3-small` for cost-effective solutions - Use `text-embedding-3-large` for maximum accuracy - Batch requests for efficiency (up to 2048 inputs) - Cache embeddings to avoid redundant API calls - Use appropriate dimensions parameter for storage optimization ## Vision and Multimodal ### Image Analysis ```python response = client.chat.completions.create( model="gpt-4o", messages=[ { "role": "user", "content": [ {"type": "text", "text": "What's in this image?"}, { "type": "image_url", "image_url": { "url": "https://example.com/image.jpg", "detail": "high" } } ] } ] ) ``` ## Error Handling ### Retry Logic ```python from openai import RateLimitError, APIError import time def call_with_retry(func, max_retries=3, base_delay=1): for attempt in range(max_retries): try: return func() except RateLimitError: delay = base_delay * (2 ** attempt) time.sleep(delay) except APIError as e: if attempt == max_retries - 1: raise time.sleep(base_delay) raise Exception("Max retries exceeded") ``` ### Common Error Types - `RateLimitError`: Implement exponential backoff - `APIError`: Check API status, retry with backoff - `AuthenticationError`: Verify API key - `InvalidRequestError`: Validate input parameters ## Cost Optimization - Use appropriate models for task complexity - Implement token counting before requests - Use streaming for long responses - Cache responses when appropriate - Set reasonable `max_tokens` limits - Use batch API for non-time-sensitive requests ## Security Best Practices - Never expose API keys in client-side code - Implement rate limiting on your endpoints - Validate and sanitize user inputs - Use content moderation for user-generated content - Log API usage for monitoring and auditing ## Dependencies - openai - python-dotenv - tiktoken (for token counting) - pydantic (for input validation) - tenacity (for retry logic)