---
name: openai-agents-sdk
description: Build AI agents using OpenAI Agents SDK with MCP server integration. Supports Gemini directly via AsyncOpenAI or OpenRouter for non-OpenAI models. Covers agent creation, function tools, handoffs, MCP server connections, and conversation management.
---

# OpenAI Agents SDK Skill

Build AI agents using OpenAI Agents SDK with support for Gemini and other LLMs via direct integration or OpenRouter.

## Architecture

```
┌─────────────────────────────────────────────────────────────────────────┐
│                        OpenAI Agents SDK                                │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                         Agent                                     │   │
│  │  model: OpenAIChatCompletionsModel                               │   │
│  │    (via AsyncOpenAI with Gemini base_url)                        │   │
│  │  tools: [function_tool, ...]                                      │   │
│  │  mcp_servers: [MCPServerStreamableHttp(...)]                      │   │
│  │  handoffs: [specialized_agent, ...]                               │   │
│  └──────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────┘
                              │
                              │ MCP Protocol
                              ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                    MCP Server (FastMCP)                                 │
│              @mcp.tool() for task operations                            │
└─────────────────────────────────────────────────────────────────────────┘
```

## Quick Start

### Installation

```bash
# Base installation
pip install openai-agents

# Or with uv
uv add openai-agents
```

### Environment Variables

```env
# For Direct Gemini Integration (Recommended)
GOOGLE_API_KEY=your-gemini-api-key

# OR for OpenRouter (Alternative)
OPENROUTER_API_KEY=your-openrouter-api-key

# For OpenAI (optional, for tracing)
OPENAI_API_KEY=your-openai-api-key
```

## Using Gemini via Direct Integration (Recommended)

The recommended approach is to use AsyncOpenAI with Gemini's OpenAI-compatible endpoint. This avoids quota issues with LiteLLM.

```python
import os
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner
from agents.run import RunConfig
from dotenv import load_dotenv

load_dotenv()

# Create custom OpenAI client pointing to Gemini
gemini_api_key = os.getenv("GOOGLE_API_KEY")
external_provider = AsyncOpenAI(
    api_key=gemini_api_key,
    base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)

# Create model using the custom client
model = OpenAIChatCompletionsModel(
    openai_client=external_provider,
    model="gemini-2.0-flash-exp",
)

# Configure agent with model
config = RunConfig(
    model=model,
    model_provider=external_provider,
    tracing_disabled=True
)

# Create agent
agent = Agent(
    name="Todo Assistant",
    instructions="You are a helpful task management assistant.",
)

# Run the agent with config
result = await Runner.run(agent, "Help me organize my tasks", config=config)
print(result.final_output)
```

## Alternative: Using OpenRouter

OpenRouter provides access to multiple models through a single API, including free options.

```python
import os
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner
from agents.run import RunConfig
from dotenv import load_dotenv

load_dotenv()

# Create OpenRouter client
openrouter_key = os.getenv("OPENROUTER_API_KEY")
external_provider = AsyncOpenAI(
    api_key=openrouter_key,
    base_url="https://openrouter.ai/api/v1",
)

# Use a free model (powered by Gemini or other providers)
model = OpenAIChatCompletionsModel(
    openai_client=external_provider,
    model="openai/gpt-oss-20b:free",
)

config = RunConfig(
    model=model,
    model_provider=external_provider,
    tracing_disabled=True
)

# Create and run agent
agent = Agent(
    name="Todo Assistant",
    instructions="You are a helpful task management assistant.",
)

result = await Runner.run(agent, "Help me organize my tasks", config=config)
print(result.final_output)
```

## Reference

| Pattern | Guide |
|---------|-------|
| **Agent Creation** | [reference/agents.md](reference/agents.md) |
| **Function Tools** | [reference/function-tools.md](reference/function-tools.md) |
| **MCP Integration** | [reference/mcp-integration.md](reference/mcp-integration.md) |
| **Handoffs** | [reference/handoffs.md](reference/handoffs.md) |

## Examples

| Example | Description |
|---------|-------------|
| [examples/todo-agent.md](examples/todo-agent.md) | Complete todo agent with MCP tools |

## Templates

| Template | Purpose |
|----------|---------|
| [templates/agent_gemini.py](templates/agent_gemini.py) | Basic Gemini agent template with direct integration |
| [templates/agent_mcp.py](templates/agent_mcp.py) | Agent with MCP server integration |

## Basic Agent with Function Tools

```python
import asyncio
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner, function_tool
from agents.run import RunConfig
import os

@function_tool
def get_weather(city: str) -> str:
    """Get the weather for a city."""
    return f"The weather in {city} is sunny."

async def main():
    # Setup Gemini client
    external_provider = AsyncOpenAI(
        api_key=os.getenv("GOOGLE_API_KEY"),
        base_url="https://generativelanguage.googleapis.com/v1beta/openai",
    )

    model = OpenAIChatCompletionsModel(
        openai_client=external_provider,
        model="gemini-2.0-flash-exp",
    )

    config = RunConfig(
        model=model,
        model_provider=external_provider,
        tracing_disabled=True
    )

    agent = Agent(
        name="Assistant",
        instructions="You are a helpful assistant.",
        tools=[get_weather],
    )

    result = await Runner.run(agent, "What's the weather in Tokyo?", config=config)
    print(result.final_output)

asyncio.run(main())
```

## Agent with MCP Server

Connect your agent to an MCP server to access tools, resources, and prompts.

```python
import asyncio
import os
from agents import AsyncOpenAI, OpenAIChatCompletionsModel, Agent, Runner
from agents.mcp import MCPServerStreamableHttp
from agents.run import RunConfig

async def main():
    # Setup Gemini client
    external_provider = AsyncOpenAI(
        api_key=os.getenv("GOOGLE_API_KEY"),
        base_url="https://generativelanguage.googleapis.com/v1beta/openai",
    )

    model = OpenAIChatCompletionsModel(
        openai_client=external_provider,
        model="gemini-2.0-flash-exp",
    )

    config = RunConfig(
        model=model,
        model_provider=external_provider,
        tracing_disabled=True
    )

    async with MCPServerStreamableHttp(
        name="Todo MCP Server",
        params={
            "url": "http://localhost:8000/api/mcp",
            "timeout": 30,
        },
        cache_tools_list=True,
    ) as mcp_server:
        agent = Agent(
            name="Todo Assistant",
            instructions="""You are a task management assistant.
Use the MCP tools to help users manage their tasks:
- add_task: Create new tasks
- list_tasks: View existing tasks
- complete_task: Mark tasks as done
- delete_task: Remove tasks
- update_task: Modify tasks""",
            mcp_servers=[mcp_server],
        )

        result = await Runner.run(
            agent,
            "Show me my pending tasks",
            config=config
        )
        print(result.final_output)

asyncio.run(main())
```

## Agent Handoffs

Create specialized agents that hand off conversations. Note: When using handoffs, all agents share the same RunConfig.

```python
from agents import Agent, handoff, AsyncOpenAI, OpenAIChatCompletionsModel, Runner
from agents.run import RunConfig
from agents.extensions.handoff_prompt import prompt_with_handoff_instructions
import os

# Setup shared configuration
external_provider = AsyncOpenAI(
    api_key=os.getenv("GOOGLE_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)

model = OpenAIChatCompletionsModel(
    openai_client=external_provider,
    model="gemini-2.0-flash-exp",
)

config = RunConfig(
    model=model,
    model_provider=external_provider,
    tracing_disabled=True
)

# Specialized agents (no model specified - uses config)
task_agent = Agent(
    name="Task Agent",
    instructions=prompt_with_handoff_instructions(
        "You specialize in task management. Help users create, update, and complete tasks."
    ),
)

help_agent = Agent(
    name="Help Agent",
    instructions=prompt_with_handoff_instructions(
        "You provide help and instructions about using the todo app."
    ),
)

# Triage agent
triage_agent = Agent(
    name="Triage Agent",
    instructions=prompt_with_handoff_instructions(
        """Route users to the appropriate agent:
- Task Agent: for creating, viewing, or managing tasks
- Help Agent: for questions about how to use the app"""
    ),
    handoffs=[task_agent, help_agent],
)

# Run with config
result = await Runner.run(triage_agent, "How do I add a task?", config=config)
```

## Streaming Responses

```python
from agents import Runner

result = Runner.run_streamed(agent, "List my tasks", config=config)

async for event in result.stream_events():
    if event.type == "run_item_stream_event":
        print(event.item, end="", flush=True)

print(result.final_output)
```

## Model Settings

```python
from agents import Agent, ModelSettings, Runner

agent = Agent(
    name="Assistant",
    model_settings=ModelSettings(
        include_usage=True,  # Track token usage
        tool_choice="auto",  # or "required", "none"
    ),
)

result = await Runner.run(agent, "Hello!", config=config)
print(f"Tokens used: {result.context_wrapper.usage.total_tokens}")
```

## Tracing Control

Disable tracing when not using OpenAI:

```python
from agents.run import RunConfig

# Tracing disabled in config
config = RunConfig(
    model=model,
    model_provider=external_provider,
    tracing_disabled=True  # Disable tracing
)
```

## Supported Model Providers

| Provider | Base URL | Model Examples |
|----------|----------|----------------|
| **Gemini** | `https://generativelanguage.googleapis.com/v1beta/openai` | `gemini-2.0-flash-exp`, `gemini-1.5-pro` |
| **OpenRouter** | `https://openrouter.ai/api/v1` | `openai/gpt-oss-20b:free`, `google/gemini-2.0-flash-exp:free` |

## MCP Connection Types

| Type | Use Case | Class |
|------|----------|-------|
| **Streamable HTTP** | Production, low-latency | `MCPServerStreamableHttp` |
| **SSE** | Web clients, real-time | `MCPServerSse` |
| **Stdio** | Local processes | `MCPServerStdio` |
| **Hosted** | OpenAI-hosted MCP | `HostedMCPTool` |

## Error Handling

```python
from agents import Runner, AgentError

try:
    result = await Runner.run(agent, "Hello")
    print(result.final_output)
except AgentError as e:
    print(f"Agent error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")
```

## Best Practices

1. **Use Direct Integration** - Use AsyncOpenAI with custom base_url instead of LiteLLM to avoid quota issues
2. **Pass RunConfig** - Always pass RunConfig to Runner.run() for non-OpenAI providers
3. **Cache MCP tools** - Use `cache_tools_list=True` for performance
4. **Use handoffs** - Create specialized agents for different functionality
5. **Enable usage tracking** - Set `include_usage=True` in ModelSettings to monitor costs
6. **Disable tracing** - Set `tracing_disabled=True` in RunConfig when not using OpenAI
7. **Handle errors gracefully** - Use try/except for agent execution
8. **Use streaming** - Implement streaming for better user experience
9. **Share configuration** - When using handoffs, all agents share the same RunConfig

## Troubleshooting

### Quota Exceeded Error on First Request
**Problem**: Getting quota exceeded error even on first request when using LiteLLM.

**Solution**: Switch to direct integration using AsyncOpenAI:
```python
from agents import AsyncOpenAI, OpenAIChatCompletionsModel
from agents.run import RunConfig

external_provider = AsyncOpenAI(
    api_key=os.getenv("GOOGLE_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai",
)

model = OpenAIChatCompletionsModel(
    openai_client=external_provider,
    model="gemini-2.0-flash-exp",
)

config = RunConfig(
    model=model,
    model_provider=external_provider,
    tracing_disabled=True
)

result = await Runner.run(agent, "Hello", config=config)
```

### MCP connection fails
- Check MCP server is running
- Verify URL is correct
- Check timeout settings
- Ensure cache_tools_list=True for performance

### Gemini API errors
- Verify GOOGLE_API_KEY is set correctly
- Check model name: `gemini-2.0-flash-exp` or `gemini-1.5-pro`
- Verify base_url is correct: `https://generativelanguage.googleapis.com/v1beta/openai`
- Ensure API quota is not exceeded

### Agent not using provided model
**Problem**: Agent ignores model configuration.

**Solution**: Always pass RunConfig to Runner.run():
```python
result = await Runner.run(agent, message, config=config)
```