> **Development Setup**: For setting up the development environment, see the [Development Guide](../Development.md).


## Agent

This notebook demonstrates how to use Cua's Agent to run workflows in virtual sandboxes, either using Cua Cloud Sandbox or local VMs on Apple Silicon Macs.

### Installation

In [None]:
# If running outside of the monorepo:
# %pip install "cua-agent[all]"

## Initialize a Computer Agent

Agent allows you to run an agentic workflow in virtual sandbox instances. You can choose between Cloud Sandbox or local VMs.

In [None]:
from computer import Computer, VMProviderType
from agent import ComputerAgent

In [None]:
import os

# Get API keys from environment or prompt user
anthropic_key = os.getenv("ANTHROPIC_API_KEY") or input("Enter your Anthropic API key: ")
openai_key = os.getenv("OPENAI_API_KEY") or input("Enter your OpenAI API key: ")

os.environ["ANTHROPIC_API_KEY"] = anthropic_key
os.environ["OPENAI_API_KEY"] = openai_key

## Option 1: Agent with Cua Cloud Sandbox

Use Cloud Sandbox for running agents from any system without local setup.

### Prerequisites for Cloud Sandbox

To use Cua Cloud Sandbox, you need to:
1. Sign up at https://cua.ai
2. Create a Cloud Sandbox
3. Generate an API Key

Once you have these, you can connect to your Cloud Sandbox and run agents on it.

Get Cua API credentials and sandbox details

In [None]:
cua_api_key = os.getenv("CUA_API_KEY") or input("Enter your Cua API Key: ")
container_name = os.getenv("CONTAINER_NAME") or input("Enter your Cloud Container name: ")

Choose the OS type for your sandbox (linux or macos)

In [None]:
os_type = (
 input("Enter the OS type of your sandbox (linux/macos) [default: linux]: ").lower() or "linux"
)

### Create an agent with Cloud Sandbox

In [None]:
import logging
from pathlib import Path

# Connect to your existing Cloud Sandbox
computer = Computer(
 os_type=os_type,
 api_key=cua_api_key,
 name=container_name,
 provider_type=VMProviderType.CLOUD,
 verbosity=logging.INFO,
)

# Create agent
agent = ComputerAgent(
 model="openai/computer-use-preview",
 tools=[computer],
 trajectory_dir=str(Path("trajectories")),
 only_n_most_recent_images=3,
 verbosity=logging.INFO,
)

Run tasks on Cloud Sandbox

In [None]:
tasks = [
 "Open a web browser and navigate to GitHub",
 "Search for the trycua/cua repository",
 "Take a screenshot of the repository page",
]

for i, task in enumerate(tasks):
 print(f"\nExecuting task {i+1}/{len(tasks)}: {task}")
 async for result in agent.run(task):
 # print(result)
 pass
 print(f"āœ… Task {i+1}/{len(tasks)} completed: {task}")

## Option 2: KASM Local Docker Containers (cross-platform)

Before we can create an agent, we need to initialize a local computer with Docker provider.

In [None]:
import logging
from pathlib import Path

computer = Computer(
 os_type="linux",
 provider_type="docker",
 image="trycua/cua-ubuntu:latest",
 name="my-cua-container",
)

## Option 3: Agent with Local VMs (Lume daemon)

For Apple Silicon Macs, run agents on local VMs with near-native performance.

Before we can create an agent, we need to initialize a local computer with Lume.

In [None]:
import logging
from pathlib import Path


computer = Computer(
 verbosity=logging.INFO,
 provider_type=VMProviderType.LUME,
 display="1024x768",
 memory="8GB",
 cpu="4",
 os_type="macos",
)

## Create an agent

Let's start by creating an agent that relies on the OpenAI API computer-use-preview model.

In [None]:
# Create agent with Anthropic loop and provider
agent = ComputerAgent(
 model="openai/computer-use-preview",
 tools=[computer],
 trajectory_dir=str(Path("trajectories")),
 only_n_most_recent_images=3,
 verbosity=logging.INFO,
)

Run tasks on a computer:

In [None]:
tasks = [
 "Look for a repository named trycua/cua on GitHub.",
 "Check the open issues, open the most recent one and read it.",
 "Clone the repository in users/lume/projects if it doesn't exist yet.",
 "Open the repository with an app named Cursor (on the dock, black background and white cube icon).",
 "From Cursor, open Composer if not already open.",
 "Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.",
]

for i, task in enumerate(tasks):
 print(f"\nExecuting task {i}/{len(tasks)}: {task}")
 async for result in agent.run(task):
 # print(result)
 pass

 print(f"\nāœ… Task {i+1}/{len(tasks)} completed: {task}")

Or using the Omni Agent Loop:

In [None]:
import logging
from pathlib import Path
from agent import ComputerAgent

computer = Computer(verbosity=logging.INFO)

# Create agent with Anthropic loop and provider
agent = ComputerAgent(
 model="omniparser+ollama_chat/gemma3:12b-it-q4_K_M",
 # model="omniparser+openai/gpt-4o-mini",
 # model="omniparser+anthropic/claude-3-7-sonnet-20250219",
 tools=[computer],
 trajectory_dir=str(Path("trajectories")),
 only_n_most_recent_images=3,
 verbosity=logging.INFO,
)

tasks = [
 "Look for a repository named trycua/cua on GitHub.",
 "Check the open issues, open the most recent one and read it.",
 "Clone the repository in users/lume/projects if it doesn't exist yet.",
 "Open the repository with an app named Cursor (on the dock, black background and white cube icon).",
 "From Cursor, open Composer if not already open.",
 "Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.",
]

for i, task in enumerate(tasks):
 print(f"\nExecuting task {i}/{len(tasks)}: {task}")
 async for result in agent.run(task):
 # print(result)
 pass

 print(f"\nāœ… Task {i+1}/{len(tasks)} completed: {task}")

## Using the Gradio UI

The agent includes a Gradio-based user interface for easy interaction. To use it:

In [None]:
from agent.ui.gradio.ui_components import create_gradio_ui

app = create_gradio_ui()
app.launch(share=False)

## Advanced Agent Configurations

### Using different agent loops

You can use different agent loops depending on your needs:

1. OpenAI Agent Loop

In [None]:
openai_agent = ComputerAgent(
 tools=[computer], # Can be cloud or local
 model="openai/computer-use-preview",
 trajectory_dir=str(Path("trajectories")),
 verbosity=logging.INFO,
)

2. Anthropic Agent Loop

In [None]:
anthropic_agent = ComputerAgent(
 tools=[computer],
 model="anthropic/claude-sonnet-4-5-20250929",
 trajectory_dir=str(Path("trajectories")),
 verbosity=logging.INFO,
)

3. Omni Agent Loop (supports multiple providers)

In [None]:
omni_agent = ComputerAgent(
 tools=[computer],
 model="omniparser+anthropic/claude-3-7-sonnet-20250219",
 # model="omniparser+openai/gpt-4o-mini",
 trajectory_dir=str(Path("trajectories")),
 only_n_most_recent_images=3,
 verbosity=logging.INFO,
)

4. UITARS Agent Loop (for local inference on Apple Silicon)

In [None]:
uitars_agent = ComputerAgent(
 tools=[computer],
 model="mlx/mlx-community/UI-TARS-1.5-7B-6bit", # local MLX
 # model="huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B", # local Huggingface (transformers)
 # model="huggingface/ByteDance-Seed/UI-TARS-1.5-7B", # remote Huggingface (TGI)
 trajectory_dir=str(Path("trajectories")),
 verbosity=logging.INFO,
)

### Trajectory viewing

All agent runs save trajectories that can be viewed at https://cua.ai/trajectory-viewer

In [None]:
print(f"Trajectories saved to: {Path('trajectories').absolute()}")
print("Upload trajectory files to https://cua.ai/trajectory-viewer to visualize agent actions")