# OpenSandbox Architecture

OpenSandbox is a universal sandbox platform designed for AI application scenarios, providing a complete solution with multi-language SDKs, standardized sandbox protocols, and flexible runtime implementations. This document describes the overall architecture and design philosophy of OpenSandbox.

## Architecture Overview

![OpenSandbox Architecture](assets/architecture.svg)

The OpenSandbox architecture consists of four main layers:

1. **SDKs Layer** - Client libraries for interacting with sandboxes
2. **Specs Layer** - OpenAPI specifications defining the protocols
3. **Runtime Layer** - Server implementations managing sandbox lifecycle
4. **Sandbox Instances Layer** - Running sandbox containers with injected execution daemons

## 1. OpenSandbox SDKs

The SDK layer provides high-level abstractions for developers to interact with sandboxes. It handles communication with both the Sandbox Lifecycle API and the Sandbox Execution API.

### Core SDK Components

#### 1.1 Sandbox

The `Sandbox` class is the primary entry point for managing sandbox lifecycle:

- **Create**: Provision new sandbox instances from container images
- **Manage**: Monitor sandbox state, renew expiration, retrieve endpoints
- **Destroy**: Terminate sandbox instances when no longer needed

**Key Features:**
- Async/await support for non-blocking operations
- Automatic state polling for provisioning progress
- Resource quota management (CPU, memory, GPU)
- Metadata and environment variable injection
- TTL-based automatic expiration with renewal

#### 1.2 Filesystem

The `Filesystem` component provides comprehensive file operations within sandboxes:

- **CRUD Operations**: Create, read, update, and delete files and directories
- **Bulk Operations**: Upload/download multiple files efficiently
- **Search**: Glob-based file searching with pattern matching
- **Permissions**: Manage file ownership, group, and mode (chmod)
- **Metadata**: Retrieve file info including size, timestamps, permissions

**Use Cases:**
- Uploading code files and dependencies
- Downloading execution results and artifacts
- Managing workspace directories
- Searching for files by pattern

#### 1.3 Commands

The `Commands` component enables shell command execution within sandboxes:

- **Foreground Execution**: Run commands synchronously with real-time output streaming
- **Background Execution**: Launch long-running processes in detached mode
- **Stream Support**: Capture stdout/stderr via Server-Sent Events (SSE)
- **Process Control**: Interrupt running commands via context cancellation
- **Working Directory**: Specify custom working directory for command execution

**Use Cases:**
- Running build commands (e.g., `npm install`, `pip install`)
- Executing system utilities (e.g., `git`, `docker`)
- Starting web servers or services
- Running test suites

#### 1.4 CodeInterpreter

The `CodeInterpreter` component provides stateful code execution across multiple programming languages:

- **Multi-Language Support**: Python, Java, JavaScript, TypeScript, Go, Bash
- **Session Management**: Maintain execution state across multiple code blocks
- **Jupyter Integration**: Built on Jupyter kernel protocol for robust execution
- **Result Streaming**: Real-time output via SSE with execution counts
- **Error Handling**: Structured error responses with tracebacks

**Key Features:**
- Variable persistence across executions within same session
- Display data in multiple MIME types (text, HTML, images)
- Execution interruption support
- Execution timing and performance metrics

**Use Cases:**
- Interactive coding environments (e.g., Jupyter notebooks)
- AI code generation and execution
- Data analysis and visualization
- Educational coding platforms

### SDK Language Support

OpenSandbox provides SDKs in multiple languages:

- **Python SDK** (`sdks/sandbox/python`, `sdks/code-interpreter/python`)
- **Java/Kotlin SDK** (`sdks/sandbox/kotlin`, `sdks/code-interpreter/kotlin`)
- **TypeScript SDK** (Roadmap)

All SDKs follow the same design patterns and provide consistent APIs across languages.

## 2. OpenSandbox Specs

The Specs layer defines two core OpenAPI specifications that establish the contract between SDKs and runtime implementations.

### 2.1 Sandbox Lifecycle Spec

**File**: `specs/sandbox-lifecycle.yml`

The Lifecycle Spec defines the API for managing sandbox instances throughout their lifecycle.

#### Core Operations

| Operation | Endpoint | Description |
|-----------|----------|-------------|
| **Create** | `POST /sandboxes` | Create a new sandbox from a container image |
| **List** | `GET /sandboxes` | List sandboxes with filtering and pagination |
| **Get** | `GET /sandboxes/{id}` | Retrieve sandbox details and status |
| **Delete** | `DELETE /sandboxes/{id}` | Terminate a sandbox |
| **Pause** | `POST /sandboxes/{id}/pause` | Pause a running sandbox |
| **Resume** | `POST /sandboxes/{id}/resume` | Resume a paused sandbox |
| **Renew** | `POST /sandboxes/{id}/renew-expiration` | Extend sandbox TTL |
| **Endpoint** | `GET /sandboxes/{id}/endpoints/{port}` | Get public URL for a port |

### 2.2 Sandbox Execution Spec

**File**: `specs/execd-api.yaml`

The Execution Spec defines the API for interacting with running sandbox instances. This API is implemented by the `execd` daemon injected into each sandbox.

#### API Categories

**Health**
- `GET /ping` - Health check

**Code Interpreting**
- `POST /code/context` - Create execution context
- `POST /code` - Execute code with streaming output
- `DELETE /code` - Interrupt code execution

**Command Execution**
- `POST /command` - Execute shell command
- `DELETE /command` - Interrupt command

**Filesystem**
- `GET /files/info` - Get file metadata
- `DELETE /files` - Remove files
- `POST /files/permissions` - Change permissions
- `POST /files/mv` - Rename/move files
- `GET /files/search` - Search files by glob pattern
- `POST /files/replace` - Replace file content
- `POST /files/upload` - Upload files
- `GET /files/download` - Download files
- `POST /directories` - Create directories
- `DELETE /directories` - Remove directories

**Metrics**
- `GET /metrics` - Get system metrics snapshot
- `GET /metrics/watch` - Stream metrics via SSE

## 3. OpenSandbox Runtime

The Runtime layer implements the Sandbox Lifecycle Spec and manages the orchestration of sandbox containers.

### 3.1 Server Architecture

**Location**: `server/`

The OpenSandbox server is a FastAPI-based service providing:

- **Lifecycle Management**: Create, monitor, pause, resume, and terminate sandboxes
- **Pluggable Runtimes**: Docker (production-ready), Kubernetes (production-ready)
- **Async Provisioning**: Background creation to reduce latency
- **Automatic Expiration**: Configurable TTL with renewal support
- **Access Control**: API key authentication
- **Observability**: Unified status tracking with transition logging

### 3.2 Runtime Implementations

#### Docker Runtime (Ready)

**Features:**
- Direct Docker API integration
- Two networking modes:
  - **Host Mode**: Containers share host network (single instance)
  - **Bridge Mode**: Isolated networking with HTTP routing
- Container lifecycle management
- Resource quota enforcement
- Private registry authentication
- Volume mounting for execd injection
- Automatic cleanup on expiration

**Key Responsibilities:**
1. Pull container images (with auth support)
2. Create containers with resource limits
3. Inject execd binary and start script
4. Monitor container state
5. Handle pause/resume operations
6. Clean up terminated containers

#### Kubernetes Runtime (Ready)

**Features:**
- Built-in **[BatchSandbox](https://github.com/alibaba/OpenSandbox/tree/main/kubernetes)** runtime with sandbox pooling, high-throughput batch creation, and heterogeneous task orchestration; also compatible with **[SIG agent-sandbox](https://github.com/kubernetes-sigs/agent-sandbox)** as an alternative runtime
- Support for different secure container runtimes (e.g., kata-containers, gVisor)
- Helm-based deployment for controller and server, see [documentation](https://github.com/alibaba/OpenSandbox/blob/main/kubernetes/charts/opensandbox/README.md)

**Planned Features:**
- Unified network storage mounting (ossfs, NAS, custom PVC) in both pooled and non-pooled modes
- Pause/resume support

#### Custom Runtime

The pluggable architecture allows implementing custom runtimes by:
1. Implementing the Lifecycle Spec APIs
2. Managing sandbox provisioning and cleanup
3. Injecting execd into sandbox instances
4. Reporting sandbox state transitions

### 3.3 Networking and Routing

#### Sandbox Router

**Purpose**: Provides HTTP/HTTPS load balancing to sandbox instance ports.

**Features:**
- Dynamic endpoint generation based on sandbox ID and port
- Supports both domain-based and wildcard routing
- Reverse proxy to sandbox container ports
- Automatic cleanup when sandbox terminates

**Endpoint Format**: `{domain}/sandboxes/{sandboxId}/port/{port}`

**Use Cases:**
- Accessing web applications running in sandboxes
- Connecting to development servers (e.g., VS Code Server)
- Exposing APIs and services
- VNC and remote desktop access

## 4. Sandbox Instances

Sandbox instances are running containers that host user workloads with an injected execution daemon.

### 4.1 Container Structure

Each sandbox instance consists of:

1. **Base Container**: User-specified image (e.g., `ubuntu:22.04`, `python:3.11`)
2. **execd Daemon**: Injected execution agent implementing the Execution Spec
3. **Entrypoint Process**: User-defined main process

### 4.2 execd - Execution Daemon

**Location**: `components/execd/`

execd is a Go-based HTTP daemon built on the Beego framework.

#### Core Responsibilities

1. **Code Execution**: Manage Jupyter kernel sessions for multi-language code execution
2. **Command Execution**: Run shell commands with output streaming
3. **File Operations**: Provide filesystem API for remote file management
4. **Metrics Collection**: Monitor and report CPU, memory usage

#### Architecture

**Technology Stack:**
- **Language**: Go 1.24+
- **Web Framework**: Beego
- **Jupyter Integration**: WebSocket-based Jupyter protocol client
- **Streaming**: Server-Sent Events (SSE)

**Package Structure:**
- `pkg/flag/` - Configuration and CLI flags
- `pkg/web/` - HTTP layer (controllers, models, router)
- `pkg/runtime/` - Execution dispatcher
- `pkg/jupyter/` - Jupyter kernel client
- `pkg/util/` - Utilities and helpers

#### Jupyter Integration

execd integrates with Jupyter Server running inside the container:

1. **Session Management**: Create and maintain kernel sessions
2. **WebSocket Communication**: Real-time bidirectional communication
3. **Message Protocol**: Jupyter message spec implementation
4. **Stream Parsing**: Parse execution results, outputs, errors

**Supported Kernels:**
- Python (IPython)
- Java (IJava)
- JavaScript (IJavaScript)
- TypeScript (ITypeScript)
- Go (gophernotes)
- Bash

### 4.3 Injection Mechanism

The execd daemon is injected into sandbox containers during creation:

**Docker Runtime Injection Process:**

1. **Pull execd Image**: Retrieve the execd container image
2. **Extract Binary**: Copy execd binary from image to temporary location
3. **Volume Mount**: Mount execd binary and startup script into target container
4. **Entrypoint Override**: Modify container entrypoint to start execd first
5. **User Process Launch**: execd forks and executes the user's entrypoint

**Startup Sequence:**

```bash
# Container starts with modified entrypoint
/opt/opensandbox/start.sh
  ↓
# Start Jupyter Server
jupyter notebook --port=54321 --no-browser --ip=0.0.0.0
  ↓
# Start execd daemon
/opt/opensandbox/execd --jupyter-host=http://127.0.0.1:54321 --port=44772
  ↓
# Execute user entrypoint
exec "${USER_ENTRYPOINT[@]}"
```

**Benefits:**
- Transparent to user code
- No image modification required
- Dynamic injection at runtime
- Works with any base image

## 5. Communication Flow

### 5.1 Sandbox Creation Flow

```
User/SDK
   │
   │ 1. POST /sandboxes (image, entrypoint, resources)
   ▼
Server (Lifecycle API)
   │
   │ 2. Pull container image
   │ 3. Inject execd binary
   │ 4. Create container with entrypoint override
   │ 5. Start container
   ▼
Sandbox Instance
   │
   │ 6. Start execd daemon
   │ 7. Start Jupyter Server
   │ 8. Execute user entrypoint
   ▼
Running (State)
```

### 5.2 Code Execution Flow

```
User/SDK
   │
   │ 1. Create sandbox
   │ 2. Get execd endpoint
   ▼
CodeInterpreter SDK
   │
   │ 3. POST /code/context (create session)
   │ 4. POST /code (execute code)
   ▼
execd (Execution API)
   │
   │ 5. Route to Jupyter runtime
   ▼
Jupyter Runtime
   │
   │ 6. WebSocket to Jupyter Server
   │ 7. Send execute_request
   ▼
Jupyter Kernel (Python/Java/etc.)
   │
   │ 8. Execute code
   │ 9. Stream output events
   ▼
execd
   │
   │ 10. Convert to SSE events
   │ 11. Stream to client
   ▼
CodeInterpreter SDK
   │
   │ 12. Parse events
   │ 13. Return result to user
   ▼
User/Application
```

### 5.3 File Operations Flow

```
User/SDK
   │
   │ 1. Upload files
   ▼
Filesystem SDK
   │
   │ 2. POST /files/upload (multipart)
   ▼
execd (Execution API)
   │
   │ 3. Write to filesystem
   │ 4. Set permissions
   ▼
Sandbox Container Filesystem
```

## 6. Design Principles

### 6.1 Protocol-First Design

- All interactions defined by OpenAPI specifications
- Clear contracts between components
- Enables polyglot implementations
- Supports custom runtime implementations

### 6.2 Separation of Concerns

- **SDK**: Client-side abstraction and convenience
- **Specs**: Protocol definition and documentation
- **Runtime**: Sandbox orchestration and lifecycle
- **execd**: In-sandbox execution and operations

### 6.3 Extensibility

- Pluggable runtime implementations
- Custom sandbox images
- Multiple SDK languages
- Additional Jupyter kernels

### 6.4 Security

- API key authentication for lifecycle operations
- Token-based authentication for execution operations
- Isolated sandbox environments
- Resource quota enforcement
- Network isolation options

### 6.5 Observability

- Structured state transitions
- Real-time metrics streaming
- Comprehensive logging
- Health check endpoints

## 7. Use Cases

### 7.1 AI Code Generation and Execution

AI models (like Claude, GPT-4, Gemini) generate code that needs to be executed safely:

- **Isolation**: Run untrusted AI-generated code in sandboxes
- **Multi-Language**: Support various programming languages
- **Iteration**: Maintain state across multiple code generations
- **Feedback**: Capture execution results and errors for AI refinement

**Examples**: [claude-code](../examples/claude-code/), [gemini-cli](../examples/gemini-cli/), [codex-cli](../examples/codex-cli/)

### 7.2 Interactive Coding Environments

Build web-based coding platforms and notebooks:

- **Code Execution**: Run code in isolated environments
- **File Management**: Upload/download project files
- **Terminal Access**: Execute shell commands
- **Collaboration**: Share sandbox instances

**Examples**: [code-interpreter](../examples/code-interpreter/)

### 7.3 Browser Automation and Testing

Automate web browsers for testing and scraping:

- **Headless Browsers**: Chrome, Playwright
- **Remote Debugging**: DevTools protocol
- **VNC Access**: Visual debugging
- **Network Isolation**: Controlled environment

**Examples**: [chrome](../examples/chrome/), [playwright](../examples/playwright/)

### 7.4 Remote Development Environments

Provide cloud-based development workspaces:

- **VS Code Server**: Full IDE in browser
- **Desktop Environments**: VNC-based desktops
- **Tool Pre-installation**: Language runtimes, build tools
- **Port Forwarding**: Access development servers

**Examples**: [vscode](../examples/vscode/), [desktop](../examples/desktop/)

### 7.5 Continuous Integration and Testing

Run build and test pipelines in isolated environments:

- **Reproducible Builds**: Consistent container images
- **Parallel Execution**: Multiple sandbox instances
- **Artifact Collection**: Download build outputs
- **Resource Limits**: Prevent resource exhaustion

## 8. Conclusion

OpenSandbox provides a complete, production-ready platform for building AI-powered applications that require safe code execution, file management, and command execution in isolated environments. The architecture is designed to be:

- **Universal**: Works with any container image
- **Extensible**: Pluggable runtimes and custom implementations
- **Developer-Friendly**: Multi-language SDKs with consistent APIs
- **Production-Ready**: Robust lifecycle management and observability
- **Secure**: Isolated environments with access control

The protocol-first design ensures that all components can evolve independently while maintaining compatibility. Whether you're building AI coding assistants, interactive notebooks, or remote development environments, OpenSandbox provides the foundation you need.

## 9. References

- [Contributing Guide](contributing.md)
- [Sandbox Lifecycle Spec](../specs/sandbox-lifecycle.yml)
- [Sandbox Execution Spec](../specs/execd-api.yaml)
- [Server Documentation](../server/README.md)
- [execd Documentation](../components/execd/README.md)
- [Python SDK](../sdks/sandbox/python/README.md)
- [Java/Kotlin SDK](../sdks/sandbox/kotlin/README.md)
- [Examples](../examples/README.md)