# ggcode — Architecture

> This document is for maintainers, contributors, and anyone who wants the internal technical layout.
> If you want to install and use ggcode as a product, start with the main [README](../README.md) first.

> Module: `github.com/topcheer/ggcode`
> Last updated: 2026-05-23

## Overview

ggcode is a terminal-based AI coding agent written in Go. It provides an interactive REPL where users describe coding tasks in natural language; the agent iteratively plans, calls tools, and refines its work in an agentic loop.

The core agent loop is now complemented by a lightweight **harness control plane** and an **A2A (Agent-to-Agent) mesh** for multi-instance collaboration. Harness mode is intentionally implemented around the existing runtime rather than inside it: `ggcode harness ...` scaffolds repo guidance, generates nested subsystem `AGENTS.md` files, runs invariant checks, tracks orchestrated work items, queues multi-step work, supports dependency-gated backlogs, binds tasks to bounded contexts when useful, carries lightweight context ownership metadata, summarizes queue health per context, exposes owner-centric actionable inboxes and owner-filtered batch actions, creates isolated git worktrees when available, can explicitly retry failed backlog items or resume interrupted runs, uses pollable sub-agent-backed workers for queued execution, persists post-run delivery evidence, exposes review/approval, promotion, and owner/context-scoped release-batch loops on verified tasks, and can further split release-ready work into owner- or context-grouped rollout waves with persisted staged rollout state, explicit environment tags, gate approval/rejection, and advance/pause/resume/abort controls without forking a second agent architecture.

The A2A subsystem enables multiple ggcode instances to discover each other, authenticate via multiple schemes (API key, OAuth2+PKCE, Device Flow, OIDC, mTLS), and call tools across instances transparently via MCP bridge.

### Core Principles

- **Agentic loop**: user prompt → LLM → tool calls → execute → feed results back → repeat
- **Extensible tools**: built-in tools + MCP servers + Go plugin interface
- **Safe by default**: permission policy with path sandbox and dangerous-command detection
- **Streaming UX**: Bubble Tea TUI with live markdown, diff preview, spinners
- **Portable config**: YAML with `${ENV_VAR}` expansion
- **Multi-instance collaboration**: A2A protocol with multi-auth, auto-discovery, MCP bridge
- **IM gateway**: Remote coding via Telegram, QQ, Discord, Slack, DingTalk, Feishu with slash commands

## Directory Structure

```
cmd/ggcode/              # CLI entrypoint (main.go, root.go, pipe.go, daemon.go)
ggcode-relay/            # Standalone relay server for mobile tunnel (main.go, store.go)
desktop/                 # Desktop GUI application (Fyne, separate Go module)
  ggcode-desktop/        # Main desktop app — visual chat, IM, tool approval, tunnel relay
  markdownx/             # Extended Markdown widget (Fyne, Mermaid diagram support)
internal/
  agent/                 # Agent: agentic loop, provider abstraction
    agent.go             # Agent struct, Run/RunStream, core orchestration
    agent_autopilot.go   # Autopilot continuation logic
    agent_compact.go     # Auto-compaction of conversation history
    agent_memory.go      # Memory management helpers
    agent_tool.go        # Tool execution, diff confirm, hooks
  a2a/                   # Agent-to-Agent protocol
    server.go            # HTTP server with multi-auth middleware
    client.go            # Auto-negotiating A2A client
    handler.go           # JSON-RPC handler (SendMessage, SendMessageStreaming)
    registry.go          # Local instance registry (PID-based discovery)
    mcp_bridge.go        # MCP bridge: exposes remote agents as MCP tools
    remote_tool.go       # Remote tool executor
    types.go             # Shared types (Task, Artifact, AgentCard, auth config)
  auth/                  # Authentication subsystem
    store.go             # Credential store
    copilot.go           # GitHub Copilot token management
    pkce.go              # PKCE helpers (code verifier/challenge generation)
    claude_oauth.go      # Claude OAuth flow
    a2a_oauth.go         # A2A OAuth2 PKCE/Device Flow providers
    a2a_presets.go       # Provider presets (GitHub, Google, Auth0, Azure)
    a2a_token_cache.go   # Token cache with per-client isolation
  checkpoint/            # File edit checkpoints for undo
    checkpoint.go
  commands/              # Markdown-backed skills and legacy custom slash commands
    command.go           # Command/skill metadata and expansion helpers
    loader.go            # Load skills and legacy commands from ~/.agents, ~/.ggcode, and project .ggcode
  config/                # Configuration loading and env expansion
    config.go            # Config struct, LoadFromFile, A2A auth config
    env.go               # ${ENV_VAR} expansion
    a2a_override.go      # Instance-level A2A config merge from .ggcode/a2a.yaml
  context/               # Conversation context management
    manager.go           # ContextManager: message history, compression
    tokenizer.go         # CJK-aware token estimation
  cost/                  # Token usage and cost tracking
    manager.go           # CostManager: per-session and total cost
    pricing.go           # Model pricing data
    types.go             # SessionCost type
    tracker.go           # In-flight token counting
  daemon/                # Daemon mode: headless agent with follow display
    daemon.go            # Daemon struct, keyboard shortcuts, i18n labels
  debug/                 # Debug logging
    debug.go
  diff/                  # Diff formatting utilities
    diff.go              # FormatDiff, IsDiffContent
  hooks/                 # Pre/post execution hooks
    hook.go              # Hook struct
    runner.go            # HookRunner
  harness/               # Harness control plane: scaffold, checks, queue/run tracking, review/promotion/release, worktrees, gc
    config.go            # Harness config model and defaults
    project.go           # Project discovery and scaffold creation
    check.go             # Structural checks and validation commands
    run.go               # Tracked harness runs / queued execution
    release.go           # Release-batch planning and persisted release reports
    worktree.go          # Git worktree lifecycle for isolated task workspaces
    gc.go                # Archive/prune stale harness state
  image/                 # Image file handling for multimodal input
    image.go             # ReadFile, Placeholder
  im/                    # IM gateway runtime
    runtime.go           # Manager: multi-adapter routing, bindings, mute/unmute
    daemon_bridge.go     # DaemonBridge: agent loop + IM slash commands + ChatBridge impl
    emitter.go           # IMEmitter: outbound event routing
    fanout.go            # Multi-adapter fan-out with echo suppression
    adapter_*.go         # Platform adapters (Telegram, QQ, Discord, Slack, DingTalk, Feishu)
    tool_format.go       # Unified tool result formatting for IM
  webui/                 # WebUI HTTP server + WebSocket chat
    server.go            # Server: REST API (config, sessions, MCP, IM, A2A, general) + WS chat handler
    tui_bridge.go        # TUIChatBridge: routes webchat → TUI event loop via program.Send()
    server_test.go       # Unit tests for config serialization, sessions, WS
    server_e2e_test.go   # E2E tests (48 tests): REST API, WS lifecycle, broadcast, attachments, errors
  knight/                # Knight: background autonomous agent
    knight.go            # Daily token budget, activity-driven code monitoring
  mcp/                   # Model Context Protocol client
    client.go            # MCPClient: spawn and communicate with MCP servers
    adapter.go           # Tool adapter (MCP tool → ggcode tool interface)
    jsonrpc.go           # JSON-RPC protocol
    oauth.go             # OAuth 2.1 handler: metadata discovery, DCR, device flow, token refresh
  memory/                # Project and auto memory
    auto.go              # AutoMemory: automatic memory extraction
    project.go           # ProjectMemory: load memory files
  permission/            # Permission and sandbox policy
    policy.go            # PermissionPolicy interface
    config_policy.go     # Config-backed policy
    mode.go              # PermissionMode enum (Supervised/Plan/Auto/Bypass)
    dangerous.go         # Dangerous command detection
    sandbox.go           # Path sandbox enforcement
  plugin/                # Go plugin system
    loader.go            # Plugin loader
    mcp_loader.go        # MCP server plugin loader
    plugin.go            # Plugin interface
  provider/              # LLM provider implementations
    provider.go          # Provider interface
    openai.go            # OpenAI-compatible provider
    anthropic.go         # Anthropic provider
    gemini.go            # Google Gemini provider
    copilot.go           # GitHub Copilot provider
    registry.go          # Provider registry
    retry.go             # Retry logic with backoff
  session/               # Session persistence
    store.go             # Store: save/load sessions as JSONL
  subagent/              # Sub-agent spawning and management
    manager.go           # Manager: spawn, list, cancel, and snapshot sub-agents
    runner.go            # Runner: execute sub-agent tasks and wait/poll their state
  tool/                  # Built-in tools
    tool.go              # Tool interface
    builtin.go           # RegisterBuiltinTools
    read_file.go         # File reading
    write_file.go        # File writing
    edit_file.go         # File editing with checkpoint support
    run_command.go       # Synchronous shell command execution
    command_jobs.go      # Background command job manager and buffered output
    command_job_tools.go # Async command start/read/wait/write/stop/list tools
    search_files.go      # Code search
    list_dir.go          # Directory listing
    glob.go              # Glob pattern matching
    web_fetch.go         # HTTP fetch with SSRF protection
    web_search.go        # Web search
    git_diff.go / git_log.go / git_status.go  # Git tools
    save_memory.go       # Save memory entries
    todo_write.go        # Write todo lists
    spawn_agent.go       # Spawn sub-agents
    list_agents.go / wait_agent.go  # Sub-agent polling and wait tools
  tui/                   # Terminal UI (Bubble Tea)
    model.go             # Model struct, Init, configuration methods
    model_update.go      # Update loop: message routing, key handling, agent lifecycle
    model_messages.go    # Message types (streamMsg, doneMsg, errMsg, etc.)
    model_approval.go    # Approval/diff confirmation selection lists
    model_pending.go     # Pending state helpers (device codes, questionnaires)
    model_clipboard.go   # Clipboard image loading
    model_terminal.go    # Terminal utility helpers (open URL, resize)
    view.go              # View rendering, status bar, autocomplete
    commands.go          # Slash command handlers
    submit.go            # Message submission and agent startup
    resize.go            # Window resize handling
    repl.go              # REPL: wires Model to Agent, session, cost
    completion.go        # Slash command autocomplete logic
    viewport.go          # Scrollable viewport with auto-follow
    spinner.go           # Tool execution spinner
    diff.go              # Diff display formatting
    markdown.go          # Markdown rendering with glamour
    preview_panel.go     # File preview, markdown rendering, syntax highlighting
    app.go               # Minimal package marker
  tunnel/                # Tunnel broker for mobile relay
    broker.go            # Broker: client management, event recording, replay, active session tracking
    relay_client.go      # WebSocket client to relay server with backpressure
    session.go           # Tunnel session helpers (SendText, SendSnapshot, etc.)
    protocol.go          # Event types and gateway message structs
  swarm/                 # Team-based multi-agent coordination
    team.go              # Team creation, deletion, teammate management
    task_board.go        # Shared task board with assignee-based delivery
  cron/                  # Scheduled jobs
    cron.go              # Cron expression parsing, recurring and one-shot jobs
  lsp/                   # LSP client integration
    client.go            # Generic LSP client for gopls, rust-analyzer, etc.
  markdown/              # Markdown rendering
    render.go            # Glamour-based markdown rendering helpers
  chat/                  # Chat utilities
    types.go             # Shared chat types and helpers
  extract/               # Content extraction
    extract.go           # File content extraction utilities
  stream/                # Stream processing
    stream.go            # Stream utilities
  task/                  # Task tracking
    task.go              # Task primitives
  safego/                # Safe goroutine helpers
    safego.go            # Panic recovery wrappers for goroutines
  restart/               # Process restart
    restart.go           # Restart support
  acp/                   # Agent Client Protocol
    acp.go               # ACP support for JetBrains, Zed, etc.
  util/                  # Shared utilities
    truncate.go          # String truncation
docs/                    # Documentation
  ARCHITECTURE.md        # This file
  a2a-auth.md            # A2A authentication guide (5 schemes, config examples, decision matrix)
```

## Key Patterns

- **Bubble Tea streaming**: Agent runs in a goroutine; events flow into the TUI via `tea.Program.Send()`
- **Permission policy**: Two layers — tool-level `ShouldAsk` + dangerous-command detection
- **Import cycle avoidance**: Shared types defined in downstream packages; factory functions injected
- **MCP client**: Spawns fresh process per tool call (`callToolStandalone`); HTTP transport supports OAuth 2.1 with automatic metadata discovery, dynamic client registration, device flow, and token refresh
- **Provider SDKs**: OpenAI (go-openai), Anthropic (anthropic-sdk-go), Gemini (genai), Copilot (custom transport on top of the provider abstraction)
- **IM routing**: IM events are fanned out to all bound adapters; per-channel echo suppression skips the originating adapter for user mirror messages
- **Session format**: JSONL with index.json metadata
- **A2A multi-auth**: Server advertises enabled auth schemes in agent card; client auto-negotiates the strongest available. Auth middleware validates each scheme independently. Multiple schemes can coexist.
- **Token cache**: OAuth2/OIDC tokens cached at `~/.ggcode/oauth-tokens/{provider}-{clientID[:12]}.json` with per-client isolation. Same client_id = shared token; different client_id = isolated.
- **IM mute**: In-memory only (not persisted to binding store). `MuteAllExcept(adapter)` prevents self-mute race. Daemon restart recovers all adapters.
- **WebUI ChatBridge**: Decouples WebSocket chat from agent implementation via `ChatBridge` interface (`SendUserMessage`, `Messages`, `Subscribe`). Two implementations:
  - `DaemonBridge` (daemon mode): Injects webchat messages through `pendingInterruptions` into agent's `SetInterruptionHandler`. Broadcasts events from agent stream callback.
  - `TUIChatBridge` (TUI mode): Routes webchat messages through `program.Send(webchatUserMsg)` into bubbletea event loop. No direct agent access — TUI handles queuing/interruption identically to keyboard input. Events broadcast via `BroadcastEvent()` called from TUI's stream callback.
- **WebUI WebSocket**: Per-connection write goroutine with buffered channel (cap 256). Read and write goroutines fully separated to satisfy gorilla/webstack concurrency requirements. Slow subscribers drop events (non-blocking send) instead of blocking the broadcast path.
- **Tunnel event persistence**: Tunnel events are appended to session JSONL via `AppendTunnelEventToDisk()` without rewriting the whole file. On reconnect, `replayCanonicalEvents()` replays recorded events. `TunnelEventsComplete` flag ensures only fully-recorded event sets are used for replay; incomplete sets fall back to snapshot-based recovery.
- **Relay backpressure**: Peer writes in `ggcode-relay` use blocking sends with a 30s write deadline instead of buffered channel drops, preventing silent data loss during slow connections.
- **Relay event dedup**: `room.upsertHistoryEvent()` deduplicates by sessionID+eventID so replayed events don't accumulate. `snapshot_reset` (empty eventID) is not persisted to SQLite.
- **Swarm task board**: Tasks are assigned to specific teammates via `swarm_task_create` with `assignee`, which pushes directly to the assignee's inbox. Unassigned tasks can be claimed by any idle teammate. Task completion is tracked on a shared board visible to all teammates.

## A2A Authentication Architecture

```
┌──────────────────────────────────────────────────────────┐
│                    A2A Server                             │
│  ┌─────────────────────────────────────────────────────┐ │
│  │ Auth Middleware                                      │ │
│  │  ┌─────────┐ ┌──────────┐ ┌─────┐ ┌──────┐ ┌────┐ │ │
│  │  │ API Key │ │ OAuth2   │ │OIDC │ │ mTLS │ │No  │ │ │
│  │  │ X-Header│ │ Bearer   │ │JWT  │ │ Cert │ │Auth│ │ │
│  │  └────┬────┘ └────┬─────┘ └──┬──┘ └───┬──┘ └─┬──┘ │ │
│  │       └───────┬───┴──────────┴────────┴───────┘    │ │
│  │               ▼ pass / fail                         │ │
│  │         [Request Context: identity]                 │ │
│  └─────────────────────────────────────────────────────┘ │
│  Agent Card: { schemes: [apiKey, oauth2, oidc, mtls] }   │
└──────────────────────────────────────────────────────────┘
         ▲                    ▲
    HTTP with             TLS with
    X-API-Key             client cert
         │                    │
┌────────┴──────┐    ┌───────┴─────┐
│  ggcode #1    │    │  ggcode #2  │
│  (client)     │    │  (client)   │
│               │    │             │
│ Token Cache:  │    │ Token Cache:│
│ ~/.ggcode/    │    │ ~/.ggcode/  │
│  oauth-tokens/│    │  oauth-     │
│   github-xxx  │    │   tokens/   │
└───────────────┘    └─────────────┘
```

Server rebuilds auth state from config on restart (no persistence needed). Client tokens survive restarts via cache.

## Conversation Context Management

### Auto-Compact Strategy

The agent monitors conversation token usage and triggers compaction when it approaches the model's context window limit. Compaction has two levels:

1. **Microcompact** — Truncates large tool results by preserving the first 10 and last 5 lines, with individual lines truncated to 200 characters. This is fast and cheap but only saves tokens from tool output.

2. **Summarize** — When microcompact is insufficient, older messages are replaced with an LLM-generated summary. The summary prompt is optimized for coding contexts: it preserves key decisions, file paths, code structure, error resolutions, and pending work. Tool results in the summary payload are pre-truncated to 500 characters to prevent the summary request itself from triggering context overflow.

Thresholds:
- With usage baseline: 75% of context window triggers compaction
- Without baseline: 65% triggers compaction
- Target after compaction: 55% of context window

### Session Checkpoints

After summarize compaction, the compacted message state is persisted as a **checkpoint** record in the session JSONL file. This enables efficient session recovery:

```
Session JSONL:
  msg1 → msg2 → ... → msg50
  [checkpoint: 3 compacted messages, 500 tokens]
  msg51 → msg52 → ...
```

On `--resume`, the loader finds the latest checkpoint and only loads:
1. Messages from the checkpoint snapshot
2. Messages recorded after the checkpoint

This avoids re-loading and re-compacting the entire conversation history. Checkpoints are triggered by all compaction paths:
- `maybeAutoCompact` (periodic check at loop start)
- `tryReactiveCompact` (after prompt-too-long errors)
- `forceCompactAndPause` (autopilot loop guard)

### IM Tool Call Display

All IM adapters (Telegram, QQ, Discord, Slack, DingTalk, Feishu) share a unified tool result formatter (`internal/im/tool_format.go`). Each built-in tool has a dedicated format with emoji icon + code block:

| Category | Format |
|----------|--------|
| Commands | `✓` + bash code block for command + plain code block for output (no truncation) |
| File read | `✓ 📖 {path}` (status only, no content) |
| File edit/write | `✓ ✏️/📝 {path}` (status only) |
| Directory/glob/search | Icon + pattern + full results in code block |
| Git | `✓ 🔧 Git Status/Log/Diff` + output in code block |
| Web | `✓ 🌐` + output in code block |
| MCP tools | `✓ 🔧 PrettyName(args)` + output in code block |
| Error variants use `✗` + error in code block |

All absolute paths are relativized against the project working directory before sending to IM.

### IM Slash Commands (Daemon Mode)

The daemon bridge (`internal/im/daemon_bridge.go`) processes slash commands from any IM channel:

| Command | Handler | Notes |
|---------|---------|-------|
| `/listim` | `handleListIM()` | Lists adapters from `Manager.Snapshot()` — shows name, platform, health, mute status |
| `/muteim <name>` | `handleMuteIM()` | Calls `Manager.MuteBinding(name)`. Refuses to mute self. |
| `/muteall` | `handleMuteAll()` | Calls `Manager.MuteAllExcept(selfAdapter)` — sender's adapter is never muted |
| `/muteself` | `handleMuteSelf()` | Emits warning first (500ms delay), then `Manager.MuteBinding(self)` |
| `/restart` | `onRestart()` hook | Triggers daemon restart, recovers all muted adapters |
| `/help` | Static text | Lists all commands |

Mute is in-memory only — `persistBinding()` strips the Muted flag before saving.

### Provider Error Detection

All provider adapters detect output truncation and policy errors:
- **OpenAI**: `finish_reason=length` returns error
- **Anthropic**: `stop_reason=max_tokens` or `stop_reason=refusal` returns error
- **Gemini**: `FinishReason=MAX_TOKENS`, `SAFETY`, `RECITATION`, etc. returns error

Default `max_output_tokens` is 16384 (configurable per endpoint).

## WebUI Architecture

The WebUI subsystem provides an HTTP+WebSocket interface for browser-based interaction with ggcode. It starts in both TUI and daemon modes.

### ChatBridge Interface

```
                    ┌──────────────────────────┐
                    │     webui.Server          │
                    │  ┌──────────────────────┐ │
                    │  │ REST API              │ │
                    │  │ /api/config, sessions │ │
                    │  │ /api/mcp, im, a2a...  │ │
                    │  └──────────────────────┘ │
                    │  ┌──────────────────────┐ │
                    │  │ WebSocket Chat        │ │
                    │  │  ↓ SendUserMessage()  │ │
                    │  │  ↑ Subscribe() events │ │
                    │  └──────┬───────────────┘ │
                    └─────────┼─────────────────┘
                              │ ChatBridge interface
                    ┌─────────┼─────────────────┐
                    │         ▼                  │
          ┌─────────┴──────────┐  ┌─────────────┴──────────┐
          │   DaemonBridge     │  │    TUIChatBridge        │
          │   (daemon mode)    │  │    (TUI mode)           │
          │                    │  │                         │
          │ pendingInterrupts  │  │ program.Send()          │
          │ → agent interrupt  │  │ → bubbletea event loop  │
          │                    │  │ → startAgent /           │
          │ broadcastEvent()   │  │   queuePendingSubmission │
          │ from agent stream  │  │                         │
          │ callback           │  │ BroadcastEvent()        │
          │                    │  │ from TUI stream callback │
          └────────────────────┘  └─────────────────────────┘
```

### REST API Endpoints

| Path | Methods | Description |
|------|---------|-------------|
| `/api/config` | GET | Full configuration (vendors, endpoints, MCP, IM, A2A, general) |
| `/api/config/active` | GET/PUT | Active vendor/endpoint/model selection |
| `/api/vendors` | GET/POST | Vendor list / add vendor |
| `/api/vendors/{id}` | GET/PUT/DELETE | Vendor CRUD |
| `/api/vendors/{id}/endpoints` | GET/POST | Endpoint list / add |
| `/api/vendors/{id}/endpoints/{ep}` | GET/PUT/DELETE | Endpoint CRUD |
| `/api/vendors/{id}/endpoints/{ep}/apikey` | PUT | Set API key |
| `/api/mcp` | GET/POST | MCP servers config / add |
| `/api/mcp/status` | GET | Runtime MCP status |
| `/api/mcp/{name}` | DELETE | Remove MCP server |
| `/api/im` | GET | IM config |
| `/api/im/status` | GET | Runtime IM adapter status |
| `/api/im/adapters/{name}` | POST | IM adapter action (enable/disable/mute) |
| `/api/general` | GET/PUT | General settings (language, mode, iterations) |
| `/api/impersonate` | GET/PUT | Impersonation preset selection |
| `/api/a2a` | GET/PUT | A2A config |
| `/api/a2a/discover` | GET | Discover remote agents |
| `/api/sessions` | GET | List sessions grouped by workspace |
| `/api/sessions/{id}` | GET/DELETE | Session detail / delete |
| `/api/chat/history` | GET | Current chat history (from agent) |
| `/api/chat/ws` | WS | WebSocket chat (send messages, receive streaming events) |
| `/api/restart` | POST | Trigger restart |

### WebSocket Chat Protocol

**Client → Server** (JSON):
```json
{"type": "user_message", "text": "explain goroutines", "images": [...], "files": [...]}
```

**Server → Client** (JSON, event types):
| Event | Fields | Description |
|-------|--------|-------------|
| `user_ack` | `text`, `image_count`, `file_names` | Confirms message received |
| `text_delta` | `text` | Streaming text chunk |
| `tool_call_chunk` | `id`, `name`, `arguments_delta` | Partial tool call |
| `tool_call` | `id`, `name`, `arguments` | Complete tool call |
| `tool_result` | `name`, `result` | Tool execution result |
| `error` | `error` | Agent error |
| `done` | `usage` | Stream complete with token usage |

### Concurrency Safety

1. **WebSocket**: Per-connection write goroutine with buffered channel (256). Read/write fully separated.
2. **DaemonBridge.SendUserMessage**: TOCTOU-safe — cancelFunc check and run-slot claim happen under a single mutex lock.
3. **TUIChatBridge**: No direct agent access. Messages route through bubbletea event loop (`program.Send`), identical to keyboard input.
4. **Broadcast**: Non-blocking sends to subscriber channels. Slow subscribers drop events instead of blocking.