# Operating Open Autonomy > **The operator/maintainer how-to doc.** > > 1. [Install & operate](#install--operate) — the model: **runner ⟂ code host**, and the three setups. > 2. [Local-runner quickstart](#local-runner-quickstart) — run the agents on your own machine (against > *either* code host: a local-git board, or auto-merging PRs on GitHub). > 3. [GitHub production rollout](#github-production-rollout) — the checklist before enabling OA on a repo > with the **GitHub Actions** runner. > 4. [Release process](#release-process) — cutting a versioned Open Autonomy release. > > Related: how OA *proves itself* lives in `docs/LIVE_TESTING_STRATEGY.md` + `docs/PROOF_LEDGER.md`; > the field runbook is `docs/OSS_AGENT_RUNBOOK.md`; the model + substrate design is `docs/SPEC.md`. --- ## Install & operate Installing OA is **two independent choices** — the runner is orthogonal to the code host (see `docs/SPEC.md` → *"Runner vs code host"*): - **Runner** — *where the agents execute*: **GitHub Actions** (hosted jobs) or **local** (your machine, via the termfleet SDK, using your own logged-in coding CLI). - **Code host** — *where the code lives and how a change lands*: **GitHub** (the agent opens a PR; `ci` + an `agent-review` status gate **native auto-merge** — that reviewer is independently *enforced* only on the hosted/scoped-token runner; see the local safety note in step 5) or **local-git** (a tracker board on disk, PR-free — no GitHub at all). A profile declares which combinations it supports (`targets` × `codeHost`). The three setups people use: | Setup | Runner | Code host | Profile | Read | |---|---|---|---|---| | **Hosted** | GitHub Actions | GitHub | `self-driving` · `simple-gh-sdlc` | [GitHub production rollout](#github-production-rollout) | | **Local agents → GitHub PRs** | local | GitHub | `simple-gh-sdlc` | [Local-runner quickstart](#local-runner-quickstart) → *GitHub code host* | | **Fully local** | local | local-git | `simple-sdlc` (or `hello`) | [Local-runner quickstart](#local-runner-quickstart) → *local-git code host* | The **runner steps are identical** for both local setups (prereqs → termfleet → compile → run the loop); only **how you feed it work and how a change lands** differ by code host. That split is exactly steps 1–4 (shared) vs step 5 (per code host) below. > Installing onto an **existing** repo is an *overlay*: `simple-gh-sdlc` / `simple-sdlc` ship only > OA-specific files (`scripts/`, `.claude/skills/`, `scheduler/`, `.open-autonomy/`, `standards/`, > `.github/workflows/merge.yml`), so `compile … .` is purely additive — it does **not** generate a > `package.json`, `README`, or `.gitignore` over yours. You still merge the runner's deps into your repo > (`npm install termfleet`, `npm install -D ztrack`) — step 1 below. The OA files are **committed** to the > repo (the agents run in git worktrees, which only see committed files — it's how OA maintains itself). > > **Letting an agent do the install?** Point it at [`docs/INSTALL-AGENT.md`](./INSTALL-AGENT.md) — a > guided detect → ask → execute → verify playbook addressed to the installing agent. --- ## Local-runner quickstart Run the **agents on your own machine** — as local terminal sessions via [termfleet](https://github.com/volter-ai/termfleet), using *your own* logged-in coding CLI (Claude Code or Codex) for model access (no bounded-token proxy, no sponsor budget — **you pay your model provider directly**). This is the **local runner**. It works against **either code host**: - **local-git** (`simple-sdlc`) — fully closed-source: work comes from a local [ztrack](https://github.com/volter-ai/ztrack) board on disk, a change is `done` PR-free, **no GitHub at all**. The path for a private project you won't push. - **GitHub** (`simple-gh-sdlc`) — agents run on your machine, but a change lands as an **auto-merging PR on GitHub** gated by `ci` + an `agent-review` status (on local the agents share your token, so that status is *not* an independent reviewer — your CI is the real gate; see step 5's safety note). The path for a trusted repo whose agents you want on your own machine and model subscription. **Steps 1–4 (prereqs → termfleet → compile → run) are identical** for both; only **step 5 — how you feed work and how a change lands** — differs by code host. If you just want to *see the loop fire* with zero tracker setup, use the `hello` profile (a single cron agent). ### 1. Prerequisites Install these once, on the machine that will run the loop: | What | Why | Install | |---|---|---| | **Node.js 22.18+** | runs the CLI, the loop driver, and termfleet; the installed ztrack `.mts` preset needs TS type-stripping (Node ≥ 22.18) | nodejs.org / your version manager | | **tmux** | termfleet's local provider runs sessions in tmux | `brew install tmux` (macOS) / your package manager | | **termfleet** | the local runner drives it through its **SDK** (a `node_modules` dependency, not a PATH binary) | in your repo: `npm install termfleet` (then `npx termfleet …` runs its console/provider CLI) | | **A coding agent CLI, logged in** | the agent's model access | Claude Code (default) **or** Codex — see next step | | **bun** (for `simple-sdlc`) | the orchestrator dispatches workers via `bun scripts/runner.ts launch …` | `curl -fsSL https://bun.sh/install \| bash` — not needed for the `hello` demo (Node-only) | #### Sign in to your coding agent The loop launches **Claude Code** by default. termfleet drives whatever CLI you point it at, and that CLI must already be **installed on PATH and signed in** — the launch fails after ~45s against a missing or logged-out CLI. So sign in *first*: - **Claude Code (default):** run `claude`, then `/login` (or set `ANTHROPIC_API_KEY`). Verify with `claude --version`. - **Codex (alternative):** run `codex login`. To use Codex instead of Claude, set `TERMFLEET_AGENT=codex` when you start the loop (step 4). There is no separate "open-autonomy login" — the agent uses your coding CLI's own session, so your local subscription/key is what's billed. ### 2. Start termfleet (console + a local provider) The local runner (the termfleet SDK's `ProviderClient`) talks to a termfleet **console** + **provider** running on your machine. Start both once (they stay up in the background); the open-autonomy runner auto-discovers them via the SDK's `resolveDefaultProvider` — no URL config needed (set `TERMFLEET_PROVIDER_URL` only to pin a specific one). ```bash npx termfleet console serve --name dev --port 7373 & npx termfleet provider serve --kind virtual-tmux --prefix dev --count 1 --port 7402 & ``` Sanity-check that a session can launch (this is the same call the loop makes): ```bash npx termfleet claude new --prompt "say hello" npx termfleet sessions recent --live ``` If that prints a session, termfleet + your agent CLI are wired correctly. Open for the optional visual console. > termfleet is loopback-only by default and makes no outbound requests unless you configure a > registry — fine for a single closed-source machine. See termfleet's `SECURITY.md` before exposing it. ### 3. Compile a profile into your repo From inside the repo you want the loop to maintain — pick the profile for your **code host**: ```bash cd my-repo # Minimal demo: one cron "greeter" agent, no tracker, no code host. npx open-autonomy compile hello local . # local-git code host — fully local, PR-free, no GitHub (the four-agent PM/draft/develop/review loop). npx open-autonomy compile simple-sdlc local . # GitHub code host — agents run locally, changes land as auto-merging PRs on GitHub. npx open-autonomy compile simple-gh-sdlc local . ``` This is an **overlay**: it lays down `scheduler/` (the loop driver + schedule), `scripts/` (the local runner), the agent skills under `.claude/skills/` + `.codex/skills/`, `standards/`, and `.open-autonomy/` — and, for the GitHub code host, `.github/workflows/merge.yml` (the auto-merge reconcile). It generates **no** `package.json` / `README` / `.gitignore`, so it's safe to run over an existing repo. No clone of this repo is required — `npx open-autonomy …` runs the published CLI. (`self-driving` also compiles to `local`; `simple-sdlc` is local-git only.) ### 4. Run the loop ```bash node scheduler/run.mjs --once # fire one tick, then exit — use this to verify end-to-end node scheduler/run.mjs # run continuously, sleeping between ticks ``` Each tick fires the cron agents in `scheduler/schedule.json` (for `simple-sdlc` that's the PM; the PM then launches the workers). To use Codex instead of Claude Code: ```bash TERMFLEET_AGENT=codex node scheduler/run.mjs ``` With the `hello` profile, the first `--once` tick launches a `greeter` session — you'll see it in `termfleet sessions recent --live` and the console. That confirms the whole local path works. ### 5. Give the loop work — by code host The `hello` greeter self-fires on cron and needs no input. A real loop needs a backlog. **How you feed work and how a change lands depends on the code host** you compiled in step 3. #### local-git code host (`simple-sdlc`) — PR-free, no GitHub The agents read work from a local **ztrack** board on disk: ```bash npm install -D ztrack # a PROJECT dep, not -g: the installed validation preset # `import`s `ztrack/preset-kit`, so it must resolve from the # repo — a global/npx install fails `ztrack check`. npx ztrack init --preset simple-sdlc # the PR-free dev preset (the `default`); no remote needed npx ztrack issue create # add a work item (repeat for each task) ``` The `simple-sdlc` preset is **PR-free**: an issue is `done` once every AC is passed with commit-evidence and the reviewer approves — no pull request, so it works on a private repo with no remote. On its next tick the PM sweeps the board, enforces WIP, and **launches** the matching worker (draft/develop/review) for the next eligible issue, moving it `draft → ready → in-progress → in-review → done`. The work item reaches each worker as `$ZTRACK_ISSUE`. You add and inspect work entirely through `ztrack` — never GitHub. > Using a different tracker? `simple-sdlc`'s agents are just skills that call `ztrack`. Fork the profile > (`profiles/simple-sdlc/skills/*`), point the agents at your CLI, then compile your profile dir. #### GitHub code host (`simple-gh-sdlc`) — auto-merging PRs, agents on your machine Here the board is **GitHub issues** and a change lands as an **auto-merging PR**. Wire the gate **once** (use your repo's package manager — `npm`/`bun`/`pnpm` — for the installs; examples show `npm`): ```bash # a) the tracker, linked to GitHub Issues (GitHub is the source of truth) npm install -D ztrack # or: bun add -d ztrack npx ztrack init --preset simple-gh-sdlc --sync github --repo / # b) require the gate in branch protection (NOT auto-merge yet — that comes after a supervised first merge, # step d). The contexts are the CI check-run NAMES that run on PULL REQUESTS — read them from an OPEN PR (a # MERGED PR's head also carries push-run checks), excluding release-only/push-only/path-filtered jobs. gh api -X PUT "repos///branches//protection" --input - <<'JSON' { "required_status_checks": { "strict": false, "contexts": ["", "agent-review"] }, "enforce_admins": true, "required_pull_request_reviews": null, "restrictions": null } JSON # verify protection took (errors if it didn't — e.g. free private plan): gh api "repos///branches//protection/required_status_checks/contexts" --jq '.' # c) add a Ready issue (open + `ready` label + assignee + ACs in the body), then sync npx ztrack issue create # ... ; then: gh issue edit --add-label ready ``` Then run the loop and **watch the first PR merge under supervision** — once its gate is green, merge it yourself (`gh pr merge --squash`). **Only after that supervised first merge**, arm native auto-merge so later PRs land on their own: ```bash # d) ongoing operation — arm auto-merge ONLY after you watched the first PR merge cleanly: gh repo edit / --enable-auto-merge ``` > **You must have a CI check that runs on PRs.** If your repo has none, `contexts` would be just > `["agent-review"]` — and on a local runner the agents share *your* token (no scoped split), so the > reviewer's `agent-review` is **not independent** of the proposer. With no real CI in the gate you would > be auto-merging agent-written code with no independent check at all. Add a real CI check first, or don't > enable auto-merge. `enforce_admins:true` is deliberate: it binds the gate even to the admin identity the > local agents run as (with it `false`, an agent could `gh pr merge --admin` / push to the branch and > bypass the gate). See the safety section in [`INSTALL-AGENT.md`](./INSTALL-AGENT.md#the-merge-boundary-and-what-it-does-not-give-you-on-local). On its next tick the PM sweeps GitHub, and for a `ready` issue **launches the developer in an isolated worktree** (`runner.ts launch developer --ref --branch agent/issue-`). The developer commits, the runner opens the PR, the **reviewer** posts `agent-review`, and once **your CI (`ci`/`build`/…) + `agent-review`** are green the PR is mergeable — **merge the first one yourself** to prove the gate, then arm auto-merge (above) so later PRs land via **native auto-merge**. With `enforce_admins:true` no agent bypasses the gate — but **your CI is the real boundary**: on a local runner the agents share your token, so the reviewer's `agent-review` is not independent of the proposer. **Require your real CI**; with only `agent-review` required you'd be auto-merging on the agents' own (same-token) say-so. > Identity: for a *technically enforced* independent reviewer, run on the hosted (scoped-token) substrate, > or give the reviewer a **separate bot identity** (a GitHub App / dedicated account) — on local under one > token, propose and review share the same credential, so `agent-review` self-blesses. > **Installing with an agent?** [`docs/INSTALL-AGENT.md`](./INSTALL-AGENT.md) is a guided playbook > addressed to the installing agent — detect the repo, ask the human only the judgment calls (the gate, > identity, the first issue), run this overlay, and **verify the loop merges before declaring done**. ### What depends on the code host vs the runner The agent loop is the same everywhere; a few controls vary by **axis** (runner ⟂ code host): - **Native auto-merge** is a **GitHub code-host** feature — available on a **local runner** too (`simple-gh-sdlc local`), *not* a GitHub-Actions-only thing. A **local-git** code host (`simple-sdlc`) has no PRs: the `review` agent gates the change on the tracker board and you take the agent's branch. - **`/agent` operator commands** (pause/resume/status/retry via issue comments) are a **GitHub code-host** control plane. On a local-git board you steer with termfleet directly (`termfleet sessions recent`, `termfleet get|wait`, kill from the console) and the board. - **The bounded model-token proxy + sponsor budget** is a **GitHub-Actions runner** feature. On the **local runner** there is no spend cap from open-autonomy (either code host) — your model provider bills you; watch `termfleet sessions recent --live` and stop the loop to bound spend. ### Troubleshooting - **`createAgentWindow returned no terminalId …`** — the console/provider (step 2) aren't running, or your agent CLI isn't installed/logged in (step 1). Re-run the `npx termfleet claude new --prompt "hi"` sanity check in isolation. - **The loop does nothing each tick (`simple-sdlc`)** — there's no eligible work. Confirm `ztrack issue view` shows items and that they're in a state the PM can advance. - **Wrong agent launches** — set `TERMFLEET_AGENT=claude` or `=codex` explicitly; the default is `claude`. - **Pin a specific provider** — if auto-discovery picks the wrong one, set `TERMFLEET_PROVIDER_URL` to your provider's URL before running the loop. --- ## GitHub production rollout Use this checklist before enabling open-autonomy on a repository. ### Required Configuration Repository variables: - `MODEL_PROXY_URL` - `MODEL_PROXY_OIDC_AUDIENCE` - `PUBLIC_AGENT_MODELS` - `PUBLIC_AGENT_MODEL` - `PUBLIC_AGENT_TRIAGE_MODEL` - `PUBLIC_AGENT_PM_MODEL` - `PUBLIC_AGENT_REVIEW_MODEL` - `PUBLIC_AGENT_MAX_USD_CENTS` - `PUBLIC_AGENT_TRIAGE_MAX_USD_CENTS` - `PUBLIC_AGENT_PM_MAX_USD_CENTS` - `PUBLIC_AGENT_REVIEW_MAX_USD_CENTS` - `PUBLIC_AGENT_MAX_REQUESTS` - `PUBLIC_AGENT_MAX_DEVELOP_ATTEMPTS` - `PUBLIC_AGENT_MAX_OPEN_AGENT_PRS` - `PUBLIC_AGENT_STALE_NEEDS_INFO_MINUTES` - `PUBLIC_AGENT_PM_LIMIT` - `PUBLIC_AGENT_ALLOWED_PATHS` - `PUBLIC_AGENT_REPO_PAUSED` Repository secrets: - none required for model access: in-cell agents mint and exchange bounded per-run model tokens via GitHub OIDC; no admin token lives in the repo. - `PUBLIC_AGENT_TRIGGER_TOKEN` if PM-triggered comments must use a token with enough permissions to trigger follow-on workflows. Model proxy deployment: - Set provider API keys and model names. - Set `MODEL_PRICES_JSON`. - Choose production limits for global active runs, per-repo active runs, per-actor active runs, per-run spend, per-run request count, and daily spend. - Verify `GET /admin/limits/status` responds (operator-run, with the admin token from the operator's local `.env`; there is no in-repo admin workflow). GitHub repository: - Branch protection requires ci + agent-review; native auto-merge lands reviewed PRs (no agent merges). - Required CI check name matches `ci`. - Actions artifact retention is long enough for operator audits. - Workflow permissions stay capability-separated; do not use `write-all`. - Workflows set `FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true`. ### First Public Rollout Policy Start with a narrow allowed surface: - trusted maintainers only for manual `/agent developer` - PM sweep limit of 1-3 issues - conservative `PUBLIC_AGENT_ALLOWED_PATHS` - low per-run spend caps - `PUBLIC_AGENT_REPO_PAUSED=false` only during supervised windows Escalate to humans for security issues, broad architecture changes, unclear requirements, repeated failures, merge conflicts, missing CI, stale CI, and reviewer high-risk verdicts. ### Operator Drills Before opening broader access, verify these in the target repo: - `/agent pause` applies `agent-paused`. - `/agent developer` on a paused issue stops before model minting. - `/agent status` reports labels, open PR, active workflow runs, and active proxy runs. - `/agent resume` clears `agent-paused`. - `/agent retry` reports no infrastructure retry when no failed run exists, or reruns failed jobs without posting a fresh `/agent developer`. - `/agent cancel` cancels active public-agent workflow runs and revokes active proxy runs for the issue. - `Model Proxy Admin` `status` shows active-run saturation and daily counters. ### Private Trial Evidence These live trial runs are the baseline acceptance evidence as of 2026-06-16: - Phase 5 review/merge hardening: run `27632534829` merged PR #67 for issue #66. - Phase 6 evidence quality: run `27632884925` merged PR #69 for issue #68, with `run-receipt.json` and `transcript.md` promoted into `agent-sessions/run_966fe8ea-2e22-4752-89dd-25db8fcd0e82/`. - Phase 7 operator controls: issue #70 live-tested `/agent pause`, a paused `/agent developer` policy block before model minting, `/agent status`, and `/agent resume`. - Push CI for operator controls: run `27633520672`. - Push CI for production rollout checks: run `27633852289`. ### Go/No-Go Go only when all of these are true: - `bun run check` passes locally and in GitHub Actions. - A fresh low-risk issue completes end to end. - A paused issue does not dispatch new work. - PM sweep on stale backlog launches no duplicate work. - Proxy saturation causes skip/backpressure, not workflow failure. - Risky or unclear issues produce human-required escalation instead of a PR. --- ## Release process Open Autonomy releases are versioned by `VERSION` and `.open-autonomy/version.json`. Release checklist: 1. Update `VERSION`, `.open-autonomy/version.json`, and `CHANGELOG.md`. 2. Run `bun run check`. 3. Run planner and preflight workflows on `main`. 4. Compile into a clean directory (`bun bin/open-autonomy.ts compile profiles/self-driving gh-actions `) and run its `bun run check`. 5. Verify the committed release evidence in [`docs/PROOF_LEDGER.md`](./PROOF_LEDGER.md). 6. Tag the release as `vX.Y.Z`. 7. Record migration notes for template changes in the changelog. Generated or upgraded repositories should keep their local `.open-autonomy/version.json` so runs can record the Open Autonomy version and profile used for each session.