--- name: loki-mode description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product with minimal human intervention. Requires --dangerously-skip-permissions flag. --- # Loki Mode v7.5.17 **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.** **Spec in, product out.** A "spec" is whatever describes the work: a Markdown PRD, a GitHub issue, an OpenAPI doc, a Jira ticket -- a PRD is one form of spec. **Multi-provider (stable since v5.0.0):** Claude/Codex/Gemini/Cline/Aider with abstract model tiers and degraded mode for non-Claude providers. See `skills/providers.md`. **Current track (v7.5.x):** Phase 1 RARV-C closure -- real provider judges, gate-failure flock, synthetic PRD e2e, status `--json`, dead code cleanup. **Runtime migration in progress:** A bash-to-Bun migration is underway on the `feat/bun-migration` branch. The first phase (shipped in v7.3.0) routes a small set of read-only commands -- `version`, `status`, `stats`, `doctor`, `provider show/list`, `memory list/index` -- through a Bun runtime via `bin/loki`. Every other command remains on the Bash runtime (`autonomy/loki`). Rollback is available with `LOKI_LEGACY_BASH=1`. See `UPGRADING.md` and `docs/architecture/ADR-001-runtime-migration.md` for the full plan. --- ## PRIORITY 1: Load Context (Every Turn) Execute these steps IN ORDER at the start of EVERY turn: ``` 1. IF first turn of session: - Read skills/00-index.md - Load 1-2 modules matching your current phase - Register session: Write .loki/session.json with: {"pid": null, "startedAt": "", "provider": "", "invokedVia": "skill", "status": "running", "updatedAt": ""} 2. Read .loki/state/orchestrator.json - Extract: currentPhase, tasksCompleted, tasksFailed 3. Read .loki/queue/pending.json - IF empty AND phase incomplete: Generate tasks for current phase - IF empty AND phase complete: Advance to next phase 4. Check .loki/PAUSE - IF exists: Stop work, wait for removal. Check .loki/STOP - IF exists: End session, update session.json status to "stopped". 5. EVERY TURN: Update .loki/session.json "updatedAt" field to current ISO timestamp. This keeps the dashboard aware the skill session is alive. Sessions without an update in 5 minutes are treated as stale/stopped by the dashboard. ``` --- ## PRIORITY 2: Execute (RARV Cycle) Every action follows this cycle. No exceptions. ``` REASON: What is the highest priority unblocked task? | v ACT: Execute it. Write code. Run commands. Commit atomically. | v REFLECT: Did it work? Log outcome. | v VERIFY: Run tests. Check build. Validate against spec. | +--[PASS]--> COMPOUND: If task had novel insight (bug fix, non-obvious solution, | reusable pattern), extract to ~/.loki/solutions/{category}/{slug}.md | with YAML frontmatter (title, tags, symptoms, root_cause, prevention). | See skills/compound-learning.md for format. | Then mark task complete. Return to REASON. | +--[FAIL]--> Capture error in "Mistakes & Learnings". Rollback if needed. Retry with new approach. After 3 failures: Try simpler approach. After 5 failures: Log to dead-letter queue, move to next task. ``` --- ## PRIORITY 3: Autonomy Rules These rules guide autonomous operation. Test results and code quality always take precedence. | Rule | Meaning | |------|---------| | **Decide and act** | Make decisions autonomously. Do not ask the user questions. | | **Keep momentum** | Do not pause for confirmation. Move to the next task. | | **Iterate continuously** | There is always another improvement. Find it. | | **ALWAYS verify** | Code without tests is incomplete. Run tests. **Never ignore or delete failing tests.** | | **ALWAYS commit** | Atomic commits after each task. Checkpoint progress. | | **Tests are sacred** | If tests fail, fix the code -- never delete or skip the tests. A passing test suite is a hard requirement. | --- ## Model Selection **Default since v5.3.0 (reaffirmed in v7.5.13):** Haiku disabled for quality. Use `--allow-haiku` or `LOKI_ALLOW_HAIKU=true` to enable. | Task Type | Tier | Claude (default) | Claude (--allow-haiku) | Codex (GPT-5.3) | Gemini | |-----------|------|------------------|------------------------|------------------|--------| | Spec analysis, architecture, system design | **planning** | opus | opus | effort=xhigh | thinking=high | | Feature implementation, complex bugs | **development** | opus | sonnet | effort=high | thinking=medium | | Code review (planned: 3 parallel reviewers) | **development** | opus | sonnet | effort=high | thinking=medium | | Integration tests, E2E, deployment | **development** | opus | sonnet | effort=high | thinking=medium | | Unit tests, linting, docs, simple fixes | **fast** | sonnet | haiku | effort=low | thinking=low | **Parallelization rule (Claude only):** Launch up to 10 agents simultaneously for independent tasks. **Degraded mode (Codex/Gemini/Cline/Aider):** No parallel agents or Task tool. Codex has MCP support. Runs RARV cycle sequentially. See `skills/model-selection.md`. **Git worktree parallelism:** For true parallel feature development, use `--parallel` flag with run.sh. See `skills/parallel-workflows.md`. **Scale patterns (50+ agents, Claude only):** Use judge agents, recursive sub-planners, optimistic concurrency. See `references/cursor-learnings.md`. --- ## Phase Transitions ``` BOOTSTRAP ──[project initialized]──> DISCOVERY DISCOVERY ──[spec analyzed, requirements clear]──> ARCHITECTURE ARCHITECTURE ──[design approved, specs written]──> DEEPEN_PLAN (standard/complex only) DEEPEN_PLAN ──[plan enhanced by 4 research agents]──> INFRASTRUCTURE INFRASTRUCTURE ──[cloud/DB ready]──> DEVELOPMENT DEVELOPMENT ──[features complete, unit tests pass]──> QA QA ──[all tests pass, security clean]──> DEPLOYMENT DEPLOYMENT ──[production live, monitoring active]──> GROWTH GROWTH ──[continuous improvement loop]──> GROWTH ``` **Transition requires:** All phase quality gates passed. No Critical/High/Medium issues. --- ## Context Management **Your context window is finite. Preserve it.** - Load only 1-2 skill modules at a time (from skills/00-index.md) - Use Task tool with subagents for exploration (isolates context) - **Context Window Tracking (v5.40.0):** Dashboard gauge, timeline, and per-agent breakdown at `GET /api/context` - **Notification Triggers (v5.40.0):** Configurable alerts when context exceeds thresholds, tasks fail, or budget limits hit. Manage via `GET/PUT /api/notifications/triggers` --- ## Key Files | File | Read | Write | |------|------|-------| | `.loki/session.json` | Session start | Session start (register), every turn (updatedAt), session end (status) | | `.loki/state/orchestrator.json` | Every turn | On phase change | | `.loki/queue/pending.json` | Every turn | When claiming/completing tasks | | `.loki/queue/current-task.json` | Before each ACT | When claiming task | | `.loki/specs/openapi.yaml` | Before API work | After API changes | | `skills/00-index.md` | Session start | Never | | `.loki/memory/index.json` | Session start | On topic change | | `.loki/memory/timeline.json` | On context need | After task completion | | `.loki/memory/token_economics.json` | Never (metrics only) | Every turn | | `.loki/memory/episodic/*.json` | On task-aware retrieval | After task completion | | `.loki/memory/semantic/patterns.json` | Before implementation tasks | On consolidation | | `.loki/memory/semantic/anti-patterns.json` | Before debugging tasks | On error learning | | `.loki/queue/dead-letter.json` | Session start | On task failure (5+ attempts) | | `.loki/signals/HUMAN_REVIEW_NEEDED` | Never | When human decision required | | `.loki/state/checkpoints/` | After task completion | Automatic + manual via `loki checkpoint` | --- ## Module Loading Protocol (Skills) This protocol governs **skill module** loading -- task-scoped instruction files in `skills/`. It is distinct from the Memory System Progressive Disclosure (see below), which governs persistent **memory layers** in `.loki/memory/`. ``` 1. Read skills/00-index.md (once per session) 2. Match current task to module: - Writing code? Load model-selection.md - Running tests? Load testing.md - Code review? Load quality-gates.md - Debugging? Load troubleshooting.md - Legacy healing? Load healing.md - Deploying? Load production.md - Parallel features? Load parallel-workflows.md - Architecture planning? Load compound-learning.md (deepen-plan) - Post-verification? Load compound-learning.md (knowledge extraction) 3. Read the selected module(s) 4. Execute with that context 5. When task category changes: Load new modules (old context discarded) ``` **Memory System Progressive Disclosure** is a separate 3-layer structure (`index.json` -> `timeline.json` -> `episodic/*.json`) for retrieving past episodes/patterns. See `skills/memory.md` and `references/memory-system.md`. --- ## Invocation **Unified entry point (v6.84.0):** `loki start [SPEC|ISSUE-REF]` auto-detects whether the input is a PRD file, an issue URL, an issue number, or another spec format (e.g. OpenAPI). No need to pick between `loki start` and `loki run` -- the single command handles all cases. ```bash # Standard mode (Claude - full features) claude --dangerously-skip-permissions # Then say: "Loki Mode" or "Loki Mode with spec at path/to/spec" (PRD .md/.json, OpenAPI .yaml, etc.) # Unified `loki start` -- one command, auto-detected mode loki start # no arg: analyze current dir, auto-generate spec loki start ./prd.md # PRD mode (.md/.json/.txt/.yaml) -- a PRD is one form of spec loki start ./openapi.yaml # SPEC mode (OpenAPI doc treated as the spec) loki start owner/repo#123 # ISSUE mode (GitHub specific repo) loki start https://github.com/o/r/issues/42 # ISSUE mode (GitHub URL) loki start 123 # ISSUE mode (GitHub issue in current repo) loki start PROJ-456 # ISSUE mode (Jira) loki start --prd ./prd.md # Explicit PRD mode (overrides detection) loki start --issue 123 # Explicit issue mode (overrides detection) # With provider selection (supports .md and .json PRDs) loki start --provider claude ./prd.md # Default, full features loki start --provider codex ./prd.json # GPT-5.3 Codex, degraded mode loki start --provider gemini ./prd.md # Gemini 3 Pro, degraded mode loki start --provider cline ./prd.md # Cline CLI, degraded mode loki start --provider aider ./prd.md # Aider (18+ providers), degraded mode # Parallel mode (git worktrees, Claude only) loki start ./prd.md --parallel loki start 123 --ship # Issue -> PR -> auto-merge # Legacy: `loki run ` still works but prints a deprecation notice. # It is an alias for `loki start ` and will be removed in a future major. ``` **Provider capabilities:** - **Claude**: Opus 4.6, 1M context (beta), 128K output, adaptive thinking, agent teams, full features (Task tool, parallel agents, MCP) - **Codex**: GPT-5.3, 400K context, 128K output, MCP support, --full-auto mode, degraded (sequential only, no Task tool) - **Gemini**: Degraded mode (sequential only, no Task tool, 1M context) - **Cline**: Multi-provider CLI, degraded mode (sequential only, no Task tool) - **Aider**: 18+ provider backends, degraded mode (sequential only, no Task tool) --- ## Human Intervention (v3.4.0) When running with `autonomy/run.sh`, you can intervene: | Method | Effect | |--------|--------| | `touch .loki/PAUSE` | Pauses after current session | | `echo "instructions" > .loki/HUMAN_INPUT.md` | Injects directive (requires `LOKI_PROMPT_INJECTION=true`) | | `touch .loki/STOP` | Stops immediately | | Ctrl+C (once) | Pauses, shows options | | Ctrl+C (twice) | Exits immediately | ### Security: Prompt Injection (v5.6.1) **DISABLED by default** for enterprise security. Prompt injection via `HUMAN_INPUT.md` is blocked unless explicitly enabled. ```bash # Enable prompt injection (only in trusted environments) LOKI_PROMPT_INJECTION=true loki start ./prd.md # Or for sandbox mode LOKI_PROMPT_INJECTION=true loki sandbox prompt "start the app" ``` ### Hints vs Directives | Type | File | Behavior | |------|------|----------| | **Directive** | `.loki/HUMAN_INPUT.md` | Active instruction (requires `LOKI_PROMPT_INJECTION=true`) | **Example directive** (only works with `LOKI_PROMPT_INJECTION=true`): ```bash echo "Check all .astro files for missing BaseLayout imports." > .loki/HUMAN_INPUT.md ``` --- ## Complexity Tiers (v3.4.0) Auto-detected or force with `LOKI_COMPLEXITY`: | Tier | Phases | When Used | |------|--------|-----------| | **simple** | 3 | 1-2 files, UI fixes, text changes | | **standard** | 6 | 3-10 files, features, bug fixes | | **complex** | 8 | 10+ files, microservices, external integrations | --- ## Managed Agents Integration (v7.2.0) Opt-in integration with Claude Managed Agents (released Apr 2026). Gives Loki cross-project audited memory and real multiagent councils. Features are BAKED INTO existing RARV-C and council flows -- no new commands to learn. **All flags default false.** Default behavior is identical to v7.2.0. | Flag | Purpose | Status | |------|---------|--------| | `LOKI_MANAGED_AGENTS` | Parent gate; required for every managed path | stable | | `LOKI_MANAGED_MEMORY` | REASON augment + REFLECT shadow-write from `.loki/memory/` to Managed Agents store | stable (tested with fakes) | | `LOKI_MANAGED_MEMORY_HYDRATE` | Session-boot pull of semantic patterns + skills from store | stable (tested with fakes) | | `LOKI_EXPERIMENTAL_MANAGED_AGENTS` | Umbrella for multiagent session path | RESEARCH PREVIEW | | `LOKI_EXPERIMENTAL_MANAGED_REVIEW` | Managed code-review council via `callable_agents` | RESEARCH PREVIEW | | `LOKI_EXPERIMENTAL_MANAGED_COUNCIL` | Managed completion council via `callable_agents` | RESEARCH PREVIEW | Fail-fast: child-on + parent-off exits 2 with clear error. API unreachable falls back to local path with a `managed_agents_fallback` event to `.loki/managed/events.ndjson`. No retry storm. **Flip-on order (recommended):** 1. `LOKI_MANAGED_AGENTS=true LOKI_MANAGED_MEMORY=true` (memory mirror). 2. Add `LOKI_MANAGED_MEMORY_HYDRATE=true` after one-week soak. 3. Keep `LOKI_EXPERIMENTAL_*` off until multiagent graduates from research preview. **NOT TESTED against live Anthropic API.** Automated CI uses `memory/managed_memory/fakes.py`. Beta header pinned to `managed-agents-2026-04-01`. If the SDK shape differs, calls raise `AttributeError`/`TypeError` which are caught and translated to `ManagedUnavailable` -> fallback to local path. See `skills/memory.md` for the full integration guide. --- ## Phase 1 RARV-C Closure (v7.5.x) The current track wires real evidence into RARV-C feedback. Documented here and in `loki internal --help`: | Env Var | Effect | |---------|--------| | `LOKI_INJECT_FINDINGS=true` | Injects council findings + gate failures into the next REASON prompt | | `LOKI_OVERRIDE_COUNCIL=true` | Promotes real provider judges over fakes when available | | `LOKI_AUTO_LEARNINGS=true` | Auto-extracts learnings into semantic memory after VERIFY | | `LOKI_HANDOFF_MD=true` | Emits a `handoff.md` continuity doc at session boundaries | See `references/core-workflow.md` for the full RARV-C contract. --- ## Concurrency and Security Hardening (v7.5.7 - v7.5.13) Three back-to-back patches closed cross-process and security gaps. No user-facing behavior change on the default flow; verify via the cited paths. - **Cross-process file locks** on append-or-rewrite state, so parallel runs / dashboard / MCP do not corrupt shared files: gate counter (`autonomy/run.sh` gate-counter writes), task queues (`autonomy/run.sh` queue read-modify-write), checkpoint index (`autonomy/run.sh` checkpoint index updates), `events.jsonl` append (event emission paths in `events/emit.sh` and `autonomy/run.sh`), human intervention signal files (`autonomy/run.sh:check_human_intervention()` at line ~8059 / 7897 per state-machine doc). - **MCP path validation** -- file/path arguments to `mcp/server.py` tools are normalized and rejected if they escape the project root (path-traversal fix from v7.5.8). - **Dashboard auth** now required on `/api/memory/*`, `/api/learning/*`, and `/api/status` in `dashboard/server.py` (previously unauthenticated read paths). - **Bash quoting hardening** across `autonomy/run.sh` and `autonomy/loki` -- variable expansions inside command substitution and `[ ]` tests quoted to prevent word-splitting on paths with spaces. See `CHANGELOG.md` entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and reviewer sign-off. --- ## Implemented Features | Feature | Added | Notes | |---------|-------|-------| | Multi-provider support (5 providers) | v5.0.0 | claude, codex, gemini, cline, aider -- see `providers/` | | CONTINUITY.md working memory | v5.35.0 | Auto-managed by run.sh, updated each iteration | | Quality gates 3-reviewer system | v5.35.0 | 5 specialist reviewers in `skills/quality-gates.md`; execution in run.sh | | Memory System (episodic/semantic/procedural) | v5.15.0 | Full implementation in `memory/` | | Context Window Tracking | v5.40.0 | Dashboard gauge, per-agent breakdown at `GET /api/context` | | Notification Triggers | v5.40.0 | `GET/PUT /api/notifications/triggers` | | GitHub integration | v5.42.2 | Import, sync-back, PR creation, export. CLI: `loki github`, API: `/api/github/*` | | Legacy System Healing | v6.67.0 | `loki heal ` -- friction-as-semantics, characterization tests | | Unified `loki start` | v6.84.0 | Auto-detects spec (PRD, OpenAPI, etc.) vs issue input | | Managed Agents (memory mirror) | v7.2.0 | Opt-in via `LOKI_MANAGED_AGENTS` -- see Managed Agents section | | Bun runtime (Phase 1) | v7.3.0 | Read-only commands routed through `bin/loki`; `LOKI_LEGACY_BASH=1` to revert | | Phase 1 RARV-C closure | v7.5.x | Findings injection, real judges, auto-learnings, handoff.md | ## Planned / In-Progress Features | Feature | Target | Notes | |---------|--------|-------| | Bun runtime (Phase 2+) | TBD | Migrate write-path commands; tracked on `feat/bun-migration` | | Managed Agents multiagent path | TBD | `LOKI_EXPERIMENTAL_MANAGED_*` flags -- RESEARCH PREVIEW, not on live API | | Benchmarks (HumanEval, SWE-bench) | TBD | Runner scripts and datasets exist in `benchmarks/`; no published results | | `loki run` removal | next major | Currently a deprecated alias for `loki start` | ## Deprecated | Item | Deprecated In | Notes | |------|---------------|-------| | `loki run ` | v6.84.0 | Alias for `loki start`. Will be removed in next major. | | VSCode extension (`vscode-extension/`) | v7.2.0 | No longer actively maintained; dashboard web UI is the supported front-end. | --- **v7.5.17 | [Autonomi](https://www.autonomi.dev/) flagship product | ~260 lines core**