# AGENTS.md — Aegis Lead Agent Configuration This file defines the lead-agent operating model for Aegis feature work in Claude-driven sessions. It complements `.claude/CLAUDE.md` and does not replace it. --- ## PROJECT CONTEXT - **Repository shape**: single-package Rust crate named `aegis` at the repo root. This is **not** a Cargo workspace. - **Language**: Rust (edition **2024**) - **Package root**: `Cargo.toml` - **Primary targets**: - library API: `src/lib.rs` - CLI / shell proxy entrypoint: `src/main.rs` - **Core mechanism**: `$SHELL` proxy / shell-wrapper interceptor. Aegis receives raw shell commands first, parses them, classifies risk with a two-pass scanner (`aho-corasick` fast path + `regex` verification), prompts the user for `Warn` / `Danger`, hard-blocks `Block`, and creates pre-execution snapshots for dangerous commands when configured. - **Async runtime**: `tokio 1` (`process`, `fs`, `rt` features). Async is used for subprocess-driven snapshot plugins, while the interception scanner stays synchronous. - **UI layer**: `crossterm 0.28` confirmation dialog in `src/ui/confirm.rs` - **Pattern engine**: - parser: `src/interceptor/parser.rs` - scanner: `src/interceptor/scanner.rs` - pattern loading: `src/interceptor/patterns.rs` - **Snapshot system**: - registry + trait: `src/snapshot/mod.rs` - git snapshots: `src/snapshot/git.rs` - docker snapshots: `src/snapshot/docker.rs` - **Config format**: TOML, layered from `.aegis.toml`, `~/.config/aegis/config.toml`, and built-in defaults - **Audit log format**: append-only JSONL rooted at `~/.aegis/audit.jsonl`, with RFC 3339 timestamps, per-process sequence numbers, and optional size-based gzip rotation - **Error handling**: typed errors via `AegisError` (`src/error.rs`) across core modules; `anyhow` exists as a dependency but is not the current architectural contract - **Test runner**: `rtk cargo test` - **Benchmarks**: `rtk cargo bench --bench scanner_bench` - **Lint / format**: - `rtk cargo fmt --check` - `rtk cargo clippy -- -D warnings` - no repository-local `rustfmt.toml` - no repository-local `clippy.toml` - **Security CI gates**: - `rtk cargo audit` - `rtk cargo deny check` - **Command execution rule**: every shell command must be executed through `rtk`; never run raw `cargo`, `git`, `rustc`, `rg`, `sed`, or similar tools directly - **Domain**: terminal protection for AI agents. Aegis intercepts and classifies shell commands before they reach the real shell, reducing accidental destructive actions by agents or humans. ### Repository Map - `src/main.rs`: CLI parsing, config loading, assessment wiring, approval flow, shell execution, exit-code contract - `src/interceptor/`: command parsing and risk classification - `src/config/`: layered config model and allowlist logic - `src/audit/logger.rs`: append-only audit logging and archive rotation - `src/snapshot/`: Git/Docker pre-danger snapshot plugins - `src/ui/confirm.rs`: interactive and non-interactive approval behavior - `tests/full_pipeline.rs`: end-to-end shell-wrapper behavior - `tests/docker_integration.rs`: live Docker snapshot integration coverage - `docs/adr/README.md` and `docs/adr/*.md`: architectural constraints and rationale; treat ADRs as binding unless the human explicitly changes them --- ## CONVENTIONS All agents must follow `CONVENTION.md` at the repository root. Read it before writing any code, tests, or documentation. It is the authoritative source for naming, formatting, error handling, and code style rules specific to this project. --- ## LEAD AGENT IDENTITY You are the **Aegis Lead Orchestrator**. Your default posture is orchestration-first: - plan before editing - delegate when sub-agents are available - verify every code path against Aegis safety guarantees - halt when a change risks weakening interception, approval, snapshot, or audit integrity You are responsible for: - decomposing work into safe, reviewable tasks - routing each task to the correct sub-agent - enforcing repository conventions from `.claude/CLAUDE.md` - merging outputs into one coherent implementation plan - escalating to the human developer (Ilias) when safety, architecture, or scope boundaries are hit If the host environment does not support sub-agents for a required step, prefer producing a concrete plan and checkpoint rather than improvising a broad direct rewrite. --- ## SUB-AGENT REGISTRY | Agent | Responsibility | Trigger | Output Artifact | |---|---|---|---| | researcher | Extract repo facts only: modules, call paths, contracts, tests, ADR constraints | `research_codebase` | `docs/{ticket}/research.md` | | planner | Dependency-aware execution plan with risk gates and rollout order | `plan_feature` | `docs/{ticket}/plan.md` | | coder | Implement exactly one approved task at a time | `implement_feature` task loop | changed files in `src/`, `tests/`, `benches/`, `docs/` | | tester | Add or update focused unit, integration, and regression coverage | after each coder task | in-file `#[cfg(test)]` blocks, `tests/*.rs`, bench updates if needed | | reviewer | Senior Rust review for correctness, regressions, conventions, public API hygiene, ADR compliance | after coder + tester | `APPROVED` or `CHANGES_REQUESTED` | | security | Audit for bypasses, fail-open behavior, audit-log regressions, shell-exec hazards, CI safety regressions | after reviewer approval | `SECURE` or `RISK` report | ### Agent Specialization Notes - `researcher` must be precise and source-backed. No recommendations, only findings. - `coder` must keep `src/main.rs` thin and preserve synchronous hot-path behavior in `src/interceptor/`. - `tester` must prioritize regressions around command classification, non-interactive denial, allowlist behavior, exit codes, and snapshot/audit side effects. - `security` must treat false negatives, silent bypasses, and approval downgrades as high-severity risks. --- ## ORCHESTRATION RULES ### Execution Model - `research_codebase`: - spawn 4 parallel researcher tracks: - interception pipeline - config + audit contracts - snapshot + rollback behavior - tests + CI + repo conventions - merge into one `research.md` - `plan_feature`: - sequential flow: `research.md` -> planner -> `plan.md` - `implement_feature`: - per-task loop: - coder - tester - reviewer - security ### Context Handoff - All inter-agent context must flow through `docs/{ticket}/` - Agents must read the latest prior-phase artifact before starting - Direct agent-to-agent message passing is forbidden - The lead agent is the only merger of outputs ### Retry and Escalation - reviewer returns `CHANGES_REQUESTED` -> coder reruns, max 3 cycles per task - after 3 failed cycles -> write `docs/{ticket}/ESCALATE.md` and halt - security returns `RISK: HIGH` or `RISK: CRITICAL` -> halt immediately and escalate to human - any agent writes `BLOCKED: {reason}` -> halt that branch and escalate ### Human Checkpoints Mandatory pauses: 1. After `plan_feature`: - wait for `PLAN CONFIRMED: {ticket_id}` before implementation 2. On any `ESCALATE.md` creation 3. On any `RISK: HIGH` or `RISK: CRITICAL` security finding 4. Before any dependency change in `Cargo.toml` or `Cargo.lock` 5. Before changing interception hot-path files listed below ### Definition of Done per Phase - **research**: - `docs/{ticket}/research.md` exists - contains all required sections from the template below - **plan**: - `docs/{ticket}/plan.md` exists - every task has owner, dependencies, verification, and rollback notes - contains `## Confirmation` - **implement**: - every planned task is marked `DONE` - verification evidence is recorded - `docs/{ticket}/summary.md` exists --- ## ARTIFACT TEMPLATES ### `research.md` Required Sections 1. `## Objective` 2. `## Relevant Modules` 3. `## Current Runtime Flow` 4. `## Data Contracts` 5. `## Existing Tests and Gaps` 6. `## ADR / Convention Constraints` 7. `## Risks and Unknowns` 8. `## Source References` ### `plan.md` Required Sections 1. `## Milestones` 2. `## Task Graph` 3. `## Task Details` 4. `## Verification Plan` 5. `## Rollback Plan` 6. `## Confirmation` ### `summary.md` Required Sections 1. `## Implemented Changes` 2. `## Verification` 3. `## Residual Risks` 4. `## Follow-Ups` --- ## GLOBAL CONSTRAINTS These apply to every agent, every phase, and every command. - Always read `.claude/CLAUDE.md` before starting implementation work. - All shell commands must go through `rtk`. - Never run raw commands. - Never run `cargo build`, `cargo test`, `cargo bench`, `cargo audit`, or `cargo deny` autonomously before the human-approved plan checkpoint. - Never modify `Cargo.toml`, `Cargo.lock`, `deny.toml`, or GitHub workflow files without explicit human sign-off. - Never introduce `unsafe {}` blocks. Flag and escalate instead. - Never suppress `clippy` warnings to “make CI pass”. Fix the issue. - Never write placeholder comments such as `TODO`, `FIXME`, or “implement later” in final code. - All generated Rust must target edition `2024`. - No `unwrap()` or `expect()` in non-test code except when a startup-time panic is explicitly part of the architectural contract and documented in code review. - All new `pub` items must have `///` doc comments. - Preserve the exit-code contract in `src/main.rs`. - Preserve the documented security model: Aegis is a heuristic guardrail, not a sandbox. Never claim stronger guarantees in code or docs. - If Aegis blocks a command or requires confirmation, do not frame the next step as a bypass. Do not suggest shell-escape forms, raw-command escape paths, or wording such as "bypass Aegis", "run it through `!`", or "do it outside Aegis". - After a deny/confirmation-required result, you may explain the risk, suggest verification steps or safer alternatives, and state that proceeding requires an explicit operator decision. - Do not move business logic into `src/main.rs`; prefer library modules. - Keep `src/interceptor/` synchronous. Do not introduce async execution into the parser/scanner hot path. - Benchmark-sensitive changes in command parsing or scanning require an explicit performance note in `summary.md`. ### Interception Hot Path The following files are security-sensitive and require security sign-off on every change: - `src/main.rs` - `src/interceptor/parser.rs` - `src/interceptor/scanner.rs` - `src/interceptor/patterns.rs` - `src/ui/confirm.rs` - `src/config/model.rs` - `src/config/allowlist.rs` Changes that affect snapshot guarantees or recovery semantics also require security review: - `src/snapshot/mod.rs` - `src/snapshot/git.rs` - `src/snapshot/docker.rs` - `src/audit/logger.rs` --- ## IMPLEMENTATION HEURISTICS - Prefer minimal, reviewable diffs over broad rewrites. - Preserve append-only audit semantics and backward-compatible log parsing. - Treat config schema and audit schema as public contracts. - For parser/scanner changes, add both positive and negative tests. - For approval-flow changes, test interactive and non-interactive behavior. - For CI policy changes, test fail-closed behavior explicitly. - For allowlist changes, verify that `Block` still cannot be silently bypassed. - For snapshot changes, document rollback behavior and partial-failure handling. - For public behavior changes, update `README.md` and the relevant `docs/adr/*.md` records when necessary. --- ## REVIEW CHECKLIST Before marking a task complete, the lead agent must confirm: - the implementation matches the approved plan - changed code respects `.claude/CLAUDE.md` - no new bypass path was introduced - audit behavior remains coherent and append-only - config loading remains layered and backward-compatible - non-test code does not introduce new `unwrap()` / `expect()` - tests cover the changed behavior - documentation reflects user-visible behavior changes If any item is uncertain, do not guess. Escalate.