Architecture

The directory tree and the runtime story behind a single eeco invocation.

README · Vision · Cockpit · Usage · Architecture · Public API · Extending · Contributing · Upgrading · Versioning · Changelog · Security

--- A one-paragraph orientation, the directory tree, and the runtime story behind a single `eeco` invocation. Companion to [`USAGE.md`](USAGE.md) (the user-facing reference). The full specification lives in the build runbook that ships with the source tree; this document is the short overview for the operator and curious users. ## Overview eeco is two complementary halves: a self-maintaining workflow ecosystem (the heart) and a deterministic, no-AI-spend knowledge layer any AI assistant can plug into. Architecturally, both are served by one process: eeco is a single static Go binary that runs inside a target repository while a developer is coding. It maintains a private, gitignored workspace inside the repo (`.eeco/` by default) that holds the engine's state, the memory store, scaffolded workflows, queue items, and hook ledger. The binary detects the project profile, runs built-in or user-scaffolded workflows under a strict exit-code contract, and gates every AI call behind explicit consent and a budget cap. It never commits, pushes, or activates a generated workflow on its own. ## Directory layout ``` cmd/eeco/ main.go CLI entry; subcommand dispatch (stdlib flag) init.go eeco init (workspace bootstrap + .gitignore) initgit.go init-only HOST .gitignore commit (sanctioned git write) historygit.go private workspace-history repo ops (sanctioned git write) history.go eeco history (log / snapshot) status.go no-args path: digest or TUI run.go eeco run [--ai] new.go eeco new (template scaffolder) gc.go eeco gc (memory garbage collection) hooks.go eeco hooks (status / on|off / session-emit) update.go eeco update (read-only remote tag check) doctor.go eeco doctor (14 diagnostic probes) internal/ config/ repo-root + profile detection; config.local loader; WriteLocalKeys upsert; SessionSettingsPath memory/ one-fact files, frontmatter, store, MEMORY.md index, word-overlap Select, GC table workflow/ registry, exit-code contract (0/1/2/3), Run + ScriptRun, embed.FS templates, attribution Detector, builtin workflows ai/ Provider interface; cliProvider (shells ai_command); notConfigured stub; shared Gate (consent, budget, parking); AI-call ledger; background ProjectDigest/Understand queue/ append-only state/queue.md; portable presence lock (.queue.lock); ErrLocked sentinel hooks/ pre-commit (SHA/marker exact-match reversible); session-start + commit-guard (JSON edit + workspace backup + validate + restore); state/hooks.json ledger gitx/ read-only git helpers: Available, TrackedFiles, HeadSHA, ChangesSince, RemoteTags tui/ Bubble Tea control center; OneScreen digest; dispatch, completion, styles workflows/ embedded builtin workflow definitions (embed.FS) scripts/ build.sh, gen-packaging.sh, release.sh, regen-demo.sh Makefile build / release / verify / gates / bench / packaging .github/workflows/ ci.yml (matrix verify + Windows smoke); release.yml (cross-build + sign + attest + upload) docs/ USAGE.md (user guide), ARCHITECTURE.md (this file), UPGRADING.md (post-v0.1.0 upgrade notes) ``` ## Component map The runtime divides into seven concerns. Each is a leaf package under `internal/`; nothing in `internal/` is part of the frozen public surface. | Concern | Package | Responsibility | | -------------- | ---------------- | ------------------------------------------------------------------------------ | | Config | `internal/config` | Resolve repo root, detect profile, load `config.local`, write upserts. | | Memory | `internal/memory` | Fact files, frontmatter parse/serialise, GC, `MEMORY.md` index, relevance. | | Workflow | `internal/workflow` | Registry, runner, contract enforcement, attribution detector, scaffolder. | | AI provider | `internal/ai` | Provider interface, the CLI provider (shells `ai_command`), gating (consent / budget / park), AI-call ledger. | | Queue | `internal/queue` | Single decision channel; append, count, list/resolve under file lock. | | Hooks | `internal/hooks` | Reversible pre-commit + session-start wiring; ledger of every install. | | Read-only git | `internal/gitx` | Tracked-set, HEAD SHA, change-since-SHA, remote tags. No write surface. | | TUI | `internal/tui` | Bubble Tea control center; non-TTY → `OneScreen` and exit 0. | The CLI under `cmd/eeco/` is a thin dispatch layer over these packages. A no-argument invocation prints the status digest in a piped or CI environment, or launches the TUI on a real terminal. **Git-write boundary.** Engine packages (`internal/*`, including `gitx`) are read-only with respect to git; every git *write* lives in package `main` so engine code cannot mutate history. There are exactly two such sites: `initgit.go` (the one-shot `eeco init` `.gitignore` commit on the HOST repo) and `historygit.go` (the optional private workspace-history repo — a separate, local, no-remote git repo inside the gitignored `/`, which records only what eeco already writes there and never touches the host's tracked tree). `historygit.go` guards the nested-repo hazard — git searching upward to the host repo — with a `.git` stat-check plus a `--show-toplevel` assert before every write. `eeco history compact` rewrites the private repo's log (squash-all to one parentless commit via `commit-tree` + `reset --soft`) within those same guards, so the host tree is never touched and the kept tree is unchanged. ## Runtime story (typical `eeco run leak-guard`) 1. `cmd/eeco/main.go` parses the subcommand and flags (stdlib `flag`). 2. `internal/config` walks upward to the repo root, detects the profile (`go`, `python`, etc.), then applies configuration in three layers, each overriding the previous: built-in defaults, the user-global `config.local` (under `EECO_CONFIG_HOME` / `$XDG_CONFIG_HOME/eeco` / `~/.config/eeco`), and the workspace `/config.local` if the workspace exists. A single `applyConfigFile` parses each file layer. 3. `cmd/eeco/run.go` constructs an `Env` (repo root, workspace path, profile, automation level, queue handle, memory handle, optional `Gate`) and looks up the named workflow in the registry. 4. The workflow runs with the repo root as its working directory. A builtin runs natively in Go; a user workflow runs through `ScriptRun`, which enforces the same contract. 5. Any AI call routes through `internal/ai.Gate`. The gate selects the provider (the CLI provider or the not-configured stub), enforces consent (`--ai` or `automation=auto`) and the per-invocation budget cap, and on any skip or failure parks the prompt under `/state/parked/` and appends an `ai-parked` queue item. Every attempt — ran, parked, or gated-out — is recorded to `/state/ai-calls.json`. eeco runs no in-binary model client and no agentic tool loop; it configures the harness that runs the AI (see `docs/COCKPIT.md`), and each gated pass is a single provider call. 6. Decision-bearing output (a finding, a proposal, a draft handover) becomes a queue item under a presence lock. Two concurrent writers see `queue.ErrLocked`; the loser exits cleanly without corruption. 7. The runner returns one of four exit codes (see below). The process exits with that code. The TUI follows the same contract: every slash command dispatches to an engine operation that already exists, and every free-text input is one turn of a multi-turn conversation that passes through the same `Gate` (built per turn, so each turn gets the configured budget). ## Workflow contract Every workflow returns one of four exit codes: | Code | Meaning | | ---- | ------------------ | | `0` | clean | | `1` | finding or failure | | `2` | blocked (a required tool is missing) | | `3` | AI pass deferred (no consent) | A workflow writes only inside the workspace and routes any decision through the queue. Gates report-and-fail; they do not queue. `bug-sweep` and `handover-refresh` queue; `comment-hygiene` and `leak-guard` do not. ## Trust boundaries eeco has three concentric zones: ``` +--------------------------------------------------------------+ | outside the repo | | +--------------------------------------------------------+ | | | tracked tree (eeco never writes here) | | | | +-----------------------------------------------+ | | | | | workspace (gitignored — eeco writes only here) | | | | | | engine/ memory/ workflows/ state/ docs/ | | | | | +-----------------------------------------------+ | | | +--------------------------------------------------------+ | +--------------------------------------------------------------+ ``` Two opt-in, reversible touches escape the workspace: - `.git/hooks/pre-commit` — local, repo-scoped, untracked. Installed only if no pre-commit hook exists; removed only when the on-disk script is byte-identical to what eeco wrote. - Namespaced entries in the AI CLI's user-global JSON settings file: the session-start emitter and the opt-in commit-guard PreToolUse hook (which denies a `git commit` carrying AI attribution, in any repo). The exact path is supplied by `session_settings_path` in `config.local` or the `EECO_SESSION_SETTINGS` environment variable; unset means both are a no-op. Both touches are recorded in `/state/hooks.json`. Removal restores the file to its prior state. The path guard in `internal/config` refuses `..` traversal and rejects any write target outside the workspace. The control-center chat lets the model invoke tools, but the tool registry is **read-only by construction**: only read-only capabilities (search the knowledge layer, read the project brief, list open decisions, list memory) are ever registered — no write, git, or otherwise mutating verb is. The boundary therefore holds by absence of capability, not by a runtime check, so the workspace-write and tracked-tree-never-written guarantees survive even if the model is adversarial or hijacked. The serialized arguments of every tool call also pass the pre-write attribution scanner before the tool runs; a flagged call aborts the whole pass before any tool executes. ## Public surface From v0.1.0 eeco follows semver over an explicitly frozen public surface (pre-stability; see [`VERSIONING.md`](../VERSIONING.md) §2.1): the CLI commands and flags, the workflow exit-code contract and `Env` shape, the read-only `gitx` helpers, the `config.local` keys, the memory frontmatter, the queue and hook-ledger formats, the builtin workflow names, and the first line of `eeco version`. The exhaustive enumeration is the freeze contract in [`PUBLIC_API.md`](PUBLIC_API.md). Internal package APIs under `internal/` remain unfrozen. ## Build and release pipeline `Makefile` orchestrates everything: `make build` produces the local binary with version metadata injected via `-ldflags`; `make verify` runs `go build/vet/test`; `make gates` runs `comment-hygiene` and `leak-guard` against the working tree; `make bench` runs the build-tagged perf gate against a generated 50k-file fixture; `make release` cross-builds the six-platform matrix into `dist/`; `make packaging` emits `eeco.rb` and `eeco.json` from the checksums. CI runs `make verify` and `make gates` on every PR and `main` push, across a Linux/Windows matrix with a dedicated Windows smoke step. On a `v*` tag push, the release workflow cross-builds the matrix, signs `SHA256SUMS` with keyless cosign, attests every archive with GitHub build provenance, generates the Homebrew formula and Scoop manifest, and uploads eleven assets to the GitHub Release page. ## Known scaling limits eeco is built for a single developer's repository and a small, curated knowledge store; its costs are bounded by that scale, not by sub-linear algorithms. The places where a cost grows with input are listed here with their bound, the trigger that would justify revisiting them, and — where a tempting optimization was considered and declined — why. None is a hot path today; the entries exist so a maintainer inherits the reasoning instead of rediscovering it. **Filesystem walks.** Two non-test tree walks scale with input. The `manifest-refresh` knowledge-tree walk (`internal/manifest`) only *enumerates directories* — no file reads — so it is O(directories) and cheap. The gate workflows `comment-hygiene` and `leak-guard` (`internal/workflow/scan.go`) walk the whole working tree *and read every text file* — O(files × size) — skipping `.git` and the workspace. *Bound:* a `make bench` CI gate fails if a 50,000-file fixture scan exceeds a 5-second wall (`internal/workflow/bench_test.go`), so the cost is measured, not assumed. *Revisit when:* a real repo approaches that file count, or the bench wall creeps toward the budget. *Not optimized:* no incremental or cached scan — the bench shows comfortable headroom at 50k files and a cache would add a staleness surface for no current gain. **AI-call ledger.** Every gated AI attempt appends one record by reading, re-marshalling, and rewriting the whole `{"records":[…]}` file (`internal/ai/ledger.go`, `appendAICall`); `eeco stats` reads the whole file once (`internal/ai/summary.go`). The cost is O(n) per append for n lifetime attempts, with no cap; `state/evolve-history.json` shares the same shape and discipline. *Bound:* realistic solo use is hundreds to low thousands of records at roughly 300 bytes each — a sub-megabyte file rewritten in well under a millisecond. *Revisit when:* the ledger reaches multiple megabytes, or append / `eeco stats` latency becomes perceptible. *Not optimized:* the on-disk shape is frozen (`docs/PUBLIC_API.md`), so an append-only migration is a breaking change (see the register below); a format-preserving record cap was considered and deferred — at sub-megabyte sizes there is no real cost, and because the ledger is an audit trail a cap must first decide count-versus-age and delete-versus-archive, which is its own design and versioning pass when real scale demands it. **Memory relevance.** Relevance ranking tokenises the query and scores every fact by keyword overlap — O(n·m) for n facts and m query terms — then, in `memory.Select`, bumps `last_used` and re-saves each matched fact as one atomic whole-file write per match (`internal/memory/select.go`). *Bound:* the store is a small, GC-curated set — dozens of facts in practice — so O(n·m) is negligible. The live query path (`eeco go` and `eeco ask`, via `internal/ask`) runs the same overlap scan but deliberately does **not** bump-and-save, so the per-match write cost is not paid today; `memory.Select` itself is currently unused in production. *Revisit when:* `Select` is wired into a frequent path, or the fact count grows past a few hundred. *Not optimized:* no inverted index or ranking rewrite — unjustified at a curated store of this size, where recall matters more than speed (see the register). **Brief budget ladder.** When `context_budget` is set, `eeco go --write` collects the brief once and then re-renders it down a fixed ladder (full, then progressively capped tiers) until it fits (`internal/brief`, `RenderWithinBudget`). The expensive step — collecting memory, git state, and workflows — runs **once**; only the in-memory Markdown render repeats, at most about seven times, over already-trimmed slices. *Bound:* the ladder is a fixed ≤7 rungs of cheap string building, and the feature is opt-in — with no `context_budget` (the default) the ladder never runs. *Revisit when:* effectively never at current brief sizes. *Not optimized:* no incremental trimming or binary search over the ladder — at most seven string builds do not justify the complexity. **Read-only git helpers.** Every `internal/gitx` helper forks one `git` subprocess per call, with no caching. Most are called once per invocation, but two paths repeat the same call in one process: `evolve` calls `ChangesSince` twice per run, and `memory-drift` calls `LastCommitDate` once per fact carrying a `ref` (O(facts)). *Bound:* a subprocess fork costs single-digit milliseconds, and both repeating paths are on-demand maintenance workflows, not interactive hot paths. *Revisit when:* a future interactive or per-keystroke path calls the same helper repeatedly in one process. *Not optimized:* no gitx memoization — caching a HEAD SHA or tracked set inside a process that may itself drive git writes (the private workspace-history repo) adds a staleness surface on a trust-boundary read, to save roughly one fork in a non-interactive workflow; the trade is not worth it. **Explicitly not optimized.** Recorded so they are not re-litigated: - **gitx result caching** — declined; staleness on a trust-boundary read outweighs roughly one saved subprocess fork in a non-interactive workflow. - **AI-ledger append-only / JSONL migration** — would end the full-rewrite-per-append, but changes the frozen `{"records":[…]}` shape (`docs/PUBLIC_API.md`) and is therefore a major-version, separate planning pass — not done now. - **Memory ranking rewrite (inverted index or scoring overhaul)** — unjustified at a curated dozens-of-facts store; recall matters more than speed at this scale. ## Extension seams Adding a new capability means registering it at one known point and, for the seams that are irreducibly multi-file, touching a short fixed set of sites. This table is the maintainer's map: where each kind of extension is registered, what else it touches, and how much friction that is. It records the one HIGH-friction seam — a new hook type — as intentional rather than as debt. The companion how-to for a newcomer is [`EXTENDING.md`](../EXTENDING.md), which expands these rows into worked examples; this section is the authoritative source of the registration points. | Extension | Register at | Also touch | Friction | | --- | --- | --- | --- | | CLI verb | `cmd/eeco/main.go` (the `run` dispatch switch) | the `usage` const in the same file; a new `cmd/eeco/.go` runner, reusing the `loadInitedConfig` / `loadRepoConfig` / `newFlagSet` guards in `helpers.go` | low | | Builtin workflow | `internal/workflow/registry.go` (the `DefaultRegistry` slice) | a new `internal/workflow/.go` implementing the workflow interface (`Name` / `Summary` / `Run`) | low | | `config.local` key | `internal/config` (the `Config` struct) | the config parse loop that assigns it; a `Default…` const if it carries a default | low | | Memory frontmatter field | `internal/memory/fact.go` (the `Fact` struct) | `frontmatter.go` — the `setField` parse switch and the `Serialize` writer (and `Validate` if the field is constrained) | medium | | AI provider | `internal/ai` (the `Select` chooser) | the new provider type (`Name` / `Run`); the `config.local` key that selects it, plus its parse | medium | | TUI slash-command | `internal/tui/commands.go` (the `commandIndex`) | the `dispatch` switch in the same file; tab-completion is derived automatically | low | | Hook type | `internal/hooks/hooks.go` (the name const + the `Names` list) | the `ledger` struct field; the enable / disable / refresh trio; the `Status` line; the `cmd/eeco/hooks.go` dispatch; and the session-emit path if the hook emits | HIGH | The hook-type seam is the one irreducible multi-touch point, and the friction is the price of the trust boundary. A hook is an opt-in, reversible escape from the workspace (see Trust boundaries), so every hook type carries a ledger field that records its install for exact, byte-identical removal — and that ledger, the enable/disable/refresh trio, the status read-out, and the CLI dispatch must stay in lock-step with it. The coupling is deliberate: it is what makes a hook removable without guesswork. New seams should prefer the single-registration shape of the rows above; add a hook type only when the capability genuinely needs to escape the workspace. --- [← Prev: Usage](USAGE.md) · [Next: Public API →](PUBLIC_API.md)