---
id: ADR-023
title: Secret store UX layer — local vault, daemon, native UI, agent protocol, rotation, pattern catalogue, onboarding skill
status: proposed
date: 2026-05-09
deciders: ["Andrei Mazniak"]
tags: ["security", "secrets", "ui", "agent", "rotation", "patterns", "skills"]
supersedes: null
superseded_by: null
---

# ADR-023: Secret store UX layer

## Status

**proposed**

This ADR is intentionally an **umbrella**. The conventions in
[`INDEX.md`](./INDEX.md) prefer one decision per ADR, and the eight
sub-decisions here could be split into eight separate files. They are
not, because they describe a **single coherent layer** sitting above
[ADR-021](./ADR-021-external-secret-sources.md): the components are
designed against each other (the daemon serves the UI, the UI is what
the agent triggers when it cannot get a value, the rotation flow uses
both, the pattern catalogue feeds all of them, the onboarding skill
walks a user through the whole stack at once). Splitting them would
multiply cross-references without adding clarity. If review surfaces
that a sub-decision should evolve independently — for example, the
pattern catalogue grows beyond the secret store and is used by other
subsystems — that section is a candidate for promotion to its own ADR.

## Context

[ADR-019](./ADR-019-secret-string-discipline.md), [ADR-020](./ADR-020-secret-manifest-and-alias-resolution.md),
and [ADR-021](./ADR-021-external-secret-sources.md) together form the
**transport layer** of the secret framework: how secrets are typed in
memory, how they are named and discovered, and where their values live.
None of those layers say what the user **sees**, what the **agent** is
allowed to do, how a token gets **rotated** in practice, or how a new
contributor **gets set up** for the first time.

In practice this leaves four concrete gaps that the transport layer
cannot close on its own:

1. **No fallback for fully offline / no-keychain machines.** ADR-021
   ships an env-store, but env-store is a CI shape, not a developer
   shape. A developer working off-network on a Linux box without
   Secret Service has nowhere to put a passphrase-protected store.
2. **No way for the user to enter a value without the agent seeing
   it.** The agent that is helping the user set up the project is
   the same channel through which the user would type a secret. A
   second, agent-bypassing UI is required.
3. **No rotation flow.** The transport layer can detect "this token
   expires in three days" but cannot guide the user through the
   rotation: open the right URL, accept the new value, validate, and
   record the rotation. Without a flow, rotation reminders become
   notification fatigue.
4. **No agent-mediated provisioning.** When the agent finds that a
   project needs a credential that is not yet provisioned, it can
   neither read existing values (by design) nor accept new ones
   (it is the leak surface). It needs a typed protocol for asking
   the user to provision a path through the UI.

This ADR introduces the **UX layer**: an encrypted local vault for
the offline-fallback case, a daemon to manage its unlocked state, a
native UI (TUI + GUI) that the user interacts with directly, a
manual-assisted rotation flow built on top, a pattern catalogue
shared with the OTLP sanitizer (#240/#242), an MCP protocol for
agent-mediated provisioning, and an onboarding skill that walks a
user through the stack on first run.

### Threat model alignment

This ADR inherits the threat model of [ADR-020](./ADR-020-secret-manifest-and-alias-resolution.md):
the framework protects against accidental leakage by humans, by
agents acting in good faith, and by routine tooling. It does not
claim isolation against a malicious agent that can spawn shells.
The local vault and the daemon raise the bar (encrypted at rest,
zeroized in memory, segmented unlock), but they do not change the
boundary — a process running as the user can still read the
keychain and `/proc/self/environ`. The UX layer's stronger claim
is operational, not cryptographic: the user types secrets into a
UI that the agent does not mediate, the agent never receives
values, and rotation happens on a cadence rather than on demand.

## Decision

> **Decision:** A user-facing UX layer sits above the router from
> [ADR-021](./ADR-021-external-secret-sources.md), built around an
> encrypted local vault, a single-purpose daemon, a native TUI/GUI,
> an agent provisioning protocol over MCP, a manual-assisted
> rotation flow, a shared pattern catalogue, and an onboarding skill.
> The agent never sees secret values through the MCP surface; the user
> never types secrets through the agent. (See §3.7 for the precise
> scope of this property — it is an enforced invariant on the agent
> tool surface, not a claim about a sandboxed agent.)

The decision has eight parts.

### 3.1 Encrypted local vault

The local vault is one of the built-in sources from
[ADR-021](./ADR-021-external-secret-sources.md) section 8. It is the
fallback for environments without an OS keychain and for users who
explicitly want a portion of their namespace stored locally,
encrypted, behind a passphrase.

#### File layout

The vault lives at `~/.devboy/secrets/local-vault.dvb`. Its layout is
plaintext-header plus per-entry AEAD ciphertexts:

```
HEADER (plaintext, fixed-width fields):
  MAGIC          [4]   = b"DVB1"
  VERSION        [1]   = 0x01
  KDF_PARAMS     [16]  = Argon2id (m_cost, t_cost, p_cost, salt-len)
  SALT           [32]  = random per-vault, used by passphrase envelope

UNLOCK_ENVELOPES (TOML-serialised, each is an AEAD-wrap of the vault-key):
  [[envelope]]
  kind = "passphrase"
  argon2_salt    = "<32B base64>"
  argon2_params  = { m = 65536, t = 3, p = 1 }
  wrapped_key    = "<AEAD(vault_key, key=Argon2id(passphrase, salt))>"

  [[envelope]]
  kind = "keychain"           # macOS-only; optional
  keychain_account = "dev.devboy.secrets.vault.macos-touchid"
  wrapped_key      = "<AEAD(vault_key, key=Keychain-protected key)>"

  [[envelope]]
  kind = "recovery"
  bip39_salt    = "<32B base64>"
  wrapped_key   = "<AEAD(vault_key, key=HKDF(bip39_seed))>"

ENTRIES_INDEX (plaintext TOML, metadata only):
  [[entry]]
  path           = "team/gitlab/token-deploy"
  nonce          = "<24B base64>"
  ct_offset      = 0
  ct_length      = 312
  description    = "..."
  retrieval_url  = "..."
  expires_at     = "2026-08-01"
  last_rotated_at = "2026-05-02"
  pattern_id     = "gitlab-pat"

  [[entry]]
  ...

AEAD_BLOBS:
  contiguous concatenation of per-entry ciphertexts; each entry is
    XChaCha20-Poly1305(
      plaintext     = value (utf-8 bytes),
      key           = vault_key,
      nonce         = entry.nonce,
      associated_data = entry.path utf-8 bytes
    )
```

Critical invariants:

- **Envelope encryption.** A single random `vault_key` (32 bytes)
  encrypts every entry. The header carries one or more *envelopes*,
  each of which independently wraps `vault_key` under a different
  unlock method. Adding a new unlock path (Touch ID, recovery
  phrase) is a write to the header only and never touches the
  per-entry ciphertexts.
- **AAD includes path.** Per-entry AEAD uses the path as
  associated data. A tampering attempt that swaps the ciphertext
  blob from `team/gitlab/...` under the index entry for
  `personal/github/...` fails decryption. This closes a class of
  swap attacks that pure encryption-without-AAD does not.
- **Plaintext metadata is intentional.** `description`,
  `retrieval_url`, `expires_at` are all readable without an unlock
  step. The discovery and rotation-reminder flows need this; the
  threat model already grants any local process read access to
  metadata, and encrypting it would gate every `secrets list` on a
  PIN prompt for no real benefit.
- **No file-level integrity over the whole vault.** Each entry is
  independently authenticated; there is no Merkle tree or
  whole-file MAC. A truncation attack on the entry table is
  detected at parse time (TOML parse error or entry-count
  mismatch), not cryptographically. This is acceptable because the
  vault file is single-writer (the daemon, see section 3.3) and
  protected by filesystem permissions.

#### Algorithms

- **AEAD:** XChaCha20-Poly1305 (192-bit nonce; ChaCha20 from RFC
  8439 extended with the XSalsa20-style HChaCha20 nonce derivation).
  Picked over AES-GCM because it does not require AES-NI and runs
  constant-time on all targets (including ARM Linux). Picked over
  the IETF 12-byte-nonce ChaCha20-Poly1305 variant because each
  per-entry nonce is generated randomly: a 12-byte random nonce has
  birthday-collision risk after ~2³² writes per key, while the
  24-byte random nonce makes collisions cryptographically negligible.
  RustCrypto provides a vetted pure-Rust implementation
  (`chacha20poly1305::XChaCha20Poly1305`).
- **KDF:** Argon2id with `m_cost = 65536` (64 MiB), `t_cost = 3`,
  `p_cost = 1` for the passphrase envelope. Tuned to ≈250 ms on a
  2024-class laptop; tuneable per-vault if a host is too slow.
  The recovery envelope uses HKDF over the BIP39 seed (no
  brute-force resistance needed; the seed has 256 bits of entropy
  by construction).
- **CSPRNG:** the `getrandom` crate, the same source used by the
  rest of the workspace.

#### Crate

A new workspace member, `crates/devboy-vault-crypto/`, exposes the
algorithms behind a small API:

```rust
pub struct Vault { /* unlocked state */ }
impl Vault {
    pub fn open(path: &Path, unlock: UnlockMethod) -> Result<Self>;
    pub fn create(path: &Path, methods: &[InitialUnlock]) -> Result<Self>;
    pub fn add_envelope(&mut self, kind: EnvelopeKind, secret: SecretString) -> Result<()>;
    pub fn get(&self, path: &str) -> Result<Option<SecretString>>;
    pub fn put(&mut self, path: &str, value: SecretString, meta: EntryMetadata) -> Result<()>;
    pub fn rotate(&mut self, path: &str, new: SecretString) -> Result<()>;
    pub fn delete(&mut self, path: &str) -> Result<()>;
    pub fn list(&self) -> impl Iterator<Item = &EntryMetadata>;
}
```

The vault unlocked state holds `vault_key` in a `secrecy::SecretBox<[u8; 32]>`
that zeroizes on drop (per ADR-019). The `Vault` instance is owned by
the daemon (section 3.3); CLI commands and the UI talk to the daemon,
not to this crate directly.

### 3.2 Authentication and recovery

The vault accepts three independent unlock methods:

- **Passphrase** — minimum 12 characters. Required at vault
  creation; cannot be removed.
- **Keychain (Touch ID)** — macOS only. The user runs
  `devboy secrets vault add-keychain-unlock`, the CLI generates a
  random 32-byte key and stores it in the macOS keychain with a
  `SecAccessControl` object created via `SecAccessControlCreateWithFlags`
  using flags `kSecAccessControlBiometryAny | kSecAccessControlUserPresence`
  (Touch ID required, fallback to user password), and adds a
  `keychain` envelope. Future unlocks via Touch ID don't require
  the passphrase.
- **Recovery (BIP39)** — generated at vault creation. The CLI
  derives a recovery key from a 24-word BIP39 phrase, wraps
  `vault_key` under it, and shows the phrase **once** with explicit
  acknowledgement (`Type 'I have written it down' to continue`).
  The phrase is never stored.

A user who loses both the passphrase and the keychain unlock can
recover with `devboy secrets vault recover`, which prompts for the
24-word phrase, decrypts `vault_key`, prompts for a new passphrase,
and rewrites the passphrase envelope. A user who loses the
passphrase, the keychain unlock, **and** the recovery phrase has no
recovery path; the documentation says so explicitly.

#### Why not just keychain?

On macOS the keychain alone is sufficient. The reason the local
vault exists at all is for the cases that the keychain does not
cover (headless Linux, containers without `loginctl`, a user who
explicitly wants some paths off-keychain). The keychain-as-unlock
path is an ergonomic add-on for macOS users who want both stores
under a single biometric flow.

### 3.3 Daemon: `devboy-secrets-agent`

The daemon owns the unlocked `Vault` instance. It speaks JSON-RPC
2.0 over a UNIX domain socket at `~/.devboy/secrets/agent.sock`,
mode `0600`, owned by the running user. Windows is out of scope for
the first release; on Windows the local-vault source falls back to
a per-call unlock until a named-pipe transport ships.

#### Lifecycle

The daemon supports two startup modes:

- **On-demand (default).** The CLI checks for a live socket; if
  none is found, it spawns the daemon as a detached child and
  hands the user the unlock UI before issuing the first
  `secret.get`. After **15 minutes** of idle time, the daemon
  zeroizes `vault_key` and exits. This mode requires no install
  step; the only configuration is the vault file itself.
- **Persistent (opt-in).** `devboy secrets agent install` writes a
  launchd `LaunchAgent` plist on macOS or a `systemd --user`
  service on Linux. The daemon starts on user login and stays
  running, but the *unlocked state* still expires on idle and on
  SIGTERM. The user trades a process always running for not
  having to re-enter the unlock method between CLI sessions.

The lifecycle code is identical in both modes; only the supervision
differs. In both modes the daemon enforces:

- **Idle timeout:** the unlocked vault key is zeroized 15 minutes
  after the last successful `secret.get` or `metadata.update`.
  Configurable per user in `~/.devboy/config.toml`.
- **Eager re-lock:** `vault.lock` zeroizes immediately. Used by
  the UI's "lock now" button and by the rotation flow after a
  successful rotation.
- **SIGTERM zeroization:** the daemon traps SIGTERM, zeroizes,
  flushes pending writes, and exits within 10 seconds.
- **Authenticated operations.** `secret.put`, `secret.rotate`, and
  `vault.add-envelope` always require a fresh PIN/passphrase
  prompt independent of the daemon's current unlocked state. This
  is the "hybrid" model from design review: reads benefit from
  daemon caching, writes do not.

#### Wire protocol

JSON-RPC 2.0, the same shape as MCP. Methods:

```
vault.unlock(method)      params: { kind: "passphrase" | "keychain" | "recovery", secret: string }
                          result: { unlocked_at: timestamp, expires_at: timestamp }

vault.lock                params: {}
                          result: { locked: true }

vault.status              params: {}
                          result: { state: "locked" | "unlocked",
                                    unlocked_at?: timestamp,
                                    expires_at?: timestamp,
                                    available_methods: [...] }

secret.get(path)          params: { path: string }
                          result: { value: string }      // SecretString on the wire
                                | { error: "Locked" | "NotFound" }

secret.list(filter?)      params: { filter?: { scope?, provider?, status? } }
                          result: { entries: [{ path, status, expires_at?, ... }] }
                          // Never returns values.

secret.put(path,          params: { path, value, meta, fresh_unlock: { kind, secret } }
           value, meta)   result: { ok: true } | { error: "BadUnlock" }

secret.rotate(path, new)  params: { path, new_value, fresh_unlock: { ... } }
                          result: { ok: true, last_rotated_at }

metadata.update(path,     params: { path, fields }
                fields)   result: { ok: true, applied_fields: [...] }
                          // Does not require unlock; metadata is plaintext.
```

#### Security boundary

The socket file is created with `umask 077` and verified at connect
time: the daemon checks the connecting peer's UID through
`SO_PEERCRED` (Linux) or `LOCAL_PEERCRED` (macOS). A connection
from a different UID is rejected. This is defence in depth on top
of the filesystem permissions; it costs nothing and catches a
subset of misconfigurations.

The daemon **does not** hand out `vault_key`. Every read goes
through `secret.get(path)` which returns one decrypted value at a
time. A compromised socket-listener that intercepts the wire
between the CLI and the daemon learns the secrets that the
listener was specifically targeting, not the entire vault.

### 3.4 Native UI

The UI is a `devboy secrets ui` subcommand of the main `devboy`
binary. There is one binary; no separate `.app` bundle for the
first release.

The subcommand has two backends:

- **TUI** — `ratatui`, runs in any terminal. Used by default in
  contexts where a full GUI is impractical (SSH sessions,
  containers, headless screens).
- **GUI** — `egui`, runs in a native window. Used by default
  when `$DISPLAY` / `$WAYLAND_DISPLAY` is set on Linux or when
  running on macOS / Windows. `egui` was picked over `iced`
  because its immediate-mode API matches the `ratatui` mental
  model (one render function per frame), making the two backends
  share the same view code.

Backend selection: `--tui` / `--gui` flags, falling through to
`devboy secrets ui` which auto-detects.

#### MVP views

Four views ship in the first release:

- **Inventory** — a sortable, filterable table of every secret
  the active context can see. Columns: path, status
  (`provisioned` / `expiring` / `missing` / `format-invalid`),
  routed source, `expires_at`, provider, scope. Filters by scope,
  provider, status. **No values are ever shown.**
- **Provision / rotation dialog** — opened by an agent
  request_provision call (section 3.7) or by `devboy secrets
  provision <path>` directly. Shows the path's metadata, a
  `[Open retrieval URL]` button that opens the user's browser at
  `retrieval_url`, a hidden input for the new value, and a
  `[Validate & save]` button that runs the format check, runs the
  liveness check, writes through the daemon (with a fresh PIN
  prompt for the local-vault), and updates `last_rotated_at`. The
  same dialog handles initial provision and rotation; a `mode`
  parameter changes the title and the destructive-confirm
  behaviour.
- **Edit metadata** — editable fields: `description`,
  `retrieval_url`, `rotate_every_days`, `expires_at`, `pattern_id`.
  Dirty fields are highlighted; `[Save]` writes through the
  daemon's `metadata.update`. A subview `[Apply agent suggestion]`
  shows the diff from a `secrets.propose_metadata` call (section
  3.7) and lets the user accept all / reject all / accept per
  field.
- **Discovery import** — connects to a configured 1Password vault
  or HashiCorp Vault mount, lists item titles (no values) through
  the source's `list()` method, suggests `<scope>/<provider>/<purpose>`
  paths via the algorithm in section 3.6, and asks the user to
  confirm or edit each mapping. **Values are never copied out
  of the upstream**; the import only registers paths in the
  global index with `source` and `reference` set so future reads
  resolve through the source.

The UI talks to the daemon, never to the keychain or the local
vault directly. This keeps the security boundary uniform: the
daemon is the only process that holds `vault_key`.

### 3.5 Manual-assisted rotation

Rotation is **assisted**, not automatic. The decision to call
provider rotate-APIs is deferred to a future ADR; this release
covers the case where the user changes a token in the upstream UI
and `devboy-tools` records the change.

#### Flow

`devboy secrets rotate <path>` (or the agent's
`secrets.request_rotation(path)` call):

1. The CLI / MCP server calls `secrets.describe(path)` to load
   metadata. If `retrieval_url` is set, the CLI opens it in the
   user's default browser.
2. The UI's provision/rotation dialog opens in `mode = "rotation"`
   with the existing path metadata.
3. The user pastes the new value into the hidden input.
4. The UI runs format validation against `format_regex` (or the
   pattern's regex if `pattern_id` is set). On format failure the
   dialog shows the mismatch and does not save.
5. The UI runs liveness validation through the provider plugin.
   On liveness failure the dialog asks for confirmation
   ("upstream rejected this token; save anyway?") with a default
   of *no*. If the user confirms anyway, the rotation proceeds
   but a warning is recorded.
6. The UI writes the new value through the daemon's
   `secret.rotate` (which requires a fresh PIN if the route
   targets the local-vault) and updates `last_rotated_at` in the
   index.
7. On success, the agent (if it was the caller) receives
   `{ ok: true, last_rotated_at }`. The agent never sees the
   value.

#### Rotation cadence and reminders

`devboy doctor` warns at `< 7` days before `expires_at` (default,
configurable). If the `notify` skill is wired up, the warning is
also delivered there. Rotation is otherwise a quiet background
fact; the system does not page the user.

#### What this ADR does **not** ship

Provider-driven rotation (calling `gitlab.users.create_personal_access_token`,
`aws iam create-access-key`, etc.) is deferred. The reasoning:
provider rotation APIs are heterogeneous (some require existing
credentials, some require browser interaction, some don't exist),
and the assisted flow above covers 100% of providers. A future ADR
("ADR-2N: Provider-driven secret rotation") will add per-provider
rotators behind the same `secrets.request_rotation` MCP call,
falling through to manual rotation when the provider has no
auto-rotate path.

### 3.6 Pattern catalogue (`devboy-secret-patterns`)

A new workspace crate, `crates/devboy-secret-patterns/`, holds the
canonical descriptions of secret types: GitHub PATs, GitLab tokens,
AWS access keys, OpenAI API keys, JWTs, Slack tokens, Vault tokens,
and so on. The crate is shared with the OTLP sanitizer (#240) and
the `otel scan` auditor (#242) so all three subsystems see the
same regex and severity, but the patterns are written from the
secret-store side first.

#### Layered trait

```rust
pub trait SecretPattern: Send + Sync {
    fn id(&self) -> &str;
    fn display_name(&self) -> &str;
    fn format_regex(&self) -> &Regex;
    fn severity(&self) -> Severity;

    // Optional layers — present if the pattern supplies them
    fn metadata(&self) -> Option<&PatternMetadata>;
    fn rotation(&self) -> Option<&RotationSpec>;
    fn liveness(&self) -> Option<&LivenessSpec>;
}

pub struct PatternMetadata {
    pub provider_id: &'static str,
    pub retrieval_url_template: &'static str,
    pub default_expiry_days: Option<u32>,
    pub scopes_hint: Vec<&'static str>,
}

pub struct RotationSpec {
    pub method: RotationMethod,            // Manual | ProviderUi { url } | ProviderApi { reserved }
}

pub struct LivenessSpec {
    pub kind: LivenessKind,                // Http { url, method, auth, expect_status }
}
```

The OTLP sanitizer and `otel scan` consume only `format_regex` +
`severity`. The secret store consumes everything available; missing
optional fields are user-fillable in the UI's edit-metadata view.

#### Built-in catalogue + user extension

The crate ships a catalogue of ~30 well-known patterns hard-coded
behind `SecretPattern` impls. Users may extend the catalogue by
dropping TOML files into `~/.devboy/secrets/patterns.d/`. Each TOML
file declares one or more patterns:

```toml
[[pattern]]
id              = "internal-mfa-token"
display_name    = "Internal MFA Service Token"
format_regex    = "^mfa_[A-Z0-9]{40}$"
severity        = "high"
provider_id     = "internal"
retrieval_url_template = "https://mfa.example.internal/tokens"
default_expiry_days = 180
```

User-supplied patterns are loaded at process start and merged with
the built-in catalogue. A user-supplied pattern with the same `id`
as a built-in shadows the built-in (with a `doctor` warning so the
user knows shadowing happened).

#### Why a new crate

The catalogue is small but central, and three consumers want to
import it. Putting it in `devboy-storage` would force the OTLP
sanitizer to depend on the storage layer just to read regexes;
putting it in `devboy-otel-sanitizer` would force the storage layer
to depend on OTLP. A small dedicated crate is the cleanest split.

### 3.7 MCP agent provisioning protocol

The MCP server (from [ADR-021](./ADR-021-external-secret-sources.md)
section 6 / [ADR-005](./ADR-005-credential-storage.md)) exposes the
following typed tools to the agent. **There is no `secrets.get`
tool exposed to the agent surface.** The only legitimate path for
a value to reach the agent is through a high-level provider tool
(e.g. `gitlab.create_merge_request`) where the resolution happens
server-side and the agent receives only the operation's outcome.

```
secrets.list(filter?)
  → [ { path, status, expires_at?, source_name, capabilities_hint? } ]
  Reads the active context's manifest. Never returns values.

secrets.describe(path)
  → { path, metadata: PatternMetadata?, status, expires_at?, last_rotated_at? }
  Reads the global index + per-project overrides. Never returns the value.

secrets.request_provision(path)
  → { request_id, status: "pending" }
  Opens the provision dialog (section 3.4) on the user's UI.
  The agent polls poll_status to track the result.

secrets.request_rotation(path)
  → { request_id, status: "pending" }
  Same flow with mode=rotation.

secrets.poll_status(request_id)
  → { status: "pending" | "ok" | "cancelled" | "expired",
      expires_at?, last_rotated_at? }
  Default timeout for a pending request is 5 minutes; after
  that the request_id resolves to "expired".

secrets.propose_metadata(path, fields)
  → { request_id, status: "pending" }
  Opens the edit-metadata view with a diff of fields to apply.
  The user accepts/rejects per field; the agent receives the
  applied subset through poll_status.

secrets.propose_new_path(suggested_path, metadata)
  → { request_id, status: "pending" }
  Opens a registration dialog. The user can accept the suggested
  path, edit it, or reject. Used by the agent when it detects a
  project consuming a token that has no manifest entry.

secrets.request_use_approval(path, reason, ttl_seconds?)            (P25)
  → { request_id, status: "pending" }
  Opens the use-approval dialog. The agent supplies a short
  human-facing reason rendered verbatim. poll_status settles to
  one of "once" / "session" / "denied". Wired only when the
  manifest's `approve_on_use` field is "session" or "per-call";
  paths default to "never" and resolve silently. ttl_seconds
  may NARROW the registry-wide TTL (capped at 5 min) — agents
  cannot enlarge the window.
```

#### Approve-on-use protocol (P25 phase)

`approve_on_use` is a per-path policy on the `IndexEntry` and
`OverrideEntry`:

- `never` (default) — alias resolves silently. Preserves the
  existing zero-prompt resolve path so most paths stay
  frictionless.
- `session` — first resolve in the running process opens the
  dialog; once the user clicks "Allow always (this session)"
  the in-process `SessionApprovalCache`
  (`devboy-core::secret_approval`) caches the approval and
  further resolves of the same path skip the dialog until the
  TTL expires.
- `per-call` — every resolve opens the dialog, regardless of
  any cached entry. Right for high-stakes paths (production
  database password, signing keys).

Override precedence per [ADR-020](./ADR-020-secret-manifest-and-alias-resolution.md)
§4: a project's `[overrides."<path>"].approve_on_use` wins
over the global index's value. The project owner can therefore
TIGHTEN (`never` → `per-call`) or RELAX (`per-call` → `never`)
the policy without rewriting the global index.

Decision contract:

- The dialog renders three buttons: `Allow always (this
  session)`, `Allow once`, `Deny`. They map to
  `ProvisionStatus::{Session, Once, Denied}` on the wire.
- `once` and `denied` are NOT cached. Only `session` populates
  the cache.
- The cache is process-scoped — closing the agent session
  clears every approval. Persisting the cache to disk is
  out-of-scope for v1 (a deliberate restraint: the user's
  trust window is the running session, not "for ever").
- The agent has no way to ESCALATE a `denied`. There is no
  MCP tool to override, extend, or forge an approval; the
  user must re-issue from the UI.

Threat model alignment:

- The agent cannot bypass approve-on-use because the gate
  fires inside the alias resolver before the value leaves the
  daemon — same trust boundary as "agent never sees value".
- The reason string is rendered as a label, never as an edit
  field, mirroring the prompt-injection mitigation already in
  place for `propose_metadata` (ADR-023 §3.4).
- `ttl_seconds` is one-way — clients can shrink the window but
  not enlarge it. The 5-minute registry TTL is the upper bound.

#### "Agent never sees value" as an enforced invariant

The MCP server's tool-registration code asserts at startup that no
registered tool returns a `SecretString` directly to the agent.
This is enforced by:

- A trait bound on the tool result type: tools may return
  `serde_json::Value` plus a typed wrapper, but the wrapper does
  not implement `Serialize` if it contains a `SecretString`
  (per ADR-019).
- A pre-commit grep gate in `crates/devboy-mcp/` that flags any
  use of `.expose_secret()` outside of the high-level provider
  tools' internal HTTP-call construction.
- A negative test in `crates/devboy-mcp/tests/` that constructs
  every registered tool, drives it with mock inputs, and asserts
  no tool's result serialises any `SecretString`-derived value.

These checks are aspirational hard guarantees: a contributor who
adds a tool that *would* return a value receives a compile error
or a CI failure, not a code review comment they might miss.

### 3.8 Onboarding skill `setup-secrets`

A new skill at `skills/setup-secrets/skill.md` walks the user
through the framework on first run. The existing `setup` skill
gains a `secrets bootstrap` step that delegates to `setup-secrets`
when a project has a `.devboy/secrets.toml` manifest with at least
one required path; otherwise the step is a no-op so projects that
don't use secrets pay no cost.

#### Flow

The skill is **idempotent**: a re-run resumes from the first
incomplete step. State is recorded in
`~/.devboy/secrets/setup-state.toml`.

1. **Vault state.** Check `local-vault.dvb`. If absent and the
   keychain is available, skip to step 4 (the user does not need
   a local vault). If absent and the keychain is not available,
   proceed to step 2.
2. **Create vault.** Prompt for a passphrase (min 12 chars,
   confirmed twice). Generate a 24-word recovery phrase, display
   it once with explicit acknowledgement, do not store it. Write
   `local-vault.dvb` with passphrase + recovery envelopes.
3. **Optional Touch ID.** On macOS, ask the user whether to add
   a Touch-ID unlock. If yes, add a keychain envelope.
4. **Configure routing.** Walk through the candidate sources.
   For each available source, register it in `sources.toml`;
   for each unavailable source (e.g. `op` not installed),
   record a skipped status with a one-line "install with `brew
   install 1password-cli`" hint. The keychain (or local-vault)
   is configured as `[default]` automatically.
5. **Walk required secrets.** For each path in
   `.devboy/secrets.toml`'s `required` list:
   - Run `secrets.describe(path)` to fetch metadata.
   - If already provisioned and not expiring, skip.
   - Otherwise, open the provision dialog (section 3.4) and
     record the outcome.
6. **Walk optional secrets.** Same flow with informational
   skipping of missing values.
7. **Run validation.** Format + liveness for every required
   path. Failures are surfaced inline; the user can re-enter or
   skip.
8. **Run `doctor`.** Expected to pass. If `doctor` fails, the
   skill loops back to the failing step with the specific path
   and reason.

Each step emits one structured message to the agent
(`{step, status, summary, next_options}`), and the agent presents
the user with explicit `next` / `skip` / `abort` choices. The
skill does **not** continue to the next step on its own; it
always waits for the user.

#### Documentation

`docs/guide/secrets/onboarding.md` (planned) describes the same
flow for users running `devboy secrets bootstrap` manually without
the agent. The skill and the manual flow share state through
`setup-state.toml`; mixing them is supported.

## Consequences

### Positive

- ✅ **Headless Linux works without external infrastructure.**
  The local vault is a real, encrypted store; "no keychain, no
  Vault, no 1Password" is now a supported configuration.
- ✅ **The agent never sees secret values through the MCP surface.**
  The agent tool surface is designed around request/poll, the
  local-vault never hands out `vault_key`, and the high-level
  provider tools resolve credentials server-side. The `secrets.get`
  tool simply does not exist on the agent surface. This is a
  property of the surface, not isolation against a shell-capable
  agent (see §3.7 and the threat-model alignment above).
- ✅ **Rotation is a guided action, not a notification.** The
  rotation dialog opens the right URL, validates the new value,
  and records the rotation in one flow — the user does not have
  to remember to update `last_rotated_at`.
- ✅ **Pattern catalogue is shared infrastructure.** The same
  regex and severity drive secret-store validation, OTLP
  sanitization (#240), and OTEL artifact scanning (#242). One
  source of truth, three consumers.
- ✅ **Onboarding is data, not lore (continued from ADR-020).**
  The `setup-secrets` skill walks every required path through a
  consistent UI, with idempotent resume on interruption.
- ✅ **One binary, two UIs.** TUI on `ratatui` and GUI on `egui`
  share view code; the user gets the right interface for their
  context without an extra install step.

### Negative

- ❌ **One more long-running process.** The on-demand daemon is
  the smallest version of this; the persistent install adds a
  launchd / systemd service to manage. Both are testable but the
  surface is larger than "stateless CLI".
- ❌ **An umbrella ADR.** Eight sub-decisions in one file
  trades enforceability of the "single decision per ADR" rule
  for cohesion. Future evolution may need to split sections out
  (the pattern catalogue is the most likely candidate).
- ❌ **GUI dependencies in the main binary.** `egui` and the
  `winit` stack add ~3 MB to the release binary. Users who only
  use the TUI pay this cost. Mitigation: feature-gate the GUI
  behind `default-features = ["gui"]` in `devboy-cli`'s
  `Cargo.toml`; CI binaries can opt out.
- ❌ **Two unlock paths means two places for unlock bugs to
  hide.** A failure in either the passphrase or the keychain
  envelope must surface clearly; the daemon's `vault.unlock`
  result enumerates the failure cause.

### Risks

- ⚠️ **Lost recovery phrase = lost vault.** The user who loses
  passphrase + keychain unlock + recovery phrase has no recourse.
  **Mitigation:** the bootstrap flow shows the recovery phrase
  with explicit acknowledgement; the documentation calls this
  out; users on teams with external sources should route the
  primary tree through the external source and use the local
  vault only for paths they can afford to recreate.
- ⚠️ **Daemon survives a session it shouldn't.** A user logs
  out without the daemon receiving SIGTERM (a hard kill of the
  parent shell, for example) and the unlocked key persists in
  RAM until the OS reclaims the page. **Mitigation:** the
  Argon2id KDF cost makes brute force on the wrapped key
  expensive even after RAM acquisition; the idle timeout
  (15 min default) bounds the window; `vault.lock` is exposed
  for explicit re-lock.
- ⚠️ **A compromised egui dependency could log keystrokes.**
  GUI input goes through `winit` and `egui`'s text-edit widget.
  **Mitigation:** the dependency is pinned to a vetted
  version, audited at release; the TUI fallback is available
  for users who prefer to limit the GUI surface.
- ⚠️ **Pattern catalogue becomes a maintenance burden.** New
  providers, new token formats, new rotation patterns
  accumulate. **Mitigation:** the user-extension mechanism
  (`patterns.d/`) lets new patterns ship out-of-band; the core
  catalogue ships ~30 patterns covering the long-tail of
  GitHub / GitLab / AWS / GCP / OpenAI / Anthropic / Slack /
  Stripe / Vault / common JWT shapes.
- ⚠️ **Agent abuses `request_provision` for prompt injection.**
  A malicious prompt could ask the agent to call
  `request_provision` for a path that opens a dialog asking the
  user to enter another credential under a misleading
  `description`. **Mitigation:** the dialog renders metadata
  from the global index, **not** strings supplied by the agent;
  `propose_metadata` shows the proposed change as a diff, not
  as an applied state.

## Alternatives Considered

### Alternative 1: No local vault — keychain or external sources only

**Description:** Drop section 3.1 / 3.2 / 3.3 entirely; users on
headless machines without a keychain must use an env-store or set
up an external source.

**Why rejected:** The "I want to work off-network on a Linux box
without provisioning Vault" case is real for short-lived
projects, demos, and contractor laptops. Forcing those users to
either set up a full Vault or paste tokens into env vars is
exactly the friction this ADR is designed to remove. The local
vault is also useful as a deliberate choice for users who want
some paths off-keychain — the routing layer makes that a per-path
decision.

### Alternative 2: Keychain-only with no daemon

**Description:** Skip the daemon entirely; rely on the OS
keychain's own session management.

**Why rejected:** The keychain is one source among five
(ADR-021); the local-vault still needs to manage unlocked state,
and that state is the daemon. The same daemon serves the
biometric-prompt batching argument (an agent loop should not
trigger N keychain unlocks for N reads).

### Alternative 3: Sealed file with sops/age and no daemon

**Description:** Use sops or age over a TOML file and unlock
on each read.

**Why rejected:** Sops pulls in a backend (age, GPG, KMS) and
turns the unlock step into a per-read prompt. The daemon's
single unlock per session is the ergonomic difference; without
it the local vault is unusable for any flow that touches more
than one secret.

### Alternative 4: Web UI on localhost

**Description:** Replace the `egui` GUI with a localhost-bound
HTTP server and a browser frontend.

**Why rejected:** Browser frontends pull in CORS, certificate,
and clipboard concerns that a native widget set does not have.
The TUI fallback covers the "I am SSH'd into a server and want
to provision a secret" case; the browser would not help there.
A future companion web UI for team admin views is not ruled out
but is out of scope here.

### Alternative 5: Provider-driven auto-rotation in v1

**Description:** Ship per-provider rotators alongside the manual
flow.

**Why rejected:** Provider rotation APIs are heterogeneous and
each one is a separate research project (which credential rotates
which, what scopes are needed for the rotate call itself, what
happens to in-flight tokens during rotation). Manual-assisted
rotation covers 100% of providers today; provider-driven
rotation is best handled per-provider once we have rotation
telemetry from the manual flow.

### Alternative 6: Eight separate ADRs

**Description:** Split this ADR into ADR-023 through ADR-029
along the eight section boundaries.

**Why rejected:** The components are designed against each
other; review would have to thread through all eight to see
the whole. The "single decision per ADR" convention is a
default; this ADR is an explicit exception, marked as such in
the Status section, with the candidate-for-promotion path
named.

## Implementation

- **Issues:**
  - [#247](https://github.com/meteora-pro/devboy-tools/issues/247) — implementation, phased; the Phase 6 cluster splits along this ADR's section boundaries
  - [#240](https://github.com/meteora-pro/devboy-tools/issues/240) — OTLP sanitizer; consumes `devboy-secret-patterns`
  - [#242](https://github.com/meteora-pro/devboy-tools/issues/242) — OTEL scan; also consumes `devboy-secret-patterns`
  - To be filed — design refresh covering rewritten ADR-020/021 and this new ADR-023
- **Code (planned):**
  - `crates/devboy-vault-crypto/` — file format + AEAD + KDF
  - `crates/devboy-secrets-agent/` — daemon, JSON-RPC server,
    socket handling
  - `crates/plugins/secrets/local-vault/` — `SecretSource`
    implementation talking to the daemon
  - `crates/devboy-secret-patterns/` — pattern catalogue
  - `crates/devboy-secrets-ui/` — `ratatui` + `egui` UIs (one
    crate, two backends behind `cfg`)
  - `crates/devboy-cli/` — `devboy secrets {ui, vault, agent,
    rotate, provision, bootstrap, recover, migrate}` subcommands
  - `crates/devboy-mcp/` — `secrets.*` MCP tool surface, the
    "agent never sees value" enforcement, the manifest-gated
    resolver in provider tools
  - `skills/setup-secrets/` — onboarding skill
- **Documentation (planned):**
  - `docs/guide/secrets/onboarding.md` — manual bootstrap flow
  - `docs/guide/secrets/local-vault.md` — file format, recovery,
    backup recommendations
  - `docs/guide/secrets/agent-protocol.md` — the MCP surface,
    aimed at agent authors
  - `docs/guide/secrets/source-plugin-protocol.md` — from
    ADR-021 section 6, cross-referenced

## References

- [ADR-005: Credential storage](./ADR-005-credential-storage.md)
- [ADR-019: Secrets carry SecretString end-to-end](./ADR-019-secret-string-discipline.md)
- [ADR-020: Secret manifest, path convention, and alias resolution](./ADR-020-secret-manifest-and-alias-resolution.md)
- [ADR-021: External secret sources and backend routing](./ADR-021-external-secret-sources.md)
- [`secrecy` crate documentation](https://docs.rs/secrecy/)
- [`chacha20poly1305` crate (RustCrypto)](https://docs.rs/chacha20poly1305/)
- [`argon2` crate (RustCrypto)](https://docs.rs/argon2/)
- [BIP-39: Mnemonic code for generating deterministic keys](https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki)
- [`ratatui` book](https://ratatui.rs/)
- [`egui` documentation](https://docs.rs/egui/)
- [Model Context Protocol](https://modelcontextprotocol.io/) — wire-format reference for the daemon and agent protocols
- [RFC 8439: ChaCha20 and Poly1305 for IETF Protocols](https://datatracker.ietf.org/doc/html/rfc8439)

---

## Changelog

| Date | Author | Change |
|------|--------|--------|
| 2026-05-09 | Andrei Mazniak | Initial draft as the umbrella UX layer above ADR-021 routing — local vault crypto, daemon, native UI, manual-assisted rotation, pattern catalogue, agent provisioning protocol, `setup-secrets` skill |
| 2026-05-10 | Andrei Mazniak | §3.7 P25 phase — added `secrets_request_use_approval` MCP tool + `approve_on_use` policy field with `never / session / per-call` semantics, decision contract (`once / session / denied`) and process-scoped `SessionApprovalCache` |