--- name: printing-press description: Generate a ship-ready CLI for an API with a lean research -> generate -> build -> shipcheck loop. version: 2.0.0 min-binary-version: "4.0.0" allowed-tools: - Bash - Read - Write - Edit - Glob - Grep - WebFetch - WebSearch - AskUserQuestion - Agent --- # /printing-press Generate the best useful CLI for an API without burning an hour on phase theater. ```bash /printing-press Notion /printing-press Discord codex /printing-press --spec ./openapi.yaml /printing-press --har ./capture.har --name MyAPI /printing-press https://postman.com/explore /printing-press https://postman.com ``` ## What Changed In v2 The old skill inflated the path to ship: - too many mandatory research documents before code existed - too many separate late-stage validation phases after code existed - too many chances to discover obvious failures late This version uses one lean loop: 1. Resolve the spec and write one research brief 2. Generate 3. Build the highest-value gaps 4. Run one shipcheck block 5. Optionally run live API smoke tests Artifacts are still written, but only the ones that materially help the next step. ## Modes ### Default Normal mode. Claude does research, generation orchestration, implementation, and verification. ### Codex Mode If the arguments include `codex` or `--codex`, offload pure code-writing tasks to Codex CLI. Use Codex for: - writing store/data-layer code - writing workflow commands - fixing dead flags / dead code / path issues - README cookbook edits Keep on Claude: - research and product positioning - choosing which gaps matter - verification results and ship decisions If Codex fails 3 times in a row, stop delegating and finish locally. ### Polish Mode (Standalone Skill) For second-pass improvements to an existing CLI, use the standalone polish skill: ```bash /printing-press-polish redfin ``` See the `printing-press-polish` skill for details. It runs diagnostics, fixes verify failures, removes dead code, cleans up descriptions and README, and offers to publish. ## Rules - **Do not ship a CLI that hasn't been behaviorally tested against real targets.** `go build` and `verify` pass-rate are structural signals, not correctness signals. Phase 5's mechanical test matrix runs every subcommand + `--json` + error paths; if that matrix was not executed, the CLI is not shippable. Quick Check is the floor; Full Dogfood is required when the user asks for thoroughness. - **Bugs found during dogfood are fix-before-ship, not "file for v0.2".** If a 1-3 file edit resolves it, do it now. `ship-with-gaps` is deprecated as a default verdict (see Phase 4). Context is freshest in-session; a v0.2 backlog that may never be revisited ships known-broken CLIs. - **Features approved in Phase 1.5 are shipping scope.** Do not downgrade a shipping-scope feature to a stub mid-build. If implementation becomes infeasible, return to Phase 1.5 with a revised manifest and get explicit re-approval. - **Do not quote human-time estimates for sub-tasks** ("~15-30 min", "~1 hour", "quick fix") in `AskUserQuestion` options, phase descriptions, or reference docs. The agent does the work, not the user; agent-fabricated estimates are notoriously bad and train users to distrust the prompt. Describe scope instead (lines of code, files touched, relative size). The carve-outs are wall-clock estimates for genuinely time-bound things: the whole-CLI run (set the user's expectation up front — most CLIs take 30+ minutes), tool installs (`go install` takes ~10 seconds), and printing-press subcommands that do network-bound work (crowd-sniff scans npm + GitHub, ~5-10 minutes). Anything bounded by agent reasoning time is not time-bound — describe scope. - **Use raw captures for contract research.** When reading official docs, auth/error/rate-limit pages, endpoint references, OpenAPI/Postman links, or source pages whose exact identifiers affect the generated CLI, read [references/fetch-docs.md](references/fetch-docs.md) and use its `fetch-docs.sh` helper. Reserve `WebFetch` for quick TL;DR reads where losing field-level details is acceptable. - Optimize for time-to-ship, not time-to-document. - Reuse prior research whenever it is already good enough. - Do not split one idea across multiple mandatory artifacts. - Durable files produced by this skill go under `$PRESS_RUNSTATE/` (working state) or `$PRESS_MANUSCRIPTS/` (archived). Short-lived command captures may use `/tmp/printing-press/` and must be removed after use. - Do not create a separate narrative phase for dogfood, dead-code audit, runtime verification, and final score. Treat them as one shipcheck block. - Run cheap, high-signal checks early. - Fix blockers and high-leverage failures first. - Reuse the same spec path across `generate`, `dogfood`, `verify`, and `scorecard`. - YAML, JSON, local paths, and URLs are all valid spec inputs for the verification tools. - Maximum 2 verification fix loops unless the user explicitly asks for more. ## Secret & PII Protection (Cardinal Rules) **These rules are non-negotiable. They apply at ALL times during a run.** API key **values**, token **values**, passwords, and session cookies must NEVER appear in any artifact: source code, manuscripts, proofs, READMEs, HARs, or anything committed to git. Env var **names** (e.g., `STEAM_API_KEY`) and placeholders (e.g., `"your-key-here"`) are safe. During Phase 5.6 (archiving) and before publishing, read and apply [references/secret-protection.md](references/secret-protection.md) for: - Exact-value scanning and auto-redaction of artifacts - HAR auth stripping (headers, query strings, cookies) - API key handling rules during the run - Session state cleanup ordering ## Preflight **This section MUST run before any user-facing prompt — including the Orientation and Briefing flow below.** A missing binary or available upgrade is information the user needs *before* they commit to an API. Do not invoke `AskUserQuestion`, print the orientation prose, or otherwise engage the user until preflight has completed and any signals from `references/setup-checks.md` have been handled. ```bash # min-binary-version: 4.0.0 # Derive scope first — needed for local build detection _scope_dir="$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")" _scope_dir="$(cd "$_scope_dir" && pwd -P)" _press_repo=false if [ -d "$_scope_dir/cmd/cli-printing-press" ] && [ -f "$_scope_dir/go.mod" ]; then _press_repo=true fi _resolve_press_bin() { if command -v cli-printing-press >/dev/null 2>&1; then command -v cli-printing-press return 0 fi if command -v printing-press >/dev/null 2>&1 && printing-press version --json >/dev/null 2>&1; then command -v printing-press return 0 fi return 1 } # Prefer local build when running from inside the printing-press repo. # The lefthook build hook keeps ./cli-printing-press current after every commit/pull, # so it's always newer than the go-install version. if [ "$_press_repo" = "true" ] && [ -x "$_scope_dir/cli-printing-press" ]; then export PATH="$_scope_dir:$PATH" echo "Using local build: $_scope_dir/cli-printing-press" elif ! _resolve_press_bin >/dev/null; then # Augment PATH if the binary is in ~/go/bin but not on the user's interactive PATH. if [ -x "$HOME/go/bin/cli-printing-press" ]; then export PATH="$HOME/go/bin:$PATH" elif [ -x "$HOME/go/bin/printing-press" ] && "$HOME/go/bin/printing-press" version --json >/dev/null 2>&1; then export PATH="$HOME/go/bin:$PATH" else # Refuse: the cli-printing-press binary is required and we will not auto-install # it. The README's install flow is the source of truth; # silent auto-install hides failure modes (network, wrong GOPATH) inside an # opaque skill invocation. echo "" echo "[setup-error] cli-printing-press binary not found." echo "" if command -v go >/dev/null 2>&1; then echo "Install it in your terminal:" echo " go install github.com/mvanhorn/cli-printing-press/v4/cmd/cli-printing-press@latest" else echo "Go 1.26.3 or newer is also not installed. Install Go from https://go.dev/dl/, then:" echo " go install github.com/mvanhorn/cli-printing-press/v4/cmd/cli-printing-press@latest" fi echo "" echo "Verify with: cli-printing-press --version" echo "Then re-run /printing-press." return 1 2>/dev/null || exit 1 fi fi # Verify the Go toolchain is on PATH. Generation runs Go-based quality gates # (go mod tidy, go vet, etc.) after writing thousands of lines of scaffolding, # so a missing `go` only surfaces 5+ minutes in. Fail-fast costs one command -v # call when Go is present and converts a late, opaque failure into a 30-second # actionable abort. if ! command -v go >/dev/null 2>&1; then echo "" echo "[setup-error] Go toolchain not found." echo "" echo "The Printing Press generator runs Go-based quality gates after generation." echo "Install Go 1.26.3 or newer from https://go.dev/dl/, then verify with:" echo " go version" echo "Then re-run /printing-press." echo "" return 1 2>/dev/null || exit 1 fi # Verify the installed Go tree can compile and run common standard library # imports. A truncated Go extraction can leave the binary working enough for # `go version` while missing packages under $GOROOT/src, which otherwise fails # deep into generation during later Go quality gates. _go_smoke_root="${PRINTING_PRESS_GO_SMOKE_DIR:-$HOME/.printing-press-smoke}" if ! mkdir -p "$_go_smoke_root"; then echo "" echo "[setup-error] Unable to create Go smoke-test workspace at $_go_smoke_root." echo "Set PRINTING_PRESS_GO_SMOKE_DIR to a writable non-temp directory and retry." echo "" return 1 2>/dev/null || exit 1 fi _go_smoke_dir="$(mktemp -d "$_go_smoke_root/stdlib.XXXXXX" 2>/dev/null || true)" if [ -z "$_go_smoke_dir" ]; then echo "" echo "[setup-error] Unable to create Go smoke-test workspace under $_go_smoke_root." echo "Set PRINTING_PRESS_GO_SMOKE_DIR to a writable non-temp directory and retry." echo "" return 1 2>/dev/null || exit 1 fi cat > "$_go_smoke_dir/go.mod" <<'__PP_GO_SMOKE_MOD__' module pp-go-stdlib-smoke go 1.20 __PP_GO_SMOKE_MOD__ cat > "$_go_smoke_dir/main.go" <<'__PP_GO_SMOKE_MAIN__' package main import ( "context" "encoding/json" "fmt" "io" "net/http" "regexp" ) func main() { ctx := context.Background() payload, err := json.Marshal(map[string]string{"status": "ok"}) if err != nil { panic(err) } if !regexp.MustCompile(`ok`).Match(payload) { panic("regexp mismatch") } req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://example.com", nil) if err != nil { panic(err) } _, _ = fmt.Fprint(io.Discard, req.Method) } __PP_GO_SMOKE_MAIN__ if ! (cd "$_go_smoke_dir" && GOFLAGS= GOWORK=off go run . >/dev/null 2>"$_go_smoke_dir/error.log"); then _go_smoke_output="$(sed -n '1,12p' "$_go_smoke_dir/error.log" 2>/dev/null || true)" rm -rf "$_go_smoke_dir" echo "" echo "[setup-error] Go std library is incomplete (truncated or corrupted install)." echo "Reinstall Go from https://go.dev/dl/ and verify with the smoke test before retrying." if [ -n "$_go_smoke_output" ]; then echo "" echo "Go smoke test output:" printf '%s\n' "$_go_smoke_output" fi echo "" return 1 2>/dev/null || exit 1 fi rm -rf "$_go_smoke_dir" # Resolve and emit the absolute path the agent must use for every later # `cli-printing-press` invocation. `export PATH` above only affects this one # Bash tool call; subsequent calls open a fresh shell and resolve bare # `cli-printing-press` against the user's default PATH. When a global is # installed at a stale version, that silently shadows the local build the # preflight just chose. Handing the agent an absolute path eliminates the # shadow. if [ "$_press_repo" = "true" ] && [ -x "$_scope_dir/cli-printing-press" ]; then PRINTING_PRESS_BIN="$_scope_dir/cli-printing-press" else PRINTING_PRESS_BIN="$(_resolve_press_bin 2>/dev/null || true)" fi echo "PRINTING_PRESS_BIN=$PRINTING_PRESS_BIN" echo "PRESS_REPO_MODE=$_press_repo" # Shadow detector (advisory). When a local build is in use, surface any # differing global so the user can see at a glance that the two binaries # disagree. Detect-only: the absolute path emitted above is the one the # agent will actually invoke; this warning does not change selection. if [ "$_press_repo" = "true" ] && [ -x "$_scope_dir/cli-printing-press" ]; then _global_bin="" for _candidate in "$HOME/go/bin/cli-printing-press" "/usr/local/bin/cli-printing-press" "/opt/homebrew/bin/cli-printing-press" "$HOME/go/bin/printing-press" "/usr/local/bin/printing-press" "/opt/homebrew/bin/printing-press"; do if [ -x "$_candidate" ] && [ "$_candidate" != "$_scope_dir/cli-printing-press" ] && "$_candidate" version --json >/dev/null 2>&1; then _global_bin="$_candidate" break fi done if [ -n "$_global_bin" ]; then _local_v="$("$_scope_dir/cli-printing-press" version --json 2>/dev/null | sed -nE 's/.*"version"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/p')" _global_v="$("$_global_bin" version --json 2>/dev/null | sed -nE 's/.*"version"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/p')" if [ -n "$_local_v" ] && [ -n "$_global_v" ] && [ "$_local_v" != "$_global_v" ]; then echo "" echo "[binary-shadow] local build v$_local_v differs from global v$_global_v at $_global_bin" echo "PRESS_BIN_LOCAL_VERSION=$_local_v" echo "PRESS_BIN_GLOBAL_VERSION=$_global_v" echo "PRESS_BIN_GLOBAL_PATH=$_global_bin" echo "" fi fi fi PRESS_BASE="$(basename "$_scope_dir" | tr '[:upper:]' '[:lower:]' | sed -E 's/[^a-z0-9_-]/-/g; s/^-+//; s/-+$//')" if [ -z "$PRESS_BASE" ]; then PRESS_BASE="workspace" fi PRESS_SCOPE="$PRESS_BASE-$(printf '%s' "$_scope_dir" | shasum -a 256 | cut -c1-8)" PRESS_HOME="${PRINTING_PRESS_HOME:-$HOME/printing-press}" PRESS_RUNSTATE="$PRESS_HOME/.runstate/$PRESS_SCOPE" PRESS_LIBRARY="$PRESS_HOME/library" PRESS_MANUSCRIPTS="$PRESS_HOME/manuscripts" PRESS_CURRENT="$PRESS_RUNSTATE/current" mkdir -p "$PRESS_RUNSTATE" "$PRESS_LIBRARY" "$PRESS_MANUSCRIPTS" "$PRESS_CURRENT" # --- Latest-version advisory (fail-open) --- # Repo checkouts track origin/main because their skills and local binary come # from the checkout. Standalone installs track the latest released Go module. PRESS_VERCHECK_FILE="$PRESS_HOME/.version-check" PRESS_VERCHECK_TTL=86400 _now_ts=$(date +%s) # Strict-older semver compare on the first three components. Pre-release # suffixes collapse to their GA counterpart (acceptable: we ship no pre-release # tags). _semver_lt() { awk -v a="$1" -v b="$2" 'BEGIN { split(a, x, ".") split(b, y, ".") for (i = 1; i <= 3; i++) { if ((x[i] + 0) < (y[i] + 0)) exit 0 if ((x[i] + 0) > (y[i] + 0)) exit 1 } exit 1 }' } _should_check=true if [ -f "$PRESS_VERCHECK_FILE" ] && [ -z "$PRESS_VERCHECK_FORCE" ]; then _last_ts=$(awk -F= '/^last_check=/{print $2}' "$PRESS_VERCHECK_FILE" 2>/dev/null) if [ -n "$_last_ts" ] && [ "$((_now_ts - _last_ts))" -lt "$PRESS_VERCHECK_TTL" ]; then _should_check=false fi fi if [ "$_press_repo" = "true" ]; then # Repo mode checks origin/main every run because the checkout and local build # move quickly; skipped_repo_main suppresses repeated prompts for one SHA. if git -C "$_scope_dir" remote get-url origin >/dev/null 2>&1 && git -C "$_scope_dir" fetch --quiet origin main >/dev/null 2>&1; then _head_rev=$(git -C "$_scope_dir" rev-parse HEAD 2>/dev/null || true) _main_rev=$(git -C "$_scope_dir" rev-parse origin/main 2>/dev/null || true) _skipped_repo_main="" if [ -f "$PRESS_VERCHECK_FILE" ] && [ -z "$PRESS_VERCHECK_FORCE" ]; then _skipped_repo_main=$(awk -F= '/^skipped_repo_main=/{value=$2} END{print value}' "$PRESS_VERCHECK_FILE" 2>/dev/null) fi if [ -n "$_head_rev" ] && [ -n "$_main_rev" ] && [ "$_head_rev" != "$_main_rev" ] && [ "$_skipped_repo_main" != "$_main_rev" ] && git -C "$_scope_dir" merge-base --is-ancestor "$_head_rev" "$_main_rev" 2>/dev/null; then echo "" echo "[repo-upgrade-available] origin/main has newer Printing Press changes" echo "PRESS_REPO_DIR=$_scope_dir" echo "PRESS_REPO_HEAD=$_head_rev" echo "PRESS_REPO_MAIN=$_main_rev" echo "" fi printf "last_check=%s\nlatest=%s\nmode=repo\nskipped_repo_main=%s\n" "$_now_ts" "${_main_rev:-unknown}" "$_skipped_repo_main" > "$PRESS_VERCHECK_FILE" 2>/dev/null || true fi elif [ "$_should_check" = "true" ] && command -v go >/dev/null 2>&1; then _installed=$("$PRINTING_PRESS_BIN" version --json 2>/dev/null | sed -nE 's/.*"version"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/p') _latest="" if [ -n "$_installed" ]; then _latest=$(go list -m -json github.com/mvanhorn/cli-printing-press/v4@latest 2>/dev/null | awk ' /"Version":/ { version=$2 gsub(/[",]/, "", version) sub(/^v/, "", version) print version exit } ') fi # Currency floor: the lowest release still considered safe to generate with, # published out-of-band so maintainers can raise it without a binary or skill # release. Fetched here (throttled by the TTL above) and cached for the # always-run enforcement gate below. _min_supported="" _min_reason="" if command -v curl >/dev/null 2>&1; then _floor_doc=$(curl -fsSL --max-time 5 \ https://raw.githubusercontent.com/mvanhorn/cli-printing-press/main/supported-versions.txt 2>/dev/null || true) if [ -n "$_floor_doc" ]; then _min_supported=$(printf '%s\n' "$_floor_doc" | awk -F= '/^min_supported=/{print $2; exit}') _min_reason=$(printf '%s\n' "$_floor_doc" | sed -nE 's/^reason=//p' | head -n 1) fi fi if [ -n "$_installed" ] && [ -n "$_latest" ] && _semver_lt "$_installed" "$_latest"; then # Marker for the skill prose below to detect and offer an interactive upgrade. # The skill reads PRESS_UPGRADE_AVAILABLE / PRESS_UPGRADE_INSTALLED from this output. echo "" echo "[upgrade-available] printing-press v$_latest is available (you have v$_installed)" echo "PRESS_UPGRADE_AVAILABLE=$_latest" echo "PRESS_UPGRADE_INSTALLED=$_installed" echo "" fi printf "last_check=%s\nlatest=%s\nmode=standalone\nmin_supported=%s\nreason=%s\n" \ "$_now_ts" "${_latest:-$_installed}" "$_min_supported" "$_min_reason" > "$PRESS_VERCHECK_FILE" 2>/dev/null || true fi # --- Currency-floor enforcement (standalone, every run, fail-open) --- # The floor *fetch* above is throttled to once per TTL, but enforcement must run # every invocation: a fresh cache must never let a stale binary keep generating # CLIs with since-fixed bugs. Compare the always-fresh installed version against # the cached floor; the network is never touched here. Only enforce a floor that # is itself <= latest, so a typo'd or tampered floor above the newest release # cannot brick every install. if [ "$_press_repo" != "true" ] && [ -f "$PRESS_VERCHECK_FILE" ]; then _floor_min=$(awk -F= '/^min_supported=/{print $2; exit}' "$PRESS_VERCHECK_FILE" 2>/dev/null) _floor_latest=$(awk -F= '/^latest=/{print $2; exit}' "$PRESS_VERCHECK_FILE" 2>/dev/null) _floor_reason=$(sed -nE 's/^reason=//p' "$PRESS_VERCHECK_FILE" 2>/dev/null | head -n 1) _floor_installed=$("$PRINTING_PRESS_BIN" version --json 2>/dev/null | sed -nE 's/.*"version"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/p') if [ -n "$_floor_min" ] && [ -n "$_floor_installed" ] && [ -n "$_floor_latest" ] && _semver_lt "$_floor_installed" "$_floor_min" && ! _semver_lt "$_floor_latest" "$_floor_min"; then echo "" echo "[upgrade-required] printing-press v$_floor_min is the minimum supported version (you have v$_floor_installed)" echo "PRESS_REQUIRED_MIN=$_floor_min" echo "PRESS_REQUIRED_INSTALLED=$_floor_installed" echo "PRESS_REQUIRED_REASON=$_floor_reason" echo "" fi fi # --- Browser-sniff backend advisory (fail-open, every-run) --- # browser-use and agent-browser are the preferred Phase 1.7 browser-sniff # backends. They are not hard requirements — vendor-spec, --spec, and --har # runs never invoke them — but when discovery does need them, mid-flight # install prompts are disruptive. Emit a marker every run so setup-checks.md # can strongly offer install. No decline caching: a run that didn't need them # yesterday may need them today, and the prompt cost is small. _browser_use_missing=true _agent_browser_missing=true # Use `command -v` only. Do NOT use `uvx browser-use --help` as a fallback # probe: when uvx exists but browser-use doesn't, that command silently # downloads and caches the package, which would be an unconsented install. # Downstream capture commands also invoke `browser-use` directly (not via # uvx), so a uvx-cache-only state would lie to the detection. if command -v browser-use >/dev/null 2>&1; then _browser_use_missing=false fi if command -v agent-browser >/dev/null 2>&1; then _agent_browser_missing=false fi if [ "$_browser_use_missing" = "true" ] || [ "$_agent_browser_missing" = "true" ]; then echo "" echo "[browser-tools-missing] one or more browser-sniff backends not installed" echo "PRESS_BROWSER_USE_MISSING=$_browser_use_missing" echo "PRESS_AGENT_BROWSER_MISSING=$_agent_browser_missing" echo "" fi # --- Codex mode detection (must run as part of setup, not a separate step) --- # Codex mode: opt-in only. User must pass "codex" or "--codex" to enable. if echo "$ARGUMENTS" | grep -qiE '(^| )(--?codex|codex)( |$)'; then CODEX_MODE=true else CODEX_MODE=false fi # Environment guard: don't delegate if already inside a Codex sandbox if [ "$CODEX_MODE" = "true" ]; then if [ -n "$CODEX_SANDBOX" ] || [ -n "$CODEX_SESSION_ID" ]; then CODEX_MODE=false fi fi # Health check: verify codex binary exists if [ "$CODEX_MODE" = "true" ]; then if command -v codex >/dev/null 2>&1; then # Model and reasoning effort inherit from ~/.codex/config.toml. Do not pin -m / -c here. CODEX_MODEL=$(grep -E '^model[[:space:]]*=' ~/.codex/config.toml 2>/dev/null | head -1 | sed -E 's/^model[[:space:]]*=[[:space:]]*"?([^"]+)"?.*$/\1/') [ -z "$CODEX_MODEL" ] && CODEX_MODEL="codex default" echo "Codex mode enabled (model: $CODEX_MODEL). Code-writing tasks will be delegated to Codex." else echo "Codex CLI not found - running in standard mode." CODEX_MODE=false fi fi # Circuit breaker state CODEX_CONSECUTIVE_FAILURES=0 ``` **MANDATORY: Read and apply [references/setup-checks.md](references/setup-checks.md) immediately after the setup contract bash block runs, before any other action.** It handles the contract output signals: `[setup-error]` (refuse to run, surface the install instructions), `[repo-upgrade-available]` (interactive `AskUserQuestion` prompt + optional repo pull), `PRESS_REPO_MODE=` plus the targeted global open-agent-skills freshness check, the min-binary-version compatibility check (hard stop if binary is too old), `[upgrade-required]` (hard gate below the published currency floor — interactive upgrade-or-abort, no skip), `[upgrade-available]` (interactive `AskUserQuestion` prompt + optional standalone binary upgrade), `[browser-tools-missing]` (interactive `AskUserQuestion` prompt + optional install of browser-use and/or agent-browser), and the `PRINTING_PRESS_BIN=` marker plus optional `[binary-shadow]` warning (capture the path; use it for every subsequent generator invocation). Skipping the reference will cause the skill to proceed with a missing or out-of-date binary, run with stale global skill text when the session is managed by open-agent-skills, hit a mid-flight install prompt if browser-sniff is later needed, or invoke the wrong binary because a stale global or the public catalog installer on `PATH` shadowed the local build. Do not skip. **Absolute-path rule.** The preflight contract always emits `PRINTING_PRESS_BIN=` to stdout. Capture this value and substitute it (the resolved absolute path, not the literal `$PRINTING_PRESS_BIN` token) for every subsequent `cli-printing-press ...` invocation in this skill, references, and any sub-skill you delegate to. The `export PATH=...` line inside the contract only affects the single Bash tool call it runs in; later Bash tool calls open fresh shells and resolve bare `cli-printing-press` against the user's default `PATH`, where a stale globally-installed binary (`$HOME/go/bin/cli-printing-press`, Homebrew copy, etc.) will silently shadow the local build the preflight just chose. Bash code examples below are written `cli-printing-press generate ...` for readability — replace `cli-printing-press` with the captured absolute path each time you actually run one. Only after preflight completes successfully (no `[setup-error]`; no `[upgrade-required]` left unresolved — the user either upgraded or the run was aborted; no global skill update that requires restart; any `[repo-upgrade-available]`, `[upgrade-available]`, or `[browser-tools-missing]` was offered to the user; `PRINTING_PRESS_BIN` is captured) should you proceed to the Orientation & Briefing section below. ## Orientation & Briefing After preflight has completed, check whether the user provided arguments. Handle two cases: ### No Arguments: Orientation If the user typed `/printing-press` with no arguments (no API name, no `--spec`, no `--har`, no URL), print an orientation and ask what they'd like to build: > The Printing Press generates a fully functional CLI for any API. You give it an API name, a spec file, or a URL. It researches the landscape, catalogs every feature that exists in any competing tool, invents novel features of its own, then generates a Go CLI that matches and beats everything out there — with offline search, agent-native output, and a local SQLite data layer. > > By the end, you'll have a working CLI in `$PRESS_LIBRARY/` that you can use for yourself, ship on your own, or apply to add to the printing-press library. > > The process takes 30-60 minutes depending on API complexity. Simple APIs with official specs (Stripe, GitHub) are faster. Undocumented APIs that need discovery (ESPN, Domino's) take longer. Print these example invocations as plain text BEFORE the `AskUserQuestion` call (so they appear as context above the question, not as competing menu options): ``` /printing-press Notion /printing-press Discord codex /printing-press --spec ./openapi.yaml /printing-press --har ./capture.har --name MyAPI /printing-press https://postman.com ``` Then ask via `AskUserQuestion`: - **question:** `"What API would you like to build a CLI for?"` - **header:** `"API target"` - **multiSelect:** `false` - **options:** 1. **label:** `"Type it (recommended)"` — **description:** `"Provide an API name, URL, spec path, or HAR file via the 'Other' option below."` 2. **label:** `"Browse existing CLIs first"` — **description:** `"Visit the public library to see what's already been printed before deciding what to build."` **Do not add additional options** — no "Show me popular options", no pre-populated buttons for Notion / Stripe / GitHub / Linear / Discord. The example invocations above already cover the common shapes, and most popular APIs are already in the public library (offering to re-print them is noise). The two options above plus the automatic "Other" field is the entire interface. If the user picks **"Type it (recommended)"**, they will provide their answer via the auto "Other" field. Set their input as the argument and proceed to the briefing below. If the user picks **"Browse existing CLIs first"**, print the public library URL prominently and try to open it in the browser, then end the skill so the user can browse before deciding: ```bash echo "" echo "Public library: https://github.com/mvanhorn/printing-press-library" echo "(If you have the Printing Press Library plugin, you can also run /ppl in Claude Code.)" echo "" command -v open >/dev/null 2>&1 && open https://github.com/mvanhorn/printing-press-library ``` After printing, end the skill cleanly. Do not proceed to briefing or research — the user is exploring, not building yet. They can re-invoke `/printing-press ` once they've decided. ### With Arguments: Briefing When the user provided an argument (API name, `--spec`, `--har`, or URL), print a brief process overview. This sets expectations and collects any upfront context. (Preflight has already run at this point.) Print as prose, matching the style of the example below: > Very well. Setting the type for ``. > > **Here is how this will proceed:** > 1. I shall research `` across the internet: official docs, community wrappers, competing CLIs, MCP servers, and npm/PyPI packages > 2. I shall catalog every feature that exists in any tool, then devise novel features of my own that no existing tool offers > 3. I shall present what I found and what I invented — you will have a chance to add your own ideas or adjust the plan before I build > 4. I shall generate a Go CLI, build every feature from the plan, then verify quality through dogfood, runtime verification, and scoring > > **What you will have at the end:** A fully functional CLI at `$PRESS_LIBRARY/` that you can use yourself, ship on your own, or apply to add to the printing-press library. > > **Time:** 30-60 minutes depending on API complexity. > > **Things that help if you have them:** > - An API key (for live smoke testing at the end) > - A logged-in browser session (for discovering authenticated endpoints) > - A spec file or HAR capture (skips discovery) If the user provided `--spec`, adapt: "You have provided a spec, so I shall skip discovery and proceed directly to analysis and generation. Should be faster." If the user provided `--har`, adapt: "You have provided a HAR capture, so I shall generate a spec from your traffic and skip browser browser-sniffing." Then ask via `AskUserQuestion`: - **question:** `"Anything you want me to know before I begin? A vision for what this CLI should do, specific features you care about, or auth context I should have?"` - **header:** `"Briefing"` - **multiSelect:** `false` - **options:** 1. **label:** `"Let's go (recommended)"` — **description:** `"Start research now. I'll ask about API keys, browser auth, or other context when I need them."` 2. **label:** `"I have context to share"` — **description:** `"Tell me your vision, specific features, or auth context (API key, logged-in browser session) before research starts."` **Do not add additional options** — auth is already handled by Phase 0.5 (API Key Gate) and Phase 1.6 (Pre-Browser-Sniff Auth Intelligence) downstream. A user who wants to volunteer auth context can do so via option 2's free-text response. The two options above plus the automatic "Other" field is the entire interface. If the user picks **"Let's go (recommended)"**, proceed to the Multi-Source Priority Gate (or, for single-source runs, directly to Phase 0). If the user picks **"I have context to share"**, capture their free-text response as `USER_BRIEFING_CONTEXT`. The response may include: - **Vision / specific features** — captured as-is. This context will be: - Added to the Phase 1 Research Brief under a `## User Vision` section - Used as a 4th self-brainstorm question in Phase 1.5c.5: "Based on the user's stated vision, what features directly serve their stated goals that the absorbed features don't cover?" - Referenced at the Phase Gate 1.5 absorb gate: "You mentioned [summary] at the start. Want to add more, or does the manifest already cover it?" - **Auth context** — if the user mentions an API key, env var, or logged-in browser session, set the corresponding `AUTH_CONTEXT` fields so the API Key Gate (Phase 0.5) and Pre-Browser-Sniff Auth Intelligence (Phase 1.6) do not re-ask. ### Multi-Source Priority Gate After the briefing question resolves, inspect the user's original argument AND any `USER_BRIEFING_CONTEXT` they provided. If together they name **two or more distinct services, sites, or APIs** (e.g., "Google Flights and Kayak", "Notion + Linear combo CLI", "flightgoat: Google Flights, Kayak.com/direct, and FlightAware"), this is a combo CLI and priority ordering MUST be confirmed before Phase 1 research. **Why this gate exists:** Phase 1 research defaults to the first resolvable spec as the primary source. When the user listed services in a specific order, that order is their intent — but the generator's spec-first bias will silently invert it (picking a well-documented paid API over a free reverse-engineered one the user actually wanted as the headline feature). This has caused real user-visible failures where the CLI shipped with the wrong primary and required a paid API key for what the user intended as the free primary command. **Parse the order from the prose.** Use the user's wording verbatim. Commas, "then", "and", explicit "primary/secondary", or numbered lists all signal ordering. If the user wrote "Google Flights, Kayak, FlightAware" — that is the order. Do not reorder by spec availability, tier, or ease of generation. **Confirm via `AskUserQuestion`:** > "You mentioned ****, ****, and ****. I'll treat **** as the primary — it gets the headline commands, the top of the README, and the first-run experience. Is that the right order?" Options: 1. **Yes, that order is correct** — Proceed with `SOURCE_PRIORITY=[A, B, C]` captured to run state. 2. **Different order** — User provides the correct ordering; capture it. 3. **They're peers, no primary** — Rare; capture as equal weighting but warn the user that one will still lead the README. Write the confirmed ordering to `$API_RUN_DIR/source-priority.json`: ```json { "sources": ["google-flights", "kayak-direct", "flightaware"], "confirmed_at": "", "raw_user_phrasing": "" } ``` **Phase 1 MUST consult this file.** When selecting a spec source, the primary source wins even if it has no spec and a later source has a clean OpenAPI. When the primary has no official spec, flag that openly in the brief under `## Source Priority` (see template below) and route to the browser-sniff/docs path for the primary — do not promote a secondary source just because its spec is cleaner. **Economics check.** If the confirmed primary source is free (no API key required) AND the generator's default path would make the primary CLI commands require a paid key (because the auth applies broadly or because a paid secondary source is bleeding into the primary path), surface the tradeoff explicitly before generating: > "The primary source (****) is free, but the default path would require a **** for the headline commands because . Options: (1) keep primary free, gate only the secondary commands on the paid key; (2) require the paid key for everything; (3) drop the paid source." Default to option 1 unless the user overrides. Record the decision in `source-priority.json` under `auth_scoping`. **Single-source runs:** If only one service is named, skip this gate entirely — no ordering to confirm. --- ## Run Initialization After you know `` (from the Orientation & Briefing flow above; preflight already ran at the top), initialize the run-scoped artifact paths: ```bash RUN_ID="$(date +%Y%m%d-%H%M%S)" API_RUN_DIR="$PRESS_RUNSTATE/runs/$RUN_ID" RESEARCH_DIR="$API_RUN_DIR/research" PROOFS_DIR="$API_RUN_DIR/proofs" PIPELINE_DIR="$API_RUN_DIR/pipeline" DISCOVERY_DIR="$API_RUN_DIR/discovery" CLI_WORK_DIR="$API_RUN_DIR/working/-pp-cli" STAMP="$(date +%Y-%m-%d-%H%M%S)" # Session state (live cookies, CSRF tokens captured during authenticated # browser-sniff) lives OUTSIDE $API_RUN_DIR so the Phase 5.5 archive # `cp -r "$DISCOVERY_DIR"` cannot pick it up. Containment by location, not by # manual rm-before-archive. # # Base prefix is user-scoped (`printing-press-$(id -u)`) so that on a Linux # host with a shared /tmp, the umask-077 subshell below does not lock the # top-level `printing-press` directory to a single user. macOS already gives # us a per-user $TMPDIR; the $(id -u) suffix keeps semantics identical there. SESSION_BASE="${TMPDIR:-/tmp}/printing-press-$(id -u)" SESSION_DIR="$SESSION_BASE/session/$RUN_ID" SESSION_STATE_FILE="$SESSION_DIR/session-state.json" mkdir -p "$RESEARCH_DIR" "$PROOFS_DIR" "$PIPELINE_DIR" "$CLI_WORK_DIR" # Create $SESSION_DIR inside a subshell with a tight umask so it lands at 0700 # at creation, not after a follow-up chmod. The two-step `mkdir; chmod` form # leaves a TOCTOU window where a concurrent process could open the directory # (and any session-state.json written into it) while perms are still # umask-derived (typically 0755 on Linux). The umask propagates to every # directory `mkdir -p` creates; the user-scoped $SESSION_BASE above is what # keeps that from blocking other users on the same host. (umask 077 && mkdir -p "$SESSION_DIR") STATE_FILE="$API_RUN_DIR/state.json" ``` Maintain a lightweight state file at `$STATE_FILE` so `/printing-press-score` can rediscover the current run. It should always contain: ```json { "api_name": "", "run_id": "$RUN_ID", "working_dir": "$CLI_WORK_DIR", "output_dir": "$CLI_WORK_DIR", "spec_path": "" } ``` `run_id` is the same `YYYYMMDD-HHMMSS` value computed earlier as `RUN_ID="$(date +%Y%m%d-%H%M%S)"`. The generator's manifest writer derives the same value from the `--research-dir` basename when generate is invoked through the canonical `$API_RUN_DIR` (whose basename equals `$RUN_ID`); persisting it in `state.json` here keeps `/printing-press-score` and any future state-loading consumer in sync. Without `run_id` in either path, `cli-printing-press dogfood --live --write-acceptance` refuses to write the gate marker. Do not create a `go.work` file in `$CLI_WORK_DIR`. Generated modules must build and test as standalone modules; a mismatched workspace `go` directive can break Go 1.25+ toolchains and lefthook checks. Editor/gopls workspace noise is cosmetic and must not be traded for broken `go build` or `go test`. There are exactly three durable writable locations. Every generated artifact this skill preserves goes to one of them: - **`$PRESS_RUNSTATE/`** — mutable working state for the current run (research, proofs, pipeline artifacts, plans, intermediate docs) - **`$PRESS_LIBRARY/`** — published CLIs (`/` subdirectories) - **`$PRESS_MANUSCRIPTS/`** — archived run evidence (research, proofs, discovery) Short-lived command captures may use `/tmp/printing-press/` with unique `mktemp` paths and must be deleted after use. Examples of the current naming/layout: - `$PRESS_LIBRARY/notion/` — published CLI directory (keyed by API slug) - `notion-pp-cli` — the binary name inside the directory - `/printing-press emboss notion` — emboss accepts both slug and CLI name - `discord-pp-cli/internal/store/store.go` — internal source paths still use CLI name - `linear-pp-cli stale --days 30 --team ENG` — binary invocations use CLI name - `github.com/mvanhorn/discord-pp-cli` — Go module paths use CLI name ## Outputs Every run writes up to 5 concise artifacts under the current managed run and archives them to `$PRESS_MANUSCRIPTS///`: 1. `research/-feat--pp-cli-brief.md` 2. `research/-feat--pp-cli-absorb-manifest.md` 3. `proofs/-fix--pp-cli-build-log.md` 4. `proofs/-fix--pp-cli-shipcheck.md` 5. `proofs/-fix--pp-cli-live-smoke.md` (only if live testing runs) These do not need to be 200+ lines. Keep them dense, evidence-backed, and directly useful. ## Phase 0: Resolve And Reuse Before new research: 1. Resolve the spec source. **URL Detection** — If the argument contains `://`, it's a URL. Determine whether it's a spec or a website before proceeding. **Step 1: Content probe.** Fetch the URL with the raw docs helper from [references/fetch-docs.md](references/fetch-docs.md) and inspect the response status, `Content-Type`, and first few lines of the returned file: - Check the `Content-Type` header and the first few lines of the body. - If the fetch fails (timeout, 404, DNS error), record the exact status/error, then skip to Step 2 — treat it as a website. If the content starts with `openapi:`, `swagger:`, or is valid JSON containing an `"openapi"` or `"swagger"` key → it's a spec. Treat as `--spec` and proceed directly. No disambiguation needed. If the content is a HAR file (JSON with `"log"` and `"entries"` keys) → treat as `--har` and proceed directly. **Step 2: Disambiguation.** If the content is HTML or the probe failed, ask the user what they want. Extract the site name from the hostname (e.g., `postman.com` → "Postman", `app.linear.app` → "Linear"). Derive `` from the site name using the same `cleanSpecName` normalization the generator uses. Use `AskUserQuestion` with: - **question:** `"What kind of CLI do you want for ?"` - **header:** `"CLI target"` - **multiSelect:** `false` - **options:** 1. **label:** `"'s official API"` — **description:** `"Build a CLI for 's documented API (e.g. REST endpoints, webhooks, OAuth)"` 2. **label:** `"The website itself"` — **description:** `"Build from the website itself — I may open or attach to Chrome during generation to capture site traffic, then generate a lightweight CLI from replayable HTTP/HTML surfaces"` The user can also pick the automatic "Other" option to describe what they're after in free text. **Routing after disambiguation:** - "'s official API" → use `` as the argument, proceed with normal discovery (Phase 1 research, then Phase 1.7 browser-sniff gate evaluates independently as usual) - "The website itself" → use `` as the argument, set `BROWSER_SNIFF_TARGET_URL=`. Proceed to Phase 1 research. When Phase 1.7 is reached, skip the browser-sniff gate decision and go directly to "If user approves browser-sniff" (the user already approved temporary browser discovery in Phase 0 — do not re-ask). Use `BROWSER_SNIFF_TARGET_URL` as the starting URL for browser capture. The printed CLI must still use a replayable runtime surface; do not ship a resident browser transport. - "Other" → read the user's free-form response and adapt **End of URL detection.** The remaining spec resolution rules apply when the argument is NOT a URL: - If the user passed `--har `, this is a HAR-first run. Run `cli-printing-press browser-sniff --har --name --output "$RESEARCH_DIR/-browser-sniff-spec.yaml" --analysis-output "$DISCOVERY_DIR/traffic-analysis.json"` to generate a spec and traffic analysis from captured traffic. If `$API_RUN_DIR/source-priority.json` exists with two or more sources, add `--preserve-hosts` so combo-CLI captures retain peer API hosts with per-endpoint `base_url` overrides instead of collapsing them into secondary evidence. Use the generated spec as the primary spec source for the rest of the pipeline. Skip the browser-sniff gate in Phase 1.7 (browser-sniff already ran). - If the user passed `--spec`, use it directly (existing behavior). - Otherwise, proceed with normal discovery (catalog, KnownSpecs, apis-guru, web search). #### Directory spec-source guard If any resolved spec source is a local directory, do not pass the directory itself to `cli-printing-press generate` and do not silently pick the first file. Enumerate candidate specs first: ```bash find "$SPEC_SOURCE_DIR" -type f \( -iname '*.json' -o -iname '*.yaml' -o -iname '*.yml' \) | sort ``` Keep only files whose head looks like an OpenAPI or Swagger root document (`openapi:`, `swagger:`, or JSON with a top-level `"openapi"` or `"swagger"` key). Ignore unrelated JSON/YAML config files. When the filtered candidate list is empty, abort with: `No OpenAPI/Swagger spec found under . Pass --spec directly.` Do not continue with the raw directory as the spec source. When the directory contains exactly one candidate, use that file as the spec source and write it to `state.json` as `spec_path`. When the directory contains more than one candidate: - Print a prominent warning before generation: `N OpenAPI/Swagger specs found under ; no single file represents the whole API surface.` - List every candidate when `N <= 20`; otherwise list the first 20 sorted paths and print `...and N-20 more`. - Record the directory and candidates in `$STATE_FILE` before continuing: `spec_path` is the directory and `spec_candidates` is the sorted list. - Ask the user to choose one spec, several specs, or all specs. If this runtime cannot ask a blocking question, stop after printing the warning and tell the user to re-run with explicit `--spec ` arguments. This is the minimum safe floor: never let a directory run finish while hiding that additional specs were ignored. - After the user confirms the selection, update `$STATE_FILE` with `selected_spec_paths` set to the list that will be generated. - For multiple selected specs, default to one independent printed CLI per spec using a derived `-` name and a distinct working directory under `$API_RUN_DIR/working/`. Do not merge all selected specs into one CLI unless the user explicitly asks for a combined surface and provides the umbrella name for `--name`. 2. Check for prior research in: - `$PRESS_MANUSCRIPTS//*/research/*` 3. Reuse good prior work instead of redoing it. 4. **Library Check** — Check if a CLI for this API already exists in the library or is actively being built, and present the user with context and options. First, check lock status to detect active builds: ```bash LOCK_STATUS=$(cli-printing-press lock status --cli -pp-cli --json 2>/dev/null) LOCK_HELD=$(echo "$LOCK_STATUS" | grep -o '"held"[[:space:]]*:[[:space:]]*[a-z]*' | head -1 | sed 's/.*: *//') LOCK_STALE=$(echo "$LOCK_STATUS" | grep -o '"stale"[[:space:]]*:[[:space:]]*[a-z]*' | head -1 | sed 's/.*: *//') LOCK_PHASE=$(echo "$LOCK_STATUS" | grep -o '"phase"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | sed 's/.*"phase"[[:space:]]*:[[:space:]]*"//;s/"//') LOCK_AGE=$(echo "$LOCK_STATUS" | grep -o '"age_seconds"[[:space:]]*:[[:space:]]*[0-9]*' | head -1 | sed 's/.*: *//') ``` Then check the library directory: ```bash CLI_DIR="$PRESS_LIBRARY/" HAS_LIBRARY=false HAS_GOMOD=false PRIOR_STEINBERGER_SCORE="" PRIOR_SUB60_REPRINT=false if [ -d "$CLI_DIR" ]; then HAS_LIBRARY=true if [ -f "$CLI_DIR/go.mod" ]; then HAS_GOMOD=true fi # Read manifest if available MANIFEST="$CLI_DIR/.printing-press.json" if [ -f "$MANIFEST" ]; then PRESS_VERSION=$(cat "$MANIFEST" | grep -o '"printing_press_version"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | sed 's/.*"printing_press_version"[[:space:]]*:[[:space:]]*"//;s/"//') GENERATED_AT=$(cat "$MANIFEST" | grep -o '"generated_at"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | sed 's/.*"generated_at"[[:space:]]*:[[:space:]]*"//;s/"//') PRIOR_STEINBERGER_SCORE=$(jq -r '.scorecard.steinberger.percentage // empty' "$MANIFEST" 2>/dev/null || true) if [ -n "$PRIOR_STEINBERGER_SCORE" ] && awk "BEGIN { exit !($PRIOR_STEINBERGER_SCORE < 60) }"; then PRIOR_SUB60_REPRINT=true fi fi # Get directory modification time as fallback CLI_MTIME=$(stat -f "%Sm" -t "%Y-%m-%d" "$CLI_DIR" 2>/dev/null || stat -c "%y" "$CLI_DIR" 2>/dev/null | cut -d' ' -f1) fi ``` **Decision matrix:** | Library dir? | Lock? | Stale? | Has go.mod? | Action | |-------------|-------|--------|-------------|--------| | No | No | N/A | N/A | Proceed normally | | No | Yes | No | N/A | Warn: "Actively being built (phase: ``, `` seconds ago). Wait, use a different name, or pick a different API." | | No | Yes | Yes | N/A | Offer reclaim: "Interrupted build detected (stale since ``s ago). Reclaim and start fresh?" | | Yes | No | N/A | Yes | Existing "Found existing" flow (see below) | | Yes | No | N/A | No | Debris: "Found `` directory in library but it appears incomplete (no go.mod). Clean up and start fresh?" If user approves, `rm -rf "$CLI_DIR"` and proceed normally. | | Yes | Yes | No | Any | Warn: "Actively being rebuilt (phase: ``, `` seconds ago). Wait, use a different name, or pick a different API." | | Yes | Yes | Yes | Any | Offer reclaim: "Interrupted rebuild detected (stale since ``s ago). Reclaim and start fresh?" | **If actively locked (not stale):** Present via `AskUserQuestion` with options to wait, pick a different API, or force-reclaim (`cli-printing-press lock acquire --cli -pp-cli --scope "$PRESS_SCOPE" --force`). **If stale lock:** Reclaiming is automatic on `lock acquire` in Phase 2. If user approves, proceed normally — the lock acquire in Phase 2 will auto-reclaim the stale lock. **If library exists with go.mod and no lock (completed CLI):** Display context and present options using `AskUserQuestion`: > Found existing `` in library (last modified ``). If `PRESS_VERSION` is available, append: `Built with printing-press v.` If `PRIOR_SUB60_REPRINT=true`, append: `Prior Steinberger score: %. Reprint will require all approved transcendence rows to ship unless you explicitly accept partial coverage.` If prior research was also found (step 2), include the research summary alongside the library info. Then ask: 1. **"Generate a fresh CLI"** — Re-runs the Printing Press into a working directory, overwrites generated code, then rebuilds transcendence features. Prior research is reused if recent. 2. **"Improve existing CLI"** — Keeps all current code, audits for quality gaps, implements top improvements. The Printing Press is not re-run. 3. **"Review prior research first"** — Show the full research brief and absorb manifest before deciding. If the user picks option 1, proceed to Phase 1 (research) and then Phase 2 (generate) as normal. If the user picks option 2, invoke `/printing-press-polish ` to improve the existing CLI. If the user picks option 3, display the prior research, then re-present options 1 and 2. **MANDATORY when re-using prior research after a binary upgrade.** If the user picks "Generate a fresh CLI" (option 1) AND `PRESS_VERSION` from the manifest differs from the current binary's version (parse both via semver and compare; only fire when the leading minor or major segment changed — patch-level deltas don't trigger this), prompt the user once before kicking off Phase 1 research. Construct the prompt's "what changed" list from these category buckets — the categories are stable across versions; the specific machine deltas inside each category are not. Read `docs/CHANGELOG.md` (or run `git log --oneline v..v -- internal/`) and tag each notable change to one of these buckets: | Category | Affects prior-brief assumption about... | |---|---| | **Transport / reachability** | Which sources are reachable, what auth/clearance is needed, which clients (stdlib, Surf, browser-clearance) the brief assumed | | **Scoring rubrics** | What Phase 1.5/scorecard dimensions the brief targets, whether prior "high-priority" features still rank as such | | **Auth modes** | Whether brief's auth choice (api-key, cookie, composed, oauth) is still the right pick, whether new modes unlock new endpoints | | **MCP surface** | Whether brief's MCP shape (endpoint-mirror vs intent vs code-orchestration) matches the latest emit defaults | | **Discovery** | Whether browser-sniff / crowd-sniff workflows changed, whether prior gate decisions are still valid | For the prompt itself, list only the buckets that have at least one notable change between the two versions. If the CHANGELOG / git log is unavailable, list all five buckets generically and let the user decide. > "The prior `` was generated with printing-press v``. The current binary is v``. Categories where the machine has changed since then: ``. Each can invalidate prior research assumptions. Re-validate the prior brief against the current machine before reusing it?" Options: 1. **Yes, re-validate the prior research** — fold the validation into Phase 1 (briefly re-probe reachability for previously-blocked sources, confirm scoring still classifies the prior CLI's pattern correctly, etc.) before reusing the brief. 2. **No, reuse the prior research as-is** — proceed with the brief verbatim, even if the underlying machine assumptions are stale. The prompt forces the user to acknowledge the version delta and explicitly accept (or refuse) re-validation. Skip it entirely on first generation, on same-version regenerations, or when no prior manifest exists. If no CLI exists in the local library and no lock is active, run the **Public-library check** below before proceeding to Phase 1. #### Public-library check (registry.json) The local library check above only sees CLIs this machine has already printed. A user on a fresh checkout — or one who typed a slightly different name than the published slug (`Slack` vs `slack-bot`, `Cal` vs `cal-com`), or who described what they wanted in their own words (`Hacker News reader`, `Notion clone`, `prediction market`) — will miss CLIs that already exist in the public library. Scan `mvanhorn/printing-press-library/registry.json` to catch those cases before Phase 1 research begins (the expensive 30-60-minute portion of the pipeline). **Skip this check entirely when:** - The local-library check above already prompted (mutual exclusion — do not double-ask). - `BROWSER_SNIFF_TARGET_URL` is set (the user is building a from-website CLI; the registry indexes API CLIs and naming collisions are unlikely and intentional). - The user passed `--har ` with an explicit `--name ` for a private capture. **Fetch the registry.** Match the pattern `/printing-press-import` and `/printing-press-reprint` already use: ```bash REGISTRY=$(mktemp) if ! gh api -H "Accept: application/vnd.github.v3.raw" \ repos/mvanhorn/printing-press-library/contents/registry.json \ > "$REGISTRY" 2>/dev/null; then echo "Public-library check skipped: registry.json unreachable. Proceeding to Phase 1." rm -f "$REGISTRY" REGISTRY="" fi ``` Do not block on a network failure. After step 4 finishes, clean up the tempfile only if the fetch succeeded: `[ -n "$REGISTRY" ] && rm -f "$REGISTRY"`. The failure branch above already removed it and set `REGISTRY=""`, so an unconditional `rm -f "$REGISTRY"` would run `rm -f ""`. **Read the registry and reason about matches** — do not gate on string equality alone. The file is small (~88 KB, ~135 entries today); read it directly and use judgment. Each entry has fields `name` (slug), `category`, `api` (brand display), `description`, `path`, `printer`. The user's argument may arrive in many shapes, and only some are catchable by deterministic match: - **Slug or near-slug** — `Notion`, `notion-cli`, `notion-pp-cli` - **Brand with punctuation** — `Cal.com`, `Customer.io`, `Archive.today`, `Trigger.dev` - **Concept or category** — `Hacker News reader`, `Notion clone`, `prediction market`, `prediction-market CLI` - **Adjacent product** — `Polymarket` when the registry has `kalshi` (peer prediction market) - **Genuinely novel** — no useful overlap Classify the best match at three confidence levels and act only on the top two: - **High** — same product under a different name (slug variant, brand vs slug form, `-cli`/`-pp-cli` suffix variant, well-known alias). Examples: `Cal.com` ↔ `cal-com`, `Notion` ↔ `notion-cli`, `slack` ↔ `slack-bot`. - **Medium** — same category and overlapping function; a reasonable user would want to know before building. Examples: `prediction market` finds `kalshi`, `Hacker News reader` finds `hackernews`, `Polymarket` surfaces `kalshi` as a peer. - **Low** — vaguely adjacent (e.g. "payment gateway" finding every payment-related CLI). Skip silently — false-positive prompts get dismissed reflexively at this gate. Resist over-matching on `description` keywords. Most descriptions mention several adjacent concepts; matching liberally on description text produces noise. Use the description to *confirm* a name-or-category candidate, not to *discover* candidates from scratch. **Combo CLIs.** When `SOURCE_PRIORITY` is set (from the Multi-Source Priority Gate above), skip the single-source High/Medium/No-match branches below. Classify matches per source, then present a single combined prompt rather than asking N times. For combo runs the existing single-source CLIs are usually *informational* — the user came here to build a combo, so the recommended default is to continue with the combo rather than reprint a component standalone. **Cap displayed reprint options at 2 across all sources combined** so the prompt fits the 4-option `AskUserQuestion` limit (2 reprints + continue + abort). Pick the 2 best candidates by judgment in this order: (1) High over Medium, (2) primary-source over secondary-source (the first entry in `SOURCE_PRIORITY` wins ties), (3) canonical slug over variant. If additional matches exist beyond the displayed 2, append "(plus N other source matches)" to the prompt body so the user knows the list is truncated. Omit sources with no match rather than listing them as empty rows. If no source has any match at High or Medium, print nothing and proceed to Phase 1. > Found matches across the sources you listed: > > - **``**: `` (``) [High] — same product as `` > - **``**: `` (``) [Medium] — similar/adjacent > > This is informational — these components already exist as single-source CLIs. Continue building the combo, switch to reprinting one standalone, or abort? Options: 1. **Continue with the combo as planned (recommended)** — the combo itself is the value-add; proceed to Phase 1 with all sources. 2. **Reprint `` standalone instead** — invoke `/printing-press-reprint ` (abandons the combo for now). 3. **Reprint `` standalone instead** — same, for the second candidate. 4. **Abort** — stop here. The `[High]` / `[Medium]` tags surface the confidence so the user can distinguish "this is literally the thing you named" from "this is adjacent." Tag in the bullet, not the option label, to keep options scannable. **Single-source CLIs.** When `SOURCE_PRIORITY` is not set, use the branches below. **High match — prompt strongly.** Under Claude Code, use `AskUserQuestion`; under another harness, use the equivalent native prompt primitive. The option set is the same either way. > Found **``** in the public library (printed by **@``**, path ``). > > `` > > URL: `https://github.com/mvanhorn/printing-press-library/tree/main/` > > This CLI already exists. What would you like to do? Options: 1. **Reprint with the current Printing Press (recommended)** — end this run and invoke `/printing-press-reprint `. That skill pulls the existing CLI, carries prior research and post-publish patches into reconciliation, and regenerates under the current binary. Almost always the right choice when a user discovers the CLI exists. 2. **Continue and build a fresh one anyway** — proceed with the current run from scratch. Rare; appropriate only for a deliberate fork or variant. 3. **Abort** — stop here. **Multiple High matches — present each candidate, do not use the Medium-match phrasing.** Rare — typically only happens when the user's argument is ambiguous between siblings like `slack` and `slack-bot`. Cap displayed candidates at 2 to stay within the 4-option prompt limit alongside continue/abort. If 3+ High candidates somehow qualify, pick the 2 best by judgment (typically the canonical slug match plus the next-most-likely alternative) and note "(plus N other close matches)" in the prompt body so the user knows the list is truncated. > Found multiple matches for **``** in the public library — each appears to be the same product under a different name: > > - **``** (``) — `` > - **``** (``) — `` > > Pick one to reprint, or continue/abort. Options: 1. **Reprint ``** — invoke `/printing-press-reprint `. 2. **Reprint ``** — same, for the second candidate. 3. **Continue and build a fresh one anyway** — rare; appropriate only for a deliberate fork or variant. 4. **Abort** — stop here. **Medium match — present alternatives.** Cap candidates at 2 to stay within the 4-option prompt limit alongside continue/abort. > Found similar entries in the public library that don't exactly match `` but may overlap: > > - **``** (``) — `` > - **``** (``) — `` > > Continue with `` as planned, or reprint one of these instead? Options: 1. **Continue with `` as planned** — proceed to Phase 1. 2. **Reprint `` instead** — invoke `/printing-press-reprint `. 3. **Reprint `` instead** — same, for the second candidate. 4. **Abort** — stop here. **No High or Medium match:** print nothing, proceed to Phase 1. 5. **API Key Gate** — Check whether this API requires authentication, then handle accordingly. **First, determine if the API needs auth.** Use these signals: - The spec has no `security` or `securityDefinitions` section → likely no auth needed - The API's endpoints are accessible without authentication (e.g., ESPN's undocumented endpoints, weather APIs, public data feeds) — note: "no auth required" does NOT mean the service has an official public API - No env var matching the API name exists AND no known token pattern applies - Community docs or npm/PyPI wrappers describe the API as "no auth required" **If no auth is required**, skip the key gate entirely. Proceed with: "No authentication required for `` — skipping API key gate." Do NOT call it "a public API" unless the service officially publishes one. Many services (ESPN, etc.) have unauthenticated endpoints without having an official API. Live smoke testing in Phase 5 will work without a key. **If the API DOES require auth**, run the key gate: Token detection order: - GitHub: `GITHUB_TOKEN`, `GH_TOKEN`, or `gh auth token` - Discord: `DISCORD_TOKEN`, `DISCORD_BOT_TOKEN` - Linear: `LINEAR_API_KEY` - Notion: `NOTION_TOKEN` - Stripe: `STRIPE_SECRET_KEY` - Generic: `API_KEY`, `API_TOKEN` **If a token IS found**, stop and explain: > Found `` in your environment. This key will be used **only** for read-only live smoke testing in Phase 5 — listing, fetching, and health checks. It will never be used for write operations (create, update, delete). OK to use it? - If the user approves → proceed with the key available for Phase 5. - If the user declines → proceed without the key and display: "Live smoke testing (Phase 5) will be skipped. The CLI will still be generated and verified against mock responses." **If no token is found**, stop and ask: > No API key detected for ``. You can provide one now for read-only live smoke testing in Phase 5, or continue without it. > > Set it with `export =` or paste the key here. - If the user provides a key → proceed with the key available for Phase 5. - If the user declines → proceed without the key and display: "Live smoke testing (Phase 5) will be skipped. The CLI will still be generated and verified against mock responses." Resolve the API key gate (or skip it for public APIs) before moving to Phase 1. ## Phase 1: Research Brief **When `BROWSER_SNIFF_TARGET_URL` is set:** Skip the catalog check, spec/docs search, and SDK wrapper search — none of these exist for an undocumented website feature. Focus research on understanding what the site/feature does, who uses it, what workflows it supports, and what competitors offer similar functionality. The spec will come from browser-sniffing in Phase 1.7. Before reading documentation, read [references/fetch-docs.md](references/fetch-docs.md). Use `fetch-docs.sh` for the API's primary docs, OpenAPI/Postman links, auth guides, error handling, rate limits, pagination, webhooks, and any per-endpoint reference page. Preserve exact status codes and inspect the returned local file directly so enum values, field constraints, casing, examples, and nav/link variants are not lost through summarization. Before starting research, check if the API has a built-in catalog entry: ```bash cli-printing-press catalog show --json 2>/dev/null ``` If the catalog has an entry for this API, branch on the entry type: **Spec-based entry** (`spec_url` populated) — present the user with a choice: > " is in the built-in catalog (spec: ). Use the catalog config to skip discovery, or run full discovery?" - If catalog config: use the spec_url from the catalog entry, skip the research/discovery phase - If full discovery: proceed with the normal research workflow **Wrapper-only entry** (no `spec_url`, `wrapper_libraries` populated) — this is a reverse-engineered API that has no official spec but has known community libraries. The catalog entry is a **discovery aid only**: `cli-printing-press generate` requires `--spec` and does not consume wrapper-library metadata, so there is no direct generation path from a wrapper-only entry today. Tell the user this up front via `AskUserQuestion`: > " has no official spec. The catalog knows about these community-maintained wrappers, but the Printing Press cannot generate a CLI directly from a wrapper. The next step has to be browser-sniffing the upstream to author an internal YAML spec, browser-sniffing or HAR-capturing the dominant source first and then using the multi-source aggregator pattern for secondary hand-authored sources, or hand-writing a Go module that imports the wrapper. Which path do you want?" Present each `wrapper_libraries` entry alongside the question with language, integration mode, and notes so the user can see what implementation backing exists. Example for `google-flights`: - **krisukox/google-flights-api** (Go, native, MIT) — Pure Go, importable; single-binary CLI with no runtime deps. Record the user's choice (and the selected wrapper, when relevant) in `$API_RUN_DIR/state.json` under an `implementation` field so later phases can read it. For wrapper or hand-written-module paths, use `{ "kind": "wrapper", "library": "", "url": "", "integration_mode": "native|subprocess|html-scrape", "next_step": "browser-sniff|hand-written-module" }`. For the aggregator path, use `{ "kind": "aggregator-pattern", "dominant_source": "", "spec_source": "browser-sniff|har|provided-spec", "spec_path": "", "secondary_sources": [""], "next_step": "aggregator-pattern" }`; do not populate `library` or `integration_mode` unless a specific secondary source is backed by a wrapper. This field is for skill bookkeeping; the generator does not currently read it. If the user picks browser-sniff, route into the Phase 1.7 browser-sniff path to produce a spec, then run `generate --spec` against it. If the user picks the aggregator path, first route the dominant source through Phase 1.7 browser-sniff or HAR capture to produce the primary spec, then read and apply [references/aggregator-pattern.md](references/aggregator-pattern.md): generate from that spec, then hand-author the secondary source clients and `sources` command tree. If the user picks a hand-written module, stop the press here and hand off — there is no generator path to drop them into. **No catalog hit** — proceed normally without mentioning the catalog. **Adding new wrapper-only APIs:** drop a YAML file in `catalog/` with `wrapper_libraries` populated and rebuild the binary. No skill changes needed. Write one build-driving brief, not a stack of phase essays. The brief must answer: 1. What is this API actually used for? 2. What are the top 3-5 power-user workflows? 3. What are the top table-stakes competitor features? 4. What data deserves a local store? 5. Why would someone install this CLI instead of the incumbent? 6. What is the product name and thesis? Research checklist: - Find the spec or docs source. For docs pages whose details affect generation, fetch the raw page with `fetch-docs.sh`, then read/grep the returned path directly. - Find the top 1-2 competitors - **Check GitHub issues on the top wrapper/SDK repo for "403", "blocked", "broken", "deprecated", "rate limit".** If multiple issues report the API is inaccessible or broken, flag this in the research brief as a reachability risk. This is critical for unofficial/reverse-engineered APIs. - Find official and popular SDK wrappers on npm (`site:npmjs.com`) and PyPI (`site:pypi.org`) - Find 2-3 concrete user pain points - Identify the highest-gravity entities - Pick the top 3-5 commands that matter most Do not produce separate mandatory documents for: - workflow ideation - parity audit - data-layer prediction - product thesis Put them in the one brief. Write: `$RESEARCH_DIR/-feat--pp-cli-brief.md` Suggested shape: ```markdown # CLI Brief ## API Identity - Domain: - Users: - Data profile: ## Reachability Risk - [None / Low / High] [evidence: e.g., "6 open issues on reteps/redfin about 403 errors since 2025"] - Tier/permission hints from 4xx body: [omit when absent; otherwise quote the matched bounded line(s) from Phase 1.9] - Probe-safe endpoint used: [omit when absent; otherwise " " from `x-pp-safe-probe`] ## Top Workflows 1. ... ## Table Stakes - ... ## Data Layer - Primary entities: - Sync cursor: - FTS/search: ## Codebase Intelligence - [DeepWiki findings if available, otherwise omit this section] - Source: DeepWiki analysis of {owner}/{repo} - Auth: [token type, header, env var pattern] - Data model: [primary entities and relationships] - Rate limiting: [limits and behavior] - Architecture: [key insight about internal design] ## User Vision - [USER_BRIEFING_CONTEXT if provided, otherwise omit this section] ## Source Priority - [Only present for combo CLIs. Copy the confirmed ordering from `source-priority.json`.] - Primary: — [spec state: official / community-wrapper / no-spec-browser-sniff-required] — [auth: free / paid] - Secondary: — [...] - Tertiary: — [...] - **Economics:** [e.g., "Primary is free; paid key for is scoped to its own commands only."] - **Inversion risk:** [e.g., "Primary has no OpenAPI; secondary has 53-endpoint spec. Do NOT let spec completeness invert the ordering."] ## Product Thesis - Name: - Why it should exist: ## Build Priorities 1. ... 2. ... 3. ... ``` **MANDATORY: Before proceeding to Phase 1.5 (Absorb Gate), you MUST evaluate Phase 1.6 (Pre-Browser-Sniff Auth Intelligence), Phase 1.7 (Browser-Sniff Gate), and Phase 1.8 (Crowd-Sniff Gate) below.** If no spec source has been resolved yet (no `--spec`, no `--har`, no catalog spec URL), the browser-sniff gate decision matrix MUST be evaluated. Do not skip to Phase 1.5. **Phase 1.5 will refuse to proceed without a `browser-browser-sniff-gate.json` marker file.** Phase 1.7 writes this file with one entry per source (one entry for single-source CLIs, one entry per named source for combo CLIs). Missing marker = HARD STOP back to Phase 1.7. See Phase 1.7 "Enforcement" below for the contract. ## Phase 1.6: Pre-Browser-Sniff Auth Intelligence After Phase 1 research completes, analyze findings to proactively assess what auth context the user could provide. This step uses research intelligence to ask the right question before browser-sniffing starts, rather than waiting for the user to volunteer "I logged in." **Skip this step if:** The briefing (Orientation & Briefing section) already captured auth context (`AUTH_CONTEXT` is set from the user selecting "I have an API key or I'm logged in"). **Classify the API's auth profile from research findings:** | Signal from research | Auth profile | What to ask | |---------------------|-------------|-------------| | Community wrappers use API keys (e.g., `STRIPE_SECRET_KEY`), MCP source shows `Authorization: Bearer` headers, spec has `security` section | **API key auth** | "Do you have an API key for ``?" | | Site has user accounts, research found auth-only features (order history, saved items, rewards, account settings), login pages exist | **Browser session auth** | "This API has authenticated endpoints ([list specific features from research, e.g., order history, saved addresses, rewards]). Are you logged in to `` in your browser? The browser-sniff will find more endpoints if you are." | | Endpoints accessible without auth, no login-gated features found, community wrappers describe API as "no auth required" | **No auth needed** | Skip this step silently | | Both API key AND browser session features found | **Dual auth** | Ask about both: API key for smoke testing, browser session for browser-sniff | **Name the specific features the user would unlock.** Do not say "auth would help." Say "This API has order history, saved addresses, and rewards that require a logged-in session." **Where signals come from:** - Phase 1 brief's "Data profile" and "Top Workflows" sections - Phase 1.5a MCP source code analysis (auth patterns, token formats) - Community wrapper README "auth" or "authentication" sections - The API Key Gate's token detection (Phase 0.5) — if it already found a key, don't re-ask **For API key auth:** Present via `AskUserQuestion`: > "Do you have an API key for ``? It will be used for read-only live smoke testing in Phase 5." > > 1. **Yes** — user provides the key or confirms it's in the environment > 2. **No, continue without it** — skip live smoke testing If the user provides a key, set it in `AUTH_CONTEXT` so the API Key Gate (Phase 0.5) does not re-ask. **For browser session auth:** Present via `AskUserQuestion`: > "`` has authenticated endpoints ([list features]). Are you logged in to `` in your browser? If so, the generated CLI will support `auth login --chrome` — you'll be able to authenticate just by being logged into the site in Chrome. No API key needed." > > 1. **Yes, I'm logged in** — I'll use your session during browser-sniff and enable browser auth in the CLI > 2. **No, but I can log in** — I'll help you log in before browser-sniffing > 3. **No, skip authenticated endpoints** — browser-sniff only public endpoints Set `AUTH_SESSION_AVAILABLE=true` if the user selects option 1 or 2. The Browser-Sniff Gate (Phase 1.7) will use this flag. After traffic capture, Step 2d in [references/browser-sniff-capture.md](references/browser-sniff-capture.md) validates that cookie replay works before enabling browser auth in the generated CLI. **For dual auth:** Ask about both in sequence — API key first (simple env var check), then browser session. --- ## Phase 1.7: Browser-Sniff Gate After Phase 1 research, evaluate whether browser-sniffing the live site would improve the spec. This phase MUST produce a decision marker file for every source named in the briefing before Phase 1.5 can proceed. **Browser discovery is temporary discovery, not a printed-CLI runtime.** Use browser-use, agent-browser, the Claude chrome-MCP (`mcp__claude-in-chrome__*`, when the runtime exposes it), or a manual HAR (optionally augmented with computer-use screenshots for visual guidance, when `mcp__computer-use__*` is exposed) to learn the hidden web contract: URLs, methods, persisted GraphQL hashes, BFF envelopes, response shapes, cookies, CSRF/header construction, HTML/SSR/RSS/JSON-LD surfaces, and whether replay is viable. The final printed CLI must use replayable HTTP, Surf/browser-compatible HTTP, browser-clearance cookie import plus replay, or structured HTML/SSR/RSS extraction. If the only working path requires live page-context execution, HOLD or pivot scope — do not generate a resident browser sidecar transport. **Automatic offer, explicit consent.** The Printing Press decides when browser discovery should be offered, but opening Chrome, attaching to a browser session, installing browser-use/agent-browser, asking the user to solve a challenge, or driving the user's logged-in Chrome via the chrome-MCP requires explicit user approval through the Phase 0 website choice or the Phase 1.7 `AskUserQuestion` prompt. **Approval at Phase 1.7 covers the full fallback set** including chrome-MCP and computer-use when Step 2c.5's recovery menu later offers them — picking chrome-MCP at the recovery menu is a refinement of the Phase 1.7 consent, not a new consent surface. The disclosure language used at the Phase 1.7 prompt MUST enumerate these possibilities so the user understands what they are approving: > "Approving browser-sniff means the agent may run browser-use, agent-browser, ask you for a manual DevTools HAR export, or — if the default backends get blocked by an anti-bot gate and your runtime exposes them — drive your already-running Chrome via the chrome-MCP browser extension, or take read-only screenshots of your DevTools window via computer-use to guide you through the HAR export. Capture artifacts are written to `$DISCOVERY_DIR/` and credential headers are stripped at write time. The chrome-MCP option uses your real logged-in Chrome session in a fresh capture tab; the agent never navigates your existing tabs." If chrome-MCP picks up later in Step 2c.5's recovery menu, do NOT re-fire a per-invocation consent prompt — Phase 1.7's pre-approval covers it. The recovery menu lists chrome-MCP as one of the fallback options the user already pre-approved; the user's selection in the menu is a backend choice, not a new consent step. ### Enforcement: the browser-browser-sniff-gate.json marker file Phase 1.7 is a hard gate. Phase 1.5 reads a marker file and refuses to proceed without it. The model cannot skip this phase by reasoning around it. **Marker file location:** `$PRESS_RUNSTATE/runs/$RUN_ID/browser-browser-sniff-gate.json` **Marker file shape:** ```json { "run_id": "20260411-000903", "sources": [ { "source_name": "", "decision": "approved | declined | skip-silent | pre-approved", "reason": "", "asked_at": "2026-04-11T00:10:00Z" } ] } ``` **Decision values:** - `approved` — user selected a browser-sniff option via `AskUserQuestion`. Proceed to "If user approves browser-sniff". - `declined` — user explicitly declined browser-sniff via `AskUserQuestion`. Proceed to "If user declines browser-sniff". - `skip-silent` — gate was silently skipped per the decision matrix (spec complete, `--har` provided, `--spec` provided, or login required with `AUTH_SESSION_AVAILABLE=false`). The `reason` field names which. - `pre-approved` — user already chose "The website itself" in Phase 0, where the prompt disclosed temporary Chrome/browser capture during generation, so `BROWSER_SNIFF_TARGET_URL` was set and the question was answered there. **Every path through Phase 1.7 MUST write a marker entry** — approve, decline, and every silent-skip case. There is no code path that proceeds to Phase 1.5 without writing the marker. **`asked_at` is mandatory.** It must reflect the actual time `AskUserQuestion` was invoked (or the time the silent-skip decision was made). Fabricated timestamps are a plan violation. ### Banned skip reasons The following rationales are NOT valid reasons to skip the browser-sniff gate. If any of these apply, you MUST still ask the user via `AskUserQuestion` and record their answer in the marker file: - **"The target is client-rendered and needs Playwright"** — browser capture tools (browser-use, agent-browser) exist specifically to handle client-rendered sites. A hard-to-browser-sniff target is not the same as an impossible one. Ask. - **"Direct HTTP/curl got 403, 429, Cloudflare, Vercel, WAF, DataDome, or bot-detection HTML"** — direct HTTP reachability failure is exactly when browser capture is valuable. Do not pivot to RSS, docs-only, official API, or a smaller product shape before attempting the approved browser-sniff. Route to cleared-browser capture instead. - **"Direct HTTP/curl got HTTP `200` but only a content-less shell, interstitial, or deterministic-size truncation"** — a 200-served shell is a clearance or JavaScript challenge, not a clean response. Do not conclude `IP-blocked`, `rate-limited`, or `wait it out` from this shape. Before declaring the target unreachable, climb the ladder: probe-reachability body-check, curl-impersonate/TLS check, real-browser cookie-warm via the cleared-browser path or chrome-MCP when available, then ask the user. Use chrome-MCP to understand the wall even when it cannot export cookie values. - **"The 3-minute time budget looks tight"** — the time budget applies AFTER the user approves browser-sniff, not before. You do not pre-judge whether a browser-sniff will fit the budget. Ask. If the budget blows after the user approves, fall back per the Time Budget rules below. - **"We have a substitute data source from another API"** — substituting one source for another is the user's call, not yours. If the user named a specific site or feature (e.g., Kayak /direct), they chose it deliberately. Ask about that exact source. Offering a different data source is a separate conversation AFTER the gate, not a reason to skip it. - **"Installing browser-use or agent-browser is friction"** — the browser-sniff capture reference already documents the install path. Tooling friction is not a valid skip reason. Ask. - **"The documentation looks thorough enough"** — the decision matrix already handles this case explicitly. If research found that competitors or community projects reference more endpoints than the spec covers, that IS a gap and you MUST ask. - **"The user said 'let's go' earlier and implicitly approved everything"** — "let's go" at the briefing stage is consent to proceed with research, not standing approval for every future decision. Ask each gate individually. - **"The default browser-use / agent-browser path got hard-blocked by a WAF, so the only remaining option is to pivot scope or fall back to RSS/docs"** — this is exactly when the chrome-MCP and computer-use fallback options enter, when the runtime exposes them. Step 1 of `references/browser-sniff-capture.md` detects which fallback MCPs are available; Step 2c.5 composes the recovery menu including those fallbacks; the gate is "ask before giving up," not "auto-pivot when blocked." Do NOT skip the Step 2c.5 menu. Do NOT pivot scope or substitute an alternate target without first asking the user via that menu. These banned reasons all fired at once in a past combo-CLI run and caused a user-critical source to be silently swapped out. The marker file exists so this cannot happen again. If you find yourself writing a phrase like "skipping browser-sniff because X" where X is one of the above, stop and call `AskUserQuestion`. ### Combo CLIs: per-source enforcement When the briefing names multiple sources (e.g., "Google Flights + Kayak + FlightAware"), each named source is evaluated independently. The marker file has one entry per source. All entries must be present before Phase 1.5 can proceed. **Source identification rule:** source names come from the briefing, verbatim. Use the user's exact wording as the `source_name` (normalized to kebab-case is fine: "Kayak /direct" → `kayak-direct`, "Google Flights" → `google-flights`, "FlightAware" → `flightaware`). Do not merge sources. Do not drop one in favor of another. **Per-source decision flow:** For each named source, run the "When to offer browser-sniff" decision matrix independently, using the research findings for THAT source. Each source produces its own `AskUserQuestion` call or its own silent-skip marker entry. **Combo CLI example** (flightgoat pattern — directional guidance, not prescription): | Source | Spec state | Expected decision | |--------|------------|-------------------| | `flightaware` | Documented OpenAPI spec found (53 endpoints, appears complete) | `skip-silent` with reason `spec-complete` | | `google-flights` | No official spec, but community wrapper exists (`krisukox/google-flights-api`) | Ask via `AskUserQuestion` → record user's answer | | `kayak-direct` | No spec, no wrapper, user named this as a key feature | Ask via `AskUserQuestion` → record user's answer | The marker file for this run would contain three entries. Phase 1.5 would HALT if any were missing. **When the user cares about only one source:** you still ask for all sources that trigger the gate. The user can decline the others. Asking is cheap. Skipping silently breaks the contract. ### Skip this gate entirely when These are the only cases where Phase 1.7 is bypassed as a whole (not just skipped for one source). Even in these cases, a marker file with a single `skip-silent` entry is written to satisfy Phase 1.5's check: - User passed `--spec` and the spec is the canonical source for every named source → marker: `{ "source_name": "", "decision": "skip-silent", "reason": "user-provided-spec" }` - User passed `--har` → marker: `{ "source_name": "", "decision": "skip-silent", "reason": "user-provided-har" }` - `BROWSER_SNIFF_TARGET_URL` is set from Phase 0 (user chose "The website itself") → marker: `{ "source_name": "", "decision": "pre-approved", "reason": "phase-0-website-choice" }`, then go directly to "If user approves browser-sniff" ### Direct HTTP challenge rule If a reachability probe during Phase 1 research returns bot-protection evidence (`403`, `429`, `cf-mitigated: challenge`, `x-vercel-mitigated: challenge`, `x-vercel-challenge-token`, AWS WAF, DataDome, PerimeterX, CAPTCHA, "Just a moment", "access denied"), **run the no-browser reachability probe before announcing any browser escalation**: ```bash cli-printing-press probe-reachability "" --json ``` This is non-negotiable. **Do not present transport tiers as a peer menu for the user to choose between.** Phrases like "Browser-sniff + clearance cookie", "Browser-sniff with Surf-only", "Try without browser at all", or "Browser-sniff, prefer Surf" route the user through implementation choices (Surf vs cookie vs full browser) they don't have context to make. The classifier is `probe-reachability`; the agent runs it and decides. Intent-level menus are fine — "Browser-sniff or HOLD?", "Browser-sniff or pick a different API?", or the standard yes/no browser-sniff offers below all ask about goals, not transport, and remain available. Escalate consent in the order the agent actually needs it, not bundled up-front: 1. **Runtime probe (silent)** — `probe-reachability` runs without prompting. The user already opted into "the website itself" or equivalent in Phase 0; running an HTTP request needs no further consent. 2. **Browser-sniff offer (intent prompt)** — Phase 1.7's normal "Browser-Sniff as enrichment" / "Browser-Sniff as primary" prompts ask whether to do browser-sniff at all. These are intent-level. Show them when the discovery matrix says to. 3. **Chrome attach (separate consent if escalation happens)** — when the agent actually needs to open or attach to Chrome (because the discovery flow requires a real browser, or because `mode: browser_clearance_http` means the runtime needs cookie capture), surface that as its own moment so the user knows they may need to solve a challenge or sign in. The user-facing prompts at lines below already disclose Chrome attach as a possibility; that is the right place to confirm. Do not pre-announce Chrome attach when the probe has already settled the runtime as `browser_http` and the spec is complete enough to skip discovery — there is no Chrome attach to announce in that path. Two concerns are decided here, separately: - **Runtime** (does the printed CLI need browser-compatible HTTP, a clearance cookie, or live page-context execution?) — settled entirely by `probe-reachability`. - **Discovery** (does Phase 1.7 need to capture XHR traffic via a real browser to learn endpoints?) — settled by Phase 1.7's normal "When to offer browser-sniff" decision matrix above. Independent of runtime. The probe runs stdlib HTTP, then Surf with a Chrome TLS fingerprint, and emits one of `standard_http | browser_http | browser_clearance_http | unknown`. Apply `mode` to the **runtime** decision: - **`mode: standard_http`** — runtime is plain HTTP (the original probe was transient). Continue Phase 1.7 normally; the discovery decision is unchanged. - **`mode: browser_http`** — **runtime is settled: ship Surf transport** (`UsesBrowserHTTPTransport` will be set in the generator's traffic-analysis hints). The printed CLI will not include `auth login --chrome` for clearance cookies — Surf alone clears the challenge. Continue Phase 1.7's discovery decision normally; the existing "Browser-Sniff as enrichment" / "Browser-Sniff as primary" prompts (above) are framed around endpoint discovery and are correct as-written. Do **not** add clearance-cookie language to those prompts. - **`mode: browser_clearance_http`** — both probes hit protection signals. The runtime needs more than Surf (clearance cookie or live page-context execution; the probe cannot distinguish), so a real browser capture is required to find out. Proceed through Phase 1.7's normal browser-sniff offer (intent-level yes/no). The consent for Chrome attach happens at the moment the agent actually opens/attaches, where the user-facing prompts in `references/browser-sniff-capture.md` already disclose what's about to happen and may ask the user to solve a challenge. Note in the brief that runtime is provisionally `browser_clearance_http` pending capture results. - **`mode: unknown`** — probes failed at the transport layer (DNS/timeout/5xx). Fall through to the existing browser-sniff offer; the user decides whether to retry or pivot. When browser-sniff is approved or pre-approved AND the probe says `browser_clearance_http` or `unknown`: - Do **not** offer alternate CLI shapes (RSS-first, official API, docs-only, narrower scope, "try anyway") before a real browser capture has been attempted. - Do **not** write the brief as if browser-sniff is complete after only curl/direct HTTP probes. - If browser automation tooling is unavailable, offer the user a manual HAR path before offering any scope pivot. Only after the browser capture attempt fails by the criteria in `references/browser-sniff-capture.md` may you ask whether to pivot to RSS, official API, docs-only, or a smaller CLI scope. ### Time budget The browser-sniff gate should complete within 3 minutes of the user approving browser-sniff. If browser automation tooling fails to produce results after 3 minutes of attempts, fall back immediately: - If a spec already exists (enrichment mode): "Browser-Sniff failed after 3 minutes — proceeding with existing spec." - If no spec exists (primary mode): "Browser-Sniff failed after 3 minutes — falling back to --docs generation." - If browser-sniff was approved or pre-approved and direct HTTP showed challenge/bot-protection evidence, do **not** auto-fall back to docs/official API, even when `BROWSER_SNIFF_TARGET_URL` is unset. Ask whether the user wants to provide a HAR manually, retry cleared-browser capture, or discuss alternate CLI scope. Do NOT spend time debugging tool integration issues. Browser-sniff is a temporary discovery aid, not the product runtime. If the first approach fails, fall back to the next option — do not retry the same broken approach. **The time budget applies AFTER the user approves.** Do not use it as a reason to skip the gate before asking. ### When to offer browser-sniff | Spec found? | Research shows gaps? | Auth required? | Action | |-------------|---------------------|----------------|--------| | Yes | Yes — docs or competitors show significantly more endpoints than the spec | No | **MUST offer browser-sniff as enrichment** | | Yes | No — spec appears complete | Any | Skip silently (write marker with `decision: skip-silent`) | | No | Community docs exist (e.g., Public-ESPN-API) | No | **MUST offer browser-sniff OR --docs** — present both options so the user decides | | No | No docs found either | No | **MUST offer browser-sniff as primary discovery** | | No | N/A | Yes (login) + `AUTH_SESSION_AVAILABLE=true` | **Offer authenticated browser-sniff** — the user confirmed a session in Phase 1.6 | | No | N/A | Yes (login) + `AUTH_SESSION_AVAILABLE=false` | Skip — fall back to `--docs` (write marker with `decision: skip-silent`, `reason: login-required-no-session`) | **Gap detection heuristic:** If Phase 1 research found documentation, competitor tools, or community projects that reference significantly more endpoints or features than the resolved spec covers, that's a gap signal. Example: "The Zuplo OpenAPI spec has 42 endpoints, but the Public-ESPN-API docs describe 370+." **When the decision matrix says "Offer browser-sniff", you MUST ask the user via `AskUserQuestion`.** Skipping the question and writing a `skip-silent` marker is a contract violation — `skip-silent` is only valid when the matrix says "Skip silently" or one of the Banned Skip Reasons is the only thing holding you back (in which case, you should be asking anyway). Every browser-sniff approval prompt must make the consent boundary explicit: - browser discovery may open or attach to Chrome during generation, - it may ask the user to log in or solve a challenge, - it may request permission to install or upgrade browser-use/agent-browser if missing, - the printed CLI will only ship if discovery finds a replayable surface and will not keep a browser running as normal command transport. ### Browser-Sniff as enrichment (spec exists but has gaps) Present to the user via `AskUserQuestion`: > "Found a spec with **N endpoints**, but research shows the live API likely has more (competitors reference M+ features). Want me to use temporary browser discovery on `` to find replayable endpoints the spec missed? I may open or attach to Chrome during generation, and I will ask before installing or upgrading browser-use/agent-browser." > > Options: > 1. **Yes — browser-sniff and merge** (temporarily open or attach to Chrome during generation, capture traffic, then merge only replayable discovered endpoints with the existing spec. Ask before installing capture tools.) > 2. **No — use existing spec** (proceed with what we have) ### Browser-Sniff as primary (no spec found) Present to the user via `AskUserQuestion`. **If `AUTH_SESSION_AVAILABLE=true`**, include an authenticated browser-sniff option: > "No OpenAPI spec found for ``. Want me to browser-sniff `` to discover the API from live traffic?" > > Options: > 1. **Yes — authenticated browser-sniff** (temporarily open or attach to Chrome during generation, use your browser session to discover public and authenticated traffic, and generate only replayable CLI surfaces. Recommended since you confirmed a session.) *(Only show when `AUTH_SESSION_AVAILABLE=true`)* > 2. **Yes — browser-sniff the live site** (temporarily browse `` anonymously, capture API/HTML traffic, and generate a spec only from replayable surfaces. Ask before installing capture tools.) > 3. **No — use docs instead** (attempt `--docs` generation from documentation pages) > 4. **No — I'll provide a spec or HAR** (user will supply input manually) When `AUTH_SESSION_AVAILABLE=false`, show only options 2-4 (the existing 3-option prompt). ### If user approves browser-sniff **Before doing anything else, write the marker entry** for this source: ```json { "source_name": "", "decision": "approved", "reason": "", "asked_at": "" } ``` Append it to `$PRESS_RUNSTATE/runs/$RUN_ID/browser-browser-sniff-gate.json` (create the file if it doesn't exist). #### Step 0: Identify the User Goal Before building the capture plan, answer one question: **What does the end user of this CLI actually want to do?** Read the research brief's Top Workflows. The #1 workflow IS the primary browser-sniff goal. State it in one sentence: - Domino's: "Order a pizza for delivery" - Linear: "Create an issue and assign it to a sprint" - Stripe: "Create a payment intent and confirm it" - ESPN: "Check today's scores and standings" - Notion: "Create a page and organize it in a database" If the API is read-only (news, weather, data feeds), the primary goal is "fetch and filter data" and the flow is search/filter/paginate rather than a multi-step transaction. The browser-sniff will walk through this goal as an interactive user flow. Secondary workflows become secondary browser-sniff passes if time permits. State the goal explicitly before proceeding: "Primary browser-sniff goal: [goal]. I will walk through this as a user flow." Then read and follow [references/browser-sniff-capture.md](references/browser-sniff-capture.md) for the complete browser-sniff implementation: tool detection, installation, session transfer, browser-use/agent-browser/manual HAR capture, replayability analysis, and discovery report writing. ### If user declines browser-sniff **Write the marker entry** for this source before proceeding: ```json { "source_name": "", "decision": "declined", "reason": "", "asked_at": "" } ``` Append it to `$PRESS_RUNSTATE/runs/$RUN_ID/browser-browser-sniff-gate.json`. Proceed with whatever spec source exists. If no spec was found, fall back to `--docs` or ask the user to provide a spec/HAR manually. ### Before leaving Phase 1.7 Every source named in the briefing must have exactly one entry in `browser-browser-sniff-gate.json`. Before proceeding to Phase 1.8, re-read the marker file and verify the count matches the number of named sources from the briefing. If a source is missing, return to the decision matrix for that source. Phase 1.5 will HALT if this check fails. --- ## Phase 1.8: Crowd-Sniff Gate After Phase 1.7 (Browser-Sniff Gate), evaluate whether mining community signals (npm SDKs and GitHub code search) would improve the spec. Skip this gate entirely if the user already passed `--spec` (spec source is already resolved and appears complete). **Time budget:** The crowd-sniff gate should complete within 10 minutes. If `cli-printing-press crowd-sniff` fails or times out, fall back immediately: - If a spec already exists: "Crowd-sniff failed — proceeding with existing spec." - If no spec exists: "Crowd-sniff failed — falling back to --docs generation." ### When to offer crowd-sniff | Spec found? | Research shows gaps? | Action | |-------------|---------------------|--------| | Yes | Yes — competitors or community projects reference more endpoints | **Offer crowd-sniff as enrichment** | | Yes | No — spec appears complete | Skip silently | | No | Community SDKs exist on npm | **Offer crowd-sniff as primary discovery** | | No | No SDKs or code found | Skip — fall back to `--docs` | ### Crowd-sniff as enrichment (spec exists but has gaps) Present to the user via `AskUserQuestion`: > "Found a spec with **N endpoints**, but research shows the live API likely has more. Want me to search npm packages and GitHub code for `` to discover additional endpoints? This typically takes 5-10 minutes." > > Options: > 1. **Yes — crowd-sniff and merge** (search npm SDKs and GitHub code, merge discovered endpoints with the existing spec) > 2. **No — use existing spec** (proceed with what we have) ### Crowd-sniff as primary (no spec found) Present to the user via `AskUserQuestion`: > "No OpenAPI spec found for ``. Want me to search npm packages and GitHub code to discover the API from community usage? This typically takes 5-10 minutes." > > Options: > 1. **Yes — crowd-sniff the community** (search npm SDKs and GitHub code, generate a spec from discovered endpoints) > 2. **No — use docs instead** (attempt `--docs` generation from documentation pages) > 3. **No — I'll provide a spec or HAR** (user will supply input manually) ### If user approves crowd-sniff Read and follow [references/crowd-sniff.md](references/crowd-sniff.md) for the crowd-sniff command, provenance capture, and discovery report writing. ### If user declines crowd-sniff Proceed with whatever spec source exists. If no spec was found, fall back to `--docs` or ask the user to provide a spec/HAR manually. --- ## Phase 1.5: Ecosystem Absorb Gate THIS IS A MANDATORY STOP GATE. Do not generate until this is complete and approved. ### Pre-flight check: browser-sniff-gate marker Before any absorb work, verify `$PRESS_RUNSTATE/runs/$RUN_ID/browser-browser-sniff-gate.json` exists and contains an entry for every source named in the briefing. **If the file is missing:** HARD STOP. Print: > Phase 1.7 Browser-Sniff Gate did not record a decision. Return to Phase 1.7 and evaluate the browser-sniff gate for every source named in the briefing. Do not proceed to Step 1.5a until the file exists. **If the file exists but is missing an entry for a named source:** HARD STOP. Print: > Browser-Sniff Gate missing decision for source ``. Return to Phase 1.7 and evaluate the decision matrix for that source. Do not proceed until every briefing source has a marker entry. **Resume leniency:** If the run was started by an older version of the skill that didn't write markers, warn and continue — do not hard-fail on legacy resumes. Distinguish by checking whether `state.json` predates the marker contract (the marker file didn't exist before 2026-04-11). New runs always hard-fail on a missing marker. **Pre-check (existing):** If no spec or HAR file has been resolved by this point and Phase 1.7 (Browser-Sniff Gate) was not evaluated, STOP. Go back and run the browser-sniff gate decision matrix. The absorb manifest depends on knowing the API surface, which requires a spec. The GOAT CLI doesn't "find gaps." It absorbs EVERY feature from EVERY tool and then transcends with compound use cases nobody thought of. This phase builds the absorb manifest. ### Step 1.5a: Search for every tool that touches this API Run these searches in parallel: 1. **WebSearch**: `"" Claude Code plugin site:github.com` 2. **WebSearch**: `"" MCP server model context protocol` 3. **WebSearch**: `"" Claude skill SKILL.md site:github.com` 4. **WebSearch**: `"" CLI tool site:github.com` (competing CLIs) 5. **WebSearch**: `"" CLI site:npmjs.com` (npm packages) 6. **Raw fetch**: Check `github.com/anthropics/claude-plugins-official/tree/main/external_plugins` for official plugins with the helper from [references/fetch-docs.md](references/fetch-docs.md), or with `gh api` when it can return the file/listing directly. 7. **WebSearch**: `"" MCP site:lobehub.com OR site:mcpmarket.com OR site:fastmcp.me` 8. **WebSearch**: `"" automation script workflow site:github.com` 9. **WebSearch**: `"" SDK wrapper site:npmjs.com` 10. **WebSearch**: `"" client library site:pypi.org` ### Step 1.5a.5: Read MCP source code (if found) If step 1.5a discovered MCP server repos with public source code on GitHub, read the actual source to extract ground-truth API usage — not just README feature descriptions. **Time budget:** Max 3 minutes total. If extraction is unproductive, fall back to README-only research. **For the top 1-2 MCP repos found:** 1. **Identify the main source file.** Use `gh api`, raw GitHub URLs, or the helper from [references/fetch-docs.md](references/fetch-docs.md) to inspect the repo tree and source files without a summarization layer. Find the entry point — typically `src/index.ts`, `server.ts`, `server.py`, `main.go`, or a `tools/` directory. MCP servers are usually small (one main file + tool definitions). 2. **Extract three things:** - **API endpoint paths**: Look for HTTP client calls (`fetch(`, `axios.`, `requests.`, `http.Get`, `client.`) and extract the URL paths (e.g., `GET /v1/issues`, `POST /graphql`). These are the endpoints the MCP maintainer proved work. - **Auth patterns**: Look for how the MCP constructs auth headers — token format (`Bearer`, `Bot`, `Basic`), header name (`Authorization`, `X-API-Key`), environment variable names. This informs our auth setup guidance. - **Response field selections**: Look for which fields are extracted from API responses — these are the high-gravity fields that power users actually need. 3. **Feed into absorb manifest.** In step 1.5b, endpoints extracted from source get attributed as ` (source)` in the "Best Source" column, distinguishing them from README-derived features. Source-extracted endpoints are high-confidence signals — the maintainer verified they work. 4. **Feed auth patterns into research brief.** If the MCP source reveals token format (e.g., `xoxp-` for Slack, `sk_live_` for Stripe), credential setup steps, or required scopes, note them in the Phase 1 brief's auth section. These hints improve the generated CLI's auth onboarding. **Skip this step when:** - No MCP repos were found in 1.5a - MCP repos are private or archived - The MCP is a monorepo where the relevant server is hard to locate within 3 minutes ### Step 1.5a.6: DeepWiki Codebase Analysis (if GitHub repos found) If Phase 1 or Step 1.5a discovered GitHub repos for the API (SDK repos, server repos, MCP server repos), query DeepWiki for a semantic understanding of how the API works - architecture, auth flows, data models, error handling. This complements crowd-sniff (endpoints) and MCP source reading (auth headers) with "how things actually work" context. **Time budget:** 2 minutes max. If DeepWiki is slow or unavailable, skip silently. **Run in parallel** with Steps 1.5a through 1.5a.5 when possible. DeepWiki queries do not depend on MCP source reading results. Read and follow [references/deepwiki-research.md](references/deepwiki-research.md) for the query procedure: wiki structure fetch, targeted section extraction (auth, data model, architecture), and synthesis into the research brief and absorb manifest. **Skip this step when:** - No GitHub repos were discovered during Phase 1 or Step 1.5a - The API is trivially simple (1-2 endpoints, no auth) ### Step 1.5b: Catalog every feature into the absorb manifest For EACH tool found, list EVERY feature/tool/command it provides. Then define how our CLI matches AND beats it: ```markdown ## Absorb Manifest ### Absorbed (match or beat everything that exists) | # | Feature | Best Source | Our Implementation | Added Value | |---|---------|-----------|-------------------|-------------| | 1 | Search issues by text | Linear MCP search_issues | -pp-cli search | Works offline, regex, SQL composable | | 2 | Create issue | Linear MCP create_issue | -pp-cli issue create --stdin --dry-run | Agent-native, scriptable, idempotent | | 3 | Sprint board view | jira-cli sprint view | -pp-cli sprint view | Historical velocity, offline | ``` Every row = a feature we MUST build. No exceptions. If someone else has it, we have it AND it works offline, with --json, --dry-run, typed exit codes, and SQLite persistence. SDK wrapper methods should be treated as features to absorb — each public method/function is a feature the CLI should match. **Our Implementation must start with a parseable disposition.** Use one of these prefixes so Phase 3 can verify the row mechanically: - `-pp-cli ` for a promoted or hand-built Cobra command path that must resolve via ` --help`. - `(generated endpoint) ` for generator-emitted typed endpoint commands that retain the upstream resource shape and are covered by the generated endpoint surface. - `(behavior in -pp-cli ) ...` for features implemented as flags, modes, output shapes, or store behavior inside another command. The named command path still must resolve; the prose after the closing parenthesis explains the behavior to verify later. - `(stub) ...` only for explicitly approved stubs per the rule below. Do not leave `Our Implementation` as freeform prose like `FTS5 offline search` or `SQLite-backed sprint query`. If the row maps to a clean user-facing command, put that command path first. If it does not, choose the explicit disposition that explains why Phase 3 should not treat the whole cell as a new command path. **Stubs must be explicit.** If any row in the manifest will ship as a stub (placeholder implementation that emits "not yet wired" / "wip" messaging), start `Our Implementation` with `(stub)` plus a one-line reason why the full implementation is deferred (e.g., "(stub - requires paid API)", "(stub - requires headless Chrome)"). If the manifest also has a `Status` column, set that value to `(stub)` too, but the `Our Implementation` prefix is the Phase 3 gate's source of truth. Do NOT quietly ship stubs for features the user approved as shipping scope. The Phase Gate 1.5 prose showcase (below) MUST read out stub items separately so the user explicitly approves the stub list. After approval, Phase 3 builds shipping-scope features fully and stubs with honest messaging; no mid-build downgrade from shipping-scope to stub is permitted. If an agent discovers during Phase 3 that a shipping-scope feature cannot be implemented in-session, they must return to Phase 1.5 with a revised manifest — not unilaterally downgrade to a stub. ### Step 1.5c: Identify transcendence features Start with the users, not the technology. The best features come from understanding who uses this service, what their rituals are, and what questions they can't answer today. "What can SQLite do?" is the wrong question. "What would make a power user say 'I need this'?" is the right one. The actual brainstorming runs as a Task subagent in Step 1.5c.5 below — customer model → 2× candidates → adversarial cut. Step 1.5c is the motivation; do not generate transcendence features inline here. The transcendence table in the manifest (Step 1.5d) renders rows in this shape, which mirrors the subagent's `### Survivors` output. The `Buildability` column tags each row `spec-emits` or `hand-code` per [references/novel-features-subagent.md](references/novel-features-subagent.md) so the Phase Gate 1.5 hand-code count has a source of truth in the manifest. The optional `Long Description` column carries agent-facing disambiguation text for Phase 3 Cobra `Long` fields; use `none` when no sibling redirect is needed: ```markdown ### Transcendence (only possible with our approach) | # | Feature | Command | Buildability | Why Only We Can Do This | Long Description | |---|---------|---------|--------------|------------------------|------------------| | 1 | Bottleneck detection | bottleneck | hand-code | Requires local join across issues + assignees + cycle data | Use this command to find cross-team work blockage. Do NOT use it for personal recency checks; use 'since' instead. | | 2 | Velocity trends | velocity --weeks 4 | hand-code | Requires historical cycle snapshots in SQLite | none | | 3 | What did I miss | since 2h | hand-code | Requires time-windowed aggregation no single API call provides | Use this command for recent personal changes. Do NOT use it for backlog bottlenecks; use 'bottleneck' instead. | ``` Minimum 5 transcendence features. These are the commands that differentiate the CLI. ### Step 1.5c.5: Auto-Suggest Novel Features (subagent) **Always spawn the subagent — first prints and reprints alike.** The subagent is the only path that produces this step's outputs (customer model, candidate list, adversarial cut, killed-candidate audit trail). There is no manual fallback. Specifically, do not: - hand-curate the transcendence list from a prior manifest, even when the prior looks complete. Prior `research.json` is INPUT to Pass 2(d), never a substitute for the spawn. - fall back to inline brainstorming inside the SKILL. - skip on cost grounds. With a strong prior the subagent confirms or reframes; with no prior it generates from scratch. Run it either way. - treat disclosure as authorization. Announcing a skip in the gate showcase does not make the skip legal. Read [references/novel-features-subagent.md](references/novel-features-subagent.md) for the prior-research discovery snippet, input bundle, prompt template, and output contract. Run the discovery snippet as written — do not substitute an `ls` of the manuscripts directory. The snippet's `none` branch (no prior research) is a first print, not a skip signal. The only legitimate non-spawn outcome is the pre-flight HALT (brief lacks user research) defined in the reference file. ### Step 1.5d: Write the manifest artifact Write to `$RESEARCH_DIR/-feat--pp-cli-absorb-manifest.md` The manifest now includes compound use cases (Step 1.5c) and auto-suggested + auto-brainstormed features (Step 1.5c.5) in the transcendence table. ### Step 1.5e: Write research.json for README credits After writing the absorb manifest, also write `$API_RUN_DIR/research.json` so the generator can credit community projects in the README. This file MUST match the `ResearchResult` JSON schema that `loadResearchSources()` expects. Populate the `alternatives` array from the absorb manifest's source tools list. Include only tools that: 1. Have a GitHub URL (not npm/PyPI landing pages) 2. Actually contributed features to the absorb manifest 3. Are capped at 8 entries, ordered by number of absorbed features (then by stars) ```bash cat > "$API_RUN_DIR/research.json" <", "novelty_score": 0, "alternatives": [ {"name": "", "url": "", "language": "", "stars": , "command_count": }, ... ], "auth": { "canonical_env_var": "" }, "novel_features": [ { "name": "", "command": "", "description": "", "rationale": "", "example": "", "why_it_matters": "", "group": "" }, ... ], "narrative": { "display_name": "", "headline": "", "value_prop": "<2-3 sentence expansion rendered beneath the title>", "auth_narrative": "", "quickstart": [ {"command": " ", "comment": ""}, ... ], "troubleshoots": [ {"symptom": "", "fix": ""}, ... ], "when_to_use": "<2-4 sentences describing ideal use cases; rendered in SKILL.md only>", "anti_triggers": ["", ...], "recipes": [ {"title": "", "command": " ", "explanation": ""}, ... ], "trigger_phrases": ["", ...] }, "gaps": [], "patterns": [], "recommendation": "proceed", "researched_at": "$(date -u +%Y-%m-%dT%H:%M:%SZ)" } REOF ``` For each tool, fill in what you know from the research. Stars and command_count are optional (use 0 if unknown). The `language` field should match the primary implementation language. Skip tools that were found during search but contributed zero features to the manifest. **Novel features rules** (the `novel_features` array populates the README's "Unique Features" section and SKILL.md's "Unique Capabilities" block; MCP exposure comes from the runtime Cobra-tree mirror, not this list): 1. Include all transcendence features from the manifest that scored >= 5/10. Order by score descending. 2. `description` should be user-benefit language, not implementation detail. Good: "See which team members are overloaded before sprint planning." Bad: "Requires local join across issues + assignees + cycle data." 3. `rationale` should explain why this is only possible with our approach. Good: "Requires correlating bookings, schedules, and staff data that only exists together in the local store." Bad: "Cal.com Insights is paid-tier only." 4. `command` must match the actual CLI subcommand that will be built in Phase 3. For subcommands of a resource (e.g., `issues stale`), use the full command path. 5. `example` is a ready-to-run invocation an agent can copy-paste. Use realistic arguments from the API's domain (e.g. `AAPL`, `customer_42`), not ``. Include the `--agent` flag when the feature benefits from structured output. 6. `why_it_matters` is a single agent-facing sentence answering "when should I pick this over a generic API call?" 7. `group` clusters related features under a theme name. Pick 2–5 themes total (e.g. "Local state that compounds", "Agent-native plumbing", "Reachability mitigation"). Use the same `group` string verbatim across features that belong together — exact matches drive README grouping. Leave `group` empty if the CLI has too few novel features to warrant clustering. 8. If the manifest row has a non-`none` `Long Description`, keep that text with the feature implementation notes and use it as the Cobra `Long` field during Phase 3 hand-code. Do not squeeze redirect prose into `description`; `description` stays one-line user-benefit text. 9. If no transcendence features scored >= 5/10, omit the `novel_features` field entirely. 10. Do not add a feature to `novel_features` merely to expose it through MCP. Any user-facing Cobra command becomes an MCP tool automatically unless it sets `cmd.Annotations["mcp:hidden"] = "true"`. **Auth research rule**: 1. `auth.canonical_env_var` is the single-token credential env var discovered from vendor docs, MCP/source analysis, or dominant SDK/CLI convention (for example `APIFY_TOKEN`, `GITHUB_TOKEN`, `STRIPE_SECRET_KEY`). Omit it when no canonical name is known, when auth is HTTP Basic or another credential pair, or when the auth flow needs richer metadata. Fresh generation reads this env var first and keeps the parser-derived name as a fallback automatically. **Narrative rules** (the `narrative` object drives README headline, Quick Start, Auth, Troubleshooting, and the entire SKILL.md): 1. `display_name` is the canonical prose name, discovered during research, with exact brand casing and spacing. This is agentic/research-owned, not slug-inferred by Go code. Good: "Product Hunt", "GitHub", "YouTube", "Cal.com". Bad: "Producthunt", "Github", "Youtube", "Cal Com". Use the slug only for binary names, directories, module paths, config paths, and env-var prefixes. 2. `headline` is the bold one-liner rendered beneath the CLI title. Should name the differentiator, not restate the API. Good: "Every Notion feature, plus sync, search, and a local database no other Notion tool has." Bad: "A CLI for the Notion API." 3. `value_prop` expands the headline to 2–3 sentences. Name specific novel features by command where helpful. 4. `auth_narrative` tells the real auth story for this API (crumb handshake, cookie session, OAuth device flow). Omit for standard API-key auth where the generic branch is fine. 5. `quickstart` is a 3–6 step flow using REAL arguments (symbols, IDs, resource names an agent can actually pass). Each step's `comment` explains *why* it runs. This replaces the generic "resource list" first-command fallback. - Step 1 of `quickstart` should usually be verify-safe: it should exit 0 when `validate-narrative --full-examples` appends `--dry-run` in a no-credentials environment. - Use ` doctor --dry-run` as step 1 (health check, works without auth). Do not use ` auth set-token ` as step 1 because it requires a positional token and is not a verify-safe runnable first step. Auth setup instructions belong in `auth_narrative` prose only, not as an executable quickstart command. 6. `troubleshoots` captures API-specific failure modes (rate-limit mitigation, cookie expiry, paginated quirks). Each `fix` must be actionable — a command or a concrete setting change. 7. `when_to_use` is SKILL-only narrative. 2–4 sentences describing the kinds of agent tasks this CLI is the right choice for. Not rendered in README. 8. `anti_triggers` is SKILL-only narrative. List common task boundaries that should make an agent choose another tool, official SDK, web UI, or human workflow instead of this CLI. Write concrete "do not use this CLI for X" cases, not vague limitations. Omit the field only when no honest boundary is known. 9. `recipes` are 3–5 worked examples rendered in SKILL.md. Each has a title, a real command, and a one-line explanation. Prefer recipes that exercise novel features. **At least one recipe must pair `--agent` with `--select`** — using dotted paths (e.g. `--select events.shortName,events.competitions.competitors.team.displayName`) when the response is deeply nested. APIs like ESPN, HubSpot, and Linear return tens of KB per call; without a `--select` recipe, agents burn context parsing verbose payloads. Pick a command known to return a large or deeply nested response and show the narrowing pattern. **Regex literals must double-escape backslashes** — write `\\b` not `\b` (and `\\t`, `\\f`, etc.) inside any `command`, `fix`, or other JSON string field. JSON parses `\b` as backspace (0x08), `\f` as form feed (0x0C), and so on, which then leak into the rendered SKILL.md as control bytes that render as nothing in most viewers. The generator's render-time scanner rejects these with a clear offset; double-escape from the start to avoid the error. 10. `trigger_phrases` are natural-language phrases a user might say that should invoke this CLI's skill. Include 3–5 domain-specific phrases (e.g. for a finance CLI: "quote AAPL", "check my portfolio", "options for TSLA") and 2 generic phrases ("use ", "run "). Domain verbs vary — don't just template "use X" variants. 11. All `narrative` fields are optional. Omit fields you can't populate honestly rather than emit filler. The generator falls back to generic content gracefully. 12. **Avoid hardcoded counts in narrative copy when the count tracks a runtime list.** A number embedded in `headline` or `value_prop` ("across N trusted sources", "from N retailers", "queries N vendors") propagates into root.go's Short/Long, the README, the SKILL, the MCP tools description, and `which.go` — every output surface that reads the narrative. When the underlying registry grows or shrinks, the count goes stale across all of those surfaces simultaneously, and a single-line edit to add a source requires hunting down ~10 hardcoded copies. Prefer plural-without-count phrasing ("across the major sources", "from a curated set of retailers") or describe the breadth qualitatively ("dozens of vendors") rather than committing to a specific integer. If a count is load-bearing for the value prop, keep the brief's narrative count-free and have the printed-CLI's README/SKILL author write the count once into a single hand-edited paragraph after generation — accepting that it will need a manual update whenever the registry changes. 13. **Use side-effectful examples only when they are the truthful workflow.** `validate-narrative --strict --full-examples` classifies `auth login`, `auth set-token`, `auth logout`, `auth setup`, `--launch`, and mutating `--apply` examples as side-effectful (see `isSideEffectfulNarrativeExample` in `internal/narrativecheck/narrativecheck.go`) and reports each as an `UNSUPPORTED` warning instead of executing it. These warnings do not fail strict aggregation, so it is valid to show an auth or apply command when that is the honest onboarding or bulk-operation shape. Prefer `doctor` or another read-only invocation as `quickstart[0]` when it teaches the same workflow, but do not strip a real auth or apply step just to appease shipcheck. Non-side-effect unsupported examples still fail strict mode when they cannot dry-run, and missing commands, empty command paths, and failed full examples remain failures. **Pre-render framework-command check.** Before running `generate --research-dir`, validate the framework command examples already present in `research.json`. This catches stable template vocabulary mistakes while the fix is still a single-file `research.json` edit, before README.md, SKILL.md, `.printing-press.json`, root help, and other generated surfaces consume the bad narrative. ```bash cli-printing-press validate-narrative --strict --framework-only \ --research "$API_RUN_DIR/research.json" ``` If this reports `sync --entities`, `search --entities`, `search --types`, an absolute date for `sync --since`, or another framework-command flag mismatch, fix `research.json` now and rerun this check before generation. This is a cheap floor, not a replacement for Phase 4 shipcheck: after the CLI exists, `shipcheck` still runs `validate-narrative --strict --full-examples` against the built binary to catch API-specific command paths, generated endpoint flags, and runtime dry-run failures. Also write discovery pages if browser-sniff was used. The generator reads these from `$API_RUN_DIR/discovery/browser-sniff-report.md` (which the browser-sniff gate already writes there). No additional action needed for discovery pages -- they are already in the right location. ### Priority inversion check (combo CLIs only) **Only runs when `source-priority.json` exists from the Multi-Source Priority Gate.** Before Phase Gate 1.5, tally the commands/features the manifest attributes to each named source. Compare against the confirmed priority ordering: - If the primary source has **fewer** commands than any secondary source, this is a **priority inversion** — the free/primary-intent source got demoted because the secondary had more spec coverage. - If the primary source has **zero** commands (all its features were dropped because it lacked a spec), this is a **hard inversion** — the primary was silently replaced. When an inversion is detected, HALT before Phase Gate 1.5 and print: > ⚠ **Priority inversion detected.** > > The confirmed primary is **** but the manifest gives it commands vs **** (secondary) with commands. This usually means the primary's discovery path (browser-sniff, community wrapper, HTML parser) didn't land, and the secondary's clean spec took over. > > The user said is the headline. Shipping this manifest would invert their stated priority. Then ask via `AskUserQuestion`: 1. **Re-run discovery for ** — loop back to Phase 1.7 browser-sniff or Phase 1.8 crowd-sniff for the primary source specifically. 2. **Accept the inversion** — the user explicitly confirms they're fine with the secondary leading. Record this in `source-priority.json` as `inversion_accepted: true`. 3. **Drop ** — remove the secondary from the manifest so it can't overshadow the primary. Do not proceed to the prose showcase until this is resolved. ### Phase Gate 1.5 **STOP.** Present the absorb manifest to the user in two parts: a prose showcase, then a question. The prose showcase and the `AskUserQuestion` are two separate turns. Print the showcase as a plain text reply with every novel feature spelled out, then call `AskUserQuestion` with four short options whose descriptions fit on one line each. The question text is one sentence; the user reads the showcase to decide and the options to act. Cramming the feature list into an option description collapses both turns into one and is the failure mode this gate exists to prevent. **Part 1: Prose showcase (print before the AskUserQuestion)** The showcase exists so the user can decide approve / trim / add ideas without asking a follow-up. Cover four things: 1. **Scope** — how many features absorbed across which tools, how many novel on top, how that stacks up against the best existing tool. 2. **Per-novel-feature readout** — one line each: feature name, what the user gets, and the specific evidence or persona that makes it worth building. 3. **Hand-code commitment** — of the M novel features, K will require hand-written Go after generate (each ~50-150 LoC plus `root.go` wiring). State the hand-code count and the auto-emitted count, then list the names of the hand-code features. The manifest transcendence table's `Buildability` column (populated from the subagent per [references/novel-features-subagent.md](references/novel-features-subagent.md) "Output contract") is the source of truth: count rows tagged `hand-code`; `spec-emits` rows are excluded from the hand-code total. Approving commits the agent to that scope, so the user must see it explicitly before the AskUserQuestion. 4. **Anything else the user should worry about before approving** — stubs, risky dependencies, expensive endpoints, low-confidence ideas. Show every novel feature that scored ≥5/10. Group by theme if there are more than ~12; never hide features behind "Plus N more" or "see full manifest." If zero qualified, say so plainly: "No novel features scored high enough to recommend. The absorbed features cover the landscape well." Format is otherwise yours — markdown headings, prose, a numbered list, whatever reads cleanly. The must-haves are the four things above and the ≥5/10 coverage rule. **Part 2: AskUserQuestion** > "Ready to generate with the full [N+M]-feature manifest? Or do you have ideas to add?" Options (each description must be one short line — the showcase already did the explaining): 1. **Approve — generate now** — Start CLI generation with the full manifest 2. **I have ideas to add** — Tell me features from your experience, then we'll generate 3. **Review full manifest** — Show me every absorbed and novel feature before deciding 4. **Trim scope** — The feature count is too ambitious, let's focus on a subset If user selects **"I have ideas to add"**, ask 3 structured questions targeting personal knowledge the research couldn't surface: 1. "Beyond the [M] ideas above, what workflows do YOU use `` for that we might have missed?" 2. "What frustrates YOU about this API that the research didn't surface?" 3. "What's YOUR killer feature — something only you'd think of?" If `USER_BRIEFING_CONTEXT` is non-empty, acknowledge it: "You mentioned [summary of their vision] at the start. Want to add more, or does the manifest already cover it?" Each answer that produces a concrete feature → score and add to the transcendence table. After the brainstorm, return to this gate with the updated manifest. WAIT for approval. Do NOT generate until approved. --- ## Phase 1.9: API Reachability Gate **MANDATORY. Do NOT skip this phase. Do NOT proceed to Phase 2 without running this check.** Before spending tokens on generation, verify the API actually responds to programmatic requests. One real HTTP call. If it fails, STOP. **Exception for browser-clearance/browser-sniffed website CLIs:** If Phase 1.7 produced a successful browser capture and `$DISCOVERY_DIR/traffic-analysis.json` reports `reachability.mode` as `browser_clearance_http` or `browser_http`, a plain `curl` 403/429 is expected evidence, not a hard stop. In that case the reachability gate passes only if: - the browser-sniff capture contains useful non-challenge traffic (real API, SSR data, structured HTML, RSS/feed data, or page-context fetch evidence), and - Phase 2 will pass `--traffic-analysis "$DISCOVERY_DIR/traffic-analysis.json"` so the generator can emit browser-compatible HTTP transport and, for `browser_clearance_http`, Chrome cookie import. Do not treat a persistent browser sidecar as a shippable CLI runtime. Browsers are allowed for Printing Press discovery and reusable auth/clearance capture; ordinary printed CLI commands must replay through direct HTTP, Surf/browser-compatible HTTP, or stored reusable auth state. If traffic analysis reports `browser_required`, return to discovery to find a replayable HTTP/HTML/RSS/SSR surface or HOLD the run. Useful same-site HTML document pages count as a replayable surface when they return real content, not challenge/login pages. Browser-sniff can promote these into `response_format: html` endpoints so generated commands extract page metadata and filtered links through Surf/direct HTTP instead of keeping a browser sidecar alive. When hand-authoring a `response_format: html` spec with `html_extract.mode: links`, document and choose `link_prefixes` as path-segment prefixes. A prefix `/items` matches `/items` and `/items/...`, but not `/items123.html`; use the parent directory prefix when the leaf segment has embedded IDs or suffixes. See `skills/printing-press/references/spec-format.md` for the exact contract. If the browser capture contained only challenge/login/error pages, this exception does not apply. **Exception for LAN-only / mDNS-discovered APIs:** If the resolved spec's `base_url` is a localhost or loopback placeholder (`http://localhost:`, `http://127.0.0.1:`, or `http://[::1]:`), or Phase 1 research explicitly identifies the API as LAN-only / SSDP / mDNS-discovered with no stable global origin, do not run the generic curl/WebFetch reachability probe. A probe from the generation host would test the agent's loopback or current network, not the user's appliance, speaker, bridge, or local service. For this case, record a Phase 1.9 PASS carve-out in the research brief: ```markdown ## Reachability Gate - Decision: PASS (carve-out) - Reason: lan-only-no-global-url - Evidence: ``` Then proceed to Phase 2. Do not write a freeform manual proof for this case, do not call it a missing-API-key skip, and do not use this carve-out for normal public/cloud origins such as `https://api.example.com`; those still run the reachability probe and decision matrix below. ### The Check Prefer the spec's `auth.verify_path` when it is set; otherwise pick the simplest GET endpoint from the resolved spec (no required params, no auth if possible). If no such endpoint exists, use the spec's base URL. Run one HTTP request and preserve the response body when the server returns a 4xx: ```bash body_file="$(mktemp "${TMPDIR:-/tmp}/pp-reachability-body.XXXXXX")" trap 'rm -f "$body_file"' EXIT status="$(curl -s --max-filesize 65536 -o "$body_file" -w "%{http_code}" -m 10 "/" 2>/dev/null || true)" case "$status" in [0-9][0-9][0-9]) ;; *) status="000" ;; esac printf '%s\n' "$status" ``` Or use `WebFetch` if curl is unavailable. Record the response status and, for any 4xx response body, run the same tier/permission keyword scan against the captured WebFetch body text before deciding. The goal is one real response code plus any 4xx body evidence the API chose to return. If `status` is any 4xx, inspect the body before deciding. Search it case-insensitively for tier or permission terms: ```bash grep -Ei 'tier|allowed|permitted|subscription|quota|plan|scope|limit|permission|forbidden|unauthorized|upgrade|trial' "$body_file" | head -20 ``` When matched lines are present, add them to the Phase 1 research brief under: ```markdown ## Reachability Risk - Tier/permission hints from 4xx body: "" ``` Keep the evidence bounded: include only the lines that explain the access model, trim each line to a readable length, and do not paste bearer tokens, API keys, cookies, or unrelated full response dumps. If the GET returns 2xx/3xx, omit this tier-hint subsection. Do not probe arbitrary mutation endpoints to discover tier limits. A generic "try a PUT/POST/PATCH/DELETE" rule can create accounts, send messages, capture payments, or mutate user data. Mutation probing is allowed only when the resolved spec or OpenAPI operation explicitly marks that endpoint as probe-safe with `x-pp-safe-probe: true`; the endpoint must be idempotent or otherwise harmless for the real account being used. If no endpoint has that explicit marker, stop after the GET body capture above. If one or more probe-safe endpoints are declared and the user provided credentials, run exactly one declared probe-safe endpoint as a second reachability probe and apply the same 4xx body capture and tier-keyword extraction. When more than one exists, choose the lowest-risk declared endpoint by preferring methods in this order: HEAD/OPTIONS/GET, then PUT/PATCH, then POST, then DELETE only if it is the only declared safe option. Break ties by choosing the endpoint with the fewest required parameters and avoiding paths with account, billing, payment, deletion, or notification terms when any safer declared option exists. Record which endpoint was probe-safe in the brief so later phases know the evidence came from an opt-in safe probe. ### OAuth2 Grant Probe If the resolved spec declares `auth.type: oauth2` and has an interactive authorization URL (`authorizationCode` or `implicit` flow in OpenAPI, or an equivalent internal YAML auth field), the generic reachability check is not enough. After the base URL check would otherwise pass, verify the OAuth grant entry point with the user's real public OAuth input before Phase 2. This probe is read-only: it stops at the provider's consent, login, or error page and does not exchange a code, request a token, or ask the user to approve consent. Do not run this grant probe for OAuth2 `client_credentials` flows that only have a token URL. Those are server-to-server credentials, not browser grant flows, and probing the token endpoint would require secret material or a write-like auth attempt. The base reachability check plus later mock/live auth verification cover that shape. **Required inputs:** Use the `client_id` env var or public auth-flow input already resolved during Phase 0.5 and Pre-Generation Auth Enrichment. If the spec exposes `x-auth-vars`, prefer the entry with `kind: auth_flow_input`, `sensitive: false`, and a name or description identifying it as the OAuth `client_id`. If the real client id is missing, HOLD before generation and tell the user exactly which env var to set. Do not substitute a fake client id; fake ids can produce provider-specific errors that look like transport quirks. Build the authorize URL from the resolved spec, not from a guessed provider default: - `client_id`: the real public client id from the env var above. - `redirect_uri`: the redirect URI declared in the spec or auth metadata. - `response_type=code` for authorization-code grants, or the spec's documented response type for implicit grants. - For authorization-code grants, include a safe probe PKCE pair using `S256`. Use `probe_reachability_check_pkce_probe_literal` as the code verifier and compute the URL-safe SHA-256 challenge from it. The verifier is 43 unreserved characters, satisfying the RFC 7636 minimum; providers that do not require PKCE ignore these params, and providers that enforce PKCE should advance to the login or consent page instead of returning a false `invalid_request`. - `scope`, `audience`, `tenant`, `state`, `prompt`, or other provider-required params when the spec or vendor docs require them. Use a benign probe value for `state` if required. Use a redirect-limited GET and inspect the final URL, response body, and response class: ```bash PKCE_VERIFIER="probe_reachability_check_pkce_probe_literal" PKCE_CHALLENGE=$(printf "%s" "$PKCE_VERIFIER" | openssl dgst -sha256 -binary | openssl base64 -A | tr '+/' '-_' | tr -d '=') AUTH_URL="" # Add code_challenge_method=S256 and code_challenge=$PKCE_CHALLENGE to AUTH_URL. PROBE_BODY_AND_META=$(curl -sS -L --max-redirs 10 -m 15 -w "\n%{http_code} %{url_effective}" -o - "$AUTH_URL" 2>/dev/null) PROBE_META=$(printf "%s\n" "$PROBE_BODY_AND_META" | tail -n 1) PROBE_BODY=$(printf "%s\n" "$PROBE_BODY_AND_META" | sed '$d') printf "%s\n" "$PROBE_META" printf "%s\n" "$PROBE_BODY" | head -c 8000 printf "\n" ``` Interpret the result before Phase 2: | OAuth probe result | Action | |--------------------|--------| | HTTP status is `2xx` or `3xx`, final URL stays on the provider's authorization/login/consent host, does not include `error=`, and the response body does not contain an OAuth error code (`invalid_request`, `invalid_client`, `unauthorized_client`, etc.) | **PASS** - the grant entry point is reachable; proceed to Phase 2 | | Final URL or response body reports `invalid_request`, `invalid_client`, `redirect_uri_mismatch`, `unauthorized_client`, `unsupported_response_type`, or equivalent | **HARD STOP** - OAuth config is misconfigured; surface the provider error and point the user to the mismatched client id, redirect URI, app type, tenant, or required scope | | HTTP status is `4xx` or `5xx` without a recognizable OAuth error code | **WARN** - flag provider-specific routing or login-shell behavior for manual review before generation | | Final URL lands on a generic non-OAuth error page, marketing page, or unrelated login landing page | **WARN** - flag endpoint ambiguity or provider-specific routing for manual review before generation | | Timeout/DNS/connection refused or HTTP status `000` | **WARN** - same handling as the generic reachability WARN | On HARD STOP, do not generate. Present a specific, provider-neutral message: > "WARNING: ``'s OAuth authorize probe failed before generation. The > provider returned ``. Check that the spec's > `authorization_url`, `redirect_uri`, `response_type`, client id env var, app > type, tenant, and required scopes match the registered OAuth application." This OAuth probe is additive to the base reachability gate. Non-OAuth APIs (`api_key`, `bearer_token`, `cookie`, `composed`, `session_handshake`, `none`) skip it entirely. **If the check returns 403/429 with bot-protection evidence and `probe-reachability` has not already run for this URL during Phase 1.7's Direct HTTP challenge rule, run it now before consulting the decision matrix:** ```bash cli-printing-press probe-reachability "" --json ``` The matrix below references `probe-reachability` `mode` for the bot-detection rows. If the probe already ran in Phase 1.7, reuse that result; do not re-probe. ### Decision Matrix | Result | Browser capture result | Traffic-analysis reachability | Action | |--------|------------------------|-------------------------------|--------| | 2xx/3xx | Any | Any | **PASS** - proceed to Phase 2 | | 401 (no key provided) | Any | Any | **PASS** - expected when API needs auth and user declined key gate | | 403/429 with HTML/bot detection | `probe-reachability` returned `browser_http` | runtime is `browser_http` (Surf) | **PASS** - the printed CLI will ship Surf transport which clears the protection. No clearance cookie capture in the printed CLI, regardless of whether browser-sniff also ran for endpoint discovery | | 403/429 with HTML/bot detection | Successful useful capture | `browser_http` or `browser_clearance_http` | **PASS** - proceed with browser-compatible HTTP / clearance strategy | | Any | Capture only works through a live page context | `browser_required` | **HOLD** - find a lighter replayable surface before Phase 2 | | 403/429 with HTML/bot detection | No browser capture attempted but browser-sniff approved/pre-approved AND `probe-reachability` returned `browser_clearance_http` or `unknown` | Any | **RETURN TO PHASE 1.7** - attempt cleared-browser capture before pivoting scope | | 403/429 with HTML/bot detection | Capture contains only challenge/error pages | Any | **HARD STOP** | | 403 | No successful useful capture | Research found 403 issues | **HARD STOP** | | 403 | No successful useful capture | No 403 research issues | **WARN** - ask user | | Timeout/DNS/connection refused | Any | Any | **WARN** - ask user | ### On HARD STOP Present via `AskUserQuestion`: > "WARNING: `` appears to block programmatic access. [what failed: e.g., 'HTTP 403 with HTML error page', 'browser-sniff gate failed with bot detection', 'reteps/redfin has 6+ issues about 403 errors']. Building a CLI against an unreachable API wastes time and tokens." > > 1. **Try anyway** - proceed knowing the CLI may not work against the live API > 2. **Pick a different API** - start over > 3. **Done** - stop here ### On WARN Present via `AskUserQuestion`: > "The API returned [error]. This might be temporary, or it might mean programmatic access is blocked. Want to proceed?" > > 1. **Yes - proceed** - generate the CLI anyway > 2. **No - stop** - pick a different API or provide a spec manually ### On PASS Proceed silently to Phase 2. --- ## Phase 2: Generate ### Pre-Generation Category Enrichment Before generating a non-catalog CLI, set the spec's top-level `category` before running `generate`. The category must come from the Phase 1 research brief's domain judgment, mapped to the public catalog enum documented in `docs/CATALOG.md`. Non-catalog means the run is based on browser-sniffed traffic, HAR capture, docs-derived specs, or a hand-authored internal spec rather than `cli-printing-press generate ` using a built-in catalog entry. For internal YAML specs, add: ```yaml category: ``` If the source is an OpenAPI file and the workflow has an editable overlay or derived internal spec, carry the same top-level category into that generated spec artifact before the final `generate` invocation. If there is no editable spec artifact, such as direct `--docs` generation, pass `--category ` on the final `generate` invocation. Do not add the category after generation just to satisfy publish; the generated manifest, README, and SKILL install section must all come from the same category-aware spec, or `verify-skill canonical-sections` can drift. Catalog-mode runs skip this step: keep the built-in catalog entry's category unchanged, even if Phase 1 research would classify the API differently. ### Pre-Generation Cache Enrichment Before generating, decide whether the spec should opt into generator-owned cache freshness. The generator already has the freshness helpers and auto-refresh hook, but it emits them only when the spec declares `cache.enabled: true` and the CLI has a real sync path. Stateful catalog-shaped CLIs otherwise serve local data exactly as it was last synced, which caps the cache freshness score and can leave agents reading stale SQLite rows without a warning. Enable cache freshness only when the resolved spec, profiler output, or absorb manifest shows at least one covered read path backed by a syncable resource that `sync` can refresh from the upstream API before serving. Do not enable it from Phase 1 research notes or scorecard goals alone. Leave it disabled for stateless read-through wrappers and for local stores that are primarily per-user working state, such as carts, drafts, or other session-owned data where a pre-read refresh could replace the user's local state with a different snapshot. Also leave it disabled for quota-metered, paid, rate-limited, or expensive bulk refresh APIs unless the refresh path is cheap, bounded, and clearly valuable; those CLIs should rely on manual `sync` plus the generated `doctor` cache report instead of surprising users with pre-read upstream calls. Catalog-mode runs skip this step: keep the built-in catalog entry's cache settings unchanged. Do not pass a flag or patch generated files after the fact; cache freshness must come from the spec that drives generation. For internal YAML specs, add the cache block before the final `generate` invocation only when at least one generated syncable resource read command will be covered automatically, or `cache.commands` will register a real hand-authored store-reading command: ```yaml cache: enabled: true stale_after: 168h # choose a domain-appropriate default refresh_timeout: 30s # optional; blank uses the generated runtime default ``` Generated resource list/get/search commands are covered automatically from the syncable resources profile. Use `cache.commands` only for hand-authored novel commands that read the local store and are not generated resource commands. The command `name` is the Cobra path without the binary name, and every listed resource must be declared in `resources:` and classified as syncable. ```yaml cache: enabled: true stale_after: 168h # choose a domain-appropriate default commands: - name: resources: [] ``` Pick `stale_after` from the domain's update cadence: shorter for live feeds or rapidly changing inventory, longer for reference catalogs and archival data. Do not enable cache just to satisfy the scorecard if there is no upstream refresh path or no user value in pre-read freshness; the generator intentionally skips the helpers when they would be dead code. ### Pre-Generation Auth Enrichment Before generating, check whether the resolved spec has auth. This matters most for browser-sniffed and crowd-sniffed specs where the mechanical auth detection may have failed (e.g., session expired during browser-sniff, SDK didn't expose auth patterns). **Check the spec:** - For internal YAML specs: look for `auth:` section with `type:` not equal to `"none"` - For OpenAPI specs: look for `components.securitySchemes` or `security` sections **If auth is missing** (`type: none` or no auth section) AND Phase 1 research found auth signals, enrich the spec before generation: 1. Check the research brief for auth mentions (Bearer, API key, token, cookie, OAuth) 2. Check Phase 1.5a MCP source code analysis for auth patterns (header names, token formats) 3. Check Phase 1.6 Pre-Browser-Sniff Auth Intelligence results (if the user confirmed auth) If any source identified auth, **edit the spec YAML** to add the auth section before running generate. Catalog-mode runs (`cli-printing-press generate ` where `` is in `catalog/`) can skip the spec edit when the catalog entry declares `auth_env_vars` — those canonical names are applied automatically and the parser's name-derived default name is retained as a trailing fallback so operators on existing setups don't need a rename. For internal YAML specs: ```yaml auth: type: bearer_token # or api_key, depending on what research found header: Authorization # or the specific header from MCP source in: header env_vars: - _TOKEN # bearer_token → _TOKEN, api_key → _API_KEY ``` When research or source metadata names a real single-token env var, record it in `research.json` as `auth.canonical_env_var`; fresh generation reads that name first and keeps the parser-derived env var as a trailing fallback. When you are editing an internal YAML spec directly, use only the canonical name in `env_vars`; do not add guessed slug-based aliases. For OpenAPI specs, choose the security scheme by wire format, not by whether the token feels like an API key. Use `type: http` with `scheme: bearer` when the upstream API sends `Authorization: Bearer `, including PAT-shaped tokens such as Slack `xoxp`, Notion integration tokens, Linear API keys, and GitHub PATs. Use `type: apiKey` only when the API sends the configured value as the raw header or query value, such as `X-API-Key: ` or `Authorization: ` with no scheme prefix. The generator adds the `Bearer ` prefix for `http` bearer schemes; `apiKey` sends exactly the configured value and will not add a prefix. Quick test: if upstream docs or live traffic show `Authorization: Bearer `, model it as `http` bearer. If they show `X-API-Key: `, `?api_key=`, or `Authorization: ` with no scheme prefix, model it as `apiKey`. For OpenAPI specs, prefer `x-auth-env-vars` on the selected security scheme when the wrapper slug differs from the underlying API brand. **If auth IS present** in the spec but Phase 1 evidence shows the slug-derived env var will differ from the canonical name users have already set for this API, enrich the spec with the canonical name before generation. The slug-derivation rule (security-scheme slug uppercased plus `_TOKEN` / `_API_KEY` / `_OAUTH2` per type) rarely matches the canonical name for established APIs. Common shapes: - Stripe (bearer): canonical `STRIPE_SECRET_KEY`, not slug-derived `STRIPE_OAUTH2` - HubSpot (bearer): canonical `HUBSPOT_PRIVATE_APP_TOKEN`, not slug-derived `HUBSPOT_API_KEY` - Twilio (HTTP Basic, two-var pair): canonical `TWILIO_ACCOUNT_SID` + `TWILIO_AUTH_TOKEN`, not slug-derived `TWILIO_USERNAME` + `TWILIO_PASSWORD` - Keap (OAuth2 authorization-code): canonical `KEAP_SERVICE_ACCOUNT_KEY`, not slug-derived `KEAP_OAUTH2` Walk through: 1. Compute the slug-derived env var the generator will pick (security-scheme slug, uppercased, plus the type-suffix above; HTTP Basic produces a `_USERNAME` + `_PASSWORD` pair; OAuth2 `client_credentials` produces a `_CLIENT_ID` + `_CLIENT_SECRET` pair). 2. Check Phase 1 research, Phase 1.5a MCP source code analysis, and community wrapper READMEs for a canonical env var name documented by the vendor or in widespread use. 3. If they differ and the canonical name is a single-token credential, record it in `research.json` as `auth.canonical_env_var`. The generator will read the canonical name first and retain the slug-derived form as a fallback. If you are editing the source spec directly instead, add `x-auth-env-vars` on the selected security scheme (OpenAPI) or set `auth.env_vars` to the canonical name (internal YAML). Use only the canonical name in the spec edit; do not retain guessed aliases there. For HTTP Basic, supply the full two-entry canonical pair (username position first, password position second) via `x-auth-env-vars`. For OAuth2 `client_credentials`, the parser silently re-applies the `CLIENT_ID`/`CLIENT_SECRET` default when `x-auth-env-vars` has fewer than two entries (see `applyAuthEnvVarDefaults` in `internal/openapi/parser.go`); if the canonical secret is a single service-account token for a `client_credentials` flow, use `x-auth-vars` instead (next section) so the override is preserved. 4. If research surfaces no canonical name distinct from the slug-derived form, do nothing. The slug-derived name is fine, and a spurious `x-auth-env-vars` would just shadow it with the same value. ```yaml # Bearer / API-key single-token case (Stripe, HubSpot, Keap on # authorization-code grant, most apiKey schemes). components: securitySchemes: keapOAuth2: type: oauth2 flows: authorizationCode: { ... } x-auth-env-vars: - KEAP_SERVICE_ACCOUNT_KEY ``` ```yaml # HTTP Basic two-var canonical pair (Twilio). components: securitySchemes: basicAuth: type: http scheme: basic x-auth-env-vars: - TWILIO_ACCOUNT_SID - TWILIO_AUTH_TOKEN ``` Skipping this step pushes the agent into hand-patching `internal/config/config.go` Load and `internal/cli/doctor.go` env-var checks after a `doctor` FAIL against the operator's real environment. Enriching the spec avoids that round-trip. For OpenAPI bearer-token specs that need richer env-var metadata (kind classification, optional credentials, OR-group relationships), keep the security scheme as `http` bearer and put `x-auth-vars` on that scheme. Do not switch to `apiKey` just to attach the richer metadata. ```yaml components: securitySchemes: bearerAuth: type: http scheme: bearer bearerFormat: xoxp x-auth-vars: - name: SLACK_BOT_TOKEN kind: per_call required: false sensitive: true description: Set this OR `SLACK_USER_TOKEN` for workspace API calls. - name: SLACK_USER_TOKEN kind: per_call required: false sensitive: true description: Set this OR `SLACK_BOT_TOKEN` for user-scoped API calls. ``` For OpenAPI raw-key schemes that need richer env-var metadata, keep `apiKey` and place `x-auth-vars` on the raw-key scheme. ```yaml components: securitySchemes: rawHeaderKey: type: apiKey in: header name: X-API-Key x-auth-vars: - name: _API_KEY kind: per_call required: true sensitive: true description: Raw API key header value. ``` See `docs/SPEC-EXTENSIONS.md` for the canonical `x-auth-vars` schema. `kind` controls who supplies the value: - `per_call` is the default user-supplied credential used by normal commands. - `auth_flow_input` is only needed during `auth login`. - `harvested` is populated by the auth login flow into local config. `sensitive: true` means credential material that must be redacted in logs and agent context. Use `sensitive: false` for public configuration, such as an OAuth `client_id`. Encode AND/OR relationships with each var's `required` flag plus `description` text. There is no first-class group syntax. For OR cases, mark each alternative `required: false` and name the other option in the description. For AND cases, mark each required member `required: true`. The parser auto-classifies cookie schemes as `harvested` and OAuth2 `client_credentials` inputs as `auth_flow_input`. Add `x-auth-vars` only when overriding those defaults or resolving multi-scheme ambiguity. For OpenAPI specs, add an `info.description` mention if one doesn't exist — the parser's `inferDescriptionAuth` will detect it automatically. **Why enrich before generation, not after:** The generator's templates (config, client, doctor, auth, README) all read `Auth.*` fields from the spec. Patching config.go after generation only fixes env var support — it misses the doctor auth check, client auth header, README auth section, and auth command setup. Enriching the spec means every template produces correct auth from the start. **When to skip:** If the API genuinely doesn't need auth (ESPN public endpoints, weather APIs, public data feeds), don't invent auth. The signal must come from research — not from guessing. No research mention of auth = no enrichment. #### Public Parameter Name Enrichment Before generating, inspect endpoint params and body fields whose API names would make poor CLI or MCP inputs, especially one-letter keys, punctuation-heavy keys, or names that only make sense inside the upstream protocol. When clear evidence shows the user-facing meaning, author `flag_name` in the internal spec or overlay and add `aliases` only for compatibility spellings. This is an agent judgment step, not a generator inference step. The skill uses research, docs, SDK/source names, browser-sniff form labels, traffic-analysis request context, and endpoint workflow evidence to decide the semantic name. The Printing Press CLI validates and propagates that authored data through Cobra flags, generated examples, typed MCP schemas, and `tools-manifest.json`; it must not guess that a cryptic key always means the same thing across APIs. Run the deterministic inventory before generation when a spec may contain cryptic wire names: ```bash cli-printing-press public-param-audit --spec --ledger /public-param-audit.json --strict ``` The command does not decide that every finding needs a public flag. It identifies parameters that need an agent decision. For each pending finding, either: - Author `flag_name` and compatibility `aliases` in the spec or overlay, then rerun the audit so the finding becomes resolved from the spec itself. - Record `decision: "skip"` in the ledger with `source_evidence` and `skip_reason` when source material shows no public rename is warranted. A ledger entry with `decision: "flag_name"` or `proposed_flag_name` is only a note to the agent; it is not complete until the spec or overlay actually contains the public name. Strict mode fails on unreviewed findings, not on evidence-backed skips. A one-letter wire name alone is enough to enter the inventory, but not enough to author a rename. Good evidence: - Explicit parameter descriptions, vendor docs, or SDK argument names. - Browser-sniff form/input labels and the interaction that produced the request. - Traffic-analysis context tying the parameter to a specific endpoint workflow. - Existing manuscript notes or reviewed examples that use the same task-level term. Bad evidence: - A one-letter name alone. - Generic descriptions such as "query parameter" or "string value". - Ambiguous sample values with no endpoint context. - Global assumptions such as "`s` means search" or "`c` means city". Prefer concise task-level names an agent would naturally use on that command. For a store-locator endpoint with `s` described as `Street address` and `c` described as `City, state, zip`, author: ```yaml params: - name: s flag_name: address aliases: [s] description: Street address - name: c flag_name: city aliases: [c] description: City, state, zip ``` The upstream wire keys remain `s` and `c`; generated users and agents see `address` and `city`. If the evidence is unclear after research, leave `flag_name` unset, record the reviewed source material and evidence gap in the audit ledger, and preserve the gap in the manuscript rather than inventing a friendly name. #### Free/Paid Tier Routing Enrichment If Phase 1 finds that the headline commands should stay free but secondary enrichment needs a paid key, declare tier routing in the spec before generation. Do this only when research identifies a real split; do not invent tiers for a single-auth API. Detection signals: - Research or source-priority notes say the primary source is free but a secondary source needs a paid/API key. - Some endpoints are documented as public while adjacent enrichment endpoints are documented as paid, partner, premium, or quota-gated. - Browser/crowd sniffing found public endpoints and SDK/MCP research found a separate credential for expanded coverage. - Sniffed specs carry per-endpoint `observed_auth` (a list of lowercased request header names observed during capture, e.g. `[authorization]` or `[x-api-key]`). An empty or missing `observed_auth` on an endpoint is evidence that the request went anonymously; a populated list is evidence the endpoint required auth. The same per-endpoint signal is mirrored on `TrafficAnalysis.endpoint_clusters[].observed_auth` in the traffic-analysis sidecar. Treat both as observation-only — not a security scheme declaration — and corroborate with documentation before declaring a tier. Action: - Internal YAML: add `tier_routing` plus `tier` on the affected resource or endpoint. - OpenAPI: add `x-tier-routing` at the root or under `info`, and add `x-tier` to the path item or operation. - Use `auth.type: none` for the free tier. - Use only `api_key` or `bearer_token` for credential tiers in v1. - If a credential tier uses a different `base_url`, it must be HTTPS and same host-family unless `allow_cross_host_auth: true` records explicit review. - Do not combine `no_auth: true` or OpenAPI `security: []` with a credential tier. Skip when: - All useful commands require the same credential. - The paid source would become the primary headline surface instead of enrichment. - The auth split requires OAuth, cookie, composed, or session-handshake tier auth; handle that as a normal auth-mode decision for now. Example: ```yaml tier_routing: default_tier: free tiers: free: auth: {type: none} paid: auth: type: api_key in: query header: api_key env_vars: [EXAMPLE_PAID_KEY] resources: search: tier: free endpoints: list: method: GET path: /search enrich: method: GET path: /paid/search tier: paid ``` #### Tagging endpoints `no_auth: true` (composed/cookie auth APIs) For APIs whose `auth.type` is `cookie`, `composed`, or `session_handshake` — i.e., auth that requires interactive setup (browser cookie capture, multi-step token exchange) — audit each endpoint individually for whether it actually needs authentication. The default of `no_auth: false` means "auth required"; flip it to `no_auth: true` for endpoints that work without credentials. Typical unauthenticated endpoints worth tagging: - **Auth-flow primitives:** login, registration, password-reset, email-confirm, refresh-token, OAuth callback. The user isn't authenticated when calling these — they ARE the auth flow. - **Public discovery:** store/location finder, menu browse, public catalog, category listing, public search, public product detail. - **Health/metadata:** health checks, version probes, capability flags, sitemap. Why this matters: the `no_auth` count drives downstream decisions. Specifically, a composed-auth API with zero `no_auth: true` tags previously got labeled `mcp_ready: cli-only` and was suppressed from MCPB manifest emission, which broke the Claude Desktop install path entirely. The current generator (post-2.5) ships a manifest regardless, but the readiness label, scorecard breadth dimension, and SKILL.md prose all read better when the count reflects reality. If unsure whether an endpoint requires auth, the safe default is `no_auth: false` (auth required) — over-tagging can mislead users to expect tools that won't work. **Example for a composed-auth pizza-ordering API:** ```yaml resources: account: endpoints: register: method: POST path: /account/register no_auth: true # registering means you're not yet authenticated login: method: POST path: /account/login no_auth: true # the auth flow itself profile: method: GET path: /account/profile # no_auth defaults to false — needs auth to view your own profile stores: endpoints: find: method: GET path: /stores/near no_auth: true # public store finder cart: endpoints: checkout: method: POST path: /cart/checkout # no_auth defaults to false — placing an order needs auth ``` #### Cookie/composed HTML transport For specs with `auth.type: cookie` or `auth.type: composed` and any `response_format: html` endpoint, treat browser fingerprint compatibility as the safe default. The generator emits Surf-backed Chrome transport for that shape unless the spec explicitly says `http_transport: standard`. Before setting an explicit standard opt-out, run `cli-printing-press probe-reachability` against a representative HTML GET endpoint. If the probe returns `standard_http`, record `http_transport: standard` in the spec. If it returns `browser_http`, leave the default or set `http_transport: browser-chrome`. If it returns `browser_clearance_http`, return to the browser-clearance flow above so the generated CLI has both browser-compatible HTTP and reusable browser auth proof. For cookie/composed-auth CLIs, recommend the `press-auth` companion binary — it captures cookies once via a controlled Chrome window and serves them to generated CLIs on demand, avoiding the on-disk session-cookie blind spot that breaks `auth login --chrome` against a daily Chrome profile. See [references/auth-companion.md](references/auth-companion.md) for the recommendation flow, install command, and debug playbook. ### Pre-Generation MCP Enrichment Before generating, count the spec's MCP tool surface and decide whether to opt into the spec's `mcp:` enrichment fields. This matters most for medium-to-large APIs (>30 tools) where the default endpoint-mirror surface scores poorly on the scorecard's MCP architectural dimensions and burns agent context at runtime. **Why before generation, not after:** the generator emits the MCP server's `main.go`, `tools.go`, `intents.go`, `code_orch.go`, `tools-manifest.json`, and README MCP section from the spec at generate-time. Patching after generation fragments across 4+ files, won't be byte-identical, and the polish skill cannot fix it (polish doesn't re-run generation). Enriching the spec means every template emits the right surface from the start. **Count the tool surface.** Two parts: 1. **Typed endpoints** — count `endpoints` across all `resources` (and `sub_resources`) in the spec. These become per-endpoint MCP tools at generate-time. 2. **Cobratree-walked tools** — the runtime walker registers user-facing Cobra commands as MCP tools. Estimate as: `extra_commands` count + ~13 framework tools that ship by default (sql, search, context, sync, stale, doctor, reconcile, etc., minus framework-skipped). When novel features are planned, add their estimated command count. The total is what an agent loads at MCP server start. **Decision table:** | Total tools | Action | |-------------|--------| | <30 | Skip — default endpoint-mirror surface is fine. | | 30–50 | Ask the user. Suggest `mcp.transport: [stdio, http]` for remote reach; suggest `mcp.intents` if there are clear multi-step workflows. | | >50 | The generator auto-applies the Cloudflare pattern (transport + code orchestration + hidden endpoint tools) unless `mcp.orchestration` / `x-mcp.orchestration` is explicitly set. | **Mandatory >50 endpoint-tools confirmation.** If the pre-generation count predicts more than 50 endpoint tools, expect `generate` to print an informational line beginning `info: applied Cloudflare MCP pattern`. This is the intended default and does not require a blocking question. Before verification, polish, dogfood, or publish, confirm the generated MCP surface is the thin `_search` + `_execute` pair. If the user explicitly wants raw endpoint tools past the threshold, set `mcp.orchestration: endpoint-mirror` (internal YAML) or `x-mcp.orchestration: endpoint-mirror` (OpenAPI) before regenerating. **The Cloudflare pattern** (default for large surfaces without explicit orchestration) — the generator applies this shape automatically. Add the spec block only when you need to make the choice explicit or preserve it across older generator versions: ```yaml mcp: transport: [stdio, http] # remote-capable; reaches hosted agents orchestration: code # thin _search + _execute pair endpoint_tools: hidden # suppress raw per-endpoint mirrors intents: # optional; named multi-step intents - name: fetch_and_summarize description: Fetch an item then summarize it params: - name: item_id type: string required: true description: item identifier steps: - endpoint: items.get bind: id: ${input.item_id} capture: item - endpoint: items.summarize bind: body: ${item.body} capture: summary returns: summary ``` `mcp.transport: [stdio, http]` adds HTTP streamable transport so cloud-hosted agents (Managed Agents, web clients) can connect. `mcp.orchestration: code` emits the thin search+execute pair that covers the full surface in ~1K tokens. `mcp.endpoint_tools: hidden` removes the raw per-endpoint tools that would otherwise still show up alongside the orchestration pair. For OpenAPI input specs, declare these fields under `x-mcp:` at the document root (OpenAPI 3.0 `x-*` vendor extensions). The shape is identical to the internal-YAML `mcp:` block above — same field names, just nested under a vendor-extension key. See [`docs/SPEC-EXTENSIONS.md`](../../docs/SPEC-EXTENSIONS.md) for the canonical schema and `info`-level placement option. **Smaller-surface variants:** - Just want remote reach? Small APIs (at or under `spec.DefaultRemoteTransportEndpointThreshold` typed endpoints) get `[stdio, http]` by default — no spec edit needed. Set `mcp.transport: [stdio, http]` explicitly only when the API is above the threshold and still wants remote reach. - Have 3–5 obvious multi-step workflows but <50 endpoints? Add `mcp.intents` without code orchestration; leave `endpoint_tools` at default (visible). **When to skip entirely:** small APIs (<30 tools), one-shot specs that won't be installed as MCP servers, or APIs where the user explicitly opts out of MCP enrichment. **Verifying after generation:** the scorecard's `mcp_remote_transport`, `mcp_tool_design`, and `mcp_surface_strategy` dimensions reflect the choices above. A correctly enriched spec for a >50 tool API should score 10/10 on all three. If polish later reports these dims weak, that's a sign this enrichment step was skipped — re-run generation with the enriched spec rather than trying to fix it in polish. ### Lock and Generate Before running any generate command, acquire the build lock: ```bash cli-printing-press lock acquire --cli -pp-cli --scope "$PRESS_SCOPE" ``` If acquire fails (another session holds a fresh lock), present the lock status to the user and let them decide: wait, use a different CLI name, force-reclaim, or pick a different API. The `--category ` flag shown below is for non-catalog runs whose category was not already authored into an editable spec. Omit it for catalog-config runs; the built-in catalog category is authoritative there. `--lenient` stubs missing local `#/components/schemas/` refs as permissive object schemas with warnings so converted OpenAPI specs can still generate. Add `--strict-refs` only when a run must fail instead of accepting those local schema stubs; it does not change the rest of lenient cleanup. OpenAPI / internal YAML: ```bash cli-printing-press generate \ --spec \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --force --lenient --validate ``` Browser-browser-sniff-enriched (original spec + browser-sniff-discovered spec): ```bash cli-printing-press generate \ --spec \ --spec "$RESEARCH_DIR/-browser-sniff-spec.yaml" \ --name \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --spec-source browser-sniffed \ --traffic-analysis "$DISCOVERY_DIR/traffic-analysis.json" \ --force --lenient --validate # If proxy pattern was detected during browser-sniff, add: # --client-pattern proxy-envelope ``` Sniff-only (no original spec, browser-sniff was the primary source): ```bash cli-printing-press generate \ --spec "$RESEARCH_DIR/-browser-sniff-spec.yaml" \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --spec-source browser-sniffed \ --traffic-analysis "$DISCOVERY_DIR/traffic-analysis.json" \ --force --lenient --validate # If proxy pattern was detected during browser-sniff, add: # --client-pattern proxy-envelope ``` Crowd-browser-sniff-enriched (original spec + crowd-discovered spec): ```bash cli-printing-press generate \ --spec \ --spec "$RESEARCH_DIR/-crowd-spec.yaml" \ --name \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --force --lenient --validate ``` Crowd-sniff-only (no original spec, crowd-sniff was the primary source): ```bash cli-printing-press generate \ --spec "$RESEARCH_DIR/-crowd-spec.yaml" \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --force --lenient --validate ``` Both browser-sniff + crowd-sniff (merged with original): ```bash cli-printing-press generate \ --spec \ --spec "$RESEARCH_DIR/-browser-sniff-spec.yaml" \ --spec "$RESEARCH_DIR/-crowd-spec.yaml" \ --name \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --traffic-analysis "$DISCOVERY_DIR/traffic-analysis.json" \ --force --lenient --validate ``` Docs-only: ```bash cli-printing-press generate \ --docs \ --name \ --output "$CLI_WORK_DIR" \ --research-dir "$API_RUN_DIR" \ --category \ --force --validate ``` GraphQL-only APIs: - Generate scaffolding only in Phase 2 - Build real commands in Phase 3 using a GraphQL client wrapper After generation: **Verify the CLI description across every surface.** A single curated one-liner is rendered into five files: `internal/cli/root.go` (`Short:`), `SKILL.md` frontmatter (`description:`), `.goreleaser.yaml` (`brews:` description), `internal/cli/agent_context.go` (`Description:`), and `internal/mcp/tools.go` (the `handleContext` response's `"description"` key). Each resolves from the authored sources (`narrative.headline` in `research.json`, or `cli_description:` in the spec) when set. `root.go`'s `Short:` has a safe generic fallback (`"Manage resources via the API"`); the other four fall through to the spec's raw `info.description` — which is often the upstream OpenAPI blob leading with a Markdown heading like `# Introduction` followed by API-shaped paragraphs. Eyeballing only `root.go` will miss the failure mode because `root.go` is the only surface that's structurally immune. Open at least the `SKILL.md` frontmatter `description:` and the `.goreleaser.yaml` `brews:` block in addition to `root.go`'s `Short:`. If any reads as API documentation rather than user-facing CLI purpose ("AeroAPI is a simple, query-based API…"), or contains a bare Markdown heading, the authored sources are missing. Fix at the source: set `narrative.headline` in `research.json` to a single-sentence differentiator (name what makes this CLI worth using, don't restate the API), or add a `cli_description:` line to the spec. Then regenerate. Do not hand-edit the printed files — they revert on the next regen. **REQUIRED: Preserve README sections.** The generated README contains 5 standard sections that the scorecard checks for: Quick Start, Agent Usage, Health Check, Troubleshooting, and Cookbook. When rewriting the README for this API during Phase 3, **preserve all 5 sections**. You may add additional sections that help users of this specific API (e.g., "Rate Limits", "Pagination", "Authentication Setup"), but never remove the standard ones. **REQUIRED: Verify auth was generated.** Check if the generated `config.go` has auth env var support (look for `os.Getenv` calls for API key variables). If the pre-generation auth enrichment ran correctly, this should already be present. If not (enrichment was missed or the spec was ambiguous), this is the safety net: check the Phase 1 research brief for auth requirements and manually add env var support to `config.go` using the pattern: add `APIKey`/`APIKeySource` fields to the Config struct, and `os.Getenv("_API_KEY")` in the Load function. **Validate narrative `command` strings before publishing examples.** The LLM (or human) authoring `research.json` can name commands that don't actually exist in the generated CLI — ` stats` when the real shape is ` reports stats`, or a command that was dropped because its endpoint had a complex body. It can also write a real command path with a bogus flag or positional shape. Without a check, the broken commands ship to the README's Quick Start (`narrative.quickstart`) and the SKILL's recipes (`narrative.recipes`); users copy-paste them and hit failures on the very first invocation. `cli-printing-press shipcheck` now runs `validate-narrative --strict --full-examples` automatically after `verify` builds the CLI binary. The standalone command is still useful immediately after editing `research.json`: it walks every `narrative.quickstart[].command` and `narrative.recipes[].command`, strips the binary name and trailing arguments, and runs ` --help` for each. With `--full-examples`, it also runs the complete example under `PRINTING_PRESS_VERIFY=1`, appending `--dry-run` when the command advertises it. This catches bad flags and argument shapes without making live API calls. ```bash QUICKSTART_BINARY="$CLI_WORK_DIR/-pp-cli" go build -o "$QUICKSTART_BINARY" "$CLI_WORK_DIR/cmd/-pp-cli" cli-printing-press validate-narrative --strict --full-examples \ --research "$API_RUN_DIR/research.json" \ --binary "$QUICKSTART_BINARY" ``` `--strict` exits non-zero on any missing command, empty subcommand-words entry, or empty narrative (both sections omitted). With `--full-examples`, it also fails on full examples that cannot dry-run or whose full invocation fails. Side-effectful auth, launch, and mutating apply examples are reported as `UNSUPPORTED` warnings and do not fail strict aggregation. Drop `--strict` to get a warn-only report, omit `--full-examples` only when you intentionally want the old offline path check, or add `--json` for machine-readable output. If any commands are reported missing, fix them in `research.json` before continuing. Common causes: - Resource was renamed during generation (typically the spec uses `users` but the LLM wrote `user` in research.json). - The endpoint exists but is hidden (had a complex body and was dropped from the promoted-command surface; reach it via the typed ` ` form). - The command name is a placeholder (` example`) that should have been replaced with a real path. - The path exists but the example uses a flag/argument shape the command does not accept; fix the concrete example in `research.json` before it renders into README and SKILL prose. `narrative.quickstart` drives the README Quick Start and `narrative.recipes` drives the SKILL.md recipes; getting either wrong silently ships copy-paste-broken examples to users. The `--help`-walk check is the cheapest catch and runs offline against the just-built binary — no live API access needed. After the description rewrite, update the lock heartbeat: ```bash cli-printing-press lock update --cli -pp-cli --phase generate ``` Then: - note skipped complex body fields - fix only blocking generation failures here - do not start broad polish work yet If generation fails: - fix the specific blocker - retry at most 2 times - prefer generator fixes over manual generated-code surgery when the failure is systemic - if retries are exhausted, release the lock and stop: ```bash cli-printing-press lock release --cli -pp-cli ``` ## Phase 3: Build The GOAT When `CODEX_MODE` is true, read [references/codex-delegation.md](references/codex-delegation.md) for the delegation pattern, task type templates, and circuit breaker logic. When `CODEX_MODE` is false, skip this section. Build comprehensively. The absorb manifest from Phase 1.5 IS the feature list. **First Phase 3 build-log line:** Before writing code, count the shipping-scope transcendence rows in the Phase 1.5 absorb manifest and write this as the first line of `$PROOFS_DIR/-fix--pp-cli-build-log.md`: ```text Manifest transcendence rows: planned, 0 built. Phase 3 will not pass until all ship. ``` Use only rows that Phase 3 is expected to build: include approved transcendence rows with concrete `Command` values, exclude rows whose implementation starts with `(stub)`, and keep `spec-emits` rows out of the hand-code count while still tracking whether their approved command path exists. Update the build log's built count as rows are completed. If `PRIOR_SUB60_REPRINT=true`, this line is also the strict-gate budget: partial transcendence coverage is a hold by default. **macOS framework access:** When the plan or manifest specifies macOS framework APIs (ScreenCaptureKit, CoreGraphics, CoreAudio, Vision, Shortcuts, etc.), use the Swift subprocess bridge pattern - Go shells out to `swift -e ''`. Swift is always available with Xcode CLT. Do NOT attempt Python+PyObjC - it requires separate installation and is unreliable across Python distributions. Reference `agent-capture-pp-cli/internal/capture/cgwindow.go` as the canonical example of this pattern. Priority 0 (foundation): - data layer for ALL primary entities from the manifest - sync/search/SQL path - this is what makes transcendence possible After completing Priority 0, update the lock heartbeat: ```bash cli-printing-press lock update --cli -pp-cli --phase build-p0 ``` Priority 1 (absorb - match everything): - ALL absorbed features from the Phase 1.5 manifest - Every feature from every competing tool, matched and beaten with agent-native output - This is NOT "top 3-5" - it is the FULL manifest **Lock heartbeat rule for long priority levels:** If Priority 1 has more than 5 features, update the lock heartbeat after every 3-5 features to prevent the 30-minute staleness threshold from triggering mid-build: ```bash cli-printing-press lock update --cli -pp-cli --phase build-p1-progress ``` Priority 2 (transcend - build what nobody else has): - ALL transcendence features from Phase 1.5 - The NOI commands that only work because everything is in SQLite - These are the commands that make someone say "I need this" **Lock heartbeat rule for Priority 2:** Same rule as Priority 1 — if Priority 2 has more than 3 transcendence features, update the heartbeat after every 2-3 features: ```bash cli-printing-press lock update --cli -pp-cli --phase build-p2-progress ``` After completing Priority 2, update the lock heartbeat: ```bash cli-printing-press lock update --cli -pp-cli --phase build-p2 ``` Priority 3 (polish): - skipped complex request bodies that block important commands - naming cleanup for ugly operationId-derived commands - tests for non-trivial store/workflow logic - enrich terse flag descriptions: review generated command flags. If any description is under 5 words or is generic spec-derived text (e.g., "access key", "The player"), improve it using the research brief. For example, change "access key" to "Steam API key (get one at steamcommunity.com/dev/apikey)". Focus on auth keys, IDs, and filter parameters. ### Agent Build Checklist (per command) After building each command in Priority 1 and Priority 2, verify these 12 principles are met. These map 1:1 to what Phase 4.9's agent readiness reviewer will check - apply them now so the review becomes a confirmation, not a catch-all. 1. **Non-interactive**: No TTY prompts, no `bufio.Scanner(os.Stdin)`, works in CI without a terminal 2. **Structured output**: `--json` produces valid JSON, `--select` filters fields correctly. Hand-written novel commands that build a Go-typed slice/struct and emit JSON should use the generated receiver-style helper, `flags.printJSON(cmd, v)`, or call `printJSONFiltered(cmd.OutOrStdout(), v, flags)` directly. Both route through `printOutputWithFlags`, picking up `--select`, `--compact`, `--csv`, and `--quiet` for free. Verify with ` --json --select | jq 'keys'` returning only the requested fields. 3. **Progressive help**: `--help` shows realistic examples with domain-specific values (not "abc123"). **Use `Example: strings.Trim(\`...\`, "\n")` (preserves leading 2-space indent) NOT `strings.TrimSpace(\`...\`)` (strips it).** TrimSpace makes the first example line unindented; dogfood's example-detection parser is tolerant of this in current versions, but the indented form renders correctly across every Cobra version and is the convention used by every generated command. 4. **Actionable errors**: Error messages name the specific flag/arg that's wrong and the correct usage 5. **Safe retries**: Mutation commands support `--dry-run`, idempotent where possible 6. **Composability**: Exit codes are typed (0/2/3/4/5/7/10 as applicable), output pipes to `jq` cleanly 7. **Bounded responses**: `--compact` returns only high-gravity fields, list commands have `--limit` 8. **Verify-friendly RunE**: Hand-written commands MUST NOT use `Args: cobra.MinimumNArgs(N)` or `MarkFlagRequired(...)`. Cobra evaluates both before RunE runs, so a `--dry-run` guard inside RunE cannot reach if those gates fail. Verify probes commands with `--dry-run` and expects exit 0; commands with hard arg/flag gates fail those probes. Instead: validate inside RunE, fall through to `cmd.Help()` only for unambiguous help-only invocations (no args and no flags), short-circuit on `dryRunOK(flags)` before any IO, and return `usageErr(...)` with exit 2 when required input is missing in real mode. - **Use string for "positional OR flag" commands**: when a command accepts a positional `` OR a flag `--y` as alternatives (e.g., `snapshot ` or `snapshot --domain example.com`), declare `Use: " [x]"` with **square brackets** (optional), not `` (required). Validate "exactly one of x or --y" inside RunE. Required positionals declared with angle brackets break verify-skill recipes that use the flag-only form. - **Declare verifier fixture inputs when generic values are not enough**: if the command needs realistic positional values or required flags to pass the verifier's happy path, add `Annotations: map[string]string{"pp:happy-args": "=example-id;--query=example"}` or assign a whole initialized `cmd.Annotations` map after construction. The verifier consumes semicolon-separated tokens in order: `