# Project Changelog All notable changes to the autoresearch project are documented here. ## v2.1.2 — Product Improvement Engine (2026-05-23) **Theme:** Outward-looking product strategy — "what should we build next?" ### Added - `/autoresearch:improve` — research ICP challenges, discover improvements, generate per-feature PRDs - 5 research categories: ICP challenges, competitor gaps, market trends, UX & experience, revenue & growth - Saturation-based termination (net-new < 2 for 3 consecutive non-reserved iterations) - Tiered ranking: ICP binary gate → Must-have / Nice-to-have / Moonshot → pairwise within Must-have - Conditional auto-discover when product context is zero (OR gate) - WebSearch triangulation with HIGH/MEDIUM/LOW confidence tags - Per-feature PRD generation with evidence chains, DECISION NEEDED markers, open questions - Terminal emitter — outputs PRDs for external tools, not autoresearch re-entry - `CONTEXT.md` domain glossary with output types, loop shapes (+ Notes column), scoring systems, key concepts - Upstream chain integration: `--improve` flag on probe, predict, debug, security ### Changed - Subcommand count: 12 → 13 - SKILL.md updated with improve row - `scripts/transform.sh` updated with improve sed rules for OpenCode + Codex - `docs/system-architecture.md` updated with improve in directory listing - `docs/project-overview-pdr.md` updated with improve in subcommands table - `docs/development-roadmap.md` phases renumbered ### Design Decisions (11 locked via adversarial reasoning) - D1: Opportunistic product context with conditional auto-discover - D2: Structured insight schema with canonical normalization - D3: 5 research categories with coverage guarantee - D5: WebSearch as hypothesis generation with triangulation safeguards - D8: Tiered ranking (not numeric scores) - D10: 2 AskUserQuestion rounds (setup + feature selection) - D11: Single-pass PRD with 5 guardrails ## v2.1.1 — Hook System (2026-05-22) **Theme:** Safety, context injection, and session notifications. ### Added - **9-hook safety system** — fires on every Claude Code session - `scout-block`: blocks node_modules/, .git/, __pycache__/ and other context-wasting directories (PreToolUse) - `privacy-block`: blocks .env, SSH keys, credentials with APPROVED: prefix override (PreToolUse) - `dangerous-cmd-block`: blocks force-push, rm -rf, git reset --hard (PreToolUse, regular git push allowed) - `iteration-context`: injects recent TSV data every 5th prompt after compaction (UserPromptSubmit) - `subagent-context`: gives spawned subagents ~150 tokens of loop awareness (SubagentStart) - `dev-rules-reminder`: re-injects plan path and code standards after compaction (UserPromptSubmit) - `simplify-gate`: warns at 400 LOC, blocks at 800 LOC when shipping verbs detected (UserPromptSubmit) - `session-init`: computes project root, branch, paths; persists session state (SessionStart) - `stop-notify`: terminal notification + optional webhook on session end (SessionEnd) - **hooks.json** — auto-registers all hooks on plugin install - **node-hook-runner.sh** — shell wrapper that silences profile noise for clean JSON output - **lib/ar-hook-utils.cjs** — shared utilities (state management, TSV reading, logging) - **lib/ignore.cjs** — vendored gitignore-spec pattern matcher (15KB, zero deps) - **.ckignore** — baseline blocked patterns (gitignore syntax, customizable per project) - **guide/hooks.md** — complete hook reference guide ### Changed - `plugin.json` version bumped to 2.1.1 - `scripts/transform.sh` now copies hooks to `claude-plugin/hooks/` - `.gitignore` updated with `!.claude/hooks/autoresearch/` exclusion - `docs/system-architecture.md` updated with hook system architecture - `CONTRIBUTING.md` updated with hook development guide ### Design Decisions - SessionEnd event (not Stop) for notifications — Stop fires per-turn, SessionEnd fires once - Force-push only blocking — regular `git push` allowed for `/autoresearch:ship` compatibility - Smart Bash argument parsing — prevents false positives on string literals - Session state via `/tmp/ar-session-{hash}.json` — hooks are subprocesses, can't share env vars - Iteration-based throttling (every 5th) — matches loop cadence, not wall-clock time ## v2.1.0 — 2026-05-22 ### Summary Modular rebuild. Thin SKILL.md routing table replaces the 813-line monolith. Twelve self-contained command files replace the old minimal-registration + 13-reference-file pattern. Net result: ~95% token reduction per invocation (~5–8K tokens vs ~100K). New `/autoresearch:evals` subcommand added. ### Added - `/autoresearch:evals` — one-shot analysis of any `*-results.tsv`: trends, plateaus, regressions, file hotspots, technique effectiveness, recommendations - `--evals` flag on all looping commands — adaptive mid-loop checkpoints + final evals-summary.md - `--evals-interval N` — override checkpoint frequency - `# metric_direction: higher_is_better|lower_is_better` comment on TSV line 1 — enables evals auto-detection - 3 new TSV status values: `keep (reworked)`, `hook-blocked`, `metric-error` (total: 8) - `scripts/transform.sh` — single script generates OpenCode and Codex distributions from `.claude/` source - `handoff.json` written by all subcommands for chain integration; `evals` reads `*-results.tsv` directly ### Changed - `SKILL.md` reduced from 813 lines to 41 lines — routing table only, no protocol - All 12 command files are now self-contained (94–120 lines each) — full protocol embedded, no reference file loading required for standard use - Reference files reduced from 13 to 3: `predict-personas.md`, `reason-judge-protocol.md`, `security-checklist.md` - Subcommand count: 11 → 12 (added evals) ### Removed - `plugins/autoresearch/resources/autoresearch-command-spec.json` — command contracts now live in individual command files - `scripts/sync-opencode.sh` and `scripts/sync-codex.sh` — replaced by `scripts/transform.sh` - `plugins/autoresearch/scripts/autoresearch_cli.py` — Python wrapper CLI no longer needed - 10 per-command workflow reference files (plan, debug, fix, security, ship, scenario, predict, learn, reason, probe workflows) ### Technical Details - Per-invocation token cost: ~5–8K (down from ~100K in v2.0.x) - All platform distributions generated from `.claude/` canonical source - Codex plugin version: `2.1.0-codex.0` - Backward compat: evals reads v2.0.03 TSV files (handles missing `timestamp` column) ## v2.0.0 — 2026-04-28 ### Summary Multi-platform GA release. Claude Code, OpenCode, and Codex all fully supported with strict YAML compliance, security-hardened scripts, and complete command metadata. ### Breaking - Version jump from 1.10.0 to 2.0.0 — reflects multi-platform support, 11 subcommands, and maturity since v1.x ### Fixed - SKILL.md YAML frontmatter uses folded block scalars — strict parsers (Codex CLI, PyYAML) no longer reject the description field (#69, #71) - `scripts/sync-codex.sh` and `scripts/sync-opencode.sh` pass paths via `sys.argv` instead of shell string interpolation — eliminates script corruption when paths contain quotes (#77) - `scripts/install.sh` `sync_dir()` validates destination path depth before `rm -rf` — rejects empty, root, or shallow paths (#78) ### Added - `name` field to `.opencode/agents/docs-manager.md` for explicit agent registration (#74) - `name` field to all 11 `.opencode/commands/*.md` files for schema compliance (#75) - `allowed-tools` declaration to all Claude Code and claude-plugin command files (#76) ### Changed - All version references unified to 2.0.0 across SKILL.md ×4, plugin.json, marketplace.json, README badge ### Contributors - @xiaolai — NLPM audit (#79): 3 bugs + 2 security findings with PRs #74–#78 - @georgelichen — initial YAML frontmatter issue report (#69) - @ch0udry — Codex loading error report (#71) - @rexplx — YAML colon escape fix (#80) - @haosenwang1018 — comprehensive YAML + sync script fix (#81) ## v1.10.0 — 2026-04-16 ### Added - `/autoresearch:probe` — adversarial multi-persona requirement interrogation engine - probe-workflow.md reference (449 lines) — 10-phase protocol - 8 personas: Skeptic, Edge-Case Hunter, Scope Sentinel, Ambiguity Detective, Contradiction Finder, Prior-Art Investigator, Success-Criteria Auditor, Constraint Excavator - Mechanical saturation termination: net-new constraints below threshold for K consecutive rounds - Output bundle: probe-spec.md, constraints.tsv, questions-asked.tsv, contradictions.md, hidden-assumptions.md, autoresearch-config.yml, summary.md, handoff.json ## v1.8.0 — 2026-03-21 ### Added - `/autoresearch:learn` — autonomous codebase documentation engine - 4 modes: init, update, check, summarize - Diff-based targeting for update mode: maps git changes to affected docs - Validation-fix loop with mechanical verification (max 3 retries) - Composite metric: `learn_score = validation%*0.5 + coverage%*0.3 + size_compliance%*0.2` ## v1.7.0 — 2026-03-18 ### Added - `/autoresearch:predict` — multi-persona swarm prediction - 5 default personas: Architecture Reviewer, Security Analyst, Performance Engineer, Reliability Engineer, Devil's Advocate - Adversarial debate mode (Red/Blue teams), anti-herd detection, budget enforcement - Zero external dependencies — file-based knowledge representation ## v1.6.2 — 2026-03-10 ### Added - Expanded EXAMPLES.md, GUIDE.md, and CONTRIBUTING.md ## Previous Versions See `docs/changelog.md` and git history for v1.3.0–v1.6.1 details. See also: [Development Roadmap](development-roadmap.md) | [Project Overview](project-overview-pdr.md)