# Changelog All notable changes to the Claude Night Market plugin ecosystem are documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [1.9.12] - 2026-06-18 ### Added - **conserve: reversible compression (CCR) for large tool outputs.** The `tool_output_summarizer` PostToolUse hook now archives any single oversized Bash/Read/Grep output (>=25 000 chars by default) to `.claude/context-archive/ccr-.txt` and surfaces a digest plus a retrieval handle. A new `scripts/context_retrieve.py` CLI fetches the original or a slice (`--grep`/`--head`/`--tail`/`--lines`) on demand. The archive is content-addressed (dedup) and survives `/clear`, so a continuation agent can read it by handle instead of re-running the command. Backed by 29 new tests (TDD). - **pensive/rust-review: coercion and conversion audit modules.** New `coercion_params` and `conversion_traits` audit modules with data tables, skill module docs, agent/command wiring, and unit tests. - **pensive/rust-review: expanded audit suite.** Nine additional audit modules: float-equality, transmute, mem-forget, mutable-statics, numeric-cast, repr-packed, match-wildcard, native-modeling, and idiomatic-elision. The skill is split into per-concern modules for surgical invocation. - **conserve: elegant-code skill and review command.** - **scribe: 2025-2026 AI slop detection patterns.** Four new Tier 5 categories derived from nine cross-source references: `performative_honesty` ("honestly,", "An Honest Review"), `sophistication_marker` ("prior art" collocations), `participial_tail` (", highlighting"), and `emphasis_crutch` ("Full stop."). Extended `throat_clearing` openers, new `rlhf_hedge` phrases, and nine Tier 1 vocabulary words. Patterns are scoped to collocations and framing nouns so legitimate uses pass (patent "prior art", dialogue "honestly"). - **scribe: prose semicolon splice detection.** New `semicolon_splice` Tier 5 pattern surfaces every prose semicolon joining two independent clauses for human judgment (confidence: low). ### Fixed - **leyline: Python 3.9 compatibility restored.** `datetime.UTC` (Python 3.11+) in `decision_journal` and `deferred_capture` broke the entire hook import chain under system Python 3.9. Switched to `timezone.utc`; an AST-scanning guard test now prevents re-introduction in CI. - **pensive: file reads memoized in review passes.** `_ReviewContext.get_file_content` now memoizes by filename. A review pass previously re-read each file from disk roughly twenty times across `analyze_*` mixins; one read per filename is now enforced by contract test. - **CI: strict mypy gating across all plugins.** `run-plugin-typecheck` pre-commit hook runs `--all` instead of `--changed`; the neutered global mirrors-mypy hook (which silently disabled 13 error codes) is removed; a new `typecheck.yml` workflow gates every PR. - Fixed `pytest-asyncio` loop scope warnings across plugins. - Resolved `leyline` mypy errors in `abstract` and `conserve`. - Addressed pre-existing test warnings surfaced by full-review pass. ## [1.9.11] - 2026-06-05 ### Added - **tome/ideate skill.** Generates diverse solution candidates using category-spanning ideation methods with rotation, so design work escapes repetitive LLM output. The catalog selects across method categories and rotates picks to avoid mode collapse. - **Grounded-evidence review contract + `citation_verifier.py`.** Review findings now carry a verbatim `Anchor` (the exact source text at the cited `file:line`) alongside the location. A new validator, `plugins/imbue/scripts/citation_verifier.py`, re-reads each citation and fails the review if a path does not resolve, a line is out of range, or the anchor does not match the source, so a finding cannot reference code that was never there. Backed by 20 unit tests. The schema lives in `imbue:structured-output`; the verification gate is `imbue:review-core` Step 5; both are wired into 17 review skills (`math-review`, `bug-review`, `rust-review`, and the rest) and 12 review/audit agents. ### Changed - **tome/triz channel grounded in the canonical method.** The TRIZ channel previously advertised "full matrix" depths while shipping a six-entry hand-authored catalog. The channel now reflects the researched canonical TRIZ method, closing the gap between the advertised and implemented behavior. - **scribe slop workflow normalizes British spellings to American.** The slop workflow now converts British spellings (colour, organise, catalogue) to American by default, with case preservation and opt-out support, skipping code, inline code, and URLs. ### Tests - Added guards for workflow wiring and graceful degradation paths. ## [1.9.10] - 2026-05-29 ### Added - **imbue/assisted-mastery skill.** Addresses the assistance dilemma in agent-assisted coding: makes the agent's reasoning a first-class deliverable (assumptions, alternatives considered, ramifications), surfaces tradeoffs before a design is chosen, and fades assistance so the human builds judgment instead of dependence (explain mode versus produce mode). Grounded in the expertise reversal effect, productive failure, and cognitive offloading research. - **imbue/graduated-implementation skill plus `guard_scope_ramp.py` hook.** The other direction of the assisted-mastery axis: assisted-mastery fades scaffolding, this ramps the ambition of the next increment. Start at the smallest intentional slice (~40 added lines) and widen the rung a notch only when the prior increment's understanding is demonstrated and recorded, low-stakes on an evidence gate, high-stakes on the human explaining the diff unaided (the aviation hand-fly check). The PreToolUse Write/Edit/MultiEdit hook holds each increment to the current rung, halves it for high-stakes paths, and defaults to shadow mode. Research basis in the graduated-implementation skill's `modules/research-basis.md` (Wilson 2019 85% rule; Platanios 2019 competence gate; GDL clean-record gating). The rung is driven by the full `leyline:risk-classification` tier scale (GREEN/YELLOW keep the rung, RED halves it, CRITICAL quarters it) via an `IMBUE_STAKES` signal, not path-regex alone. A `ramp-ledger` module records each notch climbed for later audit (pairing with `leyline:decision-journal`), and `attune`'s project-execution loop gains a ramp gate that demonstrates understanding of each increment before ramping to the next. - **imbue/dependency-verification skill plus `guard_package_hallucination.py` hook.** Confirms a suggested package exists before install, defending against package hallucination (5.2-21.7% of LLM-recommended packages do not exist) and slopsquatting. The PreToolUse Bash hook flags typosquats of popular packages offline and looks up unknown names in the registry, defaulting to shadow mode and never blocking on a network failure. - **imbue/proof-of-work: independent-verification module.** For high-stakes changes, the producing agent may no longer be its own sole verifier; a second independent check (fresh agent, untuned gate, or human) is required, per the four-eyes and nuclear independent-verification principles. - **leyline/risk-classification: automation-tiers module.** Pairs each risk tier with a default automation tier (A3 autonomous to A0 manual) and a pre-licensed downgrade trigger that drops the agent a tier on repeated failure, confidence loss, or a stakes spike (the aviation "children of the magenta" fix), rather than re-prompting at the same level. Research basis recorded in the assisted-mastery skill's `modules/research-basis.md`. - **conserve, memory-palace: MMPO anchor-clarity gates (#550, #551, #552).** Skill descriptions and memory summaries are now scored for anchor clarity and fail fast when an anchor is too vague to act on, applying the Minimal Meaningful Probe Output method from the MMPO paper. - **leyline/decision-journal: project decision journal contract.** A new shared contract creates and maintains `docs/tradeoffs.md` and `docs/lessons-learned.md` as append-only logs with stable IDs and a supersede-don't-delete discipline (MADR plus Y-statement for tradeoffs, PMI register plus blameless postmortem for lessons). `attune:project-init` scaffolds both files, and the decision and lesson workflows across attune, sanctum, spec-kit, and imbue record entries at their natural endpoints with a draft-and-confirm step. Captures the decisions, tradeoffs, and rework that AI-assisted workflows tend to drop. - **memory-palace/memory-clarity-probe skill.** A dual-anchor probe (task progress, information gaps) for verifying session state or a summary before handoff or compression, derived from the MMPO paper. - **pensive/safety-critical-patterns via `/full-review` (#538).** The NASA Power of 10 ruleset is now reachable from the unified review command for financial, medical, and high-reliability code paths. - **sanctum/validate-pr skill (#548).** Generates and self-executes a diff-derived test plan: groups changed files by area, runs targeted verifications, and proves revert-tests are genuine guards rather than dead assertions. - **sanctum/output-hygiene shared module.** A single source of truth for the text every sanctum workflow emits (commit messages, PR comments, thread replies, summaries). Contract A strips character-level slop markers (a `+` used as a conjunction, em-dashes, double-dashes, arrow connectors, and smart quotes) that the word-level checks let through; Contract B requires commit messages to state a change's reader-facing effect rather than the cleanup process behind it. Consumed by `commit-msg`, `commit-messages`, `pr-review` Phase 1.7, and `fix-pr`, each with an inline fallback for partial installs. Covered by `test_output_hygiene.py`. - **scribe: Tier 5 slop patterns wired into the runtime.** The 2026 cross-source slop patterns (spatial copula, negative parallelism, throat-clearing) now run in the detector path, not just the markdown skill. - **herald/double-shot-latte Stop-hook continuation judge.** A cross-platform Stop hook that reads the last assistant message and continues only on explicit intent to keep working, defaulting to stop so finished work is not nagged. It is pure standard library, so it needs no `jq`, `claude` CLI, or `/tmp`. A throttle allows up to 10 auto-continue cycles within a 5-minute window, then pauses and invites the user to resume with a fresh budget; the cap is configurable via `DOUBLE_SHOT_LATTE_MAX_CONTINUATIONS`. An optional LLM second shot (`DOUBLE_SHOT_LATTE_LLM=1`) breaks ties on ambiguous turns and falls back to the deterministic verdict when `claude` is unavailable. ### Changed - **README simplified to an index (435 to 200 lines).** The root README now leads with everyday workflows and links depth out to `book/` instead of inlining a 95-line "What's New" history and a 23-row plugin catalog. - **sanctum/update-readme teaches the index-not-manual shape.** The skill now favors concision with a soft line budget and a concrete, falsifiable Exit Criteria so future README passes default to linking over inlining. - **sanctum: renamed `validate-mr` to `validate-pr` for naming consistency** with `pr-review`, `fix-pr`, and `prepare-pr`. MR references became PR throughout; the `leyline:git-platform` dependency stays so GitLab CLI mapping still works. ### Fixed - **memory-palace/knowledge-intake rejects non-2xx fetches (#547).** A fetch that returns an error page no longer gets stored as if it were the requested resource. - **conserve: pinned frozen benchmark fixture for the log-hygiene test (#540).** The test no longer drifts with the live corpus. ## [1.9.9] - 2026-05-26 ### Added - **`plugins/abstract/scripts/finding_verifier.py`**: shared verifier that checks whether an auto-promoted finding's referenced locations still exist at HEAD. Parses a finding's `Where:` string or a `plugin:skill` id into file references, then reports `present`, `missing`, `stale_lines`, or `no_refs`. Scope boundary: it verifies location existence, not concern resolution, so a missing file is strong staleness evidence while a present file proves nothing. Covered by `test_finding_verifier.py`. - **abstract/auto_promote_learnings: verify-before-promote gate.** `run_auto_promote` now skips a finding whose every referenced location has been removed, recording it as `stale-skipped` instead of minting a phantom issue. Motivated by a 2026-05 sweep that found 8 of 11 auto-promoted issues already resolved by later work. Covered by the promotion-gate tests in `test_auto_promote_learnings.py`. - **sanctum/do-issue: Step 1.4 verify-against-HEAD triage.** The interactive workflow now classifies each auto-promoted finding as outstanding, already resolved, or uncertain before implementing, backed by the same `finding_verifier.py` module so the gate and the command agree on what "stale" means. ### Changed - **CHANGELOG/skill-graph-audit: PR #470 finding traceability.** Label the AR-30 (git_platform wrapper) and AR-15 (bootstrap helper) entries in the 1.9.4 section for grep-traceability (#502), surface the C-series IDs (C-13, C-20, C-21, C-46) in the aggregate refinement entry (#503), and replace 10 ASCII `--` prose connectors in `skill-graph-audit/SKILL.md` with commas and colons per the project slop rules (#507). ### Fixed - **sanctum/create-tag: release tags now trigger the pipeline.** `create-tag` pushed bare version tags (for example `1.9.9`), which never matched the cross-framework-publish workflow's `v*` trigger, so no release run started. Tags are normalized to a `v` prefix, then the command verifies the run appeared via `gh run list --branch "$TAG"` and surfaces its URL. - **`scripts/clawhub_export.py`: emit valid YAML scalars.** Unquoted scalars containing `": "` or a leading indicator character re-parsed to an empty mapping and failed `clawhub-export --validate`, breaking the v1.9.4 through v1.9.7 release runs. New `_yaml_scalar` and `_needs_yaml_quoting` helpers quote scalars correctly, with tests covering the edge cases. ## [1.9.8] - 2026-05-21 ### Added - **`scripts/fix_coverage_threshold.py`**: one-shot migration script that removes `--cov-fail-under` from `addopts` and `fail_under` from `[tool.coverage.report]`, then writes the threshold into a `[tool.nightmarket]` section so `run-plugin-tests.sh` can enforce it explicitly for full-suite runs only. Supports `--dry-run` and processes all 20 plugin `pyproject.toml` files. - **pensive/unified-review: sub-agent isolation guidance.** Step 3 now requires all review agents to be dispatched in a single parallel call and all results collected before synthesis begins. Prevents first-result anchoring, where the first returned agent report skews evaluation of the remaining results. - **sanctum/pr-review: understanding-probe anti-pattern and quality gate.** New anti-pattern "Merge Code You Cannot Explain" documents the risk of approving AI-generated changes the author cannot trace through. The quality gate now requires the author to explain how each changed section works and how it could fail, not just that tests pass. - **`docs/skill-description-guide.md`**: authoring guide for SKILL.md `description:` fields. Covers hard constraints (160 char max, third person), the format template, four skill categories with patterns and examples, a before/after table, common mistakes, and how to run the budget validator (`plugins/abstract/scripts/validate_budget.py`). - **hookify: `destructive-command-guard` security rule (#488).** New rule in `plugins/hookify/skills/rule-catalog/rules/security/` warns when a destructive command (`rm -rf`, `git push --force`, `kubectl delete`, `terraform destroy`, `DROP TABLE`, `DELETE FROM`) is paired with a production-shaped path or environment variable (`prod`, `production`, `live`, `PROD_*`). Motivated by the Replit/SaaStr incident (July 2025) where an agent reached a self-destruct button without friction. Rule defaults to `warn` (not `block`) so legitimate ops can proceed; disable temporarily via the local override file. Covered by `test_destructive_command_guard_rule.py`. ### Changed - **All 188 SKILL.md `description:` fields rewritten to action-oriented format.** Every skill now starts with an action-oriented verb phrase and includes a trigger condition (Use when/before/after/for). Anthropic recommends this format for accurate Claude-driven skill activation. Previously 173/188 descriptions were noun phrases that told Claude what a skill does but gave no signal for when to invoke it. The `description:` field is the only signal in the system-reminder Claude uses for skill selection; trigger phrasing makes activation accurate. Raw description budget grew from ~16k to ~35k chars, a deliberate trade-off within the 1M context window's ~80k available for descriptions. See `docs/skill-description-guide.md` and ADR-0004. - **pensive: `shell-review` skill hardened with POSIX structure patterns.** New module `structure-patterns.md` in `plugins/pensive/skills/shell-review/modules/` covers the library/executable distinction, the required preamble, `main "${@}"` as the last line, top-down execution prohibition, `depcheck()`, `usage()`, xtrace support, `readonly` globals, platform detection via `uname -s`, `case` over `[ ]`, and `shfmt -p -i 2 -ci` formatting. `safety-patterns.md` rewritten to match CLAUDE.md shell rules: no `echo`, braced variable references, `:?` expansion, `cd` in subshells, `${0%/*}` for script-relative paths, no `basename`/`dirname`, library-guard `case` form, and no `set -e`/`set -u` in libraries. Exit criteria now uses the required checkbox format per `.claude/rules/skill-exit-criteria.md`. - **pensive: six review modules refactored into mixin packages (#486).** `math_review`, `performance_review`, `bug_review`, `api_review`, `test_review`, and `makefile_review` each moved from a single `.py` file into a package directory with focused submodules (`_analysis.py`, `_constants.py`, `_helpers.py`, `_quality.py`, `_reporting.py`, etc.). Public API unchanged; callers import from the same package name. Each refactor is guarded by a `test_*_package_structure.py` test that verifies importability and re-exports. ### Fixed - **Coverage threshold enforcement decoupled from pytest `addopts` across all 20 plugins.** Previously `--cov-fail-under=N` in `addopts` and `fail_under` in `[tool.coverage.report]` caused subset test runs (content-assertion tests that import no Python source) to fail with low total coverage. Both keys are now removed from every plugin's `pyproject.toml`. The threshold is stored in a `[tool.nightmarket]` section (unknown to coverage.py) and read by `run-plugin-tests.sh` via awk; the `--cov-fail-under` flag is passed explicitly only for full-suite runs. - **abstract/skills: compress 37 skill descriptions to ≤100 chars and fix stale `shared-patterns` module refs.** Skill frontmatter `description` fields exceeding 100 characters were trimmed across `abstract`, `pensive`, `minister`, and other plugins to stay within the token budget enforced by the system prompt. Stale `@include shared-patterns.md` references in `abstract` skills were replaced with the correct module path. ### Tests - `test(scripts)`: 18 BDD unit tests for `fix_coverage_threshold.py` covering threshold extraction, addopts cleanup, report-section removal, `[tool.nightmarket]` insertion, and end-to-end migration in dry-run and write modes. - `test(scripts)`: 7 unit tests for `run-plugin-tests.sh` awk threshold extraction covering the 90%/85% standard cases, absent section, multiple tool sections, first-match semantics, and both `pensive` and `gauntlet` real `pyproject.toml` files. - `test(scribe)`: removed stale `version:` frontmatter assertions from `test_voice_extract.py` and `test_voice_generate.py`; fixed `_check_line_lengths` in `test_tech_tutorial.py` to skip YAML frontmatter blocks so long `description:` values no longer trigger false 80-char violations. - `test(pensive)`: cover `analyze()` edge cases in `bug_review` and `performance_review` for empty input, single-file input, and paths that raise `OSError` during analysis. ## [1.9.7] - 2026-05-18 ### Added - **conserve: `log-debugging-hygiene` module + `/filter-log` command.** New progressive-loading module under `compression-strategy` documents a three-tier workflow for handling pasted log output. Tier 1 (filter at source) beats tier 3 (compression) by 25 percentage points on the committed `intake_queue.jsonl` fixture; the asymmetry is guarded by `test_log_debugging_hygiene.py::test_filter_first_claim_is_reproducible`. The companion `/conserve:filter-log` command routes users to tier-1 filters (`tail`, `head`, `rg`, `jq`, `awk`) without auto-invoking compression. Honest framing of LLM accuracy claims cites LLMLingua, LogFiT, LogLLM, and CompressionAttack rather than asserting the unsupported claim that LLMs work better on compressed logs. - **conserve: no-bundled-compressor invariant test.** New `test_conserve_does_not_bundle_runtime_compressor` asserts that `plugins/conserve/pyproject.toml` `[project].dependencies` contains no log compressor (llmlingua, drain3, lognplus, logparser, loghub). Bundling one would contradict the module doctrine and the `shared-utility-consumer-rule`; the failure surfaces three deliberate options (preserve, layer, revise). - **conserve: hypothesis-based tier-1 subset property tests.** New `test_log_filter_properties.py` fuzzes log content shapes and verifies that `tail`/`head` outputs remain literal suffixes/prefixes of the input and never exceed the requested line budget. The subset property is what distinguishes tier 1 from tier 3; a "smart filter" that paraphrased would break these tests. Hypothesis added to conserve dev-deps only; runtime deps stay empty, preserving the no-bundle invariant the test guards. ## [1.9.6] - 2026-05-10 ### Added - **`minister:dora-metrics` skill (#487).** Computes the four DORA delivery-performance metrics (deployment frequency, lead time for changes, change failure rate, time to restore service, median) from GitHub PR and deployment data, and classifies the result into the Elite / High / Medium / Low tier from the Accelerate research. Intended for quarterly delivery retros. See `plugins/minister/skills/dora-metrics/` and the entry in `book/src/reference/capabilities-reference.md`. - **In-loop variation provider for gauntlet (#464).** When gauntlet runs inside Claude Code, the new `gauntlet.providers.in_loop` module replaces the default Anthropic-SDK variation provider with a deterministic local one that perturbs integers and small numeric tokens in the seed prompt. This avoids spawning a sibling Claude through the API just to vary a challenge wording. The default provider is unchanged outside Claude Code. The challenge skill registers the in-loop provider via `register_in_loop_provider_if_inside_claude_code()`. See `plugins/gauntlet/src/gauntlet/providers/in_loop.py` and the "In-Loop Provider Setup" section of `plugins/gauntlet/skills/challenge/SKILL.md`. ### Changed - **`memory-palace` deduplication index validates `importance_score` bounds.** `update_index()` in `plugins/memory-palace/hooks/shared/deduplication.py` now raises `ValueError` when `importance_score` falls outside the documented closed range `[0, 100]`. Validation runs before any cache or on-disk mutation, so a bad score cannot poison the index. The `knowledge-intake` SKILL.md now states the `[0, 100]` bound explicitly alongside the field. ### Fixed - **`fix(tome)`: register `tome:research` as dispatchable agent (#465).** The research agent was discoverable as a skill but missing from the agent registry, so other plugins could not invoke it through the Agent tool. Adding it to the registry restores the documented dispatch path. - **`fix(scripts)`: remove false-positive `deprecated-pre-decision` rule.** The hook-modernization scanner was flagging a pattern that the deprecation policy no longer treats as deprecated. The rule and its tests were removed; remaining rules are unchanged. ### Tests - `test(minister)`: cover `project_tracker` CLI surface. - `test(tome)`: backfill coverage for channels and output modules. - `test(sanctum)`: regression test for stale module references (#468). - `test(imbue)`: scope `os.open` patch to hook module to avoid masking init calls (#496). ### Documentation - `docs(abstract)`: refresh effort levels and add 2.1.97-2.1.138 compat notes. - `docs(pensive)`: add Verification and Testing sections to `performance-review` (#469). - `docs(archetypes)`: restore paradigm-component vocabulary in 13 SKILL.md files (#463). ## [1.9.5] - 2026-05-07 ### Added - **`/fix-pr` Step 5: HTTP, browser, and Playwright verification strategies.** The agent-verify manual test plan flow (Section 5.4) previously documented only shell, validator, and grep strategies. Reviewer test plan items like "verify the dashboard renders X" or "POST /api returns 200" had no documented strategy and fell back to LOW-confidence prose. Section 5.4.2 now covers HTTP probes (`curl -sf` + `jq -e`), DOM checks via the `mcp__plugin_superpowers-chrome_chrome__use_browser` MCP tool, multi-step Playwright specs, visual regression, and `phantom:computer-control` as last resort. New Section 5.4.5 ranks the three browser tiers (CDP MCP, Playwright, Computer Use) by cost and pins Tier 1 as the default to avoid spawning a Playwright instance per DOM assertion. - **Invariant-encoding regression tests for `/fix-pr` Section 5.4.** New `test_fix_pr_validate_strategies.py` guards three load-bearing decisions: 5.4.x sub-section ordering (regression guard for a prior reorder bug), presence of all five verification surfaces (shell, HTTP, CDP MCP, Playwright, computer-control), and the browser tier ordering. Mutation-tested: breaking the MCP tool name fails the two relevant tests. ### Changed - **Removed `version:` field from skill frontmatter** (170 SKILL.md files + dispatch tooling). The field was a duplicate of the per-plugin version in `plugin.json` / `openpackage.yml`, kept in sync by the release script per issue #484. Issue #484 explicitly listed removal as one of two acceptable resolutions; the auto-bump path was chosen at the time, but the field had no runtime consumers. Removing it eliminates 170 file edits per release. The plugin version remains the single source of truth. - Removed: `update_skill_md_version()` and the `**/SKILL.md` entry from `update_versions.py`. - Removed: `version` from `skill_validator.py` required fields and the `_validate_version` semver check. - Removed: tests asserting `version` presence in skill frontmatter (5 plugins, ~10 tests). - Unrelated: per-skill *adaptation* versioning (`adaptation.current_version` / `version_history` consumed by `abstract.skill_versioning`) is unchanged; that's a different feature. ### Fixed - **PR #511 review batch (23 threads resolved)**: - Python 3.9 compatibility: replaced `from datetime import UTC` (3.11+ alias) with `from datetime import timezone` plus `timezone.utc` across 11 files in leyline, minister, and sanctum tests. Most consequential fix was `plugins/leyline/src/leyline/deferred_capture.py`, which is library code documented as 3.9-compatible. - Removed `auto_improvement_gate` frozenset from `auto_promote_learnings.py` and its three supporting tests. The gate was suppressing recurring `abstract:skill-auditor` failures (issue #461) instead of fixing them; with the gate removed, normal auto-promote will surface the failures so root cause can be addressed. - Merged `docs/observability-hook-audit.md` into a new module `plugins/abstract/skills/hook-authoring/modules/ observability-warnings.md` (registered in the hook-authoring SKILL.md module references) and deleted the source. The pattern, three-category classification (binary-actionable, observe-only, needs-triage), counter-examples, and authoring checklist are preserved. - Replied to and resolved 19 "Why is this being removed?" threads. All flagged the same change: removing `#!/usr/bin/env python3` shebangs from non-executable Python modules and test files per PEP 394 and ruff's EXE001 rule. Shebangs belong only on entry-point scripts. - **PR #417 review batch (24 issues closed)**: bugfix-only release rolling up the imbue hook hardening, imbue test reliability, docs slop fixes, cross-plugin refactors, abstract plugin work, and policy / research / features threads tracked under issues #420--#462. Resolves #420, #421, #422, #423, #438, #439, #440, #441, #442, #443, #444, #450, #451, #452, #453, #454, #455, #456, #457, #458, #459, #460, #461, and #462. Test suites pass: imbue 635, leyline 656 (89.58% coverage), abstract 2186 (3 xfailed). See commit `237e4819` for the file map. ## [1.9.4] - 2026-05-03 ### Session summaries (PR #470) - **PR #470 deferred items closed on this branch**: the eight S-series suggestions (#484) and both C-series architectural follow-ups (#485) are now resolved on `ai-slop-1.9.4` instead of being deferred to 1.10.x. See "PR #470 deferred-items closure" under the Internal heading below for the per-finding map. - **PR #470 fix-pr session (50 review threads closed)**: 37 placeholder modules implemented across 6 skills (~6,931 new lines), 12 ephemeral docs and reports consolidated into the CHANGELOG and removed, deferred follow-ups tracked in issue #486. See "PR #470 fix-pr session" under the Internal heading below. ### Added - **AI slop playbook (2025-26) integrated into scribe**: `slop-detector` gains seven new modules covering identity-and-voice leaks, hallucination detection, stub-and-deferral handling, document economy, empirical baseline calibration, evidence-backed claim auditing, and an explicit anti-goals list to prevent over-correction. A `cleanup-workflow` module walks editors through Layer 0 to Layer 3 remediation. `doc-generator`, `style-learner`, `tech-tutorial`, `voice-generate`, and `voice-review` were updated to consume the new modules. Backed by 718 new test cases across `test_slop_detector.py`, `test_slop_detector_new_modules.py`, `test_doc_generator.py`, `test_style_learner.py`, and `test_tech_tutorial.py`. - **AI slop playbook extended to other plugins**: pensive `rust-review` adds four modules (`async-slop`, `iterator-and-allocation-slop`, `model-specific-tells`, `test-slop`); conserve `ai-hygiene-auditor` agent integrates the playbook; scribe slop-detector adds `structured-finding-output` for cross-tool consistency. New top-level `CONSTITUTION.md` codifies AI hygiene guardrails. - **leyline `git_platform` Python wrapper (AR-30)** (`plugins/leyline/src/leyline/git_platform.py`): cross-platform helpers for `gh api` and `gh api graphql` calls. Adopted by `post_learnings_to_discussions.py`, `promote_discussion_to_issue.py`, and `verify_plugin.py`. 7 unit tests in `test_git_platform.py` (128-line file). - **leyline `bootstrap` cross-plugin sys.path helper (AR-15)** (`plugins/leyline/src/leyline/bootstrap.py`): eliminates ad-hoc `sys.path` manipulation in plugin scripts. First adopter: `plugins/imbue/scripts/imbue_validator.py`. 4 unit tests in `test_bootstrap.py` (109-line file). - **abstract `hook-scope-guide` skill**: decision framework for choosing hook scope (plugin / project / global). Promoted from a single-file skill into a proper directory structure with frontmatter and registration in `plugin.json`. Indexed in `book/src/reference/capabilities-reference.md`. - **slop-scan-for-docs Layer 0**: `.claude/rules/slop-scan-for-docs.md` extended with three categorical critical patterns (identity leaks, hallucinations, bare stubs in production paths) that fail a doc outright before any structural or sentence-level checks run. - **imbue `karpathy-principles` skill + `/karpathy-check` command**: four-principle pre-implementation gate (think first, simplicity, surgical edits, verify) shipped as a 246-line SKILL.md plus `references/` and `modules/`, bound to a 79-line `/karpathy-check` command. Backed by 11 unit tests in `test_karpathy_principles.py`. Source design brief retained at `docs/karpathy-derivation/project-brief.md`. See commit `bdf77595`. - **attune `mission-orchestrator` skill**: lifecycle router across the attune brainstorm, specify, blueprint, and execute phases, with adaptive constraints in `modules/adaptive-constraints.md`. 29 unit tests in `test_mission_orchestrator.py`. See commit `bdf77595`. ### Changed - **Quality refinement waves 1, 2, and 3 complete**: 24 review findings closed across abstract, conserve, imbue, leyline, memory-palace, pensive, sanctum, and scribe, including the C-series architectural follow-ups C-13 (conserve CLI `main` decomposition), C-20 (memory-palace CLI typed-error narrowing), C-21 (vow hook exception surfacing), and C-46 (`memory_palace_cli` `build_parser` decomposition). Deferred follow-ups tracked in issue #486 (per-finding detail recoverable from git history at commit `a50a9352`). - **Shared-helper extraction (D-series)**: `json_utils.sh` vendored per-plugin (D-01); `parse_frontmatter` unified on the leyline canonical implementation (D-02); `extract_section` shared across abstract scripts (D-03); `dir_utils` facade shared by abstract hooks (D-04); duplicated hook helpers consolidated (D-05 / D-06 / D-12); `get_plugin_version` shared between root scripts (D-11); CLI JSON envelope helpers shared via leyline (D-13). - **God-function decompositions (F06 / F07 / F10 / F16)**: large multi-responsibility functions split into focused helpers across multiple plugins. - **Architecture-review split (AR-01)**: pensive `architecture_review` now lives in a mixin package rather than a single 1000+ line module. - **memory-palace `project_palace` split into a package** (AR-05); sanctum `validators.py` split into per-class package (AR-06); scribe `slop-detector` modules consolidated (P-14); sanctum `_extract_skill_refs` promoted to `_shared` module (AR-F3). - **Performance wins (Tier 3 A-series)**: pensive `rust_review` regex patterns pre-compiled at module load (A-01); pensive language detection now single-walks (A-02 / A-03); conserve `safe_replacer` walks the `SKILL.md` tree once (A-04); additional algorithmic wins applied (A-07, A-11, A-12). ### Fixed - **sanctum: path-prefixed frontmatter `modules:` entries**: registration audit no longer false-flags entries written as `./modules/foo.md` or `../skills/.../modules/foo.md`. - **sanctum `scan_disk_files` audit bug** (B-08, B-10): vacuous tests deleted, scan logic corrected. - **pensive code-refinement workflow**: Step 6 and a completion gate added to `fix-workflow` (closes a silent-success edge case). - **eight quality-review findings (F1-F5, AR-F1 / F2)**: abstract `utils.py`, conserve `safe_replacer.py`, imbue `session-start.sh`, leyline `cli_envelope.py` and `frontmatter.py`, pensive `architecture_review/__init__.py`, sanctum `test_generator.py` and `update_plugin_registrations.py`, memory-palace README: all corrected with regression coverage. - **slop-scan-for-docs Layer 0 P0 hits**: corrected identity-leak / hallucination patterns in `book/src/getting-started/first-plugin.md`, `docs/guides/README.md`, and sanctum `tutorial-updates` module. - **PR #446 review threads resolved (42 threads)**: obsolete docs removed (`audit-2026-04-25-comprehensive-skill-audit.md`, `karpathy-derivation/{implementation-plan,specification}.md` superseded by the karpathy-principles skill itself, `quality-gate-orchestration.md`, `skill-taxonomy.md`); replaced by `quality-gates.md` and `skill-integration-guide.md`. The `karpathy-derivation/project-brief.md` source brief was kept as the design record. Skill role taxonomy findings applied across abstract, archetypes, and other plugin SKILL.md files. ### Internal - **sanctum `scan_disk_files` decomposition**: monolithic disk-scan logic split into four typed helpers (`_scan_commands_dir`, `_scan_skills_dir`, `_scan_agents_dir`, `_scan_hooks_dir`). - **sanctum pytest coverage scope**: `plugins/sanctum/pyproject.toml` adds `scripts` to both `--cov` addopts and `coverage.run.source` so code under `plugins/sanctum/scripts/` is instrumented during the test run. Previously only `src/sanctum` and hooks were measured. - **Unbloat phases 1-6**: archived stale tracked artifacts (F14, F17); deleted five orphaned plugin scripts; deleted residual `scripts/test-lsp-manually.sh`; cleared 91 deferred docstring-restate hits; structural cleanup (Phase 3) and refactor passes (Phase 5 R1-R7, Phase 6). - **Hygiene fixes (Tier 3)**: F-01, F-04, F-06 through F-13, F-17 applied across the codebase. - **Em-dash polish and "actionable" purge** (slop Phase 4): prose cleanup pass across documentation. - **Truthful docstrings for stub safety / dependency methods** (B-03, B-15): docstrings now match the stub return contract. - **conserve `substantive-commits` filter**: documented as the canonical churn-detection predicate (excludes whitespace, format, and generated-file commits). - **memory-palace**: indexed 23 web-fetch captures. - **Refinement records consolidated**: dimension reports (synthesis, architecture, clean code, algorithm efficiency, additive bias, plugin patterns) merged into this CHANGELOG entry and the deferred-items tracker (issue #486). Per-finding detail recoverable from git history at commit `a50a9352`. - **PR #470 deferred-items closure** (#484, #485): closed on this branch instead of carrying to 1.10.x. - **S2**: `gh_graphql` widens ``variables`` to ``dict[str, Any]``; ``int`` / ``float`` / ``bool`` / ``None`` are routed via ``gh -F`` (typed-field) so GraphQL ``Int!`` / ``Boolean!`` / nullable variables match their schema types. - **S3**: `bootstrap.add_plugin_src_to_path` swaps CPython-private ``sys._getframe(1)`` for ``inspect.stack()[1].filename`` and falls back to ``Path.cwd()`` when the caller frame is synthetic. - **S6**: the residual "future GitLab/Bitbucket" sentence is removed from the `git_platform` docstring's user story in the test file. - **S8**: `PROJECT_PALACE_ROOMS` / `REVIEW_CHAMBER_ROOMS` are typed via ``dict[RoomType, RoomMetadata]`` / ``dict[ReviewSubroom, RoomMetadata]`` with a frozen ``RoomMetadata`` dataclass; the manager preserves string-keyed palace JSON via ``RoomType.value``. - **S9**: `cli_envelope` exports ``SuccessEnvelope`` / ``ErrorEnvelope`` ``TypedDict``s plus a discriminated ``Envelope`` union; ``success_envelope`` / ``error_envelope`` return the typed shapes. - **S10**: ``GhCommandError`` now carries ``cmd: list[str]``, ``returncode: int | None``, and ``stderr: str | None`` for programmatic error handling; ``str(exc)`` stays human-readable. - **S11**: `test_dir_utils` adds three negative-path scenarios: ``CLAUDE_HOME`` unset (falls back to ``HOME``), ``not-a-dir`` collision (raises ``FileExistsError``), unwritable parent (raises ``PermissionError``, skipped under root). - **gh_api `=` field key guard**: ``gh_api`` raises ``ValueError`` if a field key contains ``=`` so the argv pair cannot be split to smuggle extra fields. - **C7**: `ReviewEntry` is now ``@dataclass(frozen=True, slots=True)``; ``from_dict`` is a regular constructor call (no post-hoc attribute mutation), and read-tracking flows through a new ``with_access()`` method that returns a new instance. - **C9**: `TestQualityMixin` and `TestRecommendationMixin` are demoted to module-level functions in ``_quality`` / ``_recommendations``; ``TestingGuideSkill`` becomes a thin facade that delegates to them. Behavioural surface unchanged for downstream consumers. - **SKILL.md `version:` frontmatter drift**: 170 SKILL.md files were stuck at ``version: 1.9.3`` while ``plugin.json`` was at 1.9.4. ``update_versions`` learns to bump the frontmatter ``version:`` (only inside the leading YAML block; body code-block examples are preserved). All in-tree SKILL.md frontmatters synced to 1.9.4. - **PR #470 fix-pr session** (50 review threads closed): - **Cluster A: 37 placeholder modules implemented** (~6,931 lines) across six skills: - ``leyline:progressive-loading`` (20 modules): advanced-patterns, api-patterns, api-review, cargo-patterns, document-analysis-patterns, git-catchup-patterns, git-patterns, large-reference, legacy-python, linux-patterns, log-analysis-patterns, macos-patterns, modern-python, performance, python-packaging, python-patterns, python-testing, rust-review, troubleshooting, windows-patterns - ``abstract:skill-authoring`` (7 modules): advanced-patterns, authentication, error-handling, examples, testing-with-subagents, troubleshooting, validation - ``abstract:shared-patterns`` (4 modules): advanced, creation, editing, troubleshooting - ``leyline:evaluation-framework`` (3 modules): evaluation-rubric, multi-metric-evaluation-methodology, quality-metrics - ``imbue:feature-review`` (1 module): multi-metric-evaluation-methodology - ``leyline:storage-templates`` (1 module): storage-patterns - ``abstract:skills-eval`` (1 module): skill-authoring-best-practices - **Cluster B + C: 12 ephemeral docs and reports consolidated into this CHANGELOG entry and removed**: ``docs/refinement/2026-05-02/`` (7 dimension reports) and ``reports/`` (3 cleanup scans plus 2 unbloat session logs). Per-finding detail recoverable from git history at commit ``a50a9352``. - **Cluster D: CHANGELOG entries for the fix-pr session** captured under this heading. - All 37 implemented modules passed independent layer-0 + layer-2 slop scan: zero em dashes outside code spans, zero tier-1 banned words, zero banned phrases, zero identity leaks, zero hallucinated file/function references (every cited path verified via ``ls`` / ``head`` before being written into the module). - Plugin tests post-implementation: leyline 655 passing, imbue 624 passing, abstract skill-structure 61 passing. ## [1.9.3] - 2026-04-26 ### Added - **gauntlet ImportError observability**: opt-in stderr diagnostic in `precommit_gate.generate_challenge_for_files` gated on `GAUNTLET_DEBUG=1`. The function still returns `None` silently by default (unchanged graceful-degradation contract), but a developer hitting an unexpected import failure can now see the cause without instrumenting the hook. Two new tests in `test_precommit.py` guard both paths (silent without env, diagnostic with env). - **release-consistency tests** (`plugins/sanctum/tests/test_release_consistency.py`): filesystem-walking parity tests that catch real ecosystem drift instead of asserting hardcoded fixtures. Four invariants: all `plugin.json` versions agree; each `pyproject.toml` matches its sibling `plugin.json`; every Makefile `find commands/ ... *.md` invocation uses `-maxdepth 1`; top-level `commands/*.md` count equals `plugin.json.commands.length` for every plugin. RED-proof: reverting `-maxdepth 1` in any Makefile fails the static check immediately. - **inclusive-defaults policy**: `docs/inclusive-defaults.md` codifies the project rule that features default ON and expose `--no-X` opt-out, with nine TRUE-exception categories (heavy install, irreversible state, no-default value, OS-level state, user-project overrides, per-rule consent, full-codebase scans, block-mode enforcement, prescriptive modes). Tracks the audit at #445. - **hookify safe-defaults bundle**: new `--bundle ` install mode with built-in `safe-defaults` bundle shipping six warn-only rules (slop scan, large commits, risky git, large file ops, print statements, plan-before- large-dispatch). Tests guard the bundle composition so any future block-action rule added to safe-defaults will fail CI. Run with `python3 plugins/hookify/scripts/install_rule.py --bundle safe-defaults`. - **pensive performance-review skill + command**: new `/performance-review` slash command and matching skill detect time- and space-complexity hotspots via Python AST. Six time-complexity detectors (T1 nested loops over the same iterable; T2 list `in` lookup in a loop; T3 `re.compile()` in a loop; T4 string `+=` accumulator; T5 unmemoized recursion; T6 list comprehension passed to a reducer) and three space-complexity detectors (S1 unbounded `.append()` in nested loops; S2 list-wrapping a generator inside a reducer; S3 per-iteration `.copy()` / `dict()` / `list()`). Tier 2 (`gauntlet.treesitter_parser`) and Tier 3 (`gauntlet.graph.GraphStore`) extend coverage to non-Python languages and transitive call-chain severity upgrades when gauntlet is installed; sentinel-pattern fallback keeps Tier 1 working when gauntlet is missing. Implementation lives at `plugins/pensive/src/pensive/skills/performance_review.py` with 15 BDD-style tests at `plugins/pensive/tests/skills/test_performance_review.py`. `make demo-performance-review` dogfoods on pensive's own source (98 real findings on first run). Full pensive suite: 368 passed. - **gauntlet Makefile demo + test targets**: 12 new per-command targets (`demo-gauntlet`, `demo-gauntlet-extract`, `demo-gauntlet-curate`, `demo-gauntlet-graph`, `demo-gauntlet-progress`, `demo-gauntlet-onboard`, plus `demo-all`, and matching `test-*` targets). Demos are LIVE (e.g. `demo-gauntlet-extract` runs `scripts/extractor.py src/gauntlet` and prints real knowledge entries dogfooded from gauntlet's own source). Per-command test targets use `--no-cov` since they exercise a slice rather than the full plugin surface; use `make test` for the 85% coverage gate. `makefile_dogfooder.py` coverage: 0% → 85%. ### Changed - **spec-kit speckit-tasks now generates test tasks by default**. Previously the wording said "Tests are OPTIONAL: only generate test tasks if explicitly requested", defaulting away from the project's iron-law TDD posture. New default is ON; opt-out via `--no-tdd` or explicit spec opt-out for spikes/throwaways. - **sanctum update-docs slop-scan default-ON wording**: the scan was already non-blocking by default; the new wording says "default ON" not "non-blocking", since the latter implied opt-in. - **per-plugin TRUE-exception callouts**: hookify, oracle, egregore, conjure, phantom, imbue, leyline, and pensive READMEs (or relevant skill modules) now reference `docs/inclusive-defaults.md` and explain why each feature stays opt-in. - **pensive performance-review detector hardening**: iteratively suppressed false-positive classes that surfaced during dogfooding. T2 (`x in ` membership): string-literal LHS, dict/set/string RHS Names tracked module-wide, string- returning methods (`.lower`, `.upper`, `.strip`, `.read`, etc.), `str`/`bytes`-annotated function params, `.split()`/`.splitlines()` iter vars, comprehension generators, Name-to-Name assignment propagation. S1 (`.append()` in nested loops): suppresses when the target Name is initialized to an empty list in the same function (Assign + List, AnnAssign + List, `list()` call). Net result on pensive's own source: 98 to 10 total findings (-90 percent), HIGH 64 to 2 (-97 percent), MEDIUM 32 to 6 (-81 percent), zero regressions in 376/376 pensive tests. Iron Law followed for every classifier extension: failing test first, minimum implementation, all-green verification (23 BDD-style tests covering each detector and each suppression class). - **pensive `common_phony` list to frozenset**: `plugins/pensive/src/pensive/skills/makefile_review.py` was a 7-element list scanned inside the per-target loop. Converted to `frozenset` for O(1) membership. - **pensive `unified_review.detect_languages` repo-file scan**: `plugins/pensive/src/pensive/skills/unified_review.py` built a `set(files)` once outside the per-language loop so the per-config-file membership probe is O(1) regardless of repository size. - **pensive `makefile_review` regex hoist**: `plugins/pensive/src/pensive/skills/makefile_review.py` had `re.compile(...)` inside the per-makefile loop. Hoisted above the loop so the static pattern compiles once. - **pensive `repository_analyzer` materialization removal**: `plugins/pensive/src/pensive/analysis/repository_analyzer.py` used `len(list(repo_path.rglob(...)))` in two places. Replaced with `sum(1 for _ in repo_path.rglob(...))` so the iterator is consumed without an intermediate list allocation. ### Fixed - **sanctum brainstorm SessionStart hook permission**: `plugins/sanctum/hooks/brainstorm_session_warn.py` was committed with mode 100644 and registered as a direct `"type": "command"` hook in `hooks.json`. Without the exec bit the kernel returned `EACCES` on every SessionStart. Restored mode to 100755 via `git update-index --chmod=+x` to match the six sibling hooks in the same file. - **gauntlet precommit_gate Bash PreToolUse traceback**: the hook imported `gauntlet.challenges` at module scope, which transitively imports `anthropic`. Claude Code invokes the hook via system `python3` (no `anthropic`), so every `Bash` PreToolUse raised `ModuleNotFoundError` before the hook's own early-return for non-`git commit` commands. Moved `from gauntlet.challenges import ...` inside `generate_challenge_for_files()` with `try/except ImportError: return None` for graceful degradation. All 15 unit tests in `tests/unit/test_precommit.py` still pass. - **gauntlet challenges module-scope anthropic import**: closed the same import trap at the source. `plugins/gauntlet/src/gauntlet/challenges.py` now imports `anthropic` and `TextBlock` lazily inside `_generate_problem_variation()` so any caller (not just the precommit hook) can import `gauntlet.challenges` from a Python interpreter lacking `anthropic`. The pre-existing `except Exception` already catches `ImportError` and falls back to the verbatim YAML problem. - **command-counting bug across 14 Makefiles + 1 doc module**: every per-plugin `make status` target and the `sanctum:doc-updates` accuracy-scanning module used `find commands/ -name '*.md'` to count slash commands. In plugins with modular commands (sanctum's `fix-pr-modules/`, `update-plugins/modules/`, `pr-review/modules/`) this counted helper sub-files as separate commands, reporting `Commands: 46` for sanctum (canonical: 19) and `155` for the project (canonical: 128). Added `-maxdepth 1` to all 15 invocations so the count matches `plugin.json.commands.length` exactly. Verified: top-level `commands/*.md` count equals registered command count for all 23 plugins. - **README and pensive book doc stale counts** (consequence of the command-counting bug above): corrected stale skill/command counts and per-plugin catalog rows (`abstract` 13→14, `sanctum` 14→18, `imbue` 12→13/4→5, `pensive` 13→14/12→13, `gauntlet` 6→7). Headline command count fixed from `155` to `128`. Added missing `performance-review`, `blast-radius`, `/performance-review`, and `/refine-code` rows to `book/src/plugins/pensive.md`. - **`update_versions.py` CACHE_EXCLUDES missing project-local tool/cache dirs**: the script's exclusion list omitted `.typecheck-venv`, `.uv-tools`, `.xdg-cache`, and `.tools`, so dry-run reported 137 candidate version files including dozens inside installed Python packages. Added the four missing entries to `plugins/sanctum/scripts/update_plugins_modules/constants.py` with a regression test (`test_cache_excludes_contains_project_local_tool_dirs`). Dry-run noise dropped from 137 to 97 files (95 actually changed by the 1.9.2 to 1.9.3 bump). ### Notes - **Correction to release commit `22423bca` message**: the commit body says the new ImportError test "guards the precommit hook surface when `anthropic` is unavailable in the system python3". The test actually patches `gauntlet.challenges` (not `anthropic`); a missing `anthropic` would surface inside `gauntlet.challenges._generate_problem_variation` and is caught by a different `except Exception` listed under the `Fixed` section above. The test file's own docstring is correct; only the commit message overstates the trigger. ## [1.9.2] - 2026-04-23 ### Added - **abstract review insight publishing**: `post_review_insights.py` parses `/pr-review` and `/fix-pr` review markdown and posts each blocker as a high-severity `[PR Finding]` Insight discussion (medium severity for non-blocking findings). Uses the existing InsightRegistry for content-hash dedup with 30-day staleness, so re-running on the same review is idempotent. - **abstract discussion enrichment**: `discussion_enrichment.py` enriches Phase 6a `[Learning]` posts with lens analysis, trend deltas, and named failure-mode clusters by running the existing analysis lenses against `LEARNINGS.md` and embedding the results in the discussion body. Prior Learning posts surfaced raw metrics without the corresponding recommendations; this closes that loop. - **sanctum review insights wiring**: `/pr-review` and `/fix-pr` now post their findings to GitHub Discussions via `post_review_insights.py` after the review/validate step succeeds. Default behavior; opt out with `--no-insights` for trivial reviews (typo fixes, version bumps). ### Changed - **imbue bounded-reads hook split**: `vow_bounded_reads.py` now matches only Read, Grep, and Glob calls. A new companion script `vow_bounded_reads_reset.py` handles Write, Edit, and MultiEdit: it resets the per-session counter with no budget logic, eliminating a Python startup cost on every write during implementation. - **imbue bounded-reads blocking support**: `vow_bounded_reads.py` now respects `VOW_SHADOW_MODE`, matching the pattern of `vow_no_ai_attribution.py` and `vow_no_emoji_commits.py`. Default (shadow mode on) continues to warn only; set `VOW_SHADOW_MODE=0` to block when the read budget is exceeded. ### Fixed - **abstract synthetic session filter**: `aggregate_skill_logs.py` now filters out synthetic test sessions when aggregating skill logs, preventing fixture data from contaminating Learning post metrics. - **abstract `_safe_pct` whitespace handling**: order- sensitive `.rstrip("%").strip()` returned `0.0` for percentage strings with surrounding whitespace (`" 40.0% "`). Swapped to `.strip().rstrip("%")` so outer whitespace is removed first. Surfaced by the invariant-encoding test pass. - **imbue vow hook executable bits**: added executable permissions to `vow_bounded_reads.py`, `vow_bounded_reads_reset.py`, `vow_no_ai_attribution.py`, and `vow_no_emoji_commits.py` so PreToolUse hooks invoke correctly on systems where umask strips `+x`. ## [1.9.1] - 2026-04-20 ### Added - **imbue hard vow hooks**: Three PreToolUse hooks that enforce constraints mechanically rather than through willpower. `vow_no_emoji_commits.py` blocks emoji in git commit messages. `vow_no_ai_attribution.py` blocks AI attribution strings ("Co-authored-by: Claude", "Generated by Claude", etc.). `vow_bounded_reads.py` warns when discovery read budget (15 reads per session) is exceeded. All three default to shadow mode (warn only); set `VOW_SHADOW_MODE=0` for full blocking. Includes 116 unit tests covering all code paths and the modern `hookSpecificOutput.permissionDecision` protocol. ### Fixed - **imbue hook modernization**: Replaced deprecated top-level `{"decision": ..., "reason": ...}` output format in three new vow hooks with the current `hookSpecificOutput.permissionDecision` / `permissionDecisionReason` structure required by Claude Code. - **imbue test lint**: Added `check=False` to `subprocess.run` call in `test_scope_guard_hook.py` to fix `PLW1510` ruff warning. ## [1.9.0] - 2026-04-14 ### Added - **invariant-judgment framework**: Cross-cutting detection and escalation for design invariant conflicts across 6 skills in 4 plugins (imbue, pensive, sanctum). When a change conflicts with an existing invariant (architecture pattern, data structure choice, API contract), the agent surfaces three options (preserve, layer, revise) and escalates to human judgment rather than silently resolving. Includes invariant-encoding test guidance and 30 content assertion tests protecting the framework itself. - **abstract friction-detector**: Detects friction signals during execution and graduates recurring patterns into declarative rules for session retrospectives and learning pipeline automation. - **imbue vow-enforcement**: Three-layer constraint enforcement (soft vows, hard vows, Nen Court external validators) that classifies constraints by reliability and graduates them when violation rates climb. - **attune mission-orchestrator modules**: Added adaptive-constraints, reflexion-buffer, and trust-tier modules for dynamic constraint adjustment and execution feedback loops. - **egregore model-routing module**: Model selection routing for the summon skill. - **attune interactive plan review**: Review loop between planning and execution phases. Users approve, request revisions, or reject plans before implementation begins. Includes feedback collection, plan versioning with diffs, context injection for revision prompts, and a 3-round iteration governor to prevent unbounded revision cycles. - **attune war-room**: Prosecution Counsel role that stress-tests proposals by probing failure modes, hidden costs, and unstated assumptions. - **leyline additive-bias-defense**: Cross-cutting contract that inverts the burden of proof for code additions. Every proposed addition must justify its presence against scrutiny questions. ### Changed - **abstract skills-eval**: Consolidated evaluation modules. Removed multi-metric-evaluation-methodology, quality-metrics, and skill-authoring-best-practices in favor of streamlined evaluation-criteria and authoring-checklist modules. - **conserve context_scanner**: Refactored monolith script into a package with separate modules for detection, ecosystems, caching, rendering, import graph analysis, and CLI. Public API preserved via `__init__.py` re-exports. - **imbue/leyline/abstract skill modules**: Token optimization pass reducing module sizes across red-flags, interactive-auth, configuration, and sdk-hook-types modules. ### Fixed - **sanctum validators**: Restored defensive `try/except ImportError` fallback for leyline frontmatter parser. Direct import broke tests in environments without leyline installed. - **leyline test_sqlite_graph_base**: Replaced blind `pytest.raises(Exception)` with specific exception types (`sqlite3.ProgrammingError`, `sqlite3.OperationalError`). ## [1.8.4] - 2026-04-10 ### Fixed - **sanctum hooks**: Replaced broad `except Exception:` in `deferred_item_sweep` and `deferred_item_watcher` __main__ guards with specific exception tuples. - **cartograph hook**: Added stderr logging to silent `except Exception` in `graph_community_refresh`. - **leyline sem-integration**: Clarified unverified brew formula in detection docs. Fixed fallback regex to match `async def` functions. ### Added - **gauntlet tests**: Entry-type filtering and flow sorting tests in `test_flows.py`. - **cartograph tests**: Exception path test for community refresh hook. ## [1.8.3] - 2026-04-08 ### Added - **imbue justify**: `/justify` command and skill that audits changes for AI additive bias. Scores changes on line ratio, new files, test tampering, and new abstractions. Enforces Iron Law compliance and weights minimal-intervention approaches. ### Changed - **gauntlet models**: Narrowed difficulty range from 1-5 to 1-4 across KnowledgeEntry, Challenge, and AnswerRecord. Aligns scoring with the four onboarding stages. - **conserve context-map**: Removed `--no-schema` flag (schema detection is now always included). - **pensive blast-radius**: Shell snippet skips dotfiles and documents file-type filtering for reduced false positives. ### Fixed - **conserve context_scanner**: Prevented mutation of ScanResult during JSON rendering via `_copy_result`. Fixed duplicate filename collisions in import graph with `setdefault`. Graceful handling of read-only filesystems when saving cache. - **gauntlet incremental**: `_get_staged_files` returns empty list on non-zero git exit code instead of parsing empty output. - **gauntlet graph_auto_update**: Moved `logging` import to module level instead of inside exception handler. - **sanctum security_pattern_check**: Early exit after first highest-severity match avoids redundant pattern evaluation. Broader exception tuple (TypeError, AttributeError) in crash handler. - **leyline supply_chain_check**: Broader exception tuple (TypeError, AttributeError) in crash handler. - **conserve tool_output_summarizer**: Downgraded "no session file" log from warning to debug. ## [1.8.2] - 2026-04-05 ### Added - **leyline:sem-integration**: Foundation skill for [sem](https://github.com/Ataraxy-Labs/sem) semantic diff CLI. Provides detection, install-on-first-use, and output normalization with graceful fallback to git diff when sem is unavailable. - **imbue:diff-analysis**: Entity-level diffs via sem as primary categorization path. - **imbue:catchup**: Semantic delta capture with sem diff and sem log for entity history. - **pensive:blast-radius**: Three-tier impact analysis (gauntlet graph, sem impact, rg/grep fallback). - **sanctum:git-workspace-review**: Entity-level summary alongside git diff --stat when sem is available. - **sanctum:commit-messages**: Entity names from sem for more precise commit subjects. ## [1.8.1] - 2026-04-03 ### Added - **cartograph plugin**: Codebase visualization through Mermaid Chart MCP. Five diagram skills (architecture, data-flow, dependency-graph, workflow, class-diagram), one command (`/visualize`), and a codebase-explorer agent. ### Fixed - **hook SDK output format**: Modernized 6 hooks across abstract, conserve, gauntlet, and leyline to use the `hookSpecificOutput` nested format instead of the deprecated flat `{"decision": ...}` pattern. Hooks now return empty output for no-op cases instead of explicit allow responses. - **scope-guard noise**: The UserPromptSubmit scope-guard hook no longer fires RED ZONE warnings on maintenance commands (`/reinstall-all-plugins`, `/update-plugins`, `/fix-workflow`, `/catchup`, etc.). - **supply-chain-check**: Graceful handling of malformed blocklist JSON instead of crashing. - **post-learnings-stop**: Properly parses JSON stdin payload instead of discarding it. ### Changed - **hooks-eval documentation**: Updated return value examples to show current `hookSpecificOutput` format. - **superpowers upstream sync** (v5.0.7): Updated integration docs and dependency matrix to reflect superpowers v3.4.1 to v5.0.7 evolution (20+ releases). Key changes integrated: - Added `dispatching-parallel-agents`, `using-git-worktrees`, `finishing-a-development-branch` to dependency matrix - Documented skill consolidation (testing-anti-patterns, root-cause-tracing, defense-in-depth, condition-based-waiting merged into parent skills) - Added inline self-review pattern and SUBAGENT-STOP gate awareness to integration docs - Fixed deprecated skill references across imbue, memory-palace, conserve plugins - Updated sanctum integration guide to v1.1.0 - **spec-kit upstream sync** (v0.5.0): Updated spec-kit plugin to align with upstream github/spec-kit changes (34 releases since v0.0.90). Key changes: - Renamed "Non-Functional Requirements" to "Success Criteria" across commands and skill modules - Added `argument-hint` frontmatter to all 10 spec-kit commands for Claude Code skill format compliance - Added Assumptions section to spec template structure - Updated checklist dimensions terminology ### References - [Superpowers v5.0.7](https://github.com/obra/superpowers/releases/tag/v5.0.7) - [Spec-Kit v0.5.0](https://github.com/github/spec-kit/releases/tag/v0.5.0) ## [1.8.0] - 2026-04-01 ### Added - **oracle plugin** (v1.0.0): ONNX Runtime inference daemon serving ML models over localhost HTTP. Opt-in activation via `/oracle:setup`, isolated Python 3.11+ venv managed by `uv`, SessionStart/Stop lifecycle hooks. See ADR-0009. - **gauntlet ML scoring**: `score_answer_quality()` API with pluggable Scorer protocol. YamlScorer (default) uses heuristic YAML rules; OnnxSidecarScorer upgrades automatically when oracle is running. Configurable blend weights fall back gracefully when the sidecar is unavailable. - **herald plugin** (v1.0.0): shared notification library extracted from egregore. Provides GitHub issue alerts and webhook support (Slack, Discord, generic). Pure library plugin with no skills or commands. - **tome integration in feature-review**: `--research` flag triggers Phase 4.5 (Research Enrichment) which dispatches tome's code-search, discourse, papers, and TRIZ channel agents in parallel, synthesizes findings, and adjusts scoring factors with evidence-backed deltas clamped to the Fibonacci scale. Tome is an optional dependency; feature-review degrades gracefully when tome is not installed. Adds 9 new tests covering delta calculation, Fibonacci clamping, max-delta constraint, evidence thresholds, graceful degradation, and channel-factor mapping. - ADR-0009: Sidecar Service Discovery via Port Files. Formalizes port-file discovery, PID tracking, localhost binding, health endpoints, and opt-in activation for sidecar daemons. - ADR-0001 updated with herald as an approved cross-plugin dependency exception. ### Changed - Claude Code 2.1.78-2.1.85 compatibility: hook conditional `if` field (2.1.85), MCP headersHelper server env vars (2.1.85), PreToolUse AskUserQuestion support (2.1.85), MCP OAuth RFC 9728 (2.1.85), org plugin blocking (2.1.85), OTEL tool_parameters gated (2.1.85), `/compact` context exceeded fix (2.1.85), PowerShell tool preview (2.1.84), model capability env vars (2.1.84), TaskCreated hook (2.1.84), WorktreeCreate HTTP hooks (2.1.84), idle-return prompt (2.1.84), MCP tool desc 2KB cap (2.1.84), YAML glob paths for rules/skills (2.1.84), system-prompt caching fix (2.1.84), managed-settings.d/ drop-in (2.1.83), CwdChanged/FileChanged hooks (2.1.83), sandbox failIfUnavailable (2.1.83), subprocess env scrub (2.1.83), agent initialPrompt (2.1.83), plugin userConfig (2.1.83), MEMORY.md 25KB limit (2.1.83), TaskOutput deprecated (2.1.83), --bare flag (2.1.81), --channels preview (2.1.80), rate_limits statusline (2.1.80), effort frontmatter for skills (2.1.80), source:settings plugins (2.1.80), --console auth (2.1.79), StopFailure hook (2.1.78), CLAUDE_PLUGIN_DATA (2.1.78), agent effort/maxTurns/disallowedTools (2.1.78) - Added hook `if` field, StopFailure, TaskCreated, CwdChanged/FileChanged, WorktreeCreate HTTP, PreToolUse AskUserQuestion to all 4 hook reference files - Updated abstract model-optimization-guide with PowerShell tool, model capability env vars, effort frontmatter, agent frontmatter fields, --bare flag - Updated conserve mecw-principles with /compact fix, MCP tool cap, idle-return prompt, system-prompt caching, MEMORY.md 25KB limit - Updated conjure agent-teams spawning-patterns with --bare flag, agent initialPrompt, agent frontmatter fields, worktree non-git fix - Updated conjure agent-teams health-monitoring (no new entries for 2.1.78-2.1.85 beyond 2.1.77) - Updated egregore budget with rate_limits statusline - Updated leyline update-all-plugins with CLAUDE_PLUGIN_DATA, multi-dir seed dir, source:settings, managed-settings.d/, plugin userConfig, org plugin blocking - Updated compatibility-features index and recent file with 2.1.78-2.1.85 feature timelines (8 versions) - Claude Code 2.1.77 compatibility: Opus 4.6 output tokens default 64k (upper bound 128k), `allowRead` sandbox setting, `/copy N`, PreToolUse "allow" bypass deny fix (security), compound bash "Always Allow" fix, auto-updater memory leak fix, `--resume` truncation race fix, Write tool CRLF fix, progress message memory growth fix, Agent `resume` parameter removed (use SendMessage), SendMessage auto-resumes stopped agents, `/fork` renamed to `/branch`, background bash 5GB limit, `plugin validate` frontmatter checking, macOS startup and `--resume` performance improvements, 30+ bug fixes - Updated abstract model-optimization-guide with output token limits (64k default, 128k upper bound) - Updated conserve mecw-principles with progress message memory fix and output token impact on compaction - Added PreToolUse "allow" bypass fix (2.1.77+) to all 4 hook reference files - Updated conjure agent-teams spawning-patterns with Agent resume removal, stale worktree race fix - Updated conjure agent-teams health-monitoring with SendMessage auto-resume, background bash 5GB limit, teammate pane close fix - Updated leyline update-all-plugins with git-subdir cache collision fix, plugin validate improvements, headless plugin seed dir fix - Updated compatibility-features index and recent file with full 2.1.77 feature timeline - Claude Code 2.1.76 compatibility: MCP elicitation support (form fields and URL mode), Elicitation and ElicitationResult hooks (blockable), PostCompact hook (non-blockable), `-n`/`--name` CLI flag, `worktree.sparsePaths` for monorepos, `/effort` slash command, `feedbackSurveyRate` setting, deferred tools schema fix (post-compaction), auto-compaction circuit breaker (3 attempts), spurious "Context limit reached" fix with `model:` frontmatter on 1M sessions, adaptive thinking error fix for non-standard models, Bash `#` permission fix, worktree startup performance, background agent partial results preserved, model fallback always visible, stale worktree cleanup, `--plugin-dir` single path change, Remote Control fixes, 20+ additional bug fixes - Added MCP Elicitation/ElicitationResult hooks and PostCompact hook to all 4 hook reference files (hook-types, sdk-hook-types, capabilities-hooks, hook-types-comprehensive) - Updated abstract model-optimization-guide with `/effort` command, model fallback visibility, adaptive thinking fix, and context limit fix with `model:` frontmatter - Updated conserve mecw-principles with deferred tools schema fix, auto-compaction circuit breaker, and context limit fix on 1M sessions - Updated conjure agent-teams spawning-patterns with `-n`/`--name`, `worktree.sparsePaths`, worktree startup performance, stale worktree cleanup, `--plugin-dir` single path change - Updated conjure agent-teams health-monitoring with background agent partial results preservation - Updated leyline update-all-plugins with LSP registration fix and `--plugin-dir` single path change - Updated compatibility-features index and recent file with full 2.1.76 feature timeline - Claude Code 2.1.75 compatibility: 1M context window default for Opus 4.6 on Max/Team/Enterprise plans, `/color` command, session name on prompt bar, last- modified timestamps on memory files, hook source display in permission prompts, token estimation over-counting fix (premature compaction), Bash `!` pipe fix, managed-disabled plugins hidden, async hook messages suppressed, macOS startup performance, Windows managed settings path removal (breaking change) - Updated abstract model-optimization-guide with 1M context default and `CLAUDE_CODE_DISABLE_1M_CONTEXT` opt-out env var - Updated conserve mecw-principles with 1M context default for Max/Team/Enterprise and token estimation over-counting fix - Added hook source display (2.1.75+) and async hook message suppression (2.1.75+) to all 4 hook reference files (hook-types, sdk-hook-types, capabilities-hooks, hook-types-comprehensive) - Updated conjure agent-teams spawning-patterns with `/color` command and session name on prompt bar - Updated conjure agent-teams health-monitoring with agent teams footer hint fix - Updated egregore summon budget module with token estimation fix impact on orchestration sessions - Updated leyline update-all-plugins with managed- disabled plugins fix, corrupted marketplace config fix, and Windows managed settings breaking change - Updated compatibility-features index and recent file with full 2.1.75 feature timeline - Claude Code 2.1.71 compatibility: corrected CronCreate auto-expiry from 3 days to 7 days across all plugin docs (hook-authoring, hooks-eval, capabilities-hooks, hook-types-comprehensive, egregore orchestrator/summon) - Added `durable` parameter documentation for CronCreate (persists tasks to `.claude/scheduled_tasks.json` across restarts) in all hook reference files - Corrected CronCreate parameter name from `cron_expression` to `cron` in egregore orchestrator and summon skill/budget module - Expanded CronCreate schema details (jitter behavior, REPL-idle firing, off-minute scheduling guidance) in hook-authoring hook-types module - Updated compatibility-features index and recent file with full 2.1.71 CronCreate schema and durable persistence details - Claude Code 2.1.72 compatibility: ExitWorktree tool, effort levels simplified (max removed), CLAUDE.md HTML comments hidden, skill hook double-fire fix, parallel tool call cascade fix, prompt cache 12x savings, team agent model inheritance, 6 plugin marketplace fixes, bash allowlist expansion (lsof, pgrep, tput, ss, fd, fdfind), permission rule matching fixes, /clear foreground-only behavior, CLAUDE_CODE_DISABLE_CRON mid-session stop - Updated abstract escalation-governance and model-optimization-guide with 3-level effort system - Added ExitWorktree tool docs to all 4 hook reference files (hook-types, sdk-hook-types, capabilities-hooks, hook-types-comprehensive) - Added 2.1.72 hooks fixes (double-fire, transcript_path, agent prompt, PostToolUse block, async stdin, CLAUDE.md comments, parallel cascade) to all 4 hook reference files - Updated conserve clear-context with /clear foreground-only behavior note - Updated conjure agent-teams health-monitoring with team agent model inheritance - Updated conserve mecw-principles with prompt cache 12x fix - Updated leyline update-all-plugins with 6 marketplace fixes (Azure DevOps URLs, Windows EEXIST, user-scope installs, tilde expansion, clone diagnostics, marketplace-only fields) ### Removed - Tier 3 unbloat: deleted 182 dead files (66K lines) across 9 plugins (abstract, attune, conjure, conserve, egregore, leyline, memory-palace, sanctum, tome) per issues #345-#351 ### Fixed - Restored scripts referenced by pre-commit hooks and skill definitions that were incorrectly flagged as dead code during unbloat (validate_plugin.py, abstract_validator.py, hook_validator.py, and others) - Replaced inline lint suppressions (type: ignore, noqa) with proper fixes: `cast()` for type overrides, `importlib.util.find_spec()` for import detection, `__all__` re-exports for F401, constants for magic values, per-file-ignores for structural patterns - Fixed script permissions: 0755 for files with shebangs, removed executable bit from non-executable modules ## [1.7.2] - 2026-03-27 ### Added - Session-replay skill and command in scribe: converts Claude Code and Codex session JSONL files into GIF/MP4/WebM replays via VHS tape generation - Supply chain security in leyline: `supply-chain-advisory` skill, `supply_chain_check` SessionStart hook, and `make supply-chain-scan` target for detecting known compromised package versions in lockfiles - Known-bad-versions blocklist (`known-bad-versions.json`) with LiteLLM 1.82.7/1.82.8 credential stealer entry - Phantom plugin registered in marketplace.json and added to api-overview.md, domain-specialists.md book page ### Fixed - Removed dead `ConservationValidator` import and mock fixture from conserve test conftest - Converted abstract `meta-architect` and `skill-auditor` agent definitions from JSON to required markdown format - Closed unclosed command substitution in release CI heredoc ## [1.7.1] - 2026-03-25 ### Changed - Skill frontmatter descriptions expanded across attune, tome, leyline, egregore, pensive, and phantom plugins for better Claude Code matching and discoverability - Added "When To Use" and "When NOT To Use" sections to 15 skills that lacked usage guidance (content-sanitization, deferred-capture, markdown-formatting, stewardship, quality-gate, computer-control, metacognitive-self-mod, and all six tome research skills) - Normalized heading case from "When to Use" to "When To Use" across damage-control, utility, tiered-audit, and computer-control skills ### Fixed - pytest `norecursedirs` added to root `pyproject.toml` to prevent plugin conftest.py import conflicts when running top-level tests - `scripts/fix_descriptions.py` now detects embedded `Triggers:` patterns in skill descriptions alongside existing custom description and truncated "DO NOT." checks ## [1.7.0] - 2026-03-21 ### Added - ADR-0001 amendment: `abstract` declared as a second intentional cross-plugin dependency alongside leyline, with the same try/except ImportError guard requirements - Shell metacharacter rejection in conserve `permission_request.py`: commands containing `;`, `|`, `&&`, backticks, or `$()` are never auto-approved, preventing bypass via `ls; dangerous_command` - Inlined-copy sync notice in `scripts/shared/json_utils.sh` listing the five plugin hooks that inline these functions - New tests for conserve `growth_analyzer.py` and `permission_request.py` hooks ### Changed - Egregore `learning.py`: replaced floating-point `success_rate` field with integer `success_count` and a computed property to prevent cumulative drift; includes backward-compatible deserialization from the old format - Conserve `growth_analyzer.py`: all return paths in `estimate_turns_to_violation` now return `float` for type consistency - All six `deferred_capture.py` wrappers (abstract, attune, egregore, imbue, pensive, sanctum) now guard their leyline import with try/except ImportError and exit with a message when leyline is absent - Leyline `auto-star-repo.sh` emits SessionStart JSON with `hookSpecificOutput` fields instead of raw text, including opt-out and fallback paths - Shell hooks in conserve and imbue updated to return SessionStart JSON with `additionalContext` field - `json_utils.sh` `escape_for_json` uses `jq -Rs 'rtrimstr'` instead of the fragile `.[:-1] // ""` slice ### Fixed - Growth analyzer `estimate_turns_to_violation` returned `int` on some code paths and `float` on others; all paths now return `float` ## [1.6.7] - 2026-03-19 ### Added - Invisible text injection detection in leyline: identifies hidden Unicode characters and zero-width sequences in external content - Dorodango polishing workflow in attune: iterative refinement cycle for project artifacts - Spec review loop and unit decomposition teaching in attune: structured feedback for specifications with unit-level breakdown guidance - Cheapest-capable model selection heuristic in conjure: routes delegation tasks to the most cost-effective model that meets capability requirements - Mandatory file structure section in attune planning: implementation plans now require explicit directory and file layout - Deferred-item capture system in sanctum: PostToolUse watcher detects deferred items from skill output, Stop hook sweeps the session ledger and files GitHub issues via `deferred_capture.py` - Deferred-capture contract skill in leyline: defines the CLI interface, issue template, and label taxonomy that all plugin capture wrappers implement - Per-plugin `deferred_capture.py` wrappers for attune, conserve, egregore, imbue, and pensive, each routing to the sanctum reference implementation - Makefile `deferred-capture-test` target for running the deferred-capture test suite ### Fixed - PR review findings from #283 and #292: formatting fixes in test files, corrected Brooks's Law threshold assertions, removed duplicate verdict calculation code - Tiered audit skill: added table of contents and verification steps for each tier ## [1.6.6] - 2026-03-16 ### Added - Parallel work item execution via git worktrees in egregore, with independence detection and configurable concurrency (`max_concurrent_worktrees`) - Parallel pipeline stages in egregore quality gate: dependency-graph scheduling splits steps into waves that run concurrently - Agent specialization in egregore: reviewer, documenter, and tester specialists accumulate context across sessions - Cross-item learning in egregore: decision log analysis extracts reusable patterns (tech stack, failure modes, architecture) with success rate tracking - Multi-repository support in egregore via `RepoRegistry` for routing work items across repos - GitHub Discussions publishing in egregore for discoveries, insights, and retrospectives with rate limiting - Output contracts for agent dispatch: bloat-auditor and code-reviewer agents now declare required output sections, minimum evidence counts, and strictness levels validated by proof-of-work - Tiered audit skill in pensive: three-tier escalation model (git history, targeted, full codebase) that defaults to Tier 1 and requires approval for Tier 3 - UserPromptSubmit hook in egregore: lets users interact with running orchestrator sessions without breaking the pipeline loop - Self-healing heartbeat cron in egregore: detects stalled pipelines every 5 minutes and re-enters the orchestration loop automatically - New modules in conserve context-optimization: findings-format, memory-tiers, session-routing - New modules in imbue proof-of-work: output-contracts, retry-protocol - Integration tests for egregore hook `main()` entry points - Pytest markers (`unit`, `integration`) in egregore and scribe `pyproject.toml` - Missing YAML dependency test in scribe `test_pattern_loader.py` ### Fixed - Webhook URL scheme validation in egregore to prevent SSRF (only `https://` allowed) - YAML import guard and encoding parameter in hookify `pattern_loader` - Merkle sealed-node verification strengthened in conjure and sanctum - Manifest collision handling, forward-compatibility, and history field fixes in egregore ### Changed - Egregore manifest utilities: extract `get_items()` helper and `ACTIVE_STATUSES` constant to support both `work_items` and `items` keys - Egregore orchestrator checks pending items (not just active) before allowing exit - Plan-before-large-dispatch rule reformatted with 80-char wrapping and tiered audit preference - Egregore README: replace roadmap with documentation of implemented features (parallel execution, agent specialization, cross-item learning, multi-repo, Discussions publishing) - 80-char prose wrapping applied across book content, plugin READMEs, and reference documentation ### Removed - Duplicate tests in hookify `test_rule_suggester.py` ## [1.6.5] - 2026-03-16 ### Added - Six Rust review checks in pensive rust-review skill: silent-returns, collection-types, sql-injection, cfg-test-misuse, error-messages, duplicate-validators - Shared CLI scaffolding module (`cli_scaffold.py`) in abstract, extracted from five wrapper scripts - Tasks API mode test coverage in abstract ### Changed - O(n) session chain traversal in memory-palace session history (was O(n^2) linked-list walk) ### Fixed - PYTHONPATH in pensive pytest config so tests resolve plugin imports correctly - `tool_result` content counting in conserve context warning hook and `json.loads` guards in conjure war room persistence ## [1.6.4] - 2026-03-15 ### Added - ERC-8004 behavioral contract verification command (`/verify-plugin`) in leyline for querying on-chain plugin trust scores - Star prompt SessionStart hook in leyline that asks users if they want to star anthropics/claude-code (opt-out via CLAUDE_NIGHT_MARKET_NO_STAR_PROMPT=1) - Remote-control/headless subagent hang warnings in sanctum do-issue parallel-execution and troubleshooting modules with upstream bug references ### Changed - Alphabetize leyline skill listing in openpackage.yml and register newly added skills (content-sanitization, damage-control, markdown-formatting, stewardship) ## [1.6.3] - 2026-03-15 ### Added - Tooling reflection step (Step 6.7) in sanctum fix-pr completion workflow for posting night-market observations to GitHub Discussions - Tooling reflection section in sanctum do-issue completion module with Discussions routing - Star History chart and "Powered by Night Market" badge in README - BDD tests for conserve context_warning session discovery and content counting helpers ### Changed - Update context window defaults from 200K to 1M for Opus 4.6 GA across conserve plugin - Marketplace schema: promote version and description to top-level fields with `$schema` URL - Workflow-improvement skill: route tooling learnings to GitHub Discussions instead of local memory palace - Refactor conserve context_warning: extract `_find_current_session`, `_resolve_session_file`, and `_count_content` helpers from monolithic estimator function - Replace `@lru_cache` global with instance-scoped caching in sanctum security pattern check hook - Line-wrap fixes in sanctum post-implementation policy and security pattern check hooks ### Fixed - Use `datetime.timezone.utc` instead of `datetime.UTC` for Python 3.9 compatibility - Dynamic repo detection for Discussions posting in abstract plugin (no longer hardcoded) - Add cache directory exclusions to 26 `find` commands across 6 plugins to prevent scanning `.venv`, `__pycache__`, `node_modules`, `.git` ### Removed - Dead code cleanup across 6 plugins: 6K+ lines removed in three refactoring waves, including unused dependencies and stale test files ## [1.6.2] - 2026-03-11 ### Added - Five stewardship virtues (Care, Curiosity, Humility, Diligence, Foresight) connecting Claude's trained character dispositions to engineering practices in leyline stewardship skill with one module per virtue plus reflection template - Virtue-grounded rule messages in hookify: reason-first openings that explain WHY before WHAT, each tagged with the virtue it expresses - Disposition preambles in imbue proof-of-work and scope-guard skills linking enforcement to underlying virtues - Virtue practice as sixth health dimension in `/stewardship-health` command - Stewardship tracker `virtue` parameter for virtue-aligned action recording ## [1.6.1] - 2026-03-10 ### Changed - Bump all plugin versions to 1.6.1 ## [1.6.0] - 2026-03-08 ### Added - Tiered plugin review workflow in abstract plugin: `/plugin-review --tier branch|pr|release` with dependency-aware scoping via `docs/plugin-dependencies.json`. Branch tier runs quick quality gates; PR tier adds skills-eval, hooks-eval, test-review, bloat-scan scoring; release tier audits all 17 plugins with plan mode dispatch. `/update-plugins` triggers it as Phase 2. - Plugin dependency map generator script (`scripts/generate_dependency_map.py`) scanning Makefile includes, pyproject.toml deps, and Python imports to produce `docs/plugin-dependencies.json` - Egregore plugin: autonomous agent orchestrator for full development lifecycles with session budget management, crash recovery, and watchdog monitoring - `CLAUDE_CODE_PLUGIN_GIT_TIMEOUT_MS` documentation in leyline reinstall/update plugin commands for troubleshooting slow git operations (2.1.51) - npm registry plugin installation notes in leyline reinstall command (2.1.51) - Workspace trust security note for `statusLine` and `fileSuggestion` hook commands across 4 hook reference files (hook-types skill, sdk-hook-types skill, capabilities-hooks reference, hook-types-comprehensive example) (2.1.51) - Tool result disk persistence threshold (50K chars) documentation in conserve context-optimization mecw-principles module (2.1.51) - Managed settings documentation (macOS plist, Windows Registry) with full precedence hierarchy and managed-only settings reference (2.1.51) - Account environment variables documentation (`CLAUDE_CODE_ACCOUNT_UUID`, `CLAUDE_CODE_USER_EMAIL`, `CLAUDE_CODE_ORGANIZATION_UUID`) for SDK callers in conjure delegation docs (2.1.51) - Bulk agent kill (ctrl+f) aggregate notification note in conjure agent-teams spawning-patterns module (2.1.53) - Subagent task state release note in conserve subagent-coordination module (2.1.59) - Auto-memory layer comparison update in memory-palace README (2.1.59, replaces "Native Memory 2.1.32+" with "Auto-Memory 2.1.59+") - HTTP hooks reference in hook-types skill, sdk-hook-types skill, and capabilities-hooks book reference (2.1.63) - Memory leak fixes (12+ sites) note in conserve subagent-coordination module (2.1.63) - `ENABLE_CLAUDEAI_MCP_SERVERS=false` opt-out in conserve mcp-subagents module (2.1.63) - Teammate memory management note in conjure agent-teams health-monitoring module (2.1.63) - Opus 4.6 medium effort default, "ultrathink" keyword, and Opus 4/4.1 removal in escalation-governance skill and model-optimization-guide (2.1.68) - `${CLAUDE_SKILL_DIR}` variable and description colon fix documentation in skill-authoring SKILL.md (2.1.69) - Hook event field enrichment (`agent_id`, `agent_type`, `worktree`) across 4 hook reference files: hook-types skill, sdk-hook-types skill, capabilities-hooks reference, hook-types-comprehensive example (2.1.69) - TeammateIdle/TaskCompleted `continue: false` teammate shutdown in conjure health-monitoring module and all 4 hook reference files (2.1.69) - WorktreeCreate/WorktreeRemove plugin hook fix note across 4 hook reference files (2.1.69) - `/reload-plugins` command reference in leyline update-all-plugins command (2.1.69) - Sonnet 4.5 → 4.6 migration note in escalation-governance skill and model-optimization-guide (2.1.69) - Explicit `model:` field added to 10 agent definitions across 6 plugins (pensive, spec-kit, imbue, memory-palace, egregore, conserve) to prevent silent model inheritance from parent session - Fixed non-standard `model_preference: default` field in egregore sentinel and conserve continuation-agent to use proper `model: haiku` / `model: inherit` - Cron scheduling tools (`CronCreate`, `CronList`, `CronDelete`) and bash auto-approval expansion (`fmt`, `comm`, `cmp`, `numfmt`, `expr`, `test`, `printf`, `getconf`, `seq`, `tsort`, `pr`) across 4 hook reference files: hook-types skill, sdk-hook-types skill, capabilities-hooks reference, hook-types-comprehensive example (2.1.71) - Heredoc permission fix note across 4 hook reference files (2.1.71) - Background agent notification fix and `--print` team agent hang fix in conjure health-monitoring module (2.1.71) - Read tool image overflow fix in conserve context-optimization mecw-principles module (2.1.71) - `/plugin uninstall` team safety note in leyline update-all-plugins command (2.1.71) - Egregore `/loop` integration: in-session rate limit recovery via CronCreate (replaces exit-and-watchdog pattern), progress pulse via `/loop 5m /egregore:status`, CronCreate/CronList/CronDelete added to orchestrator tools (2.1.71) - Moved `/loop`, voice:pushToTalk, stdin freeze, plugin installation fix, MCP server dedup, and startup performance fixes from 2.1.69 to 2.1.71 in compatibility tracker (version correction) - Compaction image preservation and resume token savings (~600 tokens) in conserve context-optimization mecw-principles module (2.1.70) - Effort parameter Bedrock fix note in escalation-governance skill (2.1.70) - Plugin installation status accuracy note in leyline update-all-plugins command (2.1.70) ### Changed - Updated escalation-governance effort control table to reflect medium as default effort (2.1.68) - Updated model-optimization-guide effort controls section with medium default and ultrathink (2.1.68) - Updated hook event version annotations for WorktreeCreate/WorktreeRemove (2.1.49+ to 2.1.50+) across 7 files (hook validator, frontmatter parser, hook-types skill, sdk-hook-types skill, capabilities-hooks reference, hook-types-comprehensive example, test suite) - Added WorktreeCreate/WorktreeRemove input data schemas and behavioral documentation (command-only, stdout path output, no matchers, no blocking for Remove) ## [1.5.7] - 2026-03-06 ### Added - `STEWARDSHIP.md` manifesto with five principles drawn from biblical stewardship, Boy Scout Rule, Peter Block, Kaizen, and Seventh Generation thinking - Per-plugin stewardship sections in all 16 plugin READMEs with plugin-specific improvement opportunities - `leyline:stewardship` cross-cutting skill with principles, layer-specific guidance, and decision heuristics - Stewardship action tracker (`plugins/leyline/scripts/stewardship_tracker.py`) with JSONL append-only logging - Plugin health dimensions script (`plugins/leyline/scripts/plugin_health.py`) measuring documentation freshness, test coverage, code quality, contributor friendliness, and improvement velocity - `/stewardship-health` command (imbue) for per-plugin and ecosystem-wide health display - Campsite check hookify rule (`plugins/hookify/skills/rule-catalog/rules/stewardship/`) surfacing 1-2 improvement suggestions on session completion - Stewardship context in sanctum `verify_workflow_complete` hook and abstract `homeostatic_monitor` - Tests for stewardship tracker and plugin health scripts ### Changed - Updated imbue and leyline `plugin.json` with stewardship skill and command registrations - Python 3.9 compatibility fixes in abstract hooks, scripts, and memory-palace source modules ## [1.5.6] - 2026-03-05 ### Added - Shared test plan injection module for sanctum (`commands/shared/test-plan-injection.md`) with detection, generation, and injection logic - Test plan injection step in `/fix-pr` (step 6.5b) and `/pr-review` (step 17.5) workflows - 627-line BDD test suite for test plan injection covering detection patterns, generation templates, and cross-reference validation - Educational insights in PR review findings (sanctum, pensive) - GitHub Discussions publishing wired into war-room (Phase 8), scope-guard (deferral linking), and knowledge-intake (Step 7 promotion for score 80+) - `tech-tutorial` skill for scribe with 3 modules (planning, drafting, refinement) - `damage-control` skill for leyline (crash recovery, context overflow, merge conflict resolution) - 93 placeholder Makefile targets across 9 plugins - Python 3.9 compatibility CI workflow - `--dry-run` flag for abstract wrapper_generator - Stop-word filtering for abstract find_similar ### Changed - Modularized `makefile_dogfooder.py` into dogfooder package (abstract) - Raised abstract test coverage target to 90% ## [1.5.5] - 2026-03-04 ### Added - Code execution risk detection for memory-palace safety checks (YAML deserialization, eval/exec injection, os.system, subprocess shell=True, dunder traversal) with backtick-aware filtering - Importance-weighted decay floors for knowledge entries (constitutional 0.5, architectural 0.4, significant 0.3, standard 0.1, ephemeral 0.0) - Content sanitization hook for leyline (two-tier threat detection with prompt injection prevention) - Configurable complexity/confidence thresholds for hookify rule suggestions - Tool-type detection in hookify event mapping (Bash subcommand classification, WebFetch/WebSearch identification) - Importance score field on ReviewEntry with room-type defaults (decisions: 70, others: 40) - Constitutional entry exclusion from stale entry lists in decay model - Pinned learnings preservation across LEARNINGS.md regeneration in abstract aggregate script - Expanded GitHub Discussions fetch from 5 to 10 results in leyline session-start hook - Shared `leyline:markdown-formatting` skill with hybrid wrapping rules (80 chars, sentence-boundary preference) and structural conventions (ATX headings, blank lines around headings/lists, reference-style links) - New `.claude/rules/markdown-formatting.md` rule for enforcing formatting during documentation generation - Prune-check tests for memory-palace palace manager (stale entry, duplicate, low-quality detection) - Leyline Makefile `demo-formatting` target for skill structure validation - 56 integration tests for sanctum session_complete_notify hook (platform mocking, state management, deduplication layers) ### Fixed - Replaced inline `__import__("datetime").timedelta` with proper import in `palace_manager.py` - D413 and PLR2004 lint violations in session_complete_notify.py (missing blank line after docstring section, magic number constants) ### Changed - Updated scribe doc-generator generation-guidelines with line wrapping section - Updated sanctum directory-style-rules shared rules with formatting conventions - Updated sanctum update-readme, tutorial-updates, and doc-consolidation skills with formatting references - Updated slop-scan-for-docs rule to include line length verification as first post-write check ## [1.5.4] - 2026-03-03 ### Added - Session lifecycle hooks for memory-palace (session history tracking) - Audit trail module for conjure war room persistence - Italian (it) and Portuguese (pt) language patterns for scribe slop detection - New test suites: CLI framework, skill tools, utils extended, wrapper base (abstract) ### Changed - Improved tasks manager base with updated unit tests (abstract) - Enhanced rule suggester with updated tests (hookify) - Refined review analyst agent and proof enforcement hook (imbue) - Updated code reviewer, architecture reviewer, and code refiner agents (pensive) - Improved PR agent and meta evaluation script (sanctum) - Enhanced pattern loader with i18n test coverage (scribe) - Updated speckit orchestrator skill (spec-kit) - Expanded book reference documentation for capabilities, agents, and skills ### Fixed - Web research handler test updates (memory-palace) - Conftest improvements for conserve and imbue test suites - Memory-palace hooks.json configuration update ## [1.5.3] - 2026-03-02 ### Changed - **Context warning hook delegates instead of wrapping up** (conserve) - EMERGENCY alerts now direct Claude to invoke `Skill(conserve:clear-context)` and chain to a continuation agent rather than stopping work. Recommendations changed from "summarize remaining tasks" to "delegate via continuation." - **Tail-based context estimation** (conserve) - `estimate_context_from_session()` reads only the last 800KB of a JSONL session file and counts actual content text. The previous file-size approach over-reported context usage on long sessions with compressed history. - **Dash-convention project directory resolution** (conserve) - Replaced MD5-hash-based directory naming with Claude Code's actual convention (`/home/user/project` becomes `-home-user-project`). The old hash approach never matched real session directories. - **Continuation agent chaining protocol** (conserve) - Agent instructions now state: ignore "wrap up" signals from the context warning hook; chain to another continuation agent at 80% context instead of stopping. ## [1.5.1] - 2026-02-27 ### Added - **Content assertion levels framework** (L1/L2/L3) for testing execution markdown - `leyline:testing-quality-standards/modules/content-assertion-levels.md` - canonical taxonomy - `sanctum:test-updates/modules/content-test-discovery.md` - execution markdown detection - `sanctum:test-updates/modules/generation/content-test-templates.md` - BDD scaffolding - `pensive:test-review/modules/content-assertion-quality.md` - review scoring dimension - Claude Code 2.1.49-2.1.63 compatibility entries in abstract docs - 35 content assertion tests across leyline (8), conserve (included in 143), abstract (6), sanctum (15), pensive (6) - 14 tests for enriched update-plugins output (TestReadModuleDescription, TestPrintModuleIssuesEnriched) ### Changed - Enrich update-plugins orphaned module output with inline descriptions - Split update-plugins command doc from 397 to 91 lines + 3 on-demand modules - Wire ci-integration, config-file, progress-indicators into scribe slop-detector - Iron Law enforcement now recognizes execution markdown as code (imbue) - Test discovery reclassifies execution markdown from Low to High/Medium priority (sanctum) - Scribe agents updated to claude-sonnet-4-6 model ID ### Fixed - Invalid JSON code block in hook-authoring skill (removed // comments) - Conjure bridge truncation limit corrected (100k to 50k chars) ## [1.5.0] - 2026-02-26 ### Changed - **Cross-plugin skill consolidation** - Merged 7 redundant skills into parent skills as modules, eliminated 6 commands, and consolidated 2 hooks. Net reduction: ~7,200 lines removed across 90 files. - `fpf-review` → `pensive:architecture-review/fpf-methodology` module - `evidence-logging` → `imbue:proof-of-work/evidence-logging` module - `damage-control` → `leyline:error-patterns/agent-damage-control` module - `optimizing-large-skills` → `abstract:modular-skills/optimization-techniques` module - `makefile-dogfooder` → `pensive:makefile-review/plugin-dogfood-checks` module - `performance-optimization` → `leyline:progressive-loading/performance-budgeting` module - `mecw-patterns` → `conserve:context-optimization/mecw-theory` module - **Command consolidation** - Removed 6 duplicate or overlapping commands: - `/estimate-tokens` merged into `/context-report` - `/analyze-growth` merged into `/bloat-scan` - `/cleanup` merged into `/unbloat` - `/update-readme` merged into `/update-docs` - `/analyze-hook` merged into `/validate-hook` - `/skill-logs` removed (namespace collision with memory-palace) - `/doc-verify` and `/slop-scan` removed (agent-only, no command wrapper needed) - `/feature-review` removed (merged into `imbue:scope-guard`) - **Hook consolidation** - Merged 4 hooks into 1: - `web_content_processor.py` + `research_storage_prompt.py` → `web_research_handler.py` (memory-palace) - `skill_tracker_pre.py` + `skill_tracker_post.py` removed (duplicate of abstract's canonical tracking) - **Cross-plugin delegation** - attune skills now delegate to spec-kit rather than duplicating: - `attune:project-specification` delegates to `spec-kit:spec-writing` - `attune:project-planning` delegates to `spec-kit:task-planning` - **Plan-before-dispatch rule** - Added enforcement requiring `EnterPlanMode` before dispatching 4+ parallel agents. Documented in `parallel-execution.md` and enforced via hookify rule. ## [1.4.5] - 2026-02-23 ### Added - **Rules evaluation skill** (abstract): `rules-eval` skill and `/rules-eval` command validate Claude Code rules in `.claude/rules/` directories for YAML frontmatter, glob pattern quality, content metrics, and organization patterns - **Daily learning aggregation hook** (abstract): `aggregate_learnings_daily.py` UserPromptSubmit hook runs on a 24h cadence to generate LEARNINGS.md, then chains to `auto_promote_learnings.py` for severity-based GitHub Issue creation (score > 5.0) or Discussion posting (score 2.0–5.0) - **Research storage prompt** (memory-palace): PostToolUse hook detects WebSearch usage and prompts user to store findings via `memory-palace:knowledge-intake` skill ### Fixed - **Mermaid rendering in mdbook**: CDN-based mermaid.js initialization so diagrams render in the published book - **Sanctum 6-complete modularization**: split 26KB hub into slim hub + 6 sub-modules for on-demand loading - **Shared CACHE_EXCLUDES**: extract duplicate exclude lists into `update_plugins_modules/constants.py` - **Skill frontmatter cleanup**: improved pytest-config description and verification steps ## [1.4.4] - 2026-02-19 ### Added - **GitHub Discussions as agent collective memory**: Cross-session decision retrieval and knowledge sharing via GitHub Discussions GraphQL API - **Discussion CRUD operations** (leyline): `command-mapping.md` extended with create, comment, search, mark-as-answer, get, update, and list-by-category GraphQL templates; GitHub-only with graceful degradation for GitLab/Bitbucket - **Discussion category templates**: `.github/DISCUSSION_TEMPLATE/` with 4 structured forms: decisions (announcement), deliberations (open), learnings (retrospective), knowledge (Q&A) - **War room discussion publishing** (attune): `discussion-publishing` module publishes completed deliberations to a "Decisions" Discussion after user approval; checks for prior decisions to avoid duplicates - **Session-start discussion retrieval** (leyline): `fetch-recent-discussions.sh` SessionStart hook queries the 5 most recent Decisions discussions via a bounded GraphQL query (3s timeout, <600 tokens) - **Knowledge promotion to Discussions** (memory-palace): `discussion-promotion` module promotes evergreen corpus entries to a "Knowledge" Discussion category; supports both create and update flows - **Scope-guard discussion linking** (imbue): `github-integration.md` extended with optional Step 4 that creates a companion Discussion with full scoring breakdown when deferring features ### Fixed - **Minister playbook broken CLI commands** (minister): Replaced non-functional `gh discussion create/comment/list` CLI calls with working GraphQL mutations in `github-program-rituals.md` and `release-train-health.md` ### Changed - **TDD gate relaxed for markdown modules** (imbue): `tdd_bdd_gate.py` no longer gates `.md` files in `modules/` and `commands/` directories; these are agent instruction documents tested by `abstract:skills-eval`, not pytest. `SKILL.md` files remain gated. - **Claude Code compatibility updates 2.1.41–2.1.47** (abstract, conserve, sanctum, conjure, leyline, scribe): Documented 7 new Claude Code releases: - 2.1.47: `last_assistant_message` hook field, background agent transcript fix, parallel file write resilience, plan mode compaction fix, bash permission validation, concurrent agent streaming fix, memory improvements, Edit tool unicode fix, LSP gitignore filter - 2.1.46: Claude.ai MCP connectors, macOS orphan process fix - 2.1.45: Sonnet 4.6, plugin hot-loading, subagent skill compaction fix, background agent crash fix, SDK rate limit types - 2.1.44: ENAMETOOLONG fix, auth refresh fix - 2.1.43: AWS auth timeout, agents dir warning fix, structured-outputs header fix - 2.1.42: Deferred schema init, prompt cache improvement, /resume interrupt title fix - 2.1.41: `claude auth` CLI, `/rename` auto-name, streaming notifications, plan mode tick fix, permission rule refresh - **Agent worktree isolation** (abstract, attune, conserve, pensive, sanctum, spec-kit): Added `isolation: worktree` frontmatter to 6 agents: skill-improver, project-implementer, unbloat-remediator, code-refiner, dependency-updater, workflow-improvement-implementer. Documented worktree isolation patterns in conjure agent-teams, delegation-core, conserve subagent-coordination, and sanctum parallel-execution. - **Agent background execution** (conserve, scribe): Added `background: true` to 4 agents: ai-hygiene-auditor, bloat-auditor, doc-verifier, slop-hunter. Documented background agent MCP restriction (background agents cannot use MCP tools). - **Learnings-to-Discussions pipeline** (abstract): `aggregate-logs` command extended with Phase 6a (post learnings to Discussions) and linked to new `/promote-discussions` command (Phase 6c). - **Configuration change audit hook** (sanctum): `config_change_audit.py` ConfigChange hook logs all settings mutations (user, project, local, policy, skills) to stderr for security audit trail; observe-only, never blocks changes. ## [1.4.3] - 2026-02-15 ### Added - **Self-adapting skill system** (abstract) - Closed-loop homeostatic monitoring with 6 components: - `homeostatic_monitor.py` PostToolUse hook reads stability gap metrics from memory-palace and queues degrading skills for improvement - `improvement_queue.py` module with flag/trigger logic (3+ flags auto-triggers skill-improver agent) - `skill_versioning.py` manages YAML frontmatter versions on skill modifications - `experience_library.py` stores successful execution trajectories for future skill context - `rollback_reviewer.py` creates human-gated GitHub issues when skill regressions are detected - Evaluation window tracks next 10 executions after improvement to detect regressions ### Fixed - **Continuation agent task duplication** (conserve) - Agents spawned via `Task` tool no longer create duplicate task list entries when resuming from session state checkpoints - **Benchmark pytest marker** (sanctum) - Registered `benchmark` marker in `pyproject.toml` to eliminate `PytestUnknownMarkWarning` ## [1.4.2] - 2026-02-11 ### Added - **Mission orchestrator** (attune) - `/attune:mission` command and `mission-orchestrator` skill wrap the full brainstorm→specify→plan→execute lifecycle into a single resumable command with artifact-based state detection, phase routing, and session recovery via `leyline:damage-control` - **Damage control** (leyline) - `damage-control` skill provides agent-level error recovery for multi-agent coordination, covering crash recovery, context overflow, and merge conflict resolution - **Risk classification** (leyline) - `risk-classification` skill implements 4-tier inline risk routing (GREEN/YELLOW/RED/CRITICAL) with heuristic matching for low-risk tasks and war-room-checkpoint escalation for high-risk decisions - **Agent teams coordination** (conjure) - New `crew-roles` module defines agent role taxonomy and `health-monitoring` module adds health checks for multi-agent workflows ### Fixed - **Python 3.9 compatibility** (memory-palace) - Lowered hooks `requires-python` from `>=3.10` to `>=3.9` and updated mypy/ruff targets to match macOS system Python ### Changed - Claude Code Compatibility Updates (abstract, conserve, sanctum, conjure) - **Compatibility tracker update** - Documented Claude Code 2.1.38 and 2.1.39 features and fixes - 2.1.39: Nested session guard, hook exit code 2 stderr fix, agent teams model fix for Bedrock/Vertex/Foundry, MCP image streaming crash fix, OTel speed attribute, terminal rendering improvements, fatal error display, process hang fix - 2.1.38: Heredoc delimiter parsing hardened against command smuggling (security), sandbox blocks writes to `.claude/skills` directory (security), bash permission matching for env variable wrapper commands (`KEY=val cmd`), VS Code terminal scroll-to-top regression fix, tab key autocomplete restored, text between tool uses preserved in non-streaming mode, VS Code duplicate sessions on resume fixed - **Hook authoring updates** (abstract) - Added hook exit codes reference table, exit code 2 blocking documentation with examples and troubleshooting; bash permission matching notes for env var wrappers; heredoc delimiter security guidance; sandbox `.claude/skills` write restriction awareness - **Agent teams updates** (conjure) - Added nested session guard explanation to spawning patterns, Bedrock/Vertex/Foundry model fix note, provider compatibility troubleshooting - **Security patterns update** (docs) - Added heredoc smuggling and skills directory injection to threat model table - **Session management** (sanctum) - Added VS Code duplicate sessions troubleshooting section - **Sanctum conventions** - Updated HEREDOC guidance to recommend single-quoted delimiters with 2.1.38 security context - **Skill authoring** (abstract) - Added sandbox compatibility checklist item for skills that create/modify other skills at runtime ## [1.4.1] - 2026-02-07 ### Changed - Claude Code Compatibility Updates (abstract, conserve, sanctum) - **Compatibility tracker update** - Documented Claude Code 2.1.20 through 2.1.34 features and fixes - Background agent permissions, session resume compaction, Task tool metrics - Agent teams experimental feature (2.1.32+), sub-agent spawning restrictions (2.1.33+) - Sandbox permission bypass fix (2.1.34), auto-compact threshold fix (2.1.21+) - Variable context windows for Opus 4.6 (200K standard, 1M beta) - **Session management enhancements** (sanctum) - Documented PR-linked sessions, resume hints on exit, automatic memory, agent persistence on resume, and multiple resume reliability fixes - **Token conservation updates** (conserve) - Added PDF reading with `pages` parameter, "Summarize from here" partial compaction, automatic memory awareness, native tool preference over bash equivalents - **Subagent coordination** (conserve) - Documented Task tool metrics, TaskStop display improvements, agent teams vs Task tool comparison, and sub-agent spawning restrictions ### Changed - Attune Command Rename - **`/attune:plan` renamed to `/attune:blueprint`** - All cross-references in brainstorm, execute, specify, do-issue, and fix-pr commands updated ### Changed - Description Optimization (hookify) - **Shorter command descriptions** - Removed redundant plugin name prefixes from hookify command descriptions to stay within token budgets ### Changed - Hook Reliability (conserve, imbue) - **Inlined JSON utilities** in conserve session-start hook - Eliminates broken relative path when plugin runs from Claude Code cache directory - **Native tool preference notes** added to bloat-detector and ai-hygiene-auditor agents ### Changed - TodoWrite Patterns (sanctum) - **Task deletion documentation** (Claude Code 2.1.20+) - When and how to delete completed workflow items, including the 2.1.21 ID reuse fix ### Removed - **Shared SKILL.md wrappers** deleted from imbue, leyline, pensive, sanctum, scribe, and spec-kit - These infrastructure skill files added context overhead without providing value; shared modules remain available via direct module references ## [1.4.0] - 2026-02-05 ### Added - Attune Discoverability Enhancement (attune) - **Discoverability pattern** - All 20 attune components (9 skills, 9 commands, 2 agents) now feature enhanced automatic discovery through optimized descriptions - Pattern: `[WHAT]. Use when: [triggers]. Do not use when: [boundaries].` - "When To Use" and "When NOT To Use" sections guide users to correct components - Trigger keyword optimization (5-10 keywords per component) - Explicit boundaries prevent false positive matches - **Contributor templates** - Reusable templates in `plugins/attune/templates/` - `skill-discoverability-template.md` - For skills - `command-discoverability-template.md` - For commands - `agent-discoverability-template.md` - For agents - `TEMPLATE-GUIDE.md` - Quick reference with YAML quoting rules and token budgets - **Pre-commit validation** - Hook validates discoverability metadata on commit ### Changed - Token Budget Management - **Description optimization** - Attune descriptions total 3,920 chars (196 avg per component) - Skills: 100-150 chars target - Commands: 50-100 chars target - Agents: 75-125 chars target - Well under 20% of ecosystem's 15,000 char budget ## [1.3.8] - 2026-02-01 ### Changed - Conserve Plugin Improvements - **Test fixture consolidation** - Unified test fixtures and synced emergency workflow - **Context measurement documentation** - Documented precise vs fast context measurement methods - **Headless context reading** - Added documentation for headless context reading feature - **Clear-context cleanup** - Removed Task tool references from clear-context skill - **Context warnings cleanup** - Removed impossible Task tool references from context warnings ## [1.3.7] - 2026-01-28 ### Changed - File & Code Cleanup - **Script filename normalization** - Renamed script files from kebab-case to snake_case for Python module consistency across the ecosystem - **Dead wrapper detection** - `conserve:bloat-scan` and `unbloat` now detect dead wrapper scripts that simply delegate to other commands without adding value ### Removed - **Unused conserve services directory** - Cleaned up `conserve/services/` directory that contained no active code ## [1.3.6] - 2026-01-26 ### Added - Hookify Enhanced Tooling (hookify) - **Context-aware rule suggestions** - `rule_suggester.py` analyzes project patterns - Scans `.claude/` files, existing rules, and common tools - Suggests rules matching detected patterns (git workflows, linters, test runners) - Confidence scoring based on frequency and importance - Supports custom rule templates for project-specific needs - **Hook-to-hookify conversion tool** - `hook_to_hookify.py` migrates legacy hooks - Converts Claude Code hook scripts to hookify rule format - Extracts patterns, conditions, and actions from existing hooks - Preserves hook metadata (event type, matcher) in rule config - Generates equivalent hookify YAML rules ### Changed - Makefile Dogfooding - **Live demo requirement** - `/abstract:make-dogfood` now requires actual command execution - Replaced static `echo` statements with real command invocations - Validates that Makefile targets produce meaningful output - Better testing of actual tool integration ### Improved - Attune Task Integration - **Project execution skill** - Enhanced task workflow documentation - Improved TasksManager integration guidance - Clearer phase transitions and checkpoint handling ## [1.3.5] - 2026-01-25 ### Fixed - Context Handoff Execution Mode - **Execution mode propagation** - Context handoffs now preserve `--dangerous` mode - Session state captures execution mode (interactive/unattended/dangerous) - Continuation agents inherit `auto_continue` flag from parent - Batch operations (e.g., `/do-issue 42 43 44 --dangerous`) complete fully - Fixes workflow stopping unexpectedly at 80% context handoffs - **Updated components**: - `conserve:clear-context` skill - Added execution mode detection and propagation - `conserve:continuation-agent` - Respects execution mode, continues without prompts - `sanctum:do-issue` command - Added `--dangerous` flag documentation - Session state module - Added execution_mode metadata (v1.1) ### Changed - Attune Command Naming - **Command rename** - `/attune:init` renamed to `/attune:project-init` for consistency with skill naming - Matches the underlying `project-init` skill name - Clearer distinction from `/attune:arch-init` - **New command** - `/attune:arch-init` now available as documented in README - Previously referenced in docs but command file was missing - Invokes `architecture-aware-init` skill for research-based project setup ### Added - Conserve Context Optimization - **Fallback context estimation** - `conserve` now gracefully handles missing `CLAUDE_CONTEXT_USAGE` environment variable - Provides alternative estimation methods when native context tracking unavailable - Maintains functionality across different Claude Code versions - Improves robustness of context optimization workflows - **Batch implementation** - Implemented issues #55-68 for conserve plugin - Multiple enhancements to bloat detection and context optimization - Updated token estimation scripts with better accuracy - Enhanced MCP code execution skills ### Added - Attune War Room Checkpoint - **War-room-checkpoint skill** - New embedded escalation support for attune - Enables checkpoint creation during war-room sessions - Supports resuming complex troubleshooting workflows - Improves session management for long-running debugging tasks ### Improved - Documentation Quality - **Token efficiency improvements** - Reduced verbosity across guides and READMEs - Condensed documentation to save tokens while maintaining clarity - Added Table of Contents to 19 skills over 100 lines - Removed AI slop from commands and agents - Improved clarity in all plugin guides - **Skills infrastructure** - Enhanced skill evaluation and testing - Fixed script paths in skills-eval - Updated PATH to TARGET naming to avoid environment conflicts - Restored frontmatter and version fields across all skills ### Fixed - Dependency Management - **Optional dependencies** - Moved tiktoken and leyline to optional dependencies - Reduces base installation footprint - Allows selective installation of resource-heavy packages - Improves startup performance for core functionality - **Tasks manager consolidation** - Differentiated tasks_manager.py per plugin - Each plugin now has its own specialized tasks manager - Better encapsulation of plugin-specific workflow logic - Maintained consistency across implementations ### Improved - Sanctum Git Workflows - **Module auditing** - Enhanced update-plugins command with module auditing - Validates plugin structure before updates - Detects and reports configuration issues - Improves reliability of bulk plugin operations ## [1.3.4] - 2026-01-23 ### Analyzed - Claude Code 2.1.16 Compatibility - **Subagent memory fixes** - Critical out-of-memory crashes when resuming sessions with heavy subagent usage are now fixed - Addresses JavaScript heap OOM errors (Issue #19100) - Fixes subagent process cleanup (Issue #19045) - Resolves memory leaks during initialization (Issue #7020) - **Impact**: All our subagent-heavy workflows (attune, spec-kit, sanctum) are now more stable - **New Tasks system** - Claude Code 2.1.16 introduces a proper task management system - **Task tools**: `TaskCreate`, `TaskList`, `TaskGet`, `TaskUpdate`, `TaskOutput` - **Shared state**: `CLAUDE_CODE_TASK_LIST_ID` env var shares tasks across sessions - **Dependency tracking**: Tasks can define prerequisites - **Background execution**: Tasks run independently with `TaskOutput` to retrieve results - **Impact**: Potential replacement for custom state files (`.attune/execution-state.json`) - **Exploration document**: `docs/exploration/tasks-integration-exploration-2026-01.md` - **Context management improvements** - "/compact" warning now hides properly, more predictable cleanup - **Native plugin management** - VSCode now has native plugin support (we're already compatible) **Assessment**: No breaking changes required. Our plugins work better with Claude Code 2.1.16. ### Added - Tasks Integration POC (attune) - **TasksManager** (`plugins/attune/scripts/tasks_manager.py`) - Core implementation for Claude Code Tasks integration - Lazy task creation (tasks created when execution reaches them, not upfront) - Ambiguity detection with user prompts (multiple components, cross-cutting concerns, large scope, circular dependencies) - Dual-mode operation: native Tasks API (2.1.16+) with file-based fallback - Dependency tracking and enforcement - Resume detection for interrupted sessions - **BDD Test Suite** (`plugins/attune/tests/test_tasks_integration.py`) - 24 behavior-driven tests covering all TasksManager features - Tests for availability detection, lazy creation, ambiguity, dual-mode, resume, and dependencies - **Integration with `/attune:execute`** - Updated `commands/execute.md` with Tasks integration documentation - Updated `agents/project-implementer.md` with TasksManager workflow - **Integration Test Results**: - Claude Code 2.1.17 detected, Tasks available - Lazy creation verified (tasks created on-demand) - Dependency enforcement working (blocked until prerequisites complete) - Ambiguity detection functional (cross-cutting concerns flagged) **Branch**: `attune-tasks-poc` ### Added - Tasks Integration (spec-kit, sanctum) - **spec-kit integration** (`plugins/spec-kit/scripts/tasks_manager.py`) - Updated `speckit-implement.md` with Tasks integration - Progress tracking for multi-phase implementation workflows - Cross-session state via `CLAUDE_CODE_TASK_LIST_ID="speckit-{project}"` - **sanctum integration** (`plugins/sanctum/scripts/tasks_manager.py`) - Updated `fix-pr.md` with Tasks integration - Workflow step tracking (analyze → triage → plan → fix → validate → complete) - Resume interrupted PR fix workflows across sessions - Cross-session state via `CLAUDE_CODE_TASK_LIST_ID="sanctum-fix-pr-{pr_number}"` **Next Steps**: - [ ] Real-world testing with actual Claude Code Tasks tools - [ ] Create PR for review **Documentation**: - `docs/claude-code-2.1.16-impact-analysis.md` - Comprehensive analysis - `docs/claude-code-2.1.16-quick-reference.md` - Quick reference guide **Testing Recommendations** (see `docs/testing/subagent-stability-testing-2026-01.md`): - [ ] Test attune `/attune:execute` with heavy subagent usage - [ ] Test spec-kit multi-phase workflows - [ ] Test sanctum `/fix-pr` with complex PRs - [ ] Verify subagents terminate correctly **Testing Infrastructure Created**: - `docs/testing/subagent-stability-testing-2026-01.md` - Comprehensive test plan with checklists **Sources**: - [Claude Code 2.1.16 Changelog](https://code.claude.com/docs/en/changelog) - [Subagent Memory Issues](https://github.com/anthropics/claude-code/issues/19100) ### Updated - Superpowers & Spec-Kit Integration Documentation - **Documentation alignment** - Updated all references to match superpowers v4.1.0 and spec-kit v0.0.90 - Replaced obsolete skill references with consolidated skill names - Added comprehensive update notes to integration guides - Documented new superpowers capabilities (two-stage review, DOT flowcharts, etc.) - No breaking changes for claude-night-market plugins - **Skill reference updates** - `superpowers:defense-in-depth` → bundled in `superpowers:systematic-debugging` - `superpowers:condition-based-waiting` → bundled in `superpowers:systematic-debugging` - `superpowers:testing-anti-patterns` → bundled in `superpowers:test-driven-development` - **New documentation** - `docs/superpowers-spec-kit-update-analysis.md` - Comprehensive analysis of changes - `docs/superpowers-spec-kit-update-summary.md` - Summary of implemented updates - Updated integration guides with version information and breaking changes **Files modified:** - `plugins/sanctum/docs/superpowers-integration-guide.md` - `plugins/abstract/skills/skills-eval/modules/integration-testing.md` - `docs/superpowers-integration.md` - `docs/guides/superpowers-integration.md` **Impact:** Documentation only - no code changes required. All plugins remain fully compatible with latest superpowers and spec-kit versions. **Sources:** - [Superpowers v4.1.0 Release](https://github.com/obra/superpowers/releases/tag/v4.1.0) - [Spec-Kit v0.0.90 Release](https://github.com/github/spec-kit/releases/tag/v0.0.90) ## [1.3.3] - 2026-01-23 ### Added - Session Complete Notifications (sanctum) - **`session_complete_notify.py` hook** - Cross-platform toast notification when Claude awaits input - **Linux**: Uses `notify-send` (libnotify) - **macOS**: Uses `osascript` display notification with Glass sound - **Windows**: Uses PowerShell toast API with BurntToast fallback - **WSL**: Detects WSL and routes to Windows PowerShell - **Terminal context**: Shows Zellij/tmux session, terminal program, or project name - **Non-blocking**: Spawns background process to avoid hook timeout - **Disable**: Set `CLAUDE_NO_NOTIFICATIONS=1` to disable - **Test coverage**: 35 BDD tests covering all platforms and edge cases ### Added - Testing Infrastructure Improvements - **Conditional test output verbosity** - Test script now shows full error details on failure - Success: Brief `✓ Tests passed` message with quiet output - Failure: Automatic re-run with verbose output for debugging - No need to manually re-run failed tests to see errors - File: `scripts/run-plugin-tests.sh` - **Mutation testing targets** - Added to 7 plugins for test quality validation - Plugins: abstract, conserve, imbue, leyline, memory-palace, minister, conjure - Usage: `make mutation-test` (requires `pip install mutmut`) - Graceful fallback when mutmut not installed - **Performance benchmarking targets** - Added to 4 performance-sensitive plugins - Plugins: conjure (API latency), memory-palace (memory ops), leyline (token estimation), conserve (bloat detection) - Usage: `make benchmark` (requires `pip install pytest-benchmark`) - Includes `memory-profile` target placeholder for memory profiling ### Fixed - Testing Configuration - **pytest-cov path mapping** (abstract) - Coverage now reports accurately - Changed `--cov=src/abstract` to `--cov=abstract` to match import structure - Coverage report now shows 68% instead of 0% - File: `plugins/abstract/pyproject.toml` ### Verified - Existing Test Coverage - **sanctum doc-updates workflow** - 25 integration tests confirmed passing - Phase 2.5: Consolidation detection (bloat, untracked reports) - Phase 5: Accuracy verification (versions, counts) - Directory-specific rules (docs/ strict, book/ lenient) - **scry content validation** - 37 tests confirmed passing - Skill frontmatter schema validation - Agent capabilities validation - Command parameter documentation validation - **scry VHS→GIF workflow** - 11 integration tests confirmed passing - VHS recording with `@pytest.mark.requires_vhs` - FFmpeg conversion with `@pytest.mark.requires_ffmpeg` - Full pipeline with graceful dependency skipping ### Documentation - Updated `docs/testing-guide.md` with: - New Advanced Testing section (mutation, benchmarking, memory profiling, integration) - Updated test coverage table (938+ tests across 12 plugins) - Conditional verbosity documentation - Integration testing patterns - Updated `plugins/sanctum/README.md` with: - New Hooks section documenting all 4 sanctum hooks - Session Complete Notifications subsection - Added hooks/ directory to Plugin Structure ## [1.3.2] - 2026-01-22 ### Enhanced - Claude Code 2.1.14+ Compatibility (conserve) - **Parallel subagents**: Added version compatibility note to `mcp-coordination.md` documenting improved stability with Claude Code 2.1.14+ memory fixes - **Continuation-agent**: Benefits from upstream memory leak fix for long-running sessions (stream resources now properly cleaned up after shell commands) - **MECW guidance unchanged**: Our 50% quality threshold is independent of the 65%→98% blocking fix (quality vs operational limits) ## [1.3.1] - 2026-01-21 ### Added - Hookify v1.1.0 (hookify) - **block-destructive-git rule**: Blocks dangerous git commands that cause irreversible data loss - `git reset --hard` - Destroys all uncommitted changes - `git checkout -- .` - Discards all unstaged changes - `git clean -fd` - Permanently deletes untracked files - `git stash drop` - Permanently deletes stashed changes - `git branch -D` - Force-deletes branches (even unmerged) - `git reflog expire` / `git gc --prune` - Destroys recovery points - **warn-risky-git rule**: Warns about git operations that modify history - `git reset` (soft/mixed) - Moves HEAD, may unstage files - `git checkout -- ` - Replaces file from another branch - `git rebase -i` / `git rebase --onto` - Rewrites commit history - `git cherry-pick/merge/am --abort` - Discards in-progress operations - **Recovery-first guidance**: Each blocked command shows diagnostic commands to review changes before discarding - **Safer alternatives**: Comprehensive alternative workflows (stash, backup branches, selective operations) ### Removed - Command Deduplication (sanctum) - **Removed `sanctum:skill-review`** - Duplicate of `pensive:skill-review` and `abstract:skill-auditor` - `pensive:skill-review` handles runtime metrics (execution counts, stability gaps) - `abstract:skill-auditor` handles static quality analysis - Updated cross-references in `update-plugins.md` and `skill-logs.md` ### Added - Attune v1.2.0 War Room (attune) - **`/attune:war-room` command**: Convene expert panel for strategic decisions - **`Skill(attune:war-room)`**: Full deliberation skill with 7 phases - **Expert panel**: 7 specialized AI roles (Supreme Commander, Chief Strategist, etc.) - **Deliberation phases**: Intel, Assessment, COA Development, Red Team, Voting, Premortem, Synthesis - **Merkle-DAG anonymization**: Contributions anonymized during deliberation, unsealed after decision - **Borda count voting**: Rank-based aggregation for fair expert voting - **Escalation logic**: Automatic escalation from lightweight to full council on complexity - **War Room Modules**: - `modules/expert-roles.md` - Expert panel configuration and invocation patterns - `modules/deliberation-protocol.md` - Phase definitions and flow control - `modules/merkle-dag.md` - Anonymization and integrity verification - **Strategeion**: Dedicated Memory Palace chamber for war council sessions - **Conjure delegation**: Expert dispatch via conjure delegation framework ### Added - War Room Multi-LLM Deliberation (attune/conjure) - **War Room framework** - Multi-LLM expert council for strategic decisions - 7 deliberation phases: Intel, Assessment, COA Development, Red Team, Voting, Premortem, Synthesis - Expert panel: Supreme Commander (Opus), Chief Strategist (Sonnet), Intelligence Officer (Gemini Pro), Field Tactician (GLM-4.7), Scout (Qwen Turbo), Red Team Commander (Gemini Flash), Logistics Officer (Qwen Max) - Lightweight mode (3 experts) with auto-escalation to full council (7 experts) - Merkle-DAG anonymization: contributions anonymized during deliberation, unsealed after decision - Borda count voting for fair expert aggregation - **`war_room_orchestrator.py`** (conjure) - Async orchestration engine - Parallel expert dispatch with timeout handling - Session persistence to Strategeion (Memory Palace war chamber) - Graceful degradation on expert failures - Full test coverage (17 tests) - **War Room skill and command** (attune) - `Skill(attune:war-room)` - Full deliberation skill - `/attune:war-room` - Command interface with options: `--full-council`, `--delphi`, `--resume` - Modular design: expert-roles, deliberation-protocol, merkle-dag modules - **Phase 3: Strategeion persistence** - Enhanced session persistence with organized subdirectories - `intelligence/` - Scout and Intel Officer reports - `battle-plans/` - COA documents from all experts - `wargames/` - Red Team challenges and premortem analyses - `orders/` - Final Supreme Commander decision - Session archiving to `campaign-archive/{project}/{date}/` - MerkleDAG reconstruction on session load - **Phase 4: Delphi mode and hook triggers** - `convene_delphi()` - Iterative convergence until expert agreement - Convergence scoring based on Borda count spread - Hook auto-trigger detection with keyword analysis - Configurable complexity threshold (default 0.7) ## [1.3.0] - 2026-01-19 ### Added - Scribe Plugin - **New scribe plugin** - Documentation quality and AI-generated content detection - `doc-verifier` agent - Multi-pass documentation QA with AI slop detection - AI slop pattern detection: marketing speak, buzzwords, excessive enthusiasm - Quality checks: completeness, accuracy, consistency, examples, clarity - Multi-tier review: quick scan → detailed analysis → slop detection - Integration with `/sanctum:pr-review` and `/sanctum:update-docs` workflows ### Enhanced - Sanctum Workflows - **Scribe integration** - Documentation review in PR and doc update workflows - `/pr-review` now includes scribe's doc-verifier for documentation changes - `/update-docs` invokes scribe for AI-generated content detection - Hooks.json indirect reference resolution in plugin auditor - Zellij session detection for smarter notifications - GitHub API PATCH method for PR description updates ### Enhanced - Plugin Quality Tools - **Conserve plugin** - AI hygiene auditing and duplication detection - AI-generated bloat patterns module - Duplicate detection script for codebase hygiene - AI hygiene auditor agent for proactive quality checks - Agent psychosis and codebase hygiene documentation - **Memory Palace** - Knowledge corpus validation and semantic memory - ACE Playbook semantic memory integration - Knowledge corpus validation script with automated testing - Cargo cult programming prevention documentation ### Documentation - Added scribe plugin documentation to book - Updated capability references with scribe agents, commands, and skills - Enhanced plugin domain specialists section - Updated sanctum PR workflow and README update documentation ## [1.2.9] - 2026-01-16 ### Added - PermissionRequest Hook (conserve) - **`permission_request.py` hook** - Workflow automation via auto-approve/deny (Claude Code 2.0.54+) - Auto-approve safe patterns: `ls`, `cat`, `head`, `tail`, `grep`, `rg`, `find`, `git status/log/diff` - Auto-deny dangerous patterns: `rm -rf /`, `sudo`, `curl | bash`, `git push --force main` - Security model: denylist checked first, then allowlist, unknown shows dialog - Issue #55: PermissionRequest hooks for workflow automation - **`test_permission_request.py`** - Test coverage for hook patterns - Dangerous pattern detection tests - Safe pattern approval tests - Unknown command dialog tests ### Added - Session Management Skill (sanctum) - **`session-management` skill** - Named session workflows (Issue #57) - `/rename` - Name current session for later resumption - `/resume` - Resume previous sessions from REPL or terminal - Patterns: debugging sessions, feature checkpoints, PR reviews, investigations - Best practices: naming conventions, session cleanup ### Documentation - Updated conserve README with Hooks section documenting PermissionRequest - Updated sanctum README with session-management skill - Added "Permission Automation" to main README Key Features ## [1.2.8] - 2026-01-15 ### Added - Interactive Authentication System (leyline) - **`interactive-auth.sh` module** - Centralized OAuth authentication for external services - Multi-service support: GitHub, GitLab, AWS, GCP, Azure - Token caching with 5-minute TTL, session persistence with 24-hour TTL - Interactive OAuth prompts with CI/CD detection and automatic fallback - Retry logic with exponential backoff (max 3 attempts) - Wrapper functions: `gh_with_auth`, `glab_with_auth`, `aws_with_auth` - **`authentication-patterns` skill enhancements** - New `interactive-auth.md` module (634 lines) - comprehensive implementation guide - New `workflow-integration.md` examples - real-world integration patterns - New `README.md` quick-start guide (457 lines) - Shell test suite: 14 tests covering syntax, functions, caching, sessions - **`docs/guides/authentication-integration.md`** - Implementation summary and usage guide ### Added - Anti-Cargo-Cult Framework (imbue) - **`anti-cargo-cult.md` shared module** - Understanding verification protocols - The Five Whys of Understanding framework - Understanding Checklist for code review - Recovery Protocol for cargo cult code - Integration with proof-of-work and rigorous-reasoning skills - **Fourth Iron Law: NO CODE WITHOUT UNDERSTANDING** - Added to `iron-law-enforcement.md` module - Integration with proof-of-work TodoWrite items - Expanded red flags for cargo cult patterns - **Enhanced `red-flags.md`** (+184 lines) - New "Cargo Cult" Family section - AI suggestion patterns, best practice patterns - Copy-paste without understanding patterns - **`rigorous-reasoning` cargo cult patterns** (+21 lines) - Cargo cult reasoning detection in conflict analysis - Pattern recognition for "best practice" and "everyone does it" justifications - **Knowledge corpus**: `cargo-cult-programming-prevention.md` (memory-palace) ### Added - Test Coverage Improvements (leyline) - **`test_tokens.py`** (+10 tests) - JSON/default ratio handling, OSError handling, directory walking - tiktoken encoder paths, file encoding success/error cases - Coverage: 60% → 94% - **`test_quota_tracker.py`** (+15 tests) - Status levels (healthy/warning/critical), RPM/daily/token warnings - All `can_handle_task` edge cases, storage persistence, CLI modes - Coverage: 74% → 97% - **`test_anti_cargo_cult.py`** (imbue) - 16 BDD tests - Module structure and content validation - Cross-skill integration verification - Red flags content validation ## [1.2.7] - 2026-01-14 ### Added - New Skills (Issues #39, #40) - **`imbue:workflow-monitor` skill (skeleton)** - Monitor workflow executions for errors and inefficiencies - Detection patterns for command failures, timeouts, retry loops, context exhaustion - Efficiency detection for verbose output, redundant reads, sequential vs parallel - Issue templates for automatic GitHub issue creation - Configuration: severity thresholds, auto-create toggle, efficiency metrics - **`pensive:fpf-review` skill (skeleton)** - FPF (Functional Programming Framework) architecture review - Three-perspective analysis: Functional, Practical, Foundation - Feature inventory and capability gap identification - Performance assessment and usability evaluation - Pattern recognition and technical debt assessment - Structured report generation - **`imbue:rigorous-reasoning` skill** - Prevent sycophantic reasoning through checklist-based analysis - Priority signals: no courtesy agreement, checklist over intuition, categorical integrity - Conflict analysis protocol with harm/rights checklist - Red flag self-monitoring for sycophantic patterns - Debate methodology for truth claims in contested territory - Modules: priority-signals, conflict-analysis, engagement-principles, debate-methodology, correction-protocol, incremental-reasoning, pattern-completion ### Added - Feature Review Tests (Issue #41) - **`test_feature_review.py`** - 25 comprehensive tests for feature-review skill - Scoring framework tests: value/cost calculation, weighted scores, priority thresholds - Classification system tests: proactive/reactive, static/dynamic, 2x2 matrix - Kano classification tests: basic, performance, delighter categories - Tradeoff dimension tests: minimum dimensions, scale validation - Integration tests: issue title format, suggestion labels, backlog limits ### Fixed - Defensive .get() Usage (Issue #44) - **`compliance.py`** - Consistent defensive dict access - Changed `self.rules["max_tokens"]` to `self.rules.get("max_tokens", 4000)` - Changed `self.rules["required_fields"]` to `self.rules.get("required_fields", [...])` - Added 3 tests: partial rules, malformed rules, empty rules file handling - Prevents KeyError when rules file is incomplete or malformed ### Enhanced - `/fix-pr` Workflow (Issues #46, #47, #48, #49) - **Context budget tracking** (Issue #46) - Added context_budget configuration with warn/checkpoint/mandatory thresholds - Context usage warnings at 50%, 70%, 90% - Mandatory phases (3.5, 4, 6) that must not be skipped - Checkpoint/resume pattern documentation - **Thread vs comment distinction** (Issue #47) - Added section 1.5: Review Feedback Type Detection - Decision tree for PRRT_* threads vs general comments vs aggregated reviews - Guidance for handling reviews without line-specific thread IDs - **Triage output grouping** (Issue #48) - Added section 2.0: Triage Output Format - Four categories: Fix Now, This PR, Backlog, Skip - Actionable table format with IDs, issues, files, rationale - **`--continue` flag** (Issue #49) - Resume from last incomplete phase - Phase completion markers for detection - Example resume scenarios ### Enhanced - `/update-docs` Output Examples (Issue #51) - **Added Output Examples section** to update-docs.md - Consolidation Detection Output: untracked reports, bloated files tables - Accuracy Verification Output: version/count mismatch warnings - Style Violation Output: directory-specific rules, filler phrase detection ## [1.2.6] - 2026-01-13 ### Added - Self-Improvement Patterns and Research (Issues #7, #37) - **`/fix-workflow` enhanced with self-improvement patterns** - Added Phase 1.5: Reflexion (self-critique loop before implementation) - Added Plan-Do-Check-Act (PDCA) cycle to implementation phase - Added difficulty-aware orchestration (simple/medium/complex) - Added Phase 2: Outcome feedback loop for self-evolution - New `--difficulty auto|simple|complex` flag - Complexity scoring based on files affected, cross-plugin changes, prior failures - **ACE Playbook research for memory-palace** (Issue #7) - Documented semantic deduplication patterns via FAISS (0.8 cosine threshold) - Captured Generator-Reflector-Curator triad architecture - Identified counter-based reinforcement pattern (helpful/harmful/neutral) - Recommended per-room semantic indices for domain isolation - Research document: `memory-palace/docs/knowledge-corpus/queue/ace-playbook-research.md` ### Added - Plugin Root Validation (Issue #34) - **imbue_validator now warns about missing/invalid plugin roots** - Logs warning when plugin root directory doesn't exist - Logs warning when plugin root is empty - Logs warning when plugin lacks expected structure (no skills/ or plugin.json) - New properties: `root_exists`, `root_empty`, `has_valid_structure` - 4 new tests covering all notification scenarios ### Added - MECW Optimization Implemented (Issues #28, #29) - **Module bloat consolidation executed** - `testing-hooks.md` reduced from 627 to 118 lines (81% reduction) - Examples archived to `docs/examples/hook-testing/comprehensive-examples.md` - Module now references archived examples via links - Token savings: ~500 lines per skill load ### Added - Counter-Based Reinforcement (ACE Enhancement) - **`counter_reinforcement.py` module** - ACE Playbook semantic memory pattern - `ReinforcementCounter` class with helpful/harmful/neutral counters - `CounterReinforcementTracker` for entry-level tracking - `helpfulness_ratio` and `confidence_score` computed properties - `needs_review` detection for problematic entries - `should_deduplicate()` using 0.8 cosine similarity threshold - Export/import for persistence - 22 new tests with full coverage ### Added - Minister Label Management - **`/update-labels` command** - Professional GitHub label taxonomy management - Creates type labels: `feature`, `bugfix`, `test`, `docs`, `refactor`, `performance`, `ci-cd`, `research` - Creates priority labels: `high-priority`, `medium-priority`, `low-priority` - Creates effort labels: `small-effort`, `medium-effort`, `large-effort` - Replaces catch-all `enhancement` label with specific types - Auto-classifies issues based on title patterns - Supports `--dry-run`, `--preserve`, `--repo` flags - Integrates with minister's issue lifecycle commands ### Added - Iron Law Interlock Enforcement - **`iron-law-interlock.md` shared module** - Hard gate enforcement for creation workflows - Transforms Iron Law from advisory to structural enforcement - Provides mandatory checklist before Write tool invocation - Requires test file creation BEFORE implementation - Captures RED state evidence as precondition - Defines exemption categories (documentation, configuration, user-directed) - Located at `abstract/shared-modules/iron-law-interlock.md` - **Updated `/create-command`** - Phase 0 now enforces Iron Law - Must create test file before command file - Must capture failing test evidence - Phase 6 added: GREEN state verification - TodoWrite items: `proof:iron-law-red`, `proof:iron-law-interlock-satisfied`, `proof:iron-law-green` - **Updated `/create-skill`** - Phase -1 Iron Law interlock added - Blocking gate before methodology curation and brainstorming - Quick reference for test-first workflow - Links to full interlock documentation - **Updated `/create-hook`** - Phase -1 Iron Law interlock added - Blocking gate before brainstorming - Quick reference for test-first workflow - Links to full interlock documentation - **Updated `imbue:proof-of-work`** - Cross-skill module reference - Added link to iron-law-interlock shared module - Connects TDD enforcement with creation workflows ## [1.2.5] - 2026-01-11 ### Added - Continuous Improvement Integration - **/update-plugins Phase 2: Automatic improvement analysis** - Plugin maintenance now includes performance review - Invokes `/skill-review` to identify unstable skills (stability_gap > 0.3) - Queries `/skill-logs` for recent failures and patterns - Checks git history for recurring fixes (instability signals) - Generates prioritized improvement recommendations (Critical/Moderate/Low) - Creates TodoWrite items for actionable improvements - **No flags required** - improvement analysis runs by default after registration audit - **/fix-workflow Phase 0: Improvement context gathering** - Retrospectives now leverage historical data - Queries skill execution metrics before starting analysis - Searches memory-palace review-chamber for related lessons - Analyzes git history for recurring patterns - Cross-references current friction with known failure modes - Prioritizes fixes for high stability_gap components - **Automatic by default** - no flags required - **sanctum:workflow-improvement skill enhancements** - Step 0 context gathering - New TodoWrite item: `fix-workflow:context-gathered` - New TodoWrite item: `fix-workflow:lesson-stored` - Checks `/skill-logs` for recent failures in workflow components - Queries memory-palace for workflow-related lessons - Analyzes git commit patterns for recurring issues - Step 7: Close the loop by storing lessons for future reference - Metrics comparison template for before/after validation - **Continuous improvement feedback loop** - Self-improving plugin ecosystem - `/update-plugins` identifies improvement opportunities - `/fix-workflow` implements improvements with historical context - Lessons stored in git history and memory-palace - Future runs reference past improvements - Reduces recurring issues through pattern learning - **imbue:proof-of-work integration with improvement workflows** - Validation for continuous improvement - New section: "With Improvement Workflows (`/update-plugins`, `/fix-workflow`)" - `/update-plugins` Phase 2 validation examples with evidence format - `/fix-workflow` Phase 0 validation examples for data source verification - `/fix-workflow` Step 7 validation for measuring improvement impact - Updated triggers: "improvement validated", "workflow optimized", "performance improved" - Ensures improvement claims are backed by quantitative metrics - **Test coverage for continuous improvement integration** - Comprehensive test suite - New test file: `plugins/sanctum/tests/test_continuous_improvement.py` - 8 test cases covering all integration points - Tests Phase 2 and Phase 0 documentation - Tests workflow-improvement skill enhancements - Tests proof-of-work integration - Tests CHANGELOG and documentation completeness - Tests infrastructure accessibility - All tests passing with 100% success rate ### Added - Claude Code 2.1.4 Compatibility (2026-01-11) - **`CLAUDE_CODE_DISABLE_BACKGROUND_TASKS` environment variable** - Documented for CI/CD use cases - Disables auto-backgrounding and `Ctrl+B` shortcut - Useful for CI/CD pipelines, debugging, deterministic test environments - Does not affect Python subprocess spawning or asyncio tasks in hooks ### Added - Claude Code 2.1.3 Compatibility (2026-01-11) - **Compatibility documentation for Claude Code 2.1.3** - Full documentation of new features and fixes - **Skills/Commands Merge**: Skills now appear in `/` menu alongside commands (no behavior change) - **Subagent Model Fix**: Model specified in agent frontmatter now respected during context compaction - **Web Search Fix**: Subagent web search now uses correct model - **Hook Timeout**: Extended from 60 seconds to 10 minutes (enables CI/CD and complex validation) - **Permission Diagnostics**: `/doctor` now detects unreachable permission rules - **Plan File Fix**: Fresh plan files after `/clear` commands - **ExFAT Compatibility**: Fixed skill duplicate detection on large inode filesystems - **Updated hook-authoring skill** - Timeout guidance updated for 10-minute limit - Best practice: Aim for < 30s for typical hooks - Extended time available for CI/CD integration, complex validation, external APIs - **Updated compatibility reference** - Version matrix includes 2.1.3+ as recommended - All 29 ecosystem agents verified to have `model:` specification (benefits from subagent fix) - No breaking changes - existing plugin.json structure remains valid ### Added - Claude Code 2.1.2 Compatibility (2026-01-11) - **Agent-aware SessionStart hooks** - Hooks now leverage `agent_type` input field - `sanctum/hooks/post_implementation_policy.py` - Skips governance for review agents - `conserve/hooks/session-start.sh` - Abbreviated context for lightweight agents - `imbue/hooks/session-start.sh` - Minimal scope-guard for review/optimization agents - Pattern: Read JSON from stdin, check `agent_type`, customize context injection - Reduces context overhead by ~200-800 tokens for non-implementation agents - **SessionStart input schema documentation** - Updated skill documentation - `abstract:hook-authoring` - Documents `agent_type`, `source`, `session_id` fields - `abstract:hook-scope-guide` - Explains agent-aware hook patterns - Includes Python and Bash examples for reading hook input - **Large output persistence documentation** - Notes on behavior change - `conserve:context-optimization` - Documents disk-based output storage - Best practices for leveraging full output access without context bloat - **`FORCE_AUTOUPDATE_PLUGINS` environment variable** - Documented for developers - Forces plugin auto-update even when main auto-updater disabled - Useful for CI/CD pipelines and controlled update rollouts ### Changed - Hook input reading uses non-blocking patterns (`read -t 0.1` in bash) - Backward compatible: gracefully handles missing stdin from older Claude Code versions ### Added - Iron Law TDD Enforcement (2026-01-11) - **New `iron-law-enforcement.md` module** - Comprehensive TDD enforcement patterns - Defines the Iron Law: "NO IMPLEMENTATION WITHOUT A FAILING TEST FIRST" - Prevents "Cargo Cult TDD" where tests validate pre-conceived implementations - Five enforcement levels: self-enforcement, adversarial verification, git history analysis, pre-commit hooks, coverage gates - Self-check protocol with red flags table for TDD violations - RED/GREEN/REFACTOR subagent pattern for adversarial verification - Git history audit commands to detect TDD compliance - Pre-commit hook template to block implementation-only commits - Three-pillar coverage requirements: line, branch, and mutation testing - Recovery protocols for Iron Law violations - Self-improvement loop: learn from violations, strengthen rules - **Updated proof-of-work skill** - Integrated Iron Law enforcement - Added Iron Law section with self-check table - New TodoWrite items: `proof:iron-law-red`, `proof:iron-law-green`, `proof:iron-law-refactor`, `proof:iron-law-coverage` - Cross-referenced iron-law-enforcement.md module - **Updated skill-authoring skill** - Extended Iron Law to all implementation work - Skills: No skill without documented Claude failure - Code: No implementation without failing test - Claims: No completion claim without evidence - Cross-referenced proof-of-work Iron Law module - **Updated proof-enforcement.md** - Added Rule 4: Iron Law TDD Compliance - Blocks completion claims lacking TDD evidence - Checks for failing test evidence, design emergence, commit patterns - Includes recovery protocol for violations - **Updated post_implementation_policy.py** - Strengthened governance injection - Added Iron Law self-check table to session start - Extended red flags with TDD-specific patterns - Added iron-law TodoWrite items to required protocol - **Updated imbue session-start.sh** - Added Iron Law quick reference - Iron Law statement and self-check table - TDD TodoWrite items reminder ### Fixed - Proof-of-Work Enforcement Gap (2026-01-11) - **Integrated proof-of-work into governance protocol** - `post_implementation_policy.py` - Proof-of-work is now STEP 1 (before doc updates) - Added red flag table to catch rationalization patterns - Requires TodoWrite items: `proof:solution-tested`, `proof:evidence-captured` - **Added proof-of-work reminder to Stop hook** - `verify_workflow_complete.py` - End-of-session checklist now includes proof-of-work items - Warning if proof-of-work was skipped - **Added proof-of-work to imbue session start** - `session-start.sh` - Quick reference table alongside scope-guard - Red flags table for common rationalization patterns **Root Cause**: `proof-enforcement.md` was a design document referencing non-existent `PreMessageSend` hook type. Implementation now uses available hooks (SessionStart, Stop) to enforce proof-of-work discipline through governance injection and checklists - **Cleaned up unsupported hook type reference** - `imbue/hooks/proof-enforcement.md` - Updated frontmatter to reference actual triggers (SessionStart, Stop) - Added Implementation Status section explaining actual enforcement mechanism - Updated Configuration section to reflect automatic enforcement - Preserved detection patterns as self-enforcement guidance ## [1.2.4] - 2026-01-10 ### Added - Shell Review Skill and Security Guardrails (2026-01-10) - **pensive:shell-review** - New skill for auditing shell scripts - Exit code analysis and error handling validation - POSIX portability checks (bash-specific vs portable constructs) - Safety pattern verification (quoting, word splitting, globbing) - Modular design with `exit-codes.md`, `portability.md`, `safety-patterns.md` modules - Integration with `imbue:evidence-logging` for structured findings - **sanctum:security-pattern-check hook** - Context-aware security checking - Distinguishes code files from documentation (reduces false positives) - Detects patterns in context (ignores examples showing what NOT to do) - Checks for dynamic code execution, shell injection, SQL injection, hardcoded secrets - Configurable via `hooks.json` with environment variable overrides - **Commit workflow guardrails** - Prevention of quality gate bypass - Added guardrails against `--no-verify` in commit-messages skill - Updated git-workspace-review to run `make format && make lint` proactively - Skills now block if code quality checks fail (no bypass allowed) ### Changed - Bloat Reduction Phases 6-9 (2026-01-10) #### Phase 6: Pensive Code Refactoring (~2,400 tokens saved) - Created shared utilities module (`pensive/utils/`) - `content_parser.py`: File parsing and snippet extraction utilities - `severity_mapper.py`: Centralized severity categorization - `report_generator.py`: Reusable markdown report formatting - Enhanced `BaseReviewSkill` with shared helper methods - Reduced code duplication across 4 review skills (rust, architecture, bug, makefile) #### Phase 8: Examples Repository (~5,540 tokens saved) - Created centralized `/examples/attune/` directory - Moved large example files from plugin to examples directory - `microservices-example.md` (726 lines → 20 line stub) - `library-example.md` (699 lines → 18 line stub) - Replaced with lightweight stub files that reference full content #### Phase 9: Script Data Extraction - Complete (~10,192 tokens saved) Applied systematic data extraction to 4 large Python scripts: 1. **seed_corpus.py** (1,117 → ~285 lines) - Extracted: `data/seed_topics.yaml` (topic catalog) - Savings: ~832 lines 2. **makefile_dogfooder.py** (793 → ~200 lines) - Extracted: `data/makefile_target_catalog.yaml` (target definitions) - Savings: ~593 lines 3. **template_customizer.py** (792 → ~130 lines) - Extracted: `data/architecture_templates.yaml` (480 lines of templates) - Savings: ~662 lines 4. **architecture_researcher.py** (641 → ~180 lines) - Extracted: `data/paradigm_decision_matrix.yaml` (decision logic) - Savings: ~461 lines **Pattern**: Identify embedded data → Extract to YAML → Add load functions → Update scripts **Result**: 3,343 → ~795 lines (76% code reduction) **Total token savings (Phases 6-9)**: ~18,132 tokens **Combined total (all phases)**: ~70,772 tokens (28-33% context reduction) ### Added - **Skills Separation Guide** - Comprehensive guide for separating development skills from runtime agent skills - **Problem**: Namespace collision when using Claude Code to build AI agents (development skills vs runtime skills) - **New Guides**: 4 complementary resources (~16,300 words total) - `docs/guides/development-vs-runtime-skills-separation.md` - Full technical guide (11K words) - `docs/guides/skills-separation-quickref.md` - One-page quick reference - `docs/guides/skills-separation-diagram.md` - Visual diagrams (Mermaid + ASCII) - `docs/reddit-response-skills-separation.md` - Conversational response format - **4 Separation Patterns**: Physical directory, namespace prefixing, context forking, scoped loading - **SDK Integration**: Complete examples for composing system prompts from skill files - **Example Project**: TodoAgent with separated development (.claude/skills/) and runtime (src/agent/prompts/) namespaces - **Workflow Coverage**: 3-phase workflow (build, test, deploy) with context isolation - **Troubleshooting**: Common issues and solutions for namespace bleeding - **Integration**: References abstract, conserve, pensive, spec-kit plugins - **Updated**: README.md with link to Advanced Guides, docs/guides/README.md with new section - **Use Case**: Essential for anyone building AI agent applications with Claude Code assistance - **Documentation Standards** - NEW guide codifying documentation debloating methodology - **New Guide**: `docs/guides/documentation-standards.md` enforces directory-specific line limits - **Directory Limits**: docs/=500 lines (strict reference), book/=1000 lines (lenient tutorial) - **Debloating Techniques**: Progressive disclosure, consolidation, cross-referencing, deletion - **Anti-Patterns**: Complete-guide files, verbose examples, redundant code, monolithic files - **Enforcement**: Pre-commit checks, monthly reviews, PR checklist - **Phase 5 Results**: Applied to 8 files, 3,421 lines saved (55% reduction, ~3,200 tokens) - **Data Extraction Pattern Guide** - Comprehensive guide for separating data from code - **New Guide**: `docs/guides/data-extraction-pattern.md` documents the data-to-YAML refactoring pattern - **5-Step Process**: Identify → Extract → Deserialize → Update → Validate - **Real Examples**: 4 production refactorings from claude-night-market (seed_corpus, makefile_dogfooder, template_customizer, architecture_researcher) - **Results**: 75% average code reduction (3,343 → ~795 lines across 4 scripts) - **Benefits**: Non-programmer editable configs, cleaner diffs, runtime flexibility - **Best Practices**: YAML schema documentation, error handling, defaults, version migration - **Code Templates**: Production-ready examples with comprehensive error handling - **Integration**: References optimization-patterns.md and documentation-standards.md - **Optimization Patterns** - Battle-tested methodology for context reduction - **New Guide**: `docs/optimization-patterns.md` captures systematic optimization approach - **8 Patterns**: Archive cleanup, hub-and-spoke docs, data extraction, shared utilities, examples repo, progressive disclosure, TODO audit, anti-pattern removal - **Proven Results**: 9 phases achieving 28-33% context reduction (~70,772 tokens saved) - **5 Principles**: Separation of concerns, DRY, progressive disclosure, maintainability, backwards compatibility - **Phase-Based Workflow**: Discovery → Analysis → Planning → Execution → Validation - **Metrics**: Token estimation formulas, success criteria, tracking templates - **Real-World Data**: Complete phase-by-phase breakdown with measurable impact - **Future Opportunities**: Automation, configuration management, pattern library - **Conjure: GeminiQuotaTracker Inheritance Refactoring** - Reduced code duplication through leyline.QuotaTracker base class - **Code Reduction**: 287 → 255 lines (-32 lines, -11.1% reduction) - **Inherited Methods**: 11 methods now inherited from `leyline.QuotaTracker` base class - `record_request()`, `get_quota_status()`, `can_handle_task()`, `get_current_usage()` - `_load_usage()`, `_save_usage()`, `_cleanup_old_data()`, `estimate_file_tokens()` - **Preserved Features**: Gemini-specific token estimation (tiktoken + heuristic fallback) - **Backward Compatibility**: 100% compatible with existing dict-based configuration - **Documentation**: Complete technical guide at `plugins/conjure/docs/quota-tracking.md` - **Testing**: All functionality verified (instantiation, quota checking, token estimation, file handling) - **Dependencies**: Added `leyline>=1.0.0` for base class inheritance - **See Also**: ADR-0002 for architecture decision rationale - **Methodology Curator Skill** - Surface expert frameworks before creating OR evaluating skills/hooks/agents - **New Skill**: `methodology-curator` - Curates proven methodologies from domain masters - **Concept**: Based on skill-from-masters approach - ground work in battle-tested frameworks - **Creation Workflow**: Surfaces experts → select/blend methodologies → create methodology brief → handoff to create-skill - **Evaluation Workflow**: Identify domain → gap analysis vs masters → targeted improvements - **Pure Markdown**: No external dependencies, installs with plugin - **Domain Modules** (6 domains, 1500+ lines of curated expertise): - `instruction-design.md` - Mager (behavioral objectives), Bloom (taxonomy), Gagné (9 events), Sweller (cognitive load), Clark (evidence-based) - `code-review.md` - Google Engineering, Fowler (refactoring), Feathers (legacy code), Wiegers (peer reviews), Gee (practical review) - `debugging.md` - Zeller (scientific debugging), Agans (9 rules), Spinellis (effective debugging), Miller (debugging questions), Regehr (systems debugging) - `testing.md` - Kent Beck (TDD), Freeman/Pryce (GOOS), Meszaros (xUnit patterns), Feathers (legacy testing), Bach/Bolton (exploratory) - `knowledge-management.md` - Luhmann (Zettelkasten), Ahrens (smart notes), Bush (memex), Allen (GTD), Forte (PARA/Second Brain) - `decision-making.md` - Munger (mental models), Kahneman (System 1/2), Klein (RPD), Duke (thinking in bets), Dalio (principles) - **Integration**: Works with `/create-skill`, `/create-hook`, brainstorming workflows - **Extensible**: Template provided for adding new domain modules - **Subagent Auto-Compaction Documentation** - Claude Code 2.1.1+ discovery - **New Section**: `conserve:context-optimization/modules/subagent-coordination` - Auto-compaction at ~160k tokens - **Model Optimization Guide**: Added context management section for long-running subagents - **Log Signature**: Documented `compact_boundary` system message format for debugging - **Design Patterns**: State preservation strategies for compaction-aware agent design - **Context Thresholds**: Documented warning zones (80%, 90%) and trigger point (~160k) ### Fixed - **hooks.json validation errors**: Fixed hooks.json matcher format in abstract and memory-palace plugins - changed from object `{"toolName": "Skill"}` to string `"Skill"` per Claude Code SDK requirements - **#25**: Optimized architecture-paradigms skill to index/router pattern (28.5% reduction) - **#26**: Modularized optimizing-large-skills skill (38% reduction) - **#27**: Split large command files with 72% average reduction (bulletproof-skill, validate-hook, pr-review) - **#28**: Consolidated abstract module bloat to docs/examples/ directory - **#29**: Optimized agent files with 62% reduction in mcp-subagents - **#30**: Enhanced JSON escaping with complete control character handling in imbue hooks - **#31**: Added logging to PyYAML warnings for better CI/CD visibility - **#32**: Added comprehensive delegation error path tests (12 new test methods) - **#33**: Added wrapper base validation tests and implemented detect_breaking_changes() - **#93**: Merged README-HOOKS.md into docs/guides/skill-observability-guide.md - **#94**: Consolidated conjure CHANGELOG to reference main CHANGELOG - **#95**: Renamed /pr to /prepare-pr with expanded Mandatory Post-Implementation Protocol workflow - **Bloat Report Updated**: bloat-scan-report-20260109.md now includes Phase 5 (~52,640 tokens total saved) ### Changed - **Documentation Debloating (Phase 5)** - Enforced strict line limits across documentation files - **hook-types-comprehensive.md**: 748 → 147 lines (table-of-contents pattern) - **security-patterns.md**: 904 → 534 lines (consolidated redundant examples) - **authoring-best-practices.md**: 652 → 427 lines (removed verbosity) - **evaluation-methodology.md**: 653 → 377 lines (extracted implementations) - **error-handling-guide.md**: 580 → 316 lines (cross-referenced tutorial) - **Deleted outdated plans**: 2 historical files (1,685 lines) - **Total impact**: 6,222 → 1,801 lines (3,421 saved, ~3,200 tokens) - **Quality preserved**: All detail maintained via progressive disclosure - **MECW Optimization**: Commands now use modular architecture with progressive loading pattern - Large skill modules moved to docs/examples/ for comprehensive guides - Stub files provide quick reference with links to detailed documentation - Total reduction: 70-79% in command files, 28-62% in skill files - **Documentation Structure**: Comprehensive guides now centralized in docs/examples/ with categorization: - `docs/examples/hook-development/` - Hook types, security patterns - `docs/examples/skill-development/` - Authoring best practices, evaluation methodology - **Testing Coverage**: Added 12+ new test methods for error paths and validation edge cases ## [1.2.3] - 2026-01-08 ### Added - **Error Handling Documentation** - Comprehensive guide for error handling across the ecosystem - **New Guide**: `docs/guides/error-handling-guide.md` with implementation patterns - **Error Classification**: Standardized system (E001-E099: Critical, E010-E099: Recoverable, E020-E099: Warnings) - **Plugin-Specific Patterns**: Detailed scenarios for abstract, conserve, memory-palace, parseltongue, sanctum - **Integration**: Connected to leyline:error-patterns for shared infrastructure - **Best Practices**: Recovery strategies, testing patterns, monitoring and alerting - **Conjure Quota Tracking Documentation** - Technical implementation details - **New Guide**: `plugins/conjure/docs/quota-tracking.md` for Gemini API quota management - **Architecture Details**: Inheritance structure from leyline.QuotaTracker - **Token Estimation**: Multi-tier strategy (tiktoken + heuristic fallback) - **Usage Patterns**: CLI interface, hook integration, backward compatibility - **Performance**: Computational complexity, memory usage, testing patterns - **Documentation Consolidation** - Merged ephemeral reports into permanent docs - **Error Handling Enhancement Report**: Consolidated into `docs/guides/error-handling-guide.md` - **GeminiQuotaTracker Refactoring Report**: Consolidated into `plugins/conjure/docs/quota-tracking.md` - **ADR-0002**: Already exists for quota tracker refactoring architecture decision - **Removed**: Temporary report files from worktrees - **Branch Name Version Validation** - `/pr-review` now enforces branch name version consistency - **BLOCKING check**: If branch contains version (e.g., `skills-improvements-1.2.2`), it MUST match marketplace/project version - **Prevents incomplete version bumps**: Catches cases where branch naming indicates 1.2.2 but files still show 1.2.1 - **Clear error messages**: Suggests either updating version files OR renaming branch to match - Supports all project types: Claude marketplace, Python, Node, Rust - **Claude Code 2.1.0 Compatibility** - Full support for new frontmatter features and behaviors - **Skill Hot-Reload**: Skills now auto-reload without session restart - **Context Forking**: `context: fork` runs skills in isolated sub-agent context - **Agent Field**: `agent: agent-name` specifies agent type for skill execution - **Frontmatter Hooks**: Define PreToolUse/PostToolUse/Stop hooks in skill/agent/command frontmatter - **Once Hooks**: `once: true` executes hook only once per session - **YAML-Style allowed-tools**: Cleaner list syntax instead of comma-separated strings - **Wildcard Permissions**: `Bash(npm *)`, `Bash(* install)`, `Bash(git * main)` - **Skill Visibility**: `user-invocable: false` hides skills from slash command menu - **Agent Disabling**: `Task(AgentName)` syntax for disabling specific agents - **Documentation Updates** - Comprehensive 2.1.0 feature documentation - **Plugin Development Guide**: New section covering all 2.1.0 frontmatter fields - **Common Workflows Guide**: Added 2.1.0 features section with examples (moved to `book/src/getting-started/common-workflows.md`) - **Skill Authoring Skill**: Updated frontmatter examples with 2.1.0 fields - **Hook Authoring Skill**: Added frontmatter hooks, `once: true`, and event types - **Documentation Consolidation**: Moved large guides to book/ for better organization - Common Workflows (722 lines), Function Extraction Guidelines (571 lines), Migration Guide (507 lines) - Archived temporary research and implementation reports to `docs/archive/2026-01-*/` - All documentation now complies with size guidelines (docs/ ≤ 500 lines, book/ ≤ 1000 lines) - **Skill Observability System** (Issue #69) - Phases 4-5: Continual learning metrics and PreToolUse integration - **PreToolUse Hook**: `skill_tracker_pre.py` captures skill start time and invocation context - **PostToolUse Enhancement**: `skill_tracker_post.py` now calculates accurate duration and continual metrics - **Continual Learning Metrics**: Avalanche-style evaluation per skill execution - **Stability Gap Detection**: Automatic identification of performance inconsistency (avg - worst accuracy) - **Per-iteration metrics**: worst_case_accuracy, average_accuracy, avg_duration_ms - **Execution History**: Persistent tracking in `~/.claude/skills/logs/.history.json` - **Real-time Alerts**: stderr warnings when stability_gap > 0.3 - **Skill Memory Storage**: memory-palace now handles all skill execution memory - Automatic tracking of every skill invocation across all plugins - JSONL log storage per plugin/skill/day with searchable history - Command: `/skill-logs` - view and manage skill execution memories - **Skill Review**: pensive now handles skill performance analysis - Commands: `/skill-review` - analyze skill metrics and stability gaps - Commands: `/skill-history` - view recent skill executions with context - Integration reference: `docs/guides/skill-observability-guide.md` - **Parseltongue: Python Linter Agent** - Strict linting enforcement without bypassing checks - **New Agent**: `python-linter` - Expert agent for fixing lint errors properly - **Core Principle**: FIX, DON'T IGNORE - never add per-file-ignores to bypass lint checks - **Supported Rules**: E (pycodestyle), W (warnings), F (pyflakes), I (isort), B (bugbear), C4 (comprehensions), UP (pyupgrade), S (bandit), PL (pylint), D (pydocstyle) - **Common Fixes**: D205 (docstring format), E501 (line length), PLR2004 (magic values), PLC0415 (import location), PLR0915 (function length) - **Workflow**: Understand rule → fix actual code → refactor if needed → verify passes - **Only Acceptable Ignores**: `__init__.py` F401 (re-exports), `tests/**` S101/PLR2004/D103 ### Changed - **Agents Updated** - Added lifecycle hooks to key agents - `pensive:code-reviewer`: PreToolUse and Stop hooks for audit logging - Escalation configuration added to agents - **Skill Patterns** - Updated skill patterns in documentation - YAML-style allowed-tools shown as preferred syntax - Wildcard permission patterns documented - Lifecycle hooks demonstrated in skill frontmatter - **Agents Updated with Hooks** - Added lifecycle hooks to more agents - `sanctum:pr-agent`: PreToolUse, PostToolUse, Stop hooks for quality gate audit - `sanctum:git-workspace-agent`: PreToolUse hook for read-only validation - `conserve:context-optimizer`: PreToolUse, PostToolUse, Stop hooks for audit logging - `sanctum:commit-agent`: PreToolUse (Bash, Read), PostToolUse (Bash), Stop hooks for commit audit logging - `sanctum:dependency-updater`: PreToolUse (Bash, Write|Edit), PostToolUse (Bash), Stop hooks with security warnings - `pensive:architecture-reviewer`: PreToolUse (Read|Grep|Glob), PostToolUse (Bash), Stop hooks for review tracking - `pensive:rust-auditor`: PreToolUse (Bash, Grep), PostToolUse (Bash), Stop hooks for unsafe code audit trail - **Commands Updated with Hooks** - Added lifecycle hooks to high-frequency user-facing commands - `/update-dependencies` (sanctum): PreToolUse (Task), Stop hooks with security-critical operation tracking and dry-run mode detection - `/pr` (sanctum): PreToolUse (Skill|Task), PostToolUse (Bash), Stop hooks tracking code review options and quality gates - `/bloat-scan` (conserve): PreToolUse (Task), Stop hooks tracking scan level and focus area for technical debt metrics - **Skills Updated with Hooks** - Added lifecycle hooks to critical workflow skills - `pr-prep` (sanctum): PreToolUse (Bash), PostToolUse (Write), Stop hooks tracking quality gates and PR template generation - `git-workspace-review` (sanctum): PreToolUse (Bash), Stop hooks for git analysis tracking - `context-optimization` (conserve): PreToolUse (Read), PostToolUse (Bash), Stop hooks for context pressure monitoring ### Added - **Frontmatter Validation Tests** - 33 new tests for Claude Code 2.1.0 validation - `TestValidate210Fields`: 19 tests for context, hooks, permissions, allowed-tools - `TestHas210Features`: 9 tests for feature detection - `TestAllHookEventTypes`: 2 tests for hook event validation - `TestConstantDefinitions`: 3 tests for constant verification - All 62 frontmatter tests pass (was 29, now 62) - **Base Module Tests** - 25 new tests for abstract.base module - `TestSetupImports`: Backwards compatibility verification - `TestHasFrontmatterFile`: File reading with error handling - `TestFindMarkdownFiles`: Directory traversal and recursion - `TestAbstractScript`: Class initialization and lazy loading ### Removed - **pensieve plugin** - Consolidated into memory-palace and pensive for better integration - **Memory storage** (hooks, logging) moved to memory-palace - uses existing observability infrastructure - **Review capabilities** (metrics, history) moved to pensive - extends code review toolkit - **No functionality lost** - all features preserved with better integration - **Migration path**: `/pensieve:metrics` → `/pensive:skill-review`, `/pensieve:history` → `/pensive:skill-history` ## [1.2.1] - 2026-01-05 ### Added - **Tutorials: Skills Showcase** - Interactive demo of skill discovery and usage - **New Tutorial**: Visual GIF demonstration of claude-night-market skill capabilities - **Skill Discovery**: Browse and count 105+ skills across all plugins - **Skill Anatomy**: Examine frontmatter, metadata, and structure - **Skill Validation**: Use `abstract:plugin-validator` to check quality - **Workflow Composition**: See how skills chain into complex workflows - **Dual Documentation**: Concise docs and detailed book chapter - **Assets**: VHS tape (`assets/tapes/skills-showcase.tape`) and generated GIF - **Integration**: Added to README demos section, book SUMMARY, and tutorials overview - **Target Audience**: Beginners learning the skill system, plugin developers understanding architecture - **Minister: Issue Creation Command** - Complete GitHub issue lifecycle management - **New Command**: `/create-issue` - Create GitHub issues with formatting and reference links - **Interactive Template Mode**: Guided prompts for bug reports, feature requests, documentation - **Smart References**: Auto-fetch and format related issue/PR/doc links - **Cross-Repository**: Create issues in any accessible repository - **Label Management**: Apply multiple labels with validation - **Project Integration**: Auto-add to GitHub Projects v2 - **Minister Tracker**: Optional auto-capture to project tracker - **Complements**: `/close-issue` command for full issue lifecycle coverage - **Integration**: Works with minister's initiative tracking and status dashboards - **Quality Infrastructure** - Three-layer quality system for code standards - **New Scripts**: - `scripts/run-plugin-lint.sh` - Plugin-specific linting (all or changed plugins) - `scripts/run-plugin-typecheck.sh` - Plugin-specific type checking - `scripts/run-plugin-tests.sh` - Plugin-specific test execution - `scripts/check-all-quality.sh` - Comprehensive quality check with optional report - **Documentation**: - `docs/quality-gates.md` - Three-layer quality system documentation - `docs/testing-guide.md` - Testing patterns and troubleshooting - **Pre-Commit Integration**: Hooks run plugin-specific checks on changed files only - **Coverage**: ~400+ tests across 10 plugins with automated enforcement ### Changed - **Pre-commit hooks enhanced** - Plugin-specific lint/typecheck/test hooks for changed plugins - **README updated** - Added Quality Gates and Testing Guide to documentation table ### Fixed - **Spec-Kit: Test Fixture Completion** - Fixed 3 skipped parametrized tests in task planning - **Fixture Update**: Completed `valid_task_list` fixture with phases 2-4 (Core Implementation, Integration, Polish) - **Test Coverage**: All 184 spec-kit tests now passing (previously 181 passed, 3 skipped) - **Phase Structure**: Added 8 new tasks across 3 phases with proper dependencies and time estimates - **Format Compliance**: Ensured time estimates match regex pattern and action verb requirements - **Documentation**: Updated testing guide to reflect spec-kit's 184 tests (up from ~60) ## [1.2.0] - 2026-01-02 ### Added - **Conserve: Static Analysis Integration** - Complete bloat-detector skill with 5 modules - **New Module**: `static-analysis-integration.md` - Bridges Tier 1 heuristics to Tier 2 tool-based detection - **Python Tools**: Vulture (programmatic API), deadcode (auto-fix), autoflake (import cleanup) - **JavaScript/TypeScript**: Knip CLI integration with JSON parsing, tree-shaking detection - **Multi-Language**: SonarQube Web API for cyclomatic complexity and duplication metrics - **Features**: Auto-detection of available tools, graceful fallback to heuristics, confidence boosting - **Total Bloat Detection Coverage**: 2,423 lines across 5 modules - `quick-scan.md` (237 lines) - Heuristic detection - `git-history-analysis.md` (276 lines) - Staleness and churn metrics - `code-bloat-patterns.md` (638 lines) - 5 duplication types, language-specific patterns - `documentation-bloat.md` (634 lines) - Readability metrics, 4 similarity algorithms - `static-analysis-integration.md` (638 lines) - Tool integration **← NEW** - **Validation**: All tests pass, hub-spoke pattern maintained, proper frontmatter - **Memory Palace: Knowledge Corpus Integration** - Research queue processing and storage - **Queue System**: Staging area for research sessions with evaluation rubric - **Knowledge Entry**: `codebase-bloat-detection.md` - Comprehensive bloat detection research - 43 sources across static analysis, git history, documentation metrics - Memory palace format with spatial encoding and cross-references - Score: 87/100 (Evergreen threshold: 80+) - **Queue Processing**: Automated vetting, approval, and archival workflow - **Integration**: Research findings directly applied to conserve plugin implementation - **Attune Plugin (v1.2.0)** - Full-cycle project development from ideation to implementation - **New Commands**: - `/attune:brainstorm` - Brainstorm project ideas using Socratic questioning (integrates superpowers) - `/attune:specify` - Create detailed specifications from brainstorm session (integrates spec-kit) - `/attune:blueprint` - Plan architecture and break down into tasks (integrates superpowers) - `/attune:init` - Initialize new project with complete development infrastructure - `/attune:execute` - Execute implementation tasks systematically (integrates superpowers) - `/attune:upgrade-project` - Add or update configurations in existing projects - `/attune:validate` - Validate project structure against best practices - **New Agents**: - `project-architect` - Guides full-cycle workflow from brainstorming through planning - `project-implementer` - Executes implementation with TDD discipline and progress tracking - **New Skills**: - `project-brainstorming` - Socratic ideation and problem space exploration - `project-specification` - Specification creation workflow - `project-planning` - Architecture design and task breakdown - `project-execution` - Systematic implementation with tracking - `project-init` - Interactive project initialization workflow with language detection - `makefile-generation` - Generate language-specific Makefiles with standard targets - `workflow-setup` - Configure GitHub Actions workflows for CI/CD - `precommit-setup` - Set up pre-commit hooks for code quality enforcement - **Supported Languages**: - **Python**: uv-based dependency management, pytest, ruff, mypy, pre-commit hooks - **Rust**: cargo builds, clippy, rustfmt, CI workflows - **TypeScript/React**: npm/pnpm/yarn, vite, jest, eslint, prettier - **Features**: - Hybrid template + dynamic generation approach - Non-destructive initialization (prompts before overwriting) - Git initialization with comprehensive .gitignore - GitHub Actions workflows (test, lint, typecheck/build) - Pre-commit hooks configuration - Makefile with development targets (help, test, lint, format, build, clean) - Project structure creation (src/, tests/, README.md) - Version fetching for GitHub Actions and dependencies - Project validation with detailed scoring - **Phase 1 (Complete)**: Python-only MVP - **Phase 2 (Complete)**: Multi-language support (Rust, TypeScript) - **Phase 3 (Complete)**: Advanced features (version fetching, validation) - **Phase 4 (Complete)**: Integration features - `/attune:upgrade-project` - Add missing configs to existing projects with status reporting - Custom template locations (~/.claude/attune/templates/, .attune/templates/, $ATTUNE_TEMPLATES_PATH) - Plugin project initialization (attune + abstract integration) - Reference project template synchronization (auto-update from simple-resume, skrills) - Template composition and override system - **Templates Based On**: Reference projects (simple-resume, skrills, importobot) ## [1.1.2] - 2026-01-01 ### Added - **Conserve: Bloat Detection & Remediation** - Full codebase cleanup workflow - **Codebase-wide AI Slop Cleanup** - Systematic removal of AI-generated language patterns - Removed ~628 instances of vague AI slop words across 356 files - Replaced "comprehensive" with context-specific terms (detailed/deep/thorough/full) - Replaced "ensures/ensure" with precise verbs (validates/verify/maintains) - Replaced "robust" with "production-grade" or "solid" - Replaced marketing language (powerful/seamless/leverage/utilize) with plain terms - Affected all file types: Markdown (487 instances), Python (91), Shell/YAML/Makefiles (50) - No functional changes, purely stylistic improvements for clarity - **New Commands**: - `/bloat-scan` - Progressive bloat detection (3 tiers: quick scan, targeted analysis, deep audit) - `/unbloat` - Safe bloat remediation with interactive approval and automatic backups - **New Agents**: - `bloat-auditor` - Orchestrates bloat detection scans and generates prioritized reports - `unbloat-remediator` - Executes safe deletions, refactorings, and consolidations with rollback support - **New Skill**: - `bloat-detector` - Detection algorithms for dead code, God classes, documentation duplication, and dependency bloat - **Detection Capabilities**: - Tier 1 (2-5 min): Heuristic-based detection using git history (no external tools required) - Tier 2 (10-20 min): Static analysis integration (Vulture, Knip) with anti-pattern detection - Tier 3 (30-60 min): Deep audit with cyclomatic complexity and cross-file redundancy - **Safety Features**: - Automatic backup branches before any changes - Interactive approval workflow with dry-run mode - Test verification after each change with auto-rollback on failure - Reversible git operations (git rm/git mv) - **Benefits**: 10-20% context reduction on average, identifies technical debt hotspots ## [1.1.1] - 2025-12-30 ### Added - **Project-level agent configuration**: Added `.claude/agents/` with main-thread agent definitions - `plugin-developer.md` - Default agent for night-market plugin development (set in `.claude/settings.json`) - `code-review-mode.md` - Evidence-based code review sessions with imbue/pensive integration - `documentation-mode.md` - Documentation-focused workflows with sanctum integration - Enables consistent multi-session agent configuration across all project work - Automatic agent loading when starting sessions in project directory - **LSP setup guidance in main README**: Quick start guide for Language Server Protocol integration - Added "Recommended Setup: LSP Integration" section prominently in README - Step-by-step setup instructions (enable environment variable, install language servers, verify) - Comparison table showing LSP advantages (900x performance, 90% token reduction) - List of plugins benefiting from LSP (pensive, sanctum, conserve) - Graceful degradation explanation (grep fallback when LSP unavailable) - Reference to detailed compatibility documentation - **Claude Code compatibility reference**: New documentation tracking version-specific features and fixes - `plugins/abstract/docs/claude-code-compatibility.md` - Detailed compatibility matrix - Version support matrix for ecosystem compatibility - Migration guides for Claude Code version upgrades - Testing checklist for compatibility verification ### Documentation - **Claude Code 2.0.74 compatibility**: Updated documentation and agent capabilities for latest release - **LSP (Language Server Protocol) Tool**: **Now the preferred default** for code navigation and analysis - Added detailed LSP integration patterns section to compatibility documentation - Updated `plugins/pensive/agents/code-reviewer.md` with LSP-enhanced review capabilities - Updated `plugins/pensive/agents/architecture-reviewer.md` with semantic architecture analysis - Updated `plugins/pensive/agents/rust-auditor.md` with rust-analyzer integration - Updated `plugins/sanctum/commands/update-docs.md` with LSP documentation verification - Updated `plugins/sanctum/skills/doc-updates/SKILL.md` with LSP accuracy checking - Updated `plugins/conserve/skills/context-optimization/modules/mecw-principles.md` with LSP token optimization - Documented 900x performance improvement for reference finding (50ms vs 45s) - Language support: TypeScript, Rust, Python, Go, Java, Kotlin, C/C++, PHP, Ruby, C#, PowerShell, HTML/CSS - **Strategic Positioning**: LSP is now the **default/preferred** approach across all plugins - Pensive agents default to LSP for code review, architecture analysis, Rust audits - Sanctum commands default to LSP for documentation verification - Conserve skills recommend LSP for token optimization - Grep positioned as fallback when LSP unavailable - Recommendation: Enable `ENABLE_LSP_TOOL=1` permanently in environment - **Security Fix - allowed-tools enforcement**: Documented critical security bug fix - Created "Tool Restriction Patterns" section in compatibility documentation - Documented previous bug: `allowed-tools` restrictions were NOT being enforced (< 2.0.74) - Documented fix: Tool restrictions now properly applied (2.0.74+) - Added security patterns for read-only analysis, safe execution, and untrusted input processing - Audit results: No current plugins use `allowed-tools` (verified via ecosystem scan) - Added best practices for tool restriction design - **Improved /context visualization**: Documented enhanced context monitoring - Updated `plugins/conserve/skills/context-optimization/modules/mecw-principles.md` - Skills/agents now grouped by source plugin - Better visibility into plugin-level context consumption - Sorted token counts for optimization - Improved MECW compliance monitoring - **Terminal setup improvements**: Documented extended terminal support - Added support for Kitty, Alacritty, Zed, Warp terminals - Enhanced cross-platform compatibility - **User Experience**: Documented UX improvements - ctrl+t shortcut in /theme to toggle syntax highlighting - Syntax highlighting info in theme picker - macOS keyboard shortcut improvements (opt vs alt display) - **Bug Fixes**: Documented resolved issues - Fixed visual bug in /plugins discover selection indicator - Fixed syntax highlighting crash - Fixed Opus 4.5 tip incorrectly showing - **Claude Code 2.0.73 compatibility**: Updated documentation for latest Claude Code release - **Session Forking**: Documented `--session-id` + `--fork-session` workflow patterns (2.0.73+) - Added detailed session forking patterns section to compatibility documentation - Documented use cases for sanctum (git workflows), imbue (parallel analysis), pensive (code reviews), memory-palace (knowledge intake) - Added best practices, naming conventions, and lifecycle management guidance - Created advanced patterns: decision tree exploration, experiment-driven development, parallel testing - Integration guidance with subagent delegation patterns - **Plugin Discovery**: Enhanced metadata best practices for search filtering (2.0.73+) - Plugin discover screen now supports search by name, description, marketplace - Recommendations for search-friendly plugin descriptions - Guidelines for descriptive keywords in plugin.json metadata - **Image Viewing**: Documented clickable [Image #N] links for media workflows (2.0.73+) - Updated scry plugin compatibility (media generation preview) - Updated imbue plugin compatibility (visual evidence artifacts) - Quick preview support for attached and generated images - **User Experience**: Documented UX improvements - alt-y yank-pop for kill ring history cycling - Improved /theme command and theme picker UI - Unified SearchBox across resume, permissions, and plugins screens - **Performance**: Fixed slow input history cycling and message submission race condition - **VSCode Extension**: Tab icon badges for pending permissions (blue) and unread completions (orange) - **Claude Code 2.0.72 compatibility**: Updated documentation for latest Claude Code release - **Claude in Chrome (Beta)**: Documented native browser control integration - Complements scry plugin's Playwright-based browser recording - Added compatibility guidance to `plugins/scry/README.md` - Updated `plugins/scry/skills/browser-recording/SKILL.md` with Chrome integration notes - Updated `plugins/scry/commands/record-browser.md` with usage recommendations - Positioning: Native Chrome for interactive debugging, Playwright for automation/CI/CD - **User experience improvements**: Thinking toggle changed from Tab to Alt+T - **Performance improvements**: 3x faster @ mention file suggestions in git repositories - **Bug fixes**: `/context` command now respects custom system prompts in non-interactive mode - **Claude Code 2.0.71 compatibility**: Updated documentation for Claude Code release - **Bash glob pattern support**: Documented fix for shell glob patterns (`*.txt`, `*.png`) in permission system - Updated `plugins/abstract/skills/hook-authoring/modules/hook-types.md` - Updated `plugins/abstract/skills/hook-authoring/modules/security-patterns.md` - Added migration guidance for removing glob pattern workarounds - **MCP server loading fix**: Documented fix for `.mcp.json` loading with `--dangerously-skip-permissions` - Enables fully automated CI/CD workflows with MCP servers - Updated hook authoring documentation with CI/CD examples - **New commands**: Documented `/config toggle` and `/settings` alias for configuration management ### Compatibility - **Verified**: Tested with Claude Code 2.0.74 - **Recommended**: Claude Code 2.0.74+ for LSP integration, tool restrictions, and improved /context visualization - **Notable Features**: - 2.0.74+: LSP tool, allowed-tools security fix, improved /context grouping - 2.0.73+: Session forking, plugin discovery, image viewing - 2.0.72+: Claude in Chrome integration - 2.0.71+: Bash glob patterns, MCP loading fix - 2.0.70+: Wildcard permissions, improved context accuracy - **Minimum**: Claude Code 2.0.65+ for full feature support ## [1.1.0] - 2025-12-27 ### Changed - **Version alignment**: All plugins and the root workspace now report version 1.1.0 - **Spec-kit command naming**: Standardized to `/speckit-*` for naming consistency ### Documentation - **API references**: Updated spec-kit command names and version tables to match 1.1.0 ## [1.0.4] - 2025-12-22 ### Added - **Abstract skill assurance framework**: Reliable skill discovery and validation infrastructure - **Compliance tests**: Trigger isolation and enforcement level tests for skill assurance ### Changed - **Sanctum fix-pr**: Out-of-scope issue creation now mandatory for PR workflows - **Commands restructured**: Eliminated duplicate command names across plugins (do-issue, reinstall-all-plugins) - **Capabilities documentation**: Added feature-review and standardized hook types ### Fixed - **Compliance**: Addressed PR #42 review feedback for skill assurance ## [1.0.3] - 2025-12-19 ### Added - **Imbue feature-review skill**: Evidence-based prioritization for feature requests and bug triage - **Memory-palace PreToolUse hook**: Persist intake queue directly from hook for reliable queue management ### Changed - **Sanctum do-issue command**: Modularized for better token efficiency - **Imbue tests**: Test updates across review analyst, catchup, and skill modules ### Fixed - **Sanctum fix-pr**: Removed emojis from example outputs for cleaner formatting - **Lock files**: Updated across imbue, memory-palace, pensive, and spec-kit plugins ## [1.0.2] - 2025-12-18 ### Added - **Conservation hooks**: Session-start integration that automatically loads optimization guidance - **Conservation bypass modes**: `CONSERVATION_MODE` environment variable (quick/normal/deep) - **Sanctum do-issue command**: New workflow for addressing GitHub issues - **Sanctum session notifications**: `session_complete_notify.py` hook for session completion alerts - **Minister plugin**: Officially added to marketplace (project management with GitHub integration) ### Changed - **Documentation overhaul**: Expanded `book/src/reference/capabilities-reference.md` with complete skill, command, agent, and hook listings for all plugins - **Conserve README**: Added session start integration docs, bypass modes table, and threshold guidelines - **Plugin versions**: All plugins bumped to 1.0.2 for consistency - **Skill modules**: Refined content across abstract, conserve, imbue, leyline, pensive, sanctum, and spec-kit ### Improved - **Context optimization**: Updates to MECW patterns and monitoring across conserve and leyline - **Skill authoring docs**: Enhanced persuasion principles, TDD methodology, and anti-rationalization modules - **Hook authoring**: Updated security patterns and performance guidelines ## [1.0.1] - 2025-12-18 ### Fixed - **Security scanning**: Exclude `.venv` and non-code directories from Bandit and other security scans - **Hook portability**: Improved cross-platform hook execution ### Changed - **Scope-guard modularization**: Broke down scope-guard into smaller, focused modules - **Workflow-improvement**: Added workflow-improvement capability to sanctum - **Plugin versions**: Updated versions across all plugins ### Technical - Merged from PR #24 ## [1.0.0] - 2025-12-15 ### Added - **Initial release** of the Claude Night Market plugin ecosystem - **11 plugins**: abstract, archetypes, conjure, conserve, imbue, leyline, memory-palace, parseltongue, pensive, sanctum, spec-kit - **Core infrastructure**: Skills, commands, agents, and hooks framework - **Documentation**: README, capabilities reference, and per-plugin documentation ### Plugins Overview | Plugin | Purpose | |--------|---------| | abstract | Meta-skills infrastructure for plugin development | | archetypes | Architecture paradigm selection (14 paradigms) | | conjure | Intelligent delegation to external LLMs | | conserve | Resource optimization and context management | | imbue | Workflow methodologies and review scaffolding | | leyline | Storage patterns and data persistence | | memory-palace | Spatial memory techniques (method of loci) | | parseltongue | Language detection and multi-language support | | pensive | Code review toolkit (7 specialized skills) | | sanctum | Git operations and workspace management | | spec-kit | Specification-driven development | ### Technical - Merged from PR #8 - Commit: bd7d2ce [1.3.6]: https://github.com/athola/claude-night-market/compare/v1.3.5...v1.3.6 [1.3.5]: https://github.com/athola/claude-night-market/compare/v1.3.4...v1.3.5 [1.3.4]: https://github.com/athola/claude-night-market/compare/v1.3.3...v1.3.4 [1.3.3]: https://github.com/athola/claude-night-market/compare/v1.3.2...v1.3.3 [1.3.2]: https://github.com/athola/claude-night-market/compare/v1.3.1...v1.3.2 [1.3.1]: https://github.com/athola/claude-night-market/compare/v1.3.0...v1.3.1 [1.3.0]: https://github.com/athola/claude-night-market/compare/v1.2.9...v1.3.0 [1.2.5]: https://github.com/athola/claude-night-market/compare/v1.2.4...v1.2.5 [1.2.4]: https://github.com/athola/claude-night-market/compare/v1.2.3...v1.2.4 [1.2.3]: https://github.com/athola/claude-night-market/compare/v1.2.1...v1.2.3 [1.2.1]: https://github.com/athola/claude-night-market/compare/v1.2.0...v1.2.1 [1.2.0]: https://github.com/athola/claude-night-market/compare/v1.1.2...v1.2.0 [1.1.2]: https://github.com/athola/claude-night-market/compare/v1.1.1...v1.1.2 [1.1.1]: https://github.com/athola/claude-night-market/compare/v1.1.0...v1.1.1 [1.1.0]: https://github.com/athola/claude-night-market/compare/v1.0.4...v1.1.0 [1.0.4]: https://github.com/athola/claude-night-market/compare/v1.0.3...v1.0.4 [1.0.3]: https://github.com/athola/claude-night-market/compare/v1.0.2...v1.0.3 [1.0.2]: https://github.com/athola/claude-night-market/compare/v1.0.1...v1.0.2 [1.0.1]: https://github.com/athola/claude-night-market/compare/v1.0.0...v1.0.1 [1.0.0]: https://github.com/athola/claude-night-market/releases/tag/v1.0.0