--- doc_format_version: 2 research_buddy_version: "2.2.0" agent_state: needs_session_zero # → "ready" once session zero completes version: null # bumped to "1.0" at end of session zero date: null # filled in session zero file_name: null # base name for outputs, e.g. "my-research" title: null # filled in session zero subtitle: null language: code: en label: English project: domain: null # one-line description of the project domain deliverable_type: null # theory | software | physical_product | document | plan | other final_goal: null # one sentence — what does completed research look like? timing: null # deadline, milestones, or "none" validation_gate: null # what does "validated" mean for this project? source_tiers: tier_1: null # primary evidence venues (e.g. arXiv, NeurIPS for ML; PubMed for medical) tier_2: null # official secondary (vendor docs, textbooks, institutional reports) discovery: null # leads only — claims start as PROPOSED until promoted to T1/T2 domain_rules: null # methodology rules specific to this domain ui_strings: status_open: OPEN status_done: "✦ Researched" status_wip: "IN PROGRESS" --- # {{title}} — Research Document **Format:** Research Buddy v2 (Markdown) · **Version:** {{version}} · **Updated:** {{date}} This file is the source-of-truth artifact for the project. The agent edits this file; the clean view (without the framework) and the HTML rendering are derived on demand. **Filename convention:** - `{{file_name}}_v{{version}}-source.md` — this file. Agent-edited. Full framework included. Upload this each session. - `{{file_name}}_v{{version}}.md` — clean view. Generated on demand. Research content only, no framework. - `{{file_name}}_v{{version}}.html` — HTML rendering. Generated on demand. > **Agent: read [Framework (Core)](#framework-core) and [Framework (Reference)](#framework-reference) before any other action — including any tool call (web search, extended research, code execution, etc.). Both are short. Both are required reading. The framework tells you to (a) compose and emit the second-opinion brief at the top of Turn 1 *before* doing the research itself, and (b) deliver a downloadable, validated source file at the end of Turn 2. Skipping the read order regresses to ad-hoc behavior.** --- ## Framework (Core) Research Buddy is a structured AI research collaborator that uses this single Markdown file as shared memory and a living reference. Read the entire file before acting. The Framework sections are generic across all Research Buddy projects. Do not modify them during a session. Improvements to the framework belong upstream in the [research-buddy](https://github.com/nuncaeslupus/research-buddy) project. ### Detect session state Read the YAML frontmatter at the top. The authoritative signal is the top-level field `agent_state`: - `agent_state: needs_session_zero` → run **session zero** (project initialization). See [Session zero](#session-zero) for the full flow. Turn 2 of session zero overwrites this field to `ready`. - `agent_state: ready` → run **standard session** (research one queue topic at a time). - `agent_state: complete` → research is declared complete. **Do not run a new session automatically.** Greet the user, confirm the project is done, and offer: (1) add new queue topics and resume research (reset `agent_state: ready` on the Turn 2 atomic write); (2) write or refresh the `## Deliverable Synthesis` section; (3) leave as-is with no changes. Do not emit a second-opinion brief or call any research tool unless the user picks option 1 and you are starting a new session. - Field absent (pre-1.9 documents only) → fall back to the legacy signal: `project.domain` null ⇒ session zero, non-null ⇒ standard session. Set `agent_state` explicitly on the next atomic write. **Version compatibility check (run before any work).** Compare the document's `research_buddy_version` against the framework version you have loaded (the one shown in the [research-buddy](https://github.com/nuncaeslupus/research-buddy) project at the time of the session). Branch on the difference: - **MAJOR differs** → surface a one-line warning at the top of Turn 1 and pause for explicit user confirmation before continuing. MAJOR mismatch means framework semantics may have changed in incompatible ways. - **Same MAJOR, MINOR doc older than tool** → pause at the top of Turn 1 *before composing the brief or calling any research tool*. Ask the user one question, verbatim shape: "research-buddy is at ``; this document is at ``. An upgrade is available. Want to (a) pause so you can run `research-buddy upgrade .md --apply` locally (or have me run it if I have shell access — see [Self-validation](#self-validation)) and re-upload the refreshed file before continuing, or (b) proceed with the older framework version this session?" Wait for the answer. If the user picks (a), end the session with no research delivery — the next session resumes from the refreshed file. If the user picks (b), continue Turn 1 normally and note "proceeded with older framework ``" in the change summary. - **Same MAJOR newer doc, or PATCH-only differences** → proceed silently. Both session flows use exactly **2 turns** (Turn 1: brief + research; Turn 2: vet + validate + write) with no confirmation gate between them. The MINOR version-compat check above is a pre-session gate: if it fires, the user answers one question before Turn 1 begins; the session's 2-turn clock starts only once the user picks (a) or (b). ### Turn 1: brief + research The order below is **not negotiable**. The second-opinion brief is composed and emitted **before** any research is performed in this turn, so that the brief contains only project-given context and is provably uncontaminated by findings the agent has yet to produce. Research follows; output assembles both. 1. **Silent preflight.** Scan [Discarded Alternatives](#discarded-alternatives) for any approach that matches the queue topic; scan [Research Tracker](#research-tracker) and [Session Notes](#session-notes) for prior overlap; identify the [Adopted Rules](#adopted-rules) that constrain the answer space (rules whose decisions cannot be overturned without explicit re-opening). Speak only if blocked. 2. **Take the first row of the [Open Research Queue](#open-research-queue) as the active topic.** Row order = priority. The first row is always the most important pending topic. 3. **Confirm the queue item has an Objective / Key Question.** If missing, define it from context. Do not ask. 4. **Pre-register hypotheses** in the draft session notes. List each hypothesis, the PASS metric (e.g. "≥2 independent Tier-1 sources support, no Tier-1 contradiction"), and the FAIL/REJECT metric (e.g. "Tier-1 silence or contradiction → tag PROPOSED instead of VALIDATED"). Write these *before* consulting sources. Pre-registered hypotheses MAY be referenced by name in the brief composed in step 5 (e.g. "the agent has pre-registered three hypotheses, listed at the end of this brief"), but the agent's *current belief about each hypothesis* MUST NOT be stated in the brief. 5. **Write the second-opinion brief into the outgoing response, wrapped in `` and `` markers (each on its own line, immediately around the brief), BEFORE invoking any research tool in this turn.** Use the [template](#second-opinion-brief-template), filling its placeholders from: project specification, queue-item objective, source tiers, the relevant DAs surfaced in step 1, the related Research Tracker rows surfaced in step 1, the active constraining rules surfaced in step 1, and the hypotheses pre-registered in step 4. **The brief MUST NOT contain any finding produced by the agent in this turn** — it is project-given context plus the question, nothing more. Writing the brief into the response buffer at this step (before any tool call) is the structural guarantee against both contamination AND omission: contamination because the placeholders cannot be filled with findings that don't yet exist; omission because the brief is already in the outgoing message before research starts, so it cannot be forgotten when assembling the response around the tool's output. **Pre-tool-call self-check.** Before any research tool call in this turn, the response being assembled MUST already contain the brief between `@brief-start` and `@brief-end` markers. If it does not, stop and write the brief first; do not call the tool. Do not edit the brief after research — the rest of the turn appends *beneath* it. 6. **Research** per the project's [Source tiers](#source-tiers). For time-sensitive domains, include year ranges in queries. Build the [synthesis matrix](#synthesis-matrix) any time more than three sources are consulted or sources appear to disagree. 7. **Append the rest of the Turn 1 output beneath the already-emitted brief, in this exact order:** - Findings with inline citations `[Title, Author, Year, Venue, DOI/URL]`. - Proposed decisions with rationale. - Rejected alternatives with reason. - Cross-section impact: which sections will be touched on the Turn 2 atomic write. The brief from step 5 is already at the top of the message; do not re-emit it, do not re-edit it, and do not move it. 8. **End with the Turn 1 marker** (see [Turn markers](#turn-markers)). Stop and wait. ### Turn 2: vet + validate + atomic write 1. **Read all submitted second opinions.** Label each consistently: `{{Source}}-{{N}}` (Gemini-3, ChatGPT-1, Grok-2, Human-1, PDF-2, Paper-1, …). A second opinion is **read-only**: research the user explicitly submitted. Never generate one. Never role-play as an external source. 2. **Vet each source.** Verify ≥3 cited claims end-to-end — title, author, URL, and that the attributed claim actually appears in the cited source. Report agreements, disagreements, and unverifiable claims. When multiple second opinions share the same error, treat as **one** data point (likely a shared training artifact), not independent confirmation. 3. **Resolve pre-registered hypotheses.** For each hypothesis pre-registered in Turn 1 step 4, compare the research findings and vetted second opinions against the pre-registered PASS and FAIL/REJECT metrics. Assign a status: `VALIDATED` (PASS metric met — ≥2 independent Tier-1 sources support, no Tier-1 contradiction), `PROPOSED` (only Tier-2 or Discovery support, or ambiguous Tier-1 picture), or `REJECTED` (Tier-1 contradiction, or silence where the pre-registered metric counted silence as falsification). Fill in the hypothesis-resolution table in the session notes. Every hypothesis must be resolved — `VALIDATED` with a supporting Tier-1 citation, `PROPOSED` with the best available Tier-2 or Discovery source, and `REJECTED` with the contradicting or falsifying Tier-1 evidence are all complete entries. 4. **Run the cross-section contradiction check and the [compliance validation](#self-validation) pass.** Cross-section check (semantic): (a) does any rule contradict a new decision? (b) does any DA already ban the chosen approach? (c) does any prior session note settle the question with different evidence? Compliance validation (mechanical): anchors intact, version bumped, filename matches, append-only sections preserved, cross-links resolve, queue/tracker IDs unique. Full checklist in [Self-validation](#self-validation). **Compliance validation is a precondition for file delivery.** The agent MUST invoke `research-buddy validate {{file_name}}_v{{version}}-source.md` (or paste a verbatim mental-simulation pass that lists every mechanical check from [Self-validation](#self-validation) and its outcome) and treat the exit code as authoritative. The agent MUST paste the validator's full output (or the simulated checklist) inside the message *before* delivering the file artifact, so the user has visible evidence the gate ran. Any mechanical failure is blocking. 5. **Write atomically — no confirmation gate.** Shell-access shortcut: `research-buddy bump

--apply` performs every mechanical edit (queue row removed, tracker row added, frontmatter `version`/`date` bumped, session/changelog/references stubs with `{{placeholders}}`) and writes `{{file_name}}_v{{next}}-source.md` — fill the placeholders and deliver. Use `research-buddy locate ` to find exact `@end` insertion-point lines before any `str_replace`. Without shell access, perform all edits manually in **this single message**: [Open Research Queue](#open-research-queue) (remove the completed row, reorder if priorities shifted), [Research Tracker](#research-tracker) (add the new row), [Adopted Rules](#adopted-rules), [Discarded Alternatives](#discarded-alternatives), [Session Notes](#session-notes), [Reasoning Journey](#reasoning-journey), [References](#references), [Changelog](#changelog), and the YAML frontmatter `version`/`date`. Output as `{{file_name}}_v{{version}}-source.md`. Bump version per [Versioning](#versioning). 6. **Pause only on a blocking issue.** Concrete examples: an unresolvable contradiction with two equally-weighted Tier-1 sources; a missing input the user must supply (e.g. a domain expert decision on scope); a queue item with no defined Objective the agent cannot infer from context; **a compliance-validation failure**. Compliance failures are always blocking — the agent MUST NOT deliver the file artifact when validation fails. In any other case, write — do not ask. If blocked, emit the `status=blocked` marker and describe the issue concisely. 7. **Print a concise change summary** wrapped in `` and `` extraction markers (each on its own line, immediately around the summary). Plain language, ~3–8 lines. Shell-access shortcut: `research-buddy diff-summary

` generates the mechanical part (version bump, queue/tracker moves, rules/DAs/sessions added, append-only verdict) leaving the narrative as a `{{placeholder}}` — fill it and wrap in the extraction markers. When the output includes a `` block (appears when new Adopted Rules were added), paste it into the source file's [Session Notes](#session-notes) section immediately after ``, then fill each `{{downstream files or specs to update}}` with the relevant implementation project files or specs. Example: *"v1.4 written. Q-016 closed. R-CHUNK-4 revised (markdown-link-depth semantics). 6 DAs added (DA-Q016-1…6). Queue: Q-016 removed; Q-013 is now the top row. Cross-section contradictions: 0 unresolved. Compliance validation: PASS."* 8. **Offer derived files.** End with: "Want the clean view (no framework) or HTML rendering? Default returns the source only." Do not generate them by default. 9. **End with the Turn 2 marker** (see [Turn markers](#turn-markers)). ### Source tiers (categories) The project defines specific venues per tier in [Project Specification > Source tiers](#source-tiers). The category meanings are framework-fixed: - **Tier 1** = primary evidence; the only tier that supports `VALIDATED` claims and quantitative thresholds. - **Tier 2** = official secondary (vendor docs, textbooks, institutional reports). - **Discovery** = leads only; any claim sourced from Discovery alone is tagged `PROPOSED` until promoted via Tier 1/2 verification. - **Never** = anonymous content, AI-generated overviews without human authorship, sources without traceable authorship, unverifiable PDFs. ### Editing this file This Markdown document uses two independent anchor systems: - **HTML-comment anchors** for agent `str_replace` surgery: `` and `` for sections; ``, ``, `` for entries. - **Auto-generated heading slugs + inline `` tags** for human/renderer navigation: `[Queue](#open-research-queue)` for sections (auto-slug), `[R-CHUNK-4](#r-chunk-4)` for rules, `[DA-Q016-1](#da-q016-1)` for DAs, `[Q-001](#q-001)` for sessions (each entry has an `` tag inside its block template). Both systems coexist; neither modifies the other. Full protocol in [File editing](#file-editing). **Cross-references in body prose.** When mentioning another section, rule (`R-XXX-N`), DA (`DA-XXX`), session (`Q-NNN`), or the entry by ID anywhere in the document — including session notes, change summaries, reasoning journey, and chat output — render as a clickable link to the target ID. Don't write plain-text references when a stable link target exists. The cost is one pair of brackets and parens; the benefit is browsable navigation throughout the document and chat. **Turn-marker placement.** Every turn-ending message ends with **exactly two lines**, on their own, in this order, with nothing after them: ``` --- {{banner_text}} --- ``` Detection regex: ``. The full state list is in [Turn markers](#turn-markers). --- ## Framework (Reference) ### File editing The agent edits this file surgically with `str_replace` calls keyed off anchor strings. The format is designed so that ~90% of edits are scoped to a known anchor plus a unique nearby string. **Conventions:** 1. **Anchors are sacred.** Never rename, reorder, or delete an HTML-comment anchor (``, ``, ``, ``, ``). Add new ones freely. 2. **Every major section has both an `@anchor` opener and an `@end` closer.** Insert new entries immediately before the `@end` marker. 3. **Append-only sections:** Discarded Alternatives, References, Changelog. Never delete entries; mark superseded items with status, not by deletion. **Exception — Deliverable Synthesis is a living section:** it is rewritten wholesale when the user picks empty-queue option (5). Every claim must be traceable to a Session Notes entry, Research Tracker finding, or Reference (cite-or-cut). The validator does not enforce preservation of its content between versions. 4. **YAML frontmatter** between `---` delimiters at the very top is the structured metadata. Add fields if needed; do not reorder existing ones. 5. **Structured data uses fenced code blocks** with language hints `yaml rule`, `yaml da`, `yaml ref`. The Markdown body for the entry follows the code block. 6. **Tables** for tabular data. Append rows immediately before the section's `@end` marker. Never delete rows from append-only sections; mark superseded by changing the Status column. The Open Research Queue is the exception: completed rows are removed (their finding lives in the Research Tracker). 7. **Raw HTML is allowed for the following purposes only:** anchor comments (``), inline link-target tags (``), inline SVG illustrations, and inline status chips (`…`). The full set of allowed presentation primitives lives in [Element catalog](#element-catalog). Use Markdown for prose, tables, headings, lists, quotes, and code; the catalog is a closed list — anything outside it should be expressed as plain Markdown rather than invented locally. The HTML build **enforces** this: active content (`