--- name: evidence-draft description: | Create per-subsection evidence packs (NO PROSE): claim candidates, concrete comparisons, evaluation protocol, limitations, plus citation-backed evidence snippets with provenance. **Trigger**: evidence draft, evidence pack, claim candidates, concrete comparisons, evidence snippets, provenance, 证据草稿, 证据包, 可引用事实. **Use when**: `outline/subsection_briefs.jsonl` exists and you want evidence-first section drafting where every paragraph can be backed by traceable citations/snippets. **Skip if**: `outline/evidence_drafts.jsonl` already exists and is refined (no placeholders; >=8 comparisons per subsection; `blocking_missing` empty). **Network**: none (richer evidence improves with abstracts/fulltext). **Guardrail**: NO PROSE; do not invent facts; only use citation keys that exist in `citations/ref.bib`. --- # Evidence Draft (NO PROSE) Purpose: turn `papers/paper_notes.jsonl` + subsection mapping into **writeable evidence packs** so the writer never has to guess (and never copies outline placeholders). Key design: every pack should contain **evidence snippets** (1–2 sentences) with **provenance** (abstract/fulltext/notes pointer). Even abstract-level snippets are better than template prose. Why this matters for writing quality: - Packs are the writer substrate; if packs are thin, C5 will either pad prose or guess. - Treat `blocking_missing` as a stop signal: route upstream (notes/bindings) instead of writing around gaps. ## Role cards (prompt-level guidance) - **Evidence Curator** - Mission: turn paper notes into contrastable, citeable evidence (not summaries). - Do: extract snippet-backed comparisons; surface protocol details and failure modes. - Avoid: axis-driven "hypotheses" that are not supported by snippets. - **Provenance Accountant** - Mission: keep every snippet auditable. - Do: attach provenance pointers (abstract/fulltext/notes location) and keep excerpts sentence-level. - Avoid: untraceable paraphrases that cannot be verified. - **Skeptic** - Mission: prevent evidence inflation. - Do: downgrade claims when evidence is abstract/title-only; convert unknowns into `verify_fields` (not repeated boilerplate in prose). - Avoid: strong conclusions without protocol/metric context. ## Non-negotiables - NO PROSE: packs are bullets-only evidence, not narrative paragraphs. - No fabrication: do not invent datasets/metrics/numbers. - Citation hygiene: every cite key must exist in `citations/ref.bib`. - Claim candidates must be snippet-derived (no axis-driven “Hypothesis: …” items); put questions into `verify_fields` instead. - Avoid silent truncation: keep `claim_candidates[].claim` long enough to carry concrete detail (<= ~400 chars) and keep highlight `excerpt` sentence-level (<= ~280 chars). - Numeric-claim hygiene (evidence substrate): - If a snippet/claim includes digits or `%`, also include minimal context in the same bullet (at least 2 of: task/setting, metric definition, constraint/budget/tool access). - If the context is not present in notes/fulltext, do not keep the number; downgrade to qualitative wording and add a `verify_fields` item instead. - Evidence-aware language: - fulltext-backed → can summarize comparisons - abstract-only/title-only → must be provisional + list verify-fields (no strong “dominant trade-offs” language) ## Inputs Required: - `outline/subsection_briefs.jsonl` - `papers/paper_notes.jsonl` - `citations/ref.bib` Optional (recommended for addressable evidence): - `papers/evidence_bank.jsonl` - `outline/evidence_bindings.jsonl` ## Outputs - `outline/evidence_drafts.jsonl` Optional (human-readable): - `outline/evidence_drafts/` (folder of per-subsection Markdown packs) ## Output format (`outline/evidence_drafts.jsonl`) JSONL (one JSON object per line). Required fields per record: - `sub_id`, `title` - `evidence_ids` (list[str]; if `outline/evidence_bindings.jsonl` exists, this should match the binding for the subsection) - `evidence_level_summary` (counts: `fulltext|abstract|title`) - `evidence_snippets` (list; each has `text`, `paper_id`, `citations`, `provenance`) - `definitions_setup` (list of cited bullets) - `claim_candidates` (3–5 items; each has `claim`, `citations`, `evidence_field`) - `concrete_comparisons` (>=8 items; each has `axis`, `A_papers`, `B_papers`, `citations`, `evidence_field`; may also include `A_highlights`/`B_highlights` with snippet-backed contrast anchors) - `evaluation_protocol` (list of concrete protocol bullets + citations) - `failures_limitations` (>=5 cited bullets) - `blocking_missing` (list[str]; if non-empty, drafting must stop) - `verify_fields` (list[str]; non-blocking: fields to verify before making strong claims) ### Provenance schema (per snippet) Example: ```json {"source":"abstract","pointer":"papers/paper_notes.jsonl:paper_id=P0012#abstract","evidence_level":"abstract"} ``` Allowed `source`: `fulltext|abstract|paper_notes|title`. ## Workflow 1. Load `outline/subsection_briefs.jsonl` and read each subsection’s `rq`, `axes`, `clusters`, and evidence-level policy. 2. Load `papers/paper_notes.jsonl` and build a per-paper evidence index (`bibkey`, `evidence_level`, `abstract`, `fulltext_path`, `limitations`). 3. For each subsection: - Build **evidence_snippets** from mapped papers (prefer fulltext, else abstract), and record provenance (A150++: target >=12 snippets per H3). - Definitions/setup: 1–2 bullets that define setup + scope boundary (with citations). - Claim candidates: 3–5 **checkable** candidates (prefer snippet-derived; tag with `evidence_field`). - Concrete comparisons: >=8 A-vs-B comparisons (cluster-vs-cluster) along explicit axes. - Evaluation protocol: list concrete benchmark/metric names if extractable; otherwise treat as a blocking gap. - Failures/limitations: >=5 concrete limitations/failure modes with citations. - Set `blocking_missing` for hard blockers (e.g., no usable citations; title-only evidence; no eval tokens for an eval-heavy subsection). 4. Write `outline/evidence_drafts.jsonl` and per-subsection Markdown copies. ## Quality checklist - [ ] Every subsection has >=8 concrete comparisons. - [ ] `evidence_snippets` has >=12 items and includes provenance. - [ ] Any bullet containing numbers/% also carries minimal protocol context (task/metric/constraint), or the number is removed and moved to `verify_fields`. - [ ] `claim_candidates` has >=3 snippet-derived items (no axis-driven hypotheses). - [ ] `blocking_missing` is empty. - [ ] No `TODO` / `(placeholder)` / `` / unicode ellipsis (`…`) remains. ## Script ### Refinement marker (recommended; prevents churn) When you are satisfied with evidence packs (and `blocking_missing` is empty), create: - `outline/evidence_drafts.refined.ok` This is an explicit "I reviewed/refined this" signal: - prevents scripts from regenerating and undoing your work - (in strict runs) can be used as a completion signal before writing ### Quick Start - `python .codex/skills/evidence-draft/scripts/run.py --help` - `python .codex/skills/evidence-draft/scripts/run.py --workspace ` ### All Options - See `--help`. - Inputs (required): `outline/subsection_briefs.jsonl`, `papers/paper_notes.jsonl`, `citations/ref.bib`. - Inputs (optional): `papers/evidence_bank.jsonl`, `outline/evidence_bindings.jsonl`. ### Examples - Generate evidence packs after citations: - Ensure `citations/ref.bib` exists. - Ensure `outline/subsection_briefs.jsonl` exists (axes/clusters/plan filled). - Ensure `papers/paper_notes.jsonl` has usable evidence (abstract/fulltext/limitations). - Run: `python .codex/skills/evidence-draft/scripts/run.py --workspace workspaces/` ## Troubleshooting ### Issue: evidence packs have `blocking_missing` entries **Fix**: - Treat `blocking_missing` as a stop signal: enrich `papers/paper_notes.jsonl` / `papers/evidence_bank.jsonl` (or adjust scope) before drafting prose. ### Issue: evidence snippets are empty or untraceable **Fix**: - Ensure each snippet includes provenance (paper id + location) and that `outline/evidence_bindings.jsonl` is non-empty for the subsection.