---
name: protocol-writer
description: |
  Write a systematic review protocol into `output/PROTOCOL.md` (databases, queries, inclusion/exclusion, time window, extraction fields).
  **Trigger**: protocol, PRISMA, systematic review, inclusion/exclusion, 检索式, 纳入排除.
  **Use when**: systematic review pipeline 的起点（C1），需要先锁定 protocol 再开始 screening/extraction。
  **Skip if**: 不是做 systematic review（或 protocol 已经锁定且不允许修改）。
  **Network**: none.
  **Guardrail**: protocol 必须包含可执行的检索与筛选规则；需要 HUMAN 签字后才能进入 screening。
---

# Protocol Writer (systematic review, PRISMA-style)

Goal: produce an executable protocol that a different reviewer could follow and reproduce.

## Role cards (use explicitly)

### Methodologist (protocol author)

Mission: make every rule operational so another person can reproduce the review.

Do:
- Define scope and RQs in testable language (what counts as in/out).
- Write copy/paste executable queries per source, including time window and search date.
- Specify screening labels and tie-break policy.
- Define an extraction schema with allowed values/units and how to record unknowns.

Avoid:
- Vague criteria ("relevant", "state-of-the-art", "high quality").
- Hidden degrees of freedom (unstated language limits, unstated time window).

### Auditor (reproducibility checker)

Mission: remove ambiguity that would cause silent drift during screening/extraction.

Do:
- Add a short "decision log" section (what to record, where).
- Include a HUMAN approval gate statement before screening starts.

Avoid:
- Protocol prose that cannot be executed.

## Role prompt: Systematic Review Protocol Author

```text
You are writing a systematic review protocol that must be executable and auditable.

Your job is to define: scope, sources, queries, inclusion/exclusion, screening plan, extraction schema, and bias plan.

Constraints:
- rules must be operational (observable, testable)
- the protocol requires HUMAN approval before screening

Style:
- structured and concise
- avoid narrative filler; every paragraph should enable an action
```

## Inputs

Required:
- `STATUS.md` (context + scope notes)

Optional:
- `GOAL.md` (topic phrasing)
- `DECISIONS.md` (any pre-agreed constraints)

## Outputs

- `output/PROTOCOL.md`

## Workflow

1. Scope + research questions
   - Translate the goal in `GOAL.md` (if present) into 1–3 review questions.
   - State what is in-scope / out-of-scope (keep consistent with `STATUS.md`).
   - If `DECISIONS.md` exists, treat it as authoritative for any pre-agreed constraints.

2. Sources
   - List databases/sources you will search (e.g., arXiv, ACL Anthology, IEEE Xplore, ACM DL, PubMed).
   - Specify any manual routes (snowballing: references/cited-by).

3. Search strategy (copy/paste executable)
   - For each source, write a concrete query string.
   - Define the time window (from/to year) and language constraints.
   - Record “search date” so the run is auditable.

4. Inclusion / exclusion criteria (operational, not vague)
   - Write MUST-HAVE criteria (study type, domain, outcomes).
   - Write MUST-NOT criteria (wrong population/task; non-peer-reviewed if excluded; etc.).
   - Define how you handle duplicates and near-duplicates.

5. Screening plan
   - Define the screening stages (title/abstract → full text if applicable).
   - Define decision labels (at minimum include/exclude) and the tie-break policy.
   - Specify what gets recorded into `papers/screening_log.csv`.

6. Extraction schema (downstream contract)
   - Define the columns that will appear in `papers/extraction_table.csv`.
   - Ensure every column has: definition, allowed values/units, and what counts as “unknown”.

7. Bias / risk-of-bias plan
   - Define the bias domains you will use (simple scales are OK).
   - Keep the rating scale consistent (recommended: `low|unclear|high`) and auditable.

8. Write `output/PROTOCOL.md`
   - Use clear headings; avoid prose that cannot be operationalized.
   - End with an explicit “HUMAN approval required before screening” note.

## Mini examples (operational vs vague)

Inclusion criteria:
- Bad: `Include papers that are relevant to LLM agents.`
- Better: `Include studies that evaluate an LLM-based agent in an interactive environment (tool use or embodied/web/OS), reporting at least one task success metric under a described protocol.`

Exclusion criteria:
- Bad: `Exclude low-quality papers.`
- Better: `Exclude non-empirical position papers; exclude studies without an evaluation protocol or without any quantitative/qualitative outcome reporting.`

Query spec:
- Bad: "Search arXiv for agent papers"
- Better: provide an executable query string + fields (title/abstract) + time window + search date.

## Definition of Done

- [ ] `output/PROTOCOL.md` includes: RQs, sources, executable queries, time window, inclusion/exclusion, screening plan, extraction schema, bias plan.
- [ ] A human can read `output/PROTOCOL.md` and run screening without asking “what do you mean by X?”.

## Troubleshooting

### Issue: queries are too broad / too narrow

**Fix**:
- Add exclusions for common false positives; add missing synonyms/acronyms; restrict fields (title/abstract) where supported.

### Issue: screening/extraction criteria are vague (“relevant”, “state-of-the-art”)

**Fix**:
- Replace with observable rules (task/domain, metrics, dataset requirements, intervention/controls).