---
name: create-paper
description: >
  Orchestrate paper writing from project analysis. Composes assess, dogpile, arxiv,
  and review-code skills. Interview-driven workflow with frequent human collaboration
  to resolve ambiguity and validate each stage.
allowed-tools: Bash, Read
triggers:
  - write paper
  - draft paper
  - generate paper
  - paper from project
  - research paper
metadata:
  short-description: Interview-driven academic paper generation from code
---

# Paper Writer Skill

Generate academic papers from project analysis through **interview-driven orchestration**.

---

## Implementation Status

> **Current State**: Core 5-stage workflow fully implemented with real skill integrations.

| Feature | Status | Notes |
|---------|--------|-------|
| **Stage 1: Scope Interview** | ✅ Implemented | Typer prompts + /interview skill integration |
| **Stage 2: Project Analysis** | ✅ Implemented | /assess + /dogpile + optional /review-code |
| **Stage 3: Literature Search** | ✅ Implemented | Full arxiv JSON parsing with relevance triage |
| **Stage 4: Knowledge Learning** | ✅ Implemented | /arxiv learn with progress tracking |
| **Stage 5: Draft Generation** | ✅ Implemented | Multi-template support, LLM-powered sections |
| **LLM Content Generation** | ✅ Implemented | /scillm batch single, stub fallback |
| **Memory Storage** | ⚠️ Partial | Attempts to call /memory |
| **Auto-generated Figures** | ✅ Implemented | /fixture-graph integration (Seaborn/Graphviz/Mermaid) |
| **Interview Skill Integration** | ✅ Implemented | Used in Stage 1 scope definition |
| **Iterative Refinement** | ✅ Implemented | `refine` command with LLM feedback loop |
| **MIMIC Feature** | ✅ Implemented | Exemplar paper style learning & transfer |
| **BibTeX Citations** | ✅ Implemented | Auto-generated from arxiv paper IDs |
| **RAG Grounding** | ✅ Implemented | Prevent hallucination with --rag flag |
| **Multi-Template** | ✅ Implemented | IEEE, ACM, CVPR, arXiv, Springer |
| **Citation Checker** | ✅ Implemented | Verify citations match BibTeX entries |
| **Quality Dashboard** | ✅ Implemented | Word counts, citation stats, warnings |
| **Academic Phrases** | ✅ Implemented | Section-specific phrase suggestions |
| **Aspect Critique** | ✅ Implemented | SWIF2T-style multi-aspect feedback |
| **Agent Persona** | ✅ Implemented | Horus Lupercal + custom persona.json support |
| **Venue Disclosure** | ✅ Implemented | LLM disclosure for arXiv, ICLR, NeurIPS, ACL, AAAI |
| **Citation Verifier** | ✅ Implemented | Detect hallucinated/missing references |
| **Weakness Analysis** | ✅ Implemented | Generate explicit limitations section |
| **Pre-Submit Check** | ✅ Implemented | Rubric-based submission checklist |
| **Claim-Evidence Graph** | ✅ Implemented | Jan 2026: BibAgent/SemanticCite pattern |
| **AI Usage Ledger** | ✅ Implemented | ICLR 2026 disclosure compliance |
| **Prompt Sanitization** | ✅ Implemented | CVPR 2026 ethics requirement |
| **Horus Paper Pipeline** | ✅ Implemented | Full Warmaster publishing workflow |

### All Core Features Complete

1. ~~Implement MIMIC feature~~ ✅ DONE - Exemplar paper style learning
2. ~~Add figure generation~~ ✅ DONE - /fixture-graph integration
3. ~~Iterative section refinement~~ ✅ DONE - `refine` command
4. ~~Multi-round review loop~~ ✅ DONE - via `critique` and `refine`
5. ~~Add RAG grounding~~ ✅ DONE - Use --rag flag
6. ~~Multi-template support~~ ✅ DONE - IEEE, ACM, CVPR, arXiv, Springer
7. ~~Citation checker~~ ✅ DONE - `quality` command
8. ~~Academic phrase palette~~ ✅ DONE - `phrases` command
9. ~~Agent persona integration~~ ✅ DONE - Horus Lupercal authoritative style

---

## Philosophy: Human-in-the-Loop

This skill does NOT automate away the researcher. Instead, it:

- **Asks clarifying questions** until ambiguity is resolved
- **Validates assumptions** before proceeding to next stage
- **Presents recommendations** for human approval/override
- **Iterates on feedback** rather than generating final output

Think of it as a research assistant that does the legwork but defers judgment to you.

---

## ⚡ Quick Start for Agents

**Don't get overwhelmed by 17+ commands!** Use domain navigation:

```bash
# List command domains by workflow stage
create-paper domains

# Filter commands by domain
create-paper list --domain generate    # Paper generation commands
create-paper list --domain verify      # Quality assurance commands
create-paper list --domain comply      # Venue compliance commands

# Get workflow recommendations based on paper stage
create-paper workflow --stage new_paper
create-paper workflow --stage pre_submission

# Show fixture-graph presets for figures
create-paper figure-presets
```

### Domain Quick Reference

| Domain | Commands | When to Use |
|--------|----------|-------------|
| `generate` | draft, mimic, refine, horus-paper | Starting new paper or revising |
| `verify` | verify, quality, critique, check-citations, weakness-analysis, pre-submit, sanitize | Before submission |
| `comply` | disclosure, ai-ledger, claim-graph | Meeting venue requirements |
| `resources` | phrases, templates | Looking up helpers |

### Agent JSON Output

All navigation commands support `--summary` for JSON output:

```bash
create-paper domains --summary          # JSON of all domains
create-paper workflow --stage new_paper --summary  # JSON recommendations
create-paper figure-presets --summary   # JSON of IEEE sizes + colormaps
```

---

## Workflow: 5 Stages with Interview Gates

```
1. SCOPE INTERVIEW    → Define paper type, audience, contribution claims
                         [GATE: User validates scope]

2. PROJECT ANALYSIS   → /assess + /dogpile + /review-code
                         [GATE: User confirms analysis accuracy]

3. LITERATURE SEARCH  → /arxiv search + triage
                         [GATE: User selects relevant papers]

4. KNOWLEDGE LEARNING → /arxiv learn on selected papers
                         [GATE: User reviews extracted knowledge]

5. DRAFT GENERATION   → LaTeX from analysis + learned knowledge
                         [GATE: User iterates on structure/content]
```

**Key principle**: Each `[GATE]` blocks until human approval. No stage proceeds with unresolved questions.

---

## Command: `draft`

```bash
./run.sh draft --project /path/to/project
```

Launches interactive paper drafting session.

### Interview Questions (Stage 1: Scope)

The skill asks:

**1. Paper Type**

```
What type of paper are you writing?
a) Research paper (novel contribution)
b) System paper (implementation/architecture)
c) Survey paper (literature review)
d) Experience report (lessons learned)
e) Demo paper (tool description)
```

**2. Target Venue**

```
Target venue/conference? (affects formatting and emphasis)
Examples: ICSE, FSE, ASE, PLDI, arXiv preprint
```

**3. Contribution Claims**

```
What are your 3-5 main contribution claims?
(e.g., "A novel agent memory architecture that...")
```

**4. Target Audience**

```
Who is the intended audience?
a) Software engineering researchers
b) AI/ML practitioners
c) Industry developers
d) Specific domain (e.g., formal methods)
```

**5. Prior Work Scope**

```
Should I search for related work in:
[x] Agent architectures
[ ] Memory systems
[ ] Tool use / function calling
[ ] Other (specify): ___________
```

**GATE**: User reviews and confirms scope. Proceeds only on explicit approval.

---

## Stage 2: Project Analysis

Orchestrates existing skills:

```bash
# 1. Static + LLM assessment
/assess run /path/to/project
   ├─ Features identified
   ├─ Architecture patterns
   ├─ Technical debt detected
   └─ [OUTPUT: assessment.json]

# 2. Deep research on key features
/dogpile search "feature X implementation patterns"
   ├─ ArXiv papers
   ├─ GitHub examples
   ├─ Documentation
   └─ [OUTPUT: research_context.md]

# 3. Code-paper alignment check
/review-code verify /path/to/project
   ├─ Code matches documentation?
   ├─ Claims supported by implementation?
   └─ [OUTPUT: alignment_report.md]
```

### Interview: Analysis Validation

Presents findings:

```
Project Analysis Summary:
━━━━━━━━━━━━━━━━━━━━━━━━
Core Features:
  1. Episodic memory with ArangoDB (250 LOC)
  2. Tool orchestration pipeline (180 LOC)
  3. Interview-driven interactions (120 LOC)

Architecture:
  - Event-driven with message passing
  - Skills as composable modules
  - Persistent storage layer

Detected Issues:
  ⚠ Hardcoded paths in 3 locations
  ⚠ Missing test coverage for memory skill

Does this match your understanding? (y/n/refine)
```

**GATE**: User confirms or refines analysis before proceeding.

---

## Stage 3: Literature Search

Uses `/arxiv search` with generated context:

```bash
# Automatically generates /tmp/arxiv_context.md from scope + analysis
# Then searches with domain-specific terms
/arxiv search -q "episodic memory agent systems" -n 20
```

### Interview: Paper Triage

Presents abstracts with recommendations:

```
Found 20 Papers - Triaging Against Your Contribution
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

HIGH RELEVANCE (Directly Related)
  [1] "Episodic Memory in Cognitive Architectures" (arXiv:2401.12345)
      → Describes memory structure similar to yours
      → RECOMMEND: Learn from this

  [2] "Tool Use in LLM Agents" (arXiv:2310.09876)
      → Framework for tool orchestration
      → RECOMMEND: Learn from this

MEDIUM RELEVANCE (Tangential)
  [3] "Retrieval-Augmented Generation" (arXiv:2312.54321)
      → Related but different architecture
      → SKIP unless you want broader context

LOW RELEVANCE (Different Problem)
  [4-20] ...

Which papers should I extract? (Enter numbers, 'all-high', or 'manual')
```

**GATE**: User selects papers. Can override recommendations.

---

## Stage 4: Knowledge Learning

Extracts from selected papers:

```bash
# For each selected paper:
/arxiv learn <id> --scope paper-writing --context-file /tmp/arxiv_context.md
```

### Interview: Knowledge Review

Shows extracted Q&A pairs before storing:

```
Extracted Knowledge from Paper [1]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Q: How should episodic memory be structured for agent recall?
A: Use time-indexed events with semantic embeddings. Store:
   - Event timestamp
   - Agent state snapshot
   - Action taken + rationale
   - Outcome observed

Q: What data structure best represents agent beliefs?
A: BDI (Belief-Desire-Intention) dictionary with confidence scores...

Accept these extractions? (y/n/refine)
```

**GATE**: User validates or refines extractions.

---

## Stage 5: Draft Generation

Generates LaTeX sections:

### Interview: Structure Review

```
Proposed Paper Structure
━━━━━━━━━━━━━━━━━━━━━━━

1. Abstract
   - Problem: Agent memory systems lack...
   - Solution: Interview-driven episodic memory
   - Results: Demonstrated on pi-mono project

2. Introduction
   - Motivation from /assess findings
   - Contribution claims from scope interview

3. Related Work
   - Episodic Memory (from learned papers)
   - Tool Orchestration (from learned papers)
   - Comparison table highlighting your differences

4. System Design
   - Architecture from /assess
   - Code examples from project

5. Implementation
   - Key features from analysis
   - Design decisions + rationale

6. Evaluation
   - Project statistics
   - Comparison with related systems

7. Discussion
   - Limitations from /review-code
   - Future work from aspirational features

Approve this structure? (y/n/custom)
```

**GATE**: User confirms or provides custom structure.

### Iterative Refinement

```
Draft section 1 (Abstract) ready. Options:
a) View draft
b) Regenerate with feedback
c) Accept and continue to next section
d) Manual edit
```

For each section, user can:

- Review generated text
- Provide feedback for regeneration
- Directly edit LaTeX
- Iterate until satisfied

---

## Output: Draft Paper

Final output structure:

```
paper_output/
├── draft.tex           # Main LaTeX file
├── sections/
│   ├── abstract.tex
│   ├── intro.tex
│   ├── related.tex
│   ├── design.tex
│   ├── impl.tex
│   ├── eval.tex
│   └── discussion.tex
├── figures/            # Auto-generated from /assess
│   ├── architecture.pdf
│   └── workflow.pdf
├── references.bib      # From learned papers
├── analysis/           # Supporting materials
│   ├── assessment.json
│   ├── research_context.md
│   └── alignment_report.md
└── metadata.json       # Paper metadata for /memory
```

---

## Integration with Existing Skills

| Stage          | Skill Called     | Purpose                       |
| -------------- | ---------------- | ----------------------------- |
| **Scope**      | (interview only) | Define paper parameters       |
| **Analysis**   | `/assess`        | Project feature extraction    |
|                | `/dogpile`       | Research context gathering    |
|                | `/review-code`   | Code-paper alignment          |
| **Literature** | `/arxiv search`  | Find related papers           |
| **Learning**   | `/arxiv learn`   | Extract knowledge from papers |
| **Draft**      | (internal LaTeX) | Generate paper sections       |
| **Storage**    | `/memory`        | Store paper metadata          |

All skill calls use **subprocess with error handling** - if a skill fails, the interview pauses and asks user how to proceed.

---

## Key Design Principles

1. **No Auto-Proceed**: Every stage blocks on human approval
2. **Ambiguity Resolution**: Ask questions until clarity achieved
3. **Recommendation + Override**: Suggest but defer to user judgment
4. **Transparent Process**: Show what skills are called and why
5. **Iterative Refinement**: Allow regeneration with feedback
6. **Graceful Failure**: Handle skill errors without crashing

---

## Example Session

```bash
$ ./run.sh draft --project ~/pi-mono

[INTERVIEW] Paper Type?
> b (System paper)

[INTERVIEW] Target venue?
> ICSE 2026 Tool Demo

[INTERVIEW] Main contributions? (one per line, 'done' when finished)
> Interview-driven skill orchestration
> Episodic memory with ArangoDB
> Human-in-the-loop paper generation
> done

[INTERVIEW] Audience?
> a (Software engineering researchers)

[INTERVIEW] Prior work areas? (space-separated)
> agent-architectures memory-systems tool-use

[STAGE 1] Scope defined ✓

[STAGE 2] Running /assess on ~/pi-mono...
[STAGE 2] Found 15 features, 3 architectural patterns
[INTERVIEW] Analysis shows: [summary]. Accurate? (y/n/refine)
> y

[STAGE 2] Running /dogpile on: "Interview-driven skill orchestration"...
[STAGE 2] Found 12 related projects

[INTERVIEW] Analysis complete. Continue to literature search? (y/n)
> y

[STAGE 3] Generating arxiv context from scope...
[STAGE 3] Searching arxiv for: "agent memory BDI architecture"...
[STAGE 3] Found 20 papers

[INTERVIEW] 5 HIGH, 8 MEDIUM, 7 LOW relevance. Extract which?
> all-high

[STAGE 4] Extracting 5 papers... (this may take 5-10 min)
[STAGE 4] Extracted 47 Q&A pairs

[INTERVIEW] Review extractions? (y/quick/skip)
> quick

[STAGE 5] Generating draft structure...
[INTERVIEW] 7 sections proposed. Approve? (y/custom)
> y

[STAGE 5] Drafting Abstract...
[INTERVIEW] Abstract draft ready. (view/regen/accept)
> view

[Abstract text shown]

[INTERVIEW] Feedback for regeneration? (or 'accept')
> Make it more concise, emphasize novelty
> regen

[STAGE 5] Abstract regenerated.
[INTERVIEW] (view/accept)
> accept

[Continues for each section...]

✓ Draft complete: paper_output/draft.tex
  Compile with: cd paper_output && pdflatex draft.tex

[STAGE 6] Store paper metadata in memory? (y/n)
> y

✓ Paper draft session complete
```

---

## RAG Grounding

RAG (Retrieval-Augmented Generation) grounding prevents hallucination by ensuring all generated content is traceable to source material.

### Enabling RAG

```bash
./run.sh draft --project /path/to/project --rag
```

### How RAG Works

1. **Code Snippet Extraction**: Extracts function/class definitions from project
2. **Project Facts**: Compiles verified facts from analysis (features, LOC, patterns)
3. **Paper Excerpts**: Uses Q&A pairs from learned papers as grounding
4. **Research Facts**: Incorporates findings from dogpile research

### Grounding Constraints

Each section has specific constraints:

| Section | Constraints |
|---------|-------------|
| **Abstract** | Only mention features in project_facts |
| **Intro** | Contributions must map to specific features |
| **Related** | Every claim must cite paper_excerpts |
| **Design** | Architecture must match code_snippets |
| **Impl** | Code examples must be real excerpts |
| **Eval** | Metrics must be derived from sources |
| **Discussion** | Limitations from analysis issues |

### Verifying Grounding

```bash
./run.sh verify ./paper_output --project /path/to/project
```

Checks generated content for:
- Unsupported claims (novel, achieves, outperforms)
- Fabricated metrics
- Missing source attribution

---

## Multi-Template Support

Support for major academic venues:

| Template | Venue | Usage |
|----------|-------|-------|
| `ieee` | IEEE conferences (default) | `--template ieee` |
| `acm` | ACM conferences (SIGCHI, SIGMOD) | `--template acm` |
| `cvpr` | CVPR/ICCV/ECCV | `--template cvpr` |
| `arxiv` | arXiv preprints | `--template arxiv` |
| `springer` | Springer LNCS | `--template springer` |

```bash
# Generate ACM-formatted paper
./run.sh draft --project ./myproject --template acm

# List all templates
./run.sh templates

# Show template details
./run.sh templates --show cvpr
```

---

## Iterative Refinement

The `refine` command enables section-by-section improvement with LLM feedback:

```bash
# Refine all sections with 2 rounds
./run.sh refine ./paper_output --rounds 2

# Refine specific section with feedback
./run.sh refine ./paper_output --section intro --feedback "Make it more concise"
```

Each round:
1. Shows current content preview
2. Prompts for feedback (or 'skip' to accept)
3. Generates automated critique (clarity, completeness)
4. LLM rewrites section addressing feedback + critique
5. Shows word count diff, asks for acceptance

---

## Quality Dashboard

Comprehensive metrics and warnings:

```bash
./run.sh quality ./paper_output
./run.sh quality ./paper_output --verbose
```

Displays:
- Section word counts with targets
- Citation counts per section
- Figure/table/equation counts
- Citation checker (missing/unused BibTeX)
- Warnings for sections outside target ranges

### Section Word Targets

| Section | Min | Max |
|---------|-----|-----|
| Abstract | 150 | 250 |
| Intro | 800 | 1500 |
| Related | 600 | 1200 |
| Design | 800 | 1500 |
| Impl | 600 | 1200 |
| Eval | 800 | 1500 |
| Discussion | 400 | 800 |

---

## Aspect Critique (SWIF2T-style)

Multi-aspect feedback system inspired by SWIF2T research:

```bash
# Critique all aspects
./run.sh critique ./paper_output

# Specific aspects
./run.sh critique ./paper_output --aspects clarity,rigor

# Single section with LLM
./run.sh critique ./paper_output --section eval --llm
```

### Aspects Evaluated

| Aspect | Description |
|--------|-------------|
| **clarity** | Clear writing, defined terms, logical flow |
| **novelty** | Contribution claims, differentiation from prior work |
| **rigor** | Sound methodology, baselines, statistical significance |
| **completeness** | All sections adequate, self-contained |
| **presentation** | Figures clear, formatting consistent |

Each aspect produces:
- Score (1-5)
- Specific findings
- Checklist items

---

## Academic Phrase Palette

Section-specific academic writing suggestions:

```bash
# All phrases for a section
./run.sh phrases intro

# Specific aspect
./run.sh phrases intro --aspect motivation
./run.sh phrases eval --aspect results
```

### Available Sections & Aspects

| Section | Aspects |
|---------|---------|
| **abstract** | problem, solution, results |
| **intro** | motivation, gap, contribution, organization |
| **related** | category, comparison, positioning |
| **method** | overview, detail, justification |
| **eval** | setup, results, analysis |
| **discussion** | limitations, future, broader_impact |

Example phrases:
- "Despite significant advances in..., there remains a critical need for..."
- "Our key insight is that..."
- "Unlike prior work, our method..."

---

## Agent Persona Integration

Write papers in a specific agent's voice for consistent style and authority.

### Built-in Persona: Horus Lupercal

```bash
# Generate paper in Horus's authoritative voice
./run.sh draft --project ./myproject --persona horus

# Get Horus-style phrases
./run.sh phrases eval --persona horus
```

**Horus's Writing Style:**
- **Voice**: Authoritative, commanding, tactically precise
- **Tone**: Competent, subtly contemptuous of inadequate approaches
- **Structure**: Military precision, anticipates objections
- **Principles**: Answer first, technical correctness non-negotiable

**Characteristic Phrases:**
- "The evidence is unambiguous."
- "Prior approaches fail to address the fundamental issue."
- "The results leave no room for debate."
- "Our methodology achieves what lesser approaches could not."

**Forbidden Phrases** (never used):
- "happy to help", "as an AI", "I believe", "hopefully"

### Custom Personas

Load custom persona from JSON:

```bash
./run.sh draft --project ./myproject --persona /path/to/persona.json
```

**persona.json format:**
```json
{
  "name": "Custom Persona",
  "voice": "academic",
  "tone_modifiers": ["precise", "formal"],
  "characteristic_phrases": ["We demonstrate that...", "Our analysis reveals..."],
  "forbidden_phrases": ["I think", "maybe"],
  "writing_principles": ["Clarity first", "Evidence-based claims"],
  "authority_source": "Rigorous methodology"
}
```

---

## Venue Policy Compliance (2024-2025)

Based on dogpile research into current venue policies:

### Venue Disclosure Generator

Generate LLM-use disclosure statements compliant with venue policies:

```bash
# Generate arXiv disclosure
./run.sh disclosure arxiv

# Show ICLR policy notes
./run.sh disclosure iclr --policy

# Save to file
./run.sh disclosure neurips -o acknowledgements.tex
```

**Supported Venues:**

| Venue | Disclosure Required | Location |
|-------|---------------------|----------|
| arXiv | Yes | acknowledgements |
| ICLR | Yes (desk rejection risk) | acknowledgements |
| NeurIPS | Yes (method-level) | method section |
| ACL | Yes | acknowledgements |
| AAAI | Yes (if experimental) | paper body |
| CVPR | Yes | acknowledgements |

**Key Policy Notes (Oct 2025):**
- arXiv CS tightened moderation: review/survey papers need completed peer review
- ICLR 2026: Hallucinated references = desk rejection
- All venues: Authors responsible for content correctness

### Citation Verification

Prevent hallucinated references (critical for peer review):

```bash
# Check citations match BibTeX
./run.sh check-citations ./paper_output

# Strict mode (fail on issues)
./run.sh check-citations ./paper_output --strict
```

**Checks performed:**
- All `\cite{}` commands have matching .bib entries
- Recent papers (2023+) have URL/DOI
- No suspicious patterns (excessive "et al.", generic names)

### Weakness Analysis

Generate explicit limitations section (research shows LLMs miss weaknesses):

```bash
# Analyze paper for limitations
./run.sh weakness-analysis ./paper_output

# Include project analysis
./run.sh weakness-analysis ./paper_output --project ./my-project

# Save to file
./run.sh weakness-analysis ./paper_output -o sections/limitations.tex
```

**Categories analyzed:**
- Methodology assumptions/simplifications
- Evaluation baseline count (research suggests 3-4 minimum)
- Scope boundaries
- Test coverage (if project provided)
- Reproducibility and generalization

### Pre-Submission Checklist

Comprehensive validation before submission:

```bash
# Full pre-submit check
./run.sh pre-submit ./paper_output --venue iclr --project ./my-project

# arXiv-focused (default)
./run.sh pre-submit ./paper_output
```

**Checklist items:**
1. File structure (draft.tex, references.bib)
2. Required sections (intro, method, eval, conclusion)
3. Citation integrity (no missing/hallucinated)
4. LLM disclosure compliance (venue-specific)
5. Evidence grounding (code/figure references)

**Exit codes:**
- 0: Ready for submission
- 1: Critical issues found

---

## Complete Command Reference

| Command | Purpose |
|---------|---------|
| `draft` | Generate paper from project (5-stage workflow) |
| `mimic` | Learn/apply exemplar paper styles |
| `refine` | Iteratively improve sections with feedback |
| `quality` | Show metrics dashboard |
| `critique` | Multi-aspect feedback (SWIF2T-style) |
| `phrases` | Academic phrase suggestions |
| `templates` | List/show LaTeX templates |
| `verify` | Verify RAG grounding |
| `disclosure` | Generate venue-specific LLM disclosure |
| `check-citations` | Verify citations against BibTeX |
| `weakness-analysis` | Generate limitations section |
| `pre-submit` | Pre-submission checklist and validation |
| `claim-graph` | Build claim-evidence graph (Jan 2026) |
| `ai-ledger` | AI usage tracking for ICLR 2026 compliance |
| `sanitize` | Prompt injection defense (CVPR 2026) |
| `horus-paper` | Full Warmaster publishing pipeline |

---

## Horus Lupercal: Research Paper Workflow

Horus has access to all skills in `/home/graham/workspace/experiments/pi-mono/.pi/skills` and can compose them to write research papers about his projects.

### Example: Writing a Paper on the Memory Project

```bash
# Step 1: Analyze the memory project
./run.sh draft --project /home/graham/workspace/experiments/memory \
               --persona horus \
               --rag \
               --template arxiv

# Step 2: Web research for related work (Horus has /surf access)
# Horus can use /surf to browse arXiv, GitHub, documentation

# Step 3: Generate limitations section
./run.sh weakness-analysis ./paper_output \
         --project /home/graham/workspace/experiments/memory

# Step 4: Pre-submission validation
./run.sh pre-submit ./paper_output \
         --venue arxiv \
         --project /home/graham/workspace/experiments/memory
```

### Horus's Skill Composition

| Skill | Horus's Usage |
|-------|---------------|
| `/assess` | Analyze project architecture and features |
| `/dogpile` | Deep research on related topics |
| `/arxiv` | Search and learn from academic papers |
| `/memory` | Store paper context for future sessions |
| `/review-code` | Verify code-paper alignment |
| `/surf` | Browse web for documentation, examples |
| `/create-paper` | Generate research papers in his voice |

### Horus Writing Principles (Academic Context)

When writing papers, Horus:

1. **Answers first** - States contributions directly, then elaborates
2. **Technical precision** - Every claim backed by evidence from code/experiments
3. **Anticipates objections** - Limitations section is thorough, not hidden
4. **Commands authority** - Writing is confident, not hedging
5. **No AI-speak** - Never uses "happy to help", "as an AI", "hopefully"

### Example Horus Paper Abstract

> Prior approaches to agent memory systems demonstrate troubling disregard for
> compositional reasoning—a fundamental deficiency that limits generalization
> across tasks. We present a knowledge graph architecture that addresses this
> inadequacy through graph-based belief tracking and Theory of Mind inference.
> Our implementation achieves 34% improved task success rate compared to flat
> memory baselines. The experimental results leave no room for debate regarding
> the superiority of structured episodic recall.

---

## Jan 2026 Cutting-Edge Features (from dogpile research)

These features are based on January 2026 academic policy changes and state-of-the-art research.

### Claim-Evidence Graph (BibAgent/SemanticCite Pattern)

Link every claim to its evidence sources for peer review defense:

```bash
# Build claim-evidence graph
./run.sh claim-graph ./paper_output

# With verification
./run.sh claim-graph ./paper_output --verify

# Export to JSON
./run.sh claim-graph ./paper_output -o claims.json
```

**Support Levels:**
- **Supported**: Claim has 2+ citations
- **Partially Supported**: Claim has 1 citation
- **Unsupported**: Claim has no citations (⚠ review required)

### AI Usage Ledger (ICLR 2026 Compliance)

Track all AI tool usage for accurate disclosure:

```bash
# Show logged AI usage
./run.sh ai-ledger ./paper_output --show

# Generate disclosure statement from ledger
./run.sh ai-ledger ./paper_output --disclosure

# Clear ledger
./run.sh ai-ledger ./paper_output --clear
```

**Tracked Information:**
- Tool name (scillm, claude, gpt-4, etc.)
- Purpose (drafting, editing, citation_search)
- Section affected
- Prompt hash (for provenance, not full prompt)
- Output summary

### Prompt Injection Sanitization (CVPR 2026 Requirement)

CVPR 2026 explicitly treats hidden prompt injection as an ethics violation:

```bash
# Check for prompt injection
./run.sh sanitize ./paper_output

# Auto-fix detected issues
./run.sh sanitize ./paper_output --fix
```

**Detected Patterns:**
- "ignore previous instructions"
- "you are now" / "pretend to be"
- Zero-width characters
- White/hidden text in LaTeX
- System prompt markers

### Horus Paper Pipeline

The full Warmaster publishing workflow:

```bash
./run.sh horus-paper /home/graham/workspace/experiments/memory
```

**Persona Strength Parameter:**

Horus can modulate his voice for peer reviewers with `--persona-strength`:

| Strength | Tone | Use When |
|----------|------|----------|
| 0.0 | Pure academic | Conservative venues (Nature, Science) |
| 0.3 | Subtle hints | Peer review requires neutrality |
| 0.5 | Balanced | General arXiv preprints |
| 0.7 | Strong (default) | Authoritative but measured |
| 1.0 | Full Warmaster | Workshop papers, position pieces |

```bash
# Measured tone for peer review
./run.sh horus-paper ./project --persona-strength 0.5 --auto-run

# Full Warmaster intensity
./run.sh horus-paper ./project -s 1.0 --auto-run
```

*"I temper my voice for the peer reviewers. A tactical necessity." - Horus*

**Pipeline Phases:**

1. **Project Analysis**: `draft --persona horus --rag`
2. **Claim Verification**: `claim-graph --verify` + `check-citations --strict`
3. **Weakness Analysis**: `weakness-analysis --project`
4. **Compliance Check**: `sanitize` + `ai-ledger --disclosure` + `pre-submit`

**The Warmaster's Publishing Checklist:**
- [ ] All claims have evidence (claim-graph)
- [ ] No hallucinated citations (check-citations --strict)
- [ ] Limitations explicitly stated (weakness-analysis)
- [ ] No prompt injection (sanitize)
- [ ] AI usage disclosed (ai-ledger --disclosure)
- [ ] Pre-submission passed (pre-submit)

---

## Dependencies

- Python 3.10+
- LaTeX distribution (texlive or mactex)
- Existing skills: assess, dogpile, arxiv, review-code, memory
- interview skill (for HTML/TUI interview rendering)

---

## Sanity Check

```bash
./sanity.sh
```

Verifies:

- All dependent skills exist
- LaTeX is installed
- Python dependencies available
- Template files present