--- name: paper-write description: "Draft LaTeX paper section by section from an outline. Use when user says \"写论文\", \"write paper\", \"draft LaTeX\", \"开始写\", or wants to generate LaTeX content from a paper plan." argument-hint: [venue-or-section] allowed-tools: Bash(*), Read, Write, Edit, Grep, Glob, Agent, WebSearch, WebFetch, mcp__codex__codex, mcp__codex__codex-reply --- # Paper Write: Section-by-Section LaTeX Generation Draft a LaTeX paper based on: **$ARGUMENTS** ## Constants - **REVIEWER_MODEL = `gpt-5.4`** — Model used via Codex MCP for section review. Must be an OpenAI model. - **TARGET_VENUE = `ICLR`** — Default venue. Supported: `ICLR`, `NeurIPS`, `ICML`, `CVPR` (also ICCV/ECCV), `ACL` (also EMNLP/NAACL), `AAAI`, `ACM` (ACM MM, SIGIR, KDD, CHI, etc.), `IEEE_JOURNAL` (IEEE Transactions / Letters, e.g., T-PAMI, JSAC, TWC, TCOM, TSP, TIP), `IEEE_CONF` (IEEE conferences, e.g., ICC, GLOBECOM, INFOCOM, ICASSP). Determines style file and formatting. - **ANONYMOUS = true** — If true, use anonymous author block. Set `false` for camera-ready. Note: most IEEE venues do NOT use anonymous submission — set `false` for IEEE. - **MAX_PAGES = 9** — Main body page limit. For ML conferences: counts from first page to end of Conclusion section, references and appendix NOT counted. **For IEEE venues: references ARE counted toward the page limit.** Typical limits: IEEE journal = no strict limit (but 12-14 pages typical for Transactions, 4-5 for Letters), IEEE conference = 5-8 pages including references. - **DBLP_BIBTEX = true** — Fetch real BibTeX from DBLP/CrossRef instead of LLM-generated entries. Eliminates hallucinated citations. Zero install required. Set `false` to use legacy behavior (LLM search + `[VERIFY]` markers). ## Inputs 1. **PAPER_PLAN.md** — outline with claims-evidence matrix, section plan, figure plan (from `/paper-plan`) 2. **NARRATIVE_REPORT.md** — the research narrative (primary source of content) 3. **Generated figures** — PDF/PNG files in `figures/` (from `/paper-figure`) 4. **LaTeX includes** — `figures/latex_includes.tex` (from `/paper-figure`) 5. **Bibliography** — existing `.bib` file, or will create one If no PAPER_PLAN.md exists, ask the user to run `/paper-plan` first or provide a brief outline. ## Orchestra-Guided Writing Overlay Keep the existing `insleep` workflow, file layout, and defaults. Use the shared references below only when they improve writing quality: - Read `../shared-references/writing-principles.md` before drafting the Abstract, Introduction, Related Work, or when prose feels generic. - Read `../shared-references/venue-checklists.md` during the final write-up and submission-readiness pass. - Read `../shared-references/citation-discipline.md` only when the built-in DBLP/CrossRef workflow is insufficient. These references are support material, not extra workflow phases. ## Templates ### Venue-Specific Setup The skill includes conference templates in `templates/`. Select based on TARGET_VENUE: **ICLR:** ```latex \documentclass{article} \usepackage{iclr2026_conference,times} % \iclrfinalcopy % Uncomment for camera-ready ``` **NeurIPS:** ```latex \documentclass{article} \usepackage[preprint]{neurips_2025} % \usepackage[final]{neurips_2025} % Camera-ready ``` **ICML:** ```latex \documentclass[accepted]{icml2025} % Use [accepted] for camera-ready ``` **IEEE Journal** (Transactions, Letters): ```latex \documentclass[journal]{IEEEtran} \usepackage{cite} % IEEE uses \cite{}, NOT natbib % Author block uses \author{Name~\IEEEmembership{Member,~IEEE}} ``` **IEEE Conference** (ICC, GLOBECOM, INFOCOM, ICASSP, etc.): ```latex \documentclass[conference]{IEEEtran} \usepackage{cite} % IEEE uses \cite{}, NOT natbib % Author block uses \IEEEauthorblockN / \IEEEauthorblockA ``` ### Project Structure Generate this file structure: ``` paper/ ├── main.tex # master file (includes sections) ├── iclr2026_conference.sty # or neurips_2025.sty / icml2025.sty / IEEEtran.cls + IEEEtran.bst ├── math_commands.tex # shared math macros ├── references.bib # bibliography (filtered — only cited entries) ├── sections/ │ ├── 0_abstract.tex │ ├── 1_introduction.tex │ ├── 2_related_work.tex │ ├── 3_method.tex # or preliminaries, setup, etc. │ ├── 4_experiments.tex │ ├── 5_conclusion.tex │ └── A_appendix.tex # proof details, extra experiments └── figures/ # symlink or copy from project figures/ ``` **Section files are FLEXIBLE**: If the paper plan has 6-8 sections, create corresponding files (e.g., `4_theory.tex`, `5_experiments.tex`, `6_analysis.tex`, `7_conclusion.tex`). ## Workflow ### Step 0: Backup and Clean If `paper/` already exists, back up to `paper-backup-{timestamp}/` before overwriting. Never silently destroy existing work. **CRITICAL: Clean stale files.** When changing section structure (e.g., 5 sections → 7 sections), delete section files that are no longer referenced by `main.tex`. Stale files (e.g., old `5_conclusion.tex` left behind when conclusion moved to `7_conclusion.tex`) cause confusion and waste space. ### Step 1: Initialize Project 1. Create `paper/` directory 2. Copy venue template from `templates/` — the template already includes: - All standard packages (amsmath, hyperref, cleveref, booktabs, etc.) - Theorem environments with `\crefname{assumption}` fix - Anonymous author block 3. Generate `math_commands.tex` with paper-specific notation 4. Create section files matching PAPER_PLAN structure **Author block (anonymous mode):** ```latex \author{Anonymous Authors} ``` ### Step 2: Generate math_commands.tex Create shared math macros based on the paper's notation: ```latex % math_commands.tex — shared notation \newcommand{\R}{\mathbb{R}} \newcommand{\E}{\mathbb{E}} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} % Add paper-specific notation here ``` ### Step 3: Write Each Section Process sections in order. For each section: 1. **Read the plan** — what claims, evidence, citations belong here 2. **Read NARRATIVE_REPORT.md** — extract relevant content, findings, and quantitative results 3. **Draft content** — write complete LaTeX (not placeholders) 4. **Insert figures/tables** — use snippets from `figures/latex_includes.tex` 5. **Add citations** — for ML conferences (ICLR/NeurIPS/ICML/CVPR/ACL/AAAI): use `\citep{}` / `\citet{}` (natbib). **For IEEE venues**: use `\cite{}` (numeric style via `cite` package). Never mix natbib and cite commands. Before drafting the front matter, re-read the one-sentence contribution from `PAPER_PLAN.md`. The Abstract and Introduction should make that takeaway obvious before the reader reaches the full method. #### Section-Specific Guidelines **§0 Abstract:** - Use the 5-part flow from `../shared-references/writing-principles.md`: what, why hard, how, evidence, strongest result - Must be self-contained (understandable without reading the paper) - Start with the paper's specific contribution, not generic field-level background - Include one concrete quantitative result - 150-250 words (check venue limit) - No citations, no undefined acronyms - No `\begin{abstract}` — that's in main.tex **§1 Introduction:** - Open with a compelling hook (1-2 sentences, problem motivation) - State the gap clearly ("However, ...") - Give a brief approach overview before the reader gets lost in details - List 2-4 specific, falsifiable contributions as a numbered or bulleted list - Preview the strongest result early instead of saving it for the experiments section - End with a brief roadmap ("The rest of this paper is organized as...") - Include the main result figure if space allows - Target: 1-1.5 pages - Methods should begin by page 2-3 at the latest **§2 Related Work:** - **MINIMUM 1 full page** (3-4 substantive paragraphs). Short related work sections are a common reviewer complaint. - Organize by category using `\paragraph{Category Name.}` - Organize methodologically, by assumption class, or by research question; do not write paper-by-paper mini-summaries - Each category: 1 paragraph summarizing the line of work + 1-2 sentences positioning this paper - Do NOT just list papers — synthesize and compare - End each paragraph with how this paper relates/differs **§3 Method / Preliminaries / Setup:** - Define notation early (reference math_commands.tex) - Use `\begin{definition}`, `\begin{theorem}` environments for formal statements - For theory papers: include proof sketches of key results in main body, full proofs in appendix - For theory papers: include a **comparison table** of prior bounds vs. this paper - Include algorithm pseudocode if applicable (`algorithm2e` or `algorithmic`) - Target: 1.5-2 pages **§4 Experiments:** - Start with experimental setup (datasets, baselines, metrics, implementation details) - Main results table/figure first - Then ablations and analysis - Every claim from the introduction must have supporting evidence here - For each major experiment, make explicit what claim it supports and what the reader should notice - Target: 2.5-3 pages **§5 Conclusion:** - Summarize contributions (NOT copy-paste from intro — rephrase) - Limitations (be honest — reviewers appreciate this) - Future work (1-2 concrete directions) - Ethics statement and reproducibility statement (if venue requires) - Target: 0.5 pages **Appendix:** - Proof details (full proofs of main-body theorems) - Additional experiments, ablations - Implementation details, hyperparameter tables - Additional visualizations ### Step 4: Build Bibliography **CRITICAL: Only include entries that are actually cited in the paper.** 1. Scan all citation references in the drafted sections (`\citep{}`/`\citet{}` for ML conferences, `\cite{}` for IEEE venues) 2. Build a citation key list 3. For each citation key: - Check existing `.bib` files in the project/narrative docs - If not found and **DBLP_BIBTEX = true**, use the verified fetch chain below - If not found and **DBLP_BIBTEX = false**, search arXiv/Scholar for correct BibTeX - **NEVER fabricate BibTeX entries** — mark unknown ones with `[VERIFY]` comment 4. Write `references.bib` containing ONLY cited entries (no bloat) #### Verified BibTeX Fetch (when DBLP_BIBTEX = true) Three-step fallback chain — zero install, zero auth, all real BibTeX: **Step A: DBLP (best quality — full venue, pages, editors)** ```bash # 1. Search by title + first author curl -s "https://dblp.org/search/publ/api?q=TITLE+AUTHOR&format=json&h=3" # 2. Extract DBLP key from result (e.g., conf/nips/VaswaniSPUJGKP17) # 3. Fetch real BibTeX curl -s "https://dblp.org/rec/{key}.bib" ``` **Step B: CrossRef DOI (fallback — works for arXiv preprints)** ```bash # If paper has a DOI or arXiv ID (arXiv DOI = 10.48550/arXiv.{id}) curl -sLH "Accept: application/x-bibtex" "https://doi.org/{doi}" ``` **Step C: Mark `[VERIFY]` (last resort)** If both DBLP and CrossRef return nothing, mark the entry with `% [VERIFY]` comment. Do NOT fabricate. **Why this matters:** LLM-generated BibTeX frequently hallucinates venue names, page numbers, or even co-authors. DBLP and CrossRef return publisher-verified metadata. Upstream skills (`/research-lit`, `/novelty-check`) may mention papers from LLM memory — this fetch chain is the gate that prevents hallucinated citations from entering the final `.bib`. If the DBLP/CrossRef flow is not enough, load `../shared-references/citation-discipline.md` for stricter fallback rules before adding placeholders. **Automated bib cleaning** — use this Python pattern to extract only cited entries: ```python import re # 1. Grep all \citep{...}, \citet{...}, and \cite{...} from all .tex files # 2. Extract unique keys (handle multi-cite like \citep{a,b,c} or \cite{a,b,c}) # 3. Parse the full .bib file, keep only entries whose key is in the cited set # 4. Write the filtered bib ``` This prevents bib bloat (e.g., 948 lines → 215 lines in testing). **Citation verification rules (from claude-scholar + Imbad0202):** 1. Every BibTeX entry must have: author, title, year, venue/journal 2. Prefer published venue versions over arXiv preprints (if published) 3. Use consistent key format: `{firstauthor}{year}{keyword}` (e.g., `ho2020denoising`) 4. Double-check year and venue for every entry 5. Remove duplicate entries (same paper with different keys) ### Step 5: De-AI Polish and Clarity Pass After drafting all sections, scan for common AI writing patterns and fix them: First apply the sentence-level clarity rules from `../shared-references/writing-principles.md`: - keep subject and verb close together, - put familiar context first and new information later, - place the most important information near the end of the sentence, - let each paragraph do one job, - use verbs for actions instead of nominalized nouns. Then fix the common content patterns below: - Significance inflation ("groundbreaking", "revolutionary" → use measured language) - Formulaic transitions ("In this section, we..." → remove or vary) - Generic conclusions ("This work opens exciting new avenues" → be specific) **Language patterns to fix (watch words):** - Replace: delve, pivotal, landscape, tapestry, underscore, noteworthy, intriguingly - Remove filler: "It is worth noting that", "Importantly,", "Notably," - Avoid rule-of-three lists ("X, Y, and Z" appearing repeatedly) - Don't start consecutive sentences with "This" or "We" - Replace vague nouns with concrete ones when ambiguity is possible ("this result", "this ablation", "this theorem") ### Step 6: Cross-Review with REVIEWER_MODEL Send the complete draft to GPT-5.4 xhigh: ``` mcp__codex__codex: model: gpt-5.4 config: {"model_reasoning_effort": "xhigh"} prompt: | Review this [VENUE] paper draft (main body, excluding appendix). Focus on: 1. Does each claim from the intro have supporting evidence? 2. Is the writing clear, concise, and free of AI-isms? 3. Any logical gaps or unclear explanations? 4. Does it fit within [MAX_PAGES] pages (to end of Conclusion)? 5. Is related work sufficiently comprehensive (≥1 page)? 6. For theory papers: are proof sketches adequate? 7. Are figures/tables clearly described and properly referenced? 8. Would a skim reader understand the contribution from the title, abstract, introduction, and Figure 1? For each issue, specify: severity (CRITICAL/MAJOR/MINOR), location, and fix. [paste full draft text] ``` Apply CRITICAL and MAJOR fixes. Document MINOR issues for the user. ### Step 7: Reverse Outline Test (from Research-Paper-Writing-Skills) After drafting all sections: 1. **Extract topic sentences** — pull the first sentence of every paragraph 2. **Read them in sequence** — they should form a coherent narrative on their own 3. **Check claim coverage** — every claim from the Claims-Evidence Matrix must appear 4. **Check evidence mapping** — every experiment/figure must support a stated claim 5. **Fix gaps** — if a topic sentence doesn't advance the story, rewrite the paragraph ### Step 8: Final Checks Before declaring done: - [ ] All `\ref{}` and `\label{}` match (no undefined references) - [ ] All citation commands (`\citep{}`/`\citet{}` for ML conferences, `\cite{}` for IEEE) have corresponding BibTeX entries - [ ] No author information in anonymous mode - [ ] Figure/table numbering is correct - [ ] Page count within MAX_PAGES (main body to Conclusion end) - [ ] No TODO/FIXME/XXX markers left in the text - [ ] No `[VERIFY]` markers left unchecked - [ ] Abstract is self-contained (understandable without reading the paper) - [ ] Title is specific and informative (not generic) - [ ] Related work is ≥1 full page - [ ] references.bib contains ONLY cited entries (no bloat) - [ ] **No stale section files** — every .tex in `sections/` is `\input`ed by `main.tex` - [ ] **Section files match main.tex** — file numbering and `\input` paths are consistent - [ ] Venue-specific required sections/checklists satisfied (read `../shared-references/venue-checklists.md` if needed) - [ ] A skim reader can recover the main claim from the title, abstract, introduction, and Figure 1/captions ## Key Rules - **Large file handling**: If the Write tool fails due to file size, immediately retry using Bash (`cat << 'EOF' > file`) to write in chunks. Do NOT ask the user for permission — just do it silently. - **Do NOT generate author names, emails, or affiliations** — use anonymous block or placeholder - **Write complete sections, not outlines** — the output should be compilable LaTeX - **One file per section** — modular structure for easy editing - **Every claim must cite evidence** — cross-reference the Claims-Evidence Matrix - **Compile-ready** — the output should compile with `latexmk` without errors (modulo missing figures) - **No over-claiming** — use hedging language ("suggests", "indicates") for weak evidence - **Venue style matters** — ML conferences (ICLR/NeurIPS/ICML) use `natbib` (`\citep`/`\citet`); **IEEE venues use `cite` package (`\cite{}`, numeric)**. Never mix. - **Page limit rules differ by venue** — ML conferences: main body to Conclusion, references/appendix NOT counted. **IEEE: references ARE counted toward the page limit.** - **Clean bib** — references.bib must only contain entries that are actually `\cite`d - **Section count is flexible** — match PAPER_PLAN structure, don't force into 5 sections - **Backup before overwrite** — never destroy existing `paper/` directory without backing up - **Front-load the contribution** — do not hide the payoff until the experiments or appendix ## Writing Quality Reference - `../shared-references/writing-principles.md` — story framing, abstract/introduction patterns, sentence-level clarity, reviewer reading order - `../shared-references/venue-checklists.md` — ICLR/NeurIPS/ICML/IEEE submission requirements to check before declaring done - `../shared-references/citation-discipline.md` — stricter fallback for ambiguous citations Keep using the reverse-outline test and anti-inflation polish from the main workflow above; the shared references are there to improve quality without adding a new phase. ## Acknowledgements Writing methodology adapted from [Research-Paper-Writing-Skills](https://github.com/Master-cai/Research-Paper-Writing-Skills) (CCF award-winning methodology). Citation verification from [claude-scholar](https://github.com/Galaxy-Dawn/claude-scholar) and [Imbad0202/academic-research-skills](https://github.com/Imbad0202/academic-research-skills). This hybrid pack's writing-guidance overlay is adapted from Orchestra Research's paper-writing materials.