--- name: refining-ml-papers description: Refines ML/scientific LaTeX papers based on reviewer or advisor feedback. Handles structural reorganization (moving problem statements, merging sections), concrete instantiations of abstract tables, cross-file deduplication, and compilation verification. Use when the user requests paper revisions, addresses reviewer comments, restructures sections, or improves exposition clarity. allowed-tools: Read, Grep, Glob, Edit, Write, Bash, Task, TaskCreate, TaskUpdate, TaskList, EnterPlanMode --- # Refining ML Papers: From Feedback to Camera-Ready This skill captures battle-tested patterns for revising scientific LaTeX papers in response to reviewer/advisor feedback. Built from extensive revision work on multi-file ICML-style papers with Overleaf git workflows. ## When to Use This Skill Invoke when the user: - Has reviewer or advisor feedback to address - Wants to restructure paper sections (move content, merge sections) - Needs to explain tables or figures with concrete examples - Asks to improve exposition clarity or reduce redundancy - Wants to fix LaTeX compilation issues after restructuring - Needs to prepare a camera-ready or arXiv version ## Core Methodology ### Phase 1: Understand the Paper Architecture Before making ANY changes: 1. **Map the file structure**: `Glob` for `**/*.tex` to find all LaTeX files 2. **Identify the main file** and all `\input{}` dependencies 3. **Read the target sections** completely before editing 4. **Check for shared macros**: Look for `\providecommand` / `\newcommand` patterns that indicate cross-file dependencies 5. **Note existing labels**: `Grep` for `\label{` and `\ref{` to understand cross-reference graph ``` # Typical modular structure: main_arxiv.tex # Preamble + abstract + introduction + \input{} calls methods.tex # Section 3 experiments.tex # Section 4 appendix.tex # Appendix appendix_casimir.tex # Specialized appendix references.bib # Bibliography ``` ### Phase 2: Plan Changes with Feedback Mapping Map each piece of feedback to a **concrete file + line range + action**: | Feedback | Action | File(s) | Risk | |----------|--------|---------|------| | "Problem statement too late" | Move to Sec 1 + slim Sec 3 | main.tex + methods.tex | Cross-ref breakage | | "Table entries unclear" | Add concrete instantiations paragraph | main.tex | None | | "Sections redundant" | Merge + back-reference | methods.tex | Label conflicts | | "Abstract doesn't state problem" | Restructure abstract | main.tex | None | **Always use `EnterPlanMode`** for multi-file restructuring. Present the plan before editing. ### Phase 3: Implementation Patterns #### Pattern 1: Moving Content Earlier (e.g., Problem Statement to Introduction) This is the most common revision request. The three-step protocol: 1. **Add** the content in its new location (e.g., new `\paragraph{Problem setting.}` in Introduction) 2. **Slim** the original location to a back-reference: "As described in Section~\ref{sec:intro}, ..." 3. **Preserve** all `\label{}` tags or update `\ref{}` calls **Key principle**: The moved content should be a **self-contained summary**, not a copy-paste. The original section keeps the technical details. Example structure for a problem-setting paragraph in the Introduction: ```latex \paragraph{Problem setting.} We consider the [setting name]: we observe only [what's observed] at discrete times; [what's latent] are latent and never measured. The goal is to [goal (a)] and [goal (b)]. [Method] addresses this via [solution sketch: 2-3 stages]. Given [input], [method] produces [output]; Section~\ref{sec:methods} details the components. ``` #### Pattern 2: Explaining Abstract Tables with Concrete Instantiations When a table maps systems to mathematical components, add a `\paragraph{Concrete instantiations.}` that walks through 2-3 representative rows: ```latex \paragraph{Concrete instantiations.} To make the mapping from physical system to $(V, M, D)$ template explicit, we walk through three representative entries from Table~\ref{tab:...}: \begin{itemize} \item \textbf{System A} (REGIME). State $q = ...$, momentum $p = ...$. [Component 1] encodes [physical meaning]; [Component 2] is [given/learned because...]; [Component 3] is [the unknown that method learns]. \item \textbf{System B} (DIFFERENT REGIME). [Same structure, different system, showing contrast] \item \textbf{System C} (CROSS-DOMAIN). [Non-obvious domain to show generality] \end{itemize} ``` **Selection heuristic**: Pick one fully-specified (KNOWN), one partially-specified (PARTIAL), and one from a surprising domain (non-mechanical, ecological, etc.). #### Pattern 3: Abstract Restructuring (Problem-First) Reviewers often complain the abstract leads with the framework instead of the problem. The canonical structure: 1. **Context** (1 sentence): Why this domain matters 2. **Problem** (1-2 sentences): What we're solving, what's observed vs. latent 3. **Framework** (1 sentence): The structural insight that enables the solution 4. **Method** (1-2 sentences): What the method does concretely 5. **Results** (1 sentence): Key empirical finding 6. **Insight** (1 sentence): What we learned (e.g., identifiability vs. forecasting) **Anti-pattern**: Starting with "$\dot{x} = (J-R)\nabla H(x)$" before the reader knows what problem is being solved. #### Pattern 4: Section Merging / Deduplication When two sections overlap (e.g., Sec 3.1 and Sec 3.2 both introduce PH dynamics): 1. **Identify the canonical location** (usually the first occurrence) 2. **Move unique content** from the later section into the earlier one 3. **Replace the later section** with a back-reference paragraph 4. **Update the section roadmap** (the "This section describes..." paragraph) 5. **Check all `\ref{}`** calls to the removed/merged subsection labels #### Pattern 5: Cross-File Consistency After Restructuring After moving content between files, verify: - Labels defined in one file aren't duplicated in another - `\ref{}` calls still resolve (compile and check log) - Notation is consistent (same macro names for the same symbols) - Commented-out content in the original location is cleaned up or marked #### Pattern 6: Cross-Reference Connectivity Audit For a thorough review (not just compilation), audit the full cross-reference graph: 1. **Grep all `\label{}`** and **`\ref{}`** across every `.tex` file 2. **Check label naming matches location**: A label `sec:methods:X` should be defined in the methods section, not the introduction. Rename mislocated labels. 3. **Find orphan labels**: Defined but never referenced — typically fine for internal appendix structure, but flag any in the main paper 4. **Find dangling refs**: Referenced but not defined — even in commented-out code, these signal stale content 5. **Verify back-references**: If methods.tex opens with "As described in Sec 1...", confirm the label resolves to the right paragraph **Real example**: `sec:methods:regimes` was defined in the Introduction (promoted for arXiv) but the name implied Methods. Every `\ref` resolved to Sec 1 when readers expected Sec 3. Renamed to `sec:intro:regimes`. #### Pattern 7: Table Claim Verification When a table summarizes properties (identifiability, convergence, complexity): 1. **Check each cell against its detailed description** in the appendix or body text 2. **Flag qualified claims presented as unqualified** (e.g., "Yes" when the appendix says "up to overall scale") 3. **Check notation consistency**: Symbols in table columns shouldn't clash with established notation (e.g., $D_0$ for mass base diagonal when $D$ is reserved for damping → rename to $m_0$) 4. **Add footnotes for caveats**: Use `$^\dagger$` with caption text rather than silent simplification #### Pattern 8: Appendix Cleanup & Prose Quality Appendix sections often accumulate dead code and robotic prose: 1. **Delete empty placeholders**: Subsections that say only "Left to future work" — remove entirely 2. **Clean commented-out blocks**: >50 commented lines with no active content nearby → delete 3. **Add regime/group headers**: When listing many items back-to-back (e.g., 9 systems), group them with `\paragraph{}` headers and 1-2 connecting sentences 4. **Fix float drift**: Appendix figures with `[h]` or `[t]` often drift into the bibliography → use `[H]` (requires `\usepackage{float}`) ### Phase 4: Compilation & Verification **Always compile after changes.** The full verification workflow: ```bash # Full build (from the paper directory) pdflatex -interaction=nonstopmode main.tex bibtex main pdflatex -interaction=nonstopmode main.tex pdflatex -interaction=nonstopmode main.tex ``` **Post-compilation checks:** ```bash # Zero undefined references grep -c "undefined" main.log # Zero multiply-defined labels grep "multiply" main.log # Check for other warnings (filter noise) grep -i "warning" main.log | grep -v "Font\|pdf\|Unused\|size\|rerun\|float\|empty\|draft" ``` **Visual verification**: Read the PDF pages where changes were made to confirm rendering. ### Phase 5: Git Workflow (Overleaf) For Overleaf-backed papers: - Commit with descriptive messages referencing the feedback addressed - Push directly to master (Overleaf syncs from master) - Only commit `.tex` and `.bib` files; exclude build artifacts (`.aux`, `.log`, `.pdf`, `.bbl`, `.blg`, `.out`) ## Common Revision Patterns by Feedback Type ### "The paper doesn't state the problem clearly" → Pattern 1 (move to intro) + Pattern 3 (rewrite abstract) ### "I don't understand what Table X means" → Pattern 2 (concrete instantiations) ### "Sections 3.1 and 3.2 seem redundant" → Pattern 4 (merge + back-reference) ### "The abstract is too technical / leads with equations" → Pattern 3 (problem-first abstract) ### "The notation is introduced too late" → Move notation paragraph earlier; keep formal definition in Methods ## LaTeX Pitfalls in Paper Revision See `PITFALLS.md` for the complete reference. Critical ones: | Issue | Cause | Fix | |-------|-------|-----| | xcolor option clash | Style file loads xcolor; you also load it | `\PassOptionsToPackage{table}{xcolor}` BEFORE `\usepackage{icml2026}` | | Undefined `\ref` after merge | Label moved to different file | Check `\label{}` exists in the new location | | `table*` in single-column | Switched from two-column to one-column | Replace `table*` → `table`, `figure*` → `figure` | | Multiply-defined labels | Merged sections both had `\label{sec:ph}` | Rename one; update all `\ref{}` calls | | `\providecommand` doesn't override | Macro already defined in preamble | Use `\renewcommand` or define only in preamble | ## Output Format After implementing all revisions: ```markdown ## Revision Summary | Part | File | What Changed | |------|------|-------------| | A | main.tex | Abstract rewritten: problem before framework | | B | main.tex | New "Problem setting" paragraph in Sec 1 | | C | methods.tex | Sec 3.1 slimmed to back-reference | | D | main.tex | Concrete instantiations for Table 1 | Build: clean (N pages, 0 undefined refs, 0 warnings) ``` ## Progressive Disclosure For detailed examples and pitfalls: - `PATTERNS.md` — Complete before/after examples from real revisions - `PITFALLS.md` — Comprehensive LaTeX pitfalls with fixes ## Validation Checklist Before marking revision complete: - [ ] All feedback items addressed with specific file + line changes - [ ] Full compilation passes (pdflatex + bibtex + pdflatex × 2) - [ ] Zero undefined references in log - [ ] Zero multiply-defined labels - [ ] PDF visually verified on affected pages - [ ] No redundant content between sections (no reader déjà vu) - [ ] Abstract states the problem before the framework - [ ] All tables with abstract entries have concrete explanations nearby - [ ] Label names match the section they're defined in (no `sec:methods:X` in intro) - [ ] Table claims match detailed text (footnote any qualified claims) - [ ] No notation collisions across columns/sections (e.g., $D_0$ for mass vs. $D$ for damping) - [ ] Appendix: no empty placeholder sections, no large commented-out blocks - [ ] Appendix: figures use `[H]` placement (not `[h]`/`[t]` which drift) - [ ] Git committed with descriptive message mapping to feedback