πŸ† Generated Paper Showcase

From a one-line idea to a conference-ready paper β€” fully autonomous, zero human intervention.

23 Stages  8 Papers  54k LOC  ~27h Runtime

1547+ papers  50 figures  121 pages  291 refs

--- Below are **eight papers** generated **entirely by AutoResearchClaw** β€” each starting from nothing more than a topic sentence. The pipeline autonomously searched literature, designed experiments, wrote and executed code, generated figures, and produced NeurIPS-formatted LaTeX papers with verified references. > πŸ“Œ **Two batches, eight domains** β€” Batch A covers mathematics, statistics, biology, and numerical computing; Batch B covers NLP, reinforcement learning, computer vision, and knowledge distillation β€” demonstrating the pipeline's cross-domain generality. --- ## πŸ”„ How It Works
**πŸ’‘**
**Idea**
➜ **πŸ“š**
**Literature**
300–470 papers
➜ **πŸ§ͺ**
**Hypothesis**
experiment design
➜ **πŸ’»**
**Code**
2K–15K lines
➜ **πŸ”¬**
**Execute**
sandbox + refine
➜ **πŸ“**
**Write**
review & audit
➜ **πŸ“„**
**Paper**
NeurIPS PDF

Each run traverses 23 autonomous stages with iterative self-healing, multi-agent peer review, and citation verification β€” no human in the loop.

---

πŸ“˜ Batch A  Β·  Mathematics, Statistics & Sciences

Generated on Machine A  Β·  4 papers across 4 non-ML domains

--- ### πŸ“„ Paper I  Β·  Random Matrix Theory   math > **Finite-Dimensional Corrections to the Marchenko–Pastur Distribution in Random Wishart Matrices**
Paper I First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Systematically quantify pre-asymptotic, finite-*N* deviations of empirical eigenvalue densities from the Marchenko–Pastur law across *N* = 50 to 5,000, decomposing error into bulk vs. edge components and testing lightweight correction models. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 473 papers collected β†’ 26 cited | | πŸ’» **Code** | 10,290 lines of Python | | ⏱️ **Runtime** | ~2 h 25 min | | πŸ“Š **Figures** | 5 auto-generated charts | | πŸ“‘ **Pages** | 16 pages (NeurIPS format) | #### 🎯 Key Result Produced a finite-*N* correction atlas showing convergence rates of spectral densities, with edge deviations persisting significantly longer than bulk errors β€” providing practical guidance for when the MP law is "close enough." Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” MPCX Architecture

MPCX Framework Diagram

Finite-dimensional correction pipeline: Wishart matrix generation β†’ empirical spectral density estimation β†’ MP baseline comparison β†’ bulk/edge error decomposition β†’ correction model fitting. Entirely auto-generated by the FigureAgent subsystem.

--- ### πŸ“„ Paper II  Β·  Econometrics   stats > **Monte Carlo Evaluation of Instrumental Variable Estimators Under Weak Instruments**
Paper II First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Reframe the classical 2SLS / LIML / Fuller-*k* / JIVE comparison around decision-relevant *risk surfaces*, mapping finite-sample phase diagrams that show where each estimator is preferred under realistic weak-IV conditions. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 366 papers collected β†’ 41 cited | | πŸ’» **Code** | 10,062 lines of Python | | ⏱️ **Runtime** | ~2 h 56 min | | πŸ“Š **Figures** | 6 auto-generated charts | | πŸ“‘ **Pages** | 14 pages (NeurIPS format) | #### 🎯 Key Result Generated estimator-switching phase diagrams revealing that Fuller-*k* dominates in specific small-*n*, many-instrument regions, while JIVE's bias reduction is systematically offset by variance inflation β€” providing actionable guidance for empirical researchers. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” IVX Architecture

IVX Framework Diagram

Monte Carlo IV evaluation pipeline: DGP specification β†’ estimator suite (2SLS, LIML, Fuller-k, JIVE) β†’ finite-sample risk surfaces β†’ phase diagram construction. Entirely auto-generated by the FigureAgent subsystem.

--- ### πŸ“„ Paper III  Β·  Epidemiological Modeling   bio > **Structural Identifiability and Parameter Estimation in Compartmental Epidemic Models (SIR / SEIR)**
Paper III First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Map the boundary between structural and practical identifiability in SIR vs. SEIR models across realistic observation regimes, and quantify when Fisher Information Matrix gives false confidence relative to profile likelihood. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 388 papers collected β†’ 29 cited | | πŸ’» **Code** | 9,374 lines of Python | | ⏱️ **Runtime** | ~2 h 23 min | | πŸ“Š **Figures** | 6 auto-generated charts | | πŸ“‘ **Pages** | 18 pages (NeurIPS format) | #### 🎯 Key Result Demonstrated that parameterization and observer design choices affect identifiability diagnostics more strongly than the choice between SIR and SEIR structure β€” with FIM producing overconfident estimates in specific observation-limited regimes where profile likelihood correctly flags non-identifiability. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” PRIM Architecture

PRIM Framework Diagram

PRIM benchmark workflow: synthetic outbreak generation (SIR/SEIR) β†’ parameter estimation β†’ profile likelihood vs. FIM diagnostics β†’ identifiability regime mapping. Entirely auto-generated by the FigureAgent subsystem.

--- ### πŸ“„ Paper IV  Β·  Numerical Linear Algebra   computing > **Comparative Analysis of Preconditioning Strategies for Krylov Subspace Methods on Sparse Linear Systems**
Paper IV First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Go beyond "which preconditioner wins" β€” build a feature-conditioned decision map for ILU / Jacobi / SSOR / AMG with CG / GMRES / BiCGSTAB, stratified by sparsity-graph structure and matrix pathology under realistic setup-vs-solve cost budgets. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 320 papers collected β†’ 33 cited | | πŸ’» **Code** | 14,557 lines of Python | | ⏱️ **Runtime** | ~2 h 30 min | | πŸ“Š **Figures** | 4 auto-generated charts | | πŸ“‘ **Pages** | 16 pages (NeurIPS format) | #### 🎯 Key Result Produced a setup-vs-solve tradeoff analysis showing that methods considered "best" under solve-time alone are often suboptimal under realistic memory and setup budgets β€” with AMG dominance limited to specific elliptic SPD matrix families. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” Krylov Preconditioner Architecture

Krylov Preconditioner Framework Diagram

Feature-conditioned preconditioner evaluation: sparse matrix collection β†’ structural descriptor extraction β†’ solver–preconditioner grid (CG/GMRES/BiCGSTAB Γ— ILU/Jacobi/SSOR/AMG) β†’ setup-vs-solve tradeoff analysis β†’ decision map. Entirely auto-generated by the FigureAgent subsystem.

---

πŸ“™ Batch B  Β·  Machine Learning & AI

Generated on Machine B  Β·  NVIDIA RTX 6000 Ada (48 GB)  Β·  4 papers across 4 ML sub-fields

--- ### πŸ“„ Paper V  Β·  Parameter-Efficient Fine-Tuning   NLP > **GARD: Gradient-Spectral Rank Allocation for LoRA Fine-Tuning**
Paper V First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Most LoRA configurations use a fixed, uniform rank across all layers. GARD proposes using the *spectrum of layer-wise gradients* β€” eigenvalues of gradient covariance β€” to dynamically allocate rank where it matters most, under a strict parameter budget. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 60 references cited (100% verified) | | πŸ’» **Code** | 2,894 lines of Python (5 files) | | ⏱️ **Runtime** | ~50 min | | πŸ“Š **Figures** | 7 auto-generated charts | | πŸ“‘ **Pages** | 17 pages (NeurIPS format) | #### 🎯 Key Contribution A principled alternative to uniform rank allocation: GARD links intrinsic gradient dimensionality to low-rank adapter capacity, periodically updating ranks during training using smoothed spectra. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” GARD Architecture

GARD Framework Diagram

Gradient spectral analysis β†’ layer-wise rank scoring β†’ dynamic rank allocation under budget constraint. Entirely auto-generated by the FigureAgent subsystem.

--- ### πŸ“„ Paper VI  Β·  Reinforcement Learning   RL > **LACE: Learned Abstractions for Count-Based Exploration in Sparse-Reward RL**
Paper VI First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Count-based exploration in RL relies on state visitation counts, but raw state spaces are too large for effective counting. LACE designs *online-learned, task-aware state abstractions* optimized specifically for count-based exploration in sparse-reward environments. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 25 references cited (100% verified) | | πŸ’» **Code** | 2,067 lines of Python (4 files) | | 🐳 **Experiment** | 32 min GPU sandbox execution | | ⏱️ **Runtime** | ~6.8 hrs total | | πŸ“Š **Figures** | 6 auto-generated charts | | πŸ“‘ **Pages** | 11 pages (NeurIPS format) | #### 🎯 Key Result DQN baseline achieves **356.7 mean reward** in sparse-reward gridworld tasks. The paper analyzes the trade-off between abstraction compactness for counting and information sufficiency for downstream control. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” LACE Architecture

LACE Framework Diagram

Learned state abstraction module integrated with count-based exploration in the DQN agent loop. Entirely auto-generated by the FigureAgent subsystem.

--- ### πŸ“„ Paper VII  Β·  Efficient Vision Transformers   CV > **FAME: Frequency-Aware Progressive Token Merging for Efficient ViT Inference**
Paper VII First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Existing ViT token pruning methods reduce tokens based on attention or saliency without considering *frequency content*. FAME uses DCT/FFT-based spectral filters to distinguish high-frequency detail tokens from low-frequency background tokens, merging progressively across layers. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 40 references cited (100% verified) | | πŸ’» **Code** | 2,873 lines of Python (5 files) | | 🐳 **Experiment** | 32 min GPU sandbox execution | | ⏱️ **Runtime** | ~3.3 hrs total | | πŸ“Š **Figures** | 7 auto-generated charts | | πŸ“‘ **Pages** | 10 pages (NeurIPS format) | #### 🎯 Key Result ViT-B/16 baseline: **56.54% accuracy** (3 seeds). Detailed analysis of the accuracy-efficiency tradeoff and per-layer metric breakdowns for frequency-aware vs. similarity-based merging. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” FAME Architecture

FAME Framework Diagram

Frequency-aware token merging applied progressively across ViT layers with DCT-based spectral filtering. Entirely auto-generated by the FigureAgent subsystem.

--- ### πŸ“„ Paper VIII  Β·  Knowledge Distillation   KD > **CRAFT: Contrastive Feature Alignment for Robust Distillation Under Distribution Shift**
Paper VIII First Page

πŸ‘† Click to read the full paper

#### πŸ’‘ Idea Standard knowledge distillation transfers teacher knowledge assuming train/test distributions match. CRAFT introduces *reliability-aware contrastive feature alignment* that aligns teacher-student features across clean and corrupted views, while suppressing fragile teacher directions via a de-alignment loss. #### βš™οΈ Pipeline Journey | | | |:---|:---| | πŸ”— **Stages** | 23 stages + 2 refinement iterations | | πŸ“š **Literature** | 37 references cited (97% verified) | | πŸ’» **Code** | 2,231 lines of Python (4 files) | | 🐳 **Experiment** | 33 min GPU sandbox execution | | ⏱️ **Runtime** | ~5.8 hrs total | | πŸ“Š **Figures** | 9 auto-generated charts | | πŸ“‘ **Pages** | 19 pages (NeurIPS format) | #### 🎯 Key Result | Method | Clean Acc | Robust Acc | |:---|:---:|:---:| | ERM (baseline) | 81.22% | 62.96% | | LogitKD | 82.33% | 64.68% | | **AttentionKD** | **82.08%** | **65.95%** | | CRD | 68.03% | 50.57% | Attention-based feature KD improves robustness by **+3 pts** over ERM, while naive CRD degrades it by **-12 pts** β€” motivating CRAFT's reliability-aware design. Read PDF
πŸ–ΌοΈ Auto-Generated Framework Diagram β€” CRAFT Architecture

CRAFT Framework Diagram

Reliability-aware contrastive feature alignment between teacher and student across clean and corrupted views, with de-alignment on fragile teacher directions. Entirely auto-generated by the FigureAgent subsystem.

--- ## πŸ“Š Aggregate Statistics
πŸ“‹ Metric I II III IV V VI VII VIII πŸ† Total
🏷️ Domain Math Stats Bio NumLA NLP RL CV KD 8 fields
πŸ’» Code (LOC) 10,290 10,062 9,374 14,557 2,894 2,067 2,873 2,231 54,348
⏱️ Pipeline Time 2h25m 2h56m 2h23m 2h30m 50m 6h48m 3h18m 5h48m ~27 hrs
πŸ”— References 26 41 29 33 60 25 40 37 291 cited
πŸ“Š Figures 5 6 6 4 7 6 7 9 50 figs
πŸ“‘ Pages 16 14 18 16 17 11 10 19 121 pages
---

πŸš€ Try It Yourself

Every paper above was generated by a single command:

```bash researchclaw run --topic "Your research idea here" --auto-approve ```

Back  GitHub