π Generated Paper Showcase
From a one-line idea to a conference-ready paper β fully autonomous, zero human intervention.
---
Below are **eight papers** generated **entirely by AutoResearchClaw** β each starting from nothing more than a topic sentence. The pipeline autonomously searched literature, designed experiments, wrote and executed code, generated figures, and produced NeurIPS-formatted LaTeX papers with verified references.
> π **Two batches, eight domains** β Batch A covers mathematics, statistics, biology, and numerical computing; Batch B covers NLP, reinforcement learning, computer vision, and knowledge distillation β demonstrating the pipeline's cross-domain generality.
---
## π How It Works
**π‘** **Idea**
|
β |
**π** **Literature** 300β470 papers
|
β |
**π§ͺ** **Hypothesis** experiment design
|
β |
**π»** **Code** 2Kβ15K lines
|
β |
**π¬** **Execute** sandbox + refine
|
β |
**π** **Write** review & audit
|
β |
**π** **Paper** NeurIPS PDF
|
Each run traverses 23 autonomous stages with iterative self-healing, multi-agent peer review, and citation verification β no human in the loop.
---
π Batch A Β· Mathematics, Statistics & Sciences
Generated on Machine A Β· 4 papers across 4 non-ML domains
---
### π Paper I Β· Random Matrix Theory
> **Finite-Dimensional Corrections to the MarchenkoβPastur Distribution in Random Wishart Matrices**
π Click to read the full paper
|
#### π‘ Idea
Systematically quantify pre-asymptotic, finite-*N* deviations of empirical eigenvalue densities from the MarchenkoβPastur law across *N* = 50 to 5,000, decomposing error into bulk vs. edge components and testing lightweight correction models.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 473 papers collected β 26 cited |
| π» **Code** | 10,290 lines of Python |
| β±οΈ **Runtime** | ~2 h 25 min |
| π **Figures** | 5 auto-generated charts |
| π **Pages** | 16 pages (NeurIPS format) |
#### π― Key Result
Produced a finite-*N* correction atlas showing convergence rates of spectral densities, with edge deviations persisting significantly longer than bulk errors β providing practical guidance for when the MP law is "close enough."
|
πΌοΈ Auto-Generated Framework Diagram β MPCX Architecture
Finite-dimensional correction pipeline: Wishart matrix generation β empirical spectral density estimation β MP baseline comparison β bulk/edge error decomposition β correction model fitting. Entirely auto-generated by the FigureAgent subsystem.
---
### π Paper II Β· Econometrics
> **Monte Carlo Evaluation of Instrumental Variable Estimators Under Weak Instruments**
π Click to read the full paper
|
#### π‘ Idea
Reframe the classical 2SLS / LIML / Fuller-*k* / JIVE comparison around decision-relevant *risk surfaces*, mapping finite-sample phase diagrams that show where each estimator is preferred under realistic weak-IV conditions.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 366 papers collected β 41 cited |
| π» **Code** | 10,062 lines of Python |
| β±οΈ **Runtime** | ~2 h 56 min |
| π **Figures** | 6 auto-generated charts |
| π **Pages** | 14 pages (NeurIPS format) |
#### π― Key Result
Generated estimator-switching phase diagrams revealing that Fuller-*k* dominates in specific small-*n*, many-instrument regions, while JIVE's bias reduction is systematically offset by variance inflation β providing actionable guidance for empirical researchers.
|
πΌοΈ Auto-Generated Framework Diagram β IVX Architecture
Monte Carlo IV evaluation pipeline: DGP specification β estimator suite (2SLS, LIML, Fuller-k, JIVE) β finite-sample risk surfaces β phase diagram construction. Entirely auto-generated by the FigureAgent subsystem.
---
### π Paper III Β· Epidemiological Modeling
> **Structural Identifiability and Parameter Estimation in Compartmental Epidemic Models (SIR / SEIR)**
π Click to read the full paper
|
#### π‘ Idea
Map the boundary between structural and practical identifiability in SIR vs. SEIR models across realistic observation regimes, and quantify when Fisher Information Matrix gives false confidence relative to profile likelihood.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 388 papers collected β 29 cited |
| π» **Code** | 9,374 lines of Python |
| β±οΈ **Runtime** | ~2 h 23 min |
| π **Figures** | 6 auto-generated charts |
| π **Pages** | 18 pages (NeurIPS format) |
#### π― Key Result
Demonstrated that parameterization and observer design choices affect identifiability diagnostics more strongly than the choice between SIR and SEIR structure β with FIM producing overconfident estimates in specific observation-limited regimes where profile likelihood correctly flags non-identifiability.
|
πΌοΈ Auto-Generated Framework Diagram β PRIM Architecture
PRIM benchmark workflow: synthetic outbreak generation (SIR/SEIR) β parameter estimation β profile likelihood vs. FIM diagnostics β identifiability regime mapping. Entirely auto-generated by the FigureAgent subsystem.
---
### π Paper IV Β· Numerical Linear Algebra
> **Comparative Analysis of Preconditioning Strategies for Krylov Subspace Methods on Sparse Linear Systems**
π Click to read the full paper
|
#### π‘ Idea
Go beyond "which preconditioner wins" β build a feature-conditioned decision map for ILU / Jacobi / SSOR / AMG with CG / GMRES / BiCGSTAB, stratified by sparsity-graph structure and matrix pathology under realistic setup-vs-solve cost budgets.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 320 papers collected β 33 cited |
| π» **Code** | 14,557 lines of Python |
| β±οΈ **Runtime** | ~2 h 30 min |
| π **Figures** | 4 auto-generated charts |
| π **Pages** | 16 pages (NeurIPS format) |
#### π― Key Result
Produced a setup-vs-solve tradeoff analysis showing that methods considered "best" under solve-time alone are often suboptimal under realistic memory and setup budgets β with AMG dominance limited to specific elliptic SPD matrix families.
|
πΌοΈ Auto-Generated Framework Diagram β Krylov Preconditioner Architecture
Feature-conditioned preconditioner evaluation: sparse matrix collection β structural descriptor extraction β solverβpreconditioner grid (CG/GMRES/BiCGSTAB Γ ILU/Jacobi/SSOR/AMG) β setup-vs-solve tradeoff analysis β decision map. Entirely auto-generated by the FigureAgent subsystem.
---
π Batch B Β· Machine Learning & AI
Generated on Machine B Β· NVIDIA RTX 6000 Ada (48 GB) Β· 4 papers across 4 ML sub-fields
---
### π Paper V Β· Parameter-Efficient Fine-Tuning
> **GARD: Gradient-Spectral Rank Allocation for LoRA Fine-Tuning**
π Click to read the full paper
|
#### π‘ Idea
Most LoRA configurations use a fixed, uniform rank across all layers. GARD proposes using the *spectrum of layer-wise gradients* β eigenvalues of gradient covariance β to dynamically allocate rank where it matters most, under a strict parameter budget.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 60 references cited (100% verified) |
| π» **Code** | 2,894 lines of Python (5 files) |
| β±οΈ **Runtime** | ~50 min |
| π **Figures** | 7 auto-generated charts |
| π **Pages** | 17 pages (NeurIPS format) |
#### π― Key Contribution
A principled alternative to uniform rank allocation: GARD links intrinsic gradient dimensionality to low-rank adapter capacity, periodically updating ranks during training using smoothed spectra.
|
πΌοΈ Auto-Generated Framework Diagram β GARD Architecture
Gradient spectral analysis β layer-wise rank scoring β dynamic rank allocation under budget constraint. Entirely auto-generated by the FigureAgent subsystem.
---
### π Paper VI Β· Reinforcement Learning
> **LACE: Learned Abstractions for Count-Based Exploration in Sparse-Reward RL**
π Click to read the full paper
|
#### π‘ Idea
Count-based exploration in RL relies on state visitation counts, but raw state spaces are too large for effective counting. LACE designs *online-learned, task-aware state abstractions* optimized specifically for count-based exploration in sparse-reward environments.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 25 references cited (100% verified) |
| π» **Code** | 2,067 lines of Python (4 files) |
| π³ **Experiment** | 32 min GPU sandbox execution |
| β±οΈ **Runtime** | ~6.8 hrs total |
| π **Figures** | 6 auto-generated charts |
| π **Pages** | 11 pages (NeurIPS format) |
#### π― Key Result
DQN baseline achieves **356.7 mean reward** in sparse-reward gridworld tasks. The paper analyzes the trade-off between abstraction compactness for counting and information sufficiency for downstream control.
|
πΌοΈ Auto-Generated Framework Diagram β LACE Architecture
Learned state abstraction module integrated with count-based exploration in the DQN agent loop. Entirely auto-generated by the FigureAgent subsystem.
---
### π Paper VII Β· Efficient Vision Transformers
> **FAME: Frequency-Aware Progressive Token Merging for Efficient ViT Inference**
π Click to read the full paper
|
#### π‘ Idea
Existing ViT token pruning methods reduce tokens based on attention or saliency without considering *frequency content*. FAME uses DCT/FFT-based spectral filters to distinguish high-frequency detail tokens from low-frequency background tokens, merging progressively across layers.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 40 references cited (100% verified) |
| π» **Code** | 2,873 lines of Python (5 files) |
| π³ **Experiment** | 32 min GPU sandbox execution |
| β±οΈ **Runtime** | ~3.3 hrs total |
| π **Figures** | 7 auto-generated charts |
| π **Pages** | 10 pages (NeurIPS format) |
#### π― Key Result
ViT-B/16 baseline: **56.54% accuracy** (3 seeds). Detailed analysis of the accuracy-efficiency tradeoff and per-layer metric breakdowns for frequency-aware vs. similarity-based merging.
|
πΌοΈ Auto-Generated Framework Diagram β FAME Architecture
Frequency-aware token merging applied progressively across ViT layers with DCT-based spectral filtering. Entirely auto-generated by the FigureAgent subsystem.
---
### π Paper VIII Β· Knowledge Distillation
> **CRAFT: Contrastive Feature Alignment for Robust Distillation Under Distribution Shift**
π Click to read the full paper
|
#### π‘ Idea
Standard knowledge distillation transfers teacher knowledge assuming train/test distributions match. CRAFT introduces *reliability-aware contrastive feature alignment* that aligns teacher-student features across clean and corrupted views, while suppressing fragile teacher directions via a de-alignment loss.
#### βοΈ Pipeline Journey
| | |
|:---|:---|
| π **Stages** | 23 stages + 2 refinement iterations |
| π **Literature** | 37 references cited (97% verified) |
| π» **Code** | 2,231 lines of Python (4 files) |
| π³ **Experiment** | 33 min GPU sandbox execution |
| β±οΈ **Runtime** | ~5.8 hrs total |
| π **Figures** | 9 auto-generated charts |
| π **Pages** | 19 pages (NeurIPS format) |
#### π― Key Result
| Method | Clean Acc | Robust Acc |
|:---|:---:|:---:|
| ERM (baseline) | 81.22% | 62.96% |
| LogitKD | 82.33% | 64.68% |
| **AttentionKD** | **82.08%** | **65.95%** |
| CRD | 68.03% | 50.57% |
Attention-based feature KD improves robustness by **+3 pts** over ERM, while naive CRD degrades it by **-12 pts** β motivating CRAFT's reliability-aware design.
|
πΌοΈ Auto-Generated Framework Diagram β CRAFT Architecture
Reliability-aware contrastive feature alignment between teacher and student across clean and corrupted views, with de-alignment on fragile teacher directions. Entirely auto-generated by the FigureAgent subsystem.
---
## π Aggregate Statistics
| π Metric |
I |
II |
III |
IV |
V |
VI |
VII |
VIII |
π Total |
| π·οΈ Domain |
Math |
Stats |
Bio |
NumLA |
NLP |
RL |
CV |
KD |
8 fields |
| π» Code (LOC) |
10,290 |
10,062 |
9,374 |
14,557 |
2,894 |
2,067 |
2,873 |
2,231 |
54,348 |
| β±οΈ Pipeline Time |
2h25m |
2h56m |
2h23m |
2h30m |
50m |
6h48m |
3h18m |
5h48m |
~27 hrs |
| π References |
26 |
41 |
29 |
33 |
60 |
25 |
40 |
37 |
291 cited |
| π Figures |
5 |
6 |
6 |
4 |
7 |
6 |
7 |
9 |
50 figs |
| π Pages |
16 |
14 |
18 |
16 |
17 |
11 |
10 |
19 |
121 pages |
---
π Try It Yourself
Every paper above was generated by a single command:
```bash
researchclaw run --topic "Your research idea here" --auto-approve
```