Contribute

Interested in contributing or formalizing your own paper?

To get started in formalizing your own paper, clone the repository and open an LLM agent tool (I use Codex with GPT 5.5 in xhigh thinking mode). Give the agent the paper link, and ask it to formalize the paper using the skill and workflow in the repository. (And please let me know what your experience is like).

New paper formalizations should start in a private workflow and be proposed to enter the library through a pull request when ready. For questions or contributions, contact ngarg@cornell.edu. See CONTRIBUTING.md for more detail.

Library: reusable components

The shared library contains paper-independent Lean infrastructure that current paper formalizations reuse.

Component Examples Lines of code
Foundations: finite math and graph tools Finite-sum rewrites, order/rank lemmas, threshold and interval characterizations, asymptotic/exponential estimates, and cycle extraction in finite directed graphs. 9,263
Foundations: probability and stochastic processes Finite distributions, conditional expectations, kernels, atom and variance lemmas, Gaussian and exponential calculations, stochastic dominance, Markov chains, MDPs, CTMCs, and renewal-reward identities. 63,440
Foundations: optimization and certificates Argmax and endpoint principles, finite-search certificates, approximation guarantees, linear-program certificates, binary-choice optimality, move-graph descent, and choice-equilibrium existence. 2,923
Rating-system applications Binary and ordinal signal models, posterior-mean ratings, bias/variance decompositions, prior-weighted updates, monotonicity/correction lemmas, and minimal bandit-regret interfaces. 502
Matching markets Preference profiles, blocking pairs and stable matchings, deferred-acceptance invariants, proposer incentives, and many-to-one admissions models. 3,418
Auctions and mechanisms Allocation and payment rules, dominant-strategy truthfulness, benchmark-competitive digital-goods auctions, VCG-style welfare maximization, and single-minded set-packing mechanisms. 12,345
Online algorithms and regret Online matching/allocation state machines, primal-dual accounting, competitive-ratio certificates, regret interfaces, and platform-learning abstractions. 4,372
Complexity abstractions Decision/search problem interfaces, reductions, NP/ZPP-style class consequences, Yao-style minimax wrappers, and explicit external-solver assumptions. 463
Applications: recommender systems Exposure and allocation policies, classwise fairness constraints, top-k recommendation surfaces, policy averaging, and accuracy/diversity trade-off statements. 2,714
Social choice, rankings, and fair division Ranking profiles, Kendall distance and Mallows-style sequential ranking models, score/payoff interfaces, envy graphs, bounded-envy allocations, and indivisible-goods fairness statements. 9,546

Current status

The public repository currently contains formalized papers and selected public partial formalizations whose remaining assumptions are explicit. Human review and LLM-as-judge statement translation refer to validation that the Lean statement corresponds to the corresponding mathematical statement in the paper.

Paper Status/Artifacts Full proof LOC Note Human review LLM-as-judge statement translation
College Admissions and the Stability of Marriage by D. Gale and L. S. Shapley; American Mathematical Monthly, 1962. 388 This only uses a few lines of code as its infrastructure has largely been elevated to the shared matching library. 0/7 reviewed 7/7 match
The Economics of Matching: Stability and Incentives by Alvin E. Roth; Mathematics of Operations Research, 1982. 8,931 0/27 reviewed 27/27 match
Competitive Auctions and Digital Goods by Andrew V. Goldberg, Jason D. Hartline, and Andrew Wright; SODA, 2001. 14,624 Formalizes the SODA paper; Theorem 8.2 uses the refined monotone-auction wording from the journal version Goldberg-Hartline-Karlin-Saks-Wright 2006. 0/18 reviewed 18/18 match
AdWords and Generalized Online Matching by Aranyak Mehta, Amin Saberi, Umesh Vazirani, and Vijay V. Vazirani; Journal of the ACM, 2007. 13,598 0/26 reviewed 26/26 match
Who is in Your Top Three? Optimizing Learning in Elections with Many Candidates by Nikhil Garg, Lodewijk Gelauff, Sukolsak Sakshuwong, Ashish Goel; HCOMP, 2019. 35,032 0/10 reviewed 10/10 match
Designing Informative Rating Systems: Evidence from an Online Labor Market by Nikhil Garg, Ramesh Johari; Manufacturing & Service Operations Management 23(3), 2020. 7,027 0/7 reviewed 7/7 match
Test-optional Policies: Overcoming Strategic Behavior and Informational Gaps by Zhi Liu and Nikhil Garg; EAAMO, 2021. 125,744 0/16 reviewed 16/16 match
Driver Surge Pricing by Nikhil Garg and Hamid Nazerzadeh; Management Science, 2022. 142,927 0/34 reviewed 24/24 match
Reconciling the Accuracy-Diversity Trade-off in Recommendations by Kenny Peng, Manish Raghavan, Emma Pierson, Jon Kleinberg, and Nikhil Garg; The ACM Web Conference, 2024. 52,018 Proposition 2's printed finite bound appears to miss a factor of 2; Lean proves the corrected finite bound, which is sufficient for the asymptotic 1/2-homogeneity result. 0/42 reviewed 27/27 match
User-item fairness tradeoffs in recommendations by Sophie Greenwood, Sudalakshmee Chiniah, and Nikhil Garg; NeurIPS, 2024. 46,073 0/18 reviewed 18/18 match
A No Free Lunch Theorem for Human-AI Collaboration by Kenny Peng, Nikhil Garg, Jon Kleinberg; AAAI, 2025. 1,938 0/6 reviewed 6/6 match
Addressing Discretization-Induced Bias in Demographic Prediction by Evan Dong, Aaron Schein, Yixin Wang, and Nikhil Garg; PNAS Nexus, 2025. 26,133 0/32 reviewed 32/32 match
Balancing Producer Fairness and Efficiency via Prior-Weighted Rating System Design by Thomas Ma, Michael S. Bernstein, Ramesh Johari, and Nikhil Garg; ICWSM, 2025. 680 Formalization required an additional interior-quality assumption (0 < q_v < 1) for the strict variance-decrease statement. 10/27 reviewed; 2 uncertain 17/17 match
Truth Revelation in Approximately Efficient Combinatorial Auctions by Daniel Lehmann, Liadan Ita O'Callaghan, and Yoav Shoham; Journal of the ACM, 2002. 7,582 Greedy approximation, truthfulness, and Theorem 6.1 reductions are formalized. Full formalization requires computational complexity results that are out of scope. 0/30 reviewed 30/30 match
On Approximately Fair Allocations of Indivisible Goods by Richard J. Lipton, Evangelos Markakis, Elchanan Mossel, and Amin Saberi; ACM EC, 2004. 80,496 Sections 2 and 4 are fully formalized. Section 3 has query/descent/rounded-search support. The PTAS/FPTAS runtime layer needs reusable fixed-dimension IP complexity infrastructure. 0/35 reviewed 31/35 match; 4 additional assumptions
Iterative Local Voting for Collective Decision-making in Continuous Spaces by Nikhil Garg; Vijay Kamble; Ashish Goel; David Marn; Kamesh Munagala; JAIR 64, 2019. 23,379 Full formalization requires proving stochastic subgradient descent convergence. Theorem 3 is proved as a constrained alternative in general and as the original statement under the explicit full-space condition. 0/42 reviewed 35/40 match; 5 additional assumptions
Quantifying Spatial Under-reporting Disparities in Resident Crowdsourcing by Zhi Liu, Uma Bhandaram, Nikhil Garg; Nature Computational Science 4, 57-65 (2024); version of record published 2023-12-05; DOI 10.1038/s43588-023-00572-6. 18,540 Full formalization requires a homogeneous Poisson process and stopping time derivation 0/87 reviewed 28/84 match; 56 additional assumptions