> [!NOTE] > **📌 Early release (2026)** > > MLSys·im shipped with the **2026** MLSysBook refresh. The analytical modeling framework, APIs, and lab integrations are **actively iterated** as we harden the package and teaching workflows. > > **Feedback** — [GitHub issues](https://github.com/harvard-edge/cs249r_book/issues) or pull requests. > > [![dev branch](https://img.shields.io/badge/branch-dev-orange?logo=git&logoColor=white)](https://github.com/harvard-edge/cs249r_book/tree/dev) [![live site](https://img.shields.io/badge/live_site-mlsysbook.ai-blue?logo=safari&logoColor=white)](https://mlsysbook.ai)

MLSys·im: The Modeling Platform

A first-principles analytical modeling framework for ML systems.
Designed for education and early design-space reasoning before empirical benchmarking.
--- ## 🏗 The 5-Layer Analytical Stack `mlsysim` implements a "Progressive Lowering" architecture, separating high-level workloads from the physical infrastructure that executes them.
Layer Domain Key Components
Layer A Workload Representation
mlsysim.models
FLOPs, parameters, and intensity.
e.g., Llama3_70B, ResNet50
Layer B Hardware Registry
mlsysim.hardware
Concrete specs for real-world silicon.
e.g., H100, TPUv5p, Jetson
Layer C Infrastructure
mlsysim.infra
Grid profiles and datacenter sustainability.
e.g., PUE, Carbon Intensity, WUE
Layer D Systems & Topology
mlsysim.systems
Fleet configurations and network fabrics.
e.g., Doorbell, AutoDrive Scenarios
Layer E Execution & Resolvers
mlsysim.core.solver
The 3-tier math engine: Models, Solvers, and Optimizers (Design space search).
--- ## Quick Usage: Automation-Friendly CLI `mlsysim` is a **first-principles analytical modeling framework** for ML systems. It provides a terminal UI for humans and strict JSON output for scripts, CI/CD pipelines, and validation tooling. > **Accuracy note:** Trust mlsysim for bottleneck classification and relative comparisons. Absolute latency is workload-dependent; well-calibrated cases are often within ±15–30%, while production serving can be 1.5–2× slower than idealized roofline bounds. For production capacity planning, validate with benchmarks. ### 1. Explore the Registry (The Zoo) Discover built-in hardware, models, and infrastructure without reading source code: mlsysim zoo hardware
mlsysim zoo models ### 2. Quick Evaluation (CLI Flags) Evaluate the physics of a workload on a specific hardware node instantly: mlsysim eval Llama3_8B H100 --batch-size 32 ### 3. Full-Stack Analytical Run (Infrastructure as Code) Define your entire cluster and SLA constraints in a declarative `mlsys.yaml` file: ```yaml # example_cluster.yaml version: "1.0" name: "Llama-3 70B training audit" workload: name: "Llama3_70B" batch_size: 4096 hardware: name: "H100" nodes: 64 ops: region: "Quebec" duration_days: 14.0 constraints: assert: - metric: "performance.latency" max: 50.0 ``` Then compile and evaluate the 3-lens scorecard (Feasibility, Performance, Macro): mlsysim eval example_cluster.yaml ### 4. CI/CD & Automation Every command supports strict, schema-validated JSON output. If an `assert` constraint is violated, the CLI returns a semantic `Exit Code 3`. ```bash # Export the JSON Schema for your IDE or validation tooling mlsysim schema > schema.json # Run an evaluation in a CI pipeline tco=$(mlsysim --output json eval example_cluster.yaml | jq .m_tco_usd) ``` ### 5. Design Space Search (Optimizers) Use the Tier 3 Engineering Engine to automatically find the optimal configuration: mlsysim optimize parallelism example_cluster.yaml
mlsysim optimize placement example_cluster.yaml --carbon-tax 150 --- ## Stability & Integrity Because this core powers a printed textbook, we enforce strict **Invariant Verification**. Registry constants are traceable to primary sources where available, and dimensional integrity is enforced via `pint`. ## Release-Facing Modeling Workflows - `TrainingMemoryModel`: weights, gradients, optimizer state, activations, and communication buffers per accelerator. - `ServingCapacityModel`: first-pass replica sizing from QPS, target P99 latency, generated length, batching capacity, and queueing. - `MoERoutingModel`: MoE active-parameter and expert-parallel traffic sensitivity under hot-expert imbalance. ## What This Tool Does Not Model MLSys·im is an **analytical modeling framework** for first-pass reasoning, not a production serving or orchestration system. The 22 walls model physical and economic constraints that bound ML system performance. Several critical production concerns are deliberately **out of scope**:
Concern Why it matters Where to learn more
Data drift / distribution shiftThe #1 cause of production ML failures — model accuracy degrades silently as input distributions changeSculley et al. (2015), "Hidden Technical Debt in ML Systems"
Model versioning & rollbackProduction requires running multiple versions, A/B testing, and safe rollbackHuyen (2022), Designing Machine Learning Systems
Monitoring & observabilityYou cannot manage what you cannot measure — prediction distributions, latency percentiles, error ratesGoogle SRE Book (2016); Huyen (2022)
Feature store freshnessStale features silently degrade real-time models (recommendations, fraud detection)Uber Michelangelo (2017)
Software bugs & misconfigurationsMost outages are caused by software, not hardwareBarroso et al. (2018)
Human factorsTeam velocity, on-call burden, and organizational alignment often dominate outcomesBrooks (1975), The Mythical Man-Month
**Passing all 22 walls is necessary but not sufficient for a successful production deployment.** Students using this tool should understand that infrastructure physics (what mlsysim models) is one dimension of a multi-dimensional engineering challenge. ## How to Cite If you use mlsysim in your research or teaching, please cite: ```bibtex @software{mlsysim2026, author = {Janapa Reddi, Vijay}, title = {{MLSys$\cdot$im}: First-Principles Infrastructure Modeling for Machine Learning Systems}, year = {2026}, url = {https://mlsysbook.ai/mlsysim}, version = {0.1.2}, institution = {Harvard University} } ``` ## Installation MLSys·im is designed to be highly modular. Install only what you need: ```bash # Core physics engine only (fastest, smallest footprint) pip install mlsysim # The CLI and YAML support are included in the base package. # The [cli] extra is retained as a backward-compatible no-op. pip install "mlsysim[cli]" # Install plotting dependencies pip install "mlsysim[viz]" ``` ## Python API Usage The framework is just as useful inside a Python script or Jupyter Notebook. The `SystemEvaluator` provides a clean, unified entry point for full-stack analysis: ```python import mlsysim # 1. Define the scenario model = mlsysim.Models.Language.Llama3_8B hardware = mlsysim.Hardware.Cloud.H100 # 2. Run the evaluation evaluation = mlsysim.SystemEvaluator.evaluate( scenario_name="Llama-3 8B on H100", model_obj=model, hardware_obj=hardware, batch_size=32, precision="fp16", efficiency=0.45 ) # 3. View the formatted scorecard print(evaluation.scorecard()) ``` ### Efficiency Parameter Guide The `efficiency` parameter (0.0–1.0) captures the gap between peak hardware performance and what your software stack actually achieves. Use these guidelines:
Scenario Efficiency Rationale
Training (Megatron-LM, large Transformer)0.40–0.55Well-optimized GEMM + FlashAttention
Training (PyTorch eager, small model)0.08–0.15Kernel launch overhead dominates
Inference decode, batch=10.01–0.05Memory-bound; compute nearly idle
Inference decode, batch=32+0.15–0.35Batch amortizes weight loading
Inference prefill, long context0.30–0.50Compute-bound GEMM + attention
TinyML (TFLite Micro on ESP32)0.05–0.15Interpreter overhead, no tensor cores
--- ## Contributors Thanks to these wonderful people for helping improve MLSys·im! **Legend:** Bug Hunter · Code Contributor · Documentation Contributor · Design Contributor · Idea Contributor · Code Reviewer · Test Engineer · Tool Builder
Vijay Janapa Reddi
Vijay Janapa Reddi

🧑‍💻 🎨 ✍️ 🧠 maintenance
Peter Koellner
Peter Koellner

🪲 ✍️
Rocky
Rocky

🪲 🧑‍💻
Zeljko Hrcek
Zeljko Hrcek

🧑‍💻
**Recognize a contributor:** Comment on any issue or PR: ```text @all-contributors please add @username for code, doc, ideas, or bug ``` --- ## License **Code:** [Apache License 2.0](LICENSE.md) — free for commercial and non-commercial use, with patent grant and attribution requirement. **Documentation and textbook prose:** [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC-BY-NC-SA-4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/) — the tutorials and prose on [mlsysbook.ai/mlsysim](https://mlsysbook.ai/mlsysim) are part of the *Machine Learning Systems* textbook and carry its license. The two licenses are intentionally separate: the Python package is permissively licensed so engineers and researchers can use it anywhere (including commercially), while the textbook prose retains its non-commercial protection to prevent republication as a derivative textbook. Copyright © 2026 Vijay Janapa Reddi and MLSys·im contributors.