# AMO-Lean v1.0.1 -- Benchmark Report **Date**: 2026-02-12 **Platform**: macOS Darwin 24.6.0, Apple Silicon (ARM64) **Lean**: v4.16.0, Mathlib v4.16.0 **C Compiler**: Apple clang 17, -O2/-O3 --- ## 1. Build Health | Target | Modules | Result | |--------|---------|--------| | `lake build AmoLean` | 2647 | **PASS** (0 errors) | | `lake build Tests` | 10 targets | **PASS** (10/10) | | `lake build Benchmarks` | 3 targets | **PASS** | **Codebase**: 32,650 LOC across 81 Lean files. --- ## 2. Formal Verification Audit ### Sorry Inventory (27 total) | Classification | Count | Components | |----------------|-------|------------| | **Active** (genuine gaps) | 14 | Poseidon (12 match splitter), FRI/Merkle (2) | | **Deprecated** (superseded) | 7 | Verification/Theorems (Float-based, obsolete) | | **Commented out** (inactive code) | 6 | Matrix/Perm (4), NTT/Spec (1), NTT/Properties (1) | ### Axiom Inventory (12 total) | Component | Axioms | Risk Level | |-----------|--------|------------| | Goldilocks Field | 0 | **CLEAN** (5 axioms eliminated 2026-02-11) | | BabyBear Field | 0 | **CLEAN** (4 axioms eliminated 2026-02-12) | | NTT Radix-4 | 8 | Medium (roundtrip correctness, specification axioms) | | NTT ListFinsetBridge | 3 | Low (bridge axioms avoiding import cycles) | | Matrix/Perm | 1 | Low (match splitter workaround) | ### Component Verification Status | Component | Sorry | Axioms | Verdict | |-----------|-------|--------|---------| | NTT Core (Butterfly, CooleyTukey, Correctness, LazyButterfly) | 0 | 0 | **CLEAN** | | NTT Radix-4 | 0 | 8 | Axiomatized | | FRI Folding (FRI_Properties) | 0 | 0 | **CLEAN** | | Matrix/Perm | 0 active | 1 | **CLEAN** | | EGraph/VerifiedRules | 0 | 0 | **CLEAN** (19/20 rules) | | AlgebraicSemantics | 0 | 0 | **CLEAN** (19/19 cases proven) | | Poseidon_Semantics | 12 | 0 | Computationally verified | | Goldilocks Field | 0 | 0 | **CLEAN** (5 axioms eliminated) | | BabyBear Field | 0 | 0 | **CLEAN** (4 axioms + 4 sorry eliminated) | --- ## 3. NTT Correctness ### Oracle Tests (C implementation vs Lean specification) | Test | Size | Result | |------|------|--------| | Output matches Lean | N=4 | PASS | | Roundtrip INTT(NTT(x)) = x | N=4 | PASS | | Output matches Lean | N=8 | PASS | | Roundtrip INTT(NTT(x)) = x | N=8 | PASS | | Roundtrip | N=16 | PASS | | Roundtrip | N=32 | PASS | | **Total** | | **6/6 PASS** | ### Plonky3 Equivalence (bit-exact comparison) | Size | Tests per size | Result | |------|---------------|--------| | N=8 to N=1024 | 6 per size (sequential, zeros, ones, impulse, max, random) | PASS | | Roundtrip (Plonky3 + AMO-Lean) | 8 sizes x 2 | PASS | | **Total** | | **64/64 PASS** | ### Bit Reversal | Test | Sizes | Result | |------|-------|--------| | Known values | N=8, N=16 | PASS | | Involution (bit_reverse(bit_reverse(x)) = x) | N=8 to N=1024 | PASS | | Bijection (one-to-one) | N=8 to N=1024 | PASS | | Large N | N=4096, N=65536 | PASS | | Permutation consistency | N=8 | PASS | | **Total** | | **35/35 PASS** | ### C Kernel (Lazy Reduction) | Test | Iterations | Result | |------|-----------|--------| | Edge case max bounds (lazy_add, lazy_sub, lazy_mul_reduce) | 5 cases | PASS | | Modular equivalence (random) | 1000 each | PASS | | Butterfly correctness (random) | 1000 | PASS | | Butterfly sum property (a'+b' = 2a mod p) | 1000 | PASS | | Reduction idempotence | N/A | PASS | | **Total** | | **16/16 PASS** | ### Sanitizer (UBSan) | Test | Size | Result | |------|------|--------| | NTT roundtrip | N=1024 | PASS (0 UB) | | NTT roundtrip | N=4096 | PASS (0 UB) | | Edge size | N=2 | PASS | | Edge size | N=1 | PASS | | **Total** | | **4/4 PASS, 0 UBSan violations** | --- ## 4. NTT Performance ### AMO-Lean vs Plonky3 (ARM64, Feb 2026) | Size | Plonky3 (us) | AMO-Lean (us) | Ratio | AMO Throughput | |------|-------------|---------------|-------|----------------| | N=256 | 3.4 +/- 0.5 | 4.8 +/- 0.4 | 1.39x | 53.7 Melem/s | | N=512 | 7.1 +/- 1.0 | 10.6 +/- 0.5 | 1.49x | 48.5 Melem/s | | N=1024 | 14.6 +/- 0.5 | 23.3 +/- 2.5 | 1.59x | 44.0 Melem/s | | N=2048 | 31.0 +/- 1.0 | 50.5 +/- 1.7 | 1.63x | 40.6 Melem/s | | N=4096 | 65.9 +/- 2.2 | 109.5 +/- 0.9 | 1.66x | 37.4 Melem/s | | N=8192 | 138.4 +/- 1.9 | 255.1 +/- 13.7 | 1.84x | 32.1 Melem/s | | N=16384 | 294.3 +/- 9.4 | 535.8 +/- 12.2 | 1.82x | 30.6 Melem/s | | N=32768 | 665.8 +/- 9.7 | 1176.1 +/- 92.0 | 1.77x | 27.9 Melem/s | | N=65536 | 1477.4 +/- 29.2 | 2500.1 +/- 75.5 | 1.69x | 26.2 Melem/s | **Average ratio**: 1.65x (AMO-Lean achieves ~60% of Plonky3 throughput) ### Twiddle Caching Impact | Size | No cache (us) | Cached (us) | Speedup | |------|--------------|-------------|---------| | N=256 | 8.6 | 5.6 | 1.53x | | N=1024 | 29.0 | 26.6 | 1.09x | | N=4096 | 134.8 | 126.0 | 1.07x | | N=16384 | 635.8 | 596.9 | 1.07x | | N=65536 | 2896.0 | 2724.9 | 1.06x | --- ## 5. FRI Protocol | Test | Size | Result | |------|------|--------| | Basic fri_fold | N=4 | PASS | | fri_fold_layer | N=8 | PASS | | fri_fold_16 | N=16 | PASS | | fri_fold_256 | N=256 | PASS | | **Fold tests total** | | **4/4 PASS** | | Validation (Lean -> C) | Size | Result | |------------------------|------|--------| | small_n4 | N=4 | PASS | | alpha_zero | N=4 | PASS | | alpha_one | N=4 | PASS | | medium_n16 | N=16 | PASS | | larger_n64 | N=64 | PASS | | power2_n256 | N=256 | PASS | | **Validation total** | | **6/6 PASS** | ### FRI Fold Performance | Size | Latency (ns/elem) | |------|--------------------| | N=64 | 24.7 | | N=256 | 99.2 | | N=1024 | 693.0 | | N=4096 | 2452.0 | --- ## 6. Goldilocks Field All tests compiled with `-fsanitize=undefined -fno-sanitize-recover=all`. | Category | Tests | Result | |----------|-------|--------| | Addition (incl. wrap-around) | 6 | PASS | | Subtraction (incl. underflow) | 4 | PASS | | Multiplication (incl. max product) | 5 | PASS | | 128-bit Reduction | 5 | PASS | | Negation + Inverse | 6 | PASS | | Field Laws (commutativity, associativity, distributivity) | 5 | PASS | | S-Box (x^7, invertibility) | 6 | PASS | | **Total** | **37/37** | **PASS, 0 UBSan violations** | --- ## 7. Poseidon2 Hash ### Correctness | Test | Detail | Result | |------|--------|--------| | S-Box isolation (0^5, 1^5, small values) | 3 cases | PASS | | Square chain vs naive multiplication | 1000 random | PASS | | Algebraic identity x^5 = x*(x^2)^2 | 1000 random | PASS | | Partial round isolation | 1000 random states | PASS | | Full round modifies all elements | 100 random states | PASS | | **S-box total** | | **7/7 PASS** | | Test vector (HorizenLabs BN254 t=3) | Bit-exact | **PASS** | | Known test vectors | 8 vectors | PASS | | Self-consistency (C vs C) | 10000 random | PASS | | **Differential total** | | **2/2 PASS** | ### Performance (BN254 scalar, ARM64) | Operation | Latency | Throughput | |-----------|---------|------------| | Field multiplication | 34.0 ns | 29.4M mul/s | | S-box (x^5) | 128.3 ns | 7.8M/s | | Partial round (t=3) | 113.5 ns | 8.8M rounds/s | | Full round (t=3) | 279.5 ns | 3.6M rounds/s | | **Full hash (RF=8, RP=56)** | **~7.5 us** | **134K H/s** | --- ## 8. E-Graph Optimization Engine | Test | Result | |------|--------| | Effectiveness (optimization reduces nodes) | PASS | | Semantic equivalence (optimized = original) | PASS | | Rule audit (19/20 verified, no sorry in module) | PASS | | Compilation time (stress depth=10, 83ms < 10s limit) | PASS | | **Total** | **4/4 PASS** | --- ## 9. Security and Hardening ### Current Run (Feb 2026) | Test | Result | |------|--------| | UBSan: Goldilocks field (37 tests) | 0 violations | | UBSan: NTT roundtrip (N=1 to N=4096) | 0 violations | ### Historical (verified in Phase 6A, Jan 2026) | Test | Result | Detail | |------|--------|--------| | FFI Torture (1M iterations) | PASS | 3M+ calls, 0 errors | | Deep Fuzzing | PASS | 120/120 pathological vectors, bit-exact | | ABI Layout (C/Rust) | PASS | All offsets identical | | Panic Safety | PASS | Triple protection enabled | | FFI Overhead | PASS | 0.03% (< 5% threshold) | ### Platform Notes AVX2 tests (300+ tests) and Phase 3 QA suite require x86-64 and run in CI (GitHub Actions). Not executable on ARM64/Apple Silicon. --- ## 10. Summary | Category | Tests | Failures | |----------|-------|----------| | Goldilocks Field | 37 | 0 | | NTT Correctness | 141 | 0 | | Plonky3 Equivalence | 64 | 0 | | FRI Protocol | 10 | 0 | | Poseidon2 Hash | 10 | 0 | | E-Graph Optimizer | 4 | 0 | | Hardening | 120 | 0 | | Lean Build | 2641 | 0 | | Lean Tests | 10 | 0 | | **Total** | **~2850** | **0** |