--- name: complexity-cuts description: "Lower Big-O on existing code via a one-transformation-at-a-time playbook with verify-revert-stop. For new code use lemmaly; for math-level wins escalate to mathguard." risk: safe source: community source_repo: morsechimwai/lemmaly source_type: community date_added: "2026-05-26" author: morsechimwai tags: [algorithms, big-o, refactoring, optimization, performance, n-plus-one] tools: [claude-code, antigravity, cursor, gemini-cli, codex-cli] license: "Apache-2.0" license_source: "https://github.com/morsechimwai/lemmaly/blob/main/LICENSE" --- # complexity-cuts — Lower Big-O on Existing Code `lemmaly` prevents bad complexity before code is written. **complexity-cuts** fixes it after the fact: code already exists, it works, but its time or space complexity is worse than necessary. **Violating the letter of these rules is violating the spirit of the skill.** Adapting "just a little" is how a faster-but-wrong rewrite ships. ## When to Use This Skill Use **complexity-cuts** when refactoring existing code that has poor Big-O: - Nested loops, `O(n²)` or worse scans, repeated work, redundant allocations, blown memory. - Stated symptoms: "this is slow on large inputs", "times out", "OOM", "too much memory", "reduce complexity", "optimize this algorithm". - N+1 query patterns in ORMs (Prisma, Drizzle, SQLAlchemy, Django, ActiveRecord). - `await` inside `for` over independent items causing serial latency. For *preventing* bad complexity before code is written, use **`lemmaly`**. For math-level optimizations (Bloom, HLL, FFT, JL projection), escalate to **`mathguard`**. ## The Iron Law ```text NO TRANSFORMATION WITHOUT EXISTING TESTS GREEN BEFORE AND AFTER ``` If the code has no tests, you write a characterization test first (golden input → current output). Then transform. Then verify the test still passes. If you skip this, the optimization can silently break callers — and faster-but-wrong is worse than slow-and-right. ## Non-negotiable rules 1. **State current and target Big-O before touching code.** In one line: - Current: `time = O(?)`, `space = O(?)` - Target: `time = O(?)`, `space = O(?)` - Dominant input dimension (n = what, how large in practice) If you cannot state current Big-O, you do not yet understand the code. Read more. 2. **Identify the bottleneck, do not guess.** Point to the exact line(s) responsible for the dominant term. Nested loop? Repeated linear scan? Recomputation? Allocation inside a hot loop? The fix lives there, not elsewhere. 3. **One transformation at a time, with a verify-revert-stop loop.** The loop is: 1. Apply exactly one transformation from the playbook. 2. Run the existing test suite (or the characterization test you wrote per the Iron Law). 3. If any test breaks: **revert immediately.** Do not patch the test. Do not patch around the failure. Revert. 4. Count reverts on this piece of code. If **3 reverts in a row**, STOP optimizing. The bottleneck is wrong, the transformation is wrong, or the code has invariants you have not modeled. Escalate to `invariant-guard` and write the missing contract — do not try a fourth transformation. 5. Only after a transformation lands green: pick the next one. Stacked changes hide regressions. Patched tests hide regressions louder. 4. **Preserve semantics exactly.** Lower complexity must not change outputs, ordering guarantees, stability, or error behavior. If the optimization requires a semantic change (e.g. unordered output), call it out explicitly and confirm it is acceptable. 5. **No invented numbers.** Never write "10x faster" or "saves 200MB" without measuring. Write `` and move on, or actually measure with a representative input. 6. **Always report the measured speedup ratio after a transformation lands.** Once the new code is green, run a representative benchmark (same input, same machine, warm cache) and report `before → after` plus the ratio as `N× faster` (or `N× less memory`). One line, attached to the diff: ```text p50: 186 ms → 1.1 ms (169× faster, n=20,000, 200 samples) ``` If you cannot measure (e.g. the win is purely asymptotic on inputs you don't have), say so explicitly: `asymptotic only, no measurement — O(n²) → O(n)`. Never silently skip this step. ## The transformation playbook The vast majority of real-world Big-O wins come from a small set of moves. Try them in this order: ### Time-complexity reductions | Smell | Fix | Typical win | |---|---|---| | `for x in A: if x in B` where B is list/array | Convert B to `Set`/`Map` once | O(n·m) → O(n+m) | | Nested loop computing pairs/joins | Hash-join on the key; index by lookup field | O(n·m) → O(n+m) | | Repeated `.find` / `.indexOf` / `.includes` inside a loop | Precompute index `Map` outside loop | O(n^2) → O(n) | | Repeated recomputation of same value | Memoize / cache by input key | O(n·f(n)) → O(n + f(n)) | | Sort inside a loop | Sort once outside | O(n^2 log n) → O(n log n) | | Linear scan for min/max/median repeatedly | Heap / sorted structure | O(n·k) → O(n log k) | | Recursive recomputation (naive Fibonacci shape) | Memoize, or convert to iterative DP | exponential → O(n) | | String concatenation in a loop (some langs) | Use builder / `join` / `array.push` then join | O(n^2) → O(n) | | Repeated regex compile in loop | Compile once outside | constant-factor, large | | Counting / grouping via nested loop | Single pass with `Counter` / `Map` | O(n^2) → O(n) | | Sliding-window written as nested loop | Two-pointer / windowed sum | O(n^2) → O(n) | | Repeated prefix sums | Precompute prefix array, O(1) range queries | O(n·q) → O(n+q) | | Pairwise distance / containment checks on intervals | Sort + sweep line | O(n^2) → O(n log n) | | Top-K via full sort | Heap of size K | O(n log n) → O(n log k) | | Repeated set membership in loop body | `Set` once, reuse | O(n·m) → O(n) | | `await` inside a `for` over independent items | `Promise.all` / batched concurrency | wall-clock O(n·latency) → O(latency) | | ORM query inside a loop (N+1) | `IN (...)` / `select_related` / bulk fetch | O(n) round-trips → O(1) | ### Space-complexity reductions | Smell | Fix | Typical win | |---|---|---| | Materializing whole list/array just to iterate | Generator / iterator / stream | O(n) → O(1) | | Building intermediate arrays via chained `.map().filter().map()` on huge data | Single-pass loop or lazy pipeline | k·O(n) → O(n) (often O(1) extra) | | Caching every intermediate result of a recursion | Rolling window (keep last k states) | O(n) → O(k) | | Storing parents/visited for graph traversal when only count needed | Bitset / counter only | O(n) → O(1) | | Copying input to mutate | In-place mutation when caller allows | O(n) → O(1) | | Reading entire file before processing | Stream line-by-line / chunked | O(file) → O(chunk) | | Deep-clone for safety in a loop | Clone once, or use structural sharing / immutables | O(n·m) → O(n+m) | | Holding references that prevent GC (closures, listeners, caches) | Bound the cache (LRU), remove listeners, scope closures tightly | unbounded → bounded | | Loading full result set from DB | Cursor / pagination / streaming query | O(rows) → O(page) | | `JSON.parse(JSON.stringify(x))` for cloning | `structuredClone` or targeted copy | O(n) work and allocation removed | ### When you cannot lower asymptotic Big-O Sometimes O(n log n) really is the floor. Then move to constant-factor wins: - Replace pointer-chasing structures with contiguous arrays (cache locality). - Hoist invariants out of loops. - Avoid allocation in the hot loop (reuse buffers). - Prefer typed arrays / native containers over boxed objects for numeric work. - Batch syscalls / I/O. State explicitly: "Asymptotic floor is O(n log n); applying constant-factor optimizations only." ## Required workflow For each piece of code you optimize: 1. **Measure or estimate current Big-O.** Write it down. 2. **Identify the bottleneck line(s).** Point at them. 3. **Pick one transformation from the playbook.** Name it. 4. **Apply it.** One change. 5. **Verify behavior.** Tests pass, or outputs match on a representative input. 6. **State new Big-O.** Time and space. 7. **Repeat if more wins exist and are worth the complexity cost.** ## Canonical example — workflow vs no-workflow The same optimization with and without the verify-revert-stop loop. **Bottleneck.** `getOrdersWithUsers()` runs 10s on 10k orders. Cause: `users.find(u => u.id === o.userId)` inside the map → O(n·m). ### Without the workflow — changes semantics AND patches the test ```ts // No workflow: change semantics + the optimization in one go export function getOrdersWithUsers(orders, users) { const userById = Object.fromEntries(users.map(u => [u.id, u])); return orders .map(o => ({ ...o, user: userById[o.userId] })) .filter(o => o.user); // silently drops orders whose user was deleted } ``` Faster, *and* changes the result set. Existing tests catch it — but the diff also "fixes" a flaky test by removing the assertion that checked the old behavior. Ships green. Breaks the billing report two weeks later. ### With the workflow — one transformation, semantics preserved ```ts // Workflow applied: // Bottleneck: orders.map → users.find (line 14) // Current: time = O(n·m), space = O(1) // Target: time = O(n+m), space = O(m) // Transformation: precompute index Map outside the loop // Semantic risk: None — orders with missing users still emit `user: undefined` exactly as before // Reverts so far: 0 export function getOrdersWithUsers(orders, users) { const userById = new Map(users.map(u => [u.id, u])); return orders.map(o => ({ ...o, user: userById.get(o.userId) })); } ``` One transformation. Existing tests stay untouched. Run them. If green, ship. If red, revert (don't patch). After 3 reverts, stop and load `invariant-guard` — the bottleneck is wrong, or the function has a contract no one wrote down. ## Output discipline When proposing or applying an optimization, your message must contain — in this order: 1. **Bottleneck** — file:line and one-sentence reason. 2. **Current complexity** — `time = O(?)`, `space = O(?)`. 3. **Transformation** — name from the playbook (or describe it if novel). 4. **New complexity** — `time = O(?)`, `space = O(?)`. 5. **Semantic risk** — anything callers might notice (ordering, stability, error timing). "None" is a valid answer if true. 6. **Measured speedup** — `before → after` with the ratio as `N× faster` (or `asymptotic only` if not measured). One line, honest numbers. 7. **The diff.** If any of 1–6 is missing, the optimization is not ready to apply. ## Stop conditions — do not optimize further when - Asymptotic Big-O already matches a known lower bound for the problem. - The input is provably small and bounded (n < ~100 and not on a hot path). - The optimization would obscure correctness or harm readability without a measured win. - The bottleneck is I/O or external service latency, not CPU/memory — go fix that instead. Premature optimization past these points adds risk without payoff. ## Rationalizations to watch for | Excuse | Reality | | --- | --- | | "I already solved this in my head — just paste the diff and add labels after." | Retrofitted labels lie about the reasoning order. Write bottleneck → complexity → transformation → diff in that order, or you are writing fiction. | | "Stating the current Big-O is busywork — everyone can see the nested loop." | If everyone can see it, writing one line costs nothing. If only you can see it, you just saved the reviewer's time. | | "Semantic risk is None, skip that step." | "None" is a valid answer — but write it. The next reader does not know which guarantees you considered. | | "I'll do all three transformations in one diff." | Stacked transformations hide regressions. One transformation, verify, repeat. | | "It's just a small refactor, the workflow is overkill." | Then it takes 30 seconds. The cases where you skip the workflow are the ones where you miss the optimization next to the obvious one. | | "I'll measure later." | Later is `` forever. Either measure now or accept the asymptotic argument as the only claim. | ## Red flags — STOP - Optimizing without stating current Big-O. - "This should be faster" without identifying a specific bottleneck line. - Stacking multiple transformations before verifying any one of them. - Claiming a speedup without measuring or without an asymptotic argument. - Lowering complexity by silently changing output semantics. - Rewriting code that runs once at startup with n = 12. ## Verification checklist Before claiming an optimization is complete: - [ ] Existing tests (or a written characterization test) were green BEFORE the transformation. - [ ] Exactly one transformation was applied. - [ ] Tests are green AFTER the transformation. - [ ] No test was modified, weakened, or skipped to make it pass. - [ ] Current Big-O and target Big-O are stated in the diff or PR description. - [ ] Semantic risk is written down ("None" is valid if true). - [ ] Measured speedup ratio is reported as `before → after · N× faster` (or explicitly marked `asymptotic only` if no measurement was possible). - [ ] If a measured claim was made (e.g. "3x faster"), the measurement command is included. - [ ] Revert count on this code is < 3. Cannot check every box? The optimization is not done. Either revert or finish the gap — do not ship a half-verified speedup. ## Limitations - **Requires existing tests or a written characterization test.** Without one, you cannot detect silent semantic regressions; the Iron Law refuses to skip this. - **Asymptotic wins only; constant-factor work is a separate mode** (clearly labeled). The playbook will not improve cache locality or SIMD utilization on its own. - **Single-process scope.** Distributed-system bottlenecks (consensus latency, replication lag, queue backpressure) are out of scope. - **3-revert rule is firm.** If three transformations failed, the skill explicitly forces escalation to `invariant-guard`; it does not let you try a fourth. - **Measurement is on the author.** complexity-cuts requires the ratio to be reported but does not run the benchmark for you — you must produce a representative input. - **Won't help I/O-bound code.** If the dominant term is network latency or disk, the playbook will not move the needle — fix the I/O pattern instead. ## The thesis, in one line > **Existing code earned its slowness one shortcut at a time. complexity-cuts removes them one transformation at a time — and refuses to ship the optimization without a green test.** ## Related Skills - `lemmaly` — prevention gateway; use when writing new code instead of refactoring existing. - `invariant-guard` — escalation target when 3+ transformations have failed tests — the missing piece is a contract, not an optimization. - `mathguard` — escalation when the classical floor is reached and an approximate or math-heavy structure could win.