---
name: computer-science-algorithms
description: Choosing or implementing an algorithm or data structure — asymptotic complexity, data-structure selection, sorting & searching, dynamic programming, graph algorithms, divide & conquer, greedy algorithms, string/sequence algorithms, and the at-scale toolbox (Bloom filters, HyperLogLog, Count-Min Sketch, reservoir sampling, consistent hashing, external merge sort, Aho-Corasick, MinHash/LSH). Trigger on tasks involving "what's the right algorithm for…", performance-critical code, code with nested loops over the same input, recursive solutions, shortest-path / scheduling / matching / DP problems, code review for accidental O(n²) blowup, and any "how do I do X at scale / on a stream / without enough RAM" question — even if the user doesn't explicitly mention "algorithm" or "complexity."
---
# Community Classical Computer Science Algorithms Best Practices

A practitioner-oriented reference for choosing and implementing classical algorithms and data structures correctly. Organized by execution-lifecycle impact: the earliest decisions (asymptotic class, data-structure choice) cascade through everything else, so the rules near the top of the table matter most.

**Scope:** the patterns that show up in everyday production code review, reasonable interview / contest problems, **and the at-scale toolbox** (sketches, streaming, distributed primitives) — not an exhaustive cover of CLRS. Topics intentionally outside the current version: network flow, modular arithmetic, Bellman-Ford and Floyd-Warshall as standalone rules, SCC (Tarjan/Kosaraju), computational geometry, FFT, Manacher / Z-function as standalone rules. They're flagged inline in the relevant rules.

Distilled from CLRS (*Introduction to Algorithms*, 4th ed.), Sedgewick & Wayne (*Algorithms*, 4th ed., Princeton), Skiena's *Algorithm Design Manual*, Laaksonen's *Competitive Programmer's Handbook*, [cp-algorithms.com](https://cp-algorithms.com/), and the [USACO Guide](https://usaco.guide).

## When to Apply

Use these rules when:

- Choosing an algorithm or data structure for a new problem ("what's the right way to do X?")
- Reviewing code for hidden O(n²) blowup — repeated `in`-checks on lists, `pop(0)` on lists, string concatenation in loops, naive substring search
- Picking a DP state or recurrence, before writing the memoization
- Modeling a problem as a graph (BFS vs Dijkstra vs topological sort)
- Refactoring brute force / naive solutions that work on toy inputs but time out at scale
- Deciding whether greedy applies, or whether DP / branch-and-bound is required

## Rule Categories By Priority

| # | Category | Prefix | Impact | Why it cascades |
|---|----------|--------|--------|-----------------|
| 1 | Asymptotic Complexity & Algorithm Selection | `comp-` | CRITICAL | Wrong O() class makes every other optimization irrelevant |
| 2 | Data Structure Selection | `ds-` | CRITICAL | The container determines which operations are cheap |
| 3 | Sorting & Searching | `srch-` | HIGH | Foundation for greedy, two-pointer, sweep-line, binary-search-on-the-answer |
| 4 | Dynamic Programming | `dp-` | HIGH | Exponential → polynomial transformations |
| 5 | Graph Algorithms | `graph-` | HIGH | Networks, dependencies, routing, scheduling all reduce to graphs |
| 6 | Divide & Conquer / Recursion | `divide-` | MEDIUM-HIGH | Logarithmic-factor speedups; stack-depth and recurrence traps |
| 7 | Greedy Algorithms | `greedy-` | MEDIUM | Fast when correct, silently wrong when not |
| 8 | String & Sequence Algorithms | `str-` | MEDIUM | Pattern matching, parsing, substring queries |
| 9 | Scale & Probabilistic Algorithms | `scale-` | MEDIUM | Sketches, streaming, distributed primitives — situational, decisive when they apply |

## Quick Reference

### 1. Asymptotic Complexity & Algorithm Selection (CRITICAL)

- [`comp-pick-algorithm-class-from-input-bound`](references/comp-pick-algorithm-class-from-input-bound.md) — Match O() to n before writing code
- [`comp-amortize-instead-of-worst-casing`](references/comp-amortize-instead-of-worst-casing.md) — Total cost, not per-op worst case
- [`comp-watch-for-quadratic-blowup-from-membership-in-list`](references/comp-watch-for-quadratic-blowup-from-membership-in-list.md) — Linear `in` checks in loops are O(n²)
- [`comp-prefer-iterative-builders-over-string-concatenation`](references/comp-prefer-iterative-builders-over-string-concatenation.md) — Join / buffers, not `+=`
- [`comp-derive-recurrences-via-master-theorem`](references/comp-derive-recurrences-via-master-theorem.md) — Write the recurrence before coding recursion
- [`comp-treat-space-complexity-as-first-class`](references/comp-treat-space-complexity-as-first-class.md) — Memory kills services before time does

### 2. Data Structure Selection (CRITICAL)

- [`ds-hash-map-for-keyed-lookup`](references/ds-hash-map-for-keyed-lookup.md) — Build the index once, then O(1) lookups
- [`ds-set-for-uniqueness-and-membership`](references/ds-set-for-uniqueness-and-membership.md) — Dedup and "have I seen this?" in O(1)
- [`ds-heap-for-top-k-and-priority-queues`](references/ds-heap-for-top-k-and-priority-queues.md) — O(n log k), priority-queue idioms
- [`ds-deque-for-both-end-operations`](references/ds-deque-for-both-end-operations.md) — O(1) pop-front for BFS queues and sliding windows
- [`ds-balanced-bst-or-sorted-container-for-range-queries`](references/ds-balanced-bst-or-sorted-container-for-range-queries.md) — Predecessor / successor / range scan
- [`ds-union-find-for-dynamic-connectivity`](references/ds-union-find-for-dynamic-connectivity.md) — Near-O(1) grouping and merging
- [`ds-prefix-sums-for-repeated-range-sums`](references/ds-prefix-sums-for-repeated-range-sums.md) — O(1) range sum after O(n) preprocessing
- [`ds-fenwick-or-segment-tree-for-mutable-range-queries`](references/ds-fenwick-or-segment-tree-for-mutable-range-queries.md) — O(log n) updates + queries

### 3. Sorting & Searching (HIGH)

- [`srch-use-builtin-sort-not-hand-rolled`](references/srch-use-builtin-sort-not-hand-rolled.md) — Timsort / introsort beat any hand-roll
- [`srch-binary-search-on-sorted-data`](references/srch-binary-search-on-sorted-data.md) — `bisect`, plus binary search on the answer
- [`srch-quickselect-for-k-th-element`](references/srch-quickselect-for-k-th-element.md) — O(n) average for k-th / median
- [`srch-counting-and-radix-sort-for-bounded-integer-keys`](references/srch-counting-and-radix-sort-for-bounded-integer-keys.md) — Beat O(n log n) for integer keys
- [`srch-two-pointers-on-sorted-data`](references/srch-two-pointers-on-sorted-data.md) — O(n) on sorted arrays, sliding window

### 4. Dynamic Programming (HIGH)

- [`dp-memoize-overlapping-subproblems`](references/dp-memoize-overlapping-subproblems.md) — `@cache` collapses exponentials
- [`dp-tabulate-when-recursion-depth-or-order-matters`](references/dp-tabulate-when-recursion-depth-or-order-matters.md) — Bottom-up + rolling arrays
- [`dp-define-state-precisely`](references/dp-define-state-precisely.md) — Underspecified state = silent wrong answers
- [`dp-knapsack-pattern`](references/dp-knapsack-pattern.md) — 0/1 vs unbounded; loop direction is correctness
- [`dp-bitmask-for-small-set-states`](references/dp-bitmask-for-small-set-states.md) — n! → 2ⁿ · poly for n ≤ ~20
- [`dp-prove-optimal-substructure-before-coding`](references/dp-prove-optimal-substructure-before-coding.md) — DP requires substructure; verify before coding

### 5. Graph Algorithms (HIGH)

- [`graph-bfs-for-unweighted-shortest-path`](references/graph-bfs-for-unweighted-shortest-path.md) — O(V+E), no heap needed
- [`graph-dijkstra-for-non-negative-weights`](references/graph-dijkstra-for-non-negative-weights.md) — Lazy-deletion heap variant
- [`graph-topological-sort-for-dependency-order`](references/graph-topological-sort-for-dependency-order.md) — Kahn's algorithm + DAG DP
- [`graph-represent-as-adjacency-list-not-matrix`](references/graph-represent-as-adjacency-list-not-matrix.md) — Sparse graphs need lists
- [`graph-detect-cycles-during-dfs`](references/graph-detect-cycles-during-dfs.md) — Three-colour scheme for directed graphs
- [`graph-kruskal-or-prim-for-mst`](references/graph-kruskal-or-prim-for-mst.md) — MST with Union-Find or heap

### 6. Divide & Conquer / Recursion (MEDIUM-HIGH)

- [`divide-merge-sort-pattern-for-counting-inversions`](references/divide-merge-sort-pattern-for-counting-inversions.md) — Piggy-back counting onto the merge step
- [`divide-watch-recursion-depth-and-stack`](references/divide-watch-recursion-depth-and-stack.md) — Iterate, or raise the stack
- [`divide-meet-in-the-middle-for-subset-problems`](references/divide-meet-in-the-middle-for-subset-problems.md) — 2ⁿ → 2^(n/2)
- [`divide-quickselect-vs-quicksort-partitioning`](references/divide-quickselect-vs-quicksort-partitioning.md) — Random pivots; 3-way Dutch flag

### 7. Greedy Algorithms (MEDIUM)

- [`greedy-prove-exchange-argument-before-using`](references/greedy-prove-exchange-argument-before-using.md) — Greedy needs a correctness proof
- [`greedy-sort-by-the-right-key-for-scheduling`](references/greedy-sort-by-the-right-key-for-scheduling.md) — Finish time, deadline, value/weight
- [`greedy-interval-merge-and-sweep-line`](references/greedy-interval-merge-and-sweep-line.md) — Events + sort + linear sweep
- [`greedy-huffman-and-priority-queue-greedies`](references/greedy-huffman-and-priority-queue-greedies.md) — Heap-based "pick smallest repeatedly"

### 8. String & Sequence Algorithms (MEDIUM)

- [`str-kmp-or-builtin-find-not-naive-search`](references/str-kmp-or-builtin-find-not-naive-search.md) — Linear worst-case substring search
- [`str-trie-for-prefix-queries`](references/str-trie-for-prefix-queries.md) — Autocomplete in O(|query|)
- [`str-rolling-hash-for-multiple-substring-comparisons`](references/str-rolling-hash-for-multiple-substring-comparisons.md) — Two independent hashes, please
- [`str-suffix-array-or-automaton-for-substring-queries`](references/str-suffix-array-or-automaton-for-substring-queries.md) — Heavy-duty substring tooling

### 9. Scale & Probabilistic Algorithms (MEDIUM)

The "unusual but valuable at scale" toolbox — sketches that trade tiny accuracy loss for orders-of-magnitude memory wins, streaming primitives for inputs that don't fit in RAM, and distributed structures that survive sharding changes.

- [`scale-bloom-filter-for-probabilistic-membership`](references/scale-bloom-filter-for-probabilistic-membership.md) — 1 bit/element vs 8 bytes, 1% false-positive rate
- [`scale-hyperloglog-for-cardinality-estimation`](references/scale-hyperloglog-for-cardinality-estimation.md) — Count distinct over billions in ~12 KB
- [`scale-count-min-sketch-for-frequency-estimation`](references/scale-count-min-sketch-for-frequency-estimation.md) — Heavy hitters and frequency queries in fixed memory
- [`scale-reservoir-sampling-for-streams`](references/scale-reservoir-sampling-for-streams.md) — Uniform k-sample from a stream of unknown length
- [`scale-consistent-hashing-for-distributed-sharding`](references/scale-consistent-hashing-for-distributed-sharding.md) — Remap k/n keys (not all keys) on node changes
- [`scale-external-merge-sort-for-out-of-memory-data`](references/scale-external-merge-sort-for-out-of-memory-data.md) — Sort 1 TB on 8 GB of RAM
- [`scale-aho-corasick-for-multi-pattern-search`](references/scale-aho-corasick-for-multi-pattern-search.md) — Find all of m patterns in one pass over text
- [`scale-minhash-lsh-for-near-duplicate-detection`](references/scale-minhash-lsh-for-near-duplicate-detection.md) — Near-duplicate pairs in O(n), not O(n²)

## How to Use

Start with the category that matches the question:

- **"What's the right algorithm for n = 10⁶?"** → `comp-` (input-bound)
- **"I'm looking things up in a list inside a loop"** → `ds-hash-map-for-keyed-lookup` or `comp-watch-for-quadratic-blowup-from-membership-in-list`
- **"My recursion is slow"** → `dp-memoize-overlapping-subproblems` and `comp-derive-recurrences-via-master-theorem`
- **"Shortest path / connectivity / ordering tasks"** → `graph-`
- **"Choose items to maximize value"** → start with `greedy-prove-exchange-argument-before-using`; fall back to `dp-knapsack-pattern`
- **"Find / match strings"** → `str-`
- **"Memory is the constraint, not time"** / **"Sample / count / deduplicate at scale"** / **"Sharding"** → `scale-`

Code examples are in Python (most readable across audiences). The reasoning generalizes to any language — equivalent stdlib primitives are listed where they differ.

## Reference Files

| File | Description |
|------|-------------|
| [references/_sections.md](references/_sections.md) | Category definitions and ordering |
| [assets/templates/_template.md](assets/templates/_template.md) | Template for new rules |
| [metadata.json](metadata.json) | Version and reference information |
| [AGENTS.md](AGENTS.md) | Auto-built TOC navigation |

## Related Skills

- `complexity-optimizer` — Static analysis that finds the patterns these rules diagnose