# Tutorial: Round-Based Algorithms in Leslie ## What this document covers This tutorial explains how to model and verify distributed algorithms using Leslie's round-based framework. It covers: 1. The Heard-Of model and communication closure 2. How to specify a round-based algorithm 3. How to state and prove safety properties 4. The cutoff theorem for automatic verification 5. A worked example: OneThirdRule consensus No Lean syntax is required to follow the tutorial — we focus on the concepts, proof strategies, and intuitions. The Lean formalization is in the `Leslie/Round.lean`, `Leslie/Cutoff.lean`, and `Leslie/Examples/` files. --- ## 1. The Heard-Of Model ### The problem with asynchronous reasoning Distributed protocols are hard to verify because of asynchrony: messages can be delayed, reordered, or lost. In a fully asynchronous model, a process might receive a message from 100 rounds ago mixed with fresh messages. Reasoning about what a process "knows" requires tracking the entire message history — an unbounded state space. ### Communication-closed rounds The **Heard-Of (HO) model** (Charron-Bost & Schiper, 2009) simplifies this by structuring computation into **rounds**. Each round has three phases: 1. **Send**: every process broadcasts a message (determined by its current state). 2. **Receive**: every process receives a subset of the messages sent in this round. 3. **Update**: every process updates its state based on the received messages. The key property is **communication closure**: messages from round r are either delivered in round r or lost forever. A process never receives stale messages from earlier rounds. ### Heard-Of sets For each round r and process p, the **Heard-Of set** HO(p, r) is the set of processes whose round-r messages were delivered to p. Different processes may have different HO sets (modeling unreliable communication), but the communication closure guarantee holds for all. ### Communication predicates The fault model is captured by a **communication predicate** — a constraint on which HO sets are valid. Examples: - **Reliable**: HO(p, r) = all processes, for every p and r. (All messages delivered — synchronous, fault-free.) - **Two-thirds**: |HO(p, r)| > 2n/3 for every p and r. (Each process receives "enough" messages, but possibly different subsets. Models partial synchrony with f < n/3 faults.) - **Lossy**: No constraint. (Any HO set is valid — models fully unreliable communication.) ### Why communication closure helps Communication closure gives us a crucial property: > **Received messages reflect the current state.** > > If process p receives value v from process q in round r, then q > held value v at the start of round r. This means received vote counts are bounded by actual current holder counts. If 10 processes hold value 0 at the start of the round, then no process can receive more than 10 votes for value 0 — regardless of its HO set. There are no stale messages inflating the count. Without communication closure (e.g., in asynchronous protocols), a process might receive old messages carrying value 0 from processes that have since switched to value 1. The received count for value 0 would overestimate the current holder count, breaking the reasoning. --- ## 2. Specifying a Round-Based Algorithm ### The RoundAlg structure A round-based algorithm is specified by three components: - **init(p, s)**: the initial state predicate for process p. - **send(p, s)**: the message that process p broadcasts given its current state s. (In the basic HO model, every process sends the same message to everyone.) - **update(p, s, msgs)**: how process p updates its state given its current state s and the received messages msgs. The messages are a function `msgs : Process → Option Message` — for each potential sender q, either `some m` (the message from q was delivered) or `none` (not delivered, i.e., q ∉ HO(p)). ### The global state The global state tracks: - **round**: the current round number (0 = initial, before any round). - **locals**: a function mapping each process to its local state. ### The transition relation One round step from state s to state s': - s'.round = s.round + 1 - For each process p: s'.locals(p) = update(p, s.locals(p), delivered(s, ho, p)) where `delivered(s, ho, p)(q) = if ho(p, q) then some(send(q, s.locals(q))) else none`. The HO collection `ho` is existentially quantified — the environment chooses it, subject to the communication predicate. ### Example: OneThirdRule The OneThirdRule is the simplest consensus algorithm in the HO model: - **State**: each process holds a natural number value and optionally a decision. - **Send**: broadcast your current value. - **Update**: collect received values. If more than 2n/3 carry the same value v, adopt v and decide v. Otherwise, keep your current value. - **Communication predicate**: every process hears from > 2n/3 senders. --- ## 3. Proving Safety Properties ### The round invariant rule The main proof rule for round-based algorithms is the **round invariant rule**: to prove that a property holds at all times, show: 1. **Init**: the property holds in the initial state. 2. **Step**: if the property holds before a round, it holds after the round — for ANY valid HO collection. This reduces temporal reasoning ("the property holds forever") to single-round reasoning ("the property is preserved by one step"). ### Example: OneThirdRule agreement **Property**: if process p decided v and process q decided w, then v = w. **Proof strategy**: we don't prove agreement directly. Instead, we prove a stronger **lock invariant** and derive agreement from it. #### The lock invariant > If any process has decided value v, then more than 2n/3 of all n > processes currently hold value v. This says: a decision "locks in" a super-majority. The lock invariant implies agreement by **pigeonhole**: if two values v ≠ w are both decided, then each needs > 2n/3 holders. But the holder sets are disjoint (a process holds exactly one value), and two disjoint sets of size > 2n/3 would need > 4n/3 > n elements total. Contradiction. #### Proving the lock is inductive The lock invariant holds initially (no decisions → vacuously true). The inductive step has three parts: **(a) Establishing the lock in the pre-state.** If some process p has `decided = some v` in the post-state, either p already had the decision (use the inductive hypothesis) or p newly decided in this round. If p newly decided v, then `findSuperMajority` returned v for p's received messages, meaning > 2n/3 of p's received messages carried v. Each such message came from a distinct sender who held v. So > 2n/3 processes held v at the start of the round. Here, **communication closure** is essential: we conclude that the senders held v at the START of this round (not at some earlier time), because messages from past rounds don't exist. **(b) The lock persists: no competing value can emerge.** We show that every process q that held v before the round still holds v after. The argument: `findSuperMajority` on q's received messages can't return w ≠ v, because: - Votes for w in q's received messages ≤ total w-holders (by communication closure: received votes reflect current state). - Total w-holders ≤ n - (> 2n/3) < n/3 (since v has > 2n/3 holders and v-holders and w-holders are disjoint). - So votes for w < n/3 < 2n/3 — NOT a super-majority. Therefore `findSuperMajority` returns either `some v` (q adopts v) or `none` (q keeps its current value v). Either way, q still holds v. Again, **communication closure** is the key: the bound "votes ≤ holders" holds because there are no stale messages from past rounds. **(c) Counting.** Since every v-holder in the pre-state is still a v-holder in the post-state, the count of v-holders doesn't decrease: countHolding(v, s') ≥ countHolding(v, s) > 2n/3. ✓ #### Validity: decided values are initial values A separate property: any decided value was the initial value of some process. The proof: in each round, the update function either keeps the current value or adopts a value from received messages (which came from senders' current values). So values only propagate — no new values are created. By induction on time, every value traces back to an initial value. --- ## 4. The Cutoff Theorem ### Motivation The lock invariant proof above works for **all n** simultaneously. But for more complex protocols, the inductive step might be too tedious to prove by hand. Can we instead verify safety for a few small values of n and conclude it holds for all n? ### When cutoffs work: three conditions A cutoff reduction requires three conditions on the protocol: **1. Process symmetry.** The algorithm treats all processes identically. The send and update functions don't depend on the process ID. This means the global state can be abstracted to a **configuration**: a count of how many processes hold each value, rather than tracking which specific process holds what. **2. Threshold-based decisions.** The update function depends on received messages only through **threshold comparisons**: "did value v appear more than α·n times?" It does NOT depend on exact counts or on which processes sent which messages. This means the update factors through a finite **threshold view** — a Boolean vector indicating which values crossed the threshold. **3. Communication closure.** Messages from round r are delivered in round r or lost forever. This ensures that threshold views computed from received messages are consistent with the global configuration. Without communication closure, received vote counts could be inflated by stale messages, and the threshold view would not reflect reality. ### The threshold view abstraction Under these conditions, the protocol's behavior is determined by the **threshold view** — which values have counts exceeding α·n. For binary values (k=2), there are only 3 realizable threshold views: - **(T, F)**: value 0 has super-majority, value 1 doesn't. - **(F, T)**: value 1 has super-majority, value 0 doesn't. - **(F, F)**: neither has super-majority. - **(T, T)**: impossible by pigeonhole (would need > 2α·n > n total). The update function maps each (current_value, threshold_view) pair to a new value. This is a finite table — independent of n. ### The cutoff bound The **cutoff bound** K depends on the number of distinct values k and the threshold fraction α = α_num/α_den: K = ⌈k · α_den / (α_den - α_num)⌉ + 1 For OneThirdRule (k=2, α=2/3): K = 7. ### The theorem > If a threshold-determined invariant holds for all configurations > at all n ≤ K, then it holds for all configurations at all n. **Proof sketch**: For any configuration c at any n, its threshold view is some pattern tv. By the **scaling lemma**, there exists a configuration c' at some n' ≤ K with the same threshold view. Since the invariant is threshold-determined (depends only on the threshold view), the invariant agrees on c and c'. Since n' ≤ K, the small-n verification covers c'. Therefore the invariant holds for c. The scaling lemma is where **communication closure** is used: the threshold view captures all relevant information about the round transition because received messages reflect the current configuration (no stale messages). Without this, the transition would depend on message history, the threshold view wouldn't suffice, and the scaling argument would fail. ### Unreliable communication The cutoff works even under unreliable communication (different processes see different HO sets), as long as the communication predicate constrains the HO sets enough. The key property: > No process can see a competing value above threshold. Under the 2/3-communication predicate, if value v has > 2n/3 holders, then the non-v holders number < n/3. Since votes ≤ holders (communication closure!), no process can receive > 2n/3 votes for any non-v value, regardless of its HO set. This means the nondeterminism from HO sets affects WHICH processes adopt the dominant value, but NOT whether the invariant is preserved. ### Using the cutoff for verification The cutoff theorem reduces infinite verification to finite checking: 1. **Define** the protocol as a symmetric threshold algorithm. 2. **Define** the invariant as a configuration-level property. 3. **Verify** the invariant for all configurations at n ≤ K. (This is a finite check — can be done by `native_decide` in Lean, or by an external model checker like TLC.) 4. **Apply** the cutoff theorem to get the result for all n. Step 3 is the "model checking" step. For OneThirdRule with k=2 and K=7, there are about 330 configurations to check. For larger protocols, the cutoff may be larger but still finite. --- ## 5. Worked Example: OneThirdRule End-to-End ### Step 1: Specify the algorithm ``` Algorithm OneThirdRule: State: (val : Nat, decided : Option Nat) Send: broadcast val Update: let received = [messages from HO set] if some value v appears > 2n/3 times in received: val := v; decided := some v else: keep current state CommPred: every process hears from > 2n/3 senders ``` ### Step 2: Identify the invariant The **lock invariant**: ∀ v, (∃ p, p.decided = some v) → countHolding(v) > 2n/3 "If anyone decided v, then more than 2n/3 of all processes hold v." ### Step 3: Prove the invariant is inductive **Init**: No decisions → antecedent is false → vacuously true. ✓ **Step**: Given the lock holds in state s, show it holds in s'. The proof decomposes into: - **Establishing the lock in s**: trace a new decision back to > 2n/3 holders via the counting bound (votes ≤ holders, by communication closure). - **Lock persists**: no competing value can reach a super-majority (pigeonhole: w-holders < n/3 < 2n/3, so w-votes < 2n/3 in any HO set). - **Count doesn't decrease**: v-holders keep v (can't be lured to a competing value), so the count is monotone. ### Step 4: Derive agreement From the lock invariant, agreement follows by pigeonhole: - If v decided and w decided, both have > 2n/3 holders. - Disjoint sets of size > 2n/3 would need > 4n/3 > n elements. - Contradiction. So v = w. ✓ ### Step 5 (optional): Apply the cutoff For a model-checking approach: 1. Encode the state as Fin 4 (val × decided for binary values). 2. Define the lock invariant at the configuration level. 3. Verify for n ≤ 7 by exhaustive enumeration (330 configs). 4. Apply the cutoff theorem: lock holds for all n. The cutoff argument uses the same three-condition check: - Process symmetry ✓ (update doesn't depend on process ID) - Threshold-based ✓ (only checks > 2n/3 counts) - Communication closure ✓ (by the HO model) ### Step 6: Derive validity Any decided value was an initial value. Proof by induction on time: the update function either keeps the current value or adopts a value from received messages. Received values are current sender values (communication closure!). So values only propagate, never appear from nothing. By induction, every value traces back to some process's initial value. --- ## 6. Summary: Why Round-Based Reasoning Helps ### What the round structure gives you 1. **No message buffers.** Communication closure eliminates the need to track pending messages. The state is just the local states of the processes — no message queues, no delivery ordering. 2. **Received = current.** The `countOcc_le_countHolding` bound — votes for v in received messages ≤ total current v-holders — is the workhorse of every safety proof. It holds because of communication closure. 3. **Round induction.** The round invariant rule reduces temporal reasoning to single-step reasoning. You prove one round step, and the framework gives you "always." 4. **Finite threshold views.** For symmetric threshold protocols, the behavior is determined by a finite Boolean vector (which values cross the threshold). This enables cutoff arguments: check finitely many cases, conclude for all n. ### What communication closure gives you specifically Without communication closure, NONE of the following work: - **Counting bounds** (votes ≤ current holders): stale messages could inflate vote counts beyond the current holder count. - **Pigeonhole arguments** (two super-majorities can't coexist): the pigeonhole uses the counting bound, which depends on communication closure. - **Cutoff scaling** (same threshold view at different n implies same behavior): the threshold view must reflect the current configuration, not message history. - **Lock persistence** (v-holders can't be lured away): a process holding v keeps v because no competing value can reach a super-majority in its received messages. This uses the counting bound, which uses communication closure. ### The proof architecture ``` Safety Property (e.g., agreement) │ ├── Derived from an inductive invariant (e.g., lock invariant) │ │ │ ├── Init: trivial (no decisions) │ │ │ └── Step: preserved by one round │ │ │ ├── Counting bound (votes ≤ holders) │ │ └── Uses: communication closure │ │ │ ├── Pigeonhole (no competing super-majority) │ │ └── Uses: counting bound + disjoint sets │ │ │ └── Monotonicity (dominant value count doesn't decrease) │ └── Uses: pigeonhole (no one switches away) │ └── (Optional) Cutoff: reduce to finite checking │ ├── Threshold view abstraction (finite Boolean vector) │ └── Uses: communication closure (views reflect current state) │ ├── Scaling lemma (same view realizable at small n) │ └── Model checking at n ≤ K ``` --- ## Files in the project | File | What it contains | |------|-----------------| | `Leslie/Round.lean` | RoundAlg, RoundSpec, round_invariant, utility lemmas | | `Leslie/Cutoff.lean` | Config, ThreshView, SymThreshAlg, cutoff_reliable | | `Leslie/Examples/OneThirdRule.lean` | Full direct proof: agreement + validity | | `Leslie/Examples/OneThirdRuleCutoff.lean` | Cutoff-based proof: lock invariant at config level | | `Leslie/Examples/BallotLeader.lean` | Leader election with majority pigeonhole | | `Leslie/Examples/FloodMin.lean` | Flood-min consensus with refinement | | `Leslie/Examples/VRViewChange.lean` | VR view change safety (quorum intersection) | | `docs/mc-tactic-plan.md` | Plan for model-checking tactic | | `docs/zero-one-rule.md` | Plan for zero-one reduction |