# ChonkyBFT

This folder contains the specification of the ChonkyBFT, a new consensus protocol created by Matter Labs. It has both the pseudo-code specification that was used as the basis for the Rust implementation in the rest of this repo and the Quint specification that was used to formally verify the protocol.
Chonky BFT is a consensus protocol inspired by [FaB Paxos](https://www.cs.cornell.edu/lorenzo/papers/Martin06Fast.pdf), [Fast-HotStuff](https://arxiv.org/abs/2010.11454) and [HotStuff-2](https://eprint.iacr.org/2023/397).
It is committee-based and has only one round of voting, single slot finality, quadratic communication and _n=5f+1_ fault tolerance. Let's discuss what were our objectives when designing ChonkyBFT.

## Design goals in practice vs. theory

In recent years, research on consensus algorithms has often focused on theoretical optimizations that may not align with the practical challenges of implementing these algorithms. This has led to researchers prioritizing metrics that are less impactful in real-world deployments. For instance, tables comparing algorithms frequently highlight metrics that, while theoretically appealing, offer limited practical significance. Below, we outline commonly emphasized metrics that may not adequately reflect real-world performance considerations.

- Authenticator complexity. Optimizing to have fewer signatures was a  priority decades ago when crypto operations were expensive. Today, digital signatures are fast and small. However, many papers still report this measure and even go as far as suggesting threshold signatures over multisignatures, which introduces a much more complex step of distributed key generation instead of spending some more milliseconds on verifying the signatures.
- Message complexity. In theory, reducing the number of messages exchanged across the network should improve the algorithm's performance. However, in practice, the actual performance gain depends on where the system bottleneck lies. For instance, even with linear communication, if the leader must still send and receive _N_ messages, the improvement may be negligible. Moreover, this approach often oversimplifies the cost of messages, treating all messages equally. In reality, a block proposal can be several megabytes in size, whereas a block commit may only be a few kilobytes.
- Block latency. Block times of, for example, 0.1 seconds are not necessarily a reliable performance indicator if finality requires waiting for 100 blocks. The key metric that matters is the time it takes for a user to see their transaction finalized. Some algorithms claim to achieve only one round of voting; however, they introduce an additional round within the block broadcast mechanism. Or pipeline the voting rounds, thus requiring several blocks to finalize. These approaches can result in higher user-perceived latency, despite achieving shorter block times.

While the above metrics are often highlighted, other aspects of consensus protocol design have a far greater impact in practical settings. Below, we discuss the metrics that are most relevant for real-world consensus algorithm performance.

- Systemic complexity. This aligns with the discussion on [systemic vs. encapsulated complexity](https://vitalik.eth.limo/general/2022/02/28/complexity.html). Consensus algorithms do not operate in isolation; they are designed to support a wide range of applications. A clear example of this issue is the distinction between probabilistic and provable finality. Algorithms that provide probabilistic finality introduce additional complexity for applications, as exchanges, multi-chain dApps, hybrid dApps, block explorers, wallets, and similar systems must determine an appropriate number of confirmations for each chain. In contrast, algorithms with provable finality deliver a clear and deterministic signal that applications can rely on. This distinction is significant enough that even Ethereum is planning to adopt [single-slot finality](https://ethereum.org/en/roadmap/single-slot-finality/#why-aim-for-quicker-finality).
- Simplicity. When designing and implementing an algorithm, it is essential to consider the trade-off between complexity and practicality. While saving one round-trip in an optimistic scenario may seem advantageous, it may not be worthwhile if the algorithm is too complex to formalize and model effectively. Additionally, a complex implementation that requires significant resources—such as multiple engineers and extensive audits—can introduce further challenges. Simple algorithms that are straightforward to implement and can be formally proven offer greater security and reliability. A bug that results in downtime, or worse, safety violations, has a far more detrimental impact on user experience than marginally slower block times.
- Transaction latency. As previously discussed, the only latency that truly matters is the one experienced by the user.

## Lessons learned

For our specific use case, we have gained several insights from researching and implementing previous consensus algorithms:

1. Chained consensus is not beneficial for our use case. It does not improve throughput or latency but adds unnecessary systemic complexity. Instead, we finalize every block.
2. Reducing fault tolerance can minimize the number of voting rounds, as demonstrated by FaB Paxos. By decreasing our fault tolerance requirement from _3f+1_ to _5f+1_, we are able to finalize consensus in a [single voting round](https://decentralizedthoughts.github.io/2021-03-03-2-round-bft-smr-with-n-equals-4-f-equals-1/).
3. Linear communication is not always advantageous. Quadratic communication among replicas simplifies security, as there are fewer cases where the impact of a malicious leader needs to be considered. It also simplifies implementation by allowing a clear separation of the leader component, as well as view changes, where constant timeouts suffice. For example, [Jolteon/Ditto](https://arxiv.org/abs/2106.10362) adopted this approach after encountering challenges while implementing HotStuff. Additionally, the performance trade-off is likely minimal, as demonstrated by [ParBFT](https://eprint.iacr.org/2023/679.pdf).
4. Re-proposals can be used to ensure that no "rogue" blocks exist—a challenge that has received little attention so far (to the best of our knowledge) and is particularly relevant to public blockchains. In committee-based consensus algorithms, it is possible for a commit quorum certificate (to use HotStuff's terminology) to be formed without being received by enough replicas. This can result in a timeout and the proposal of a new block. Most algorithms address this by declaring the old block invalid. While all honest replicas will agree on which block is canonical, a participant who only sees the single block without knowledge of the timeout may mistakenly believe that it was finalized. This undermines the desirable property of verifying a block's inclusion in the chain by inspecting only the block itself, without needing the entire chain. To address this, we require that block proposals following a timeout (where a commit quorum certificate might have been formed) re-propose the previous block. This approach guarantees that any block with a valid commit quorum certificate is part of the chain—it may not have been finalized in the current view, but it was certainly finalized
5. Always justify messages to remove time dependencies, a lesson we learned from Fast-HotStuff. Messages should carry sufficient information to allow any replica to independently verify their validity without relying on additional data (except for previous blocks, which are external to the consensus algorithm). Failing to do so can introduce subtle timing dependencies. For example, Tendermint contained a bug—discovered years later—where the leader had to wait for the maximum network delay at the end of each round to avoid a deadlock. If this wait did not occur, the system could lock. Interestingly, HotStuff-2 reintroduces this timing dependency to eliminate one round-trip, which significantly increases the complexity of modeling and implementing the system.
6. Make garbage collection and reconfiguration integral parts of the algorithm. These components will inevitably need to be implemented, and failing to specify and model them upfront may result in awkward and inefficient implementations later.

FaB Paxos satisfies the first 4 points, and Fast-HotStuff satisfies the 5th. ChonkyBFT is basically FaB Paxos with some ideas from Fast-HotStuff/HotStuff-2.