News Report:
Author: Vitalik Buterin
Translation: Tia, Techub News
One of the biggest risks facing Ethereum L1 is the centralization issue of Proof of Stake (POS). If POS has economies of scale, it will lead to large stakers dominating and small stakers exiting to join large staking pools. This can further lead to high-risk events such as 51% attacks and transaction censorship. In addition to the centralization risk, there is also the risk of value extraction, where a small group of people extract value that originally belongs to Ethereum users.
In the past year, our understanding of these risks has deepened. It is well known that there are two aspects involved in these risks: (i) block construction and (ii) staking capital supply. Larger participants can generate blocks with more complex algorithms (“MEV extraction”) to increase block rewards. Large participants can also handle the inconvenience caused by locked capital more efficiently by releasing it to third parties as liquid staking tokens (LST). In addition to the issues caused by economies of scale for stakers, Ethereum also needs to consider the issue of excessive staking (or the potential existence of excessive staking) of ETH.
The Scourge, 2023 Roadmap
This year, significant progress has been made in “block construction,” most notably the application of the “committee inclusion lists sorting solution.” Extensive research has also been conducted on proof-of-stake economics, proposing ideas such as two-tiered staking models and reducing issuance to limit the percentage of ETH staked.
The Scourge: Key Objectives
Minimize the centralization risk in Ethereum’s staking layer, particularly in block construction and capital supply (also known as MEV and staking pools).
Minimize the risk of extracting excessive value from users.
Fix block construction channels.
What problem are we solving?
Currently, Ethereum block construction is mainly achieved through MEVBoost, which embeds the PBS mechanism. When validators propose blocks, they auction off the work of selecting the block content to builders. The selection of blocks that maximize profits requires specialized algorithms to extract as much profit as possible from the chain (known as “MEV extraction”). Validators only need to perform relatively simple tasks such as listening for bids, accepting the highest bid, and attesting.
MEVBoost Task Chart: Professional builders handle red tasks, while stakers handle blue tasks.
There are many versions, such as Proposer-Builder Separation (PBS) and Attester-Proposer Separation (APS). They only differ slightly in some responsibilities. In PBS, validators still propose blocks but receive payloads from builders, while in APS, the entire slot is the responsibility of the builders. Recently, APS has become more popular than PBS as it further reduces the incentive for collusion between proposers and builders. Note that APS only involves the execution of blocks; consensus blocks containing proof-of-stake-related data (such as attestations) are still randomly assigned to validators.
Further dividing power helps maintain decentralization among validators, but it comes with an important cost: participants executing “specialized” tasks can easily become centralized. Here is the current state of block construction in Ethereum:
As can be seen, two builders determine 88% of Ethereum block content. This can lead to transaction censorship by the participants. However, the situation may be better than imagined: blocking transactions from being included in 100% of cases requires more than 51%. With 88% censorship, users need to wait an average of 9 slots to be included. For some cases, waiting two or even five minutes is acceptable. But for other cases, such as DeFi liquidations, delaying transactions for a few blocks can lead to market manipulation risks.
The strategies adopted by block builders to maximize profits can have negative impacts on users. Transactions like “sandwich attacks” can cause significant losses to users due to slippage. Additionally, introducing transactions for these attacks can congest the chain, increasing gas prices for other users.
What is the solution? How does it work?
The solution can be further subdividing the block production tasks: returning the task of selecting transactions to proposers (i.e., stakers), while builders can only sort transactions and insert some of their own. This is what inclusion lists do.
At time T, a randomly selected staker creates an inclusion list of valid transactions based on the current state. At time T+1, the block builder (possibly chosen in advance through a protocol-level auction mechanism) creates a block. This block must include every transaction in the inclusion list, but the builder can sort the transactions and add their own.
In the Fork-Choice Enforced Inclusion Lists (FOCIL) proposal, each block has a multiple inclusion list committee. If a transaction is to be delayed by one block, the creators of the k inclusion lists (e.g., k = 16) must review the transaction. FOCIL, combined with the final proposer selected through an auction (which needs to include the inclusion list but can reorder and add new transactions), is often referred to as “FOCIL + APS.”
Another approach to solving this problem is using Multiple Concurrent Proposers (MCP) schemes like BRAID. BRAID aims to avoid dividing the block proposer role into small-scale economic parts and large-scale economic parts, instead allocating the block production process to many participants so that each proposer only needs a moderate level of complexity to maximize their income. MCP works by having parallel k proposers generate transaction lists and then using a deterministic algorithm (such as sorting by fees from high to low) to select the order.
BRAID does not achieve its goals by running default software to determine block proposers. Two simple and understandable reasons why it cannot achieve this goal are:
Post-facto frontrunning attacks: Suppose the average time for proposer submission is T, and the latest time a transaction can be submitted and included is around T+1. Now, suppose the ETH/USDC price on a centralized exchange increases from $2500 to $2502 between T and T+1. The proposer can wait an extra second and add an extra transaction to perform arbitrage on a decentralized exchange on-chain, earning up to $2 per ETH in profit. Mature proposers closely connected to the network are more capable of doing this.
Exclusive order flow: Users have an incentive to send transactions directly to a single proposer to minimize the risk of frontrunning and other attacks. Experienced proposers have an advantage as they can build infrastructure to accept these transactions directly from users and they have a stronger reputation, so users sending transactions to them can trust that the proposers will not betray or frontrun them (this can be mitigated by trusted hardware, but trusted hardware has its own trust assumptions).
In addition to these two extremes, there is a range of designs in between. For example, you can auction off a role that appends transactions to the block (but without the power to reorder them).
Cryptographic Memory Pools
Cryptographic memory pools are a key technology for implementing these designs, especially BRAID or APS versions with strict limitations on auction functionality. Cryptographic memory pools are a technology where users broadcast their transactions in encrypted form, including some form of proof of validity, and the transactions are included in blocks in encrypted form, with the block builders unaware of the content (which will be revealed later).
The main challenge in implementing cryptographic memory pools is coming up with a design that ensures transactions will be disclosed after confirmation: a simple “submit and reveal” scheme is not feasible because if disclosure is voluntary, then the act of choosing to disclose or not becomes a post-facto influence on blocks that can be exploited. Two main techniques to achieve this are (i) threshold decryption and (ii) delayed encryption, which is closely related to verifiable delay functions (VDFs).
What is the connection to existing research?
Explanation of MEV and builder centralization: https://vitalik.eth.limo/general/2024/05/17/decentralization.html#mev-and-builder-dependence
MEVBoost: https://github.com/flashbots/mev-boost
Enshrined PBS (early solutions to these problems): https://ethresear.ch/t/why-enshrine-proposer-builder-separation-a-viable-path-to-epbs/15710
Mike Neuder’s reading list on inclusion lists: https://gist.github.com/michaelneuder/dfe5699cb245bc99fbc718031c773008
Inclusion list EIP: https://eips.ethereum.org/EIPS/eip-7547
FOCIL: https://ethresear.ch/t/fork-choice-enforced-inclusion-lists-focil-a-simple-committee-based-inclusion-list-proposal/19870
Max Resnick’s presentation on BRAID: https://www.youtube.com/watch?v=mJLERWmQ2uw
Dan Robinson’s “Priority is All You Need”: https://www.paradigm.xyz/2024/06/priority-is-all-you-need
Tools and protocols for multiple proposers: https://hackmd.io/xz1UyksETR-pCsazePMAjw
VDFResearch.org: https://vdfresearch.org/
Verifiable delay functions and attacks (focus on RANDAO setups, but also applicable to cryptographic memory pools): https://ethresear.ch/t/verifiable-delay-functions-and-attacks/2365
What remains to be done? What trade-offs need to be made?
We can view all these proposals as different ways of dividing staking permissions, arranged in a spectrum from lower economies of scale (“friendly to generalists”) to higher economies of scale (“friendly to specialists”). Prior to 2021, all these permissions were concentrated in a single participant.
The core challenge is that any meaningful power left in the hands of stakers can ultimately become “MEV-related.” We want a highly decentralized set of participants to have as much power as possible; this means (i) putting a lot of power in the hands of stakers and (ii) ensuring that stakers are as decentralized as possible, meaning they have little incentive for integration driven by economies of scale. This is a challenging tension to manage.
We can think of FOCIL + APS in this way. Token holders continue to have permissions on the left side of the spectrum, while the right side is auctioned off to the highest bidder.
BRAID, on the other hand, is quite different. The “staker” portion is larger but divided into lightweight stakers and heavyweight stakers. Additionally, since transactions are ordered in descending order of fees, the selection at the top of the block is effectively auctioned through a fee market, making this scheme similar to an encapsulated PBS.
Note that the security of BRAID largely depends on cryptographic memory pools; otherwise, the top of the block auction mechanism is vulnerable to strategy-stealing attacks (essentially: copying someone else’s transaction, swapping recipient addresses, and paying a slightly higher fee of 0.01%). This need for pre-inclusion privacy is also why PBS is difficult to implement.
Finally, there is a more “radical” version of FOCIL + APS, such as APS only determining the options at the end of the block:
The remaining major tasks are: (i) consolidating the various proposals and analyzing their consequences; (ii) combining this analysis with an understanding of the Ethereum community’s goals, i.e., what forms of centralization the Ethereum community is willing to tolerate. Each individual proposal also needs some work, such as:
Continuing to work on the design of cryptographic memory pools and achieving a design that is both robust and reasonably simple and inclusion-friendly.
Optimizing the design of multiple inclusion lists to ensure that (i) it does not waste data, especially when the inclusion list covers blobs, and (ii) it is friendly to stateless validators.
More research on the best auction design for APS.
Additionally, it’s worth noting that these different proposals are not necessarily mutually incompatible forks. For example, implementing FOCIL + APS can serve as a stepping stone towards implementing BRAID. A viable conservative strategy is to take a “wait and see” approach, where we first implement one solution that limits staker permissions and auctions off most permissions, and then, as we gain a better understanding of the real-time network, make further decisions.