Blockchain scalability has been a long-standing problem in the blockchain industry. As more and more users and transactions join the network, the blockchain system struggles to keep up with the demand, leading to slow transaction times and high fees. This is where blockchain scalability solutions come in, which aim to increase the capacity and performance of blockchain networks. This comprehensive course offers a detailed overview of blockchain scalability solutions, including Rollups and Layer 2 scaling. Discover how these innovative solutions increase the capacity and performance of blockchain networks, while reducing transaction times and fees.
Why are blockchains so slow?
At the database level, blockchains such as Ethereum and Bitcoin consist of blocks, and these blocks must propagate across the network. Each of these blocks contains transactions, which are propagated between different blockchain nodes via a gossip network. There exist mining nodes searching for new blocks; each block contains transaction data. Whenever a miner announces that a new block has been discovered, this block is propagated across the network. This process takes some time, but eventually all nodes participating in this network receive the new block ignoring synchronisation issues and the possibility of two miners discovering the same block around the same time. Each block contains a number of transactions that alter the state of the blockchain. Blockchain is, at its core, a State Machine; it computes the post-state from a pre-state and a set of transactions. The state of Ethereum is approximately 900 GB in size, so rather than storing the entire state in each block, we compress it into a single hash using advanced cryptographic techniques and store a state commitment within each block.
From a user’s perspective: the miner mines a new block and now the users are interested in knowing what is the current state. One option for a user is to ask the miner what is the current state? This is easy for the user, but if the miner is a malicious actor, then the user is in trouble. The easiest way to determine whether a miner is truthful is to validate the state ourselves. In this way, we do not need to obtain information from the miner; instead, we can examine the miner, examine the list of transactions, and compute the state ourselves. This is how Bitcoin and Ethereum function, and this is how people running full nodes participate in the network. This is a crucial component because the ability to validate the blockchain enables us to have a relationship with the miners that is not based on trust. On the Ethereum network, a new block is added to the blockchain every 15 seconds. As long as I am able to verify the new block within this 15-second window, there is no need to trust the miner. If the speed of validating is slower than the rate at which miners add new blocks, then users must place their trust in miners. The underlying bottleneck of scalability in blockchains is the speed of validation.
There is nothing preventing us from increasing the block size and decreasing the time between blocks on Ethereum, but in this case we would not be able to independently validate the network. Using sharding, we can distribute the work of building and verifying the chain across multiple nodes to address this scenario. Sharding is a technique for distributing a single database across multiple databases, which are then stored across multiple data nodes.
This is even more true for Bitcoin, where the trust assumptions are even lower, resulting in a longer time between blocks and a smaller block size, which reduces the time required to validate blocks and makes it simple for anyone to run a full node.
One Honest Node Security Assumption: Assume a scenario in which all nodes on a network collude except for a single honest node. If the honest node can independently verify the network, it can raise an alarm, causing some users to become suspicious of the network’s state and decide to run validator nodes, allowing us to reject the dishonest network. Validating the blockchain requires a single full node, provided it can do so within the given time window.
Full node users are responsible for validating the miners, and their ability to validate a block is the reason public, decentralised blockchains cannot scale beyond the speed the end users can run full nodes.
Brief Overview of Existing Scaling Solutions
Blockchain-based systems are currently hampered in their scalability by low transaction rates and high transaction processing latencies. Visa and PayPal process thousands of transactions per second (TPS), whereas Bitcoin and Ethereum process between five and fifteen transactions per second. For blockchains to be used effectively as a payment method, we require scalability solutions that enable a significantly higher throughput.
Scalability solutions address this transaction-capacity issue by enhancing the scaling limitations of blockchains. The following is a brief overview of existing scaling solutions:
Layer 1 Solutions
o increase the scalability of blockchains, we can use Layer 1 solutions such as:
- Alternate Consensus Algorithms: The Proof of Stake (PoS) consensus mechanism does not require miners to solve cryptographic algorithms and expend excessive computing power; by removing the computational barrier to proof computation and moving towards a route of probabilistic election, PoS provides immediate throughput gains. Therefore, it is anticipated that ETH 2.0 (PoS) TPS will be 500-1000 times that of ETH 1.0. (PoW).
- Modifying the block data: The block data can be modified via mechanisms such as Compact Block Relay, Segwit, or Txilm, to minimize both the bandwidth and the latency required to transfer a block that confirms many of those same transactions.
- Directed Acyclic Graphs (DAGs): Lewenberg et al. replace the blockchain structure with a Directed Acyclic Graph; there is still a main chain, but its blocks may refer to pruned branches in order to incorporate their transactions.
- Sharding: Sharding divides large chains into smaller, faster chains, thereby increasing the system’s scalability, by dividing the state and history stored on the main chain into shard.
Layer 1 Solutions involve structurally altering the fundamental design elements of blockchains, thereby diminishing backward compatibility. Using an alternative consensus mechanism, for instance, necessitates the forking of the protocol. Similarly, sharding networks significantly alters the network’s layout.
Layer 2 Solutions
A second class of scaling solutions has emerged to allow blockchains to scale without modifying the underlying consensus mechanism or trust assumptions. These solutions, referred to collectively as Layer 2 solutions, execute transactions off-chain via an authenticated communication medium, thereby significantly reducing the transaction load on the main chain. For a comprehensive overview on the existing Layer 2 Solutions, you are encouraged to read A Survey of Layer-Two Blockchain Protocols.
L1 = Layer 1; L2 = Layer 2.
A Layer 2 Scaling Solution is a method for encapsulating one blockchain within another by providing an L2 block as a transaction (which includes a state commitment) within an L1 blockchain. We can include an entire block with its own post-state within a single state commitment. This allows us to pack a large number of transactions into a single L1 block, and we can do so recursively. Imagine what would happen if you packed four blockchains recursively; you would end up with unlimited scalability. This solution is called a rollup solution. This is referred to as a “ ” because not only is the state posted to L1, but also all the transactions from L2 are transferred into L1 when the transactions are included — we can reconstruct the state of L2 by examining L1. Theoretically, rollups are non-custodial side chain solutions that aim to reduce the load on the main chain; they employ data compression techniques and a smart contract for scaling L1s.
Rollups retain minimal state update information on-chain; this information is used for on-chain verification and faster withdrawals. Transactions are bundled for on-chain verification and executed in batches off-chain. The smart contract keeps the state root, or Merkle root, from the current state of the rollup on-chain. The same root can be verified using the blockchain’s data. The Merkle tree is not stored on the blockchain in order to conserve space. A new state root is calculated whenever a batch of transactions alters its balance. By including the transactions in compressed form, the previous state root, and the newly computed state root, anyone can publish the batch.
Rollup Fee Mechanism
To comprehend how rollups generate fees, it is necessary to comprehend how they execute transactions:
- A sequencer receives and orders transactions; users are notified when their transaction has been recorded on L1. The sequencer is only utilised for transaction completion and ordering.
- A deterministic state transition function updates the L2 state for each transaction, thereby producing an L2 block. These blocks are produced more rapidly than L1 blocks.
- A transaction group is compressed and sent to L1 Currently, these transactions are stored as calldata, but in the future, rollups will utilise data blobs.
Given the steps involved in successfully establishing, executing, and terminating rollup transactions, both L2 and L1 steps incur costs. When the state transition is applied to execute their transactions, users pay L2 gas. L1 gas is paid for at the time the batch is posted.
Sequencers commit to a transaction and collect L2 fees prior to knowing the batch’s complete contents or the L1 base fee. Consequently, L2s estimate the L1 fees.
Advantages of Rollups
The advantages of rollup scaling solutions include:
- Compression: Rollups use compression to reduce transaction footprint on-chain, reducing space and scaling the L1 chain.
- Data Availability: Data availability refers to the condition in which all transaction-related data is accessible to network nodes. Data availability enables nodes to independently validate transactions and compute the blockchain’s state without requiring trust. Rollups circumvent the issue of data availability by utilising zk proofs and fraud proofs (detailed below).
- Capital Efficiency
- No Mass Exiting
- Flexible Smart Contract Support
- Existing EVM-bytecode Compatibility
There exist two different kinds of rollups, that depend on the type of underlying miner: Optimistic Rollups and Zero-Knowledge Rollups.
Types of Miners
Optimistic Miner: An optimistic miner will not provide the proof that the new state commitment is the correct state. Since new blocks must be validated, there are constraints on the size of the block and the number of blocks proposed by miners.
Zero-Knowledge Miner: The miner provides both the state and a zero-knowledge proof that the state is accurate. The proof demonstrates that the miner is not lying and that the post-state is valid. If there is a supercomputer miner that submits a large number of transactions, then we only need to verify the ZK proof in this scenario. On the surface, it would appear that ZK-proofs solve the scalability problem, as all we need to do is include ZK-proofs and we can create blocks of any size. Despite the fact that we know the post-state is accurate, all we can see on the blockchain is the state commitment, and we do not know what the state is. Therefore, we can inquire of the miner, who will provide us with the state, and we can use an inclusion proof to determine whether a state is included in the commitment. We can independently determine whether a miner is lying.
Assume we are searching for a set of transactions between Bob and Alice, as well as the balances held by Bob and Alice following the transactions. If the miner provides us with a state containing an inclusion proof that provides us with the balance for Bob but not for Alice, then we do not know Alice’s balance. If I do not receive the answer from the miner, I must recompute the state myself, which will require a significant amount of computational power if the state is large.
Optimistic Rollups — Mechanism
We must examine all L2 transactions and independently verify the post-state. This construction only requires the validator assumption on the L1 to function. In this scenario, miners are able to cheat, and we must have faith in them. Theoretically, the validator could discover an incorrect state commitment, and the validators could submit a fraud proof to a smart contract on L1. Once the fraud evidence has been validated, the L1 can reject this new state. There is typically an incentive mechanism in place to punish malicious miners. There is also a fraud proof window during which honest validators must submit a fraud proof — typically between one and two weeks, after which the state is presumed to be correct.
Zero-Knowledge (ZK) Rollups — Mechanism
A ZK-miner provides the ZK-proof, and since they cannot cheat, we are certain that the new block on L2 is correct. These proofs are constructed using either zk-SNARKs or PLONK. It is difficult to compute validity proofs, but their on-chain verification is quick. If we wish to determine Bob’s balance in L2 (the actual state), we are faced with the same dilemma: we must either run a full node or rely on someone else running a full node.
Withdrawing from Layer 2 to Layer 1
Imagine the L2 blockchain is a casino where you purchase all of the chips and where transactions are taking place. Let’s assume that after a few rounds of Poker, you wish to transfer your winnings back into the L1 blockchain and exchange your chips for fiat currency. Consequently, you can now ask the L2 miner if you can withdraw tokens from L2, and ideally, the miner will validate this transaction.
Optimistic Rollup: In this case, the miner can cheat, so you must wait until the fraud-proof window to withdraw your funds. This is one of the most significant disadvantages of the optimistic rollup construction, as we must wait to verify whether the miner’s state is accurate.
ZK Rollup: This is not an issue given that the miner cannot cheat in this construction.
Now suppose that the L2 miner is offline: In this scenario, no new blocks are generated, but users must still be able to withdraw funds. This is possible for any rollup construction; however, you must be able to produce a state and validate the state for this to be possible.
The third circumstance is when the miner is censoring. Assuming that both Bob and Alice wish to withdraw funds, the miner responds to Alice’s request but ignores Bob’s. It is impossible to determine whether the miner or Bob lied about the withdrawal request. This should be a method for protecting users and avoiding miners.
Differences between Optimistic and Zero-Knowledge Rollups
The Rollup Thesis
Fundamentally, I believe that zk-rollups will be the ultimate Layer 2 scaling solution in the long run due to their properties that make them as secure as the underlying L1. As stated previously, zk-rollups are an excellent withdrawal time solution. Consequently, it is clear why zk-based NFT exchanges such as Immutable X gained so much traction in 2021.
The second reason why I am so enthusiastic about zk-rollups is composability: on Ethereum, Decentralised Finance (DeFi) is so successful because multiple protocols interact to increase the value of one another’s offerings. Both optimistic rollups and zk-rollups can be used to achieve composability; however, there is a significant drawback with optimistic rollups: the more value you lock into a single instance of an optimistic rollup, the more risk your funds are exposed to. If you have multiple protocols with a combined value of billions of dollars, a terrifying situation may occur: the optimistic rollup could be attacked by the miners and not the validators. The miners could conduct a censorship attack involving fraudulent transactions, in which case all validator nodes would rush to submit a fraud proof; however, the miners would censor these validators.
Dangers of Optimistic Rollups
- 51% Attacks: Assume an attacker rents 51% of the cloud’s hashing power and overflows all blocks containing fraud-proof transactions into this particular optimistic rollup. The attacker does not affect any other transactions pertaining to any other account — all other transactions are included; attacker builds on top of miners not containing the fraud proof transaction, and reverts the block if miners include the fraud proof transaction. In this situation, as a miner, you may attempt to coordinate with other miners to obtain 51% of the hashing power, but if that is not possible, you simply comply. As a miner on a Proof of Work system, it is possible to comply with an attacker completely anonymously by leaving the legitimate mining pool and joining the attacker pool. Miners are perfectly rational and profit-seeking actors, and therefore, there is a likelihood that miners comply. Once miners comply, the hashpower requirement is no longer necessary, and the soft-fork can be maintained indefinitely. The cost of this attack is low. All miners are required to comply with a 51% attack on an optimistic rollup for one week (or until miners comply), allowing an attacker to steal billions of dollars in value. zk-rollups are completely resistant to this type of issue.
- Non-attributable censorship attack on fraud-proof-based L2s (in a POS world)
- Time-travelling attacks by an anonymous user: Providing fraud of a legitimate state transition enables chain to revert and travel time.
Optimizing Withdrawal Time in Optimistic Rollups
The 1 to 2 week withdrawal time for optimistic rollups occurs when transferring tokens from L2 to L1 via the native bridge. There are now numerous ways to circumvent this issue, including:
- Hop Protocol: The hop protocol provides a scalable rollup-to-rollup general token bridge through a combination of creating a cross-network token bridge token that can be economically moved from rollup to rollup or claimed on the L1 for the underlying asset, and using AMMs to swap between each bridge token and its corresponding canonical tokens on each rollup to dynamically price liquidity and incentivise the rebalancing of liquidity across the network (Winfrey, 2021, https://hop.exchange/whitepaper.pdf). Hop protocol reduces the withdrawal time from weeks to minutes.
- Celer Network: Celer Network announced the Celer cBridge, a multi-chain network that enables instant, low-cost and any-to-any value transfers within and across L2 chains, main chain, other L1s and L2s built on top of L1s. The bridge functions by extending Celer Network’s state channel network’s functionality through enabling multi-homing in the off-chain communication protocol. Multi-hoping is the practice of connecting a host or a computer network to more than one network.
Advantages of using a bridging solution: There exist several other third-party bridges between the L2 and L1, and L2 and other L2s. (https://help.optimism.io/hc/en-us/articles/4411903283227-Withdrawals-from-Optimism)
- Shorter Withdrawal Time: Using an external bridging solution removes the limitation of waiting for the verification period.
- Lower Cost: The cost of the merkle proof required for a withdrawal transaction is also spread over a large pool of tokens, reducing the relative cost of withdrawal.
Disadvantages of using a bridging solution:
- Security Risks: Bridges often hold a multiple assets in liquidity pools; and therefore, are a weak point in the transaction flow that are often systematically targeted by malicious actors. Cross-chain bridges can be considered to be a top security risk. (Chainalysis, 2022, https://blog.chainalysis.com/reports/cross-chain-bridge-hacks-2022/)
- Ecosystem Lock-In: Most bridges are restricted to a single crypto-ecosystem
- Slow and Inefficient: Several transactions are required to bond, mint, claim, and swap assets
Alternatives to Bridging:
- Cross-chain Interoperability Protocols (such as Axelar), built using threshold signature cryptography (TSS) and PoS. These protocols can verify a transaction on one chain, and trustlessly generate a signature to submit the transaction on another chain.
- Cross-chain Swaps (such as Magpie Protocol) ****use bridges to communicate swap signals between chains, creating a chain-agnostic, efficient, and fully non-custodial trading experience. Example Mechanism:
- The swap is initiated by a user on source chain with target asset and chain information to calculate the order route
- User assets are converted to stablecoins
- Stablecoins are deposited to chain-specific Magpie stable pools
- User swap and deposit information is sent and recorded by Wormhole State Guardian Network
- Upon verification, relayer executes swap-out function on target chain converting stablecoins to users target
- Multi-chain Liquidity Aggregation (such as Li.Fi) bridge aggregation and smart routing with DEX and protocol connectivity. Example Mechanism (Source: https://gitcoin.co/grants/3133/lifinance-the-cross-chain-defi-mesh-widget): ****
- Aggregates cross-chain liquidity pools like Connext, Hop, RouterProtocol, Thorchain, Chainflip, Anyswap, and makes sure to always know all liquidity pools
- Connects to DEXes and DEX Aggregators akin to 1Inch on all chains to facilitate any-to-any swaps
- Arbitrageurs and impatient users can connect to lending protocols
State of Layer 2 Rollups
Due to their EVM compatibility, optimistic rollups are currently in the lead in terms of Total Value Bridged (TVB). Arbitrum and Optimism combined hold nearly 70% of the Layer 2 marketshare. Due to the absence of EVM compatibility and the complexity of implementing ZK-rollups, they have fallen behind in the race. ZK-rollups are currently too computationally intensive for EVM compatibility.
As Ethereum progresses toward a Proof of Stake (PoS) consensus, danksharding will soon be implemented, providing mass data availability and a low-cost location for rollups to post their data. Currently, the combined rollup and Ethereum architecture limits transaction throughput scaling to between 1,000 and 4,000 TPS; however, the introduction of shards will increase this metric to more than 100,000 TPS.
Arbitrum currently leads the Layer 2 race in offering the lowest transaction costs. We are still awaiting the outcome of proto-danksharding (EIP-4844). In a world where EIP-4844 has been implemented, the cost of sending ETH will fall to below $0.01 and cost of swapping tokens to $0.02.
I anticipate that decentralised applications that seek the lowest gas fees for general-purpose computations and EVM compatibility will opt to build on top of Arbitrum. In contrast, applications that are computationally intensive, such as games, or that seek to provide fully private on-chain usage will opt to build on ZK-Rollups.