• Making Sense of Ethereum’s Layer 2 Scaling Solutions: State Channels, Plasma, and Truebit
  • By Josh Stark
  • Translation from: The Gold Project
  • This article is permalink: github.com/xitu/gold-m…
  • Translator: JohnJiangLA
  • Proofreader: Foxxnuaa Zheaoli

Railroad Viaduct (CC), Tunkhannock, Pa. The use of ancient Roman architectural ideas in the new age.

2018 was a year of infrastructure for Ethereum. This is a year for early adopters to test the limits of the network and to refocus on some of the technologies that extend Ethereum.

Ethereum is still in its infancy. Right now, it’s not secure or scalable. Technicians are well aware of this. But in the last year, the hype caused by a spate of ICOs has begun to overstate the current capabilities of the network. The promise of ethereum and Web3 is to build a secure, easy-to-use, decentralized Internet, bound by a common set of economic norms and used by countless people, but only if the critical infrastructure is in place.

Projects that work to build this infrastructure and extend ethereum’s performance are often referred to as scaling solutions. These projects take different forms and are often compatible or complementary to each other.

In this long post, I want to dive into one extension solution: the “off-chain” or “Layer 2” solution.

  • First, we’ll discuss the scaling challenges of Ethereum (and all public blockchains) in general.
  • Second, we’ll look at different approaches to the extension challenge, distinguishing between “Layer 1” and “Layer 2” solutions.
  • Finally, we’ll take a closer look at the Layer 2 solution and explain in detail how it works. We’ll cover state channels, Plasma, and Truebit.

The focus of this article is to give the reader a comprehensive and detailed explanation of the concepts and working principles of layer 2 solutions. But we won’t delve into the code or specific implementation. Instead, we focus on understanding the economic mechanisms used to build these systems and the mindset shared by all tier 2 technologies.


1. The scaling problem of public blockchain

First, you need to understand that “scaling” is not a single, specific problem, it involves a set of puzzles that must be solved to make Ethereum available to millions of users around the world.

The most commonly discussed scaling problem is trade throughput. Currently, Ethereum can process about 15 transactions per second, while Visa has about 45,000/ TPS. In the last year, some applications (such as Cryptokitties or the occasional ICO) have become popular enough to “slow down” the network and raise the price of mining fees (gas).

The central flaw of public blockchains such as Ethereum is that they require every transaction to be processed by every node in the network. A payment, the birth of Cryptokitty, the deployment of new ERC20 contracts, every operation that takes place on the Ethereum blockchain must be performed in parallel by every node in the network. This is determined by the design concept, and it is this design concept that makes the public blockchain authoritative. Nodes don’t need to rely on other nodes to tell them the current state of the current blockchain, they’ll figure it out themselves.

This puts a fundamental limit on ethereum’s transaction throughput: it cannot exceed our design requirements for a single node.

We can ask each node to do more work, if we double the block size (for example, block gas limit), which means that each node will handle roughly twice as much work per block as before. But this weakens the idea of decentralizing the system: more work for nodes means that less powerful computers (such as user devices) are likely to drop out of the network, and mining is more concentrated toward powerful node operators.

Instead, we need a way to make the blockchain do more useful things, but not increase the work of individual nodes.

Conceptually, there are two possible approaches to this problem:

I. What if every node didn’t have to process every operation in parallel?

The first approach is to abandon our premise. What if we could build a blockchain where every node didn’t have to handle every operation? What if the network was replaced by two parts, each of which could operate independently?

Part A can process one batch of transactions while Part B can process another batch. This would effectively double the volume of transactions on the blockchain. Because our limitation can now be processed by both nodes simultaneously. If we can divide the blockchain into many different parts, then we can increase the flux of the blockchain many times.

This is the mind-set of Sharding, an extension that Vitalik’s Ethereum Research Group and other communities are working on. A blockchain is split into different parts called Shards, each of which can process transactions independently. Because sharding is implemented in ethereum’s base level protocol, it is often referred to as a Layer 1 extension solution, and if you want to learn more about sharding, check out the Extensive FAQ and this blog post.

2. If we can squeeze more useful business operations out of ethereum’s existing capabilities

The second option goes in the opposite direction: instead of increasing the capacity of the Ethereum blockchain itself, what if we could do more with the power we already have? The productivity of the Ethereum blockchain is the same at the base level, but in reality, we can do more things that are useful to people and applications, like transactions, status updates in games, or simple calculations.

This is the thinking behind “off-chain” technologies such as State Channels, Plasma and Truebit. While each of these solutions addresses a different problem, they all perform “off the chain” operations and are able to operate outside the Ethereum blockchain while still guaranteeing sufficient security and authority.

These are also referred to as Layer 2 solutions because they are built “on top” of the main ethereum chain. They don’t need to change the base level protocol, instead, they just act as smart contracts on Ethereum to interact with off-chain software.

2. The Layer 2 solution is the crypto economy solution

Before diving into the details of the second layer solution, it is important to understand the underlying details that make it feasible.

The power of the public blockchain is the crypto economic contract. By tweaking incentives and protecting them with software and encryption, we can create a stable computer network with agreed internal states. This is a key element of Satoshi nakamoto’s white paper, which has been used in the design of many different public blockchains, including Bitcoin and Ethereum.

Except in some extreme cases (such as 51% attacks), crypto economic contracts give us a solid core. We know that on-chain operations (such as payments, smart contracts) can be considered written to perform.

The key behind the Layer 2 solution is that we can use this solid core as an anchor, a fixed point where other economic mechanisms can be attached. This second economic mechanism can extend the availability of public blockchains. Let’s interoperate off the blockchain and still reliably return to the core chain if we need to.

These layers built “on top” of Ethereum do not always have the same safeguards as on-chain operations. However, they still have enough authority, security, and availability to enable us to perform operations faster or maintain lower overhead costs, especially with slightly fewer terminals.

The crypto economy doesn’t start or end with Satoshi nakamoto’s white paper, it’s the technical body best suited for us to learn and apply. Not only in the design of the core protocol, but also in the design of the second layer systems that extend the functionality of the underlying blockchain.

1. State Channels

A State Channel is an “off-chain” technique used to perform transactions and other status updates. However, transactions occurring “in” a status channel remain highly secure and authoritative. If anything goes wrong, we still have the option of falling back on the “solid kernel,” which is authoritative based on on-chain transactions.

Most readers will be familiar with a concept that has been around for years — the Payment Channel, which was recently implemented on Bitcoin via the Lightning Network. A status channel is a generalised form of a payment channel, which can be used not only for payments but also for any “status update” on the blockchain, such as a change in a smart contract. In 2015, Jeff Coleman first detailed status channels.

The best way to explain how state channels work is to look at an example. Keep in mind that this is a conceptual explanation, which means we won’t get into the technical details of the implementation.

Now imagine that Alice and Bob want to play a game of Tic-tac-toe, with the winner getting an Ether coin. The easiest way to do this is to create a smart contract on Ethereum that implements the tic-tac-toe rules and tracks each player’s movements. Each time the player wants to move, they send a trade to the contract. When a player wins, the contract pays the winner one Ether coin, according to the rules.

This is feasible, but inefficient and slow. Alice and Bob are using the entire Ethernet network for their gaming sessions, which is a little out of place for their needs. They have to pay a mining fee (gas) for each step, and they have to wait for the mining to complete before moving on to the next step in the game.

However, we can design a new system that enables Alice and Bob to generate as few on-chain operations as possible during tic-tac-toe. Alice and Bob were able to update the state of the game in an off-chain fashion, while still being able to return it to the main ethereum chain when needed. We call such a system a “status channel.”

First, we created a smart contract on the Ethereum main chain called “Judge” that understood the rules of tic-tac-toe and identified Alice and Bob as two players in our game. The contract holds an ether reward.

Then, Alice and Bob start to play games. Alice creates and signs a transaction that describes the first steps of her game, then sends it to Bob, who also signs it, and sends the signed version back with a copy kept. Bob then also creates and signs a transaction describing his first step in the game, and sends it to Alice, who also signs it, sends it back, and keeps a copy, and each time they update each other with the current state of the game. Each transaction contains a “random number” so that we can know directly the order of moves in the game.

So far, no on-chain operations have occurred. Alice and Bob just sent transactions to each other over the Internet, but nothing involved the blockchain. However, all transactions can be sent to the Judge contract, that is, they are valid Ethereum transactions. Think of it as two people writing a series of blockchain-certified checks back and forth with each other. In fact, no money was being deposited or withdrawn from the bank, but they both had a pile of checks that could be deposited at any time.

When Alice and Bob finish the game (probably because Alice won), they can close the channel by submitting the final status (for example, the trade list) to the Judge contract, so that only one transaction fee is paid. Judge would make sure both sides signed the “final status,” wait a while to make sure no one could reasonably challenge the outcome, and then pay Alice an Ethereum reward.

Why do we need to set a “question time” for the Judge contract to wait?

Let’s say, instead of sending Judge a real final state, Bob sends Judge a previous state in which he won Alice. At this point, if Judge is a non-smart contract, it has no way of knowing whether the state is the most recent.

Questioning time gives Alice a chance to prove that Bob submitted a false final state of the game. If there is a more recent status, she has a copy of the signed transaction and can provide it to Judge. Judge can check random numbers to see if Alice’s version is up to date, and Bob’s attempt to steal the victory can be dismissed.

Characteristics and Limitations

Status channels are useful in many applications, and they are a rigorous improvement on performing operations on a chain. But in deciding whether an application is appropriate to be channeled, pay particular attention to some specific trade-offs that need to be made:

  • Status channels depend on reliability. If Alice had dropped the line during the questioning time (perhaps because Bob, desperate to win the prize, had broken her Internet connection), she might not have been able to respond within the questioning time. However, Alice can pay someone else to keep a copy of her status and act as her equity representative to keep the system reliable.
  • Status channels are useful in situations where a large number of status updates need to be exchanged over time. This is because of the initial cost of creating a channel when deploying the Judge contract. But once the deployment is complete, the cost of each status update within the channel is low
  • Status channels are best suited for applications with a clear set of participants. This is because the Judge contract must always be aware of all entities (for example, addresses) participating in a given channel. We can add or remove users, but each time we need to change the contract.
  • Status channels have strong privacy properties. Because everything happens “inside” the channel between the participants, not publicly broadcast and recorded on the chain. Only opening and closing transactions must be made public.
  • The authority of the status channel is immediate. This means that as long as both parties sign a status update, it can be considered final. Both parties have explicit assurances that they can “execute” the state onto the chain if necessary.

Our L4 team is working on Creating Counterfactual, a framework for implementing state channels on Ethereum. Our goal is to make it possible for developers to use state channels modularically in their projects without having to become state channel experts. You can learn more about the project here. We will release the technical details in the first quarter of 2018.

Another notable state channel project for Ethereum is Raiden, which is currently focusing on building a payment channel network using a similar paradigm to lightning. This means you don’t have to set up channels with specific people who want to trade. You can set up a separate channel with an entity connected to a larger network of channels, so you can pay anyone connected to the same network at no extra cost.

In addition to Counterfactual and Raiden, there are several application-specific channel implementations on Ethereum. For example, Funfair has built what they call Fate Channels for their decentralized gambling platform, SpainChain built a set of one-way Payment Channels for adult project actors (they also used a status channel in their ICO), and Horizon Games used a status channel in their first Ethereum-based game.

2. Plasma

On August 11, 2017, Vitalik Buterin and Joseph Poon published a paper entitled Plasma: Autonomous Smart Contracts. This document describes a new technology that could allow Ethereum to process far more transactions per second than it currently does.

Like status channels, Plasma is a technology used to manage off-chain transactions and relies on the underlying Ethereum blockchain for its security. But Plasma takes a new approach by creating “sub-” blockchains that are attached to the” main “Ethereum blockchain. These subchains can in turn produce their own subchains, and can be repeated in turn.

The result is that we can perform many complex operations at the sub-chain level, running full applications with thousands of users with minimal interaction with the main ethereum chain. The Plasma subchain migrates faster and carries lower transaction costs because operations on it do not need to be repeated across the entire Ethereum blockchain.

plasma.io/plasma.pdf

To see how Plasma works, let’s look at an example of how it works.

Imagine you’re building an Ethereum-based card trading game. These cards are tokens that are irreplaceable in ERC 721 (such as Cryptokitties), but have features and attributes that allow players to play against each other, a bit like Hearthstone or Magic the Gathering. These types of complex operations are expensive to perform on the chain, so you decide to use Plasma as an alternative in your application.

First, we create a series of smart contracts on the Main ethereum chain that act as a “root node” for the Plasma subchain. The Plasma root node contains some basic “state transaction rules” for the subchain (such as “transactions cannot consume consumed assets”), records the hash of the state of the subchain, and establishes a “bridge” service that allows users to transfer assets between the main ethereum chain and the subchain.

Then, we create our subchain. The subchain can have its own consensus algorithm, and in this case, let’s assume it uses Proof of Authority (PoA), a simple consensus mechanism that relies on a trusted block producer (i.e., a verifier). In a “proof-of-work” system, block producers and miners function similarly in that they receive transactions, form blocks and collect transaction fees from nodes. To keep the example simple, let’s assume that you (the company that created the game) are the only entity creating the block, and that your company operates several nodes that are the block producers of the sub-chain.

Once the subchain is created and implemented, the block producer periodically submits to the root node contract. What they’re essentially saying is “The X I submitted is the most current block in the subchain.” These commits are documented in the Plasma root node of the chain as proof of transactions in the subchain.

Now that the subchain is ready, we can create the basic components of a card trading game. These cards follow ERC721 and are initialized on the Main ethereum chain and then transferred from the Plasma root to the subchain. A key point is introduced here: Plasma can extend our interactions with blockchain-based digital assets, but those assets should have been first created by the Ethereum main chain. We then deploy the actual game application on the sub-chain as a smart contract, so that the chain contains all the logic and rules of the game.

When the user wants to play the game, they only need to interact with the sub-chain. They can hold properties (ERC721 cards), buy and exchange them for Ether, play against other users, and do other things allowed in the game, all without interacting with the main chain. Because only a few nodes (such as block producers) need to process transactions, the cost is much lower and the operation is faster.

But is this model safe?

By migrating operations from the main chain to the sub-chain, we can obviously perform more operations. But is it safe? Are transactions occurring on the sub-chain authoritative? After all, the system we have just described has only one central entity controlling the block production of the sub-chain. Isn’t that centralization? Wouldn’t a company be able to steal your assets or take your favorite cards at any time?

In simple terms, Plasma makes the basic promise that you can take your assets back to the main chain at any time, even if the subchain is completely controlled by one entity. If a block producer starts to show hostility, the worst that can happen is to force you off the subchain.

Let’s look at some of the ways in which block producers behave badly and see how Plasma handles these scenarios.

First, let’s say a block producer tries to cheat you by lying. They can do this by creating a fake new block and claiming that your assets have been taken over by them. Since they are the only block producer, they are free to introduce a new block that does not follow the rules of the blockchain. As with any other block, they will have to submit evidence of its existence to the Plasma root contract.

As mentioned above, users have a basic guarantee that they can get their assets back on the main chain at any time. In this scenario, users (or applications acting on their behalf) would detect such attempts to steal and withdraw to the main chain before block producers could attempt and use their “stolen” assets.

Plasma has also created a mechanism to prevent fraud. Plasma includes a mechanism for anyone (including you) to issue a fraud proof to the root contract to prove that a block producer is cheating. This proof of fraud will contain information about the previous block and allow us to use the status trading rules in the sub-chain that the wrong block does not properly connect to the previous state. If fraud is confirmed, the sub-chain rolls back to the previous block. Even better, we’ve built a system where block producers are penalized for checking out incorrect blocks, and they lose an on-chain deposit.

plasma.io/plasma.pdf

But submitting a fraud certificate requires access to the underlying data, that is, the actual previous historical blocks needed to prove the fraud. What if the block producer doesn’t share information about the previous block in order to prevent Alice from submitting a fraud certificate to the root node contract?

In this case, the plan is for Alice to recover the assets and get out of the sub-chain. Basically, Alice submitted a “certificate of fraud” to the root node contract. Alice’s assets will be moved back to the main ethereum chain after a delay during which anyone can challenge the proof (i.e. show some later legal block proof that she actually consumed the assets).

plasma.io/plasma.pdf

Finally, block producers can monitor users in the sub-chain. If a block producer wishes, they can simply exclude the actual transaction from their block, effectively preventing the user from doing anything on the sub-chain. As mentioned above, this solution once again takes all of our assets directly back onto the Ethereum main chain.

But taking out assets carries risks of its own. One of the concerns is what would happen if all users using the subchain were to pull out assets at the same time. In the case of such a mass withdrawal, there may not be enough capacity on the Ethereum main chain to handle everyone’s transactions during the challenge period, meaning users could lose money. There are many techniques available to prevent this from happening, for example, by extending the questioning period to accommodate the need to withdraw assets.

It is important to note that it is not inevitable that all block producers are controlled by one entity; this is an extreme case in our case. We can create sub-chains that create block producers that are distributed among different entities, truly decentralized like public blocks. In these cases, there is less risk for blockchain producers to interact as described above, and less risk for users to have to transfer assets back to the ethereum main chain.

Now that we’ve introduced status channels and Plasma, a few things are worth comparing.

One difference between them is that when all interested parties in the status channel agree to withdraw, it can withdraw immediately. If Alice and Bob agree to close the tunnels and withdraw their funds. As long as they all agree on the end state, they can immediately acquire their assets. This is not possible on Plasma and, as mentioned above, the user must include a questioning time during the asset extraction process.

Status channels are cheaper per transaction and faster than Plasma. That means we can set up state channels on the Plasma subchain. For example, two users in an application are making a series of small transactions. Establishing a status channel on the subchain should be cheaper and faster than executing each transaction directly on the subchain.

Finally, it’s important to note that this explanation is missing a great deal of detail. Plasma itself is at a very early stage. If you’re interested in seeing what Plasma is like right now, check out Vitalik’s recent proposal for a “Minimal Viable Plasma”. A team in Taiwan is working on this, which can be viewed in this branch. OmiseGo is looking at their distributed trading implementation and they post recent updates here.

III. Truebit

Truebit is a technology that helps Ethereum perform heavy or complex operations off the chain. It is more effective at increasing the total transaction throughput of the Ethereum blockchain, which makes it different from status channels and Plasma. As we discussed in the opening section, expansion is a multifaceted challenge that requires more than just higher transaction volumes. Truebit won’t let us do more transactions, but it will allow Ethereum-based applications to handle more complex transactions and still be validated by the main chain.

This allows us to do useful operations on the Ethereum application that are too computationally expensive to perform on the chain. For example, verify simple Payment Verification (SPV) proof from other blockchains by which an Ethereum smart contract can “check” whether a transaction occurred on another chain (such as Bitcoin or Dogecoin).

Let’s look at an example. Imagine that you have some high-cost computation (such as SPV proof) that needs to be performed as part of an Ethereum application. Because the computing cost of SPV proof is so high, you can’t simply use it as part of a smart contract on the main ethereum chain. Keep in mind that the cost of performing any computation on Ethereum is very high because each node must do this in parallel. Each block in Ethereum has a maximum cost (GAS) limit, which is used to limit the amount of computation that all transactions in the block combined can complete. However, the amount of computation proved by the SPV is so large that even if it were only one transaction, it would still require many times the total fee limit for a single block.

In contrast, off the chain you can do the calculation for a very small fee. The person who asks you to pay for the computation is called a solver.

First, the solver pays a deposit to the smart contract. Then you give the solver a detailed description of the calculation, they run it, and they return the result. If the result is correct (which happens in less than a second in most cases), their deposit will be returned. If it turns out that the solvers did not perform the calculations correctly (for example, they cheated or made a mistake), they lose the deposit.

But how can we tell if the results are correct? Truebit uses an economic mechanism called “verification games.” Essentially we create an incentive system called challengers to check the results of the solver. If the challenger can prove through the verification game that the solution submitted the wrong result, then they can collect the prize, and the solution loses their deposit.

Since the verification game is performed on-chain, it cannot simply compute the results (which would overturn the whole system’s design, since if we could perform the calculations on-chain, we wouldn’t need Truebit). Instead, we asked the solver and the challenger to identify specific operations on which they disagreed. In fact, we support both sides to go into a corner and find the specific line of code that causes them to be inconsistent with the results.

Truebit’s simplified concept map.

Once the specific operation is determined, it is small enough to be performed by the main ethereum chain. We then perform this action through ethereum smart contracts, which resolve once and for all which side is telling the truth and which is lying or wrong.

If you want to learn more about Truebit, you can check out this document, or this blog post by Simon de la Rouviere.

conclusion

The second layer of solutions share a common vision. Once we have the stable core provided by the public blockchain, we can use it as an anchor for the crypto economy and extend it to an infinite number of blockchain applications.

Now that we’ve investigated some examples, we can get a more specific look at how the second layer solution achieves this vision. The economic mechanisms used in the second tier of solutions are usually interactive games: they create incentives for parties to compete with each other or “check in” on each other’s work. Because we have a strong incentive for the other party to present verified misinformation, blockchain applications can assume that a given claim is correct.

In the status channel scenario, the final state of the channel is determined by giving each party an opportunity to “refute” the other. In the Plasma scheme, it is how to manage fraud certificates and withdrawals. In Truebit’s scheme, the challenger is incentivized to prove that the solver is wrong, thus ensuring that the solver gives the correct result.

These systems will help address a number of challenges involved in expanding Ethereum to a global user base. Some systems, such as status channels and Plasma, will increase the transaction volume of the platform. Other systems, like Truebit, will be able to perform more complex calculations as part of smart contracts, creating new use cases.

These three examples represent only a small part of the design possibilities for cryptoeconomic expansion schemes. We haven’t even talked about “inter-blockchain protocols” like Cosmos or Polkadot (although that’s a “Layer 2” solution or another blog post). Still, we should expect to invent unexpected new layer 2 systems that improve on existing models or make new trade-offs in speed, terminals, and overhead.

More important than any unique tier 2 solution is the further development of the underlying technologies and mechanisms that make these crypto economic designs possible.

These second layer extension solutions are strong evidence of the long-term value of programmable blockchains like Ethereum. It is only on a programmatic blockchain that it is possible to build an economy based on the second layer of solutions: you need to implement programs that perform interactive games in a scripting language. Because blockchains like Bitcoin offer only limited scripting capabilities, this is difficult for them (or in some cases, like Plasma, completely impossible).

The emergence of ethereum Layer 2 solutions allows us to make new trade-offs between speed, terminals, and overhead. This is where the underlying blockchain can be applied to a wider variety of applications. Therefore, different types of applications facing different threat models will naturally choose different tradeoff patterns. For large-scale transactions that need to be secured on a regional (or even national) scale, we use the backbone. For more speed-oriented digital asset transactions, we can use Plasma. The second layer allows us to make these middle ground measures without compromising the underlying blockchain, while remaining decentralized and authoritative.

Also, it is difficult to predict in advance what scripting capabilities will be required for a given extension scenario. When Ethereum was designed, Plasma and Truebit had not yet been invented. But because Ethereum is completely programmable, it can implement any economic mechanism we can invent.

The value of blockchain technology is built on the stable core of crypto economic contracts, and programmable blockchains like Ethereum are the only way to take full advantage of this value.

Thanks to Vitalik Buterin, Jon Choi, Matt Condon, Chris Dixon, Hudson Jameson, Denis Nazarov and Jesse Walden for their comments on the first draft of this article.


Diggings translation project is a community for translating quality Internet technical articles from diggings English sharing articles. The content covers the fields of Android, iOS, front end, back end, blockchain, products, design, artificial intelligence and so on. For more high-quality translations, please keep paying attention to The Translation Project, official weibo and zhihu column.