Zhu Chongwen

Technology Manager of Blockchain Group of Kaiying Network

Go version of Ethereum

Why DPOS

We will expand the practice of consensus transformation

Practice of smart contracts

Problems exposed under pressure tests

 

1. Go version of Ethereum

1.1 Ethereum client

The first is some client implementation of the Ethereum technology community. Ethereum technology protocol itself is the protocol, it contains some interface protocol, specification or parameter definition and internal specific implementation logic and flow, etc. Based on this technical protocol, one Ethereum node can be implemented in each language. The official ethereum team implemented the official version using the Go language. The implementation of Partiy is the second largest client.

1.2 Ethereum toolset

Core components include Solidity, web3.js. For most developers, these are the two components that matter most. Swarm is an implementation of external storage, hoping to solve ethereum’s own problems with large data storage, and IPFS is a similar implementation.

1.3 Ethereum Corporate network topology



Figure 3

Figure 3 is a topology of the Ethereum network. It consists of a variety of clients that form a network among themselves. As you can see from Figure 3, the entire Ethereum public chain is very open.

2. Why DPOS mechanism

2.1 Comparison of consensus mechanisms

DPOS is a consensus mechanism derived from graphene technology. A quick comparison is made to the POW mechanism provided by ethereum’s official community. POW consumes your computational power to produce blocks. It is slow to produce blocks and slow to confirm. DPOS is the agent pattern. The TPS of DPOS mechanism realized at the present stage can reach 1000, and the average confirmation time is 1-3 seconds.

2.2 Advantages of DPOS mechanism

The advantages of the DPOS mechanism we have implemented lie in two main areas. The first is system reliability. In commercial scenarios, the network performance is controllable, and the network can be quickly handled and recovered when exceptions occur. In addition, TPS/QPS and confirmation time are required. Second, consensus mechanism is an idea, based on the public chain, can be open to the outside world, anyone can participate, set up the role of council and witnesses, the council manages the blockchain network, witnesses produce and verify blocks, constitute a benign ecosystem.

2.3 Concept of DPOS mechanism

Figure 4.

Figure 4 shows that we define a company and a community into three roles, with ordinary community members electing the board of directors through their voting rights. The Board appoints witnesses through its own judgment or perception. Witnesses maintain the whole chain according to their actual situation, this is the concept of DPOS mechanism.

3. Expand consensus transformation practices

3.1 Consensus framework engine

3.1.1 Transform the logic of consensus layer

Figure 5

Figure 5 shows how the DPOS mechanism is implemented. Ethereum’s interface service is to provide client calls, network communication is to provide services for P2P. In the middle is a consensus layer, whether to do mining or to do DPOS and other consensus, there can be an extension on this. At the bottom is the storage layer, where the storage and state of the block is done.

3.1.2 Official implementation engine: Ethash/Clique

Figure 6.

Figure 6 shows that we achieve DPOS through the consensus layer. The code above is a direct snippet of The code from Ethereum, an interface engine that provides a number of methods, many of which are named in Verify. Verify is important for ethereum as a whole. Blockchain is immutable and requires verification of all kinds of anomalies. The top text says that officially there are two engines. The first is the engine used on the public chain, which is the implementation of the POW mechanism. The second one is Clique, and the modification to Dpos is very much inspired by the Clique engine.

3.1.3 Seal core method call



Figure 7.

Figure 7 is a bit complicated. It can help us understand the internal logic of Ethereum. First, agent and worker methods are used to listen to their own objects. Let’s look at worker channel first. In this channel, different things will happen when new work comes out. Worker is a working command. Once you receive the worker’s channel, you need to determine whether you are capable of mining and packaging. After triggering the channel, you call the methods in the engine, and then SEAL solves the problem. The seal method will then generate a Result object, which will trigger the following work. Once a block is generated, a new worker will be placed on a new channel. If I can keep producing blocks, You can do this process over and over again, which makes up the core mining process inside the Ethereum Miner node. Work Channel also has its own trigger mechanism. At this point, I need to rely on the external node. When other nodes generate blocks, they will broadcast to our node, and I will step out the block, notifying the channel that there is a new workspace to do it. This is one of the most important Seal methods in the interface engine. We can modify this method to change the mining problem to no problem.

3.2 Reference Clique (POA) implementation



Figure 8.

Another consensus mechanism is Clique. The Ethereum public chain uses a POW mechanism to prevent malicious attacks by the number of nodes. At the same time, Ethereum provides some test chains. However, there is a problem after the release of the test chain with POW. There are not many people involved in the maintenance of the test chain at the same time. If someone maliciously makes it, it is very simple, and it is easy to break down through some machines and computing power. In order to make the test chain run more stable, another consensus mechanism was developed instead of POW, only authorized machines can produce blocks, so the maintenance cost is very low, find some machines to license him to maintain the network. The whole network uses authorized people to issue blocks, and nodes vote to authorize or eliminate authorization. These additional voting mechanisms are recorded in the extra data field of the block header.

The diagram on the right of Figure 8 shows the block generation logic of the test chain. If there are three nodes A, B, and C, the normal logic goes from A to B, and from B to C. There is A concept of competing blocks in the picture. When C is produced, not only C can produce blocks, but ALSO A can produce blocks. Why? In order to avoid the failure of C node to produce blocks, blocks of A node can also be provided to join the chain. Meanwhile, if C is in the current round, there is A difficult field with A value of 2. Although A can produce blocks, it is not in this round, so difficult is 1. As far as possible, blocks generated by node C can be accepted by most nodes. This is the most common concept of Ethereum.

Figure 9.

The two nodes can compete for blocks. For it, we hope that the blocks of node C can be accepted as much as possible. Therefore, for node A, if it is not in this round, the time of block generation of node A can be slightly delayed. For example, it should be possible to produce blocks at this time, but I need to control it to produce blocks later, so as to avoid too many forks. Although difficult value can be used to determine who is the longest chain, too many forks will lead to more unstable chain state.

3.3 Extend the block header structure

3.3.1 Adding a witness list

Figure 10.

After we borrow Clique’s ideas, we will think about how to extend our DPOS mechanism. WitnessVotes is the WitnessVotes field we defined ourselves. We hope to provide node authorization for DPOS. We will extend the WitnessVotes field, list the addresses of witnesses and sort them by dictionary. In this way, the problem of block rotation changes can be avoided. Ethereum is very easy to do for field extension, in the framework of the framework of the body framework, we just need to add some code, we can easily expand its Header.

3.3.2 Rules for witness list generation Figure 11.

Figure 11 is a list of Witness. For ethereum’s authentication mechanism, two blocks, even if the block data is the same, but the block creator is not the same, is not acceptable. To do this, we need to agree on a scheme that, for the list of witnesses, must rely on the parent node. If the current recognized Block height on the blockchain is Block N, and now we want to generate Block N+1 blocks, how do we populate the witness list of blocks? We have to agree by Block N’s Block, and whoever is on the current Block N’s witness list will be added to the next witness list. For the creation block, we write the list of witnesses directly in the Creation Block through the Creation profile.

3.4 Implementation of rotating producers

3.4.1 Determine whether the current round needs to produce blocks

After each node receives some conditions, how can it determine whether it can generate blocks? Usually by calculating the problem, but now by making a judgment based on the following conditions: The current timestamp, the block producer of the current block, the witness list of the Parent block, the block producer of the Parent block, the time stamp of the Parent block, and the block producing period are used to calculate whether the current miner is qualified to produce blocks. If you don’t qualify now, you shouldn’t block it.

3.4.2 Implementation of rotating producers — analysis

  • Scenario analysis

In the figure, if we have three nodes A, B and C, this is the data structure of their parent node, the list of parent’s witnesses is ABC, the timestamp is T, the period of block generation is 3 seconds, and the current time point is T+3. These three node points can judge whether they are qualified to block generation. According to the figure, we find that the witness list is always ABC, and the block producer of the parent node is A. The next block producer is B. B can produce blocks when T+3, while A and C cannot produce blocks when T+3. A may wait until T+9 to produce blocks, while C is T+6 to produce blocks. In this way, the whole network can continue to run down.

  • Determine the current round – code implementation



The logic is realized by the code, we calculate a value according to the condition, calculate the round check, we can judge whether the block can be produced according to the timestamp and the block producing node and the local time stamp.

  • Register the next call

  • why



Another problem is that the block generation has not reached its own round or should not be generated now, so no result object will be generated and the submission of new worker will not be triggered, and there is no other time node to trigger this channel. At this time, I cannot receive any message and do anything. This is a problem in Clique’s test network. We do not want this to happen. We hope that the network will always move forward. Even if the network is faulty or abnormal, as long as the nodes behind are OK, they can still move forward. To solve this problem, we need to make some changes. We remove the notification mechanism generated by workers. Each time seal is generated, it is not notified by these two places, but by seal itself. Seal then registers the next Seal, which ensures that it can judge itself forever.

  • Scenario analysis

According to the above content, to judge yourself, you need to register an action timer for yourself. Again, for node A, this time node A cannot generate A block, because there is no condition, what time point does he give himself the timer? One scenario is that I go to block at T+9, and I wait 9 seconds or 6 seconds to see if I’m qualified to block. In the actual test, the effect of this scheme is not particularly good. There are two reasons;

The first one is that the list of witnesses will change. If the list changes, the next time when you wake up, the rounds of block generation will also change, so the opportunity of block generation may be missed, which will cause the block generation cycle of the whole network to become more unstable. Secondly, my iteration time is relatively long, which reduces the frequency of my block generation. Once the whole network is unstable, I will be in the situation of side fork. In the whole network, each node is generating blocks, but it is not accepted by other nodes. We hope to synchronize in a short time as much as possible, but not too small, because too small will cause problems in the performance of generating blocks. We define two periods, one is 3 seconds, and the second is to define a minimum value. In fact, during block output, packaging and verification work will be done before the node blocks output. We deduct the time consumption, the packaging time is too long, the minimum value will be useful, to help fast block. In this case, because of some reason in front of your block production speed is slow, or you may be this time point should produce a block, but no block, this time is less than 0, this time immediately let him produce a block.

  • Code implementation

Each Seal scheme registers a method to be called the next time. This method is called registerNext

3.4.3 Custom Reward rules

We also made changes to the rewards, which will change the rules of how rewards are distributed for our application scenarios, allowing us to adapt to different business scenarios.First of all, confirm the reward rule in genesis block, which is a default-reward rule. This rule can issue the reward according to this rule after the node mining is finished. How to do it specifically: Just set an interface and define a method. For example, I hope the block maker can get five coins as a reward. After each block is issued and other nodes are verified, the money in the account will be increased by five coins in this data structure. We pull out the code and then extend our rules of reward allocation, block it based on time, either not reward it, reward it in some other way, or put the build in a pool.

4. Fulcrum practice of smart contracts

4.1 Contract language Solidity

Let’s talk about our implementation in smart contracts. The mechanism of DPOS we talk about is bound with smart contract to a large extent. How our voters vote and how our council members elect witnesses are all done through DPOS, which can be related to consensus.

4.2 Implementation of voting contract

2 the council

We vote for council mentality is ordinary users, through their holdings of the right to vote, can pledge to vote for the election of board members, will be through the calculation of Top N, automatically to change, now some DPOS mechanism is the concept of transition, in the every day, our side is automatic, this point in time as long as you enough votes, Soon to be a member of the council.

4.2.2 witnesses

A witness is appointed by the council, and a board member is qualified to initiate a proposal, this proposal may contain nomination or remove a witness, he launched a proposal, can choose by or not by other board members, this proposal has a certain limitation period, more than the time window is automatically canceled, once passed, more than half If more than half deny it, it’s null and void.

4.3 Smart contract design mode

There’s a lot of information about smart contracts, you can separate the smart contract code of the business from the controller, and our business will change. For a smart contract, once it enters the blockchain, it cannot be changed. Then you have to write a new contract to replace it, but there is only one exposed interface to separate the controller from the business layer. You just need to change the controller to the address of the business contract below. And a train of thought, we may be in inside put some smart contracts such as data storage, the data, but once the new business implementation, this data is not common, the two smart contract although similar, but the data is not universal, we divide data separate out, the data is also a contract, don’t need too much of the design of complex, Can also be used by the business layer above.

4.4 Smart contracts are not smart, but have too many holes

Smart contract itself has many restrictions, such as the number of parameters, the fixed length of the return value, and the size of the smart contract cannot be set too large. Therefore, smart contract still needs some time to be optimized from the current perspective.

4.5 Smart contract design mode, using a single contract

Because it has some problems of its own, so the ideal design pattern, with the controller layer and the data store layer, we don’t want either, is to start with one contract, or one type of contract to do all the work. And it turns out that once you get rid of those layers, you can’t do a lot of things that way, so the way we develop smart contracts is the same way we write code, you define return values, you define variable vectors, you can’t do that, and you end up doing that with a single contract.

5. Problems exposed under pressure test

5.1 There will be no stress testing scenarios for the Ethereum public chain, which will require a lot of optimization and testing



5.2 Traffic control/retransmission mechanism

The simplest point is through some flow control and retransmission mechanism, when we call external business, we are not directly connected to the witness node, but the gateway node, the code is the same, there will be a limit on all requests. After a certain point the request is rejected.

5.2.1 Check whether the transaction request reaches the upper limit

 

The actual code implementation, control the interface to call, we do not want to affect the P2P network.

5.2.2 Retransmission mechanism to prevent p2p network transaction broadcast failure

 

This problem is not likely to occur when the size of the node is very large, but it can occur when the size is small. We added the retransmission mechanism to the pending List in the program.