This series is targeted at blockchain developers and CS majors

Faced with the media’s interpretation and adulation of blockchain-related technologies, many people are at a loss. Driven by the fear of Missing out (FOMO), investors and big companies are rushing to announce all in blockchain. Various moguls discussed the social, political, economic and even philosophical implications of blockchain technology. Humans have a natural sense of insecurity about what they don’t know and what they don’t know, and as a developer, I think the best way to overcome anxiety (and the speculation that comes with it) is to be as aware of the underlying principles and implementation as possible.

From a technical point of view, both Bitcoin and Ethereum, as well as EOS and IPFS that have not yet been officially launched, are highly experimental and have various limitations, which inevitably affect the development of upper applications. Block chain applications are mostly involved in important sectors such as finance, credit, so the deep understanding of the underlying principle is a basic requirement for chain blocks developers, rather than just 10 minutes following tutorials to deploy a smart contracts, especially under the condition of various technical not mature early, made a little attention will cause great losses.

The first article in this series looks at the evolution of blockchain technology architecture by focusing on one of the core differences between the cryptocurrency architecture represented by bitcoin (Blockchain 1.0) and the programmable distributed credit infrastructure represented by Ethereum (blockchain 2.0) — whether or not a Turing-complete language is supported.

How Bitcoin and Ethereum came to be: Vitalik Buterin(born 1994) is the undisputed god of the bitcoin and blockchain community. What many people may not know is that God V, an active member of the early Bitcoin community, initially proposed that Bitcoin need to develop a common scripting language to support feature-rich application development, but did not gain the support of the Bitcoin development team. The ethereum project was launched in 2013, leading to today’s flourishing cryptographic tokens, collectible games, and DAOs. Let’s take a look at what kind of bitcoin script system V God is dissatisfied with?

Part I: Bitcoin scripting engine

trading

Transaction has a very broad meaning in the blockchain world. In the application of cryptocurrency, it can be narrowly understood as the transfer of bitcoin quota between different addresses, namely transfer. Money transfers have a long history, but the technology is evolving all the time. Understanding the Bitcoin transfer model is especially important because the Bitcoin script engine is built on top of it.

There are two ways to transfer money

1. Simplified traditional central transfer: Alice (Account A) transfers X yuan to Bob (account B), and the bank needs the atomized operation balance[A]-=x,balance[B]+= X. Of course, the implicit condition is that Alice completes the authentication of account A.

2. A way to explain how Bitcoin works: Each node in the network maintains a separate database to record the balance of each address. If Alice(addressA owner) wants to transfer X dollars to Bob(addressB owner), she will broadcast to the network that “addressA gives X units to addressB”. Bring pubkeyA and sign with privatekeyA. Balance [addressA]-=x,balance[addressB]+=x. (Note: The actual address is generated by Pubkey, which is simply omitted here).

1 above occupies the mainstream in reality and has mature expansion plans, but centralization inevitably brings problems such as cost and platform evil. The description of 2 comes from B-Money, an Anonymous and Distributed Electronic Cash System (this article is very important and has a profound influence on Satoshi Nakamoto’s design of Bitcoin), but he could not practice it at that time. It is difficult to maintain consistency because of the heavy reliance on a synchronous, undisturbed network environment. Furthermore, the existing consistency algorithms paxos, RAFT (non-Byzantine) and PBFT (Byzantine) cannot support the tens of thousands of nodes in bitcoin.

The design of bitcoin transaction model

The earliest Bitcoin transaction model comes from Satoshi Nakamoto’s Bitcoin: A peer-to-peer Electronic Cash System. In fact, Satoshi Nakamoto proposed two kinds of chains. The so-called chain of blocks is the explicit way of organizing data, and the other implicit one is that the chain of transactions is the flow chain of bitcoin value.

As shown in figure, the earliest transaction description model:

If Alice(addressA owner) wants to transfer X dollars to Bob(addressB owner), she also needs to sign the transaction and broadcast it over the network. The difference is that the addressA balance is not stored in the database of each node, but in the unspent transaction output (UTXO) that others transfer to addressA. We query the addressA balance, and what we actually get is the sum of all UTXOs credits that are addressed to adressA. “AddressA (combining combining xo1… UTXO3) gives X units to addressB”, take pubkeyA, sign with privatekeyA. After the transaction is confirmed in the network, Bob will have one more UTXO available. If he wants to spend the money, he needs to prove that he owns addressB’s privatekeyB, and Bob also signs with the privatekey. The transaction becomes a chain of signatures.

Obviously there are three problems:

1. If the input of any transaction requires the output of some previous transaction, where did bitcoin come from in the first place? So in the bitcoin trade, there’s something called Coinbase, which is known as a reward for mining. Bitcoin is created through the mining algorithm, where the input comes from the system reward. It actually verifies whether Coinbase is “mature”, that is, whether the block is sufficiently validated. In Bitcoin, if it doesn’t end up in the longest chain, it’s discarded as a orphan block, and the reward is voided.

2. Is it necessary to trace the entire blockchain to determine if a transaction output is UTXO? No, because transactions are organized according to the Merkel Tree structure, which makes it inefficient to query a transaction from the entire blockchain database. UTXOs is stored specifically in LevelDB’s database ChainState and cached in memory. Every time a new block is generated, the UTXOs set is updated; When chain reconstruction occurs on a node, the process is rolled back. Note that the UTXOs set is not a TxMemPool, but an input source for all pending transactions. UTXOs can also theoretically be rebuilt from the entire blockchain via –reindex.

3.Alice’s account balance comes from four UTXOs, which are 0.05, 0.2, 0.2 and 0.3 respectively. Now we need to transfer 0.6 to Bob.



Analysis of the

Note: Ethereum abandons the UTXOs model in favor of an account paradigm similar to BMoney. The specific reason will be analyzed in the design of ethereum virtual machine.

Do so much foreshadowing, finally can enter the bitcoin script design.

Script opcodes

Bitcoin transactions are handled by a Script engine. Here’s a quote from the Bitcoin-core source interpreter. CPP:

/**
 * Script is a stack machine (like Forth) that evaluates a predicate
 * returning a bool indicating valid or not.  There are no loops.
 */
Copy the code

Script is a Forth, stack-based, stateless, non-Turing-complete language. Opcodes are divided into constant, process control, stack operation, arithmetic operation, bit operation, cryptographic operation, reserved word and other categories, and also include three internal use of pseudo-instructions. Here are a few instructions that will appear in later scripts, all of which can be found in official documentation and source code.

  • OP_0 … OP_16: Pushes literal values onto the stack.
  • OP_DUP: Copies the top element on the stack and pushes it onto the stack.
  • OP_ADD: pops top and subtop elements, adds them and pushes them onto the stack.
  • OP_EQUAL: Pops the top and second top elements of the stack, compares whether they are equal, pushes 1 to the stack if they are equal, pushes 0 to the stack otherwise.
  • OP_SHA256: pops the top element of the stack, performs sha-256 encryption operation, and pushes the result to the stack.
  • OP_HASH160: pops the top element of the stack, sha-256 encryption is performed, and ripemd160 digest is performed, and the result is pushed onto the stack. It is worth noting that this is part of the process of generating an Address based on a public key.
  • OP_CHECKSIG: pops the top and subtop elements of the stack, in this case sig and pubkey respectively; There is an internal VerifySignature function that verifies whether the signature matches the public key.
  • OP_CHECKMULSIG: pushes m signatures and N public keys into the stack to check whether the signatures correspond to a subset of the n public keys one by one.

Pay-to-PubkeyHash(P2PKH)

The example Alice reproduced to Bob above is a typical P2PKH. In the paper, Satoshi nakamoto just gives the transaction model, let’s look at a more specific implementation.

As shown in the figure above, before Alice transfers money to Bob, Bob needs to provide a receipt address of her own, but the actual P2PKH uses a Public Key Hash. The private key generates a public key unidirectionally, and the public key generates a 160-bit PKH(public key hash) through the OP_HASH160 instruction. The PKH can be converted to a more readable address for users, but the encoding and verification processes are bidirectional. So providing an address is equivalent to providing PKH.

lock
unlock

Here are the two scripts.

Lock script: scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG Unlock script: scriptSig: <sig> <pubKey>Copy the code

Those included above between <> are the numbers to be pushed on the stack, and the push instruction defaults. When executed, the scriptSig and scriptPubkey are connected to run the scripts from left to right.

Verification process: <sig> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGCopy the code

The situation on the stack is shown in the following two figures. Those of you who have a foundation in assembly will be familiar with how stack computer models work.

Generation rewards miners

Rewarding miners can be seen as a simplified P2PKH, with the difference that the input of the transaction comes from Coinbase rather than a UTXO.

Pay-to-Script-Hash(P2SH)

P2PKH design is relatively simple, the recipient Bob directly provides the payment address. There are many conditions involved in the actual circulation of value. In order to satisfy more complex functions, OP_EVAL instruction was introduced in BIP12 (eval means that the language has meta-programming capability in programming language design), and a more perfect trading standard P2SH was later proposed by BIP16.

Bob, the receiver, needs to first design a RedeemScript — a withdrawal script, which is regenerated into the Hash of the script and provided to Alice.

If Bob wants to spend the UTXO, he needs to provide the signature and RedeemScript, execute the content of RedeemScript after successful verification, and unlock the UTXO successfully after satisfying the conditions.

The following redeemScript is designed for specific scenarios. The corresponding examples are given in combination with the application of smart contract.

Lock script: Pubkey script: OP_HASH160 <Hash160(redeemScript)> OP_EQUAL Unlock script: Signature script: <sig> [sig] [sig...]  <redeemScript>Copy the code

In the P2SH transaction, Alice doesn’t actually know the details of the transaction because Bob provides a scripted Hash, and Bob needs to design the details of the transaction. This is called *” Moving the responsibility for supplying the conditions to redeem a transaction from the sender of the funds to The redeemer. They allow the sender to fund an arbitrary transaction, no matter how complicated, using a 20-byte hash. * This is not uncontroversial by design, but within bitcoin’s technical framework, it is a way to support more features with minimal modification.

Smart contracts for Bitcoin

When it comes to smart contracts, though, ethereum is more commonly thought of. But, as mentioned earlier, technology has evolved in a single line. As early as 1997, Nick Szabo proposed the concept of smart contract in his seminal paper Formalizing and Securing Relationships on Public Networks. Bitcoin’s scripting system supports the development of limited smart contracts, mainly through P2SH transactions.

MultiSig Multiple signature

BIP11 proposed m-of-N multiple signature transaction. The unlocking condition of a transaction is predetermined M signature authentication in N specified pubkeys (M<=N). P2PKH can be viewed as a 1-of-1 signature. Multiple signatures are useful in scenarios such as increased security and managed transactions. Therefore, the instruction of OP_CHECKMULTISIG is specially implemented in bitcoin. This can be done through the following script design.

<m> <A Pubkey > [B Pubkey] [C Pubkey...] <n> OP_CHECKMULTISIG: Signature script: OP_0 <A sig> [B sig] [C sig...Copy the code

If you use P2SH transactions, you can also design the following script.

Lock script: Pubkey script: OP_HASH160 <Hash160(redeemScript)> OP_EQUAL Lock script: Signature script: OP_0 <A sig> <C sig> <redeemScript> Where: Redeem script: <OP_2> <A pubkey> <B pubkey> <C pubkey> <OP_3> OP_CHECKMULTISIGCopy the code

Gavin Andresen has written an example of the use of a 2-of-3 multi-signed transaction, which is so detailed that I won’t carry it.

Part II: Ethereum Virtual Machine

Blockchain paradigm

Gavin Wood abstracts the blockchain system as a transaction-based state machine in his Yellow Book:

In Formula (1), S is the set of states inside the system, F is the transfer function of transaction state, T is the transaction information, and the initial state is Gensis state. In Formula (2), F is the block-level state transfer function, and B is the block information. Formula (3) defines B as a block of a series of transactions, each containing multiple transactions; Formula (4)G is a block finalization function, including uncle block check, bonus miner check, POW check, etc.

This mathematical model underlies not only Ethereum, but also most consensus-based decentralized trading systems today. Ethereum’s ascent over Bitcoin is essentially the F and S of this paradigm. At its heart is the idea of a blockchain with Turing-complete and unlimited storage space for internal transactions. Corresponding to:

  • The powerful function F, which can perform any calculation, bitcoin does not support loop;
  • State S records any type of data (including code), while Bitcoin’s UTXO model only calculates how much an address can be spent.

The data structure

In terms of data storage, Bitcoin calculates address balances through the UTXO model, and users are not encouraged to store other data. Through the P2SH scripting mechanism, it is theoretically possible to design various smart contracts, but due to the limited expression ability of scripting language, it is difficult to support complex contract development. This design makes sense for cryptocurrencies.

Ethereum needed to redesign its data structure to allow recording arbitrary information and performing arbitrary functions.

Merkle Patricia Trie

Ethereum uses Merkle Patricia Trie heavily to organize and store data, and as we’ll see below, this new data structure is achieved through an innovative combination of hash and prefix trees. Convention: MPT is used below in place of Merkle Patricia Trie.

Merkle Tree

Also called a hash tree: Each leaf of the tree is the hash of a block of data, and each non-leaf is the hash of a child. As shown, this tree does not store Data Blocks themselves. In the P2P network environment, if the malicious network node modifies the data on the tree, it will not pass Merkle Proof, thus ensuring the integrity and validity of the data. This depends on the nature of one-way hash encryption. This property makes it widely used in distributed system data verification, such as IPFS, Git, etc.

Patricia Trie

Also called Radix Trie, it is a spatially optimized variant of prefix tree: if a node in the tree is the only child of its parent, the two nodes can be merged. Its application here is to map long integer data from a 20bytes Ethereum Address to its Account, such as

, which is encoded into a hexadecimal number — in Patricia Trie’s case, a path of non-leaf nodes.

,account>

For example, if you store <“dog”,”Snoopy”> on Patricia Trie,” dog” will be encoded as “64 6f 67”. If you find the root node, you will find root->6->4->6->15->6->7->value. Value is just a hash that points to “Snoopy”. The advantage of this approach over hash tables is that there are no conflicts; But without optimization, the query steps are too long.

Improved point

To improve efficiency, Ethereum has a special design for node data types in the tree. Including the following four types of nodes

  • The null node represents an empty string
  • Branch a 17-element non-leaf node of the form
    ,i1…>
  • Leaf node A leaf node of two elements, such as

    . EncodedPath is part of a long integer number string encoded by address encryption
    ,value>
  • Extension node is a non-leaf node of two elements, such as <encodedPath, K >. The function of extension is to combine nodes on paths without forks to save space resources

    The figure, which is a simplified state tree (explained in more detail shortly, but not in any way a schematic), shows a map of < address, balance > in the upper right corner. The prefix item is used to assist coding and can be ignored. Addresses of 4 accounts, organized according to MPT. All extension nodes are just optimizations and can be replaced with multiple Branch nodes.

MPT requires a back-end database (levelDB in Ethereum) to maintain the connection between each node. This database is called the state database. The benefits of using MPT include: (1) The root node of this structure is encrypted and depends on all internal data. Its hash can be used for security verification, which is the property of Merkle tree. However, unlike Merkle tree, which does not store data blocks itself, MPT tree nodes store address data. It is the property of the Patricia tree (2) that allows any previous state (where the root hash is known) to be recalled by simply changing the root hash value.

state

The concept of a state tree was introduced above in explaining MPT. The concept of World states in Ethereum stores arbitrary states recorded by a decentralized transaction system through MPT mappings. This corresponds to the S in the blockchain paradigm, a core concept of Ethereum’s design.

The article
The entry point for querying the latest account status should be the state tree of the most recently confirmed block.

Ethereum’s account model needs a special introduction.

Account

Bitcoin uses the UTXO model to calculate balances, which cannot meet the need to record arbitrary states. Ethereum has designed the Account model, which stores the following: [Nonce, balance, storageRoot, codeHash] where Nonce is the transaction counter, balance is the balance information, storageRoot corresponds to another MPT, through which the variable information of the contract can be retrieved in the database. A codeHash is a codeHash value that cannot be changed after creation.

  • External accounts are controlled by private keys. In the corresponding Account model, the storageRoot and codeHash do not exist, that is, codes are not stored or executed. If there are only external accounts, then Ethereum can only support money transfers.
  • Contract accounts can be created by initiating transactions from an external account or by another contract account. When a contract account receives a message invocation, it loads code that performs the logic through EVM to modify the state of the internal store.

trading

Under the UTXO model, transactions are essentially unlocking input (via signed data) and locking output. Under the Account model, there are two types of transactions:

  • Create a contract, creating a new contract through code
  • Message call, which can transfer money or trigger a function of the contract

There are two types of transactions are includes the following fields: [nonce, gasPrice gasLimit, to value, [v, r, s]]

  • Nonce: indicates the number of transactions sent by an account
  • GasPrice,gasLimit: Used to limit the execution time of a transaction and prevent the program from running in an infinite loop
  • To: The recipient of the transaction
  • Value: Amount of transfer, in the case of creating a contract, the amount donated to the contract
  • V, R, S: transaction signature data that can be used to determine the sender of a transaction

Contract creation also requires:

  • Init: AN EVM code represented by an unlimited byte array that is run only once at contract creation time; Init returns the body snippet after execution, and subsequent contract calls run the body content.

    The address of the contract account is determined jointly by the Sender and the Nonce, so any two successful contract deployments will result in different addresses. As you can see from the figure above, code and state are stored separately. The compiled bytecode is actually stored in a virtual ROM and cannot be modified.

Message invocation also requires:

  • Data: An unlimited array of bytes representing the input to a message call

    Message invocation modifies the status of an account, which could be an EOA account or a contract account.

Transactions can be initiated from an external account or by contract. For example, block 5228886 contains 170 transactions and seven internal contract transactions.

block

Ethereum’s blocks add more data items, making them much more complex than Bitcoin, but not much different in nature. For example, tertiary hashing is added to optimize incentives to support mining agreements; There is also a lot of validation and serialization in the block itself. These are beyond the scope of this article and will not be discussed in depth.

EVM design and implementation

The Ethereum Virtual Machine (Ethereum Virtual Machine) is the runtime environment that performs Ethereum’s state transition function. Simple question: Can Ethereum reuse Java, Lisp, Lua, etc instead of developing a single underlying VM? In theory, yes, the Corda project is based entirely on the JVM platform. But more blockchain projects will choose to specialize in developing underlying infrastructure, including scripting engines for Bitcoin. Ethereum official explanation:

  • Ethereum’s VM specifications are simpler, whereas other general-purpose VMS have a lot of unnecessary complexity
  • Allow custom development, such as 32bytes of words
  • Avoid external dependency issues with other VMS, which can make installation difficult
  • With another VM, a complete security review is required, and the tradeoff may not be much

The memory model

EVM is also based on the stack computer model, but involves memory and storage in addition to the stack:

  • Stack the size of elements on the stack is 32bytes, which is different from the normal size of 4bytes and 8bytes, mainly for ethereum arithmetic objects with 20bytes of address and 32bytes of cryptography variables. The stack size does not exceed 1024; The call depth of the stack does not exceed 1024 to prevent memory overflow.
  • While all operations are performed on the stack, temporary variables can be stored in memory, regardless of memory size
  • Storage state variables are stored in storage. Unlike stack and memory variables that disappear when EVM instances are destroyed, data in storage will be persisted after modification

    This is a diagram of the EVM architecture, which has a profound impact on ethereum application development, including design patterns and security considerations.A classic problem is contract escalation:

    After the contract is deployed, it is compiled into bytecode and stored in virtual ROM. The code is not modifiable, which is a serious constraint for many DApps. One idea is to distribute the code among different contracts, and the inter-contract calls are made through addresses stored in storage, thus implementing the actual contract upgrade operation.

Execution model

EVM is technically a quasi-Turing machine, grammatically capable of performing arbitrary operations, but in order to prevent network abuse and avoid security issues due to Turing integrity, all operations in Ethereum are subject to economic restrictions, known as gas mechanisms, in three cases:

  • General operation cost, such as SLOAD,SSTORE, etc
  • Fuel consumed by submessage calls or contract creation is part of the cost of performing CREATE, CALL, and CALLCODE
  • The cost of memory usage is proportional to the number of 32bytes words required

The following figure shows the internal flow of EVM execution, fetching instructions from EVM code, all operations on the Stack, Memory as a temporary variable storage, storage is the account state. Execution is limited by Gas Avail.

Now with EVM, let’s look at the execution details of the transaction introduced earlier. As defined by the blockchain paradigm, T is the ethereum state transition function, which is also the most complex part of Ethereum. All transactions are subject to internal validation before execution:

  • Transactions are RLP format data with no extra suffix bytes;
  • The signature of the transaction is valid;
  • Random numbers of transactions are valid;
  • The fuel ceiling shall not be less than the fuel used in the actual transaction process;
  • The balance of the sender’s account is at least greater than the fee V0, which needs to be paid in advance;

The following figure shows the process of message invocation. Each transaction may form a deep call stack, with calls between different contracts within the transaction. The CALL is made through the CALL directive, and the parameters and return values are passed through memory.

Error handling

Several errors occur during EVM contract execution:

  • The fuel shortage
  • Invalid instructions
  • Missing stack data
  • The target address of the JUMP JUMPI command is invalid
  • The new stack size is greater than 1024
  • Stack call depth exceeds 1024

EVM error handling has a simple principle called revert-state-and-consume all-gas, where the state is restored to the checkpoint before the transaction was executed, but consumed gas is not returned. The virtual machine treats errors as code errors and does no specific error handling.

EVM analysis tools

Tools for EVM analysis can be found in the Ethereum Virtual Machine (EVM) Awesome List

Turing-complete Virtual Machine (WIP) like EVM

The full EVM specification is complex, but with a certain assembly base and the ability to simplify the model, implementing an EVM-like virtual machine is a challenge that can be attempted. I’ll put up my own implementation when I have time. If you’re interested, you can try it yourself.

reference

1.A Next-Generation Smart Contract and Decentralized Application Platform

2.ETHEREUM: A SECURE DECENTRALISED GENERALISED TRANSACTION LEDGER

3.Design Rationale

4.Stack Exchange: Ethereum block architecture

5.Go Ethereum

6.evm-illustrated

7.Diving Into The Ethereum VM