Author: Lin Guanhong/The Ghost at my Fingertips. Transporters, please: Be sure to identify the source.

The Denver nuggets: juejin. Cn/user / 178526…

Blog: www.cnblogs.com/linguanh/

Making: github.com/af913337456…

Tencent cloud column: cloud.tencent.com/developer/u…


Note: Some basic knowledge of Bitcoin technology is required to read this article

directory

  • Background introduction
  • agreement
  • Isolate the generation of witness addresses
  • Expansion principle
  • Verification of transaction
  • Malleable attack
    • scenario
    • Questions and Solutions
    • Modify the code of S

Background introduction

Segregated Witness was the result of an upgrade to Bitcoin, which went to the consensual level and supported an additional transaction model, segregated Witness, which led to a soft fork.

It appears in the background to solve the following two problems:

  1. Bitcoin “malleable attacks” due to a vulnerability in the elliptic curve signature algorithm ECDSA;

  2. To a certain extent, to achieve the purpose of bitcoin block expansion.

agreement

It should be noted that the “isolated witness address” is part of the “isolated Witness system” of Bitcoin. The entire isolated witness system is composed of multiple parts. Introduction of the complete system in the Bitcoin Improvement Protocol (BIP), mainly by the following protocol documents jointly participate:

  1. Bip-141, the link is: github.com/bitcoin/bip… 141 gives a detailed introduction to the isolation witness, including its definition and use;

  2. Bip-143, the link is: github.com/bitcoin/bip… 143 Detailed introduction is made to the whole process of signature and verification of isolated witness whose version number is 0 in transaction;

  3. Bip-144, the link is: github.com/bitcoin/bip… 144 pairs of isolated witnesses are introduced in detail in peer-to-peer (P2P) networks.

  4. Bip-173, the link is: github.com/bitcoin/bip… In 173, the isolated witness address is introduced, including what encoding format it uses, and what the verification code is, etc.

For more official information on quarantine Witness, you can view the full protocol for all improvements at github.com/bitcoin/bip…

Address generation

Let’s look at how the isolated witness address is generated. The address generation process of bitcoin starts from the public key and requires multiple bytes to be pieced together before hash algorithm and encoding. Similarly, the generation process of isolated witness address is similar. Referring to biP-173 protocol, we can summarize its generation process:

  1. Get ready for the bitcoin, unlock the scriptHexadecimal hash byte streamAt present, there are mainly two kinds, namely P2WPKH and P2WSH. The most obvious difference between these two scripts is that the hash in P2WPKH is 20 bytes, while the hash in P2WSH is 32 bytes. The corresponding script structure is as follows:
  • P2WPKH: OP_0 <20-byte hash >
  • P2WSH: OP_HASH160 <32-byte hash > OP_EQUAL
  1. Select the HRP string corresponding to different Bitcoin networks, including: PRIMARY network: BC; test network: TB; private Regtest network: BCRT. This information is defined in the configuration files in the source code. For example, the path of the Go version is github.com\btcsuite\ BTCD \chaincfg\params.go.

  2. Encode the byte stream of step 1 with 5 bits a byte, instead of 8 bits a byte, and set the result to B;

  3. Add the 0x00 byte before B, and the result is set to C;

  4. Bech32 algorithm is used to generate parity code D for the byte stream of HRP and C.

  5. Add D to C and set the result to E;

  6. Assembly: HRP + “1” + E combined with the encoding table mapping string, get the address.

In step 1, although the public key is not directly involved in the generation, the hash data is also evolved from the public key. The seventh step “bech32” code table is character combinations: qpzry9x8gf2tvdw0s3jn54khce6mua7l. Because the number of bytes in the hash structure varies from script to script, the result is different. Below is the flow chart corresponding to the above generation steps:

The code implementation of address generation can use the functions provided in the BTCD source code directly, as shown below:

Expansion principle

In the general packaging process of Bitcoin transactions, the signature data of each transaction is included, as shown in the figure below.

At the same time, we know that a block can contain a limited amount of data, which means that a block can package a limited number of transactions, if we can find a way to reduce the amount of data in the transaction, then can indirectly achieve the purpose of block expansion.

The essence of isolated witness is that when a block packages a transaction, it does not package the signed data and puts the signed data in another place. If so, how do you verify that the transaction is signed correctly to ensure that the data has not been tampered with?

Quarantine the inspection of witness transactions

To initiate a transaction, construct the script’s input Vin and place the input unlock script data in another field, the source code, named Witness. When the transaction is sent to the node, the node code will recover from this field to extract the data inside the unlock script, then attestation in the node for operation, inspection, after signed by the signature of the data is no longer be packaged into blocks, subsequent consumer side deals, unlock script is recovering from the Witness field.

The Witness field has no data. The data of the unlock script is carried in the signature, which means that the signature data must be packaged in the block to recover. Here is the Vin structure related to the above:

type Vin struct {
    // omit irrelevant fields
    ScriptSig *ScriptSig  `json:"scriptSig"`
    Witness   []string   `json:"txinwitness"`
}
Copy the code

ScriptSig places signature data, Witness places unlock script data. The script recovery operation function in the source code is in the bitcoin opcode virtual machine section, as shown below.

Malleable attack

One of the contexts in which isolation witness emerged was to address “malleability attacks,” also known as “malleability attacks.”

In 2014, MT.GOX exchange (Mentougou) lost 850,000 bitcoins, which was later blamed on bitcoin’s malleability attack. Here’s how it does it. In Bitcoin, the credential that distinguishes a transaction is the transaction ID, or TxId. If the TXids of two transactions are not the same, they are considered to be two different transactions. In our daily use of blockchain browser to view transactions, we will also query transactions according to TxId, two different TXIDS, it is definitely not the same transaction.

Now consider this scenario:

A uses the elliptic curve signature algorithm ECDSA to send A transaction T to bitcoin node N. At this time, N will verify whether T already exists in the transaction pool and its related logic according to THE ID of T. If it finds that T already exists and does not meet the substitution conditions, an error message will be returned to A. If it does not exist, it will be placed in the trading pool, waiting to be processed, and return T’s ID to A. At this point, suppose B wrote a program to monitor the bitcoin node transaction pool, found the transaction T, extracted the T, and obtained the ScriptSig field in Vin structure. The flowchart for the above scenario is shown in the following figure.

What should B do after acquiring the ScriptSig of T? First of all, B can’t tamper with T’s data, so how can he do a “malleable attack”?

B Then do this:

B will restore R and S shaping values of signature information by ScriptSig using ECDSA codes. Then, S was modified according to the vulnerability of the elliptic curve encryption algorithm, and then the signed ScriptSig was re-generated, and then the transaction was replayed. Pay attention here! When replaying, node N will generate a TxId_2 again according to the whole transaction information including ScriptSig. However, because ScriptSig is modified, the TxId is different, because the TxId is generated using hash algorithm. Hash algorithm can generate a TxId_2 with different input conditions. The output is definitely different.

B the description of the operation part leaves two questions:
  1. Will the input part (UTXO) of the same transaction not be detected on the outbound chain?

  2. Why can ScriptSig successfully check his signature after being changed?

Answer:
  1. Because bitcoin’s account model is based on UTXOs, if a UTXO has not yet been spent, it can be tried for double spending. And when it’s already consumed, when it’s consumed, it goes wrong. In the example above, the UTXO in the transaction T and its copy modified by B are the same, but they are unspent and can be referenced multiple times. That is, detection, only detection has been spent on the chain.

  2. The reason why the signature can be verified is that in the signature algorithm ECDSA of elliptic curve, S and R can be verified, and negative S and R can also be verified. But a negative S causes a difference in TxId. The attack flow chart of B is as follows:

At this point, the node in the same transaction content, but there are different TxId, transaction handling fee and receiver are completely the same. This means that both transactions have the potential to be packaged, and when one is packaged, the other will not be packaged, because the UTXO involved has already been spent.

At this time assume that A is an exchange, and it is helping users withdraw cash at this time, and B is the user, A in the reconciliation, according to their own transaction to get the TxId to check, but it does not know at this time there is A TxId_2 is to do the same withdrawal operation. Then TxId_2 is packaged, and B, as the attacker, finds himself successfully attacked, and goes to the exchange and says, why hasn’t his withdrawal been successful? And A will verify, find their TxId failed, and then initiate A withdrawal operation to the user again. At this point, B has received two or more on-chain transfers.

The whole process is a malleable attack.

Modify the code of S

The specific implementation in the code is also very simple, just need to add a line of code to modify the signature S, so that the check is still valid. Here is a working function I implemented, with the key lines coded to avoid abuse.

Modify the signature S code:

After the