The address of the project: https://github.com/netwarps/r…

preface

In Rust, async-std and Tokio, as two asynchronous runtime libraries with more users, have their own advantages. Rust-ipfs is the rust implementation of IPFS, the runtime is Tokio, and the underlying network library is based on rust-libp2p. In order to try to modify the underlying rust-libp2p to libp2-rs, we fork a copy of code based on the original warehouse for porting, which has been completed now. Now share a hang problem encountered during the migration process.

Problem description

First, I started a go-ipfs daemon and used the ipfs ID command to get the multiaddress information. After that, run rust-ipfs’ sample program simple to ensure that the connection to Go-ipfs succeeds. In IPFS, the maximum number of blocks a DAG node can carry is 174, that is, 43.5MiB. I stored a file of about 77MiB through go. When I got it through rust, I found that the maximum number of blocks was 128, but the test code did not respond.

Search time too long?

Because there is a timeout limit in the logic of obtaining blocks from CID, Bitswap will raise an Error message if no return is returned after 30 seconds. Therefore, Bitswap simply considers the search process to be time-consuming and puts it aside. But after about a dozen minutes, it turns out that the console is not throwing any BitswapError messages, and that things may not be as simple as we thought.

Blockstore hang

The problem was finally located in the call to the get_block() method, through layers of print log side-by-side errors.

First, search for the block corresponding to the CID in the local BlockStore. If no block is found, use Bitswap to search for the block. The blockStore was tested using a Hashmap wrapped in a Tokio ::Mutex package, and the pending problem occurred when the block was retrieved from the Hashmap, which is line 383 in the figure.

Tokio resource limitations

With the help of an article by Tokio, we found a solution to this problem.

Because Tokio is not preemptively scheduled, it is possible that a certain task may be executed all the time. As a result, other tasks cannot be scheduled and remain hungry. In some languages, it is possible to interrupt execution by injecting yield points, but rust’s generator does not seem to provide a similar capability.

Therefore, tokio introduced the concept of budget in version 0.2.14 to solve this problem. This can be thought of as a quota, and every resource in Tokio will know this value. The default value of budget is 128, which is a good value obtained by the official test. Each asynchronous operation will reduce the value of budget. When the value is reduced to 0, the task will return to the scheduler, reset budget and wait for the next scheduling.

In tokio::block_on, the budget check is performed:







Coop ::budget() as shown in the figure initializes the budget variable to 128 and then polls the future passed in, in this case the Acquire inside Mutex. Acquire implements Future and first checks budget when polling. If budget is sufficient or unrestricted, return Ready and perform the remaining operations, otherwise return Pending.

In the meantime, there’s one more thing to note. Budget’s reset takes effect only in Tokio’s worker Thread. Other libraries’ executors are unaware of Budget’s existence and do not perform the reset operation. For example, if block_on for Futures is used in tokio’s Executor, after a certain number of executions, the code running logic inside the block_on is suspended, resulting in a hang problem.

In our code, we actually encounter the same problem:

First, the main function uses #[tokio::main], which automatically generates a Tokio executor:

Second, the method that tests the code uses the future::executor’s block_on, which causes the code to be executed only on the future executor. When get_block() looks for the local blockstore, it will suspend if it does not finish within 128 times:

conclusion

To sum up, the solution is not to execute executor::block_on() of other libraries in Tokio’s executor.


Netwarps is composed of domestic senior cloud computing and distributed technology development team, the team has very rich experience in the financial, power, communication and Internet industry. Netwarps has set up research and development centers in Shenzhen and Beijing, with a team size of more than 30, most of which are technical personnel with more than 10 years of development experience, respectively from the Internet, finance, cloud computing, blockchain and scientific research institutions and other professional fields.


Netwarps focuses on the development and application of secure storage technology products, including decentralized File System (DFS) and decentralized computing Platform (DCP). Netwarps is committed to providing distributed storage and computing platform based on decentralized network technology, which has the technical characteristics of high availability, low power consumption and low network. Applicable to the Internet of Things, industrial Internet and other scenarios.


Public account: Netwarps