A summary,

In this article, we will talk about how to optimize the performance of HDFS when a large number of clients concurrently write data.

Second, background introduction

To introduce a little background, if multiple clients simultaneously want to write a file on HDFS, can this be done?

Obviously not acceptable, because HDFS files are not allowed to write concurrently, such as concurrent appending some data etc.

So there’s a mechanism in HDFS called file contract mechanism.

That is, only one client can obtain the contract of a file on the NameNode at a time before writing data.

If other clients try to obtain the file contract, they cannot obtain it and can only wait.

This ensures that only one client is writing a file at a time.

After obtaining the file contract, during the process of writing the file, the client needs to start a thread that keeps sending requests to NameNode to renew the file, telling NameNode:

NameNode, I’m still writing the document. Will you keep the deed for me?

The NameNode has a special background thread that monitors the contract renewal time.

If a contract has not been renewed for a long time, it will automatically expire and let another client write it.

Said so much, the old rules, to give you a picture, intuitive experience of the whole process.

  


Three, the problem protrudes

Ok, so here’s the problem: if we have a large scale Hadoop cluster, there could be thousands of concurrent clients.

The list of file contracts maintained by NameNode is very large, and the background thread that monitors contracts needs to frequently check that all contracts have expired at regular intervals.

For example, traversing a large number of contracts every few seconds can lead to poor performance, so this contract monitoring mechanism is clearly not suitable for large-scale hadoop clusters.

Fourth, optimization plan

So how to optimize the file contract monitoring algorithm?

Let’s look at his implementation logic step by step. First, let’s take a look at this hand-drawn image:

  


The secret is quite simple: each time a client sends a renewal request, it sets the latest renewal time of the contract.

Then, based on a TreeSet data structure, the contracts are sorted according to the last renewal time. Each time, the contract with the oldest renewal time is ranked first. This sorted contract data structure is very important.

TreeSet is a sortable data structure, which is implemented based on TreeMap.

TreeMap’s underlying implementation is based on a red-black tree, which ensures that elements are not duplicated and allows us to customize the sorting of elements each time you insert an element.

So our sorting rule here is to sort the contract by the most recent renewal time.

In fact, this optimization is as simple as maintaining such a sorted data structure.

Let’s now look at the source code implementation of contract monitoring in Hadoop:

  


You don’t want to run through thousands of contracts every time you check to see if they are expired, because that would be inefficient.

We can simply get the oldest contract renewed from TreeSet, assuming that the most recent contract renewed has not expired. This means that contracts that are renewed more recently will never expire!

For example: the contract with the oldest renewal date was last renewed 10 minutes ago, but we judge the contract to expire after 15 minutes.

At this time, even 10 minutes ago contract renewal has not expired, so those 8 minutes ago, 5 minutes ago contract renewal, certainly will not expire ah!

The optimization of this mechanism is quite helpful for performance, because normally, expired contracts are still in the minority, so there is no need to run through all the contracts to check for expiration.

We only need to check the oldest contracts. If one contract has expired, delete it and check the second-oldest one. And so on.

Through this mechanism of TreeSet sorting + priority checking the oldest contract, the performance of contract monitoring mechanism under large-scale cluster can be improved by at least 10 times. This idea is very worthy of our learning and reference.

To give you an extension, Eureka as a registry in The Spring Cloud microservices architecture also has a renewal check mechanism, similar to Hadoop.

However, Eureka does not implement a similar renewal optimization mechanism, instead each round of violence iterates through the renewal times of all service instances.

If you’re dealing with a massively deployed microservice system, that’s not good!

In a large-scale system with hundreds of thousands of machines deployed, the renewal information of hundreds of thousands of service instances resides in Eureka’s memory. Should the renewal information of hundreds of thousands of service instances be traversed every few seconds?

Finally, a good open source project contains a lot of good design ideas. Read all kinds of excellent open source project source code, is a short time quickly, greatly improve a person’s technical foundation and technical level of the way, we might as well try.