Welcome to follow our wechat official account: Shishan100

My new course ** “C2C e-commerce System Micro-service Architecture 120-day Practical Training Camp” is online in the public account ruxihu Technology Nest **, interested students, you can click the link below for details:

120-Day Training Camp of C2C E-commerce System Micro-Service Architecture

directory

A, a prelude to

2. NameNode architecture principle of HDFS

A, a prelude to

Hadoop is the most mainstream technology system in the field of big data, including a variety of technologies.

HDFS (distributed file system), YARN (distributed resource scheduling system), MapReduce (distributed computing system), and so on.

For those of you who may have heard of Hadoop but are not quite sure what it is, this article will explain it in plain English.

If all the data in your company is stored in MySQL, then put it all on a database server. Let’s assume that this server has 2TB disk space. Let’s take a look at the graph below.

Now the problem is, you keep adding data to the MySQL server, the result is more and more data, more than 2TB size, now what to do?

I can set up multiple MySQL database servers with separate databases and tables. Just put some data on each server. As shown above!

Ok, no problem, so let’s get 3 database servers, 3 MySQL instances, and each server can have 2 terabytes of data.

Now let me ask you a question. What is the so-called big data doing?

Let’s take a look at one of the most elementary uses of big data. Suppose you have an e-commerce website, and now you need to store and analyze the behavior logs of all users clicking, buying and browsing on the page and APP of this e-commerce website.

You now have all this data on 3 MySQL servers. It’s a lot of data, but you can barely fit it.

One morning, your boss shows up. To see a statement, for example, to see the daily website X indicators, Y indicators, Z indicators, and so on, 20 or 30 data indicators.

All right, buddy, now try writing a SQL to analyze the 20 or 30 metrics from the click, buy, browse logs?

I bet you that you can write a very complex SQL that starts with hundreds or even thousands of lines. This SQL, do you think it can run on 3 MySQL servers behind the database and table?

If you think you can, then you must not know MySQL branch table after many pits, hundreds of rows of large SQL cross-library join, all kinds of complex calculation, is not realistic.

Therefore, the storage and calculation of big data is not carried out by MySQL at all, which is why big data technology systems such as Hadoop and Spark came into being.

In essence, big data technologies such as Hadoop and Spark are a series of distributed systems.

For example, HDFS in Hadoop is the core cornerstone of big data technology system, responsible for distributed storage of data. What does this mean? Don’t worry, keep reading.

Hadoop Distributed File System (HDFS) is a Hadoop Distributed File System.

It consists of a number of machines, each running a DataNode process, responsible for managing a part of the data.

Then there is a machine running NameNode, which is basically the process that manages the entire HDFS cluster and stores all metadata of the HDFS cluster.

Then there are many machines, and each machine stores a portion of the data! Well, HDFS can now store and manage large amounts of data very well.

At this point you must be wondering: isn’t MySQL server like this? If you think so, you are all wet.

This matter is not as simple as you think, HDFS is naturally distributed technology, so you upload a large amount of data, storage data, management data, naturally can use HDFS to do.

If you have to rely on MySQL database and table, it will be a lot more painful, because MySQL is not designed for distributed system architecture, it lacks a lot of data guarantee mechanism in distributed data storage.

Ok, so now you have distributed data in HDFS, and then you have to distribute the data to calculate it.

For distributed computing:

  • Many companies use Hive to write hundreds of lines of large SQL (underlying MapReduce)
  • There are also many companies that are slowly starting to write hundreds of lines of large SQL with Spark (the underlying Spark Core engine).

In short, you write a big SQL, and they break it up into a bunch of computing tasks, put them on each machine, and each computing task is responsible for a small part of the data. This is called distributed computing.

This is definitely more reliable than running hundreds of rows of SQL against MySQL.

For that, I’ll give you a picture, and we’ll go through the whole process.

2. NameNode architecture principle of HDFS

All right, after the prelude, let’s get down to business. This article mainly discusses the core architecture principle of NameNode in HDFS cluster.

NameNode has a core function: manage metadata of the entire HDFS cluster, such as file directory tree, permission setting, copy number setting, and so on.

Here is the most typical file directory tree maintenance, to give you an example, ** we look at the following figure. ** Now there is a client system that wants to upload a large 1TB file to the HDFS cluster.

When he will be with the NameNode communication, say: elder brother, I want to create a new file, his name is “/ usr/hive/warehouse/access_20180101 log”, the size is 1 TB, ok for you?

The NameNode then creates a new file object named access_20180101.log in the specified directory in the file directory tree in its memory.

This file directory tree is not a very core piece of HDFS metadata, maintenance of HDFS distributed file system, which directories, which files, right?

But there is a problem. The file directory tree is in NameNode memory.

What if NameNode crashes by accident? Isn’t the metadata all lost?

But if you constantly change the metadata in the disk file, performance will be extremely low. After all, that’s a lot of random disk reads and writes!

Never mind, let’s take a look at HDFS’s elegant solution.

Every time the memory changes, write an EDits log, metadata modification operation log to the disk file, do not modify the disk file content, is sequential append, this performance is much higher.

When the NameNode restarts, read the edits log into the memory and play it back.

Let’s go through the whole process with the following picture.

If the edits log gets bigger and bigger, it will restart slowly. Because a lot of edits log playback recovery metadata to read!

So HDFS says, I can go like this, I’ll introduce a new disk file called fsimage, and then I’ll introduce a JournalNodes cluster, and then I’ll introduce a Standby NameNode.

An Edits log is generated each time the Active NameNode (primary node) modifies the metadata, writing to the JournalNodes cluster in addition to the local disk file.

The Standby NameNode can then pull the Edits log from the JournalNodes cluster and apply it to the file directory tree in its own memory, consistent with the Active NameNode.

Then every once in a while, Standby NameNode writes the file directory tree in its memory to a fsimage on disk. This is not a log, this is a complete metadata. This operation is known as a checkpoint operation.

Upload the fsimage to the Active NameNode, and then empty the old Active NameNode edits log file, which may have a million lines of modification log!

The Active NameNode then continues to receive requests to modify the metadata and writes to the Edits log for a while, which may be just a few dozen lines!

If at this point, Active NameNode is restarted, bingo! The fsimage from the Standby NameNode is read directly into the memory. This fsimage is metadata, and no additional operations are required.

Then play back a few dozen lines of the new Edits log into memory!

This process starts much faster! Because you don’t need to replay millions of edits logs to recover metadata! As shown in the figure below.

In addition, if you look at the picture above, we now have two Namenodes.

  • One is that the master node provides services externally and receives requests
  • The other node simply receives and synchronizes the edits log of the primary node and the standby node performs periodic checkpoint.

Did you notice that?! The metadata in their memory is almost identical!

So, if the Active NameNode fails, can we switch to the Standby NameNode immediately?

This is the NameNode failover mechanism.

HDFS client adds a new file to the file directory tree in NameNode memory.

However, at this time, they need to upload data to multiple Datanodes, which is a large 1TB file! Zha preach?

Very simple, 1TB large file split into N blocks, each block is 128MB. 1TB = 1024GB = 1048576MB. If a block is 128MB, there are 8192 blocks.

These blocks are managed on different machines. For example, in a cluster of 100 machines, about 80 blocks on each machine is ok.

But the problem is that if one machine goes down at this point, 80 blocks are lost.

This means that if you upload a terabyte file, you will lose a small amount of data. It doesn’t matter! HDFS has it all figured out!

It defaults to three copies per block, identical copies on different machines, and if one machine goes down, there are two copies of the same block on other machines.

Take a look at the picture below. There are 3 copies of each block on different machines, and it doesn’t matter if any machine goes down! You can also get that block from another machine.

Now, you to HDFS upload a 1TB large file, can rest easy!

OK, the above is plain English and a series of hand-drawn diagrams, to give you a little white people can understand the basic architecture principles of Hadoop

Next, WE will talk about HDFS, the most outstanding distributed storage system in the world, which carries some core mechanisms and principles of high concurrent requests and high performance file upload.

** How can Hadoop Support Thousands of Concurrent accesses per second in a Large cluster

[Secret under the Iceberg] How Does Hadoop improve the upload performance of terabyte large files by 100 times? , stay tuned

If there is any harvest, please help to forward, your encouragement is the biggest power of the author, thank you!

A large wave of microservices, distributed, high concurrency, high availabilityOriginal seriesThe article is on its way,

Please scan the qr code belowContinue to pay attention to:

! [](https://p1-jj.byteimg.com/tos-cn-i-t2oaga2asx/gold-user-assets/2018/11/12/167088310d1d57b1~tplv-t2oaga2asx-image.imag e)

Architectural Notes of Huperia (ID:shishan100)

More than ten years of EXPERIENCE in BAT architecture