1. HDFS Concept

HDFS is a file system that stores and manages files through a unified namespace (similar to the directory tree of a local file system). It is distributed, with each node in the server cluster having its own roles and responsibilities.

1.1 the Hadoop architecture

Consists of three modules:

  • The Hadoop Distributed FileSystem (HDFS) is a Distributed storage system
  • Distributed computing MapReduce
  • Resource scheduling framework Yarn

1.2 Block Concept

A large number of files can be distributed on different servers

For example, the following client and file (1-n) are independent business machines, and a large number of files are allocated from the client machine to each machineA single file is too large to fit on a single disk. Therefore, the file can be divided into many small blocks and stored on different servers. Each server is connected over the network to form a whole.The default to128MBThe size of theta, each128MBThe file in HDFS is calledblock(block)

And the default type can be set by parameter, setting too small or too much is not good. If the shard is too small, the data may be spread across multiple machines (and the addressing time is slow). If the shard file is too large, the data transfer time will be slow.

After a user sends a 1GB file request to the HDFS client, the HDFS client splits the file into eight files (also called blocks) based on the configuration (128MB by default). Each server stores these blocks. Assume that each server stores two copies. The servers where the data is stored are called datanodes (data nodes) in HDFS. This involves the concept of the three components of HDFS.

1.3 backup HDFS

A file is divided into eight blocks and stored on different Datanodes. If one of the datanodes fails, the data is unavailable. Therefore, Hadoop makes a copy of each data block to ensure data reliability

1.4 HDFS Features:

  • Save multiple copies, and provide fault tolerance mechanism, copy loss or downtime automatically recover. By default, three copies are saved.
  • Running on cheap machines.
  • Suitable for big data processing. By default, HDFS divides files into 64 MB blocks (128 MB by default). Then, the block key value pairs are stored in the HDFS and mapped to the memory. If there are too many small files, the memory burden will be heavy

1.5 HDFS Architecture:

NameNode: indicates the Master node. Manage data block mapping; Handle client read and write requests; Configure a copy policy. Manage the HDFS namespace;

SecondaryNameNode: shares the workload of namenode. Is a cold backup of NameNode; Merge fsimage and fsedits and send to namenode.

DataNode: Slave node. Store data blocks sent by the client. Perform read/write operations on data blocks.

Hot backup: B is the hot backup of A. If A fails. Then B immediately runs the job that replaces A.

Cold backup: B is a cold backup of A. If A fails. So B can’t immediately replace A. However, B stores some information about A to reduce the loss after A breaks down.

Fsimage: metadata image file (directory tree of the file system)

Edits: Metadata operation logs (records of changes made to the file system)

Namenode memory stores =fsimage+edits

Three components of the HDFS

HDFS is a NameNode and multiple Datanodes. MapReduce is a JobTracker and multiple TaskTracker. Yarn is a ResourceManager. Multiple NodeManagers, and Spark is a Master and multiple slaves

2.1 Introduction to NameNode

Big data frames are distributed, each role may run on different servers, need to communicate will need the support of network, and in our client needs to read a file information, must know that our file is divided into many blocks, each block, which are stored in the server again, The information used to describe the file is called the file’s metaData, and the metaData is stored in the NameNode’s memory

2.2 Introduction to metaData

Size of metaData: Files, blocks, and directories occupy about 150 bytes of metadata, so why HDFS is suitable for storing large files but not small files. It can be thought that there is only one 150byte metadata file for storing a large file, and N small files will be accompanied by N 150byte metadata files. It’s not a good deal

Metadata information is stored in a namespace image file (hereinafter called fsimage) and an edit log (hereinafter called edits log)

Fsimage: metadata image file, which stores the directory tree information of the file system and the mapping between files and blocks edits log: log file, which stores the change records of filesCopy the code

If the NameNode goes down, the memory will not be able to read it. In order to prevent this, and to speed up the recovery of the NameNode from a failure, the memory will not be able to read it. A SecondaryNameNode role is designed

Log cache: When a client writes a file to the HDFS, the operation log will be recorded. In this case, we prepare two cache areas in advance. When the first cache is full, the log will be recorded into the disk, that is, the memory of the Edits log and NameNode. This ensures that client writes are logged at all times.

2.3 Introduction to SecondaryNameNode

Its function basically has the following points

Back up the NameNode metadata. 2. Speed up the restart of NameNode 3. As a new NameNode if necessaryCopy the code

Why does SecondaryNameNode speed up the recovery of NameNode?

When a cluster starts, the system records the startup time. When a period of time passes or the Edits log file in the NameNode is full, the checkPoint operation is triggered. This operation is also used in Spark to back up important data

The procedure is described in a point for you to read

1.SecondaryNameNode pulls the edits log and fsimage information through HTTP get

2. Merge the edits log and fsimage in SecondaryNameNode to create a new file called fsimage.ckpt

3. After the merge is complete in SecondaryNameNode, it is passed back to NameNode

4. At this time, it is likely that some clients are still reading and writing NameNode, and new logs will be generated, which will be stored in a separate edits new file

5. The fsimage. CKPT that has just been sent back is decomposed into the original fsimage and edits log files

Why does SecondaryNameNode speed up the restart of the NameNode

How does NameNode recover from a node crash

First, it reads the image file fsimage into memory, then executes all the operations recorded in the Edits log to restore all the metadata, and then returns to the state before shutdown. This process is very slow

However, with SecondaryNameNode, a large part of the metadata can be recovered through the fsimage.ckpt provided by the SecondaryNameNode, which can then be recovered directly by performing the new operations recorded in the Edits log and merged from the Edits New

After the NameNode is determined that it cannot be restarted, the SecondaryNameNode can serve as the new NameNode by running the following command

Hadoop-daemon. sh start Namenode copy codeCopy the code

Of course, it is not difficult to find that this method is very inelegant, because there must be a gap between the NameNode restart and the SecondaryNameNode host, and the Hadoop HA method can help us solve this problem

2.4 Datanodes

Data is stored on Datanodes, and multiple copies are backed up between Datanodes.

When a DataNode starts up, it registers with a NameNode and maintains a heartbeat. If no DataNode heartbeat is received over the time threshold, HDFS considers the DataNode to be faulty.

A Block stores not only the data itself, but also a copy of metadata (including the length of the data Block, the checksum of the data Block, and the timestamp). The DataNode still periodically reports information about all current blocks to the NameNode. The metadata is used to check whether the current Block status is normal.

3. HDFS mechanism

3.1 Heartbeat Mechanism

The heartbeat mechanism solves the communication problem between HDFS clusters, and is also the way for NameNode command DataNode to perform operations

After the master Namenode is started, an IPC server is started. 2. The slave DataNode starts, connects to the Namenode, and sends a heartbeat message to the Namenode every 3 seconds. 3.NameNode sends task instruction copy code to DataNode by returning the value of this heartbeatCopy the code

The role of the heartbeat mechanism

1.NameNode has full control over data block replication. It periodically receives heartbeat signals and block status reports from each DataNode in the cluster

2.DataNode registers with NameNode when it starts, sends blockReports to NameNode periodically, sends heartbeat messages to NameNode every 3 seconds, and NameNode returns instructions to DataNode, such as copying data blocks to another machine. If a DataNode does not send heartbeat messages to the NameNode within 10 minutes, the NameNode determines that the DataNode is unavailable. In this case, the read and write operations of the client are no longer transmitted to the DataNode

3. When the Hadoop cluster is started, it enters safe mode (99.99%), and the heartbeat mechanism is used. In fact, when the cluster is started, each DataNode sends a block report to the NameNode, and the NameNode counts the total number of blocks reported. When block/total is less than 99.99%, the safe mode is triggered. In safe mode, the client cannot write data to the HDFS, but can only read data.

3.2 Load Balancing

In fact, it is the increase or decrease of nodes, or the level of disk utilization of nodes, mainly through the network for data migration work to achieve high availability

Trigger a command

$HADOOP_HOME/sbin/start-balancer.sh -t 5% Copy codeCopy the code

5% is actually the disk utilization difference mentioned earlier, more than 5% triggers load balancing policy

conclusion

1. HDFS is a distributed file system, simply understood as a file system composed of multiple machines.

2. There are three important modules in HDFS. The CLIENT provides a unified operation interface externally, the DataNode stores data, and the NameNode coordinates and manages data.

3. HDFS cuts large files into blocks, and each block stores backup data to ensure high data availability and is suitable for storing big data.

4. NameNode implements data recovery and high availability through fsimage and Editlog.

5. HDFS is not suitable for storing a large number of small files. It does not support concurrent write or random file modification.

References:

www.cnblogs.com/laov/p/3434…

Juejin. Cn/post / 684490…

Juejin. Cn/post / 684490…