preface

Hadoop has been developed for more than 10 years, and the versions have gone through numerous updates and iterations. At present, the major versions of Hadoop are divided into three versions: Hadoop1, Hadoop 2 and Hadoop3.

1 hadoop1

Hadoop 1 was first released to solve two problems:

  • How to store massive data
  • How to calculate the problem of massive data

The core design of Hadoop1 is HDFS and MapReduce.

2 HDFS1 architecture

HDFS1 is a master-slave architecture with only one master node called the NameNode. There are multiple slave nodes called DataNodes.

2.1 the NameNode

  • Manage metadata information (file directory tree) : Files and Block blocks, Block blocks and DataNode hosts.
  • The NameNode loads the metadata into memory in order to quickly respond to user requests for operations.

2.2 the DataNode

  • Stores data, dividing the uploaded data into fixed – size file blocks. (Hadoop1, the default is 64M. Hadoop2, 128 m)
  • To keep the data secure, each file block has three copies by default.

3 HDFS1 architecture defects

Although HDFS1 was originally designed to solve the problem of mass data storage, some defects gradually appeared in the process of use, including the following two main points:

  • Single point of failure
  • Memory limited problem

3.1 Single point of failure

A single point of failure, as the name implies, is a single node failure that causes a fatal HDFS defect. We all know that an HDFS cluster is made up of multiple nodes, the most important of which is the node where the NameNode resides. The NameNode holds all the metadata information of the datastore. If the NameNode fails, the entire HDFS becomes unavailable.

3.1.1 high availability

So HDFS in order to solve this problem, put forward a high availability scheme, which means that one more NameNode, when one hangs up or loses communication, the other can replace, so as to maintain the availability of HDFS, to achieve the function of self-active disaster recovery.

In order to realize self-active disaster recovery, a third-party framework cluster can be introduced, and here comes ZooKeeper. When ZooKeeper communicates with the NameNode, it creates a lock. The NameNode holding the lock is treated as the active NameNode, and the backup NameNode is treated as the standby NameNode. There is also a listener thread called ZKFC in the background to monitor the NameNode. Once a problem is detected in the active NameNode, the Standby NameNode switches over automatically.

3.1.2 Data consistency problem

Introducing another NameNode is a matter of data consistency. In the case of the current NameNode failure, there will be some cases where the request cannot find the corresponding data if the top of the highly available NameNode is saved incorrectly with the metadata information of the previously failed NameNode. The data that would take us away from our subjective consciousness is missing.

3.1.3 JournalNode

So in order to avoid this, it’s important to keep the data consistent. Here HDFS introduces the concept of a shared file system, which starts a number of JournalNode processes in the background (storing metadata synchronously, which is less stressful) to form a shared file system that stores metadata (editlog). JournalNode itself is also highly available. The active NameNode writes the metadata to JournalNode in real time, and the standby NameNode reads the metadata information in real time, thus maintaining the same metadata as the active NameNode and the standby NameNode.

3.2 Memory constraints

The high availability solution allows the single point of failure problem to be resolved. It also lets us know the importance of NameNode. The reason why we can retrieve the data stored in the DataNod so quickly through NameNode is that the NameNode reads all the metadata in the memory, and the retrieval efficiency is much higher than that of the disk.

As the data accumulates over time and the metadata becomes larger and larger, the memory of the NameNode is limited. The answer is yes, the NameNode itself is a server, and the assembled memory is definitely limited. When memory is limited, adding memory to this device can also be a temporary solution. But it treats the symptoms rather than the causes. Our cluster itself consists of so many servers, and the memory of one node is not enough, so we can use the memory of other servers to complete the data retrieval.

So, here HDFS uses Federation.

3.2.1 Federation

The federated mechanism works by dividing NameNodes into different namespaces and numbering them (think of this as C,D,E disks on a computer, etc.). Different namespaces don’t interfere with each other. Create a directory in the DataNode that corresponds to the number of the namespace. Thus, data with the same number is managed by the corresponding namespace. Of course, in order to avoid the single point of failure problem, the high availability scheme is still adopted.

HDFS2 architecture

As a result, HDFS has solved the single point of failure and memory limitations, and HDFS1 has entered the era of HDFS2, which is the version we often use today, and is relatively stable.

  • HA Solution (High Available)

Solve HDFS1 NameNode single point of failure.

  • The federal program

Fixed HDFS1 memory limitations.

HDFS3 architecture

As for HDFS3, it optimizes HA scheme on the basis of HDFS2, and adds erasure code technology.

  • The HA schema supports multiple NameNodes
  • Introducing erasure code technology (EC)

The HA solution has been optimized to support multiple NameNodes, making the production environment a little more reliable. The background of erasure code technology is that the storage overhead of 3 replicas in HDFS is too large in some scenarios. For example, the storage of some cold data in three copies is a waste of cost. Thus, a natural improvement would be to use erasure codes (EC) instead of replicas, which provide the same level of fault tolerance, but with much less storage. In a typical erasure code (EC) setup, storage overhead is no more than 50%. The replicators of EC files are meaningless. It is always 1 and cannot be changed with the -setorp command.

With the addition of erasure code technology, HDFS3 not only improves the storage efficiency, but also maintains the characteristics of data persistence.