A, HDFS HA

1. Namespace (dfs.nameservice)

  • Namespaces are configuration parameters, not processes, and do not need to modify scripts or code.

2. HDFS HA architecture

Active NameNode(Active NN)

  • Receives RPC requests from clients and writes a record to its own editlog file and to the JN log cluster. Receive the heartbeat and block position report of each DN node.

Standby NameNode(Standby NN)

  • Repeat: The read and write operations performed on the Active NN node are retrieved from the JN cluster and performed again on the Standby NN node to keep the two nodes in sync.
  • Perform repeat operation and receive heartbeat and block position report of each node of DN. Wait to switch to the Active state at any time to provide external services.

JournalNode(JN)

  • Is the communication channel for data synchronization between Active NN and Standby NN.
  • At least three nodes must be deployed
  • The maximum number of failures allowed is n-1 or 2

DataNode(DN)

  • Sends heartbeat and block position reports to both NNS simultaneously.

ZooKeeper(ZK)

  • Coordination service, unified naming service, status synchronization service, cluster management, distributed application configuration item management, etc.

ZKFC process

  • Monitor the health status of the NN and periodically send heartbeats to the ZK cluster so that it can be elected.
  • When elected Active by the ZK cluster, the ZKFC process calls the corresponding NN node through RPC to make it Active.

Second, the Yarn HA

ResourceManager(RM)

  • Write the lock file to the hadoop-ha directory of the ZK cluster during startup. If the write succeeds, the current RM is identified as Active RM; otherwise, the current RM is Standby RM.
  • The Standby RM monitors whether the lock file exists and attempts to create it if it does not.
  • Active RM receives requests from the client, receives and monitors nm resource reports, allocates and schedules resources, and starts and monitors the Application Master.

NodeManager(NM)

  • Start the Container, run task computing, and report resources and computing information to the Application Master.

RMStore

  • The RM job information is stored in the /RMStore location of the ZK cluster, and the Active RM writes job app information to this directory.
  • When the Active RM fails and the Standby RM succeeds in changing to the Active state, the Standby RM reads the job information, reconstructs the job memory information, starts internal services, receives NM heartbeat, builds cluster resource information, and receives job requests submitted by clients.

Compare HDFS HA with Yarn HA

ZKFC

  • ZKFC of HDFS HA is a process
  • Yarn HA’s ZKFC is thread

A master-slave relationship

  • Active NN and Standby NN Both NNS are reported by DN
  • Only Active RM is reported by NM

State storage

  • NN state is stored by the dedicated log cluster JN
  • Leverage the existing ZK cluster