purpose

This document is the starting point for users to use a Hadpoop Distributed file system (HDFS), either as part of a Hadoop cluster or as a stand-alone general purpose distributed file system. Although HDFS is designed to “work properly” in many scenarios, knowing more about how HDFS works can help with configuration and diagnostics.

An overview of the

HDFS is a distributed storage system based on Hadoop. An HDFS cluster consists of namenodes that manage file system namespaces and Datanodes that store actual data. HDFS is described in detail in the HDFS Architecture guide. This user guide describes the interaction between users, administrators, and HDFS clusters. The HDFS architecture diagram describes the basic interactions between Namenode, Datanode, and clients. The client contacts NameNode for file metadata or file modifications and performs the actual file I/O directly using DataNodes.

Here are some salient features that may be of interest to many users.

  • Hadoop, including HDFS, is suitable for distributed storage and computing using some commercial hosts. It is fault-tolerant and easily extensible. MapReduce, known for its simplicity and applicability to large numbers of distributed applications, is an integral part of Hadoop.
  • HDFS is highly configurable and provides default configuration. Most of the time, some configuration is only necessary for very large clusters.
  • Hadoop is written in JAVA and runs on all major platforms.
  • Hadoop supports shell-like command lines to interact with HDFS.
  • Namenode and Datanode have built-in Web services to check the current cluster status.
  • New features and improvements are implemented regularly in HDFS. Here are some useful features:
    • File permissions and authentication
    • Rack awareness: Consider the physical location of nodes when planning tasks and allocating storage
    • Security mode: a maintenance mode
    • FSCK: diagnose the health status of a file system and discover lost files or blocks
    • Fetchdt: A tool to get a Delegation token and store it in a file on the local file system
    • Balancer: tool for balancing clusters when data is not evenly distributed among datanodes
    • Upgrade and rollback: After a software upgrade, HDFS may need to be rolled back to the state before the upgrade to prevent location problems
    • Secondary NameNode: Periodically check the namespace and ensure that the file size of HDFS change logs is within the NameNode limit.
    • Checkpoint node: Periodically checks the namespace and minimizes the file size of HDFS change logs on Namenode. Plays the role previously played by the Secondary NameNode, although it has not been enhanced. NameNode allows multiple Checkpoint nodes at the same time as long as no backup node is registered with the system.
    • Backup node: Extension of checkpoint node. In addition to checking functions, it also receives all operations from Namanode in the form of data streams. The backup node keeps a backup of the namespace in its own memory and synchronizes the backup with the namespace of the active Namenode. Namenode allows only one backup node to be registered.

A prerequisite for

The following documentation describes how to install and start a Hadoop cluster:

  • Set up for the first single node on the road
  • Distributed Cluster construction

The following documentation assumes that the user has the ability to set up and run HDFS with at least one Datanode. After learning this document, you can place Namenode and Datanode on the same physical machine.

A Web interface

Both Namenode and Datanode run Web services to display basic information about the current cluster status. By default, the front-end page of Namenode can be accessed at http://namenode-name:50070/. This page displays the list of Datanodes and basic cluster statistics. This Web interface can also be used to Browse the file system (click “Browse the File System” on the page).

Shell command

Hadoop contains rich shell-like commands for direct interaction with HDFS and other Hadoop supported file systems. Run the bin/ HDFS DFS -help command to list the commands supported by Hadoop. In addition, run bin/ HDFS DFS -help command-name to display the details about the command-name command. These commands support most common file system operations, such as copying files, changing file permissions, and so on. It also supports certain HDFS specific operations, such as changing file replication. For details, see the File System Shell Guide

DFSAdmin command

The bin/ HDFS dfsadmin command supports some operations related to HDFS administrators. The bin/ HDFS dfsadmin -help command lists all supported commands. Such as:

  • -report: reports basic HDFS statistics. Some of the information in the report is also available on the Namenode web page.
  • -safemode: the administrator can manually enter or leave the safemode, although this is not required.
  • -finalizeUpgrade: deletes the backup made during the last cluster upgrade.
  • -refreshNodes:
  • -printTopology :

For more information, see dfsadmin

Secondary NameNode

Namenode stores updates as logs, which are appended to the local file edits. When Namenode starts, it reads HDFS state from an image file called “FsImage” and then uses the changes in the edit log file. Then, write the new HDFS state to the FsImage file and start the regular operation with a new empty edits file. Because Namenode only merges Fsimage and edits files at startup, edits log files can become very large on a busy cluster, which can cause the next Namenode restart to take a long time.

The Secondary NameNode periodically merges Fsimage and EDits files and keeps the EDits file size within a limited range. Secondary NameNode usually runs on a different machine from the primary NameNode because the Secondary NameNode needs the same amount of memory to run.

Starting a checkpoint running on the Secondary NameNode is controlled by two configuration parameters:

  • DFS. The namenode. Checkpoint. Period. The default is set to 1 hour, specifies the maximum time interval between two checkpoints
  • DFS. The namenode. Checkpoint. TXNS. Default is set to one million, defining the namenode did not check the number of transactions, when achieved when the number of mandatory inspection, even not the inspection interval.

The Secondary NameNode saves the latest checkpoint in the same file path as the main NameNode project. Therefore, NameNode can read the checkpoint image file on the Secondary NameNode whenever needed.

For details, see Secondary NameNode

Checkpoint node

Namenode maintains its namespace through two files: fsimage and Journal (log). Where fsimage is the maximum checkpoint for the namespace and the edit file, journal records updates since the last checkpoint. When NameNode starts, it merges fsimage and Edits Journal to provide an up-to-date view of file system metadata. NameNode then overwrites fsimage with the new HDFS state and starts a new edit log.

Checkpoint nodes periodically create namespace checkpoints. It downloads fsimage and Edits from the active Namenode, then merges them locally, and finally uploads the latest fsimage to the active Namenode. Checkpoint nodes usually run on a different machine than the main Namenode, because checkpoint nodes require the same amount of memory to run. You can run bin/ HDFS namenode-checkpoint on the node specified in the configuration file to start the checkpoint node.

“DFS. The namenode. Backup. Address” and “DFS. The namenode. Backup. The HTTP address” configuration properties to specify the location of the checkpoint nodes and the address of a web interface.

Two configuration parameters affect the operation of a checkpoint:

  • DFS. The namenode. Checkpoint. Period. The default is set to 1 hour, specifies the maximum time interval between two checkpoints
  • DFS. The namenode. Checkpoint. TXNS. Default is set to one million, defining the namenode did not check the number of transactions, when achieved when the number of mandatory inspection, even not the inspection interval.

The checkpoint node saves the latest checkpoint in the same file path as the main Namenode project. Therefore, NameNode can read the checkpoint image file on the checkpoint node whenever needed.

Multiple checkpoints can be specified simultaneously in a cluster configuration file.

For more information, see Namenode

The backup node

Backup nodes provide the same checking functions as checkpoint nodes, keeping relevant information in memory, updating a copy of the namespace and synchronizing with the active Namenode state. In addition to accepting the log stream of filesystem edits from NameNode and persisting them to disk, the backup node also creates a backup of the namespace by applying these edits to its own in-memory copy of the namespace.

The backup node does not need to download fsimage and edit files from the active NameNode to create checkpoints like Secondary NameNode and Checkpoint node, because it already saves the latest state of the namespace in memory. The backup node process is more efficient because it only needs to save the namespace to the local fsimage file and reset the edits.

Because the backup node needs to keep its namespace in memory, it needs the same memory size as Namenode.

Namenode supports only one backup node at a time. We do not need checkpoint nodes when the backup node is in use. Simultaneous support for multiple backup nodes will be implemented in the future.

Backup nodes are configured in the same way as checkpoint nodes. You can use bin/ HDFS namenode-backup to start the backup node.

“DFS. The namenode. Backup. Address” and “DFS. The namenode. Backup. The HTTP address” configuration properties to specify the location of the backup nodes and the address of a web interface.

The backup node provides the option to make persistence unnecessary at NameNode runtime, delegating all responsibility for preserving namespace state to the backup node. To do this, use the -importcheckpoint option to start NameNode and set the non-persistent storage directory dfs.namenode.edits.dir for the edits type.

For a discussion of the causes of backup nodes and checkpoint nodes, see Hadoop-4539

Import checkpoint

If the backups of the fsimage and edits files are lost, we can import the latest checkpoint into Namenode. The operation is as follows:

  • Create an empty folder at the location specified by the configuration item “dfs.name.dir”
  • In the configuration items “DFS. The namenode. Checkpoint. Dir” specifies the directory in which the checkpoint position
  • Start Namenode with the -importcheckpoint parameter

NameNode reads the checkpoint from fs.checkpoint.dir and saves it in dfs.name.dir. If there is a valid fsimage in dfs.name.dir, NameNode will fail to start. NameNode checks the consistency of the image file in the fs.checkpoint.dir directory, but does not modify it.

Trust information, see Namenode

balancer

HDFS data may not be evenly distributed among datanodes. A common cause is that a new Datanode is added to an existing cluster. When selecting a location for a new block, Namenode considers several factors before selecting which Datanodes to store:

  • Based on the proximity principle, place a copy of the block on the node where the block is being written
  • Consider placing block data on different racks to reduce the risk of single rack failure
  • A copy is usually placed on a node in the same rack as the node that wrote the file to reduce cross-rack bandwidth consumption
  • Distribute HDFS data evenly among datanodes in the cluster

Data may not be evenly distributed among datanodes due to trade-offs. HDFS provides a tool for administrators to analyze data block distribution and rebalance data distribution on Datanodes. A PDF attached to Hadoop-1652 is a brief guide for rebalancer administrators.

For details, see Balancer