This is the seventh day of my participation in the August More text Challenge. For details, see:August is more challenging

The body of the

Simple version

By submitting jar packages for MapReduce processing, the entire running process can be divided into five steps:

  1. The MapReduce job is submitted to the client.
  2. Then the ResourceManager of YARN allocates resources.
  3. Containers are loaded and monitored by NodeManager.
  4. ApplicationMaster interacts with ResourceManager about resource application and status, and NodeManagers manage MapReduce running jobs.
  5. The HDFS is used to distribute job configuration files and JAR packages to each node.

A detailed version

The following is the sequence of events that occur when the MapReduce application is started in a YARN cluster.

  1. The client submits a MapReduce V2 (MRv2) application request to ResourceManager as follows:
hadoop jar wordcount.jar WordCount testdata output
Copy the code
  1. The ApplicationManager component of ResourceManager instructs the NodeManager (running on one of the working nodes) to start a new ApplicationMaster instance for the application. This is container 0 for the application. The containers that run the mapper and reducer will be created later and named 01, 02, 03, etc.
  2. ApplicationMaster initializes itself by registering with ResourceManager.
  3. ApplicationMaster calculates the resources needed to complete the application. ApplicationMaster determines the number of Map tasks that should be started based on the sharding of the input data. It calculates the number of input shards by requesting the name of the input file and the location of the data block required by the application. Using this information, ApplicationMaster calculates the number of Map tasks needed to process the input data
  4. ApplicationMaster requests ResourceManager to assign containers for the Map task. During the entire application life cycle, the application keeps in touch with ResourceManager to ensure that the list of resources required by the application is complied with by ResourceManager, and sends some necessary kill requests to kill tasks.
  5. The Scheduler component of ResourceManager determines the node on which the Map task is run. Key factors in this decision include the location of the data and the available memory of the nodes that support the creation of new containers to perform tasks. ResourceManager queues ApplicationMaster resource requests, and grants lease rights to containers on specific nodes when available resources are available.
  6. ApplicationMaster instructs NodeManager to create a container on the node where the container has been allocated.
  7. The NodeManager creates the requested containers and starts them. The container sends the running status of MapReduce to ApplicationMaster (there is only one ApplicationMaster per job)
  8. ApplicationMaster applies to ResourceManager for resources for Reduce tasks. If the MapReduce application contains Reduce tasks, some resources are not required.
  9. ApplicationMaster requests the NodeManager to start Reduce jobs on the nodes where ResourceManager allocates resources to Reduce jobs
  10. A Reduce job shuffles and sorts intermediate data on mappers and writes the output to the OUT directory (HDFS).
  11. The NodeManager sends the status and health report to ResourceManager. Once all the tasks are completed, ApplicationMaster sends the results to the client application and the job information and logs to JobHistoryServer. The task container cleans up its state and removes the intermediate output from the local file system
  12. Once the application is running, ApplicationMaster notifies ResourceManager that the job is successfully completed, and logs itself out of ResourceManager and closes it.
  13. ResourceManager releases all resources (containers) owned by applications for cluster reuse.

supplement

Job initialization process

  1. After resourceManager receives the notification of calling the submitApplication() method, the Scheduler starts to allocate containers. Then resourceManager sends the applicationMaster process. Tell each nodeManager manager.
  2. It is up to applicationMaster to decide how to run the tasks. If the job data is small, applicationMaster chooses to run the tasks in a JVM.

What is the basis for this judgment?

  1. When a job is less than the number of mappers 10 (graphs. Job. Ubertask. Maxmaps)
  2. There is only one reducer (graphs. Job. Ubertask. Maxreduces)
  3. Read the file size is smaller than an HDFS block (graphs. Job. Ubertask. Maxbytes)
  1. Before running the tasks, applicationMaster will call the setupJob() method and create the output path of the output

This explains why the output path is created regardless of whether MapReduce initially reports an error

Task Task assignment

  1. Then applicationMaster requests the ResourceManager to use the containers to execute map and Reduce tasks. The priority of map tasks is higher than that of Reduce tasks. When all map tasks are completed, sort(shuffle in this case) begins, and then reduce tasks begin.

When map Tasks execute 5% of the time, reduce will be requested

  1. Tasks consume memory and CPU resources. By default, map and reduce tasks are allocated 1024MB and one core

The corresponding minimum and maximum parameters can be modified to run: mapreduce.map.memory.mb mapreduce.reduce.memory.mb mapreduce.map.cpu.vcores mapreduce.reduce.reduce.cpu.vcores

Task Task execution

  1. In this case, a Task has been assigned to a Container by ResourceManager. ApplicationMaster tells nodeManager to start the Container. The task will be run by a Java Application whose main function is YarnChild, but before running the task, first locate the jar packages, configuration files, and files loaded in the cache required by the task.
  2. YarnChild runs in a dedicated JVM, so any problem with a Single Map or Reduce job does not affect the entire NodeManager. 3. Each task can be completed in the same JVM task, and the completed processing data is then written to a temporary file.

Progress and status update

MapReduce is a long running batch process, which can be an hour, a few hours, or even a few days, so Job status monitoring is very important.

Each job and task has a status containing the job (running, successfully completed, failed), a counter for value, status information, and a description (the description is usually printed in the code).

How does this information communicate with the client?

When a task starts to be executed, it will keep running records to record the completion percentage of the task. For map tasks, it will record the percentage of their execution. It may be complicated for Reduce tasks, but the system will still estimate the completion percentage of Reduce tasks. When a Map or Reduce job is executed, the child process interacts with applicationMaster every three seconds.