1. Case Description:

Running beyond physical memory limits. Current Usage: Running beyond physical memory limits. 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB Virtual Memory Used. Killing Container.

2. Analyze and solve

The error log shows that the memory is insufficient. The 2 GB physical memory allocated by the Container has been used up. In addition, 3.9 GB virtual memory of 4.2 GB has been used. So the mission is killed. I found some troubleshooting problems on the Internet, such as adjusting the increase of map and Reduce memory, setting the ratio of virtual memory, canceling the check of virtual memory, etc., but in fact, most of them are useless, and need to analyze the specific problems. Here is a concrete analysis.

1. View the memory range allocated to each Container in the big data cluster

According to the following parameters, the memory range that can be applied for by each Container is between 1 GB and 8 GB. The actual amount of memory allocated by each Container depends on the actual platform resource load and task priority. There is even a way to increase the value of set yarn.scheduler. minimum-allocation-MB, that is, the amount of memory allocated to each container. However, this value must be configured in the yarn-site. XML file and the cluster must be restarted to take effect, which is not possible in actual development.

--1. Check the memory range allocated by YARN for each Container
hive>set yarn.scheduler.minimum-allocation-mb;
    yarn.scheduler.minimum-allocation-mb=1024
hive>set yarn.scheduler.maximum-allocation-mb;
yarn.scheduler.maximum-allocation-mb=8192
Copy the code

2. View the cluster configuration

As shown in the following figure, the cluster is configured to use 2 GB memory for each MapTask or Reduce task by default. Therefore, at this time, if some tasks fail due to insufficient memory due to platform resource shortage (many successful tasks can be run normally before). Then unilaterally changing graphs. The map. The memory. The MB and graphs. Reduce. The memory. It doesn’t make any sense to the value of the MB, because although each container can be allocated memory is 1-8 g, but now because of the cluster strained resources may be allocated memory is very small, For example, if it is less than 2 GIGABytes, it makes no sense for you to scale up the memory available for Map and Reduce running on containers. Augmenting the graphs. In the case of a map. The memory. The MB and graphs. Reduce the memory. The MB of useful value, that is platform resource is very rich, each container actually allocated memory is very big, such as 6 g. This will speed up the program, but too much value will cause a waste of resources.

--1. View the default memory size for each Map and Reduce
hive>set mapreduce.map.memory.mb;
     mapreduce.map.memory.mb=2048
hive>set mapreduce.reduce.memory.mb;
mapreduce.reduce.memory.mb=2048
Copy the code

3. Increase the virtualization ratio of virtual memory

As you can see below, the default memory ratio for each MapTask and Reudcetask virtualization is 2.1. The default memory usage for each Map and Reduce in the enterprise cluster is 2 GB. Therefore, the maximum virtual memory size is 2.1 x 2=4.2 GB. This parameter is also set using yarn-site. XML. In fact, the actual development of the corresponding configuration can be larger.

hive (default)> set yarn.nodemanager.vmem-pmem-ratio;
yarn.nodemanager.vmem-pmem-ratio=2.1
Copy the code

4. Configure YARN no memory check

Set yarn.nodeManager. vmem-check-enabled to false in the yarn-site. XML file and restart the cluster. However, this recommendation may cause memory leaks.

hive (default)> set yarn.nodemanager.vmem-check-enabled;
yarn.nodemanager.vmem-check-enabled=true
Copy the code

5. Add map and Reduce

Obviously, this reduces the workload of each Map and Reduce, which has a role to play. Generally, the number of MapTasks is determined by the file size and slice information, while the number of Reduce is determined by the final aggregation results. If the number is increased too much, multiple small files will be generated.

--1. To increase the number of maps, you can set the size of slice information, because generally a slice starts a MapTask. -- The size of each block is 128Mb, so we can set mapred.max.split.size to 64Mb -- such a block is executed by two tasks. hive>set dfs.block.size; DFS. Block. Size = 134217728-2. Can reduce Numbers increase by: set hive. The exec. Reducers. The bytes. Per. Reducer for configurationCopy the code

So the analysis, the actual company development, with this situation, restart the cluster is not too reliable, can only check the optimization of their own procedures.

  1. Generally, you can set the retry times for a task. If the task is automatically tried for several times, the task may succeed.
  2. Check your application for data skew problems, which are usually caused by data skew. Unless the company’s cluster parameters are incorrectly configured, you can modify them and increase the values as described above.