牛骨文教育服务平台(让学习变的简单)
博文笔记

MapReduce job Shuffle 过程的ERROR

创建时间:2016-04-13 投稿人: 浏览次数:819

1.错误描述

error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#43
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1550)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:414)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:343)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:166)

2.问题解决

           从error的stack信息可以看出是mapreduce 的job shuffle过程中发生的OOM,map的输出数据在reduce节点load的过多导致的OOM,所以将reduce的shuffle.input 由默认的0.7 改为0.6,该参数的变化需要根据实际情况确定。
  conf.set("mapreduce.reduce.shuffle.input.buffer.percent", "0.6");

3.相关参数

  • mapreduce.reduce.shuffle.input.buffer.percent :
the percentage of the reducer"s heap memory to be allocated for the circular buffer to store the intermedite outputs copied from multiple mappers.
  • mapreduce.reduce.shuffle.memory.limit.percent
the maximum percentage of the above memory buffer that a single shuffle (output copied from single Map task) should take. The shuffle"s size above this size will not be copied to the memory buffer, instead they will be directly written to the disk of the reducer.
  • mapreduce.reduce.shuffle.merge.percent:
the threshold percentage by where the in-memory merger thread will run to merge the available shuffle contents on the memory buffer into a single file and immediately spills the merged file into the disk.

阅读更多
声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。