• IDEA 本地远程执行MapReduce(HA集群)程序找不到自定义Mapper与Reduce类


    本文通过IDEA本地执行MR程序的main函数 ,而不是打包成Jar手工放到服务器上运行,发现以下错误提示:No job jar file set,然后在HDFS的/tmp下也没发现有该项目的Jar包,可以推测是任务提交给yarn后,本地并没有将项目打包成Jar提交给ResourceManager,导致找不到Mapper与Reducer类。

    所以总体思路就是将项目借助Maven打包成Jar,然后通过添加mapreduce.job.jar的xml配置指定该Jar包在本地的存储路径。注意不能是绝对路径,必须是相对路径(项目文件为根路径),否则还是无法提交成功,但是No job jar file set的提示消失了,却依旧找不到类。

     

    1. package com.atguigu.mapreduce.wordcount;
    2. import org.apache.hadoop.conf.Configuration;
    3. import org.apache.hadoop.fs.Path;
    4. import org.apache.hadoop.io.IntWritable;
    5. import org.apache.hadoop.io.LongWritable;
    6. import org.apache.hadoop.io.Text;
    7. import org.apache.hadoop.mapreduce.Job;
    8. import org.apache.hadoop.mapreduce.Mapper;
    9. import org.apache.hadoop.mapreduce.Reducer;
    10. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    11. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    12. import java.io.IOException;
    13. public class WordCountDriver {
    14. public static class WordCountReducer extends Reducer {
    15. private IntWritable outV = new IntWritable();
    16. @Override
    17. protected void reduce(Text key, Iterable values, Reducer.Context context) throws IOException, InterruptedException {
    18. int sum = 0;
    19. for (IntWritable value : values) {
    20. sum += value.get();
    21. }
    22. outV.set(sum);
    23. context.write(key, outV);
    24. }
    25. }
    26. public static class WordCountMapper extends Mapper {
    27. private Text outK = new Text();
    28. private IntWritable outV = new IntWritable(1);
    29. @Override
    30. protected void map(LongWritable key, Text value, Mapper.Context context) throws IOException, InterruptedException {
    31. // 1.获取一行
    32. String line = value.toString();
    33. // 2.切割
    34. String[] words = line.split(" ");
    35. for (String word : words) {
    36. // 封装outK
    37. outK.set(word);
    38. //写出
    39. context.write(outK, outV);
    40. }
    41. }
    42. }
    43. public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
    44. // 1 获取job
    45. Configuration conf = new Configuration();
    46. conf.set("mapreduce.app-submission.cross-platform", "true");
    47. conf.set("mapreduce.job.jar","test.jar");
    48. Job job = Job.getInstance(conf);
    49. // 2 设置jar包路径
    50. job.setJarByClass(WordCountDriver.class);
    51. // 3 关联mapper和reducer
    52. job.setMapperClass(WordCountMapper.class);
    53. job.setReducerClass(WordCountReducer.class);
    54. // 4 设置map输出的KV类型
    55. job.setMapOutputKeyClass(Text.class);
    56. job.setMapOutputValueClass(IntWritable.class);
    57. // 5 设置最终输出的KV类型
    58. job.setOutputKeyClass(Text.class);
    59. job.setOutputValueClass(IntWritable.class );
    60. // 6 设置输入路径和输出路径
    61. // FileInputFormat.setInputPaths(job, new Path("D:\\BaiduNetdiskDownload\\11_input\\inputword"));
    62. // FileOutputFormat.setOutputPath(job, new Path("D:\\hadoop\\output\\output1"));
    63. FileInputFormat.setInputPaths(job, new Path("/input/word.txt"));
    64. FileOutputFormat.setOutputPath(job, new Path("/output"));
    65. // 7 提交job
    66. boolean result = job.waitForCompletion(true);
    67. System.exit(result ? 0 : 1);
    68. }
    69. }

     点击package按钮进行打包

     

     

     

     

  • 相关阅读:
    Spring Cloud Gateway集成Nacos和Swagger聚合Api文档
    uniapp 微信对接地图的三种操作
    Python bug: TypeError: unhashable type: ‘dict‘ or ‘list‘
    代码随想录算法训练营第五天|“巨燃”哈希表,本源双指针
    flowable工作流所有业务概念
    爱思唯尔——利用AI来改善医疗决策和科研
    前端网站分享
    桌面显示计算机信息,并实现批处理自动化
    Leetcode 238. 除自身以外数组的乘积
    实验四 OR指令设计实验【计算机组成原理】
  • 原文地址:https://blog.csdn.net/Adam_captain/article/details/128116145