MR案例:CombineFileInputFormat
生活随笔
收集整理的這篇文章主要介紹了
MR案例:CombineFileInputFormat
小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
CombineFileInputFormat是一個(gè)抽象類。Hadoop提供了兩個(gè)實(shí)現(xiàn)類CombineTextInputFormat和CombineSequenceFileInputFormat。
此案例讓我明白了三點(diǎn):詳見(jiàn) 解讀:MR多路徑輸入 和 解讀:CombineFileInputFormat類
- 對(duì)于單一輸入路徑情況:
- 對(duì)于多路徑輸入情況①:
- 多路徑輸入情況②:
細(xì)心觀察,還會(huì)發(fā)現(xiàn)兩種多路徑輸入① ②的區(qū)別:(已驗(yàn)證)
完整的代碼:
package test0820;import java.io.IOException;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.VLongWritable; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.CombineTextInputFormat; import org.apache.hadoop.mapreduce.lib.input.MultipleInputs; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount0826 {public static void main(String[] args) throws Exception {Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(WordCount0826.class); job.setMapperClass(IIMapper.class);job.setReducerClass(IIReducer.class);job.setNumReduceTasks(5);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(VLongWritable.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(VLongWritable.class);//CombineFileInputFormat類//job.setInputFormatClass(CombineTextInputFormat.class); CombineTextInputFormat.setMaxInputSplitSize(job, 60*1024*1024L);//CombineTextInputFormat.addInputPath(job, new Path(args[0]));//CombineTextInputFormat.addInputPath(job, new Path(args[1])); MultipleInputs.addInputPath(job, new Path(args[0]), CombineTextInputFormat.class);MultipleInputs.addInputPath(job, new Path(args[1]), CombineTextInputFormat.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));System.exit(job.waitForCompletion(true)? 0:1);}//mappublic static class IIMapper extends Mapper<LongWritable, Text, Text, VLongWritable>{@Overrideprotected void map(LongWritable key, Text value,Context context)throws IOException, InterruptedException {String[] splited = value.toString().split(" "); for(String word : splited){context.write(new Text(word),new VLongWritable(1L));}}}//reducepublic static class IIReducer extends Reducer<Text, VLongWritable, Text, VLongWritable>{@Overrideprotected void reduce(Text key, Iterable<VLongWritable> v2s, Context context)throws IOException, InterruptedException {long sum=0;for(VLongWritable vl : v2s){sum += vl.get(); }context.write(key, new VLongWritable(sum));}} }
?
轉(zhuǎn)載于:https://www.cnblogs.com/skyl/p/4761662.html
總結(jié)
以上是生活随笔為你收集整理的MR案例:CombineFileInputFormat的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: H5移动前端性能优化
- 下一篇: eclipse ide for java