日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

MR案例:CombineFileInputFormat

發(fā)布時(shí)間:2025/4/14 编程问答 30 豆豆
生活随笔 收集整理的這篇文章主要介紹了 MR案例:CombineFileInputFormat 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

CombineFileInputFormat是一個(gè)抽象類。Hadoop提供了兩個(gè)實(shí)現(xiàn)類CombineTextInputFormatCombineSequenceFileInputFormat

此案例讓我明白了三點(diǎn):詳見(jiàn) 解讀:MR多路徑輸入解讀:CombineFileInputFormat類

  • 對(duì)于單一輸入路徑情況:
//指定輸入格式CombineFileInputFormat job.setInputFormatClass(CombineTextInputFormat.class); //指定SplitSize CombineTextInputFormat.setMaxInputSplitSize(job, 60*1024*1024L);//指定輸入路徑 CombineTextInputFormat.addInputPath(job, new Path(args[0]));
  • 對(duì)于多路徑輸入情況①:
//指定輸入格式CombineFileInputFormat job.setInputFormatClass(CombineTextInputFormat.class); //指定SplitSize CombineTextInputFormat.setMaxInputSplitSize(job, 60*1024*1024L);//指定輸入路徑(兩個(gè)) CombineTextInputFormat.addInputPath(job, new Path(args[0])); CombineTextInputFormat.addInputPath(job, new Path(args[1]));
  • 多路徑輸入情況②:
//指定SplitSize CombineTextInputFormat.setMaxInputSplitSize(job, 60*1024*1024L);//指定輸入路徑,以及指定輸入格式 MultipleInputs.addInputPath(job, new Path(args[0]), CombineTextInputFormat.class); MultipleInputs.addInputPath(job, new Path(args[1]), CombineTextInputFormat.class);

細(xì)心觀察,還會(huì)發(fā)現(xiàn)兩種多路徑輸入① ②的區(qū)別:(已驗(yàn)證)

  • 第一種方案:先把所有的輸入集中起來(lái)求出總的輸入大小,再除以SplitSize算出總的map個(gè)數(shù)
  • 第二種方案:先分別算出每個(gè)MultipleInputs路徑對(duì)應(yīng)的map個(gè)數(shù),再對(duì)兩個(gè)MultipleInputs的map個(gè)數(shù)求和
  • 完整的代碼:

    package test0820;import java.io.IOException;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.VLongWritable; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.CombineTextInputFormat; import org.apache.hadoop.mapreduce.lib.input.MultipleInputs; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount0826 {public static void main(String[] args) throws Exception {Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(WordCount0826.class); job.setMapperClass(IIMapper.class);job.setReducerClass(IIReducer.class);job.setNumReduceTasks(5);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(VLongWritable.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(VLongWritable.class);//CombineFileInputFormat類//job.setInputFormatClass(CombineTextInputFormat.class); CombineTextInputFormat.setMaxInputSplitSize(job, 60*1024*1024L);
    //CombineTextInputFormat.addInputPath(job, new Path(args[0]));//CombineTextInputFormat.addInputPath(job, new Path(args[1])); MultipleInputs.addInputPath(job, new Path(args[0]), CombineTextInputFormat.class);MultipleInputs.addInputPath(job, new Path(args[1]), CombineTextInputFormat.class);
    FileOutputFormat.setOutputPath(job,
    new Path(args[2]));System.exit(job.waitForCompletion(true)? 0:1);}//mappublic static class IIMapper extends Mapper<LongWritable, Text, Text, VLongWritable>{@Overrideprotected void map(LongWritable key, Text value,Context context)throws IOException, InterruptedException {String[] splited = value.toString().split(" "); for(String word : splited){context.write(new Text(word),new VLongWritable(1L));}}}//reducepublic static class IIReducer extends Reducer<Text, VLongWritable, Text, VLongWritable>{@Overrideprotected void reduce(Text key, Iterable<VLongWritable> v2s, Context context)throws IOException, InterruptedException {long sum=0;for(VLongWritable vl : v2s){sum += vl.get(); }context.write(key, new VLongWritable(sum));}} }

    ?

    轉(zhuǎn)載于:https://www.cnblogs.com/skyl/p/4761662.html

    總結(jié)

    以上是生活随笔為你收集整理的MR案例:CombineFileInputFormat的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

    如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。