當前位置：首頁 > 编程语言 > java >内容正文

java

hadoop程序开发--- Java

發布時間：2025/5/22 java 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 hadoop程序开发--- Java 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1、創建maven項目

如果不懂配置maven請點擊：傳送門

2、在pom.xml寫入架包配置文件

<dependencies><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-mapreduce-client-common</artifactId><version>2.8.4</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>2.8.4</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-hdfs</artifactId><version>2.8.4</version></dependency></dependencies>

3、創建源程序

src–>main–>java–>com–>test–>WordCount.java

WordCount.java

/ ***通過一項授權給Apache Software Foundation（ASF）*或更多貢獻者許可協議。查看公告文件*隨本作品分發以獲取更多信息*關于版權擁有權。 ASF許可此文件*根據Apache許可2.0版（以下簡稱“* “執照”）; 除非合規，否則您不得使用此文件*帶許可證。您可以在以下位置獲得許可的副本：** http://www.apache.org/licenses/LICENSE-2.0**除非適用法律要求或書面同意，否則軟件*根據許可協議分發的內容是按“原樣”分發的，*不作任何明示或暗示的保證或條件。*有關特定語言的管理權限，請參閱許可證*許可中的限制。* /package com.xxx;import java.io.IOException; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.FileInputFormat; import org.apache.hadoop.mapred.FileOutputFormat; import org.apache.hadoop.mapred.JobClient; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.Reducer; import org.apache.hadoop.mapred.Reporter; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner;/*** 這是一個示例Hadoop Map / Reduce應用程序。* 讀取文本輸入文件，將每一行分解為單詞* 并計數。輸出是單詞的本地排序列表，并且* 計算它們發生的頻率。** 運行：bin / hadoop jar build / hadoop-examples.jar wordcount* [-m 地圖] [-r 減少] 目錄內 目錄外*/ public class WordCount extends Configured implements Tool {/*** 計算每一行中的單詞。* 于輸入的每一行，將其分解為單詞并將其作為* 單詞， 1 ）。*/public static class MapClass extends MapReduceBaseimplements Mapper<LongWritable, Text, Text, IntWritable> {private final static IntWritable one = new IntWritable(1);private Text word = new Text();public void map(LongWritable key, Text value,OutputCollector<Text, IntWritable> output,Reporter reporter) throws IOException {String line = value.toString();StringTokenizer itr = new StringTokenizer(line);while (itr.hasMoreTokens()) {word.set(itr.nextToken());output.collect(word, one);}}}/*** 一個reducer類，該類僅發出輸入值的總和。*/public static class Reduce extends MapReduceBaseimplements Reducer<Text, IntWritable, Text, IntWritable> {public void reduce(Text key, Iterator<IntWritable> values,OutputCollector<Text, IntWritable> output,Reporter reporter) throws IOException {int sum = 0;while (values.hasNext()) {sum += values.next().get();}output.collect(key, new IntWritable(sum));}}static int printUsage() {System.out.println("wordcount [-m <maps>] [-r <reduces>] <input> <output>");ToolRunner.printGenericCommandUsage(System.out);return -1;}/*** 字數映射/減少程序的主要驅動程序。* 調用此方法以提交地圖/縮小作業。* @throws When there is communication problems with the job tracker.*/public int run(String[] args) throws Exception {JobConf conf = new JobConf(getConf(), WordCount.class);conf.setJobName("wordcount");// the keys are words (strings)conf.setOutputKeyClass(Text.class);// the values are counts (ints)conf.setOutputValueClass(IntWritable.class);conf.setMapperClass(MapClass.class);conf.setCombinerClass(Reduce.class);conf.setReducerClass(Reduce.class);List<String> other_args = new ArrayList<String>();for(int i=0; i < args.length; ++i) {try {if ("-m".equals(args[i])) {conf.setNumMapTasks(Integer.parseInt(args[++i]));} else if ("-r".equals(args[i])) {conf.setNumReduceTasks(Integer.parseInt(args[++i]));} else {other_args.add(args[i]);}} catch (NumberFormatException except) {System.out.println("ERROR: Integer expected instead of " + args[i]);return printUsage();} catch (ArrayIndexOutOfBoundsException except) {System.out.println("ERROR: Required parameter missing from " +args[i-1]);return printUsage();}}// Make sure there are exactly 2 parameters left.if (other_args.size() != 2) {System.out.println("ERROR: Wrong number of parameters: " +other_args.size() + " instead of 2.");return printUsage();}FileInputFormat.setInputPaths(conf, other_args.get(0));FileOutputFormat.setOutputPath(conf, new Path(other_args.get(1)));JobClient.runJob(conf);return 0;}public static void main(String[] args) throws Exception {int res = ToolRunner.run(new Configuration(), new WordCount(), args);System.exit(res);} }

4、將WordCount.java 打包為jar文件

（1）基本配置

選擇完后 Apply–>ok

（2）開始打包

Build–>Build Artifacts–> XXX.jar–> Build

（3）查看生成的jar文件

在文件夾 out–>artifacts–>WordCount_jar里面

5、運行

我這里將WordCount.jar 上傳到 /usr/local/hadoop-jar 目錄下了
運行命令
重要：程序名前一定要寫包名這里是 com.test

yarn jar /usr/local/hadoop-jar/WordCount.jar com.test.WordCount /input/word.txt /output/01

6、結束

示例結束，如果想開發其他程序，可以自己另外編寫java 文件，打包上傳運行即可。
如有轉載請標明出處，支持原創。
QQ交流學習群：779133600

總結

以上是生活随笔為你收集整理的hadoop程序开发--- Java的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： java 操作txt文件
下一篇： hadoop程序开发 --- pytho