Hadoop2.x编程入门实例:MaxTemperature
生活随笔
收集整理的這篇文章主要介紹了
Hadoop2.x编程入门实例:MaxTemperature
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
Hadoop2.x編程入門實例:MaxTemperature
@(HADOOP)[hadoop]
- Hadoop2x編程入門實例MaxTemperature
- 一前期準備
- 二編寫代碼
- 1創建Map
- 2創建Reduce
- 3創建main方法
- 4導出成MaxTempjar并上傳至運行程序的服務器
- 三運行程序
- 1創建input目錄并將sampletxt復制到input目錄
- 2運行程序
- 3查看結果
注意:以下內容在2.x版本與1.x版本同樣適用,已在2.4.1與1.2.0進行測試。
一、前期準備
1、創建偽分布Hadoop環境,請參考官方文檔?;蛘遠ttp://blog.csdn.net/jediael_lu/article/details/38637277
2、準備數據文件如下sample.txt:
123456798676231190101234567986762311901012345679867623119010123456798676231190101234561+00121534567890356 123456798676231190101234567986762311901012345679867623119010123456798676231190101234562+01122934567890456 123456798676231190201234567986762311901012345679867623119010123456798676231190101234562+02120234567893456 123456798676231190401234567986762311901012345679867623119010123456798676231190101234561+00321234567803456 123456798676231190101234567986762311902012345679867623119010123456798676231190101234561+00429234567903456 123456798676231190501234567986762311902012345679867623119010123456798676231190101234561+01021134568903456 123456798676231190201234567986762311902012345679867623119010123456798676231190101234561+01124234578903456 123456798676231190301234567986762311905012345679867623119010123456798676231190101234561+04121234678903456 123456798676231190301234567986762311905012345679867623119010123456798676231190101234561+00821235678903456二、編寫代碼
1、創建Map
package org.jediael.hadoopDemo.maxtemperature;import java.io.IOException;import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper;public class MaxTemperatureMapper extendsMapper<LongWritable, Text, Text, IntWritable> {private static final int MISSING = 9999;@Overridepublic void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException {String line = value.toString();String year = line.substring(15, 19);int airTemperature;if (line.charAt(87) == '+') { // parseInt doesn't like leading plus// signsairTemperature = Integer.parseInt(line.substring(88, 92));} else {airTemperature = Integer.parseInt(line.substring(87, 92));}String quality = line.substring(92, 93);if (airTemperature != MISSING && quality.matches("[01459]")) {context.write(new Text(year), new IntWritable(airTemperature));}} }2、創建Reduce
package org.jediael.hadoopDemo.maxtemperature;import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer;public class MaxTemperatureReducer extendsReducer<Text, IntWritable, Text, IntWritable> {@Overridepublic void reduce(Text key, Iterable<IntWritable> values, Context context)throws IOException, InterruptedException {int maxValue = Integer.MIN_VALUE;for (IntWritable value : values) {maxValue = Math.max(maxValue, value.get());}context.write(key, new IntWritable(maxValue));} }3、創建main方法
package org.jediael.hadoopDemo.maxtemperature;import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class MaxTemperature {public static void main(String[] args) throws Exception {if (args.length != 2) {System.err.println("Usage: MaxTemperature <input path> <output path>");System.exit(-1);}Job job = new Job();job.setJarByClass(MaxTemperature.class);job.setJobName("Max temperature");FileInputFormat.addInputPath(job, new Path(args[0]));FileOutputFormat.setOutputPath(job, new Path(args[1]));job.setMapperClass(MaxTemperatureMapper.class);job.setReducerClass(MaxTemperatureReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);System.exit(job.waitForCompletion(true) ? 0 : 1);} }4、導出成MaxTemp.jar,并上傳至運行程序的服務器。
三、運行程序
1、創建input目錄并將sample.txt復制到input目錄
hadoop fs -put sample.txt /
2、運行程序
export HADOOP_CLASSPATH=MaxTemp.jarhadoop org.jediael.hadoopDemo.maxtemperature.MaxTemperature /sample.txt output10注意輸出目錄不能已經存在,否則會創建失敗。
3、查看結果
(1)查看結果
[jediael@jediael44 code]$ hadoop fs -cat output10/* 14/07/09 14:51:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 1901 42 1902 212 1903 412 1904 32 1905 102(2)運行時輸出
[jediael@jediael44 code]$ hadoop org.jediael.hadoopDemo.maxtemperature.MaxTemperature /sample.txt output10 14/07/09 14:50:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/07/09 14:50:41 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 14/07/09 14:50:42 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/07/09 14:50:43 INFO input.FileInputFormat: Total input paths to process : 1 14/07/09 14:50:43 INFO mapreduce.JobSubmitter: number of splits:1 14/07/09 14:50:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1404888618764_0001 14/07/09 14:50:44 INFO impl.YarnClientImpl: Submitted application application_1404888618764_0001 14/07/09 14:50:44 INFO mapreduce.Job: The url to track the job: http://jediael44:8088/proxy/application_1404888618764_0001/ 14/07/09 14:50:44 INFO mapreduce.Job: Running job: job_1404888618764_0001 14/07/09 14:50:57 INFO mapreduce.Job: Job job_1404888618764_0001 running in uber mode : false 14/07/09 14:50:57 INFO mapreduce.Job: map 0% reduce 0% 14/07/09 14:51:05 INFO mapreduce.Job: map 100% reduce 0% 14/07/09 14:51:15 INFO mapreduce.Job: map 100% reduce 100% 14/07/09 14:51:15 INFO mapreduce.Job: Job job_1404888618764_0001 completed successfully 14/07/09 14:51:16 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=94FILE: Number of bytes written=185387FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=1051HDFS: Number of bytes written=43HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=5812Total time spent by all reduces in occupied slots (ms)=7023Total time spent by all map tasks (ms)=5812Total time spent by all reduce tasks (ms)=7023Total vcore-seconds taken by all map tasks=5812Total vcore-seconds taken by all reduce tasks=7023Total megabyte-seconds taken by all map tasks=5951488Total megabyte-seconds taken by all reduce tasks=7191552Map-Reduce FrameworkMap input records=9Map output records=8Map output bytes=72Map output materialized bytes=94Input split bytes=97Combine input records=0Combine output records=0Reduce input groups=5Reduce shuffle bytes=94Reduce input records=8Reduce output records=5Spilled Records=16Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=154CPU time spent (ms)=1450Physical memory (bytes) snapshot=303112192Virtual memory (bytes) snapshot=1685733376Total committed heap usage (bytes)=136515584Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=954File Output Format Counters Bytes Written=43總結
以上是生活随笔為你收集整理的Hadoop2.x编程入门实例:MaxTemperature的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Hadoop常见异常
- 下一篇: hadoop配置文件加载机制