日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

MapReduce 编程实践:统计对象中的某些属性

發布時間:2024/7/5 编程问答 22 豆豆
生活随笔 收集整理的這篇文章主要介紹了 MapReduce 编程实践:统计对象中的某些属性 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

    • 1. 生成數據
    • 2. 編寫實體類
    • 3. Mapper類
    • 4. Reducer類
    • 5. Driver類
    • 6. 運行

參考書:《Hadoop大數據原理與應用》

相關文章:MapReduce 編程實踐

1. 生成數據

超市消費者 數據: id, 時間,消費金額,會員/非會員

使用 Python 生成虛擬數據

import random import time consumer_type = ['會員', '非會員'] vip, novip = 0, 0 vipValue, novipValue = 0, 0 with open("consumer.txt",'w',encoding='utf-8') as f:for i in range(1000): # 1000條數據random.seed(time.time()+i)id = random.randint(0, 10000)t = time.strftime('%Y-%m-%d %H:%M',time.localtime(time.time()))value = random.randint(1, 500)type = consumer_type[random.randint(0, 1)]f.write(str(id)+'\t'+t+'\t'+str(value)+'\t'+type+'\n')if type == consumer_type[0]:vip += 1vipValue += valueelse:novip += 1novipValue += value print(consumer_type[0] + ": 人數 " + str(vip) + ", 總金額: " + str(vipValue) +", 平均金額:"+str(vipValue/vip)) print(consumer_type[1] + ": 人數 " + str(novip) + ", 總金額: " + str(novipValue) +", 平均金額:"+str(novipValue/novip))

[dnn@master HDFS_example]$ python test.py 會員: 人數 510, 總金額: 128744, 平均金額:252.439215686 非會員: 人數 490, 總金額: 123249, 平均金額:251.528571429

2. 編寫實體類

  • Consumer.java
package com.michael.mapreduce;import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Writable; import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; import java.util.Date;public class Consumer implements Writable{private String id;private int money;private int vip; // 0 no, 1 yespublic Consumer() {}public Consumer(String id, int money, int vip) {this.id = id;this.money = money;this.vip = vip;}public int getMoney() {return money;}public void setMoney() {this.money = money;}public String getId() {return id;}public void setId(String id) {this.id = id;}public int getVip() {return vip;}public void setVip(int vip) {this.vip = vip;}public void write(DataOutput dataOutput) throws IOException{dataOutput.writeUTF(id);dataOutput.writeInt(money);dataOutput.writeInt(vip);}public void readFields(DataInput dataInput) throws IOException{this.id = dataInput.readUTF();this.money = dataInput.readInt();this.vip = dataInput.readInt();}@Overridepublic String toString() {return this.id + "\t" + this.money + "\t" + this.vip;} }

3. Mapper類

  • ConsumerMapper.java
package com.michael.mapreduce;import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import java.io.IOException;public class ConsumerMapper extends Mapper<LongWritable, Text, Text, IntWritable>{@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{String line = value.toString();String[] fields = line.split("\t");String id = fields[0];int money = Integer.parseInt(fields[2]);String vip = fields[3];context.write(new Text(vip), new IntWritable(money));} }

4. Reducer類

  • ConsumerReducer.java
package com.michael.mapreduce;import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; import java.io.IOException;public class ConsumerReducer extends Reducer<Text, IntWritable, Text, LongWritable>{@Overrideprotected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{int count = 0;long sum = 0;for(IntWritable v : values) {count++;sum += v.get();}long avg = sum/count;context.write(key, new LongWritable(avg));} }

5. Driver類

  • ConsumerDriver.java
package com.michael.mapreduce;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class ConsumerDriver {public static void main(String[] args) throws Exception{Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(ConsumerDriver.class);job.setMapperClass(ConsumerMapper.class);job.setReducerClass(ConsumerReducer.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(IntWritable.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(LongWritable.class);FileInputFormat.setInputPaths(job, new Path(args[0]));FileOutputFormat.setOutputPath(job, new Path(args[1]));boolean result = job.waitForCompletion(true);System.exit(result ? 0 : 1);} }

6. 運行

  • 復制數據到 hdfs
[dnn@master Desktop]$ hadoop dfs -copyFromLocal /home/dnn/eclipse-workspace/HDFS_example/consumer.txt /InputDataTest
  • 導出 jar 在 bash 命令行運行
hadoop jar /home/dnn/eclipse-workspace/HDFS_example/consumer_avg.jar com.michael.mapreduce.ConsumerDriver /InputDataTest/consumer.txt /OutputTest2
  • 運行結果
[dnn@master Desktop]$ hadoop fs -cat /OutputTest2/part-r-00000 會員 252 非會員 251

總結

以上是生活随笔為你收集整理的MapReduce 编程实践:统计对象中的某些属性的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。