日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

spark wordcount完整工程代码(含pom.xml)

發布時間:2025/1/21 编程问答 22 豆豆
生活随笔 收集整理的這篇文章主要介紹了 spark wordcount完整工程代码(含pom.xml) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

工程目錄概覽

代碼

package com.zxl.spark.atguiguimport org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext}object L01_WordCount {def main(args: Array[String]): Unit = {// 創建 Spark 運行配置對象val sparkConf = new SparkConf().setMaster("local").setAppName("L01_WordCount")// 創建 Spark 上下文環境對象(連接對象)val sparkContext = new SparkContext(sparkConf)// 讀取文件數據sparkContext.setLogLevel("ERROR")// 將文件中的數據進行分詞val fileRDD: RDD[String] = sparkContext.textFile("src/main/input/word.txt")// 轉換數據結構 word => (word, 1)val wordRDD: RDD[String] = fileRDD.flatMap(_.split(","))// 將轉換結構后的數據按照相同的單詞進行分組聚合val word2OneRDD: RDD[(String, Int)] = wordRDD.map((_, 1))// 將數據聚合結果采集到內存中val word2CountRDD: RDD[(String, Int)] = word2OneRDD.reduceByKey(_ + _)// 打印結果val word2Count: Array[(String, Int)] = word2CountRDD.collect()word2Count.foreach(println)//阻塞以查看日志while (true){}sparkContext.stop()}}

pom

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>org.zxl</groupId><artifactId>SparkDemo1</artifactId><version>1.0-SNAPSHOT</version><properties><maven.compiler.source>8</maven.compiler.source><maven.compiler.target>8</maven.compiler.target><spark.version>3.1.1</spark.version><spark.scala.version>2.12</spark.scala.version></properties><dependencies><dependency><groupId>org.apache.spark</groupId><artifactId>spark-core_${spark.scala.version}</artifactId><version>${spark.version}</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-sql_${spark.scala.version}</artifactId><version>${spark.version}</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-streaming_${spark.scala.version}</artifactId><version>${spark.version}</version></dependency><!--<dependency><groupId>org.apache.spark</groupId><artifactId>spark-streaming-kafka-0-10_2.11</artifactId><version>2.4.4</version></dependency>--><dependency><groupId>org.apache.spark</groupId><artifactId>spark-hive_${spark.scala.version}</artifactId><version>${spark.version}</version></dependency><!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java --><dependency><groupId>mysql</groupId><artifactId>mysql-connector-java</artifactId><version>5.1.48</version></dependency><!--和其他框架整合使用時有可能報錯compiler--><dependency><groupId>org.codehaus.janino</groupId><artifactId>commons-compiler</artifactId><version>3.1.0</version></dependency></dependencies><build><plugins><!-- 該插件用于將 Scala 代碼編譯成 class 文件 --><plugin><groupId>net.alchim31.maven</groupId><artifactId>scala-maven-plugin</artifactId><version>3.2.2</version><executions><execution><!-- 聲明綁定到 maven 的 compile 階段 --><goals><goal>testCompile</goal></goals></execution></executions></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-assembly-plugin</artifactId><version>3.1.1</version><configuration><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs></configuration><executions><execution><id>make-assembly</id><phase>package</phase><goals><goal>single</goal></goals></execution></executions></plugin></plugins></build></project>

input/word.txt

hello,zxl hello,zhangxueliang

總結

以上是生活随笔為你收集整理的spark wordcount完整工程代码(含pom.xml)的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。