日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

java spark 读取json_apache-spark - 与aws-java-sdk链接时,在读取json文件时发生Spark崩溃 - 堆栈内存溢出...

發布時間:2023/12/1 编程问答 26 豆豆
生活随笔 收集整理的這篇文章主要介紹了 java spark 读取json_apache-spark - 与aws-java-sdk链接时,在读取json文件时发生Spark崩溃 - 堆栈内存溢出... 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

讓config.json成為一個小的json文件:

{

"toto": 1

}

我做了一個簡單的代碼,用sc.textFile讀取json文件(因為文件可以在S3,本地或HDFS上,所以textFile很方便)

import org.apache.spark.{SparkContext, SparkConf}

object testAwsSdk {

def main( args:Array[String] ):Unit = {

val sparkConf = new SparkConf().setAppName("test-aws-sdk").setMaster("local[*]")

val sc = new SparkContext(sparkConf)

val json = sc.textFile("config.json")

println(json.collect().mkString("\n"))

}

}

SBT文件只提取spark-core庫

libraryDependencies ++= Seq(

"org.apache.spark" %% "spark-core" % "1.5.1" % "compile"

)

程序按預期工作,在標準輸出上寫入config.json的內容。

現在我想鏈接aws-java-sdk,亞馬遜的sdk來訪問S3。

libraryDependencies ++= Seq(

"com.amazonaws" % "aws-java-sdk" % "1.10.30" % "compile",

"org.apache.spark" %% "spark-core" % "1.5.1" % "compile"

)

執行相同的代碼,spark拋出以下異常。

Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)

at [Source: {"id":"0","name":"textFile"}; line: 1, column: 1]

at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)

at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)

at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)

at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)

at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143)

at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409)

at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358)

at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265)

at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245)

at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143)

at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439)

at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666)

at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558)

at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)

at org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82)

at org.apache.spark.rdd.RDDOperationScope$$anonfun$5.apply(RDDOperationScope.scala:133)

at org.apache.spark.rdd.RDDOperationScope$$anonfun$5.apply(RDDOperationScope.scala:133)

at scala.Option.map(Option.scala:145)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:133)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

at org.apache.spark.SparkContext.withScope(SparkContext.scala:709)

at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:1012)

at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:827)

at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:825)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

at org.apache.spark.SparkContext.withScope(SparkContext.scala:709)

at org.apache.spark.SparkContext.textFile(SparkContext.scala:825)

at testAwsSdk$.main(testAwsSdk.scala:11)

at testAwsSdk.main(testAwsSdk.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

讀取堆棧時,似乎當鏈接aws-java-sdk時, sc.textFile檢測到該文件是json文件并嘗試使用jackson解析它,假設某種格式,當然無法找到。 我需要鏈接aws-java-sdk,所以我的問題是:

1-為什么添加aws-java-sdk會修改spark-core的行為?

2-是否有解決方法(文件可以在HDFS,S3或本地)?

總結

以上是生活随笔為你收集整理的java spark 读取json_apache-spark - 与aws-java-sdk链接时,在读取json文件时发生Spark崩溃 - 堆栈内存溢出...的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。