日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Spark 调用 hive使用动态分区插入数据

發布時間:2024/8/23 编程问答 22 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Spark 调用 hive使用动态分区插入数据 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

spark 調用sql插入hive 失敗 ,執行語句如下

spark.sql("INSERT INTO default.test_table_partition partition(province,city) SELECT xxx,xxx md5(province),md5(city) FROM test_table")

報錯如下,需動態插入分區

Exception in thread "main" org.apache.spark.SparkException: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrictat org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:314)at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:66)at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:61)at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:77)at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:183)at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:183)at org.apache.spark.sql.Dataset$$anonfun$54.apply(Dataset.scala:2841)at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)at org.apache.spark.sql.Dataset.withAction(Dataset.scala:2840)at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:632)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

?在spark配置中加入:

.config("hive.exec.dynamici.partition",true)
.config("hive.exec.dynamic.partition.mode","nonstrict")

val spark = SparkSession.builder()// .master("local[2]").appName("WeiBoAccount-Verified").config("spark.serializer","org.apache.spark.serializer.KryoSerializer").config("hive.exec.dynamici.partition",true).config("hive.exec.dynamic.partition.mode","nonstrict").enableHiveSupport() .getOrCreate()

?

相關參數說明:

Hive.exec.dynamic.partition ?是否啟動動態分區。false(不開啟) true(開啟)默認是 falsehive.exec.dynamic.partition.mode ?打開動態分區后,動態分區的模式,有?strict和?nonstrict?兩個值可選,strict?要求至少包含一個靜態分區列,nonstrict則無此要求。各自的好處,大家自己查看哈。hive.exec.max.dynamic.partitions 允許的最大的動態分區的個數。可以手動增加分區。默認1000

?

總結

以上是生活随笔為你收集整理的Spark 调用 hive使用动态分区插入数据的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。