spark的内存过小报错
提交任務(wù)
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 600M --executor-memory 500M --num-executors 1 /usr/local/spark/examples/jars/spark-examples_2.11-2.3.0.jar 3console報(bào)錯(cuò)
2020-08-11 10:39:50 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM) 2020-08-11 10:39:53 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:66 - Requesting driver to remove executor 1 for reason Container marked as failed: container_1597065725323_0004_02_000002 on host: bj3-dev-search-02.tencn. Exit status: 1. Diagnostics: Exception from container-launch. Container id: container_1597065725323_0004_02_000002 Exit code: 1 Stack trace: ExitCodeException exitCode=1:at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)at org.apache.hadoop.util.Shell.run(Shell.java:455)at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)Container exited with a non-zero exit code 1查看nodemanager的yarn日志
java.lang.IllegalArgumentException: System memory 466092032 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:217)at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:199)at org.apache.spark.SparkEnv$.create(SparkEnv.scala:330)at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175)at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:256)at org.apache.spark.SparkContext.<init>(SparkContext.scala:423)at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)at com.kk.search.spark.SparkPi.main(SparkPi.java:27)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)這個(gè)和container的內(nèi)存分配有關(guān)系
https://www.jianshu.com/p/bb0bdcb26ccc
這個(gè)錯(cuò)誤目前還是很難完全排查,使用的是yarn client模式
報(bào)異常的代碼是在這里
這里的systemMemory 實(shí)際上就是Runtime.getRuntime.maxMemory,理論上這個(gè)值應(yīng)該等于(jvm設(shè)置的值減去一個(gè)survivor的值)。
在本地的idea中使用local模式運(yùn)行的時(shí)候顯示的值和預(yù)期的是一致的。但是在yarn上使用client模式則是不一致的,不太容易判斷。
也就是當(dāng)設(shè)置 --driver-memory 500M的時(shí)候?qū)?yīng)的到了yarn中并不是這些,這是需要注意的。具體的實(shí)現(xiàn)暫時(shí)沒(méi)有看到相關(guān)的分析,估計(jì)還有一些難度,畢竟不是java的,只能看一個(gè)大概。
在deploy-mode 為client的方式下,進(jìn)行提交如下
[root@bj3-dev--03 search_jar]# spark-submit --class com.kk.search.spark.SparkPi --master yarn --deploy-mode client --driver-memory 600M --executor-memory 1500M --conf spark.yarn.am.memory=1500M --num-executors 1 spark-1.0-SNAPSHOT.jar 30 # 控制臺(tái)輸出,做了一些修改,增加了一些shell日志和代碼日志 ----------------------will do submit /usr/local/jdk1.8.0_91/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/*:/usr/local/hadoop/etc/hadoop/ -Xmx600M org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode client --conf spark.yarn.am.memory=1500M --conf spark.driver.memory=600M --class com.kk.search.spark.SparkPi --executor-memory 1500M --num-executors 1 spark-1.0-SNAPSHOT.jar 30 ---------------------- 2020-08-13 18:23:24 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable # 這里的輸出是driver端的main方法獲取了一下當(dāng)前的jvm的情況,可以看到,Runtime.getRuntime().maxMemory() 獲取到的可用內(nèi)存和-Xmx600M 并不相等,也不等于(Eden+Survivor*2+Old) --------------------------- max: 559415296 (533.50 M) Non-heap: -1 (-0.00 M) Heap: 559415296 (533.50 M) Pool: Code Cache (type Non-heap memory) = 251658240 (240.00 M) Pool: Metaspace (type Non-heap memory) = -1 (-0.00 M) Pool: Compressed Class Space (type Non-heap memory) = 1073741824 (1024.00 M) Pool: PS Eden Space (type Heap memory) = 166723584 (159.00 M) Pool: PS Survivor Space (type Heap memory) = 21495808 (20.50 M) Pool: PS Old Gen (type Heap memory) = 419430400 (400.00 M) --------------------------- ***************從上面看到我們?cè)O(shè)置了-Xmx600M之后,對(duì)應(yīng)的driver通過(guò)Runtime.getRuntime().maxMemory()獲取到的最大可用系統(tǒng)內(nèi)存和jvm當(dāng)前的最大內(nèi)存并不一致。
也就是說(shuō)Runtime.getRuntime().maxMemory()獲取的并不是jvm的內(nèi)存,jvm的內(nèi)存有可能以某種方式進(jìn)行了限制,Runtime.getRuntime().maxMemory()獲取的內(nèi)存是動(dòng)態(tài)變化的,某種對(duì)jvm內(nèi)存的使用方式可能導(dǎo)致可用內(nèi)存減少。
具體是那種使用方式會(huì)導(dǎo)致減小還不太清楚,可能是用來(lái)做緩存之類的。
https://blog.csdn.net/u011564172/article/details/68496848
https://www.cnblogs.com/mengrennwpu/p/11754341.html
https://cloud.tencent.com/developer/article/1198464
總結(jié)
以上是生活随笔為你收集整理的spark的内存过小报错的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: hadoop2.6.0安装详细步骤
- 下一篇: shell脚本命令set