spark的三种运行模式以及yarn-client和yarn-cluster在提交命令上的区别
本文針對的是Spark 2.3.1
standalone:線下模式
分為standalone-client和standalone-cluster兩種模式
?
yarn:線上模式
又分為yarn-client(調試模式)和yarn-cluster
--master yarn和--master yarn-client效果等效[2]
?
mesos:線上模式(官方推薦)
#----------------------------------------------------
本地模式(8核,偽分布式):
spark-submit --class WordCountLocal --master local[8] /home/appleyuchi/IdeaProjects/scala-learn/target/scalalearn-1.0-SNAPSHOT-jar-with-dependencies.jar
?
Standalone-Client模式:
spark-submit --class WordCountLocal --master spark://master:7077?/home/appleyuchi/IdeaProjects/scala-learn/target/scalalearn-1.0-SNAPSHOT-jar-with-dependencies.jar
?
Standalone-Client模式(python工程文件):
spark-submit --class WordCountLocal --master spark://master:7077 xxx.py
?
Standalone-Cluster模式:
spark-submit --class WordCountLocal --master spark://master:7077?--deploy-mode cluster? --supervise? /home/appleyuchi/IdeaProjects/scala-learn/target/scalalearn-1.0-SNAPSHOT-jar-with-dependencies.jar
?
?
Yarn-client模式:
spark-submit --class WordCountLocal --master yarn --deploy-mode client ?/home/appleyuchi/IdeaProjects/scala-learn/target/scalalearn-1.0-SNAPSHOT-jar-with-dependencies.jar
yarn-client模式時,后邊這句--deploy-mode client可寫可不寫
?
Yarn-cluster模式:
spark-submit --class WordCountLocal --master yarn --deploy-mode cluster? /home/appleyuchi/IdeaProjects/scala-learn/target/scalalearn-1.0-SNAPSHOT-jar-with-dependencies.jar
?
yarn的意思是使用hadoop的資源管理器,standalone的意思是使用spark自帶的資源管理器
?
[4]Master URLs
The master URL passed to Spark can be in one of the following formats:
| local | Run Spark locally with one worker thread (i.e. no parallelism at all). |
| local[K] | Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine). |
| local[K,F] | Run Spark locally with K worker threads and F maxFailures (see?spark.task.maxFailures?for an explanation of this variable) |
| local[*] | Run Spark locally with as many worker threads as logical cores on your machine. |
| local[*,F] | Run Spark locally with as many worker threads as logical cores on your machine and F maxFailures. |
| spark://HOST:PORT | Connect to the given?Spark standalone cluster?master. The port must be whichever one your master is configured to use, which is 7077 by default. |
| spark://HOST1:PORT1,HOST2:PORT2 | Connect to the given?Spark standalone cluster with standby masters with Zookeeper. The list must have all the master hosts in the high availability cluster set up with Zookeeper. The port must be whichever each master is configured to use, which is 7077 by default. |
| mesos://HOST:PORT | Connect to the given?Mesos?cluster. The port must be whichever one your is configured to use, which is 5050 by default. Or, for a Mesos cluster using ZooKeeper, use?mesos://zk://.... To submit with?--deploy-mode cluster, the HOST:PORT should be configured to connect to the?MesosClusterDispatcher. |
| yarn | Connect to a?YARN?cluster in?client?or?cluster?mode depending on the value of?--deploy-mode. The cluster location will be found based on the?HADOOP_CONF_DIR?or?YARN_CONF_DIR?variable. |
| k8s://HOST:PORT | Connect to a?Kubernetes?cluster in?cluster?mode. Client mode is currently unsupported and will be supported in future releases. The?HOST?and?PORT?refer to the?Kubernetes API Server. It connects using TLS by default. In order to force it to use an unsecured connection, you can use?k8s://http://HOST:PORT. |
?
Reference:
[1]yarn-cluster和yarn-client提交模式的區別
[2]4.5.1 Yarn-Client模式實例部署及運行演示
[3]Running Spark on YARN
[4]Submitting Applications
[5]Standalone模式兩種提交任務方式
創作挑戰賽新人創作獎勵來咯,堅持創作打卡瓜分現金大獎總結
以上是生活随笔為你收集整理的spark的三种运行模式以及yarn-client和yarn-cluster在提交命令上的区别的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 「已回复」少年初长成是指多少岁
- 下一篇: 关于spark-shell和scala关