spark集群详细搭建过程及遇到的问题解决(四)
在spark集群詳細(xì)搭建過(guò)程及遇到的問(wèn)題解決(三)中,我們將講述了hadoop的安裝過(guò)程,在本文中將主要講述spark的安裝配置過(guò)程。
spark@master:~/spark$?cd?hadoop spark@master:~/spark/hadoop$?cd?$SPARK_HOME/conf spark@master:~/spark/spark/conf$?cp?slaves.template?slaves? spark@master:~/spark/spark/conf$?vim?slaves添加以下內(nèi)容
spark@master:~/spark/spark/conf$?cp?spark-env.sh.template?spark-env.shspark-env.sh?為Spark進(jìn)程啟動(dòng)時(shí)需要加載的配置
改模板配置中有選項(xiàng)的具體說(shuō)明,此處參考稍微加入了一些配置:
spark@master:~/spark/spark/conf$?vim?spark-env.sh添加以下內(nèi)容
export?SPARK_PID_DIR=/home/spark/spark/spark/tmp/pid export?SCALA_HOME=/home/spark/spark/scalaexport?JAVA_HOME=/home/spark/spark/jdk export?HADOOP_HOME=/home/spark/spark/hadoop export?SPARK_MASTER_IP=master export?SPARK_MASTER_PORT=7077 export?SPARK_WORKER_MERMORY=2G export?HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop注意更改相應(yīng)的目錄,并保存。
spark@master:~/spark/spark/conf$??hadoop?fs?-mkdir?hdfs://master:9000/sparkHistoryLogs mkdir:?Cannot?create?directory?/sparkHistoryLogs.?Name?node?is?in?safe?mode.發(fā)現(xiàn)不能創(chuàng)建,并提示Name node處于安全模式,因此我們先關(guān)掉安全模式
重新建立
配置spark-defaults.conf,該文件為spark提交任務(wù)時(shí)默認(rèn)讀取的配置文件
spark@master:~/spark/spark/conf$?vim?spark-defaults.conf添加以下內(nèi)容
spark.master?????????????????????spark://master:7077 spark.eventLog.enabled???????????true spark.eventLog.dir???????????????hdfs://master:9000/sparkHistoryLogs spark.eventLog.compress??????????true spark.history.updateInterval?????5 spark.history.ui.port????????????7777 spark.history.fs.logDirectory????hdfs://master:9000/sparkHistoryLogs將配置好的spark文件復(fù)制到worker1、worker2節(jié)點(diǎn)中
切換到worker1節(jié)點(diǎn)中,執(zhí)行
spark@worker1:~/spark$?scp?-r?spark@master:/home/spark/spark/spark?./spark注意復(fù)制的目錄是放在spark目錄下
切換到worker2節(jié)點(diǎn)中,執(zhí)行
spark@worker2:~/spark$?scp?-r?spark@master:/home/spark/spark/spark?./spark注意復(fù)制的目錄是放在spark目錄下
切換到master中
接著啟動(dòng)spark
spark@master:~/spark/spark/conf$?$SPARK_HOME/sbin/start-all.shstarting?org.apache.spark.deploy.master.Master,?logging?to?/home/spark/spark/spark/logs/spark-spark-org.apache.spark.deploy.master.Master-1-master.out master:?starting?org.apache.spark.deploy.worker.Worker,?logging?to?/home/spark/spark/spark/logs/spark-spark-org.apache.spark.deploy.worker.Worker-1-master.out worker2:?starting?org.apache.spark.deploy.worker.Worker,?logging?to?/home/spark/spark/spark/logs/spark-spark-org.apache.spark.deploy.worker.Worker-1-worker2.out worker1:?starting?org.apache.spark.deploy.worker.Worker,?logging?to?/home/spark/spark/spark/logs/spark-spark-org.apache.spark.deploy.worker.Worker-1-worker1.out可以看到啟動(dòng)成功,
停止spark 使用
$SPARK_HOME/sbin/stop-all.sh啟動(dòng)Spark歷史任務(wù)記錄:
spark@master:~/spark/spark/conf$?$SPARK_HOME/sbin/start-history-server.shstarting?org.apache.spark.deploy.history.HistoryServer,?logging?to?/home/spark/spark/spark/logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-master.out查看Spark和Hadoop相關(guān)的所有進(jìn)程
spark@master:~/spark/spark/conf$?jps?-l6711?org.apache.hadoop.hdfs.server.namenode.NameNode 18863?org.apache.spark.deploy.master.Master 7053?org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode 18966?org.apache.spark.deploy.worker.Worker 19122?sun.tools.jps.Jps 19070?org.apache.spark.deploy.history.HistoryServer 15529?org.apache.hadoop.hdfs.server.datanode.DataNode 7352?org.apache.hadoop.yarn.server.nodemanager.NodeManager 7222?org.apache.hadoop.yarn.server.resourcemanager.ResourceManager至此Spark集群也已經(jīng)運(yùn)行成功。
Spark-shell測(cè)試Spark集群:
需要先執(zhí)行
hdfs?dfsadmin?-safemode?leave將安全模式關(guān)閉
spark@master:~/spark/spark/conf$?$SPARK_HOME/bin/spark-shell?--master?spark://master:7077可以看到啟動(dòng)成功
測(cè)試一下:
一些web瀏覽界面:
集群節(jié)點(diǎn)信息:http://master:8080,可以將master換成ip地址
歷史任務(wù):http://master:7777,因?yàn)闆](méi)有執(zhí)行任務(wù),所以看不到
Hadoop?集群信息:?http://master:50070/
圖中顯示安全模式已經(jīng)關(guān)閉,接下來(lái)重新打開(kāi)hadoop集群,則需要執(zhí)行下列命令
spark@master:~/spark/spark/conf$?hdfs?dfsadmin?-safemode?enter至此,已經(jīng)全部安裝完畢。
若想停止spark集群則執(zhí)行
spark@master:~/spark/spark/conf$?$SPARK_HOME/sbin/stop-all.shmaster:?stopping?org.apache.spark.deploy.worker.Worker worker2:?stopping?org.apache.spark.deploy.worker.Worker worker1:?stopping?org.apache.spark.deploy.worker.Worker stopping?org.apache.spark.deploy.master.Master若想停止hadoop集群則執(zhí)行
spark@master:~/spark/spark/conf$?$HADOOP_HOME/sbin/stop-all.shThis?script?is?Deprecated.?Instead?use?stop-dfs.sh?and?stop-yarn.sh Stopping?namenodes?on?[master] master:?no?namenode?to?stop master:?stopping?datanode worker1:?stopping?datanode worker2:?stopping?datanode Stopping?secondary?namenodes?[0.0.0.0] 0.0.0.0:?no?secondarynamenode?to?stop stopping?yarn?daemons no?resourcemanager?to?stop master:?no?nodemanager?to?stop worker1:?no?nodemanager?to?stop worker2:?no?nodemanager?to?stop no?proxyserver?to?stop最后附一些常用檢測(cè)命令:來(lái)自(http://ciscolinux.blog.51cto.com/746827/1313110)
1.查看端口是否開(kāi)啟
netstat -tupln | grep 9000
netstat -tupln | grep 9001
2.訪問(wèn)master(NameNode)和slave(JobTracker)啟動(dòng)是否正常http://192.168.0.202:50070和50030
3.jps查看守護(hù)進(jìn)程是否運(yùn)行
master顯示:Job TrackerJpsSecondaryNameNod NameNode
slave顯示:DataNode JpsTaskTracker
4.查看集群狀態(tài)統(tǒng)計(jì)信息(hadoopdfsadmin -report)
master和slave輸入信息:
九、常用命令
hadoop dfs -ls #列出HDFS下文件
hadoop dfs -ls in #列出HDFS下某個(gè)文檔中的文件
hadoop dfs -put test.txt test #上傳文件到指定目錄并且重新命名,只有所有的DataNode都接收完數(shù)據(jù)才算成功
hadoop dfs -get in getin #從HDFS獲取文件并且重新命名為getin,同put一樣可操作文件也可操作目錄
hadoop dfs -rmr out #刪除HDFS上的out目錄
hadoop dfs -cat in/* #查看HDFS上in目錄的內(nèi)容
hadoop dfsadmin -safemode leave #退出安全模式
hadoop dfsadmin -safemode enter #進(jìn)入安全模式
添加一個(gè)新的節(jié)點(diǎn)
請(qǐng)按照worker1和worker2的配置步驟,即可
轉(zhuǎn)載于:https://blog.51cto.com/lefteva/1874268
總結(jié)
以上是生活随笔為你收集整理的spark集群详细搭建过程及遇到的问题解决(四)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 使用pdfbox实现PDF转JPG
- 下一篇: perror的特殊输出