新闻网大数据实时分析可视化系统项目——5、Hadoop2.X HA架构与部署
1.HDFS-HA架構原理介紹
hadoop2.x之后,Clouera提出了QJM/Qurom Journal Manager,這是一個基于Paxos算法實現的HDFS HA方案,它給出了一種較好的解決思路和方案,示意圖如下:
1)基本原理就是用2N+1臺 JN 存儲EditLog,每次寫數據操作有大多數(>=N+1)返回成功時即認為該次寫成功,數據不會丟失了。當然這個算法所能容忍的是最多有N臺機器掛掉,如果多于N臺掛掉,這個算法就失效了。這個原理是基于Paxos算法
2)在HA架構里面SecondaryNameNode這個冷備角色已經不存在了,為了保持standby NN時時的與主Active NN的元數據保持一致,他們之間交互通過一系列守護的輕量級進程JournalNode
3)任何修改操作在 Active NN上執行時,JN進程同時也會記錄修改log到至少半數以上的JN中,這時 Standby NN 監測到JN 里面的同步log發生變化了會讀取 JN 里面的修改log,然后同步到自己的的目錄鏡像樹里面,如下圖:
當發生故障時,Active的 NN 掛掉后,Standby NN 會在它成為Active NN 前,讀取所有的JN里面的修改日志,這樣就能高可靠的保證與掛掉的NN的目錄鏡像樹一致,然后無縫的接替它的職責,維護來自客戶端請求,從而達到一個高可用的目的。
2.HDFS-HA 詳細配置
1)修改hdfs-site.xml配置文件
vi hdfs-site.xml
<configuration>
??????? <property>
??????????????? <name>dfs.replication</name>
??????????????? <value>3</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.permissions</name>
??????????????? <value>false</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.permissions.enabled</name>
??????????????? <value>false</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.nameservices</name>
??????????????? <value>ns</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.ha.namenodes.ns</name>
??????????????? <value>nn1,nn2</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.namenode.rpc-address.ns.nn1</name>
??????????????? <value>bigdata-pro01.kfk.com:8020</value>
??????? </property>
?????????????? <property>
??????????????? <name>dfs.namenode.rpc-address.ns.nn2</name>
??????????????? <value>bigdata-pro02.kfk.com:8020</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.namenode.http-address.ns.nn1</name>
??????????????? <value>bigdata-pro01.kfk.com:50070</value>
??????? </property>
???????
??????? <property>
??????????????? <name>dfs.namenode.http-address.ns.nn2</name>
??????????????? <value>bigdata-pro02.kfk.com:50070</value>
??????? </property>
???????
??????? <property>
??????????????? <name>dfs.namenode.shared.edits.dir</name>
??????????????? <value>qjournal://bigdata-pro01.kfk.com:8485;bigdata-pro02.kfk.com:8485;bigdata-pro03.kfk.com:8485/ns</value>
??????? </property>
?????????????? <property>
??????????????? <name>dfs.journalnode.edits.dir</name>
??????????????? <value>/opt/modules/hadoop-2.5.0/data/jn</value>
??????? </property>
?????????????? <property>
??????????????? <name>dfs.client.failover.proxy.provider.ns</name>
??????????????? <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
??????? </property>
?????????????? <property>
??????????????? <name>dfs.ha.automatic-failover.enabled</name>
??????????????? <value>true</value>
??????? </property>?????
?????????????? <property>
?????????????????????? <name>dfs.ha.fencing.methods</name>
?????????????????????? <value>sshfence</value>
?????????????? </property>
??????? <property>
??????????????? <name>dfs.ha.fencing.ssh.private-key-files</name>
??????????????? <value>/home/kfk/.ssh/id_rsa</value>
??????? </property>
</configuration>
2)修改core-site.xml配置文件
??????? <configuration>
??????? <property>
?????????????? <name>fs.defaultFS</name>
?????????????? <value>hdfs://ns</value>
??????? </property>
??????? <property>
?????????????? <name>hadoop.http.staticuser.user</name>
?????????????? <value>kfk</value>
??????? </property>
??????? <property>
?????????????? <name>hadoop.tmp.dir</name>
?????????????? <value>/opt/modules/hadoop-2.5.0/data/tmp</value>
??????? </property>
??????? <property>
?????????????? <name>dfs.namenode.name.dir</name>
?????????????? <value>file://${hadoop.tmp.dir}/dfs/name</value>
??????? </property>
??????? <property>
??????? <name>ha.zookeeper.quorum</name>
??????? <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,
?????????????????????? bigdata-pro03.kfk.com:2181</value>
??????? </property>
</configuration>
3)將修改的配置分發到其他節點
scp hdfs-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp hdfs-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp core-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp core-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
3.HDFS-HA 服務啟動及自動故障轉移測試
1)啟動所有節點上面的Zookeeper進程
zkServer.sh start
2)啟動所有節點上面的journalnode進程
sbin/hadoop-daemon.sh start journalnode
3)在[nn1]上,對namenode進行格式化,并啟動
#namenode 格式化
bin/hdfs namenode -format
#格式化高可用
bin/hdfs zkfc -formatZK
#啟動namenode
bin/hdfs namenode
4)在[nn2]上,同步nn1元數據信息
bin/hdfs namenode -bootstrapStandby
5)nn2同步完數據后,在nn1上,按下ctrl+c來結束namenode進程。然后關閉所有節點上面的journalnode進程
sbin/hadoop-daemon.sh stop journalnode
6)一鍵啟動hdfs所有相關進程
sbin/start-dfs.sh
hdfs啟動之后,kill其中Active狀態的namenode,檢查另外一個NameNode是否會自動切換為Active狀態。同時通過命令上傳文件至hdfs,檢查hdfs是否可用。
4.YARN-HA架構原理及介紹
ResourceManager HA 由一對Active,Standby結點構成,通過RMStateStore存儲內部數據和主要應用的數據及標記。目前支持的可替代的RMStateStore實現有:基于內存的MemoryRMStateStore,基于文件系統的FileSystemRMStateStore,及基于zookeeper的ZKRMStateStore。 ResourceManager HA的架構模式同NameNode HA的架構模式基本一致,數據共享由RMStateStore,而ZKFC成為 ResourceManager進程的一個服務,非獨立存在。
5.YARN-HA詳細配置
1)修改mapred-site.xml配置文件
<configuration>
??????? <property>
?????????????? <name>mapreduce.framework.name</name>
?????????????? <value>yarn</value>
??????? </property>
</configuration>
2)修改yarn-site.xml配置文件
<configuration>
??????? <property>
?????????????? <name>yarn.resourcemanager.cluster-id</name>
?????????????? <value>rs</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.ha.rm-ids</name>
?????????????? <value>rm1,rm2</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.hostname.rm1</name>
?????????????? <value>bigdata-pro01.kfk.com</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.hostname.rm2</name>
?????????????? <value>bigdata-pro02.kfk.com</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.zk.state-store.address</name>
?????????????? <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,
?????????????????????????????? bigdata-pro03.kfk.com:2181</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.zk-address</name>
?????????????? <value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,
?????????????????????????????? bigdata-pro03.kfk.com:2181</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.recovery.enabled</name>
?????????????? <value>true</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.resourcemanager.ha.enabled</name>
?????????????? <value>true</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.nodemanager.aux-services</name>
?????????????? <value>mapreduce_shuffle</value>
??????? </property>
??????? <property>
?????????????? <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
?????????????? <value>org.apache.hadoop.mapred.ShuffleHandler</value>
??????? </property>
</configuration>
3)將修改的配置分發到其他節點
scp yarn-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp yarn-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp mapred-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp mapred-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
6.YARN-HA服務啟動及自動故障轉移測試
1)在rm1節點上啟動yarn服務
sbin/start-yarn.sh????
2)在rm2節點上啟動ResourceManager服務
sbin/yarn-daemon.sh start resourcemanager
3)查看yarn的web界面
http://bigdata-pro01.kfk.com:8088
http://bigdata-pro02.kfk.com:8088
4)查看ResourceManager主備節點狀態
#bigdata-pro01.kfk.com節點上執行
bin/yarn rmadmin -getServiceState rm1
#bigdata-pro02.kfk.com節點上執行
bin/yarn rmadmin -getServiceState rm2
5)hadoop集群測試WordCount運行
bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount /user/kfk/data/wc.input
?
?
轉載于:https://www.cnblogs.com/ratels/p/10844674.html
總結
以上是生活随笔為你收集整理的新闻网大数据实时分析可视化系统项目——5、Hadoop2.X HA架构与部署的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 割双眼皮的多少钱?
- 下一篇: Scala中的foreach方法和map