日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 人文社科 > 生活经验 >内容正文

生活经验

hadoop2.4.1集群搭建

發布時間:2023/11/27 生活经验 29 豆豆
生活随笔 收集整理的這篇文章主要介紹了 hadoop2.4.1集群搭建 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

準備Linux環境

修改主機名

$ vim /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=hadoop001

?

修改IP

# vim /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0

HWADDR=?????????????

TYPE=Ethernet

UUID=????????????????

ONBOOT=yes

NM_CONTROLLED=yes

BOOTPROTO=static

IPADDR=172.17.30.111

NETMASK=255.255.254.0

GATEWAY=172.17.30.1

DNS1=223.5.5.5

DNS2=223.6.6.6

?

關閉防火墻

查看防火墻狀態

???????? service iptables status

???????? 關閉防火墻

???????? service iptables stop

???????? 查看防火墻開機啟動狀態

???????? chkconfig iptables --list

???????? 關閉防火墻開機啟動

???????? chkconfig iptables off

?

修改主機名和IP映射關系

$ vim /etc/hosts

172.17.30.111?? hadoop001

172.17.30.112?? hadoop002

172.17.30.113?? hadoop003

172.17.30.114?? hadoop004

172.17.30.115?? hadoop005

172.17.30.116?? hadoop006

172.17.30.117?? hadoop007

?

重啟機器

# reboot

?

安裝JDK

解壓jdk

# tar -zxvf jdk-7u79-linux-x64.tar.gz -C /opt/modules/

?

添加環境變量

# vim /etc/profile

##JAVA

JAVA_HOME=/opt/modules/jdk1.7.0_79

JRE_HOME=/opt/modules/jdk1.7.0_79/jre

PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

export JAVA_HOME JRE_HOME PATH CLASSPATH

?

刷新配置

# source /etc/profile

?

安裝hadoop2.4.1

解壓hadoop2.4.1

# tar -zxvf hadoop-2.4.1.tar.gz -C /opt/modules/

?

添加環境變量

# vim /etc/profile

##HADOOP

export HADOOP_HOME=/opt/modules/hadoop-2.4.1

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

?

刷新配置

# source /etc/profile

?

集群規劃:

???????? 主機名 ? ? ? ? ? ? ? ?IP ? ? ? ? ? ? ? ? ? ? ? ? ? ?安裝的軟件 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 運行的進程

???????? hadoop001?????? 172.17.30.111 ? ? ? ? jdk、hadoop ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NameNode、DFSZKFailoverController(zkfc)

???????? hadoop002?????? 172.17.30.112 ? ? ? ? jdk、hadoop ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NameNode、DFSZKFailoverController(zkfc)

???????? hadoop003?????? 172.17.30.113 ? ? ? ? jdk、hadoop ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ResourceManager

???????? hadoop004?????? 172.17.30.114 ? ? ? ? jdk、hadoop ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ResourceManager

???????? hadoop005?????? 172.17.30.115 ? ? ? ? jdk、hadoop、zookeeper ? ? ? ? ?DataNode、NodeManager、JournalNode、QuorumPeerMain

???????? hadoop006?????? 172.17.30.116 ? ? ? ? jdk、hadoop、zookeeper ? ? ? ? ?DataNode、NodeManager、JournalNode、QuorumPeerMain

???????? hadoop007?????? 172.17.30.117 ? ? ? ? jdk、hadoop、zookeeper ? ? ? ? ?DataNode、NodeManager、JournalNode、QuorumPeerMain

????????

說明:

???????? 1.在hadoop2.0中通常由兩個NameNode組成,一個處于active狀態,另一個處于standby狀態。Active NameNode對外提供服務,而Standby NameNode則不對外提供服務,僅同步active namenode的狀態,以便能夠在它失敗時快速進行切換。

???????? hadoop2.0官方提供了兩種HDFS HA的解決方案,一種是NFS,另一種是QJM。這里我們使用簡單的QJM。在該方案中,主備NameNode之間通過一組JournalNode同步元數據信息,一條數據只要成功寫入多數JournalNode即認為寫入成功。通常配置奇數個JournalNode

???????? 這里還配置了一個zookeeper集群,用于ZKFC(DFSZKFailoverController)故障轉移,當Active NameNode掛掉了,會自動切換Standby NameNode為standby狀態

???????? 2.hadoop-2.2.0中依然存在一個問題,就是ResourceManager只有一個,存在單點故障,hadoop-2.4.1解決了這個問題,有兩個ResourceManager,一個是Active,一個是Standby,狀態由zookeeper進行協調

?

?

配置HDFS:

修改hadoop-env.sh

# vim hadoop-env.sh

export JAVA_HOME=/opt/modules/jdk1.7.0_79

?

修改core-site.xml

# vim core-site.xml

<configuration>

???????? <!-- 指定hdfs的nameservice為ns1 -->

???????? <property>

?????????????????? <name>fs.defaultFS</name>

?????????????????? <value>hdfs://ns1</value>

???????? </property>

???????? <!-- 指定hadoop臨時目錄 -->

???????? <property>

?????????????????? <name>hadoop.tmp.dir</name>

?????????????????? <value>/opt/data/tmp</value>

???????? </property>

???????? <!-- 指定zookeeper地址 -->

???????? <property>

?????????????????? <name>ha.zookeeper.quorum</name>

?????????????????? <value>hadoop005:2181,hadoop006:2181,hadoop007:2181</value>

???????? </property>

</configuration>

修改hdfs-site.xml

# vim?hdfs-site.xml

<configuration>

???????? <!--指定hdfs的nameservice為ns1,需要和core-site.xml中的保持一致 -->

???????? <property>

?????????????????? <name>dfs.nameservices</name>

?????????????????? <value>ns1</value>

???????? </property>

???????? <!-- ns1下面有兩個NameNode,分別是nn1,nn2 -->

???????? <property>

?????????????????? <name>dfs.ha.namenodes.ns1</name>

?????????????????? <value>nn1,nn2</value>

???????? </property>

???????? <!-- nn1的RPC通信地址 -->

???????? <property>

?????????????????? <name>dfs.namenode.rpc-address.ns1.nn1</name>

?????????????????? <value>hadoop001:9000</value>

???????? </property>

???????? <!-- nn1的http通信地址 -->

???????? <property>

?????????????????? <name>dfs.namenode.http-address.ns1.nn1</name>

?????????????????? <value>hadoop001:50070</value>

???????? </property>

???????? <!-- nn2的RPC通信地址 -->

???????? <property>

?????????????????? <name>dfs.namenode.rpc-address.ns1.nn2</name>

?????????????????? <value>hadoop002:9000</value>

???????? </property>

???????? <!-- nn2的http通信地址 -->

???????? <property>

?????????????????? <name>dfs.namenode.http-address.ns1.nn2</name>

?????????????????? <value>hadoop002:50070</value>

???????? </property>

???????? <!-- 指定NameNode的元數據在JournalNode上的存放位置 -->

???????? <property>

?????????????????? <name>dfs.namenode.shared.edits.dir</name>

?????????????????? <value>qjournal://hadoop005:8485;hadoop006:8485;hadoop007:8485/ns1</value>

???????? </property>

???????? <!-- 指定JournalNode在本地磁盤存放數據的位置 -->

???????? <property>

?????????????????? <name>dfs.journalnode.edits.dir</name>

?????????????????? <value>/opt/data/journaldata</value>

???????? </property>

???????? <!-- 開啟NameNode失敗自動切換 -->

???????? <property>

?????????????????? <name>dfs.ha.automatic-failover.enabled</name>

?????????????????? <value>true</value>

???????? </property>

???????? <!-- 配置失敗自動切換實現方式 -->

???????? <property>

?????????????????? <name>dfs.client.failover.proxy.provider.ns1</name>

?????????????????? <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

???????? </property>

???????? <!-- 配置隔離機制方法,多個機制用換行分割,即每個機制暫用一行-->

???????? <property>

?????????????????? <name>dfs.ha.fencing.methods</name>

?????????????????? <value>

??????????????????????????? sshfence

??????????????????????????? shell(/bin/true)

?????????????????? </value>

???????? </property>

???????? <!-- 使用sshfence隔離機制時需要ssh免登陸 -->

???????? <property>

?????????????????? <name>dfs.ha.fencing.ssh.private-key-files</name>

?????????????????? <value>/root/.ssh/id_rsa</value>

???????? </property>

???????? <!-- 配置sshfence隔離機制超時時間 -->

???????? <property>

?????????????????? <name>dfs.ha.fencing.ssh.connect-timeout</name>

?????????????????? <value>30000</value>

???????? </property>

</configuration>

修改mapred-site.xml

# cp mapred-site.xml.template mapred-site.xml

# vim mapred-site.xml

<configuration>

???????? <!-- 指定mr框架為yarn方式 -->

???????? <property>

?????????????????? <name>mapreduce.framework.name</name>

?????????????????? <value>yarn</value>

???????? </property>

</configuration>

修改yarn-site.xml

# vim yarn-site.xml

<configuration>

???????? <!-- 開啟RM高可用 -->

???????? <property>

???????? ?? <name>yarn.resourcemanager.ha.enabled</name>

???????? ?? <value>true</value>

???????? </property>

???????? <!-- 指定RM的cluster id -->

???????? <property>

???????? ?? <name>yarn.resourcemanager.cluster-id</name>

???????? ?? <value>yrc</value>

???????? </property>

???????? <!-- 指定RM的名字 -->

???????? <property>

???????? ?? <name>yarn.resourcemanager.ha.rm-ids</name>

???????? ?? <value>rm1,rm2</value>

???????? </property>

???????? <!-- 分別指定RM的地址 -->

???????? <property>

???????? ?? <name>yarn.resourcemanager.hostname.rm1</name>

???????? ?? <value>hadoop003</value>

???????? </property>

???????? <property>

???????? ?? <name>yarn.resourcemanager.hostname.rm2</name>

???????? ?? <value>hadoop004</value>

???????? </property>

???????? <!-- 指定zk集群地址 -->

???????? <property>

???????? ?? <name>yarn.resourcemanager.zk-address</name>

???????? ?? <value>hadoop005:2181,hadoop006:2181,hadoop007:2181</value>

???????? </property>

???????? <property>

???????? ?? <name>yarn.nodemanager.aux-services</name>

???????? ?? <value>mapreduce_shuffle</value>

???????? </property>

</configuration>

修改slaves(slaves是指定子節點的位置,因為要在hadoop001上啟動HDFS、在hadoop003啟動yarn,所以hadoop001上的slaves文件指定的是datanode的位置,hadoop003上的slaves文件指定的是nodemanager的位置):

# vim slaves

hadoop005

hadoop006

hadoop007

?

?

配置免密碼登錄:

在hadoop001上產生一對密鑰

# ssh-keygen -t rsa

配置hadoop001到hadoop002、hadoop003、hadoop004、hadoop005、hadoop006、hadoop007的免密碼登陸

將公鑰拷貝到其他節點,包括自己

# ssh-copy-id hadoop001

# ssh-copy-id hadoop002

# ssh-copy-id hadoop003

# ssh-copy-id hadoop004

# ssh-copy-id hadoop005

# ssh-copy-id hadoop006

# ssh-copy-id hadoop007

?

在hadoop003上產生一對密鑰

# ssh-keygen -t rsa

配置hadoop003到hadoop004、hadoop005、hadoop006、hadoop007的免密碼登陸

# ssh-copy-id hadoop004

# ssh-copy-id hadoop005

# ssh-copy-id hadoop006

# ssh-copy-id hadoop007

?

注意:兩個namenode之間要配置ssh免密碼登陸,

在hadoop002上產生一對密鑰

# ssh-keygen -t rsa

配置hadoop002到hadoop001的免登陸

# ssh-copy-id hadoop001

?

?

將配置好的hadoop2.4.1拷貝到其他節點

# scp -r hadoop-2.4.1/ hadoop002:/opt/modules/

# scp -r hadoop-2.4.1/ hadoop003:/opt/modules/

# scp -r hadoop-2.4.1/ hadoop004:/opt/modules/

# scp -r hadoop-2.4.1/ hadoop005:/opt/modules/

# scp -r hadoop-2.4.1/ hadoop006:/opt/modules/

# scp -r hadoop-2.4.1/ hadoop007:/opt/modules/

?

安裝配置zooekeeper集群(在hadoop005)

解壓zookeeper

# tar -zxvf zookeeper-3.4.5.tar.gz -C /opt/modules/

?

添加環境變量

# vim /etc/profile

##ZOOKEEPER

export ZOOKEEPER_HOME=/opt/modules/zookeeper-3.4.5

export PATH=$PATH:$ZOOKEEPER_HOME/bin

?

修改配置

# pwd

/opt/modules/zookeeper-3.4.5/conf

# cp zoo_sample.cfg zoo.cfg

# vim zoo.cfg

修改:dataDir=/opt/modules/zookeeper-3.4.5/tmp

在配置文件最后添加:

server.1=hadoop005:2888:3888

server.2=hadoop006:2888:3888

server.3=hadoop007:2888:3888

創建tmp文件夾

# mkdir tmp

在tmp文件夾中創建空文件添加myid文本為1

# echo 1 > myid

示例

# cat myid?

1

?

將配置好的zookeeper拷貝到其他節點

# scp -r zookeeper-3.4.5/ hadoop006:/opt/modules/

# scp -r zookeeper-3.4.5/ hadoop007:/opt/modules/

注意:修改hadoop006、hadoop007對應/opt/modules/zookeeper-3.4.5/tmp/myid內容

hadoop006:

# echo 2 > myid

hadoop007:

# echo 3 > myid

?

?

注意:第一次啟動集群嚴格按照下面的步驟:

啟動zookeeper集群(分別在hadoop005、hadoop00、hadoop007上啟動)

$ zkServer.sh start

查看狀態

# zkServer.sh status 一個leader兩個follower

?

啟動journalnode(分別在hadoop005、hadoop00、hadoop007上執行)

# hadoop-daemon.sh start journalnode

運行jps命令:若是有JournalNode進程說明journalnode執行成功

# jps

示例

2308 QuorumPeerMain

2439 JournalNode

2486 Jps

?

格式化HDFS

# hdfs namenode –format

格式化后會在根據core-site.xml中的hadoop.tmp.dir配置生成個文件,這里我配置的是/opt/data/tmp,然后將/opt/data/tmp拷貝到hadoop002的/opt/data/下。

scp -r tmp/ hadoop002: /opt/data/

?

格式化ZKFC

# hdfs zkfc –formatZK

?

啟動HDFS(在hadoop001上啟動):

# start-dfs.sh

?

啟動YARN(注意:在hadoop003上啟動。把namenode和resourceManager分開是因為性能問題,因為他們都要占用大量的資源,所以要分開,啟動當然是在不同機器上啟動。):

# start-yarn.sh

?

?

?

hadoop2.4.1配置完畢,可以瀏覽器訪問:

http://hadoop001:50070

NameNode’hadoop001:9000’(active)

http://hadoop002:50070

NameNode’hadoop002:9000’(standby)

?

測試集群工作狀態的一些指令 :

# hdfs dfsadmin -report?? 查看hdfs的各節點狀態信息

# hdfs haadmin -getServiceState nn1?????????????? 獲取一個namenode節點的HA狀態

# hadoop-daemon.sh start namenode? 單獨啟動一個namenode進程

# hadoop-daemon.sh start zkfc?? 單獨啟動一個zkfc進程

?

?

如果只有3臺主機,可以按照如下規劃來部署安裝??????????????????

hadoop001?????????????????????????????????? zookeeper??? journalnode?? namenode zkfc??? resourcemanager? datanode

hadoop002?????????????????????????????????? zookeeper??? journalnode?? namenode zkfc??? resourcemanager? datanode

hadoop003?????????????????????????????????? zookeeper??? journalnode?? datanode? ? ? ??

轉載于:https://www.cnblogs.com/goodcheap/p/6113098.html

總結

以上是生活随笔為你收集整理的hadoop2.4.1集群搭建的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。