日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

hadoop集群平台的搭建

發(fā)布時(shí)間:2023/12/14 编程问答 25 豆豆
生活随笔 收集整理的這篇文章主要介紹了 hadoop集群平台的搭建 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

環(huán)境配置:
master:192.168.1.20
slave1:192.168.1.21
slave2:192.168.1.22

準(zhǔn)備工作:

#yum安裝需要的服務(wù),關(guān)閉防火墻和selinux, yum -y install wget vim gcc net-tools curl lrzsz rsync yum updatesystemctl status firewalld systemctl stop firewalld systemctl disable firewalld vim /etc/selinux/config ##修改為disabledvim /etc/security/limits.conf #可以打開的文件數(shù)量,追加到尾部 * soft nofile 65536 # open files (-n) * hard nofile 65536 * soft nproc 65565 * hard nproc 65565 # max user processes (-u)

更改hostname:

vim /etc/hostname #分別在master和兩個(gè)slave中刪除原來的本機(jī)hostname,添加 master/slave1/slave2

更改hosts:

vim /etc/hosts #在三個(gè)主機(jī)中同樣追加: 192.168.1.20 master 192.168.1.21 slave1 192.168.1.22 slave2

安裝jdk:

tar -xzvf /usr/local/src/jdk-16_linux-x64_bin.tar.gz -C /usr/local/ vim /etc/profile ##追加 export JAVA_HOME=/usr/local/jdk-16 export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JRE_HOME=$JAVA_HOME/jre ##刷新 source /etc/profile ##驗(yàn)證 java -version

新增hadoop用戶:

#三個(gè)機(jī)器都要新增 useradd hadoop #密碼為123456 passwd hadoop

配置ssh無密碼驗(yàn)證登錄【每個(gè)節(jié)點(diǎn)都需要操作】

切換到hadoop用戶:

[root@localhost ~]# su - hadoop [hadoop@localhost ~]$

節(jié)點(diǎn)生成密鑰對(duì):

ssh-keygen -t rsa -P ''或者直接 ssh-keygen #一直確認(rèn)就可以 #查看hadoop目錄下是否生成無密碼密鑰對(duì) [hadoop@localhost .ssh]$ cd /home/hadoop/.ssh [hadoop@localhost .ssh]$ ls id_rsa id_rsa.pub #將id_rsa.pub追加到授權(quán)key文件中 [hadoop@localhost .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys [hadoop@localhost .ssh]$ ls -a . .. authorized_keys id_rsa id_rsa.pub `給authorized_keys修改權(quán)限` [hadoop@master .ssh]$ chmod 600 ~/.ssh/authorized_keys [hadoop@master .ssh]$ ll 總用量 16 -rw------- 1 hadoop hadoop 410 3月 25 20:43 authorized_keys -rw------- 1 hadoop hadoop 1679 3月 25 20:34 id_rsa -rw-r--r-- 1 hadoop hadoop 410 3月 25 20:34 id_rsa.pub -rw-r--r-- 1 hadoop hadoop 171 3月 25 20:50 known_hosts

配置ssh服務(wù):

`root用戶登錄` [root@localhost ~]# vim /etc/ssh/sshd_config #找到#PubkeyAuthentication yes 將#號(hào)去掉 PubkeyAuthentication yes

重啟ssh服務(wù):

systemctl restart sshd

驗(yàn)證ssh登錄本機(jī):

`切換到hadoop用戶` su - hadoop [hadoop@master ~]$ ssh localhost #首次登錄主機(jī)時(shí)提示系統(tǒng)無法確認(rèn)host主機(jī)的真實(shí)性,只知道它的公鑰指紋,詢問用戶是否需要繼續(xù)連接,此時(shí)輸入yes即可。下次登錄直接登錄,不需要輸入任何的確認(rèn)和密碼,就表示配置ssh無密碼登錄成功

交換ssh密鑰

在master和slave1,slave2 之間交換密鑰,實(shí)現(xiàn)master和slave的ssh無密碼登錄
將master節(jié)點(diǎn)的公鑰id_rsa.pub復(fù)制到每個(gè)slave節(jié)點(diǎn):【在hadoop用戶下操作 】

[hadoop@master ~]$ scp ~/.ssh/id_rsa.pub hadoop@slave1:~/ The authenticity of host 'slave1 (192.168.1.21)' can't be established. ECDSA key fingerprint is SHA256:jxYxSoANdkaRE8gUXyYb0qmCBDBjg8lBsfbeXl+aM4E. ECDSA key fingerprint is MD5:25:15:dd:51:12:ee:b5:6e:fd:08:81:b2:78:84:26:3c. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1,192.168.1.21' (ECDSA) to the list of known hosts. hadoop@slave1's password: Permission denied, please try again. hadoop@slave1's password: id_rsa.pub 100% 410 756.8KB/s 00:00 #同樣給slave2操作 scp ~/.ssh/id_rsa.pub hadoop@slave2:~/

在每個(gè)slave節(jié)點(diǎn)上把master節(jié)點(diǎn)復(fù)制的公鑰復(fù)制到authorized_keys文件中:
在slave1,slave2節(jié)點(diǎn)上登錄hadoop用戶操作:

[hadoop@localhost ~]$ cat ~/id_rsa.pub >> ~/.ssh/authorized_keys `注意路徑是不同的`

每個(gè)slave節(jié)點(diǎn)刪除master的公鑰文件id_rsa.pub:

rm -rf ~/id_rsa.pub

將slave節(jié)點(diǎn)的公鑰文件保存到master:【每個(gè)slave都要操作一次】

#將slave節(jié)點(diǎn)的公鑰復(fù)制到master下: [hadoop@localhost .ssh]$ scp ~/.ssh/id_rsa.pub hadoop@master:~/ The authenticity of host 'master (192.168.1.20)' can't be established. ECDSA key fingerprint is SHA256:AlbOTMHeCJIgoXJOW7d9N9pSMRUs11+z++45WorTBKA. ECDSA key fingerprint is MD5:14:20:a8:b5:b0:b7:54:f7:5e:07:b2:0b:31:ee:6a:fc. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'master,192.168.1.20' (ECDSA) to the list of known hosts. hadoop@master's password: id_rsa.pub 100% 410 24.9KB/s 00:00 #在master節(jié)點(diǎn)將復(fù)制過來的slave公鑰復(fù)制到authorized_keys 文件中 [hadoop@master ~]$ cat ~/id_rsa.pub >> ~/.ssh/authorized_keys #刪除slave節(jié)點(diǎn)的公鑰文件 [hadoop@master ~]$ rm -rf ~/id_rsa.pub

驗(yàn)證:

查看master的authorized_keys 文件中有master,slave1,slave2共3個(gè)公鑰,slave1和slave2中有本身的公鑰和master的公鑰共2個(gè)。`在master上分別登錄兩個(gè)slave:` [hadoop@master .ssh]$ ssh hadoop@slave1 Last failed login: Thu Mar 25 22:18:20 CST 2021 from master on ssh:notty There was 1 failed login attempt since the last successful login. Last login: Thu Mar 25 22:10:12 2021 from localhost [hadoop@localhost ~]$ exit 登出 Connection to slave1 closed. [hadoop@master .ssh]$ ssh hadoop@slave2 Last login: Thu Mar 25 22:11:52 2021 from localhost [hadoop@localhost ~]$ exit 登出 `在slave上登錄mater:` [hadoop@localhost .ssh]$ ssh hadoop@master Last failed login: Thu Mar 25 22:42:12 CST 2021 from slave1 on ssh:notty There was 1 failed login attempt since the last successful login. Last login: Thu Mar 25 20:57:34 2021 from localhost [hadoop@master ~]$ exit 登出

master節(jié)點(diǎn)安裝hadoop

下載,解壓縮,移動(dòng)到/usr/local/下:

wget https://mirrors.bfsu.edu.cn/apache/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz tar -zxvf /usr/local/src/hadoop-3.2.2.tar.gz -C /usr/local/ mv /usr/local/hadoop-3.2.2 /usr/local/hadoop

配置hadoop環(huán)境變量:

vim /etc/profile #追加 #hadoop export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH #刷新 source /etc/profile #檢查 [root@master hadoop]# /usr/local/hadoop/bin/hdaoop version Hadoop 3.2.2 Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932 Compiled by hexiaoqiao on 2021-01-03T09:26Z Compiled with protoc 2.5.0 From source with checksum 5a8f564f46624254b27f6a33126ff4 This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar

修改hadoop-env.sh配置文件:

cd /usr/local/hadoop/etc/hadoop/ vim hadoop-env.sh #追加 export JAVA_HOME=/usr/local/jdk-16

配置參數(shù):

配置hdfs-site.xml配置文件:

#在<configuration> </configuration>中間添加<property><name>dfs.namenode.http-address</name><value>master:50070</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop/dfs/name</value><description>hdfs的namenode在本地文件系統(tǒng)中的位置</description></property><property><name>dfs.namenode.data.dir</name><value>file:/usr/local/hadoop/dfs/data</value><description>hdfs的datanode在本地文件系統(tǒng)中的位置</description></property><property><name>dfs.replication</name><value>3</value><description>冗余副本數(shù)為3</description></property><property><name>dfs.namenode.secondary.http-address</name><value>192.168.1.20:50090</value><description>定義hdfs對(duì)應(yīng)的http服務(wù)器的地址和端口</description></property><property><name>dfs.webhdfs.enabled</name><value>ture</value><description>是否通過http協(xié)議讀取hdfs的文件。如果是則集群的安全性較差</descrip tion></property>

配置core-site.xml配置文件:

#在<configuration> </configuration>中間添加<property><name>fs.defaultFS</name><value>hdfs://192.168.1.20:9000</value><description>文件系統(tǒng)主機(jī)和端口</description></property><property><name>io.file.buffer.size</name><value>131072</value><description>流文件的緩存區(qū)大小為128M</description></property><property><name>hadoop.tmp.dir</name><value>file:/usr/local/hadoop/tmp</value><description>臨時(shí)文件夾(此項(xiàng)若是沒配置,則系統(tǒng)默認(rèn)的臨時(shí)文件夾為/tmp/hadoop-hadoop.此目錄在linux系統(tǒng)重新啟動(dòng)時(shí)會(huì)被刪除,必須重新執(zhí)行hadoop系統(tǒng)格式化命令,否則hadoop運(yùn)行會(huì)出錯(cuò))</description></property><!-- 當(dāng)前用戶全設(shè)置成root --><property><name>hadoop.http.staticuser.user</name><value>root</value></property><!-- 不開啟權(quán)限檢查 --><property><name>dfs.permissions.enabled</name><value>false</value></property>

配置mapred-site.xml:

<configuration><property><name>mapreduce.framework.name</name><value>yarn</value><description>默認(rèn)local模式,classic,yarn。使用yarn是使用yarn集群來實(shí)現(xiàn)資源的分配</description></property><property><name>mapreduce.jobhistory.address</name><value>master:10020</value><description>定義作業(yè)歷史服務(wù)器的地址和端口,通過作業(yè)歷史服務(wù)來查看已經(jīng)運(yùn)行完成的mapreduce任務(wù)</description></property><property><name>mapreduce.jobhistory.webapp.address</name><value>master:19888</value><description>定義歷史服務(wù)器web應(yīng)用訪問的地址和端口</description></property><property><name>mapreduce.application.classpath</name><value>/usr/local/hadoop/etc/hadoop,/usr/local/hadoop/share/hadoop/common/*,/usr/local/hadoop/share/hadoop/common/lib/*,/usr/local/hadoop/share/hadoop/hdfs/*,/usr/local/hadoop/share/hadoop/hdfs/lib/*,/usr/local/hadoop/share/hadoop/mapreduce/*,/usr/local/hadoop/share/hadoop/mapreduce/lib/*,/usr/local/hadoop/share/hadoop/yarn/*,/usr/local/hadoop/share/hadoop/yarn/lib/*</value></property> </configuration>

配置yarn-site.xml:

<configuration><!-- Site specific YARN configuration properties --><property><name>yarn.resourcemanager.address</name><value>master:8032</value><description>RsourceManager提供給客戶端訪問的地址,客戶端通過該地址向RM提交應(yīng)用程序,殺死應(yīng)用程序等</description></property><property><name>yarn.resourcemanager.scheduler.address</name><value>master:8030</value><description>定義作業(yè)歷史服務(wù)器的地址和端口,通過歷史服務(wù)器來查看已經(jīng)運(yùn)行完的mapreduce作業(yè)記錄</description></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>master:8031</value><description>ResourceManager提供給nodemanager的地址,nodemanager通過該地址向MR匯報(bào)心跳,領(lǐng)取任務(wù)等</description></property><property><name>yarn.resourcemanager.admin.address</name><value>master:8033</value><description>resourcemanager提供給管理員的地址,管理員可以通過該地址向RM發(fā)送管理命令</description></property><property><name>yarn.resourcemanager.webapp.address</name><value>master:8088</value><description>resourcemanager對(duì)web服務(wù)器提供的地址,用戶可以通過該地址在瀏覽器中查看集群的各類信息</description></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value><description></description></property><property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value><description>通過該配置,用戶可以自定義一些服務(wù),例如,map-reduce的shuffle功能就是采用這種方式來實(shí)現(xiàn)的,這樣就可以在nodemanager上擴(kuò)展自己的服務(wù)</description></property> </configuration>

hadoop其他的相關(guān)配置:

  • 配置workers文件:
  • vim /usr/local/hadoop/etc/hadoop/workers #刪除loaclhost,添加 master slave1 slave2 #將master和兩個(gè)slave節(jié)點(diǎn)都當(dāng)作data節(jié)點(diǎn)
  • 新建目錄/usr/local/hadoop/tmp,/usr/local/hadoop/dfs/name,/usr/local/hadoop/dfs/data
  • [root@master hadoop]# mkdir /usr/local/hadoop/tmp [root@master hadoop]# mkdir /usr/local/hadoop/dfs/name -p [root@master hadoop]# mkdir /usr/local/hadoop/dfs/data -p
  • 修改/usr/local/hadoop/權(quán)限
  • [root@master hadoop]# chown -R hadoop:hadoop /usr/local/hadoop/
  • 同步配置到slave節(jié)點(diǎn):
  • [root@master hadoop]# scp -r /usr/local/hadoop/ root@slave1:/usr/local/ [root@master hadoop]# scp -r /usr/local/hadoop/ root@slave2:/usr/local/
  • 在每個(gè)slave節(jié)點(diǎn)上配置hadop環(huán)境變量:
  • vim /etc/profile #追加 #hadoop export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH #刷新 source /etc/profile
  • 在slave節(jié)點(diǎn)上給hadoop目錄授權(quán)
  • [root@slave1 ~]# chown -R hadoop:hadoop /usr/local/hadoop/ [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/hadoop/

    7.在master和slave節(jié)點(diǎn)上切換到Hadoop用戶

    su - hadoop

    master節(jié)點(diǎn)進(jìn)行namenode格式化

    將namenode上的數(shù)據(jù)清零,第一次啟動(dòng)HDFS時(shí)要進(jìn)行格式化,以后啟動(dòng)無須進(jìn)行格式化,否則會(huì)datanode丟失。另外,只要運(yùn)行過HDFS,hadoop的工作目錄就會(huì)有數(shù)據(jù),如果需要重新格式化,則需要在格式化前刪除工作目錄的數(shù)據(jù),否則會(huì)出問題。

    [hadoop@master ~]$ /usr/local/hadoop/bin/hdfs namenode -format WARNING: /usr/local/hadoop/logs does not exist. Creating. 2021-03-26 19:22:49,107 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = master/192.168.1.20 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 3.2.2 ..............................................略 2021-03-27 09:53:13,285 INFO common.Storage: Storage directory /usr/local/hadoop/dfs/name has been successfully formatted. ............................略 2021-03-26 19:22:50,686 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.20 ************************************************************/

    啟動(dòng)namenode:

    [hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start namenode WARNING: Use of this script to start HDFS daemons is deprecated. WARNING: Attempting to execute replacement "hdfs --daemon start" instead.

    查看java進(jìn)程:

    [hadoop@master hadoop]$ jps 1625 NameNode 1691 Jps

    啟動(dòng)datanode:

    [hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start datanode WARNING: Use of this script to start HDFS daemons is deprecated. WARNING: Attempting to execute replacement "hdfs --daemon start" instead. [hadoop@master hadoop]$ jps 1825 Jps 1762 DataNode 1625 NameNode

    啟動(dòng)secondarynamenode:

    [hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start secondarynamenode WARNING: Use of this script to start HDFS daemons is deprecated. WARNING: Attempting to execute replacement "hdfs --daemon start" instead. [hadoop@master hadoop]$ jps 1762 DataNode 1893 SecondaryNameNode 1926 Jps 1625 NameNode

    檢查集群是否連接成功:

    [hadoop@master sbin]$ hdfs dfsadmin -report .........略 Live datanodes (1):Name: 192.168.1.20:9866 (master) Hostname: master Decommission Status : Normal Configured Capacity: 30041706496 (27.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 4121952256 (3.84 GB) DFS Remaining: 25919746048 (24.14 GB) DFS Used%: 0.00% DFS Remaining%: 86.28% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:55 CST 2021 Num of Blocks: 0

    出錯(cuò)了,沒連接到兩個(gè)slave節(jié)點(diǎn)。

    解決方式:

    #一鍵停止服務(wù) `/usr/local/hadoop/sbin/start-all.sh 刪除前面格式化和啟動(dòng)服務(wù)產(chǎn)生的數(shù)據(jù)[刪除下面目錄中的所有文件]: /usr/local/hadoop/logs/ /usr/local/hadoop/dfs/data/ /usr/local/hadoop/dfs/name/ /usr/local/hadoop/tmp/ 重新格式化master 啟動(dòng)服務(wù) 再次檢查` [hadoop@master sbin]$ hdfs dfsadmin -report Configured Capacity: 92271554560 (85.93 GB) Present Capacity: 81996009472 (76.36 GB) DFS Remaining: 81995984896 (76.36 GB) DFS Used: 24576 (24 KB) DFS Used%: 0.00% Replicated Blocks:Under replicated blocks: 0Blocks with corrupt replicas: 0Missing blocks: 0Missing blocks (with replication factor 1): 0Low redundancy blocks with highest priority to recover: 0Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0Block groups with corrupt internal blocks: 0Missing block groups: 0Low redundancy blocks with highest priority to recover: 0Pending deletion blocks: 0------------------------------------------------- Live datanodes (3):Name: 192.168.1.20:9866 (master) Hostname: master Decommission Status : Normal Configured Capacity: 30041706496 (27.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 4121952256 (3.84 GB) DFS Remaining: 25919746048 (24.14 GB) DFS Used%: 0.00% DFS Remaining%: 86.28% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:55 CST 2021 Num of Blocks: 0Name: 192.168.1.21:9866 (slave1) Hostname: slave1 Decommission Status : Normal Configured Capacity: 31114924032 (28.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 3144146944 (2.93 GB) DFS Remaining: 27970768896 (26.05 GB) DFS Used%: 0.00% DFS Remaining%: 89.90% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:49 CST 2021 Num of Blocks: 0Name: 192.168.1.22:9866 (slave2) Hostname: slave2 Decommission Status : Normal Configured Capacity: 31114924032 (28.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 3009445888 (2.80 GB) DFS Remaining: 28105469952 (26.18 GB) DFS Used%: 0.00% DFS Remaining%: 90.33% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:49 CST 2021 Num of Blocks: 0

    停止服務(wù):

    /usr/local/hadoop/sbin/hadoop-daemon.sh stop secondarynamenode/usr/local/hadoop/sbin/hadoop-daemon.sh stop datanode/usr/local/hadoop/sbin/hadoop-daemon.sh stop namenode

    一鍵開始和停止服務(wù):

    /usr/local/hadoop/sbin/start-all.sh #開啟hadoop服務(wù)(namenode,datanode,secondarynamenode) /usr/local/hadoop/sbin/stop-all.sh #停止hadoop服務(wù)

    web端查看集群:

    http://192.168.1.20:50070/


    運(yùn)行hadoop的wordcount進(jìn)行測(cè)試

    先建hdfs文件系統(tǒng)中的/input目錄:

    [hadoop@master sbin]$ hdfs dfs -mkdir /input [hadoop@master sbin]$ hdfs dfs -ls / Found 1 items drwxr-xr-x - hadoop supergroup 0 2021-03-27 16:11 /input [hadoop@master sbin]$

    將輸入數(shù)據(jù)文件復(fù)制放入到hdfs的/input目錄中:

    [hadoop@master sbin]$ hdfs dfs -put /chenfeng/pzs.log /input [hadoop@master sbin]$ hdfs dfs -ls /input Found 1 items -rw-r--r-- 3 hadoop supergroup 199205376 2021-03-28 22:31 /input/pzs.log [hadoop@master sbin]$

    在瀏覽器查看:

    沒有看到文件,而且還報(bào)錯(cuò):

    `Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error` 這是因?yàn)閖ava 11以后 移除了javax.activation**。

    解決方法1:
    javax.activiation 文件的下載鏈接

    https://jar-download.com/?search_box=javax.activation `下載**javax.activiation** 由于下載到的文件是ZIP格式的是要提取hadoop\share\hadoop\common`

    下載的時(shí)候挑評(píng)星多的下載

    解決方法2:直接替換java版本為jdk8:

    https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html

    成功解決:

    運(yùn)行wordcount案例:
    若是hdfs系統(tǒng)中存在/output目錄,先刪除,要不然在運(yùn)行案例時(shí)無法生存新的/output目錄會(huì)執(zhí)行失敗。

    #若存在/output目錄請(qǐng)刪除: hdfs dfs -rm -r -f /outtput

    測(cè)試開始:

    [hadoop@master sbin]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount /input/pzs.log /output `2021-03-28 22:56:16,003 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 2021-03-28 22:56:17,202 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2021-03-28 22:56:18,203 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2021-03-28 22:56:19,203 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)`

    出現(xiàn)錯(cuò)誤:

    根據(jù)錯(cuò)誤 【Retrying connect to server: master/192.168.1.20:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)】 發(fā)現(xiàn)原來是yarn沒有啟動(dòng),yarn啟動(dòng)會(huì)有兩個(gè)進(jìn)程: resourcemanager nodemanagers

    啟動(dòng)yarn:

    [hadoop@master sbin]$ start-yarn.sh Starting resourcemanager Starting nodemanagers

    查看是否啟動(dòng)成功:

    [hadoop@master sbin]$ jps 4357 Jps 1750 DataNode 1910 SecondaryNameNode 1630 NameNode

    啟動(dòng)不成功

    同時(shí)查看日志:

    [root@master logs]# tailf hadoop-hadoop-resourcemanager-master.log [root@master logs]# tailf hadoop-hadoop-nodemanager-master.log

    在日志中有這樣的輸出:

    ` 2021-03-28 23:35:04,775 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: registered UNIX signal handlers for [TERM, HUP, INT] 2021-03-28 23:35:05,212 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/usr/local/hadoop/etc/hadoop/core-site.xml 2021-03-28 23:35:05,317 INFO org.apache.hadoop.conf.Configuration: resource-types.xml not found 2021-03-28 23:35:05,317 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'. 2021-03-28 23:35:05,348 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/usr/local/hadoop/etc/hadoop/yarn-site.xml 2021-03-28 23:35:05,350 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher 2021-03-28 23:35:05,390 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: NMTokenKeyRollingInterval: 86400000ms and NMTokenKeyActivationDelay: 900000ms 2021-03-28 23:35:05,392 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: ContainerTokenKeyRollingInterval: 86400000ms and ContainerTokenKeyActivationDelay: 900000ms `

    提示找不到resource-types.xml

    原因,hadoop3.XXXX中需要配置各種的環(huán)境變量:

    解決辦法:

  • 在環(huán)境變量文件中添加hadoop需要的環(huán)境變量:
  • vim /etc/profile #在原有的java和hadoop的后面追加 export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_YARN_HOME=$HADOOP_HOME export HADOOP_INSTALL=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_CONF_DIR=$HADOOP_HOME export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH source /etc/profile

    修改配置:
    停止服務(wù):

    [hadoop@master sbin]$ /usr/local/hadoop/sbin/stop-all.sh vim mapred-site.xml #添加<property><name>mapreduce.application.classpath</name><value>/usr/local/hadoop/etc/hadoop,/usr/local/hadoop/share/hadoop/common/*,/usr/local/hadoop/share/hadoop/common/lib/*,/usr/local/hadoop/share/hadoop/hdfs/*,/usr/local/hadoop/share/hadoop/hdfs/lib/*,/usr/local/hadoop/share/hadoop/mapreduce/*,/usr/local/hadoop/share/hadoop/mapreduce/lib/*,/usr/local/hadoop/share/hadoop/yarn/*,/usr/local/hadoop/share/hadoop/yarn/lib/*</value></property>

    啟動(dòng)服務(wù):

    [hadoop@master sbin]$ /usr/local/hadoop/sbin/start-all.sh

    繼續(xù)執(zhí)行wordcount案例:

    [hadoop@master sbin]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount /input/pslstreaming_log1.txt /output 2021-03-30 10:25:36,653 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 2021-03-30 10:25:37,432 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1617071112579_0001 2021-03-30 10:25:37,979 INFO input.FileInputFormat: Total input files to process : 1 2021-03-30 10:25:38,225 INFO mapreduce.JobSubmitter: number of splits:1 2021-03-30 10:25:38,582 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1617071112579_0001 2021-03-30 10:25:38,583 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2021-03-30 10:25:38,712 INFO conf.Configuration: resource-types.xml not found 2021-03-30 10:25:38,712 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2021-03-30 10:25:39,091 INFO impl.YarnClientImpl: Submitted application application_1617071112579_0001 2021-03-30 10:25:39,123 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1617071112579_0001/ 2021-03-30 10:25:39,123 INFO mapreduce.Job: Running job: job_1617071112579_0001 2021-03-30 10:25:46,264 INFO mapreduce.Job: Job job_1617071112579_0001 running in uber mode : false 2021-03-30 10:25:46,264 INFO mapreduce.Job: map 0% reduce 0% 2021-03-30 10:25:53,462 INFO mapreduce.Job: map 100% reduce 0% 2021-03-30 10:25:58,543 INFO mapreduce.Job: map 100% reduce 100% 2021-03-30 10:25:59,555 INFO mapreduce.Job: Job job_1617071112579_0001 completed successfully 2021-03-30 10:25:59,617 INFO mapreduce.Job: Counters: 54 .........略

    不明白為什么這個(gè)錯(cuò)誤還是存在【resource.ResourceUtils: Unable to find ‘resource-types.xml’.】

    查看輸出文件:

    [hadoop@master sbin]$ hdfs dfs -cat /output/part-r-00000|head "", 308 ""], 2 "9716168072", 601 "9716168072"}, 1 "?arrc=2&linkmode=7", 1 "Count=2 299 "a50_inactive_threshold": 300 "a50_refresh_interval": 119 "a50_state_check_interval": 300 "app_private_data": 299 cat: Unable to write to output stream. #太多,只輸出10行看一下

    在網(wǎng)頁(yè)端查看:

    在網(wǎng)頁(yè)中新建文件時(shí)出現(xiàn)錯(cuò)誤,不能新建文件和目錄:

    `Permission denied: user=dr.who, access=WRITE, inode="/output":hadoop:supergroup:drwxr-xr-x`

    問題的分析:
    我在瀏覽器查看目錄和刪除目錄及文件,為什么會(huì)是dr.who,dr.who其實(shí)是hadoop中http訪問的靜態(tài)用戶名,并沒有啥特殊含義,可以在core-default.xml中看到其配置

    hadoop.http.staticuser.user=dr.who

    我們可以通過修改core-site.xml,配置為當(dāng)前用戶,

    <property><name>hadoop.http.staticuser.user</name><value>hadoop</value></property>

    另外,通過查看hdfs的默認(rèn)配置hdfs-default.xml發(fā)現(xiàn)hdfs默認(rèn)是開啟權(quán)限檢查的。

    dfs.permissions.enabled=true #是否在HDFS中開啟權(quán)限檢查,默認(rèn)為true

    解決方法一:
    直接修改/user目錄的權(quán)限設(shè)置,操作如下:

    hdfs dfs -chmod -R 755 /user `不知道什么原因不起作用,這個(gè)方法失敗` ```解決方法二: 在Hadoop的配置文件core-site.xml中增加如下配置:```bash <!-- 當(dāng)前用戶全設(shè)置成root --> <property> <name>hadoop.http.staticuser.user</name> <value>hadoop</value> </property><!-- 不開啟權(quán)限檢查 --> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property>

    查看:
    修改權(quán)限前:

    修改權(quán)限后:
    創(chuàng)建chenfeng目錄:

    輸入192.168.1.20:8088查看yarn集群中運(yùn)行的作業(yè):

    總結(jié)

    以上是生活随笔為你收集整理的hadoop集群平台的搭建的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。