安装hadoop1.2.1集群环境
生活随笔
收集整理的這篇文章主要介紹了
安装hadoop1.2.1集群环境
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
一、規劃
(一)硬件資源
10.171.29.191 master
10.173.54.84? slave1
10.171.114.223 slave2
(二)基本資料
用戶:? jediael
目錄:/opt/jediael/
二、環境配置
(一)統一用戶名密碼,并為jediael賦予執行所有命令的權限
#passwd # useradd jediael # passwd jediael # vi /etc/sudoers增加以下一行:
jediael ALL=(ALL) ALL(二)創建目錄/opt/jediael
$sudo chown jediael:jediael /opt $ cd /opt $ sudo mkdir jediael注意:/opt必須是jediael的,否則會在format namenode時出錯。
(三)修改用戶名及/etc/hosts文件
1、修改/etc/sysconfig/network
NETWORKING=yes HOSTNAME=*******2、修改/etc/hosts
10.171.29.191 master 10.173.54.84 slave1 10.171.114.223 slave2注 意hosts文件不能有127.0.0.1? *****配置,否則會導致出現異常。org.apache.hadoop.ipc.Client: Retrying connect to server: master/10.171.29.191:9000. Already trie
3、hostname命令
hostname ****
(四)配置免密碼登錄
以上命令在master上使用jediael用戶執行:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys然后,將authorized_keys復制到slave1,slave2
scp ~/.ssh/authorized_keys slave1:~/.ssh/ scp ~/.ssh/authorized_keys slave2:~/.ssh/注意
(1)若提示.ssh目錄不存在,則表示此機器從未運行過ssh,因此運行一次即可創建.ssh目錄。
(2).ssh/的權限為600,authorized_keys的權限為700,權限大了小了都不行。
(五)在3臺機器上分別安裝java,并設置相關環境變量
參考http://blog.csdn.net/jediael_lu/article/details/38925871
(六)下載hadoop-1.2.1.tar.gz,并將其解壓到/opt/jediael
三、修改配置文件
【3臺機器上均要執行】
(一)修改conf/hadoop_env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51(二)修改core-site.xml
<property><name>fs.default.name</name><value>hdfs://master:9000</value> </property><property><name>hadoop.tmp.dir</name><value>/opt/tmphadoop</value> </property>?
(三)修改hdfs-site.xml
<property><name>dfs.replication</name><value>2</value> </property>
(四)修改mapred-site.xml
<property><name>mapred.job.tracker</name><value>master:9001</value> </property>
(五)修改master及slaves
master: masterslaves: slave1 slave2
可以在master中完成上述配置,然后使用scp命令復制到slave1與slave2上。
?如:
$scp core-site.xml slave2:/opt/jediael/hadoop-1.2.1/conf
四、啟動并驗證
1、格式 化namenode【此步驟在3臺機器上均要運行】
[jediael@master hadoop-1.2.1]$ bin/hadoop namenode -format15/01/21 15:13:40 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:?? host = master/10.171.29.191
STARTUP_MSG:?? args = [-format]
STARTUP_MSG:?? version = 1.2.1
STARTUP_MSG:?? build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG:?? java = 1.7.0_51
************************************************************/
Re-format filesystem in /opt/tmphadoop/dfs/name ? (Y or N) Y
15/01/21 15:13:43 INFO util.GSet: Computing capacity for map BlocksMap
15/01/21 15:13:43 INFO util.GSet: VM type?????? = 64-bit
15/01/21 15:13:43 INFO util.GSet: 2.0% max memory = 1013645312
15/01/21 15:13:43 INFO util.GSet: capacity????? = 2^21 = 2097152 entries
15/01/21 15:13:43 INFO util.GSet: recommended=2097152, actual=2097152
15/01/21 15:13:43 INFO namenode.FSNamesystem: fsOwner=jediael
15/01/21 15:13:43 INFO namenode.FSNamesystem: supergroup=supergroup
15/01/21 15:13:43 INFO namenode.FSNamesystem: isPermissionEnabled=true
15/01/21 15:13:43 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
15/01/21 15:13:43 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
15/01/21 15:13:43 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
15/01/21 15:13:43 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/01/21 15:13:44 INFO common.Storage: Image file /opt/tmphadoop/dfs/name/current/fsimage of size 113 bytes saved in 0 seconds.
15/01/21 15:13:44 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/opt/tmphadoop/dfs/name/current/edits
15/01/21 15:13:44 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/opt/tmphadoop/dfs/name/current/edits
15/01/21 15:13:44 INFO common.Storage: Storage directory /opt/tmphadoop/dfs/name has been successfully formatted.
15/01/21 15:13:44 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/10.171.29.191
************************************************************/
2、啟動hadoop【此步驟只需要在master上執行】
[jediael@master hadoop-1.2.1]$ bin/start-all.sh starting namenode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-namenode-master.out
slave1: starting datanode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-datanode-slave1.out
slave2: starting datanode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-datanode-slave2.out
master: starting secondarynamenode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-secondarynamenode-master.out
starting jobtracker, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-jobtracker-master.out
slave1: starting tasktracker, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-tasktracker-slave1.out
slave2: starting tasktracker, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-tasktracker-slave2.out
3、登錄頁面驗證
NameNode??? http://ip:50070 ?
JobTracker???? http://ip50030
4、查看各個主機的java進程
(1)master:
$ jps
17963 NameNode
18280 JobTracker
18446 Jps
18171 SecondaryNameNode
(2)slave1:
$ jps
16019 Jps
15858 DataNode
15954 TaskTracker
(3)slave2:
$ jps
15625 Jps
15465 DataNode
15561 TaskTracker
五、運行一個完整的mapreduce程序。
以下內容均只是master上執行
1、將wordcount.jar包復制至服務器上
程序見http://blog.csdn.net/jediael_lu/article/details/37596469
2、創建輸入目錄,并將相關文件復制至目錄
[jediael@master166 ~]$ hadoop fs -mkdir /wcin [jediael@master166 projects]$ hadoop fs -copyFromLocal /opt/jediael/hadoop-1.2.1/conf/hdfs-site.xml /wcin?
3、運行程序
[jediael@master166 projects]$ hadoop jar wordcount.jar org.jediael.hadoopdemo.wordcount.WordCount /wcin /wcout
14/08/31 20:04:26 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/08/31 20:04:26 INFO input.FileInputFormat: Total input paths to process : 1
14/08/31 20:04:26 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/08/31 20:04:26 WARN snappy.LoadSnappy: Snappy native library not loaded
14/08/31 20:04:26 INFO mapred.JobClient: Running job: job_201408311554_0003
14/08/31 20:04:27 INFO mapred.JobClient: map 0% reduce 0%
14/08/31 20:04:31 INFO mapred.JobClient: map 100% reduce 0%
14/08/31 20:04:40 INFO mapred.JobClient: map 100% reduce 100%
14/08/31 20:04:40 INFO mapred.JobClient: Job complete: job_201408311554_0003
14/08/31 20:04:40 INFO mapred.JobClient: Counters: 29
14/08/31 20:04:40 INFO mapred.JobClient: Job Counters
14/08/31 20:04:40 INFO mapred.JobClient: Launched reduce tasks=1
14/08/31 20:04:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=4230
14/08/31 20:04:40 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/08/31 20:04:40 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/08/31 20:04:40 INFO mapred.JobClient: Launched map tasks=1
14/08/31 20:04:40 INFO mapred.JobClient: Data-local map tasks=1
14/08/31 20:04:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=8531
14/08/31 20:04:40 INFO mapred.JobClient: File Output Format Counters
14/08/31 20:04:40 INFO mapred.JobClient: Bytes Written=284
14/08/31 20:04:40 INFO mapred.JobClient: FileSystemCounters
14/08/31 20:04:40 INFO mapred.JobClient: FILE_BYTES_READ=370
14/08/31 20:04:40 INFO mapred.JobClient: HDFS_BYTES_READ=357
14/08/31 20:04:40 INFO mapred.JobClient: FILE_BYTES_WRITTEN=104958
14/08/31 20:04:40 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=284
14/08/31 20:04:40 INFO mapred.JobClient: File Input Format Counters
14/08/31 20:04:40 INFO mapred.JobClient: Bytes Read=252
14/08/31 20:04:40 INFO mapred.JobClient: Map-Reduce Framework
14/08/31 20:04:40 INFO mapred.JobClient: Map output materialized bytes=370
14/08/31 20:04:40 INFO mapred.JobClient: Map input records=11
14/08/31 20:04:40 INFO mapred.JobClient: Reduce shuffle bytes=370
14/08/31 20:04:40 INFO mapred.JobClient: Spilled Records=40
14/08/31 20:04:40 INFO mapred.JobClient: Map output bytes=324
14/08/31 20:04:40 INFO mapred.JobClient: Total committed heap usage (bytes)=238026752
14/08/31 20:04:40 INFO mapred.JobClient: CPU time spent (ms)=1130
14/08/31 20:04:40 INFO mapred.JobClient: Combine input records=0
14/08/31 20:04:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=105
14/08/31 20:04:40 INFO mapred.JobClient: Reduce input records=20
14/08/31 20:04:40 INFO mapred.JobClient: Reduce input groups=20
14/08/31 20:04:40 INFO mapred.JobClient: Combine output records=0
14/08/31 20:04:40 INFO mapred.JobClient: Physical memory (bytes) snapshot=289288192
14/08/31 20:04:40 INFO mapred.JobClient: Reduce output records=20
14/08/31 20:04:40 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1533636608
14/08/31 20:04:40 INFO mapred.JobClient: Map output records=20
4、查看結果
[jediael@master166 projects]$ hadoop fs -cat /wcout/* --> 1
<!-- 1
</configuration> 1
</property> 1
<?xml 1
<?xml-stylesheet 1
<configuration> 1
<name>dfs.replication</name> 1
<property> 1
<value>2</value> 1
Put 1
file. 1
href="configuration.xsl"?> 1
in 1
overrides 1
property 1
site-specific 1
this 1
type="text/xsl" 1
version="1.0"?> 1
cat: File does not exist: /wcout/_logs
總結
以上是生活随笔為你收集整理的安装hadoop1.2.1集群环境的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 改变HTML中超链接的显示样式
- 下一篇: Hadoop文件的基本操作