1.单机部署hadoop测试环境
之前看了很多理論上的知識,感覺云里霧里的,所以趕緊著手搭建個單機版的hadoop跑一跑,開啟自學大數據技術的第一步~~
? 1.在開源的世界里,我就是個土豪,要啥有啥,所以首先你得有個jdk,有錢所以用最新的java8,hadoop使用的是hadoop2.6.0。
? 2.配置好java后,可以在/etc/profile里配置好環境變量,方便之后使用,緊接著解壓hadoop2.6.0.tar.gz。
? 3.接下來配置hadoop,所有的配置文件都在hadoop文件夾下的etc/hadoop中:
(1)hadoop-env.sh :這個腳本只需要修改最上面的JavaHome即可,修改為自己的java路徑
(2)core-site.xml,mapred-site.xml,hdfs-site.xml這幾個配置完事再補上吧~~~,網上挺多的,不過要找自己對應的版本,不然會出很多奇怪的問題。
? 4.配置好之后就要啟動了
(1)啟動之前首先要把namenode格式化一下,這是第一次啟動hadoop需要做的動作,他會把hdfs中所有的東西全部清空掉的,所以要慎用~~
[qiang@localhost hadoop-2.6.0]$ bin/hadoop namenode -format DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it.15/08/11 08:25:43 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = localhost/127.0.0.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.6.0 ..... ..... ..... 15/08/11 08:25:46 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1 ************************************************************/格式化會出現一大堆信息,如果沒有報錯,那么說明之前的配置應該是可以滴~~~
(2)啟動的時候,可以直接使用sbin/start-all.sh,但是這種方式太low,如果集群啟動出現錯誤,那么不會知道是那一部分的問題,不便于問題的排查,所以我們來一個一個啟動它
啟動namenode:
[qiang@localhost hadoop-2.6.0]$ sbin/hadoop-daemon.sh start namenode starting namenode, logging to /home/qiang/hadoop-2.6.0/logs/hadoop-qiang-namenode-localhost.localdomain.out啟動datanode:
[qiang@localhost hadoop-2.6.0]$ sbin/hadoop-daemon.sh start datanode starting datanode, logging to /home/qiang/hadoop-2.6.0/logs/hadoop-qiang-datanode-localhost.localdomain.out可以用jps命令查看是否啟動
[qiang@localhost ~]$ jps 17254 Jps 16473 NameNode 16698 DataNode當然也可以使用開放的端口在web瀏覽器上查看:(hdfs開放的端口為50070)
開了當然要用用他了,看看是不是唬人的,所以我們向hdfs中上傳點東西試試:
[qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -mkdir /home [qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -mkdir /home/qiangweikang [qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -put README.txt /home/qiangweikang點擊uitilites中的system source會看到我們之前傳進去的東東:
?好開森~~
完事我們繼續啟動yarn
[qiang@localhost hadoop-2.6.0]$ sbin/start-yarn.sh在web上就可以看到傳說中的那只大象.... ?,而且我們可以看到有一個活動的節點(yarn的ResourceManager的默認端口號是8088)
?
接下來我們再跑一個demo,看看hadoop是怎么去運行的(在share下有自帶的demo可供測試)這個pi的計算很有意思,是對一個圓做投擲飛鏢的動作,第一個參數是map操作的次數
第二個參數是每次投擲多少個飛鏢,好高大上啊,pi還可以這樣算~~~,難道這就是傳說中的概率統計?
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 2 100 Number of Maps = 2 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Starting Job 15/08/11 08:54:24 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/08/11 08:54:25 INFO input.FileInputFormat: Total input paths to process : 2 15/08/11 08:54:25 INFO mapreduce.JobSubmitter: number of splits:2 15/08/11 08:54:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1439308289430_0001 15/08/11 08:54:26 INFO impl.YarnClientImpl: Submitted application application_1439308289430_0001 15/08/11 08:54:26 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1439308289430_0001/ 15/08/11 08:54:26 INFO mapreduce.Job: Running job: job_1439308289430_0001 15/08/11 08:54:41 INFO mapreduce.Job: Job job_1439308289430_0001 running in uber mode : false 15/08/11 08:54:41 INFO mapreduce.Job: map 0% reduce 0% 15/08/11 08:54:51 INFO mapreduce.Job: map 50% reduce 0% 15/08/11 08:54:52 INFO mapreduce.Job: map 100% reduce 0% 15/08/11 08:55:04 INFO mapreduce.Job: map 100% reduce 100% 15/08/11 08:55:05 INFO mapreduce.Job: Job job_1439308289430_0001 completed successfully 15/08/11 08:55:06 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=50FILE: Number of bytes written=317688FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=526HDFS: Number of bytes written=215HDFS: Number of read operations=11HDFS: Number of large read operations=0HDFS: Number of write operations=3Job Counters Launched map tasks=2Launched reduce tasks=1Data-local map tasks=2Total time spent by all maps in occupied slots (ms)=14463Total time spent by all reduces in occupied slots (ms)=10093Total time spent by all map tasks (ms)=14463Total time spent by all reduce tasks (ms)=10093Total vcore-seconds taken by all map tasks=14463Total vcore-seconds taken by all reduce tasks=10093Total megabyte-seconds taken by all map tasks=14810112Total megabyte-seconds taken by all reduce tasks=10335232Map-Reduce FrameworkMap input records=2Map output records=4Map output bytes=36Map output materialized bytes=56Input split bytes=290Combine input records=0Combine output records=0Reduce input groups=2Reduce shuffle bytes=56Reduce input records=4Reduce output records=0Spilled Records=8Shuffled Maps =2Failed Shuffles=0Merged Map outputs=2GC time elapsed (ms)=412CPU time spent (ms)=4770Physical memory (bytes) snapshot=680353792Virtual memory (bytes) snapshot=6324887552Total committed heap usage (bytes)=501743616Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=236File Output Format Counters Bytes Written=97 Job Finished in 42.318 seconds Estimated value of Pi is 3.12000000000000000000?
最后記得把yarn關掉~~
[qiang@localhost hadoop-2.6.0]$ sbin/stop-yarn.sh?
轉載于:https://www.cnblogs.com/qiangweikang/p/4723196.html
總結
以上是生活随笔為你收集整理的1.单机部署hadoop测试环境的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: FFT算法的完整DSP实现(转)
- 下一篇: js如何查看元素类型