How to setup Hadoop 2.2.0 ( HDFS ) stable version in your CentOS

How to setup Hadoop 2.2.0 ( HDFS ) stable version in your CentOS

Written by  Face4store

How to setup Hadoop 2.2.0 ( HDFS ) stable version in your CentOS, this is article step by step to welcome you to big data

1) Requirements:

First of all, make sure your CentOS has Java SDK, to check Java SDK exists: 

$ java -version

java version "1.7.0_25"

If no Java, install Java SDK: $ yum install java-1.7.0-openjdk-devel.x86_64

2) Hadoop user & group: 

We can add user to hadoop group:

$ groupadd hadoop

$ adduser hdfs -g hadoop

3) Setup SSH certificate

$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

 4) Setup Hadoop

Download: 1) http://hadoop.apache.org/releases.html#Download

Switch user hdfs: 

$ su hdfs

$ cd /home/hdfs

Download: 

$ wget http://mirrors.digipower.vn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz

$ tar xvzf hadoop-2.2.0.tar.gz

$ mv hadoop-2.2.0 hadoop

5) Hadoop variables:

Exit to root user, and open /root/.bashrc, add following line to setting up:

#Hadoop variables

export JAVA_HOME=/usr/lib/jvm/jre-1.7.0/

export HADOOP_INSTALL=/home/hdfs/hadoop

export PATH=$PATH:$HADOOP_INSTALL/bin

export PATH=$PATH:$HADOOP_INSTALL/sbin

export HADOOP_MAPRED_HOME=$HADOOP_INSTALL

export HADOOP_COMMON_HOME=$HADOOP_INSTALL

export HADOOP_HDFS_HOME=$HADOOP_INSTALL

export YARN_HOME=$HADOOP_INSTALL

###end of paste

Go back to user hdfs, open /home/hdfs/hadoop/etc/hadoop/hadoop-env.sh, and modify this line:

JAVA_HOME=/usr/lib/jvm/jre-1.7.0/

6) Check Hadoop in system:

$ hadoop version

Hadoop 2.2.0

Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768

Compiled by hortonmu on 2013-10-07T06:28Z

Compiled with protoc 2.5.0

From source with checksum 79e53ce7994d1628b240f09af91e1af4

This command was run using /home/hdfs/hadoop/share/hadoop/common/hadoop-common-2.2.0.jar

7) Hadoop configuration:

$ su hdfs

$ cd /home/hdfs

$ vi /home/hdfs/hadoop/etc/hadoop/core-site.xml

<configuration>

                <property>

                   <name>fs.default.name</name>

                   <value>hdfs://localhost:9000</value>

                </property>

</configuration>

$ vi /home/hdfs/hadoop/etc/hadoop/yarn-site.xml

<configuration>

                <property>

                   <name>yarn.nodemanager.aux-services</name>

                   <value>mapreduce_shuffle</value>

                </property>

                <property>

                   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

                   <value>org.apache.hadoop.mapred.ShuffleHandler</value>

                </property>

</configuration>

$ vi /home/hdfs/hadoop/etc/hadoop/mapred-site.xml

<configuration>

                <property>

                   <name>mapreduce.framework.name</name>

                   <value>yarn</value>

                </property>

</configuration>

8) Namenode & Datanode:

We create folder for namenode & datanote

$ mkdir -p /home/hdfs/namenode

$ mkdir -p /home/hdfs/datanode

9) Namenode & Datanode configuration:

$ vi /home/hdfs/hadoop/etc/hadoop/hdfs-site.xml

<configuration>

                <property>

                                <name>dfs.replication</name>

                                <value>1</value>

                 </property>

                 <property>

                   <name>dfs.namenode.name.dir</name>

                   <value>file:/home/hdfs/namenode</value>

                 </property>

                 <property>

                   <name>dfs.datanode.data.dir</name>

                   <value>file:/home/hdfs/datanode</value>

                 </property>

</configuration>

10) Format Namenode:

Easy task, almost done, we format namenode for HDFS ( Hadoop file system)

$ cd /home/hdfs/hadoop

$ bin/hadoop namenode –format

If done, the result seem likes this: 

14/04/11 23:52:00 INFO common.Storage: Storage directory /home/hdfs/namenode has been successfully formatted.

14/04/11 23:52:00 INFO namenode.FSImage: Saving image file /home/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression

14/04/11 23:52:00 INFO namenode.FSImage: Image file /home/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 196 bytes saved in 0 seconds.

11) Start Hadoop services:

$ start-dfs.sh

$ start-yarn.sh

12) Check Namenode & Datanode:

If everything is OK, we will see the result via jps command:

$ jps

1913 NodeManager

1822 ResourceManager

1680 SecondaryNameNode

2299 Jps

1431 NameNode

1519 DataNode

We will have an article for example Hadoop - WordCount as soon as posible, Face4store.com will discuss with you about this subject later.

Thank for your comment. 

Phuc Nguyen - Co-Founder
Son Tran - Chief Architect

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.