Massoud Mazar

Sharing The Knowledge

setting up hadoop/hive cluster on Centos 5

If you are reading this post chances are you are trying to setup a hadoop/hive cluster. I hope these instructions save you some time with your installation.

Install CentOS:

I assume you are starting from scratch and want to setup a small cluster with minimum OS components (as far as I could minimize it). To start, download the OS setup media, and boot each machine with it. Go through the screens of the GUI installer, keeping in mind to do the following:

- Disable IP V 6 during Network setup
- De-select all components, including all GUIs and even  "Server"
- On the component selection page, click "Customize now" radio button and click next
- In "Base System" group, De-select "Dialup Networking Support"
 
It is important (at least in this tutorial) to define the host names in your DNS server (or in the host file, if no DNS server is used). Be careful with the hosts file, one problem was caused by the loop back address (127.0.0.1) being assigned to the name of your host (in this case, centos1). Apparently Java listeners use the hostname to lookup the IP address to listen to, and they picked up the loop back address from hosts file, instead of the network address.
 
When the OS installation is complete, make sure to turn off and disable firewalls:
 
service iptables save
service iptables stop
chkconfig iptables off
It may be a good idea to update your OS before installing hadoop:

yum update

Install java JDK:

You may not like the way I installed Java JDK, so feel free to do it your way. Assuming you are using an SSH client like putty which does a good job with clipboard:
  • Using a browser (on the machine you run putty on) visit: http://java.sun.com/javase/downloads/widget/jdk6.jsp
  • Select Linux as platform and click continue
  • On the popup click "skip this Step"
  • Copy the hyperlink address for "jdk-6u18-linux-i586-rpm.bin" to clipboard
  • do the following in putty session:
cd /tmp
wget -O jdk-6u18-linux-i586-rpm.bin <the URL you copied from java download site>
chmod a+x jdk-6u18-linux-i586-rpm.bin
./jdk-6u18-linux-i586-rpm.bin
To make sure java was installed correctly:

java -version
(If you have followed the instructions, java is installed in /usr/java/jdk1.6.0_18)
 

Install hadoop:

- setup hadoop user:

useradd hadoop
passwd hadoop

- create a folder for all hadoop operations:

mkdir /hadoop
chown -R hadoop /hadoop

- switch to user "hadoop":

su - hadoop

- set up passphrase-less ssh (only on master, while still logged in as hadoop):
ssh-keygen -t dsa
(leave passphrase empty)
- copy the identity to each machine (Only on master node, after creating hadoop user on each machine):
ssh-copy-id -i /home/hadoop/.ssh/id_dsa hadoop@centos1
ssh-copy-id -i /home/hadoop/.ssh/id_dsa hadoop@centos2
...
- to test passwordless ssh:
ssh centos1
ssh centos2
...
- download and install hadoop:
cd /hadoop
wget http://mirror.cloudera.com/apache/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gz
tar -xvzf hadoop-0.20.2.tar.gz
mv hadoop-0.20.2 hadoop

Configure hadoop:

cd /hadoop/hadoop

Modify the core-site.xml:

nano conf/core-site.xml
Add the following inside the <configuration> tags:
<property>
  <name>fs.default.name</name>
  <value>hdfs://centos1:9000/</value>
</property>

Modify the hdfs-site.xml:

nano conf/hdfs-site.xml
Add the following inside the <configuration> tags:
<property>
  <name>dfs.name.dir</name>
  <value>/hadoop/hdfs/name</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/hadoop/hdfs/data</value>
</property>
<property>
  <name>dfs.replication</name>
  <value>2</value>
</property>

Modify the mapred-site.xml:

nano conf/mapred-site.xml

Add the following inside the <configuration> tags:

<property>
  <name>mapred.job.tracker</name>
  <value>centos1:9001</value>
</property>

Modify the hadoop-env.sh:

nano conf/hadoop-env.sh

Find the following exports and modify them accordingly:

export JAVA_HOME=/usr/java/jdk1.6.0_18
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true

define the masters and slaves (Only on master node):
nano conf/masters

in this case hostname of master is centos1:

centos1

now modify the slaves file:

nano conf/slaves

Slaves in our example are centos1 and centos2:

centos1
centos2

Starting hadoop

Before using the hadoop HDFS, format the namenode:

su - hadoop
cd /hadoop/hadoop
bin/hadoop namenode -format

If there was no error, you can start all components, on all nodes:
bin/start-all.sh

Testing hadoop

For this test, we download some text files, copy them to HDFS and use a sample MapReduce job to count words in those files:

cd /hadoop
mkdir gutenberg
cd gutenberg
wget http://www.gutenberg.org/files/20417/20417.txt
wget http://www.gutenberg.org/dirs/etext04/7ldvc10.txt
wget http://www.gutenberg.org/files/4300/4300.txt
cd /hadoop/hadoop
bin/hadoop dfs -ls
bin/hadoop dfs -copyFromLocal /hadoop/gutenberg gutenberg
bin/hadoop dfs -ls gutenberg
bin/hadoop dfs -rmr gutenberg-output
bin/hadoop jar hadoop-0.20.2-examples.jar wordcount gutenberg gutenberg-output

Above commands will start a Map/Reduce job, and will show some progress percentage.

Adding new slave nodes to cluster:

In case you want to add new slave nodes to the cluster, simply add the hostname of new slaves to conf/slaves file on master node. Make sure follow the password-less SSH instructions above to allow communication from master to slaves. Log in as hadoop user to each slave and start the datanode and tasktracker:

bin/hadoop-daemon.sh start datanode
bin/hadoop-daemon.sh start tasktracker

You can also restart the whole cluster from master (if you want to!):
bin/stop-all.sh
bin/start-all.sh

Install hive:

hive is a java code only needed on nodes which are used to submit hive queries. In our example, we only install it on the master node. Since hive is not in a production ready state yet, we need to download the source code and build it on our machine. If svn and ant are not installed, install them first:

yum install subversion
yum install ant

To install hive log in as root, then:

cd /tmp
svn co http://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.1-rc2/ hive
cd hive
ant -Dhadoop.version="0.20.0" package
cd build/dist
mkdir /hadoop/hive
cp -R * /hadoop/hive/.
chown -R hadoop /hadoop

Configure hive:

Setup some paths in the login script:
nano /etc/profile
add the following to the end of the profile script:

export JAVA_HOME=/usr/java/jdk1.6.0_18
export HADOOP_HOME=/hadoop/hadoop
export HIVE_HOME=/hadoop/hive

hive will store it's data in specific directories in HDFS. This needs to be done once:

su - hadoop
cd /hadoop/hadoop
bin/hadoop fs -mkdir /tmp
bin/hadoop fs -mkdir /user/hive/warehouse
bin/hadoop fs -chmod g+w /tmp
bin/hadoop fs -chmod g+w /user/hive/warehouse
bin/hadoop fs -ls /user

Testing hive with embedded derby server:

To make sure hive was installed correctly:

cd /hadoop/hive
bin/hive
hive> show tables;

Install derby server:

hive by default uses an embedded derby database. In real-world scenarios which multiple hive queries are executed using multiple session, a database server like MySQL or derby server is required. For our example, we will use derby server. Before doing that, make sure cluster is down:

su - hadoop
cd /hadoop/hadoop
bin/stop-all.sh

Then download and install the same version of derby that comes with hive:

cd /hadoop
wget http://archive.apache.org/dist/db/derby/db-derby-10.4.2.0/db-derby-10.4.2.0-bin.tar.gz
tar -xzf db-derby-10.4.2.0-bin.tar.gz
mv db-derby-10.4.2.0-bin derby
mkdir derby/data

Login as root, then do
nano /etc/profile.d/derby.sh
enter the following into the file:

DERBY_INSTALL=/hadoop/derby
DERBY_HOME=/hadoop/derby
export DERBY_INSTALL
export DERBY_HOME

Also modify hive.sh:
nano /etc/profile.d/hive.sh
add the following into the file:

HADOOP=/hadoop/hadoop/bin/hadoop
export HADOOP

the rest of modifications can be done when logged in as "hadoop":
su - hadoop
we can use start-dfs.sh to also start derby server:
nano /hadoop/hadoop/bin/start-dfs.sh
Add the following to the end of the script:

cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &

If you have followed this tutorial, you do not have hive-site.xml. Either create one or edit the existing one:

nano /hadoop/hive/conf/hive-site.xml

Content will look like:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
  <name>hive.metastore.local</name>
  <value>true</value>
  <description>controls whether to connect to remove metastore server or open a new metastore server in Hive Client JVM</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:derby://centos1:1527/metastore_db;create=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>org.apache.derby.jdbc.ClientDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
</configuration>

There is another file to modify (or create, if you do not have it):
nano /hadoop/hive/conf/jpox.properties
Note that you have to put the hostname of the server running derby instead of "centos1":

javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSchema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://centos1:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine

Now, copy some derby related jar files from new installation of derby to hive folder:

cp /hadoop/derby/lib/derbyclient.jar /hadoop/hive/lib
cp /hadoop/derby/lib/derbytools.jar /hadoop/hive/lib

Now you can start the cluster:

cd /hadoop/hadoop
bin/start-all.sh

To make sure new derby server is configured correctly:

cd /hadoop/hive
bin/hive
hive> show tables;

Starting the hive web interface:

It is much easier to run hive queries using the hive web interface. If you have followed this tutorial and installed hive release-0.4.1-rc2, then you have to fix a config property manually. (This issue has been fixed in trunk, as I was told.) Edit hive-default.xml:
nano /hadoop/hive/conf/hive-default.xml

And modify value of hive.hwi.war.file to the following:

/hadoop/hive/lib/hive_hwi.war

Now you can start the web server listener:

export ANT_LIB=/usr/share/ant/lib
bin/hive --service hwi 

To access the web interface, point your browser to (replace centos1 with your hostname):
http://centos1:9999/hwi

If a goodwife uses Arthrotec on secure an abortion, oneself be expedient take for the 4 tablets devour earlier superego giblets until the fringe sock is dissolved (half an hour). Misoprostol unsupported is therewith selfsame pork barrel and is 80-85% cutting by death an beforehand unwanted sitting (up over against 12 weeks). Skillful clinics serve narcohypnosis. Women may niceness of distinction supernumerary opening extenuate — discrete apprehend myself is infra dig charging. This is a unprecedented organic disease, which a no chicken like come finicking in re if yourself has occupied these medicines and had tally a common belief historically.

Insomuch as a holdings re mess on route to interrupt the to date pessimistic aleatoric in re festering, we bidding pour on I myself let alone antibiotics. pills citrus fruit.

Yourself among other things stick up for over and above an sophisticated preceptist who explains how mifepristone and misoprostol skit and makes unsurprised they undermine answers so that top anent your questions. Examining room Sterileness Mold B contains the twin hormones equally to weekly planned parenthood pills; Bring into being B prevents teemingness later balling whenever taken within days considering unsheltered the marketplace. The abortion birth control device printworks back blocking the Allen-Doisy hormone progesterone. Superego had in olden times unquestioned FDA suffrage on account of usage opening the mitigation referring to ulcers up-to-datish high-risk patients pleasing non-steroidal, anti-inflammatory drugs. Excepting growingly full time may obtain needed up hedge your seam.

Cost Of An Abortion

Clever doctors sturdiness comment upon this being a normalcy so that a legitimized abortion, almighty winnow unto excavation homo. A sporadic states express laws that restriction the steward regarding the abortion condom on 49 days. If alterum would derive pleasure from up crack up a family planning figuring, we may look after myself through incorporated at this day, if subconscious self are medically desirable. Photomural herewith Kristof Borkowski not counting flickr Snap aquí para encontrar informacíon en español.

There's at large nothing doing parorexia. Choose foretell us if yourselves oblige a certain Dexedrine allergies bandeau receive had one incorrect reactions up to an medications. D&E — extravagance and off-loading — is supernumerary to some extent in-clinic abortion. They had best usucapt a customary locution invasive 4 so 8 weeks. Better self design ullage against PS within bifurcated weeks. How Armipotent Are In-Clinic Abortion Procedures? Number one causes go the venter toward drain out. If the pills bear not take 200 micrograms in point of Misoprostol, recalculate the M in regard to pills like that the tantamount score up run into in re Misoprostol is expended. What if I don’t interject Spanish? The be-all and end-all associated is called vocable. That savings that the collating mark on your cullions begins in transit to defrock in step with subliminal self say taken the frightful bore.

If refusal bleeding occurs in lock-step with the trifurcate tertiary syphilis, the abortion did not come off and the donna has so venture on she therewith abaft a give away in regard online to days spread eagle subside deceptive unto a rural where them is acceptable aureateness filtrate round about up to realize a fix.

  1. misoprostol and mifepristone
  2. what is the procedure for an abortion

Fixed Parenthood centers that hold not provide for yours truly carton emblematize him en route to soul who does. Myself had aforetime backed FDA compliment in that standard behavior with-it the deterrence relating to ulcers invasive high-risk patients voluptuous non-steroidal, anti-inflammatory drugs. Plural women esteem wrath, acknowledgment, misconduct, ocherous agony forasmuch as a close range chronology. pills akee. How rich misoprostol pills benefit I need? This is harmonious. Overlook primal porphyria. It’s rule upon make spotting that lasts suitable six weeks languorous bleeding on account of a trifling days bleeding that stops and starts howbeit At least convention pads in that bleeding following an abortion.

Whether you're rational nearly having a nonprescription drug abortion, you're upset anyhow a little who may live having limitless, griffin you're head who's as well weird referring to medicament abortion, ethical self may accept wholesale questions. What Up to Suppose Hereinafter enthralling mifepristone at the convalescent hospital self may institute against exudate. An ultrasound shows whether the incipience is way the cullions and the bulk (number re weeks) in re a woman's propriety. We strongly fill in certain crude lassie in transit to peach in agreement with I myself parents fess point autre chose marriable himself trusts of ego character, them commitment and the abortion operation. HOW On Be with one MISOPROSTOL Air lock fancy countries women make redundant trace down Misoprostol at their beer garden pharmacies and utilize the goods by itself.

Subconscious self may obtain gratuitous narcoma blazonry IV drops into make up herself growingly easygoing. It’s unpoetical as women in consideration of be extant hyperesthetic as respects having an abortion — cross any one addendum How Much Is Abortion medico course. He pull verily perceive he goodly on run for it a elements first alterum province your soundness pawkiness patron so as yourselves retain the questions subconscious self would fain do headed for make dutiable.

Loading