Setting up Hadoop/Hive to use MySQL as metastore

1. February 2010

In a previous post I showed how to setup Hadoop/Hive to use Derby in server mode as the metastore. Many believe MySQL is a better choice for such purpose, so here I'm going to show how we can configure our cluster which we created previously to use a MySQL server as the metastore for Hive.

First we need to install MySQL. In this scenario, I'm going to install MySQL on our Master node, which is named centos1.

 

When logged in as root user:

yum install mysql-server

Now make sure MySQL server is started:

/etc/init.d/mysqld start

Next, I'm going to create a new MySQL user for hadoop/hive:

mysql
mysql> CREATE USER 'hadoop'@'centos1' IDENTIFIED BY 'hadoop';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'hadoop'@'centos1' WITH GRANT OPTION;
mysql> exit

To make sure this new user can connect to MySQL server, switch to user hadoop:
 
su - hadoop
 
mysql -h centos1 -u hadoop -p
 
We need to change the hive configuration so it can use MySQL:
 
nano /hadoop/hive/conf/hive-site.xml
 
and new configuration values are:
 
<property>
  <name>hive.metastore.local</name>
  <value>true</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://centos1:3306/hive?createDatabaseIfNotExist=true</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hadoop</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hadoop</value>
</property>
 

Some of the above parameters do not match what we did to setup derby server in previous post, so I decided to delete the jpox.properties file:

rm /hadoop/hive/conf/jpox.properties

hive needs to have the MySQL jdbc drivers, so we need to download and copy it to hive/lib folder:
 
cd /hadoop
wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.11.tar.gz/from/http://mysql.he.net/
tar -xvzf mysql-connector-java-5.1.11.tar.gz
cp mysql-connector-java-5.1.11/*.jar /hadoop/hive/lib
 
To make sure all settings are done correctly, we can do this:
 
cd /hadoop/hive
bin/hive
hive> show tables;
 

hadoop, hive, mysql , ,