if you already have nagios setup to monitor your servers, it's a good idea to use it to monitor your hadoop data nodes as well. you can use nrpe to call your dfs monitoring plugin. I found this blog post about setting up nrpe on CentOS extremely useful. The only trick to get it to work was to also setup rpmforge repositories for yum, so it finds the nrpe packages (nagios-nrpe and nagios-plugins-nrpe).
The next step was to add monitoring script for hadoop dfs, which I found here. Since I was using hadoop 0.20.2, I had to make some changes to the script to correctly parse values out of the dfs report:More...
In a previous post I showed how to setup Hadoop/Hive to use Derby in server mode as the metastore. Many believe MySQL is a better choice for such purpose, so here I'm going to show how we can configure our cluster which we created previously to use a MySQL server as the metastore for Hive.
First we need to install MySQL. In this scenario, I'm going to install MySQL on our Master node, which is named centos1. More...
18. November 2009 23:10
If you are reading this post chances are you are trying to setup a hadoop/hive cluster. I hope these instructions save you some time with your installation.
I assume you are starting from scratch and want to setup a small cluster with minimum OS components (as far as I could minimize it). To start, download the OS setup media, and boot each machine with it. Go through the More...