Background
It is always a good idea to monitor the server hardware, in many cases the root cause of the probblem is hardware related like: a fan stops and the temperature gets to high, dust in the machine makes it to hot, disks that fails, memory corruption and so on. This article will describe howto enable hardware monitoring on a HP Proliant running CentOS Linux and then howto collect the data with Nagios or op5 Monitor. The procedure is the same with RedHat Enterprise Linux and similiar with Suse Enterprise server.
The HP manuals and information is bloated with irrelevant information and I had to struggle several hours, ask collegues to get it running. I hope this blog article will help others to get monitoring of HP Proliant using HP Insight Manager easier to setup.
Installing the software on the target system
You need two packages from HP and can be downloaded from hp.com under "Support & Drivers". Search your hardware plattform and correct operating system:
- ProLiant Support Pack for Red Hat Enterprise Linux 5 (i686) The latest Proliant Support Package 2010-03-02 name is psp-8.25.rhel5.i686.en.tar.gz
- HP System Health Application and Insight Management Agents for Red Hat Enterprise Linux 5 (x86). The latest name is hpasm-8.0.0-173.rhel5.i386.rpm
Install kernel source code and rpm tools:
# yum install kernel-devel rpm-build rpm-devel
Proliant Support Package is not supported on CentOS so you have to let the installer think it is a RedHat system. If you have a RHEL system skip the next steps.
# cp /etc/redhat-release /etc/redhat-release.backup # echo "Red Hat Enterprise Linux Server release 5.4 (Tikanga)">/etc/redhat-release
Untar the Proliant Support Package
# tar xzvf psp-8.25.rhel5.i686.en.tar.gz
Install the Proliant Support Package
# cd compaq/csp/linux/ # ./install825.sh
alot of text appears and some questions, answer them.
Install HP System Health Application and Insight Management Agents, and for some stupid reason it is in conflict with some other packages just installed. I solved it in a dirty way:
# rpm -i --force --replacefiles --nodeps hpasm-8.0.0-173.rhel5.i386.rpm
Configure by running:
# hpa/etc/init.d/hpasm configure
and answer the questions.
Do not forget to restore /etc/redhat-release
# cp /etc/redhat-release.backup /etc/redhat-release
I did modify my /etc/snmp/snmp.conf to:
dlmod cmaX /usr/lib/libcmaX.so rocommunity public trapsink 10.1.1.20 syscontact peter@it-slav.net syslocation PDC, Peters DataCenter
To test that you have installation and configuration work, run a snmpwalk from your Nagios or op5 Monitor host:
# snmpwalk -c public -v1 <ip-adress of your proliant box> 1.3.6.1.4.1.232 SNMPv2-SMI::enterprises.232.1.1.1.0 = INTEGER: 1 SNMPv2-SMI::enterprises.232.1.1.2.0 = INTEGER: 23 SNMPv2-SMI::enterprises.232.1.1.3.0 = INTEGER: 2 SNMPv2-SMI::enterprises.232.1.2.1.4.1.0 = INTEGER: 30 SNMPv2-SMI::enterprises.232.1.2.1.4.2.1.1.1 = INTEGER: 1 SNMPv2-SMI::enterprises.232.1.2.1.4.2.1.2.1 = STRING: "Compaq Standard Equipment Agent for Linux" SNMPv2-SMI::enterprises.232.1.2.1.4.2.1.3.1 = "" SNMPv2-SMI::enterprises.232.1.2.1.4.2.1.4.1 = Hex-STRING: 00 00 00 00 00 00 00 SNMPv2-SMI::enterprises.232.1.2.1.4.2.1.5.1 = STRING: "To gather Standard Equipment data for Linux." ...
Install check_hpasm on the Nagios or op5 Monitor host
The check_hpasm can be downloaded from Console Labs.
Unpack the tarball
# tar xzvf check_hpasm-4.1.2.tar.gz
Configure and compile # ./configure --prefix=/opt/plugins/custom/hp-insight --with-nagios-user=monitor --with-nagios-group=users --enable-perfdata ... # make ... # make install
Test
# /opt/plugins/custom/hp-insight/libexec/check_hpasm -H -C public OK - System: 'proliant dl360 g3', S/N: '7J31LMW6N01D', ROM: 'P31 01/28/2004', hardware working fine, da: 1 logical drives, 1 physical drives | fan_1=50% fan_2=50% temp_1_cpu=16;50;50 temp_2_cpu=15;65;65 temp_3_ioBoard=21;56;56 temp_4_cpu=20;65;65
Congratulations your plugin and hw monitoring works!
Configure Nagios or op5 Monitor
checkcommands.cfg
# command 'check_hpasm' define command{ command_name check_hpasm command_line $USER1$/custom/libexec/check_hpasm -H $HOSTADDRESS$ -C $ARG1$ }
services.cfg
# service 'Insight Manager' define service{ use default-service host_name humpa service_description Insight Manager check_command check_hpasm!public contact_groups call_it-slav,it-slav_jabber,it-slav_mail }
Screenshoot, using ninja
Useful links
- op5, a systems and network management company
- op5 Monitor, an enterpise monitor system based on Nagios
- Ninja, Nagios is now just awesome
- Nagios, enterprise monitoring based on opensource
- Hp support & drivers, a place to start looking for the HP software used in this article
March 5th, 2010 at 1:50 am
Wow, thanks for writing about my plugin. Two things may be also interesting for you:
– the lastest version can also be used to monitor HP Blade Centers (checks Blades, Power supplies). These Bladecenters seem to gain a lot of popularity.
– adding “-v” to check_hpasm outputs a line for each component of the system (fans, memory modules, power supplies, disks,…). The above graphik would look the same, but by clicking on “Insight Manager” you would see all this extra information in the service detail window.
Greetings from Munich,
Gerhard
May 11th, 2010 at 3:25 pm
Hello,
I’ve found that HP has an supportpack for CentOS on their website. I have downloaded that and succesfully ínstalled it on my CentOS based HP PL320 G6 system. Just dowload and unpack, then install the rpm-packs one by one.
There is an “ReleaseNotes.txt” that is “kinda” helpful 🙂
I use an plugin called “check_compaq” that works like an charm with this setup.
Regards,
Gusten