Nagios Installation and Configuration
From Debian Clusters
Contents |
Installing Nagios
First, you're going to need a list of all the hosts you want to monitor, but listed as their fully qualified names. For example, mine are eyrie.raptor.loc, gyrfalcon.raptor.loc, etcetera - one for each host I want to monitor.
Nagios must be installed on a machine running a web server. Install it with
apt-get install nagios2
Note: As part of the package install, this will also install exim, a mail transfer agent. More documentation to follow soon (hopefully) about setting up mail.
Configuring Nagios
Web
At this point, you should be able to visit http://yourserver/nagios2/ and be prompted for a username and password. Great! What are they? Well, they haven't been created yet. Create them with
htpasswd -c /etc/nagios2/htpasswd.users nagiosadmin
and specify the password you want. Reload the page, enter the username and password, and you should see something like this. If you click on anything under the "Monitoring" headline, you'll get a nice little error message: "Whoops! Error: Could not read host and service status information!" That's because there isn't yet any information for it to be reading. We need to go ahead and specify this.
Configuration: Hosts
Nagios drops a configuration files into /etc/nagios2/conf.d/ as part of the base install. We'll use these as bases for configuration Nagios. First, we'll need a template file. Copy host-gateway_nagios2.cfg to /tmp/cluster-nagios.skel with
cp host-gateway_nagios2.cfg /tmp/cluster-nagios.skel
Next, open /tmp/cluster-nagios.skel in a text editor. Cut out the top line and change the values to variables we'll be replacing, so the file looks like this:
define host {
host_name HOSTNAME
alias ALIAS
address IP
use generic-host
}
We'll need to create a host entry for each machine to monitor. If you have that file with the hostnames, IP addresses, and fully qualified names, copy that to /tmp/nagioshosts. If you don't have one, make one, because you'll need it. Make sure your file of hosts is at /tmp/hosts. Paste this script into a new file, make it executable (chmod o+x yourfilename) and then run it (./yourfilename).
#!/bin/bash
# Short script to generate nagios host entries
# Assumes you have a file called /tmp/hosts with format
# Hostname IP Fully qualified name
#
# Also assumes you have a /tmp/cluster-nagios.skel file that looks like this
# define host {
# host_name HOSTNAME
# alias ALIAS
# address IP
# use generic-host
# }
for x in `cat /tmp/hosts`
do
y=`echo $x | cut -d . -f1`
z=`host $x | awk '{print $3}'`
echo "Creating entry host_name = $x , alias = $y, IP = $z"
sed "s/HOSTNAME/$x/g" /tmp/cluster-nagios.skel | sed "s/IP/$z/g" | sed "s/ALIAS/$y/g" >> /tmp/cluster-nagios.cfg
done
After you run it, /tmp/cluster-nagios.cfg should have one entry for each host. Copy this file to the nagios2 configuration directory:
cp /tmp/cluster-nagios.cfg /etc/nagios2/conf.d/
Next, try to start up (or restart) nagios with
/etc/init.d/nagios2 restart
Check the log to make sure it started up correctly. If it did, you should have one entry about "no services associated" for each host.
gyrfalcon:/etc/nagios2/conf.d# tail -n 20 /var/log/nagios2/nagios.log [1197252453] Nagios 2.10 starting... (PID=7448) [1197252453] LOG VERSION: 2.0 [1197252453] Finished daemonizing... (New PID=7449) [1197253356] Caught SIGTERM, shutting down... [1197253356] Successfully shutdown... (PID=7449) [1197256272] Nagios 2.10 starting... (PID=8371) [1197256272] LOG VERSION: 2.0 [1197256272] Warning: Host 'eagle.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'eyrie.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'goshawk.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'gyrfalcon.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'harrier.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'kestrel.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'kite.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'osprey.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'owl.raptor.loc' has no services associated with it! [1197256272] Warning: Host 'peregrine.raptor.loc' has no services associated with it! [1197256273] Finished daemonizing... (New PID=8372)
Configuration: Services
Hoorah. Now we've given Nagios the names of the hosts to monitor, but we haven't told them what we want to monitor. The easiest thing to monitor is whether a given host is currently up, and we can do that with ping. Again, we'll need our list of all the hosts. To create a comma-separated list of all the hosts to add, issue
for x in `cat /tmp/hosts`; do echo "$x," >> /tmp/hoststoadd; done
Then open /tmp/hoststoadd in a text editor and delete all the blank lines so all of these are on one line. Then copy the contents of the file. We'll need to add all the hosts to /etc/nagios2/conf.d/hostgroups_nagios2.cfg. Open hostgroups_nagios2.cfg and look for this section:
# nagios doesn't like monitoring hosts without services, so this is
# a group for devices that have no other "services" monitorable
# (like routers w/out snmp for example)
define hostgroup {
hostgroup_name ping-servers
alias Pingable servers
members gateway
}
Delete gateway (it should be in your list of hostnames anyway), then paste your list in. Again, restart nagios:
/etc/init.d/nagios2/restart
Cleaning up Extra Cruft
Installing the nagios2 package causes some groups and services to automatically be created for you, and you'll see those entries on the Nagios web-site of things. I prefer to only have what I've configured showing.
To clean up, first create a backup directory. I did
mkdir /etc/nagios2/old-conf.d-stuff
Then, move localhost_nagios2.cfg, host-gateway_nagios2.cfg, and extinfo_nagios2.cfg into that directory. Move a copy of hostgroups_nagios2.cfg into that directory.
- mv /etc/nagios2/conf.d/localhost_nagios2.cfg /etc/nagios2/old-conf.d-stuff/
- mv /etc/nagios2/conf.d/extinfo_nagios2.cfg /etc/nagios2/old-conf.d-stuff/
- mv /etc/nagios2/conf.d/host-gateway_nagios2.cfg /etc/nagios2/old-conf.d-stuff/
- cp /etc/nagios2/conf.d/hostgroups_nagios2.cfg /etc/nagios2/old-conf.d-stuff/
Next, edit /etc/nagios2/conf.d/hostgroups_nagios2.cfg and take out (or comment out) the sections for the following host_groups:
- debian-servers
- ssh-servers
- http-servers (or change the value in this entry from
localhostto the fully qualified name of your web server, iegyrfalcon.raptor.loc
Leave no references to gateway or localhost.
After that, you'll need to comment out any services that you're no longer using in the file /etc/nagios2/conf.d/services_nagios2.cfg
Finally, restart nagios. If you haven't gotten out all the entries for localhost or gateway, or you missed moving a file, you'll see any error like this -
gyrfalcon:/etc/nagios2/conf.d# /etc/init.d/nagios2 restart
Restarting nagios2 monitoring daemon: nagios2
Nagios 2.10
Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
Last Modified: 10-21-2007
License: GPL
Reading configuration data...
Error: Could not find any host matching 'localhost'
Error: Could not expand member hosts specified in hostgroup
(config file '/etc/nagios2/conf.d/hostgroups_nagios2.cfg', starting on line 25)
***> One or more problems was encountered while processing the config files...
Check your configuration file(s) to ensure that they contain valid
directives and data defintions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.
errors in config!
failed!
At least the error message is pretty helpful - it will give you the name of the file with the error. Edit/move that file, then try restarting again.

