nagios-stat(1) nagios-statd client

SYNOPSIS

nagios-stat [OPTION] check server

DESCRIPTION

nagios-stat (nagios-stat client) is the client program for nagios-statd(8). These programs together comprise a systems monitoring tool for various platforms. It is designed to be integrated with the Nagios monitoring tool, although this is not a requirement.

nagios-stat is the client that connects to the nagios-statd server. It then sends the daemon which check it wants to run, parses the data, prints a result, and then exits appropriately.

OPTIONS

-c, --critical=LEVEL
LEVEL is the what level of data will cause the client program to exit with a critical error. The defaults are dependent upon the check below, and are documented below.

-d, --disk=DISK
A specific disk to check when running the disk check. It is only valid for this check, and will be ignored for all other checks.

-D, --ignoredisk=DISK
A comma delimited list of disks (device names) to ignore when running the disk check. It is only valid for this check, and will be ignored for all other checks.

-l, --lt
Reverse how the proc check exits on criticality and warnings. Instead of going critical/warning when the number of processes is greater than -c/-w, it now goes critical/warning when the number of processes is less than -c/-w.

-m, --mount=MOUNT
A specific mount point to check when running the disk check. It is only valid for this check, and will be ignored for all other checks.

-M, --ignoremount=MOUNT
A comma delimited list of mount points to ignore when running the disk check. It is only valid for this check, and will be ignored for all other checks.

-n, --processname=NAME
Name of a specific process to check for when running the proc check. It is only valid for this check, and will be ignored for all other checks.

-P, --perfdata
Enables output of nagios performance data.

-p, --port=PORT
Port to connect to on remote nagios-statd(8) server.

-s, --state=STATE
Process states to check for (such as ZW, or just Z, etc.) when running the proc check. It is only valid for this check, and will be ignored for all other checks.

-v, --verbose
Gives verbose output for checks. Currently, it causes the disk check to always print out utilization information.

-w, -warning=LEVEL
LEVEL is the what level of data will cause the client program to exit with a warning error. The defaults are dependent upon the check below, and are documented below.

-V, --version
Output version information and exit.

-x, --debug
Prints out raw data that nagios-stat(1) receives from the nagios-statd(8) server. Useful for debugging connection issues.

-h, --help
Print short option information and exit.

CHECKS

disk (warning: 90%; critical: 95%)
The disk check allows you to check to see if a disks' utilization is above a specified percentage. You can use the -d/-m to check a specific disk/mount, or you can use the -D/-M options to ignore certain disks/mount points (like /cdrom). By default, the disk check will check all disks on a machine. Note that -d/-m and (-D/-M) are exclusive options, although -D and -M are not. Also, if you specify a disk/mount and it isn't found, then nagios-stat(1) will exit at a critical level.

load (warning: 2; critical: 5)
The load check allows you to check to see if a machines' load average is above a specified 5-minute load average).

proc (warning: 100; critical: 200) / (warning: -1; critical: 1 if using -l/--lt)
The proc check allows you to check to see if a machine is running more than the specified number of processes. You can use the -s option to restrict the results to processes running in a specified state. Be warned: Using the -s check with a state that the process check will run in may result in an off-by-one error. There is no workable way to fix this. You can also use the -n option to only look at certain processes as they appear in the process table. This matches the beginning of the processes name, so -n ora would match "oracle", "orablob", and "orafreep -n". Interpreted programs often find their command line modified (especially in Linux). For example, running "./foo.pl" will result in a process name of "/usr/bin/perl ./foo.pl". An easy solution (in Perl, for example) is to add "$0 = $0;" at the beginning of your programs.

swap (warning: 75%; critical: 90%)
The swap check allows you to check to see if a machines' swap utilization is above a specified percentage. Due to the difficulty of ascertaining this information, your platform may not be supported for this check.

user (warning: 20; critical: 30)
The user check allows you to check to see if more than a specified number of users are currently logged in.

version (warning: nagios-stat(1) version - .01; critical: none)
The version check allows you to check what version of nagios-statd(8) that the remote server is running. This is useful for organizations that are running a large quantity of servers to make sure they are always up-to-date.

EXAMPLES

Check to see if /dev/hda is under 60%/80% full on server.domain.net:
nagios-stat -d /dev/hda -w 60 -c 80 disk server.domain.net

Check to see if all mounts except /cdrom and /tmp are under 75%/85% full:

nagios-stat -M /cdrom,/tmp -w 75 -c 85 disk server.domain.net

Check to see if the load average is below 2.5/10:

nagios-stat -w 2.5 -c 10 load server.domain.net

Check and warn if load is above 1 (only):

nagios-stat -w 1 -c 10000 load server.domain.net

Check to see if there are more than 5/10 zombie processes:

nagios-stat -w 5 -c 10 -s Z proc server.domain.net

Check and critical if more than 20 Z or N or W processes:

nagios-stat -w 10000 -c 20 -s NWZ proc server.domain.net

Check to see if cron is running - critical if it isn't:

nagios-stat -l -n '/usr/sbin/cron' proc server.domain.net

Check 'oracle' processes running, critical if less than 3, warn less than 5:

nagios-stat -l -n 'oracle' -c 3 -w 5 proc server.domain.net

Check if swap utilization is above 50%/75%:

nagios-stat -w 50 -c 75 swap server.domain.net

Check to see if there are more than 250/500 users connected:

nagios-stat -w 250 -c 500 user server.domain.net

Check and critical if server is running nagios-statd 3.05 or lower:

nagios-stat -w 10000 -c 3.05 version server.domain.net

EXIT CODES

To comply with Nagios specifications, the following exit codes are used:
0 : OK
1 : Warning
2 : Critical
3 : Invalid command line options, unable to check status (Unknown)

BUGS

There is a general lack of feedback for the more obscure platforms. As such their behavior might not always be particularly deterministic. Feedback is always welcome. Redhat Linux contains Python 1.x as /usr/bin/python. This program requires Python 2.x to function. Either change the shebang at the top of the program to point to /usr/bin/python2, or change /usr/bin/python to be Python 2.x.

AUTHOR

April King http://www.twoevils.org
E-mail: april at twoevils dot org

COPYRIGHT

Copyright (C) 2002-2005 April King.

This is free software, there is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. This program is licensed under the BSD license. More information is available in the LICENSE file included with this program.

Nagios is a trademark of Ethan Galstad.