xymond(8) Master network daemon for a Xymon server

SYNOPSIS

xymond [options]

DESCRIPTION

xymond is the core daemon in the Xymon Monitor. It is designed to handle monitoring of a large number of hosts, with a strong focus on being a high-speed, low-overhead implementation of a Big Brother compatible server.

To achieve this, xymond stores all information about the state of the monitored systems in memory, instead of storing it in the host filesystem. A number of plug-ins can be enabled to enhance the basic operation; e.g. a set of plugins are provided to implement persistent storage in a way that is compatible with the Big Brother daemon. However, even with these plugins enabled, xymond still performs much faster than the standard bbd daemon.

xymond is normally started and controlled by the xymonlaunch(8) tool, and the command used to invoke xymond should therefore be in the tasks.cfg file.

OPTIONS

--hosts=FILENAME
Specifies the path to the Xymon hosts.cfg file. This is used to check if incoming status messages refer to known hosts; depending on the "--ghosts" option, messages for unknown hosts may be dropped. If this option is omitted, the default path used is set by the HOSTSCFG environment variable.

--checkpoint-file=FILENAME
With regular intervals, xymond will dump all of its internal state to this check-point file. It is also dumped when xymond terminates, or when it receives a SIGUSR1 signal.

--checkpoint-interval=N
Specifies the interval (in seconds) between dumps to the check-point file. The default is 900 seconds (15 minutes).

--restart=FILENAME
Specifies an existing file containing a previously generated xymond checkpoint. When starting up, xymond will restore its internal state from the information in this file. You can use the same filename for "--checkpoint-file" and "--restart".

--ghosts={allow|drop|log|match}
How to handle status messages from unknown hosts. The "allow" setting accepts all status messages, regardless of whether the host is known in the hosts.cfg file or not. "drop" silently ignores reports from unknown hosts. "log" works like drop, but logs the event in the xymond output file. "match" will try to match the name of the unknown host reporting with the known names by ignoring any domain-names - if a match is found, then a temporary client alias is automatically generated. The default is "log".

--no-purple
Prevent status messages from going purple when they are no longer valid. Unlike the standard bbd daemon, purple-handling is done by xymond.

--merge-clientconfig
The client-local.cfg(5) file contains client-configuration which can be found matching a client against its hostname, its classname, or the name of the OS the client is running. By default xymond will return one entry from the file to the client, looking for a hostname, classname or OS match (in that order). This option causes xymond to merge all matching entries together into one and return all of it to the client.

--listen=IP[:PORT]
Specifies the IP-address and port where xymond will listen for incoming connections. By default, xymond listens on IP 0.0.0.0 (i.e. all IP- addresses available on the host) and port 1984.

--lqueue=NUMBER
Specifies the listen-queue for incoming connections. You don't need to tune this unless you have a very busy xymond daemon.

--no-bfq
Tells xymond to NOT use the local messagequeue interface for receiving status- updates from xymond_client and xymonnet.

--daemon
xymond is normally started by xymonlaunch(8). If you do not want to use xymonlaunch, you can start xymond with this option; it will then detach from the terminal and continue running as a background task.

--timeout=N
Set the timeout used for incoming connections. If a status has not been received more than N seconds after the connection was accepted, then the connection is dropped and any status message is discarded. Default: 10 seconds.

--flap-count=N
Track the N latest status-changes for flap-detection. See the --flap-seconds option also. To disable flap-checks globally, set N to zero. To disable for a specific host, you must use the "noflap" option in hosts.cfg(5). Default: 5

--flap-seconds=N
If a status changes more than flap-count times in N seconds or less, then it is considered to be flapping. In that case, the status is locked at the most severe level until the flapping stops. The history information is not updated after the flapping is detected. NOTE: If this is set higher than the default value, you should also use the --flap-count option to ensure that enough status-changes are stored for flap detection to work. The flap-count setting should be at least (N/300)-1, e.g. if you set flap-seconds to 3600 (1 hour), then flap-count should be at least (3600/300)-1, i.e. 11. Default: 1800 seconds (30 minutes).

--delay-red=N
--delay-yellow=N
Sets the delay before a red/yellow status causes a change in the web page display. Is usually controlled on a per-host basis via the delayred and delayyellow settings in hosts.cfg(5) but these options allow you to set a default value for the delays. The value N is in minutes. Default: 0 minutes (no delay). Note: Since most tests only execute once every 5 minutes, it will usually not make sense to set N to anything but a multiple of 5.

--env=FILENAME
Loads the content of FILENAME as environment settings before starting xymond. This is mostly used when running as a stand-alone daemon; if xymond is started by xymonlaunch, the environment settings are controlled by the xymonlaunch tasks.cfg file.

--pidfile=FILENAME
xymond writes the process-ID it is running with to this file. This is for use in automated startup scripts. The default file is $XYMONSERVERLOGS/xymond.pid.

--log=FILENAME
Redirect all output from xymond to FILENAME.

--store-clientlogs[=[!]COLUMN]
Determines which status columns can cause a client message to be broadcast to the CLICHG channel. By default, no client messages are pushed to the CLICHG channel. If this option is specified with no parameter list, all status columns that go into an alert state will trigger the client data to be sent to the CLICHG channel. If a parameter list is added to this option, only those status columns listed in the list will cause the client data to be sent to the CLICHG channel. Several column names can be listed, separated by commas. If all columns are given as "!COLUMNNAME", then all status columns except those listed will cause the client data to be sent.

--status-senders=IP[/MASK][,IP/MASK]
Controls which hosts may send "status", "combo", "config" and "query" commands to xymond.

By default, any host can send status-updates. If this option is used, then status-updates are accepted only if they are sent by one of the IP-addresses listed here, or if they are sent from the IP-address of the host that the updates pertains to (this is to allow Xymon clients to send in their own status updates, without having to list all clients here). So typically you will need to list your servers running network tests here.

The format of this option is a list of IP-addresses, optionally with a network mask in the form of the number of bits. E.g. if you want to accept status-updates from the host 172.16.10.2, you would use

    --status-senders=172.16.10.2
whereas if you want to accept status updates from both 172.16.10.2 and from all of the hosts on the 10.0.2.* network (a 24-bit IP network), you would use

    --status-senders=172.16.10.2,10.0.2.0/24

--maint-senders=IP[/MASK][,IP/MASK]
Controls which hosts may send maintenance commands to xymond. Maintenance commands are the "enable", "disable", "ack" and "notes" commands. Format of this option is as for the --status-senders option. It is strongly recommended that you use this to restrict access to these commands, so that monitoring of a host cannot be disabled by a rogue user - e.g. to hide a system compromise from the monitoring system.

Note: If messages are sent through a proxy, the IP-address restrictions are of little use, since the messages will appear to originate from the proxy server address. It is therefore strongly recommended that you do NOT include the address of a server running xymonproxy in the list of allowed addresses.

--www-senders=IP[/MASK][,IP/MASK]
Controls which hosts may send commands to retrieve the state of xymond. These are the "xymondlog", "xymondboard" and "xymondxboard" commands, which are used by xymongen(1) and combostatus(1) to retrieve the state of the Xymon system so they can generate the Xymon webpages.

Note: If messages are sent through a proxy, the IP-address restrictions are of little use, since the messages will appear to originate from the proxy server address. It is therefore strongly recommended that you do NOT include the address of a server running xymonproxy in the list of allowed addresses.

--admin-senders=IP[/MASK][,IP/MASK]
Controls which hosts may send administrative commands to xymond. These commands are the "drop" and "rename" commands. Access to these should be restricted, since they provide an un-authenticated means of completely disabling monitoring of a host, and can be used to remove all traces of e.g. a system compromise from the Xymon monitor.

Note: If messages are sent through a proxy, the IP-address restrictions are of little use, since the messages will appear to originate from the proxy server address. It is therefore strongly recommended that you do NOT include the address of a server running xymonproxy in the list of allowed addresses.

--no-download
Disable the "download" command which can be used by clients to pull files from the Xymon server. The use of these may be seen as a security risk since they allow file downloads.

--ack-each-color
By default, sending an ACK for a yellow status stops alerts from being sent while the status remains yellow or red. A status change from yellow to red will not re-enable alerts - the ACK covers all non-green statuses. With this option, an ACK is valid only for the color of the status when the ACK was sent. So an ACK for a yellow status is ignored if the status later changes to red, but an ACK for a red status covers both yellow and red.
Note: An ACK for a red status will clear any existing yellow acks. This means that a long-lived ack for yellow is lost when you send a short-lived ack for red. Hence alerts will restart when the red ack expires, even if the status by then has changed to yellow.

--ack-log=FILENAME
Log acknowledgements created on the Critical Systems page to FILENAME. NB, acknowledgements created by the Acknowledge Alert CGI are automatically written to acknowledge.log in the Xymon server log directory. Alerts from the Critical Systems page can be directed to the same log.

--debug
Enable debugging output.

--dbghost=HOSTNAME
For troubleshooting problems with a specific host, it may be useful to track the exact communications from a single host. This option causes xymond to dump all traffic from a single host to the file "/tmp/xymond.dbg".

HOW ALERTS TRIGGER

When a status arrives, xymond matches the old and new color against the "alert" colors (from the "ALERTCOLORS" setting) and the "OK" colors (from the "OKCOLORS" setting). The old and new color falls into one of three categories:

OK: The color is one of the "OK" colors (e.g. "green").

ALERT: The color is one of the "alert" colors (e.g. "red").

UNDECIDED: The color is neither an "alert" color nor an "OK" color (e.g. "yellow").

If the new status shows an ALERT state, then a message to the xymond_alert(8) module is triggered. This may be a repeat of a previous alert, but xymond_alert(8) will handle that internally, and only send alert messages with the interval configured in alerts.cfg(5).

If the status goes from a not-OK state (ALERT or UNDECIDED) to OK, and there is a record of having been in a ALERT state previously, then a recovery message is triggered.

The use of the OK, ALERT and UNDECIDED states make it possible to avoid being flooded with alerts when a status flip-flops between e.g yellow and red, or green and yellow.

CHANNELS

A lot of functionality in the Xymon server is delegated to "worker modules" that are fed various events from xymond via a "channel". Programs access a channel using IPC mechanisms - specifically, shared memory and semaphores - or by using an instance of the xymond_channel(8) intermediate program. xymond_channel enables access to a channel via a simple file I/O interface.

A skeleton program for hooking into a xymond channel is provided as part of Xymon in the xymond_sample(8) program.

The following channels are provided by xymond:

status This channel is fed the contents of all incoming "status" and "summary" messages.

stachg This channel is fed information about tests that change status, i.e. the color of the status-log changes.

page This channel is fed information about tests where the color changes between an alert color and a non-alert color. It also receives information about "ack" messages.

data This channel is fed information about all "data" messages.

notes This channel is fed information about all "notes" messages.

enadis This channel is fed information about hosts or tests that are being disabled or enabled.

client This channel is fed the contents of the client messages sent by Xymon clients installed on the monitored servers.

clichg This channel is fed the contents of a host client messages, whenever a status for that host goes red, yellow or purple.

Information about the data stream passed on these channels is in the Xymon source-tree, see the "xymond/new-daemon.txt" file.

SIGNALS

SIGHUP
Re-read the hosts.cfg configuration file.

SIGUSR1
Force an immediate dump of the checkpoint file.

BUGS

Timeout of incoming connections are not strictly enforced. The check for a timeout only triggers during the normal network handling loop, so a connection that should timeout after N seconds may persist until some activity happens on another (unrelated) connection.

FILES

If ghost-handling is enabled via the "--ghosts" option, the hosts.cfg file is read to determine the names of all known hosts.