From collectd Wiki
Revision as of 15:43, 17 May 2011 by Tokkee (talk | contribs) (initial version; proposed new query and output configuration)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This is a small tool to interact with the monitoring suite Nagios. It reads values from collectd using the unixsock plugin and compares them with the specified ranges. collectd-nagios then terminates with an error code according to the Nagios plugin development guidelines.

The following includes a few suggested improvements open for discussion:

Querying values

As of now (version 5.0), a single dataset (possibly composed of multiple data-sources) may be queried from collectd. This is fairly unflexible. E.g., it does not allow to check the percentage of used space on a disk (starting with collectd 5.0 which uses multiple datasets for this information).

The idea is to support a more flexible syntax when specifying a dataset using the -n (value spec) option:

  • mark -d (data-source) as deprecated and, optionally, append the data-source to the value spec (-n option) separated by another slash, e.g. load/load/midterm
  • support basic arithmetic operations, e.g. df-root/df_complex-used / (df-root/df_complex-used + df-root/df_complex-free); note: backward compatibily is preserved as specifying a single dataset still works just like before
  • -d is forbidden if -n does not specify a single dataset
  • support simple functions like MIN, MAX, ...

Output formatting

As of now (version 5.0), the output of collectd-nagios is hard-wired into the tool, e.g. OKAY: 0 critical, 0 warning, 3 okay. In a lot of cases, some more information (to be displayed in the Nagios frontend) might be desirable (especially when querying multiple values.

The idea is to make the output configurable through a configuration file and by specifing a format string with placeholders. The syntax might look like the following:

 <Service "disk_percent">
   Output "DISK {status} - free space {plugin_instance}: {value:df-{plugin_instance}/df_complex-free} ({value}%)"
 <Host "hostname">
   # settings applying to a specific host only
   <Service "foo">
     # ...

The service name may then be specified using a newly added command line option.