You can find the answers to these questions on the FAQ page.
  • It doesn't work. Where can I find diagnostic output?
    Version 3.* writes warnings and error messages using the syslog(3) facility. Depending on your system the syslog-daemon writes these messages to files and/or sends them to another host. On most GNU/Linux distributions the place to look at is either /var/log/syslog or /var/log/messages.
    Version 4.0 and later comes with the logfile and syslog plugins which can be used to write status messages to a file or send it to the syslog daemon.
  • Some lines of the config seem to be ignored..?
    Yes, that's a known bug. You probably have one or more white spaces at the end of the lines being ignored.
    This is a bug in the library used by collectd 3.* to parse the configfile. Versions 4.0 and later use a different library and don't have this problem.
  • Can I adjust the interval in which data is collected?
    Yes, since version 3.9.0 this can be set at compiletime. Keep in mind, though, that this will change the layout of the generated RRD-files. Also, clients and servers should have the same setting here to avoid interesting results.
    Version 4.0 allows this setting to be adjusted in the configfile.
  • I try to use the ping-plugin, but keep getting the message "`ping_host_add' failed.". What's the matter?
    In order to generate ICMP packets one needs to open a so called "RAW socket". On most UNIX systems only the superuser (root) may open such sockets.
    In addition, some virtualization environments, such as VServer and Solaris Zones have been reported to cause some trouble.
  • Who receives the multicast traffic?
    I don't know. That entirely depends on your network setup. By default collectd uses "site local" addresses, that should not be routed to outside your AS. If that's really the case is up to you.
  • What does "Invalid value for config option `Mode': `Local'" mean?
    Is means that the mode "Local" is not available. Most likely the "librrd" library wasn't found. If you want to write to RRD-files install "librrd" or, if you already did that, use the --with-rrdtool option of the ./configure-script to point to the right direction.
  • How to I use --with-rrdtool?
    If you installed libraries in a non-standard (or non-system) path you need to specify them when running the configure script. Otherwise it will not find them and build the binaries without linking against the library.
    You need to set the PATH as given to the --prefix option when compiling the library. The script actually looks for the two subdirectories PATH/include and PATH/lib, so check for their existence if things don't work. If, for example, you installed RRDTool in /opt/rrdtool-x.y.z you need to run configure like this:
    $ ./configure --with-rrdtool=/opt/rrdtool-x.y.z
  • The apache-plugin reports the following error: apache: curl_easy_perform failed: Failed writing body. What's wrong?
    The response received was too big and didn't fit into the buffer. Check the URL-option in the configfile. Especially check that the URL ends in "?auto": collectd requires the machine readable output generated by the Apache-plugin mod_status and will not work with anything else.
  • What do the version numbers mean?
    The version numbers consist of three numbers: The major- and minor-number and the patchlevel.
    • Versions with different major-numbers are basically not compatible. This means that the definitions of RRD-files or config-options have been changed or, in general, that the user has to do something in addition to install the new version. This is not nice and avoided when possible, but sometimes necessary to prevent old mistakes to become ancient mistakes. We try to provide migration scripts, though, to make a switch as easy as possible. See the v3 to v4 migration guide for details.
    • Versions with differing minor-numbers are backwards compatible, i. e. you can replace the lower version with the higher one and everything should still work. This means that features are added, but not removed or changed and that the default behavior does not change.
    • Versions with different patchlevels are both, forward- and backwards-compatible, because no new features have been introduced. The only difference between the two versions is one or more bugfixes, so you should generally install the higher version of the two.
  • I enabled the foo plugin using --enable-foo but now the build process fails. What's wrong?
    Since version 4.0.0 a server process doesn't need to load the plugins from which data should be received - in contrast to versions 3.*. This means, that plugins with unmet dependencies no longer have any purpose. So, we moved dependency checking into the configure script, starting with version 4.1.0. I. e. the configure script now automatically disables all plugins with unmet dependencies and enables all plugins whose dependencies are met.
    So, if a plugin is displayed as disabled, it's dependencies are not met. The normal way to get a plugin compiled is to install the missing dependencies and re-run the configure script.
    You can force it to be build using --enable-foo, but you need to know exactly what you are doing. If you do this you're out in the dark, cold woods and totally on your own!
  • The build process fails with "relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC". What's wrong?
    Many plugins have to be linked against libraries. A few of them (currently iptables, netlink and nut are known to be affected) link against libraries that are only available as "static libraries" in many distributions. Most distributions (e. g. Debian and SuSE GNU/Linux) do not compile static libraries with the "-fPIC" option. Thus they cannot be linked with shared objects compiled with "-fPIC". Some architectures (among them i386) do not seem to care about that and handle it in some (probably magic) way. However, other architectures (mostly 64bit like amd64 or hppa) cannot handle that and thus the compiler aborts with the error message mentioned above.
    To fix this issue, you need a version of the static library compiled with "-fPIC" (or a shared library). Ask your distributor to provide a suitable version of the library or compile it yourself.
    For more detailed information please refer to:
  • Solaris support is broken! The build aborts! Help!
    There are two known issues with Solaris, but both can be fixed relatively easy:
    If you build a 32bit binary, the configure script will (try to) enable LFS. This will result in an error which looks somehow like this:
    config.h:832:1: error: "_FILE_OFFSET_BITS" redefined
    Also, the swap-plugin has some problems of it's own with this:
    swap.c:197: warning: implicit declaration of function 'swapctl'
    swap.c:197: error: 'SC_AINFO' undeclared (first use in this function)
    The solution is to build a 64bit binary! If you build a 64bit binary LFS is not needed and the swap plugin works as intended. To do this, pass the -m64 flag to the compiler (assuming you're using the Sun C compiler.
    Another problem is that by default Sun defines a version of getgrnam_r that isn't POSIX-compatible. To enable POSIX-compatibility pass the _POSIX_PTHREAD_SEMANTICS define to the compiler.
    Putting all together you need to pass the following flags to the configure-script:
    # Sun CC
    $ ./configure CFLAGS="-m64 -mt -D_POSIX_PTHREAD_SEMANTICS"
    Please note that we only test the Sun C compiler ourselves, but GCC may work, too. When using the GCC you need to substitute the -mt flag with the -pthreads flag. So if you use GCC the above invokation of ./configure becomes:
    # GCC
    $ ./configure CFLAGS="-m64 -pthreads -D_POSIX_PTHREAD_SEMANTICS"
    Thanks to Christophe Kalt for sharing his insights :)
  • Why is the CPU usage split up in so many files? Can I change that?
    The short answer is: That is because otherwise backwards compatibility would be impossible and you would have to re-create your files from scratch regularly. And, "no".
    The long answer and explanation of the short answer is: collectd runs on a variety of operating systems. Each operating system has it's own method for accounting CPU states, memory consumption, swap usage, and so on. If all these data sources where in one data set, every new supported operating system or any addition to an already supported operating system would mean that we need to modify the data set. This cannot be done without breaking backwards compatibility.
    To give you a few examples: Sometime in mid-2.6 the Linux kernel added some Xen-patches which provided a new CPU state: "steal time". When adding support for BSD systems we had to add "wired" memory. NFSv4 added some new procedures that NFSv3 didn't have, etc pp.
    That interface traffic has two data sources is different, because every operating system will account received and transmitted bytes. Likewise for the system load: The 1, 5, and 15 minute averages have been like that for ages and it's very unlikely that any weird UNIX does this different.
    Changing the layout of the data is not just a matter of changing the types.db file. That file describes the layout of the data submitted by plugins. The plugins don't need it - they know what data they submit. It's needed by the daemon and writing plugin to know how to store the data. If you mess with the file without knowing what you do, you will most likely end up with the data not being collected at all anymore.
  • Why doesn't collection.cgi draw foo graphs correctly?
    That script is meant as a starting point for own developments, not as a ready to use web frontend for RRD files written by collectd.
    It is just an example, because it's not really usable as it is. And it's not really useable, because we are UNIX developers and don't enjoy doing web stuff much. Working on the daemon is just so much more fun.. ;) So in the best of free / open source traditions: Patches welcome!
    There are alternatives, though. We've heard from various people using Cacti to render the graphs. Sergiusz Pawlowicz of the BBC has written CollectGraph, a macro for the MoinMoin wiki. And of course there's drraw.

Manpage collectd-exec(5)


NAME

collectd-exec - Documentation of collectd's exec plugin


SYNOPSIS

  # See collectd.conf(5)
  LoadPlugin exec
  # ...
  <Plugin exec>
    Exec "myuser:mygroup" "myprog"
    Exec "otheruser" "/path/to/another/binary" "arg0" "arg1"
    NotificationExec "user" "/usr/lib/collectd/exec/handle_notification"
  </Plugin>


DESCRIPTION

The exec plugin forks of an executable either to receive values or to dispatch notifications to the outside world. The syntax of the configuration is explained in collectd.conf(5) but summarized in the above synopsis.

If you want/need better performance or more functionality you should take a long look at the perl plugin, collectd-perl(5).


EXECUTABLE TYPES

There are currently two types of executables that can be executed by the exec plugin:

Exec

These programs are forked and values that it writes to STDOUT are read back. The executable is forked in a fashion similar to init: It is forked once and not again until it exits. If it exited, it will be forked again after at most Interval seconds. It is perfectly legal for the executable to run for a long time and continuously write values to STDOUT.

See EXEC DATA FORMAT below for a description of the output format expected from these programs.

Warning: If the executable only writes one value and then exits I will be executed every Interval seconds. If Interval is short (the default is 10 seconds) this may result in serious system load.

NotificationExec

The program is forked once for each notification that is handled by the daemon. The notification is passed to the program on STDIN in a fashion similar to HTTP-headers. In contrast to programs specified with Exec the execution of this program is not serialized, so that several instances of this program may run at once if multiple notifications are received.

See NOTIFICATION DATA FORMAT below for a description of the data passed to these programs.


EXEC DATA FORMAT

The forked executable is expected to print values to STDOUT. The expected format is as follows:

Comments

Each line beginning with a # (hash mark) is ignored.

PUTVAL Identifier [OptionList] Valuelist

Submits one or more values (identified by Identifier, see below) to the daemon which will dispatch it to all it's write-plugins.

An Identifier is of the form host/plugin-instance/type-instance with both instance-parts being optional. If they're omitted the hyphen must be omitted, too. plugin and each instance-part may be chosen freely as long as the tuple (plugin, plugin instance, type instance) uniquely identifies the plugin within collectd. type identifies the type and number of values (i. e. data-set) passed to collectd. A large list of predefined data-sets is available in the types.db file. See types.db(5) for a description of the format of this file.

The OptionList is an optional list of Options, where each option if a key-value-pair. A list of currently understood options can be found below, all other options will be ignored.

Valuelist is a colon-separated list of the time and the values, each either an integer if the data-source is a counter, of a double if the data-source if of type ``gauge''. You can submit an undefined gauge-value by using U. When submitting U to a counter the behavior is undefined. The time is given as epoch (i. e. standard UNIX time).

You can mix options and values, but the order is important: Options only effect following values, so specifying an option as last field is allowed, but useless. Also, an option applies to all following values, so you don't need to re-set an option over and over again.

The currently defined Options are:

interval=seconds

Gives the interval in which the data identified by Identifier is being collected.

Please note that this is the same format as used in the unixsock plugin, see collectd-unixsock(5). There's also a bit more information on identifiers in case you're confused.

Since examples usually let one understand a lot better, here are some:

  leeloo/cpu-0/cpu-idle N:2299366
  alice/interface/if_octets-eth0 interval=10 1180647081:421465:479194

Since this action was the only one supported with older versions of the exec plugin all lines were treated as if they were prefixed with PUTVAL. This is still the case to maintain backwards compatibility but deprecated.

PUTNOTIF [OptionList] message=Message

Submits a notification to the daemon which will then dispatch it to all plugins which have registered for receiving notifications.

The PUTNOTIF if followed by a list of options which further describe the notification. The message option is special in that it will consume the rest of the line as its value. The message, severity, and time options are mandatory.

Valid options are:

message=Message (REQUIRED)

Sets the message of the notification. This is the message that will be made accessible to the user, so it should contain some useful information. This option must be the last option because the rest of the line will be its value, even if there are spaces and equal-signs following it! This option is mandatory.

severity=failure|warning|okay (REQUIRED)

Sets the severity of the notification. This option is mandatory.

time=Time (REQUIRED)

Sets the time of the notification. The time is given as ``epoch'', i. e. as seconds since January 1st, 1970, 00:00:00. This option is mandatory.

host=Hostname
plugin=Plugin
plugin_instance=Plugin-Instance
type=Type
type_instance=Type-Instance

These ``associative'' options establish a relation between this notification and collected performance data. This connection is purely informal, i. e. the daemon itself doesn't do anything with this information. However, websites or GUIs may use this information to place notifications near the affected graph or table. All the options are optional, but plugin_instance without plugin or type_instance without type doesn't make much sense and should be avoided.

Please note that this is the same format as used in the unixsock plugin, see collectd-unixsock(5).

When collectd exits it sends a SIGTERM to all still running child-processes upon which they have to quit.


NOTIFICATION DATA FORMAT

The notification executables receive values rather than providing them. In fact, after the program is started STDOUT is connected to /dev/null.

The data is passed to the executables over STDIN in a format very similar to HTTP: At first there is a ``header'' with one line per field. Every line consists of a field name, ended by a colon, and the associated value until end-of-line. The ``header'' is ended by two newlines immediately following another, i. e. an empty line. The rest, basically the ``body'', is the message of the notification.

The following is an example notification passed to a program:

  Severity: FAILURE
  Time: 1200928930
  Host: myhost.mydomain.org
  \n
  This is a test notification to demonstrate the format

The following header files are currently used. Please note, however, that you should ignore unknown header files to be as forward-compatible as possible.

Severity

Severity of the notification. May either be FAILURE, WARNING, or OKAY.

Time

The time in epoch, i. e. as seconds since 1970-01-01 00:00:00 UTC.

Host
Plugin
PluginInstance
Type
TypeInstance

Identification of the performance data this notification is associated with. All of these fields are optional because notifications do not need to be associated with a certain value.


USING NAGIOS PLUGINS

Though the interface is far from perfect, there are tons of plugins for Nagios. You can use these plugins with collectd by using a simple transition layer, exec-nagios.px, which is shipped with the collectd distribution in the contrib/ directory. It is a simple Perl script that comes with embedded documentation. To see it, run the following command:

  perldoc exec-nagios.px

This script expects a configuration file, exec-nagios.conf. You can find an example in the contrib/ directory, too.

Even a simple mechanism to submit ``performance data'' to collectd is implemented. If you need a more sophisticated setup, please rewrite the plugin to make use of collectd's more powerful interface.


CAVEATS

The user, the binary is executed as, may not have root privileges, i. e. must have an UID that is non-zero. This is for your own good.


SEE ALSO

collectd(1), collectd.conf(5), collectd-perl(5), collectd-unixsock(5), fork(2), exec(3)


AUTHOR

Florian Forster <octo@verplant.org>