Post 4.4.x roadmap
Live goes on. Which features are planned in future releases? This page is more a list of ideas than a fixed plan for the future, but it may give you an idea where the journey is going. If you want to contribute but don't know what is to be done, simply drop us a short mail and we'll get you started.
This file is not automatically generated and may thus be out of date from time to time. To give you an idea how fresh this information is: This file was last modified Wednesday, 25-Jun-2008 15:40:54 CEST.
Planned features
available for testing
Allow flushing of specific values.
RRD files have a known weakness: The
RRAs are spread out over the entire file. When updating a file,
a new value is written to some of those RRAs - which depends on the number of
PDP that are combined in one
CDP. A typical RRD file created by
collectd has four different timespans with three different consolidation
functions, resulting in 12 RRAs. This means that with each update one value is written to three, six, nine
or all 12 RRAs.
The problem is that these values are not very big. In almost all cases one or two values are written, which
equals eight or 16 bytes. So the result is that the disk as to seek around a lot just to
write some bytes here and there. It's next to impossible to write RRD updates to disk faster than
1.5-2 MByte/s - even with fast disks and a good look at
the "TuningRRD" article in the RRDTool
wiki.
One solution is to do multiple updates at once. Once seeked to the right position the disk hardly cares if it
writes eight, 16, 64, or 128 bytes: It writes 512 byte blocks anyway. And even if the update writes
over the border of one disk block the performance penalty is almost negligible: Writing sequentially is what
disks can do best. This is precisely the idea behind the
CacheTimeout
setting of the rrdtool-plugin: Do many updates at once and save a lot of
IO-operations.
There's an obvious drawback: You collect statistics in a very high resolution (such as the default ten seconds),
but then you have to wait for a long time until you can finally see the high resolution graphs. This is where
the unixsock-plugin comes into
play: Starting with collectd 4.5 you can tell the daemon to "flush"
one value, i e. to write all updates for one specific file to disk right now. Integrate
this in your graphing solution and you will end up with the most up-to-date graphs possible and very little
IO.
Enhance the rrdtool-plugin to
allow type-specific configuration. This would allow to keep
interesing data longer and less interesting data only for a short
time. Also seasonal data could be added. It's not reasonable to add
seasonal data to all RRD files, because it operates on unconsolidated
primary datapoints (thus taking up a lot of diskspace) and you can't
specify one seasonal period that fits all kinds of data. A
discussion of this feature can be found in the mailinglist's
archive.
Wishlist / Ideas
Support distributed setups. Make it possible to have several "servers" with shared storage. When one server dies another one takes its place, so that you don't have gaps in your statistics. A solution for non-shared storage is probabily even harder to implement and likely not worth the effort.
Embed more scripting languages. Python and Lua come to mind.
Want to discuss or help to implement any of the above? Get in touch with us!
