Difference between revisions of "Binary protocol"
(→Part types: added signed & encrypted parts) |
m (Address https://github.com/collectd/collectd/issues/3902; wiki rendering error by replacing thumbnail conversion with frame formatting) |
||
(24 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
− | Until the | + | The '''binary protocol''' is the protocol implemented by the {{Plugin|Network}} and some external implementations to exchange data collected by ''collectd'' or to send data to an instance of ''collectd''. |
+ | |||
+ | Until the {{Plugin|Network}} has been factorized out into a library, it is useful to have some documentation to reimplement it. | ||
==Well-known numbers== | ==Well-known numbers== | ||
Line 8: | Line 10: | ||
;Default IPv6 multicast group | ;Default IPv6 multicast group | ||
:ff18::efc0:4a42 | :ff18::efc0:4a42 | ||
+ | ;Maximum packet size | ||
+ | :1452 bytes (payload only, not including UDP / IP headers)<br />In versions [[Version 4.0|4.0]] through [[Version 4.7|4.7]], the receive buffer had a fixed size of 1024 bytes. When longer packets are received, the trailing data is simply ignored. Since [[version 4.8]], the buffer size can be configured. | ||
== Protocol structure == | == Protocol structure == | ||
− | [[Image:Binary protocol part header.png| | + | [[Image:Binary protocol part header.png|frame|Beginning of each “part”: Type and length.]] |
− | Each packet consists of one or more so called “parts”. Each part starts with the same four bytes: Two bytes that specify the “part type” (what kind of information is enclosed in the part) and two bytes which specify the length of the part, including the four header bytes itself. The maximum length of payload in any part is therefore 65531 bytes. | + | |
+ | Each packet consists of one or more so called “parts”. Each part starts with the same four bytes: Two bytes that specify the “part type” (what kind of information is enclosed in the part) and two bytes which specify the length of the part, including the four header bytes itself. Both type and length should be encoded in network order, aka Big Endian. The maximum length of payload in any part is therefore 65531 bytes. | ||
Using this layout, clients can determine the length of a part they don't know and lets them skip unknown data. This makes the protocol forward compatible so that new features can be added easily. | Using this layout, clients can determine the length of a part they don't know and lets them skip unknown data. This makes the protocol forward compatible so that new features can be added easily. | ||
+ | |||
+ | Protocol allows signature verification and encryption. This is done using corresponding “parts”. | ||
There are two part layouts that are used for a couple of “types”: ''numeric'' (an 8 byte integer) and ''string''. | There are two part layouts that are used for a couple of “types”: ''numeric'' (an 8 byte integer) and ''string''. | ||
+ | <br style="clear: both" /> | ||
=== Numeric parts === | === Numeric parts === | ||
− | [[Image:Binary protocol part number.png| | + | [[Image:Binary protocol part number.png|frame|Structure of the “number” parts]] |
− | Numeric integer values, e. g. the | + | Numeric integer values, e. g. the [[interval]] and ''time'' values, are transferred using 8 byte integers. The ''length'' field of those parts must therefore always be set to ''12''. |
+ | <br style="clear: both" /> | ||
=== String parts === | === String parts === | ||
− | [[Image:Binary protocol part string.png| | + | [[Image:Binary protocol part string.png|frame|Structure of “string” parts. The example shows the encoding of the string “foobar”.]] |
Strings are transferred ''including'' a null byte at the end. In the example you can see the encoding of the string “foobar”. The string is six characters long, followed by a null-byte and appended to a four byte header, leading to a length of ''11 bytes'' for this part. | Strings are transferred ''including'' a null byte at the end. In the example you can see the encoding of the string “foobar”. The string is six characters long, followed by a null-byte and appended to a four byte header, leading to a length of ''11 bytes'' for this part. | ||
+ | |||
+ | <br style="clear: both" /> | ||
+ | |||
+ | === Value parts === | ||
+ | |||
+ | Value parts encode an actual data sample. Preceding time, plugin, plugin instance, type, and type instance parts must have set the context for the data sample. The value part consists of: | ||
+ | * Type <code>0x0006</code> | ||
+ | * Length (16 bit field) | ||
+ | * Number of values (16 bit field) | ||
+ | * Values, for each one: | ||
+ | ** [[Data set|Data type]] code (8 bit field) | ||
+ | **: <code>COUNTER</code> → 0 | ||
+ | **: <code>GAUGE</code> → 1 | ||
+ | **: <code>DERIVE</code> → 2 | ||
+ | **: <code>ABSOLUTE</code> → 3 | ||
+ | ** Value (64 bit field) | ||
+ | **: <code>COUNTER</code> → network (big endian) ''unsigned integer'' | ||
+ | **: <code>GAUGE</code> → x86 (little endian) ''double'' | ||
+ | **: <code>DERIVE</code> → network (big endian) ''signed integer'' | ||
+ | **: <code>ABSOLUTE</code> → network (big endian) ''unsigned integer'' | ||
+ | Many data samples have a single value, such as "cpu" (see {{Manpage|types.db|5}}), but others have multiple values, such as "disk_merged", which has read and write values. | ||
+ | |||
+ | In case you have many values, the types need to be defined first, then the values, like this: [type][type][type][value][value][value] and not [type][value][type][value][type][value]. | ||
+ | |||
+ | === Signature part === | ||
+ | |||
+ | Signature part prepends signed parts of the packet. Each signed packet should have its own signature part at top of parts. | ||
+ | |||
+ | The signature part consists of fields: | ||
+ | |||
+ | * Type 0x0200 | ||
+ | * Length (16 bit field) | ||
+ | * Hash (32 bytes fixed length string) | ||
+ | * Username (without null byte) | ||
+ | |||
+ | Unlike `String parts`, Username value does not contain a null byte at the end. | ||
+ | |||
+ | The Length specifies the full length of the Signature part. | ||
+ | Unlike Encrypted part, Length does not contain length of signed content, so Username field length can be calculated as Length - 36 (2 bytes of Type field + 2 bytes of Length field + 32 bytes of Hash field). | ||
+ | |||
+ | Hash value is calculated for Username field + signed parts content. | ||
+ | |||
+ | === Encrypted part === | ||
+ | |||
+ | Encrypted part is container, it contains other parts in encrypted form. | ||
+ | |||
+ | The encrypted part consists of fields: | ||
+ | |||
+ | * Type 0x0210 | ||
+ | * Length (16 bit field) | ||
+ | * Username length (16 bit field, length of Username field) | ||
+ | * Username (without null byte) | ||
+ | * Init vector (16 bytes fixed length) | ||
+ | * Hash (20 bytes fixed length) | ||
+ | * Encrypted data | ||
+ | |||
+ | Unlike `String parts`, Username value does not contain a null byte at the end. | ||
+ | |||
+ | The Length specifies the full length of the Encrypted part including 'Encrypted data' length. | ||
== Part types == | == Part types == | ||
Line 37: | Line 105: | ||
! Name | ! Name | ||
! Data type | ! Data type | ||
+ | ! Comment | ||
|- | |- | ||
| <code>0x0000</code> | | <code>0x0000</code> | ||
| Host | | Host | ||
| String | | String | ||
+ | | The name of the host to associate with subsequent data values | ||
|- | |- | ||
| <code>0x0001</code> | | <code>0x0001</code> | ||
| Time | | Time | ||
| Numeric | | Numeric | ||
+ | | The timestamp to associate with subsequent data values, unix time format (seconds since epoch) | ||
+ | |- | ||
+ | | <code>0x0008</code> | ||
+ | | Time ([[High resolution time format|high resolution]]) | ||
+ | | Numeric | ||
+ | | The timestamp to associate with subsequent data values. Time is defined in ''2<sup>–30</sup>'' seconds since epoch. New in [[Version 5.0]]. | ||
|- | |- | ||
| <code>0x0002</code> | | <code>0x0002</code> | ||
| Plugin | | Plugin | ||
| String | | String | ||
+ | | The plugin name to associate with subsequent data values, e.g. "cpu" | ||
|- | |- | ||
| <code>0x0003</code> | | <code>0x0003</code> | ||
| Plugin instance | | Plugin instance | ||
| String | | String | ||
+ | | The plugin instance name to associate with subsequent data values, e.g. "1" | ||
|- | |- | ||
| <code>0x0004</code> | | <code>0x0004</code> | ||
| Type | | Type | ||
| String | | String | ||
+ | | The type name to associate with subsequent data values, e.g. "cpu" | ||
|- | |- | ||
| <code>0x0005</code> | | <code>0x0005</code> | ||
| Type instance | | Type instance | ||
| String | | String | ||
+ | | The type instance name to associate with subsequent data values, e.g. "idle" | ||
|- | |- | ||
| <code>0x0006</code> | | <code>0x0006</code> | ||
| Values | | Values | ||
− | | ''other'' | + | | ''other'' |
+ | | Data values, see above | ||
|- | |- | ||
| <code>0x0007</code> | | <code>0x0007</code> | ||
| Interval | | Interval | ||
| Numeric | | Numeric | ||
+ | | Interval used to set the "step" when creating new RRDs unless rrdtool plugin forces StepSize. Also used to detect values that have timed out. | ||
+ | |- | ||
+ | | <code>0x0009</code> | ||
+ | | Interval ([[High resolution time format|high resolution]]) | ||
+ | | Numeric | ||
+ | | The interval in which subsequent data values are collected. The interval is given in ''2<sup>–30</sup>'' seconds. New in [[Version 5.0]]. | ||
|- | |- | ||
| <code>0x0100</code> | | <code>0x0100</code> | ||
| Message (notifications) | | Message (notifications) | ||
| String | | String | ||
+ | | | ||
|- | |- | ||
| <code>0x0101</code> | | <code>0x0101</code> | ||
| Severity | | Severity | ||
| Numeric | | Numeric | ||
+ | | | ||
|- | |- | ||
| <code>0x0200</code> | | <code>0x0200</code> | ||
− | | | + | | Signature (HMAC-SHA-256) |
− | | '' | + | | ''Signature'' |
+ | | | ||
|- | |- | ||
| <code>0x0210</code> | | <code>0x0210</code> | ||
− | | | + | | Encryption (AES-256/OFB/SHA-1) |
− | | '' | + | | ''Encrypted content'' |
+ | | | ||
|- | |- | ||
|} | |} | ||
==Implementations== | ==Implementations== | ||
− | * [http://people.igalia.com/aperez/files/collectd.py | + | * [http://people.igalia.com/aperez/files/collectd.py Python script] by ''Adrian Perez'' |
+ | * [http://packages.python.org/collectd/ ''collectd'' Python module] by ''Eli Courtwright''. | ||
* [http://github.com/hyperic/jcollectd/blob/master/src/main/java/org/collectd/protocol/PacketWriter.java PacketWriter.java] of [[jcollectd]] | * [http://github.com/hyperic/jcollectd/blob/master/src/main/java/org/collectd/protocol/PacketWriter.java PacketWriter.java] of [[jcollectd]] | ||
− | * Of course the | + | * Of course the {{Plugin|Network}} ({{GitFile|src/network.c}}) of ''collectd'' itself |
* [http://github.com/astro/ruby-collectd/blob/master/lib/collectd/pkt.rb Ruby implementation] by [[User:Astro|Astro]]. | * [http://github.com/astro/ruby-collectd/blob/master/lib/collectd/pkt.rb Ruby implementation] by [[User:Astro|Astro]]. | ||
+ | * [[erlang-collectd]], an Erlang implementation by [[User:Astro|Astro]] | ||
+ | * [https://github.com/Aviogram/CollectD Aviogram/CollectD], a PHP server implementation. | ||
+ | * [https://github.com/collectd/go-collectd go-collectd], a pure Go client and server implementation. See the [https://godoc.org/collectd.org/network "network" package]. | ||
== See also == | == See also == | ||
* [[Plugin:Network|Network]] plugin | * [[Plugin:Network|Network]] plugin | ||
+ | * The network protocol analyzer [http://www.wireshark.org/ Wireshark] supports dissecting this binary protocol since version 1.4. The code can be found in [http://anonsvn.wireshark.org/viewvc/trunk/epan/dissectors/packet-collectd.c epan/dissectors/packet-collectd.c]. | ||
[[Category:Development]] | [[Category:Development]] |
Latest revision as of 15:52, 9 September 2021
The binary protocol is the protocol implemented by the Network plugin and some external implementations to exchange data collected by collectd or to send data to an instance of collectd.
Until the Network plugin has been factorized out into a library, it is useful to have some documentation to reimplement it.
Contents
Well-known numbers
- Default UDP port
- 25826
- Default IPv4 Multicast group
- 239.192.74.66
- Default IPv6 multicast group
- ff18::efc0:4a42
- Maximum packet size
- 1452 bytes (payload only, not including UDP / IP headers)
In versions 4.0 through 4.7, the receive buffer had a fixed size of 1024 bytes. When longer packets are received, the trailing data is simply ignored. Since version 4.8, the buffer size can be configured.
Protocol structure
Each packet consists of one or more so called “parts”. Each part starts with the same four bytes: Two bytes that specify the “part type” (what kind of information is enclosed in the part) and two bytes which specify the length of the part, including the four header bytes itself. Both type and length should be encoded in network order, aka Big Endian. The maximum length of payload in any part is therefore 65531 bytes.
Using this layout, clients can determine the length of a part they don't know and lets them skip unknown data. This makes the protocol forward compatible so that new features can be added easily.
Protocol allows signature verification and encryption. This is done using corresponding “parts”.
There are two part layouts that are used for a couple of “types”: numeric (an 8 byte integer) and string.
Numeric parts
Numeric integer values, e. g. the interval and time values, are transferred using 8 byte integers. The length field of those parts must therefore always be set to 12.
String parts
Strings are transferred including a null byte at the end. In the example you can see the encoding of the string “foobar”. The string is six characters long, followed by a null-byte and appended to a four byte header, leading to a length of 11 bytes for this part.
Value parts
Value parts encode an actual data sample. Preceding time, plugin, plugin instance, type, and type instance parts must have set the context for the data sample. The value part consists of:
- Type
0x0006
- Length (16 bit field)
- Number of values (16 bit field)
- Values, for each one:
- Data type code (8 bit field)
-
COUNTER
→ 0 -
GAUGE
→ 1 -
DERIVE
→ 2 -
ABSOLUTE
→ 3
-
- Value (64 bit field)
-
COUNTER
→ network (big endian) unsigned integer -
GAUGE
→ x86 (little endian) double -
DERIVE
→ network (big endian) signed integer -
ABSOLUTE
→ network (big endian) unsigned integer
-
- Data type code (8 bit field)
Many data samples have a single value, such as "cpu" (see types.db(5)), but others have multiple values, such as "disk_merged", which has read and write values.
In case you have many values, the types need to be defined first, then the values, like this: [type][type][type][value][value][value] and not [type][value][type][value][type][value].
Signature part
Signature part prepends signed parts of the packet. Each signed packet should have its own signature part at top of parts.
The signature part consists of fields:
- Type 0x0200
- Length (16 bit field)
- Hash (32 bytes fixed length string)
- Username (without null byte)
Unlike `String parts`, Username value does not contain a null byte at the end.
The Length specifies the full length of the Signature part. Unlike Encrypted part, Length does not contain length of signed content, so Username field length can be calculated as Length - 36 (2 bytes of Type field + 2 bytes of Length field + 32 bytes of Hash field).
Hash value is calculated for Username field + signed parts content.
Encrypted part
Encrypted part is container, it contains other parts in encrypted form.
The encrypted part consists of fields:
- Type 0x0210
- Length (16 bit field)
- Username length (16 bit field, length of Username field)
- Username (without null byte)
- Init vector (16 bytes fixed length)
- Hash (20 bytes fixed length)
- Encrypted data
Unlike `String parts`, Username value does not contain a null byte at the end.
The Length specifies the full length of the Encrypted part including 'Encrypted data' length.
Part types
The following numeric types are currently used to identify the type of a “part”. Defines are available from src/network.h.
ID | Name | Data type | Comment |
---|---|---|---|
0x0000
|
Host | String | The name of the host to associate with subsequent data values |
0x0001
|
Time | Numeric | The timestamp to associate with subsequent data values, unix time format (seconds since epoch) |
0x0008
|
Time (high resolution) | Numeric | The timestamp to associate with subsequent data values. Time is defined in 2–30 seconds since epoch. New in Version 5.0. |
0x0002
|
Plugin | String | The plugin name to associate with subsequent data values, e.g. "cpu" |
0x0003
|
Plugin instance | String | The plugin instance name to associate with subsequent data values, e.g. "1" |
0x0004
|
Type | String | The type name to associate with subsequent data values, e.g. "cpu" |
0x0005
|
Type instance | String | The type instance name to associate with subsequent data values, e.g. "idle" |
0x0006
|
Values | other | Data values, see above |
0x0007
|
Interval | Numeric | Interval used to set the "step" when creating new RRDs unless rrdtool plugin forces StepSize. Also used to detect values that have timed out. |
0x0009
|
Interval (high resolution) | Numeric | The interval in which subsequent data values are collected. The interval is given in 2–30 seconds. New in Version 5.0. |
0x0100
|
Message (notifications) | String | |
0x0101
|
Severity | Numeric | |
0x0200
|
Signature (HMAC-SHA-256) | Signature | |
0x0210
|
Encryption (AES-256/OFB/SHA-1) | Encrypted content |
Implementations
- Python script by Adrian Perez
- collectd Python module by Eli Courtwright.
- PacketWriter.java of jcollectd
- Of course the Network plugin (src/network.c) of collectd itself
- Ruby implementation by Astro.
- erlang-collectd, an Erlang implementation by Astro
- Aviogram/CollectD, a PHP server implementation.
- go-collectd, a pure Go client and server implementation. See the "network" package.
See also
- Network plugin
- The network protocol analyzer Wireshark supports dissecting this binary protocol since version 1.4. The code can be found in epan/dissectors/packet-collectd.c.