Difference between revisions of "Plugin:DCPMM"
From collectd Wiki
(Add DCPMM plugin page) |
(Update DCMPP plugin page with info on metrics) |
||
Line 14: | Line 14: | ||
== Synopsis == | == Synopsis == | ||
+ | {{See|[[Plugin:DCPMM/Config]]}} | ||
<Plugin "dcpmm"> | <Plugin "dcpmm"> | ||
Line 22: | Line 23: | ||
</Plugin> | </Plugin> | ||
− | + | === Parameters === | |
+ | {|class="wikitable" | ||
+ | ! Name | ||
+ | ! Description | ||
+ | ! Comment | ||
+ | |- | ||
+ | | Interval | ||
+ | | The collection interval in seconds at which the metric counts are collected | ||
+ | | Defaults to global Interval value. This will override the global Interval value for dcpmm plugin. None of the other plugins will be affected. | ||
+ | |- | ||
+ | | CollectHealth | ||
+ | | Health information metrics will be collected if set to true | ||
+ | | Default value is false. | ||
+ | |- | ||
+ | | CollectdPerfMetrics | ||
+ | | Memory performance metrics will be collected if set to true | ||
+ | | Default value is true. | ||
+ | |- | ||
+ | | EnableDispatchAll | ||
+ | | This parameter helps to seamlessly enable simultaneous health and memory performance metrics collection in future. | ||
+ | | This is unused at the moment and must always be false. | ||
+ | |} | ||
+ | |||
+ | == Metrics == | ||
+ | The DCMPP plugin collects health metrics or performance metrics (currently doesn't support collecting both sets of metrics simultaneously). | ||
+ | |||
+ | === Health Information Metrics === | ||
+ | The health information metrics are the following: | ||
+ | |||
+ | {|class="wikitable" | ||
+ | ! Metric | ||
+ | ! Description | ||
+ | |- | ||
+ | | health_status | ||
+ | | <nowiki>Overall health summary (0: normal | 1: non-critical | 2: critical | 3: fatal).</nowiki> | ||
+ | |- | ||
+ | | lifespan_remaining | ||
+ | | The module’s remaining life as a percentage value of factory expected life span. | ||
+ | |- | ||
+ | | lifespan_used | ||
+ | | The module’s used life as a percentage value of factory expected life span. | ||
+ | |- | ||
+ | | power_on_time | ||
+ | | The lifetime the DIMM has been powered on in seconds. | ||
+ | |- | ||
+ | | uptime | ||
+ | | The current uptime of the DIMM for the current power cycle in seconds. | ||
+ | |- | ||
+ | | last_shutdown_time | ||
+ | | The time the system was last shutdown. The time is represented in epoch (seconds). | ||
+ | |- | ||
+ | | media_temperature | ||
+ | | The media’s current temperature in degree Celsius. | ||
+ | |- | ||
+ | | controller_temperature | ||
+ | | The controller’s current temperature in degree Celsius. | ||
+ | |- | ||
+ | | max_media_temperature | ||
+ | | The media’s the highest temperature reported in degree Celsius. | ||
+ | |- | ||
+ | | max_controller_temperature | ||
+ | | The controller’s highest temperature reported in degree Celsius. | ||
+ | |- | ||
+ | | tsc_cycles | ||
+ | | The number of tsc cycles during each interval. | ||
+ | |- | ||
+ | | epoch | ||
+ | | The timestamp in seconds at which the metrics are collected from DCPMM DIMMs. | ||
+ | |} | ||
+ | |||
+ | === Memory Performance Metrics === | ||
+ | The Health information metrics are the following: | ||
+ | {|class="wikitable" | ||
+ | ! Metric | ||
+ | ! Description | ||
+ | |- | ||
+ | | total_bytes_read | ||
+ | | Number of bytes transacted by the read operations. | ||
+ | |- | ||
+ | | total_bytes_written | ||
+ | | Number of bytes transacted by the write operations. | ||
+ | |- | ||
+ | | read_64B_ops_rcvd | ||
+ | | Number of read operations performed to the physical media in 64 bytes granularity. | ||
+ | |- | ||
+ | | write_64B_ops_rcvd | ||
+ | | Number of write operations performed to the physical media in 64 bytes granularity. | ||
+ | |- | ||
+ | | media_read_ops | ||
+ | | Number of read operations performed to the physical media. | ||
+ | |- | ||
+ | | media_write_ops | ||
+ | | Number of write operations performed to the physical media. | ||
+ | |- | ||
+ | | host_reads | ||
+ | | Number of read operations received from the CPU (memory controller). | ||
+ | |- | ||
+ | | host_writes | ||
+ | | Number of write operations received from the CPU (memory controller). | ||
+ | |- | ||
+ | | read_hit_ratio | ||
+ | | Measures the efficiency of the buffer in the read path. Range of 0.0 - 1.0. | ||
+ | |- | ||
+ | | write_hit_ratio | ||
+ | | Measures the efficiency of the buffer in the write path. Range of 0.0 - 1.0. | ||
+ | |- | ||
+ | | tsc_cycles | ||
+ | | The number of tsc cycles during each interval. | ||
+ | |- | ||
+ | | epoch | ||
+ | | The timestamp in seconds at which the metrics are collected from DCPMM DIMMs. | ||
+ | |} | ||
== Example Graph == | == Example Graph == | ||
{{No Example Graph}} | {{No Example Graph}} | ||
Line 30: | Line 142: | ||
* [https://github.com/intel/intel-pmwatch libpmwapi] | * [https://github.com/intel/intel-pmwatch libpmwapi] | ||
+ | |||
+ | == Caveats == | ||
+ | * Health metrics and performance metrics cannot be collected simultaneously. | ||
+ | |||
+ | == History == | ||
+ | * {{Version|5.11}} New plugin for Intel Optane DC Presistent Memory (DCPMM) added. | ||
== See also == | == See also == | ||
* [[Plugin:DCPMM/tests]] | * [[Plugin:DCPMM/tests]] | ||
− | * [https://wiki.opnfv.org/display/fastpath/DCPMM DCPMM plugin high level design document] | + | * [https://wiki.opnfv.org/display/fastpath/DCPMM DCPMM plugin metric list] |
+ | * [https://wiki.opnfv.org/display/fastpath/Collectd+DCPMM+Plugin+HLD DCPMM plugin high level design document] | ||
[[Category:Plugins]] | [[Category:Plugins]] | ||
[[Category:Plugins requiring privileges]] | [[Category:Plugins requiring privileges]] | ||
{{DEFAULTSORT:DCPMM}} | {{DEFAULTSORT:DCPMM}} |
Revision as of 16:30, 18 March 2020
DCPMM plugin | |
---|---|
Type: | read |
Callbacks: | init, config, read, shutdown |
Status: | supported |
First version: | 5.11 |
Copyright: | 2019 Intel Corporation Hari TG |
License: | MIT license |
Manpage: | collectd.conf(5) |
List of Plugins |
The dcpmm plugin will collect Intel(R) Optane(TM) DC Persistent Memory (DCPMM) related performance and health statistics. The plugin requires root privileges to perform the statistics collection.
Contents
Synopsis
- → See: Plugin:DCPMM/Config
<Plugin "dcpmm"> Interval 10.0 CollectHealth false CollectPerfMetrics true EnableDispatchAll false </Plugin>
Parameters
Name | Description | Comment |
---|---|---|
Interval | The collection interval in seconds at which the metric counts are collected | Defaults to global Interval value. This will override the global Interval value for dcpmm plugin. None of the other plugins will be affected. |
CollectHealth | Health information metrics will be collected if set to true | Default value is false. |
CollectdPerfMetrics | Memory performance metrics will be collected if set to true | Default value is true. |
EnableDispatchAll | This parameter helps to seamlessly enable simultaneous health and memory performance metrics collection in future. | This is unused at the moment and must always be false. |
Metrics
The DCMPP plugin collects health metrics or performance metrics (currently doesn't support collecting both sets of metrics simultaneously).
Health Information Metrics
The health information metrics are the following:
Metric | Description |
---|---|
health_status | Overall health summary (0: normal | 1: non-critical | 2: critical | 3: fatal). |
lifespan_remaining | The module’s remaining life as a percentage value of factory expected life span. |
lifespan_used | The module’s used life as a percentage value of factory expected life span. |
power_on_time | The lifetime the DIMM has been powered on in seconds. |
uptime | The current uptime of the DIMM for the current power cycle in seconds. |
last_shutdown_time | The time the system was last shutdown. The time is represented in epoch (seconds). |
media_temperature | The media’s current temperature in degree Celsius. |
controller_temperature | The controller’s current temperature in degree Celsius. |
max_media_temperature | The media’s the highest temperature reported in degree Celsius. |
max_controller_temperature | The controller’s highest temperature reported in degree Celsius. |
tsc_cycles | The number of tsc cycles during each interval. |
epoch | The timestamp in seconds at which the metrics are collected from DCPMM DIMMs. |
Memory Performance Metrics
The Health information metrics are the following:
Metric | Description |
---|---|
total_bytes_read | Number of bytes transacted by the read operations. |
total_bytes_written | Number of bytes transacted by the write operations. |
read_64B_ops_rcvd | Number of read operations performed to the physical media in 64 bytes granularity. |
write_64B_ops_rcvd | Number of write operations performed to the physical media in 64 bytes granularity. |
media_read_ops | Number of read operations performed to the physical media. |
media_write_ops | Number of write operations performed to the physical media. |
host_reads | Number of read operations received from the CPU (memory controller). |
host_writes | Number of write operations received from the CPU (memory controller). |
read_hit_ratio | Measures the efficiency of the buffer in the read path. Range of 0.0 - 1.0. |
write_hit_ratio | Measures the efficiency of the buffer in the write path. Range of 0.0 - 1.0. |
tsc_cycles | The number of tsc cycles during each interval. |
epoch | The timestamp in seconds at which the metrics are collected from DCPMM DIMMs. |
Example Graph
None yet. Add one now!
Dependencies
Caveats
- Health metrics and performance metrics cannot be collected simultaneously.
History
- 5.11 New plugin for Intel Optane DC Presistent Memory (DCPMM) added.