Plugin:DCPMM

From collectd Wiki
Jump to: navigation, search
DCPMM plugin
Type: read
Callbacks: init, config, read, shutdown
Status: supported
First version: 5.11
Copyright: 2019 Intel Corporation
Hari TG
License: MIT license
Manpage: collectd.conf(5)
List of Plugins

The dcpmm plugin will collect Intel(R) Optane(TM) DC Persistent Memory (DCPMM) related performance and health statistics. The plugin requires root privileges to perform the statistics collection.

Synopsis

→ See: Plugin:DCPMM/Config
<Plugin "dcpmm">
  Interval 10.0
  CollectHealth false
  CollectPerfMetrics true
  EnableDispatchAll false
</Plugin>

Parameters

Name Description Comment
Interval The collection interval in seconds at which the metric counts are collected Defaults to global Interval value. This will override the global Interval value for dcpmm plugin. None of the other plugins will be affected.
CollectHealth Health information metrics will be collected if set to true Default value is false.
CollectdPerfMetrics Memory performance metrics will be collected if set to true Default value is true.
EnableDispatchAll This parameter helps to seamlessly enable simultaneous health and memory performance metrics collection in future. This is unused at the moment and must always be false.

Metrics

The DCPMM plugin collects health metrics or performance metrics (currently doesn't support collecting both sets of metrics simultaneously).

Health Information Metrics

The health information metrics are the following:

Metric Description
health_status Overall health summary (0: normal | 1: non-critical | 2: critical | 3: fatal).
lifespan_remaining The module’s remaining life as a percentage value of factory expected life span.
lifespan_used The module’s used life as a percentage value of factory expected life span.
power_on_time The lifetime the DIMM has been powered on in seconds.
uptime The current uptime of the DIMM for the current power cycle in seconds.
last_shutdown_time The time the system was last shutdown. The time is represented in epoch (seconds).
media_temperature The media’s current temperature in degree Celsius.
controller_temperature The controller’s current temperature in degree Celsius.
max_media_temperature The media’s the highest temperature reported in degree Celsius.
max_controller_temperature The controller’s highest temperature reported in degree Celsius.
tsc_cycles The number of tsc cycles during each interval.
epoch The timestamp in seconds at which the metrics are collected from DCPMM DIMMs.

Memory Performance Metrics

The Health information metrics are the following:

Metric Description
total_bytes_read Number of bytes transacted by the read operations.
total_bytes_written Number of bytes transacted by the write operations.
read_64B_ops_rcvd Number of read operations performed to the physical media in 64 bytes granularity.
write_64B_ops_rcvd Number of write operations performed to the physical media in 64 bytes granularity.
media_read_ops Number of read operations performed to the physical media.
media_write_ops Number of write operations performed to the physical media.
host_reads Number of read operations received from the CPU (memory controller).
host_writes Number of write operations received from the CPU (memory controller).
read_hit_ratio Measures the efficiency of the buffer in the read path. Range of 0.0 - 1.0.
write_hit_ratio Measures the efficiency of the buffer in the write path. Range of 0.0 - 1.0.
tsc_cycles The number of tsc cycles during each interval.
epoch The timestamp in seconds at which the metrics are collected from DCPMM DIMMs.

Example Graph

None yet. Add one now!

Dependencies

Caveats

  • Health metrics and performance metrics cannot be collected simultaneously.

History

  • 5.11 New plugin for Intel Optane DC Presistent Memory (DCPMM) added.

See also