Difference between revisions of "Plugin:IntelRDT"

From collectd Wiki
Jump to: navigation, search
m
 
(3 intermediate revisions by one other user not shown)
Line 5: Line 5:
 
   | Status={{supported}}
 
   | Status={{supported}}
 
   | FirstVersion={{Version|5.7}}
 
   | FirstVersion={{Version|5.7}}
   | Copyright=''2016–2018'' Intel Corporation
+
   | Copyright=''2016–2018'' Intel Corporation<br /> ''Serhiy Pshyk'' <br /> ''Mateusz Starzyk'' <br /> ''Wojciech Andralojc'' <br /> ''Michał Aleksiński''
 
   | License={{MIT_License}}
 
   | License={{MIT_License}}
 
   | Manpage={{Manpage|collectd.conf|5|plugin_intel_rdt}}
 
   | Manpage={{Manpage|collectd.conf|5|plugin_intel_rdt}}
 
}}
 
}}
The ''intel_rdt'' plugin collects information provided by monitoring features of
+
The ''intel_rdt'' plugin collects information provided by monitoring features of Intel Resource Director Technology (Intel(R) RDT) like Cache Monitoring Technology (CMT), Memory Bandwidth Monitoring (MBM), Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) Technology provide the hardware framework to monitor and control the utilization of shared resources, like last level cache, memory bandwidth. These Technologies comprise Intel’s Resource Director Technology (RDT). As multithreaded and multicore platform architectures emerge, running workloads in single-threaded, multithreaded, or complex virtual machine environment, the last level cache and memory bandwidth are key resources to manage. Intel introduces CMT, MBM, CAT and CDP to manage these various workloads across shared resources. 
Intel Resource Director Technology (Intel(R) RDT) like Cache Monitoring
+
 
Technology (CMT), Memory Bandwidth Monitoring (MBM).
+
{| class="wikitable" style="background-color:#FFF;"
CMT and MBM are features that allows an operating system (OS) or Hypervisor/virtual machine monitor (VMM) to determine the usage of cache and memory bandwidth by applications running on the platform.
+
|-
Using these monitoring technologies, the intel_rdt plugin collects the following metrics:
+
! <br />Name
* LLC - last level cache occupancy (CMT)
+
! <br />Type
* MBL - the bandwidth of accessing memory associated with the local socket (MBM)
+
! <br />Type Instance
* MBR - the bandwidth of accessing the remote socket (MBM)
+
! <br />Description
* IPC - instructions per clock
+
! <br />Comment
 +
|-
 +
| <br />LLC
 +
| <br />bytes
 +
| <br />llc
 +
| <br />last   level cache occupancy (CMT)
 +
| <br />Existing  type
 +
|-
 +
| <br />MBL
 +
| <br />memory_bandwidth
 +
| <br />local
 +
| <br />the   bandwidth of accessing memory associated with the local socket (MBM)
 +
| <br />Existing  type
 +
|-
 +
| <br />MBR
 +
| <br />memory_bandwidth
 +
| <br />remote
 +
| <br />the   bandwidth of accessing the remote socket (MBM)
 +
| <br />Existing  type
 +
|-
 +
| <br />IPC
 +
| <br />ipc
 +
| <br />
 +
| <br />instructions   per clock
 +
| <br />New  type introduced in types.db
 +
|}
  
 
For a full description of available options please refer to the {{Manpage|collectd.conf|5|plugin_intel_rdt}} manual page.
 
For a full description of available options please refer to the {{Manpage|collectd.conf|5|plugin_intel_rdt}} manual page.
Line 26: Line 51:
 
   Cores "0-2" "3,4,6" "8-10,15"
 
   Cores "0-2" "3,4,6" "8-10,15"
 
  </Plugin>
 
  </Plugin>
 +
 +
=== Parameters ===
 +
 +
{| class="wikitable"
 +
|-
 +
! <br />Name
 +
! <br />Description
 +
! <br />Comment
 +
|-
 +
| <br />Interval
 +
| <br />The interval  within which to retrieve statistics on monitored events in seconds
 +
| <br />Interval option is supported by collectd  and is defined in <LoadPlugin> block. No additional functionality  should be developed in intel_rdt plugin  to support this option.
 +
|-
 +
| <br />Cores
 +
| <br />Core groups  definition. Monitored metrics are reported as aggregated statistics per  group.
 +
| <br />The field is  represented as list of strings with core group values. Each string represents  a list of cores in a group. Allowed formats are: “0,1,2,3” “0-10,20-18”  “1,3,5-8,10,0x10-12”.<br />  <br />If an empty string is provided as value for this field default cores  configuration should be applied - a separate group for each core.
 +
|}
 +
 +
== Metrics ==
 +
 +
{| class="wikitable" style="background-color:#FFF;"
 +
|-
 +
! <br />Metric/Feature/Input
 +
! <br />Name
 +
! <br />Date Type
 +
! <br />Format Example
 +
! <br />Description
 +
! <br />Dependencies
 +
! <br />Limitations
 +
! <br />Comments
 +
|-
 +
| <br />Metric
 +
| <br />Memory  Bandwidth on Local NUMA Node
 +
| <br />Bytes/Second
 +
| <br />3934325
 +
| <br />Memory  bandwidth utilization by the relevant CPU core on the local NUMA memory  channel
 +
| <br />PQOS  ToolSet
 +
| <br />Does  not provide the value per process basis due to lack of resctrl fs support
 +
| <br />Dependent  on PQOS toolset to read the metric value
 +
|-
 +
| <br />Metric
 +
| <br />Memory  Bandwidth on Remote NUMA Node
 +
| <br />Bytes/Second
 +
| <br />3934325
 +
| <br />Memory  bandwidth utilization by the relevant CPU core on the remote NUMA memory  channel
 +
| <br />PQOS  ToolSet
 +
| <br />Does  not provide the value per process basis due to lack of resctrl fs support
 +
| <br />Dependent  on PQOS toolset to read the metric value
 +
|-
 +
| <br />Metric
 +
| <br />Total  Memory Bandwidth
 +
| <br />Bytes/Second
 +
| <br />3934325
 +
| <br />Total  memory bandwidth utilized by a CPU core on local and remote NUMA memory  channels
 +
| <br />PQOS  ToolSet
 +
| <br />Does  not provide the value per process basis due to lack of resctrl fs support
 +
| <br />Not  part of the Collectd plugin
 +
|-
 +
| <br />Metric
 +
| <br />L3  Cache Occupancy
 +
| <br />Bytes
 +
| <br />45345434
 +
| <br />Total  Last Level Cache occupancy by a CPU core
 +
| <br />PQOS  ToolSet
 +
| <br />Does  not provide the value per process basis due to lack of resctrl fs support
 +
| <br />Dependent  on PQOS toolset to read the metric value
 +
|-
 +
| <br />Metric
 +
| <br />Instructions  Per Cycle
 +
| <br />Integer
 +
| <br />23734
 +
| <br />Total  instructions per cycle executed by a CPU core
 +
| <br />PQOS  ToolSet
 +
| <br />None
 +
| <br />Dependent  on PQOS toolset to read the metric value
 +
|-
 +
| <br />Input
 +
| <br />Cores
 +
| <br />Integer  Array
 +
| <br />[0-12]  or [1,2,3]
 +
| <br />The  list of CPU core(s) to be provided as input by the user for which the  corresponding metrics are required
 +
| <br />None
 +
| <br />None
 +
| <br />Configuration  input in the plugin .conf file
 +
|-
 +
| <br />Input
 +
| <br />Configuration  Interval
 +
| <br />Integer
 +
| <br />1  or 10
 +
| <br />The  interval in seconds at which the metrics need to be collectd
 +
| <br />None
 +
| <br />None
 +
| <br />Configuration  input in the plugin .conf file
 +
|-
 +
| <br />Metric
 +
| <br />Memory  Bandwidth of the process on Local NUMA Node
 +
| <br />Bytes
 +
| <br />3934325
 +
| <br />Memory  bandwidth utilization by the relevant process on the local NUMA memory  channel
 +
| <br />resctrl  fs & kernel 4.14
 +
| <br />TBD
 +
| <br />Still  to create Collectd PR
 +
|-
 +
| <br />Metric
 +
| <br />Memory  Bandwidth of the process on Remote NUMA Node
 +
| <br />Bytes
 +
| <br />3934325
 +
| <br />Memory  bandwidth utilization by the relevant process on the remote NUMA memory  channel
 +
| <br />resctrl  fs & kernel 4.14
 +
| <br />TBD
 +
| <br />Still  to create Collectd PR
 +
|-
 +
| <br />Metric
 +
| <br />Total  Memory Bandwidth of the process
 +
| <br />Bytes
 +
| <br />3934325
 +
| <br />Total  memory bandwidth utilized by a process on local and remote NUMA memory  channels
 +
| <br />resctrl  fs & kernel 4.14
 +
| <br />TBD
 +
| <br />Still  to create Collectd PR
 +
|-
 +
| <br />Metric
 +
| <br />L3  Cache Occupancy of the process
 +
| <br />Bytes
 +
| <br />45345434
 +
| <br />Total  Last Level Cache occupancy by a process
 +
| <br />resctrl  fs & kernel 4.14
 +
| <br />TBD
 +
| <br />Still  to create Collectd PR
 +
|-
 +
| <br />Metric
 +
| <br />Instructions  Per Cycle of the process
 +
| <br />Integer
 +
| <br />23734
 +
| <br />Total  instructions per cycle executed by a process
 +
| <br />resctrl  fs & kernel 4.14
 +
| <br />TBD
 +
| <br />Still  to create Collectd PR
 +
|-
 +
| <br />Input
 +
| <br />Process  List
 +
| <br />Array
 +
| <br />[qemu,  pmd]
 +
| <br />List  of processes for which metrics are to be collected
 +
| <br />resctrl  fs & kernel 4.14
 +
| <br />TBD
 +
| <br />Still  to create Collectd PR
 +
|}
  
 
== Example graph ==
 
== Example graph ==
Line 37: Line 210:
 
== See also ==
 
== See also ==
  
* [[Plugin|IntelRDT/tests]]
+
* [[Plugin:IntelRDT/tests]]
* [ https://wiki.opnfv.org/display/fastpath/Intel_RDT Intel RDT plugin high level design document]
+
* [https://wiki.opnfv.org/display/fastpath/Intel_RDT Intel RDT plugin high level design document]
  
 
[[Category:Plugins]]
 
[[Category:Plugins]]
 +
[[Category:Needs Info]]
 
{{DEFAULTSORT:Intel_Rdt}}
 
{{DEFAULTSORT:Intel_Rdt}}

Latest revision as of 16:19, 16 June 2020

Intel RDT plugin
Type: read
Callbacks: config, init, read, shutdown
Status: supported
First version: 5.7
Copyright: 2016–2018 Intel Corporation
Serhiy Pshyk
Mateusz Starzyk
Wojciech Andralojc
Michał Aleksiński
License: MIT license
Manpage: collectd.conf(5)
List of Plugins

The intel_rdt plugin collects information provided by monitoring features of Intel Resource Director Technology (Intel(R) RDT) like Cache Monitoring Technology (CMT), Memory Bandwidth Monitoring (MBM), Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) Technology provide the hardware framework to monitor and control the utilization of shared resources, like last level cache, memory bandwidth. These Technologies comprise Intel’s Resource Director Technology (RDT). As multithreaded and multicore platform architectures emerge, running workloads in single-threaded, multithreaded, or complex virtual machine environment, the last level cache and memory bandwidth are key resources to manage. Intel introduces CMT, MBM, CAT and CDP to manage these various workloads across shared resources. 


Name

Type

Type Instance

Description

Comment

LLC

bytes

llc

last level cache occupancy (CMT)

Existing type

MBL

memory_bandwidth

local

the bandwidth of accessing memory associated with the local socket (MBM)

Existing type

MBR

memory_bandwidth

remote

the bandwidth of accessing the remote socket (MBM)

Existing type

IPC

ipc


instructions per clock

New type introduced in types.db

For a full description of available options please refer to the collectd.conf(5) manual page.

Synopsis

<Plugin "intel_rdt">
  Cores "0-2" "3,4,6" "8-10,15"
</Plugin>

Parameters


Name

Description

Comment

Interval

The interval within which to retrieve statistics on monitored events in seconds

Interval option is supported by collectd and is defined in <LoadPlugin> block. No additional functionality should be developed in intel_rdt plugin to support this option.

Cores

Core groups definition. Monitored metrics are reported as aggregated statistics per group.

The field is represented as list of strings with core group values. Each string represents a list of cores in a group. Allowed formats are: “0,1,2,3” “0-10,20-18” “1,3,5-8,10,0x10-12”.

If an empty string is provided as value for this field default cores configuration should be applied - a separate group for each core.

Metrics


Metric/Feature/Input

Name

Date Type

Format Example

Description

Dependencies

Limitations

Comments

Metric

Memory Bandwidth on Local NUMA Node

Bytes/Second

3934325

Memory bandwidth utilization by the relevant CPU core on the local NUMA memory channel

PQOS ToolSet

Does not provide the value per process basis due to lack of resctrl fs support

Dependent on PQOS toolset to read the metric value

Metric

Memory Bandwidth on Remote NUMA Node

Bytes/Second

3934325

Memory bandwidth utilization by the relevant CPU core on the remote NUMA memory channel

PQOS ToolSet

Does not provide the value per process basis due to lack of resctrl fs support

Dependent on PQOS toolset to read the metric value

Metric

Total Memory Bandwidth

Bytes/Second

3934325

Total memory bandwidth utilized by a CPU core on local and remote NUMA memory channels

PQOS ToolSet

Does not provide the value per process basis due to lack of resctrl fs support

Not part of the Collectd plugin

Metric

L3 Cache Occupancy

Bytes

45345434

Total Last Level Cache occupancy by a CPU core

PQOS ToolSet

Does not provide the value per process basis due to lack of resctrl fs support

Dependent on PQOS toolset to read the metric value

Metric

Instructions Per Cycle

Integer

23734

Total instructions per cycle executed by a CPU core

PQOS ToolSet

None

Dependent on PQOS toolset to read the metric value

Input

Cores

Integer Array

[0-12] or [1,2,3]

The list of CPU core(s) to be provided as input by the user for which the corresponding metrics are required

None

None

Configuration input in the plugin .conf file

Input

Configuration Interval

Integer

1 or 10

The interval in seconds at which the metrics need to be collectd

None

None

Configuration input in the plugin .conf file

Metric

Memory Bandwidth of the process on Local NUMA Node

Bytes

3934325

Memory bandwidth utilization by the relevant process on the local NUMA memory channel

resctrl fs & kernel 4.14

TBD

Still to create Collectd PR

Metric

Memory Bandwidth of the process on Remote NUMA Node

Bytes

3934325

Memory bandwidth utilization by the relevant process on the remote NUMA memory channel

resctrl fs & kernel 4.14

TBD

Still to create Collectd PR

Metric

Total Memory Bandwidth of the process

Bytes

3934325

Total memory bandwidth utilized by a process on local and remote NUMA memory channels

resctrl fs & kernel 4.14

TBD

Still to create Collectd PR

Metric

L3 Cache Occupancy of the process

Bytes

45345434

Total Last Level Cache occupancy by a process

resctrl fs & kernel 4.14

TBD

Still to create Collectd PR

Metric

Instructions Per Cycle of the process

Integer

23734

Total instructions per cycle executed by a process

resctrl fs & kernel 4.14

TBD

Still to create Collectd PR

Input

Process List

Array

[qemu, pmd]

List of processes for which metrics are to be collected

resctrl fs & kernel 4.14

TBD

Still to create Collectd PR

Example graph

Rdt llc.png

Dependencies

See also