resctrl.rst (48057B)
1.. SPDX-License-Identifier: GPL-2.0 2.. include:: <isonum.txt> 3 4=========================================== 5User Interface for Resource Control feature 6=========================================== 7 8:Copyright: |copy| 2016 Intel Corporation 9:Authors: - Fenghua Yu <fenghua.yu@intel.com> 10 - Tony Luck <tony.luck@intel.com> 11 - Vikas Shivappa <vikas.shivappa@intel.com> 12 13 14Intel refers to this feature as Intel Resource Director Technology(Intel(R) RDT). 15AMD refers to this feature as AMD Platform Quality of Service(AMD QoS). 16 17This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo 18flag bits: 19 20============================================= ================================ 21RDT (Resource Director Technology) Allocation "rdt_a" 22CAT (Cache Allocation Technology) "cat_l3", "cat_l2" 23CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2" 24CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc" 25MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local" 26MBA (Memory Bandwidth Allocation) "mba" 27============================================= ================================ 28 29To use the feature mount the file system:: 30 31 # mount -t resctrl resctrl [-o cdp[,cdpl2][,mba_MBps]] /sys/fs/resctrl 32 33mount options are: 34 35"cdp": 36 Enable code/data prioritization in L3 cache allocations. 37"cdpl2": 38 Enable code/data prioritization in L2 cache allocations. 39"mba_MBps": 40 Enable the MBA Software Controller(mba_sc) to specify MBA 41 bandwidth in MBps 42 43L2 and L3 CDP are controlled separately. 44 45RDT features are orthogonal. A particular system may support only 46monitoring, only control, or both monitoring and control. Cache 47pseudo-locking is a unique way of using cache control to "pin" or 48"lock" data in the cache. Details can be found in 49"Cache Pseudo-Locking". 50 51 52The mount succeeds if either of allocation or monitoring is present, but 53only those files and directories supported by the system will be created. 54For more details on the behavior of the interface during monitoring 55and allocation, see the "Resource alloc and monitor groups" section. 56 57Info directory 58============== 59 60The 'info' directory contains information about the enabled 61resources. Each resource has its own subdirectory. The subdirectory 62names reflect the resource names. 63 64Each subdirectory contains the following files with respect to 65allocation: 66 67Cache resource(L3/L2) subdirectory contains the following files 68related to allocation: 69 70"num_closids": 71 The number of CLOSIDs which are valid for this 72 resource. The kernel uses the smallest number of 73 CLOSIDs of all enabled resources as limit. 74"cbm_mask": 75 The bitmask which is valid for this resource. 76 This mask is equivalent to 100%. 77"min_cbm_bits": 78 The minimum number of consecutive bits which 79 must be set when writing a mask. 80 81"shareable_bits": 82 Bitmask of shareable resource with other executing 83 entities (e.g. I/O). User can use this when 84 setting up exclusive cache partitions. Note that 85 some platforms support devices that have their 86 own settings for cache use which can over-ride 87 these bits. 88"bit_usage": 89 Annotated capacity bitmasks showing how all 90 instances of the resource are used. The legend is: 91 92 "0": 93 Corresponding region is unused. When the system's 94 resources have been allocated and a "0" is found 95 in "bit_usage" it is a sign that resources are 96 wasted. 97 98 "H": 99 Corresponding region is used by hardware only 100 but available for software use. If a resource 101 has bits set in "shareable_bits" but not all 102 of these bits appear in the resource groups' 103 schematas then the bits appearing in 104 "shareable_bits" but no resource group will 105 be marked as "H". 106 "X": 107 Corresponding region is available for sharing and 108 used by hardware and software. These are the 109 bits that appear in "shareable_bits" as 110 well as a resource group's allocation. 111 "S": 112 Corresponding region is used by software 113 and available for sharing. 114 "E": 115 Corresponding region is used exclusively by 116 one resource group. No sharing allowed. 117 "P": 118 Corresponding region is pseudo-locked. No 119 sharing allowed. 120 121Memory bandwidth(MB) subdirectory contains the following files 122with respect to allocation: 123 124"min_bandwidth": 125 The minimum memory bandwidth percentage which 126 user can request. 127 128"bandwidth_gran": 129 The granularity in which the memory bandwidth 130 percentage is allocated. The allocated 131 b/w percentage is rounded off to the next 132 control step available on the hardware. The 133 available bandwidth control steps are: 134 min_bandwidth + N * bandwidth_gran. 135 136"delay_linear": 137 Indicates if the delay scale is linear or 138 non-linear. This field is purely informational 139 only. 140 141"thread_throttle_mode": 142 Indicator on Intel systems of how tasks running on threads 143 of a physical core are throttled in cases where they 144 request different memory bandwidth percentages: 145 146 "max": 147 the smallest percentage is applied 148 to all threads 149 "per-thread": 150 bandwidth percentages are directly applied to 151 the threads running on the core 152 153If RDT monitoring is available there will be an "L3_MON" directory 154with the following files: 155 156"num_rmids": 157 The number of RMIDs available. This is the 158 upper bound for how many "CTRL_MON" + "MON" 159 groups can be created. 160 161"mon_features": 162 Lists the monitoring events if 163 monitoring is enabled for the resource. 164 165"max_threshold_occupancy": 166 Read/write file provides the largest value (in 167 bytes) at which a previously used LLC_occupancy 168 counter can be considered for re-use. 169 170Finally, in the top level of the "info" directory there is a file 171named "last_cmd_status". This is reset with every "command" issued 172via the file system (making new directories or writing to any of the 173control files). If the command was successful, it will read as "ok". 174If the command failed, it will provide more information that can be 175conveyed in the error returns from file operations. E.g. 176:: 177 178 # echo L3:0=f7 > schemata 179 bash: echo: write error: Invalid argument 180 # cat info/last_cmd_status 181 mask f7 has non-consecutive 1-bits 182 183Resource alloc and monitor groups 184================================= 185 186Resource groups are represented as directories in the resctrl file 187system. The default group is the root directory which, immediately 188after mounting, owns all the tasks and cpus in the system and can make 189full use of all resources. 190 191On a system with RDT control features additional directories can be 192created in the root directory that specify different amounts of each 193resource (see "schemata" below). The root and these additional top level 194directories are referred to as "CTRL_MON" groups below. 195 196On a system with RDT monitoring the root directory and other top level 197directories contain a directory named "mon_groups" in which additional 198directories can be created to monitor subsets of tasks in the CTRL_MON 199group that is their ancestor. These are called "MON" groups in the rest 200of this document. 201 202Removing a directory will move all tasks and cpus owned by the group it 203represents to the parent. Removing one of the created CTRL_MON groups 204will automatically remove all MON groups below it. 205 206All groups contain the following files: 207 208"tasks": 209 Reading this file shows the list of all tasks that belong to 210 this group. Writing a task id to the file will add a task to the 211 group. If the group is a CTRL_MON group the task is removed from 212 whichever previous CTRL_MON group owned the task and also from 213 any MON group that owned the task. If the group is a MON group, 214 then the task must already belong to the CTRL_MON parent of this 215 group. The task is removed from any previous MON group. 216 217 218"cpus": 219 Reading this file shows a bitmask of the logical CPUs owned by 220 this group. Writing a mask to this file will add and remove 221 CPUs to/from this group. As with the tasks file a hierarchy is 222 maintained where MON groups may only include CPUs owned by the 223 parent CTRL_MON group. 224 When the resource group is in pseudo-locked mode this file will 225 only be readable, reflecting the CPUs associated with the 226 pseudo-locked region. 227 228 229"cpus_list": 230 Just like "cpus", only using ranges of CPUs instead of bitmasks. 231 232 233When control is enabled all CTRL_MON groups will also contain: 234 235"schemata": 236 A list of all the resources available to this group. 237 Each resource has its own line and format - see below for details. 238 239"size": 240 Mirrors the display of the "schemata" file to display the size in 241 bytes of each allocation instead of the bits representing the 242 allocation. 243 244"mode": 245 The "mode" of the resource group dictates the sharing of its 246 allocations. A "shareable" resource group allows sharing of its 247 allocations while an "exclusive" resource group does not. A 248 cache pseudo-locked region is created by first writing 249 "pseudo-locksetup" to the "mode" file before writing the cache 250 pseudo-locked region's schemata to the resource group's "schemata" 251 file. On successful pseudo-locked region creation the mode will 252 automatically change to "pseudo-locked". 253 254When monitoring is enabled all MON groups will also contain: 255 256"mon_data": 257 This contains a set of files organized by L3 domain and by 258 RDT event. E.g. on a system with two L3 domains there will 259 be subdirectories "mon_L3_00" and "mon_L3_01". Each of these 260 directories have one file per event (e.g. "llc_occupancy", 261 "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these 262 files provide a read out of the current value of the event for 263 all tasks in the group. In CTRL_MON groups these files provide 264 the sum for all tasks in the CTRL_MON group and all tasks in 265 MON groups. Please see example section for more details on usage. 266 267Resource allocation rules 268------------------------- 269 270When a task is running the following rules define which resources are 271available to it: 272 2731) If the task is a member of a non-default group, then the schemata 274 for that group is used. 275 2762) Else if the task belongs to the default group, but is running on a 277 CPU that is assigned to some specific group, then the schemata for the 278 CPU's group is used. 279 2803) Otherwise the schemata for the default group is used. 281 282Resource monitoring rules 283------------------------- 2841) If a task is a member of a MON group, or non-default CTRL_MON group 285 then RDT events for the task will be reported in that group. 286 2872) If a task is a member of the default CTRL_MON group, but is running 288 on a CPU that is assigned to some specific group, then the RDT events 289 for the task will be reported in that group. 290 2913) Otherwise RDT events for the task will be reported in the root level 292 "mon_data" group. 293 294 295Notes on cache occupancy monitoring and control 296=============================================== 297When moving a task from one group to another you should remember that 298this only affects *new* cache allocations by the task. E.g. you may have 299a task in a monitor group showing 3 MB of cache occupancy. If you move 300to a new group and immediately check the occupancy of the old and new 301groups you will likely see that the old group is still showing 3 MB and 302the new group zero. When the task accesses locations still in cache from 303before the move, the h/w does not update any counters. On a busy system 304you will likely see the occupancy in the old group go down as cache lines 305are evicted and re-used while the occupancy in the new group rises as 306the task accesses memory and loads into the cache are counted based on 307membership in the new group. 308 309The same applies to cache allocation control. Moving a task to a group 310with a smaller cache partition will not evict any cache lines. The 311process may continue to use them from the old partition. 312 313Hardware uses CLOSid(Class of service ID) and an RMID(Resource monitoring ID) 314to identify a control group and a monitoring group respectively. Each of 315the resource groups are mapped to these IDs based on the kind of group. The 316number of CLOSid and RMID are limited by the hardware and hence the creation of 317a "CTRL_MON" directory may fail if we run out of either CLOSID or RMID 318and creation of "MON" group may fail if we run out of RMIDs. 319 320max_threshold_occupancy - generic concepts 321------------------------------------------ 322 323Note that an RMID once freed may not be immediately available for use as 324the RMID is still tagged the cache lines of the previous user of RMID. 325Hence such RMIDs are placed on limbo list and checked back if the cache 326occupancy has gone down. If there is a time when system has a lot of 327limbo RMIDs but which are not ready to be used, user may see an -EBUSY 328during mkdir. 329 330max_threshold_occupancy is a user configurable value to determine the 331occupancy at which an RMID can be freed. 332 333Schemata files - general concepts 334--------------------------------- 335Each line in the file describes one resource. The line starts with 336the name of the resource, followed by specific values to be applied 337in each of the instances of that resource on the system. 338 339Cache IDs 340--------- 341On current generation systems there is one L3 cache per socket and L2 342caches are generally just shared by the hyperthreads on a core, but this 343isn't an architectural requirement. We could have multiple separate L3 344caches on a socket, multiple cores could share an L2 cache. So instead 345of using "socket" or "core" to define the set of logical cpus sharing 346a resource we use a "Cache ID". At a given cache level this will be a 347unique number across the whole system (but it isn't guaranteed to be a 348contiguous sequence, there may be gaps). To find the ID for each logical 349CPU look in /sys/devices/system/cpu/cpu*/cache/index*/id 350 351Cache Bit Masks (CBM) 352--------------------- 353For cache resources we describe the portion of the cache that is available 354for allocation using a bitmask. The maximum value of the mask is defined 355by each cpu model (and may be different for different cache levels). It 356is found using CPUID, but is also provided in the "info" directory of 357the resctrl file system in "info/{resource}/cbm_mask". Intel hardware 358requires that these masks have all the '1' bits in a contiguous block. So 3590x3, 0x6 and 0xC are legal 4-bit masks with two bits set, but 0x5, 0x9 360and 0xA are not. On a system with a 20-bit mask each bit represents 5% 361of the capacity of the cache. You could partition the cache into four 362equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000. 363 364Memory bandwidth Allocation and monitoring 365========================================== 366 367For Memory bandwidth resource, by default the user controls the resource 368by indicating the percentage of total memory bandwidth. 369 370The minimum bandwidth percentage value for each cpu model is predefined 371and can be looked up through "info/MB/min_bandwidth". The bandwidth 372granularity that is allocated is also dependent on the cpu model and can 373be looked up at "info/MB/bandwidth_gran". The available bandwidth 374control steps are: min_bw + N * bw_gran. Intermediate values are rounded 375to the next control step available on the hardware. 376 377The bandwidth throttling is a core specific mechanism on some of Intel 378SKUs. Using a high bandwidth and a low bandwidth setting on two threads 379sharing a core may result in both threads being throttled to use the 380low bandwidth (see "thread_throttle_mode"). 381 382The fact that Memory bandwidth allocation(MBA) may be a core 383specific mechanism where as memory bandwidth monitoring(MBM) is done at 384the package level may lead to confusion when users try to apply control 385via the MBA and then monitor the bandwidth to see if the controls are 386effective. Below are such scenarios: 387 3881. User may *not* see increase in actual bandwidth when percentage 389 values are increased: 390 391This can occur when aggregate L2 external bandwidth is more than L3 392external bandwidth. Consider an SKL SKU with 24 cores on a package and 393where L2 external is 10GBps (hence aggregate L2 external bandwidth is 394240GBps) and L3 external bandwidth is 100GBps. Now a workload with '20 395threads, having 50% bandwidth, each consuming 5GBps' consumes the max L3 396bandwidth of 100GBps although the percentage value specified is only 50% 397<< 100%. Hence increasing the bandwidth percentage will not yield any 398more bandwidth. This is because although the L2 external bandwidth still 399has capacity, the L3 external bandwidth is fully used. Also note that 400this would be dependent on number of cores the benchmark is run on. 401 4022. Same bandwidth percentage may mean different actual bandwidth 403 depending on # of threads: 404 405For the same SKU in #1, a 'single thread, with 10% bandwidth' and '4 406thread, with 10% bandwidth' can consume upto 10GBps and 40GBps although 407they have same percentage bandwidth of 10%. This is simply because as 408threads start using more cores in an rdtgroup, the actual bandwidth may 409increase or vary although user specified bandwidth percentage is same. 410 411In order to mitigate this and make the interface more user friendly, 412resctrl added support for specifying the bandwidth in MBps as well. The 413kernel underneath would use a software feedback mechanism or a "Software 414Controller(mba_sc)" which reads the actual bandwidth using MBM counters 415and adjust the memory bandwidth percentages to ensure:: 416 417 "actual bandwidth < user specified bandwidth". 418 419By default, the schemata would take the bandwidth percentage values 420where as user can switch to the "MBA software controller" mode using 421a mount option 'mba_MBps'. The schemata format is specified in the below 422sections. 423 424L3 schemata file details (code and data prioritization disabled) 425---------------------------------------------------------------- 426With CDP disabled the L3 schemata format is:: 427 428 L3:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... 429 430L3 schemata file details (CDP enabled via mount option to resctrl) 431------------------------------------------------------------------ 432When CDP is enabled L3 control is split into two separate resources 433so you can specify independent masks for code and data like this:: 434 435 L3DATA:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... 436 L3CODE:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... 437 438L2 schemata file details 439------------------------ 440CDP is supported at L2 using the 'cdpl2' mount option. The schemata 441format is either:: 442 443 L2:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... 444 445or 446 447 L2DATA:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... 448 L2CODE:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... 449 450 451Memory bandwidth Allocation (default mode) 452------------------------------------------ 453 454Memory b/w domain is L3 cache. 455:: 456 457 MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;... 458 459Memory bandwidth Allocation specified in MBps 460--------------------------------------------- 461 462Memory bandwidth domain is L3 cache. 463:: 464 465 MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;... 466 467Reading/writing the schemata file 468--------------------------------- 469Reading the schemata file will show the state of all resources 470on all domains. When writing you only need to specify those values 471which you wish to change. E.g. 472:: 473 474 # cat schemata 475 L3DATA:0=fffff;1=fffff;2=fffff;3=fffff 476 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff 477 # echo "L3DATA:2=3c0;" > schemata 478 # cat schemata 479 L3DATA:0=fffff;1=fffff;2=3c0;3=fffff 480 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff 481 482Cache Pseudo-Locking 483==================== 484CAT enables a user to specify the amount of cache space that an 485application can fill. Cache pseudo-locking builds on the fact that a 486CPU can still read and write data pre-allocated outside its current 487allocated area on a cache hit. With cache pseudo-locking, data can be 488preloaded into a reserved portion of cache that no application can 489fill, and from that point on will only serve cache hits. The cache 490pseudo-locked memory is made accessible to user space where an 491application can map it into its virtual address space and thus have 492a region of memory with reduced average read latency. 493 494The creation of a cache pseudo-locked region is triggered by a request 495from the user to do so that is accompanied by a schemata of the region 496to be pseudo-locked. The cache pseudo-locked region is created as follows: 497 498- Create a CAT allocation CLOSNEW with a CBM matching the schemata 499 from the user of the cache region that will contain the pseudo-locked 500 memory. This region must not overlap with any current CAT allocation/CLOS 501 on the system and no future overlap with this cache region is allowed 502 while the pseudo-locked region exists. 503- Create a contiguous region of memory of the same size as the cache 504 region. 505- Flush the cache, disable hardware prefetchers, disable preemption. 506- Make CLOSNEW the active CLOS and touch the allocated memory to load 507 it into the cache. 508- Set the previous CLOS as active. 509- At this point the closid CLOSNEW can be released - the cache 510 pseudo-locked region is protected as long as its CBM does not appear in 511 any CAT allocation. Even though the cache pseudo-locked region will from 512 this point on not appear in any CBM of any CLOS an application running with 513 any CLOS will be able to access the memory in the pseudo-locked region since 514 the region continues to serve cache hits. 515- The contiguous region of memory loaded into the cache is exposed to 516 user-space as a character device. 517 518Cache pseudo-locking increases the probability that data will remain 519in the cache via carefully configuring the CAT feature and controlling 520application behavior. There is no guarantee that data is placed in 521cache. Instructions like INVD, WBINVD, CLFLUSH, etc. can still evict 522“locked” data from cache. Power management C-states may shrink or 523power off cache. Deeper C-states will automatically be restricted on 524pseudo-locked region creation. 525 526It is required that an application using a pseudo-locked region runs 527with affinity to the cores (or a subset of the cores) associated 528with the cache on which the pseudo-locked region resides. A sanity check 529within the code will not allow an application to map pseudo-locked memory 530unless it runs with affinity to cores associated with the cache on which the 531pseudo-locked region resides. The sanity check is only done during the 532initial mmap() handling, there is no enforcement afterwards and the 533application self needs to ensure it remains affine to the correct cores. 534 535Pseudo-locking is accomplished in two stages: 536 5371) During the first stage the system administrator allocates a portion 538 of cache that should be dedicated to pseudo-locking. At this time an 539 equivalent portion of memory is allocated, loaded into allocated 540 cache portion, and exposed as a character device. 5412) During the second stage a user-space application maps (mmap()) the 542 pseudo-locked memory into its address space. 543 544Cache Pseudo-Locking Interface 545------------------------------ 546A pseudo-locked region is created using the resctrl interface as follows: 547 5481) Create a new resource group by creating a new directory in /sys/fs/resctrl. 5492) Change the new resource group's mode to "pseudo-locksetup" by writing 550 "pseudo-locksetup" to the "mode" file. 5513) Write the schemata of the pseudo-locked region to the "schemata" file. All 552 bits within the schemata should be "unused" according to the "bit_usage" 553 file. 554 555On successful pseudo-locked region creation the "mode" file will contain 556"pseudo-locked" and a new character device with the same name as the resource 557group will exist in /dev/pseudo_lock. This character device can be mmap()'ed 558by user space in order to obtain access to the pseudo-locked memory region. 559 560An example of cache pseudo-locked region creation and usage can be found below. 561 562Cache Pseudo-Locking Debugging Interface 563---------------------------------------- 564The pseudo-locking debugging interface is enabled by default (if 565CONFIG_DEBUG_FS is enabled) and can be found in /sys/kernel/debug/resctrl. 566 567There is no explicit way for the kernel to test if a provided memory 568location is present in the cache. The pseudo-locking debugging interface uses 569the tracing infrastructure to provide two ways to measure cache residency of 570the pseudo-locked region: 571 5721) Memory access latency using the pseudo_lock_mem_latency tracepoint. Data 573 from these measurements are best visualized using a hist trigger (see 574 example below). In this test the pseudo-locked region is traversed at 575 a stride of 32 bytes while hardware prefetchers and preemption 576 are disabled. This also provides a substitute visualization of cache 577 hits and misses. 5782) Cache hit and miss measurements using model specific precision counters if 579 available. Depending on the levels of cache on the system the pseudo_lock_l2 580 and pseudo_lock_l3 tracepoints are available. 581 582When a pseudo-locked region is created a new debugfs directory is created for 583it in debugfs as /sys/kernel/debug/resctrl/<newdir>. A single 584write-only file, pseudo_lock_measure, is present in this directory. The 585measurement of the pseudo-locked region depends on the number written to this 586debugfs file: 587 5881: 589 writing "1" to the pseudo_lock_measure file will trigger the latency 590 measurement captured in the pseudo_lock_mem_latency tracepoint. See 591 example below. 5922: 593 writing "2" to the pseudo_lock_measure file will trigger the L2 cache 594 residency (cache hits and misses) measurement captured in the 595 pseudo_lock_l2 tracepoint. See example below. 5963: 597 writing "3" to the pseudo_lock_measure file will trigger the L3 cache 598 residency (cache hits and misses) measurement captured in the 599 pseudo_lock_l3 tracepoint. 600 601All measurements are recorded with the tracing infrastructure. This requires 602the relevant tracepoints to be enabled before the measurement is triggered. 603 604Example of latency debugging interface 605~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 606In this example a pseudo-locked region named "newlock" was created. Here is 607how we can measure the latency in cycles of reading from this region and 608visualize this data with a histogram that is available if CONFIG_HIST_TRIGGERS 609is set:: 610 611 # :> /sys/kernel/debug/tracing/trace 612 # echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/trigger 613 # echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable 614 # echo 1 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure 615 # echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable 616 # cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist 617 618 # event histogram 619 # 620 # trigger info: hist:keys=latency:vals=hitcount:sort=hitcount:size=2048 [active] 621 # 622 623 { latency: 456 } hitcount: 1 624 { latency: 50 } hitcount: 83 625 { latency: 36 } hitcount: 96 626 { latency: 44 } hitcount: 174 627 { latency: 48 } hitcount: 195 628 { latency: 46 } hitcount: 262 629 { latency: 42 } hitcount: 693 630 { latency: 40 } hitcount: 3204 631 { latency: 38 } hitcount: 3484 632 633 Totals: 634 Hits: 8192 635 Entries: 9 636 Dropped: 0 637 638Example of cache hits/misses debugging 639~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 640In this example a pseudo-locked region named "newlock" was created on the L2 641cache of a platform. Here is how we can obtain details of the cache hits 642and misses using the platform's precision counters. 643:: 644 645 # :> /sys/kernel/debug/tracing/trace 646 # echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable 647 # echo 2 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure 648 # echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable 649 # cat /sys/kernel/debug/tracing/trace 650 651 # tracer: nop 652 # 653 # _-----=> irqs-off 654 # / _----=> need-resched 655 # | / _---=> hardirq/softirq 656 # || / _--=> preempt-depth 657 # ||| / delay 658 # TASK-PID CPU# |||| TIMESTAMP FUNCTION 659 # | | | |||| | | 660 pseudo_lock_mea-1672 [002] .... 3132.860500: pseudo_lock_l2: hits=4097 miss=0 661 662 663Examples for RDT allocation usage 664~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 665 6661) Example 1 667 668On a two socket machine (one L3 cache per socket) with just four bits 669for cache bit masks, minimum b/w of 10% with a memory bandwidth 670granularity of 10%. 671:: 672 673 # mount -t resctrl resctrl /sys/fs/resctrl 674 # cd /sys/fs/resctrl 675 # mkdir p0 p1 676 # echo "L3:0=3;1=c\nMB:0=50;1=50" > /sys/fs/resctrl/p0/schemata 677 # echo "L3:0=3;1=3\nMB:0=50;1=50" > /sys/fs/resctrl/p1/schemata 678 679The default resource group is unmodified, so we have access to all parts 680of all caches (its schemata file reads "L3:0=f;1=f"). 681 682Tasks that are under the control of group "p0" may only allocate from the 683"lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1. 684Tasks in group "p1" use the "lower" 50% of cache on both sockets. 685 686Similarly, tasks that are under the control of group "p0" may use a 687maximum memory b/w of 50% on socket0 and 50% on socket 1. 688Tasks in group "p1" may also use 50% memory b/w on both sockets. 689Note that unlike cache masks, memory b/w cannot specify whether these 690allocations can overlap or not. The allocations specifies the maximum 691b/w that the group may be able to use and the system admin can configure 692the b/w accordingly. 693 694If resctrl is using the software controller (mba_sc) then user can enter the 695max b/w in MB rather than the percentage values. 696:: 697 698 # echo "L3:0=3;1=c\nMB:0=1024;1=500" > /sys/fs/resctrl/p0/schemata 699 # echo "L3:0=3;1=3\nMB:0=1024;1=500" > /sys/fs/resctrl/p1/schemata 700 701In the above example the tasks in "p1" and "p0" on socket 0 would use a max b/w 702of 1024MB where as on socket 1 they would use 500MB. 703 7042) Example 2 705 706Again two sockets, but this time with a more realistic 20-bit mask. 707 708Two real time tasks pid=1234 running on processor 0 and pid=5678 running on 709processor 1 on socket 0 on a 2-socket and dual core machine. To avoid noisy 710neighbors, each of the two real-time tasks exclusively occupies one quarter 711of L3 cache on socket 0. 712:: 713 714 # mount -t resctrl resctrl /sys/fs/resctrl 715 # cd /sys/fs/resctrl 716 717First we reset the schemata for the default group so that the "upper" 71850% of the L3 cache on socket 0 and 50% of memory b/w cannot be used by 719ordinary tasks:: 720 721 # echo "L3:0=3ff;1=fffff\nMB:0=50;1=100" > schemata 722 723Next we make a resource group for our first real time task and give 724it access to the "top" 25% of the cache on socket 0. 725:: 726 727 # mkdir p0 728 # echo "L3:0=f8000;1=fffff" > p0/schemata 729 730Finally we move our first real time task into this resource group. We 731also use taskset(1) to ensure the task always runs on a dedicated CPU 732on socket 0. Most uses of resource groups will also constrain which 733processors tasks run on. 734:: 735 736 # echo 1234 > p0/tasks 737 # taskset -cp 1 1234 738 739Ditto for the second real time task (with the remaining 25% of cache):: 740 741 # mkdir p1 742 # echo "L3:0=7c00;1=fffff" > p1/schemata 743 # echo 5678 > p1/tasks 744 # taskset -cp 2 5678 745 746For the same 2 socket system with memory b/w resource and CAT L3 the 747schemata would look like(Assume min_bandwidth 10 and bandwidth_gran is 74810): 749 750For our first real time task this would request 20% memory b/w on socket 0. 751:: 752 753 # echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata 754 755For our second real time task this would request an other 20% memory b/w 756on socket 0. 757:: 758 759 # echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata 760 7613) Example 3 762 763A single socket system which has real-time tasks running on core 4-7 and 764non real-time workload assigned to core 0-3. The real-time tasks share text 765and data, so a per task association is not required and due to interaction 766with the kernel it's desired that the kernel on these cores shares L3 with 767the tasks. 768:: 769 770 # mount -t resctrl resctrl /sys/fs/resctrl 771 # cd /sys/fs/resctrl 772 773First we reset the schemata for the default group so that the "upper" 77450% of the L3 cache on socket 0, and 50% of memory bandwidth on socket 0 775cannot be used by ordinary tasks:: 776 777 # echo "L3:0=3ff\nMB:0=50" > schemata 778 779Next we make a resource group for our real time cores and give it access 780to the "top" 50% of the cache on socket 0 and 50% of memory bandwidth on 781socket 0. 782:: 783 784 # mkdir p0 785 # echo "L3:0=ffc00\nMB:0=50" > p0/schemata 786 787Finally we move core 4-7 over to the new group and make sure that the 788kernel and the tasks running there get 50% of the cache. They should 789also get 50% of memory bandwidth assuming that the cores 4-7 are SMT 790siblings and only the real time threads are scheduled on the cores 4-7. 791:: 792 793 # echo F0 > p0/cpus 794 7954) Example 4 796 797The resource groups in previous examples were all in the default "shareable" 798mode allowing sharing of their cache allocations. If one resource group 799configures a cache allocation then nothing prevents another resource group 800to overlap with that allocation. 801 802In this example a new exclusive resource group will be created on a L2 CAT 803system with two L2 cache instances that can be configured with an 8-bit 804capacity bitmask. The new exclusive resource group will be configured to use 80525% of each cache instance. 806:: 807 808 # mount -t resctrl resctrl /sys/fs/resctrl/ 809 # cd /sys/fs/resctrl 810 811First, we observe that the default group is configured to allocate to all L2 812cache:: 813 814 # cat schemata 815 L2:0=ff;1=ff 816 817We could attempt to create the new resource group at this point, but it will 818fail because of the overlap with the schemata of the default group:: 819 820 # mkdir p0 821 # echo 'L2:0=0x3;1=0x3' > p0/schemata 822 # cat p0/mode 823 shareable 824 # echo exclusive > p0/mode 825 -sh: echo: write error: Invalid argument 826 # cat info/last_cmd_status 827 schemata overlaps 828 829To ensure that there is no overlap with another resource group the default 830resource group's schemata has to change, making it possible for the new 831resource group to become exclusive. 832:: 833 834 # echo 'L2:0=0xfc;1=0xfc' > schemata 835 # echo exclusive > p0/mode 836 # grep . p0/* 837 p0/cpus:0 838 p0/mode:exclusive 839 p0/schemata:L2:0=03;1=03 840 p0/size:L2:0=262144;1=262144 841 842A new resource group will on creation not overlap with an exclusive resource 843group:: 844 845 # mkdir p1 846 # grep . p1/* 847 p1/cpus:0 848 p1/mode:shareable 849 p1/schemata:L2:0=fc;1=fc 850 p1/size:L2:0=786432;1=786432 851 852The bit_usage will reflect how the cache is used:: 853 854 # cat info/L2/bit_usage 855 0=SSSSSSEE;1=SSSSSSEE 856 857A resource group cannot be forced to overlap with an exclusive resource group:: 858 859 # echo 'L2:0=0x1;1=0x1' > p1/schemata 860 -sh: echo: write error: Invalid argument 861 # cat info/last_cmd_status 862 overlaps with exclusive group 863 864Example of Cache Pseudo-Locking 865~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 866Lock portion of L2 cache from cache id 1 using CBM 0x3. Pseudo-locked 867region is exposed at /dev/pseudo_lock/newlock that can be provided to 868application for argument to mmap(). 869:: 870 871 # mount -t resctrl resctrl /sys/fs/resctrl/ 872 # cd /sys/fs/resctrl 873 874Ensure that there are bits available that can be pseudo-locked, since only 875unused bits can be pseudo-locked the bits to be pseudo-locked needs to be 876removed from the default resource group's schemata:: 877 878 # cat info/L2/bit_usage 879 0=SSSSSSSS;1=SSSSSSSS 880 # echo 'L2:1=0xfc' > schemata 881 # cat info/L2/bit_usage 882 0=SSSSSSSS;1=SSSSSS00 883 884Create a new resource group that will be associated with the pseudo-locked 885region, indicate that it will be used for a pseudo-locked region, and 886configure the requested pseudo-locked region capacity bitmask:: 887 888 # mkdir newlock 889 # echo pseudo-locksetup > newlock/mode 890 # echo 'L2:1=0x3' > newlock/schemata 891 892On success the resource group's mode will change to pseudo-locked, the 893bit_usage will reflect the pseudo-locked region, and the character device 894exposing the pseudo-locked region will exist:: 895 896 # cat newlock/mode 897 pseudo-locked 898 # cat info/L2/bit_usage 899 0=SSSSSSSS;1=SSSSSSPP 900 # ls -l /dev/pseudo_lock/newlock 901 crw------- 1 root root 243, 0 Apr 3 05:01 /dev/pseudo_lock/newlock 902 903:: 904 905 /* 906 * Example code to access one page of pseudo-locked cache region 907 * from user space. 908 */ 909 #define _GNU_SOURCE 910 #include <fcntl.h> 911 #include <sched.h> 912 #include <stdio.h> 913 #include <stdlib.h> 914 #include <unistd.h> 915 #include <sys/mman.h> 916 917 /* 918 * It is required that the application runs with affinity to only 919 * cores associated with the pseudo-locked region. Here the cpu 920 * is hardcoded for convenience of example. 921 */ 922 static int cpuid = 2; 923 924 int main(int argc, char *argv[]) 925 { 926 cpu_set_t cpuset; 927 long page_size; 928 void *mapping; 929 int dev_fd; 930 int ret; 931 932 page_size = sysconf(_SC_PAGESIZE); 933 934 CPU_ZERO(&cpuset); 935 CPU_SET(cpuid, &cpuset); 936 ret = sched_setaffinity(0, sizeof(cpuset), &cpuset); 937 if (ret < 0) { 938 perror("sched_setaffinity"); 939 exit(EXIT_FAILURE); 940 } 941 942 dev_fd = open("/dev/pseudo_lock/newlock", O_RDWR); 943 if (dev_fd < 0) { 944 perror("open"); 945 exit(EXIT_FAILURE); 946 } 947 948 mapping = mmap(0, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, 949 dev_fd, 0); 950 if (mapping == MAP_FAILED) { 951 perror("mmap"); 952 close(dev_fd); 953 exit(EXIT_FAILURE); 954 } 955 956 /* Application interacts with pseudo-locked memory @mapping */ 957 958 ret = munmap(mapping, page_size); 959 if (ret < 0) { 960 perror("munmap"); 961 close(dev_fd); 962 exit(EXIT_FAILURE); 963 } 964 965 close(dev_fd); 966 exit(EXIT_SUCCESS); 967 } 968 969Locking between applications 970---------------------------- 971 972Certain operations on the resctrl filesystem, composed of read/writes 973to/from multiple files, must be atomic. 974 975As an example, the allocation of an exclusive reservation of L3 cache 976involves: 977 978 1. Read the cbmmasks from each directory or the per-resource "bit_usage" 979 2. Find a contiguous set of bits in the global CBM bitmask that is clear 980 in any of the directory cbmmasks 981 3. Create a new directory 982 4. Set the bits found in step 2 to the new directory "schemata" file 983 984If two applications attempt to allocate space concurrently then they can 985end up allocating the same bits so the reservations are shared instead of 986exclusive. 987 988To coordinate atomic operations on the resctrlfs and to avoid the problem 989above, the following locking procedure is recommended: 990 991Locking is based on flock, which is available in libc and also as a shell 992script command 993 994Write lock: 995 996 A) Take flock(LOCK_EX) on /sys/fs/resctrl 997 B) Read/write the directory structure. 998 C) funlock 999 1000Read lock: 1001 1002 A) Take flock(LOCK_SH) on /sys/fs/resctrl 1003 B) If success read the directory structure. 1004 C) funlock 1005 1006Example with bash:: 1007 1008 # Atomically read directory structure 1009 $ flock -s /sys/fs/resctrl/ find /sys/fs/resctrl 1010 1011 # Read directory contents and create new subdirectory 1012 1013 $ cat create-dir.sh 1014 find /sys/fs/resctrl/ > output.txt 1015 mask = function-of(output.txt) 1016 mkdir /sys/fs/resctrl/newres/ 1017 echo mask > /sys/fs/resctrl/newres/schemata 1018 1019 $ flock /sys/fs/resctrl/ ./create-dir.sh 1020 1021Example with C:: 1022 1023 /* 1024 * Example code do take advisory locks 1025 * before accessing resctrl filesystem 1026 */ 1027 #include <sys/file.h> 1028 #include <stdlib.h> 1029 1030 void resctrl_take_shared_lock(int fd) 1031 { 1032 int ret; 1033 1034 /* take shared lock on resctrl filesystem */ 1035 ret = flock(fd, LOCK_SH); 1036 if (ret) { 1037 perror("flock"); 1038 exit(-1); 1039 } 1040 } 1041 1042 void resctrl_take_exclusive_lock(int fd) 1043 { 1044 int ret; 1045 1046 /* release lock on resctrl filesystem */ 1047 ret = flock(fd, LOCK_EX); 1048 if (ret) { 1049 perror("flock"); 1050 exit(-1); 1051 } 1052 } 1053 1054 void resctrl_release_lock(int fd) 1055 { 1056 int ret; 1057 1058 /* take shared lock on resctrl filesystem */ 1059 ret = flock(fd, LOCK_UN); 1060 if (ret) { 1061 perror("flock"); 1062 exit(-1); 1063 } 1064 } 1065 1066 void main(void) 1067 { 1068 int fd, ret; 1069 1070 fd = open("/sys/fs/resctrl", O_DIRECTORY); 1071 if (fd == -1) { 1072 perror("open"); 1073 exit(-1); 1074 } 1075 resctrl_take_shared_lock(fd); 1076 /* code to read directory contents */ 1077 resctrl_release_lock(fd); 1078 1079 resctrl_take_exclusive_lock(fd); 1080 /* code to read and write directory contents */ 1081 resctrl_release_lock(fd); 1082 } 1083 1084Examples for RDT Monitoring along with allocation usage 1085======================================================= 1086Reading monitored data 1087---------------------- 1088Reading an event file (for ex: mon_data/mon_L3_00/llc_occupancy) would 1089show the current snapshot of LLC occupancy of the corresponding MON 1090group or CTRL_MON group. 1091 1092 1093Example 1 (Monitor CTRL_MON group and subset of tasks in CTRL_MON group) 1094------------------------------------------------------------------------ 1095On a two socket machine (one L3 cache per socket) with just four bits 1096for cache bit masks:: 1097 1098 # mount -t resctrl resctrl /sys/fs/resctrl 1099 # cd /sys/fs/resctrl 1100 # mkdir p0 p1 1101 # echo "L3:0=3;1=c" > /sys/fs/resctrl/p0/schemata 1102 # echo "L3:0=3;1=3" > /sys/fs/resctrl/p1/schemata 1103 # echo 5678 > p1/tasks 1104 # echo 5679 > p1/tasks 1105 1106The default resource group is unmodified, so we have access to all parts 1107of all caches (its schemata file reads "L3:0=f;1=f"). 1108 1109Tasks that are under the control of group "p0" may only allocate from the 1110"lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1. 1111Tasks in group "p1" use the "lower" 50% of cache on both sockets. 1112 1113Create monitor groups and assign a subset of tasks to each monitor group. 1114:: 1115 1116 # cd /sys/fs/resctrl/p1/mon_groups 1117 # mkdir m11 m12 1118 # echo 5678 > m11/tasks 1119 # echo 5679 > m12/tasks 1120 1121fetch data (data shown in bytes) 1122:: 1123 1124 # cat m11/mon_data/mon_L3_00/llc_occupancy 1125 16234000 1126 # cat m11/mon_data/mon_L3_01/llc_occupancy 1127 14789000 1128 # cat m12/mon_data/mon_L3_00/llc_occupancy 1129 16789000 1130 1131The parent ctrl_mon group shows the aggregated data. 1132:: 1133 1134 # cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy 1135 31234000 1136 1137Example 2 (Monitor a task from its creation) 1138-------------------------------------------- 1139On a two socket machine (one L3 cache per socket):: 1140 1141 # mount -t resctrl resctrl /sys/fs/resctrl 1142 # cd /sys/fs/resctrl 1143 # mkdir p0 p1 1144 1145An RMID is allocated to the group once its created and hence the <cmd> 1146below is monitored from its creation. 1147:: 1148 1149 # echo $$ > /sys/fs/resctrl/p1/tasks 1150 # <cmd> 1151 1152Fetch the data:: 1153 1154 # cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy 1155 31789000 1156 1157Example 3 (Monitor without CAT support or before creating CAT groups) 1158--------------------------------------------------------------------- 1159 1160Assume a system like HSW has only CQM and no CAT support. In this case 1161the resctrl will still mount but cannot create CTRL_MON directories. 1162But user can create different MON groups within the root group thereby 1163able to monitor all tasks including kernel threads. 1164 1165This can also be used to profile jobs cache size footprint before being 1166able to allocate them to different allocation groups. 1167:: 1168 1169 # mount -t resctrl resctrl /sys/fs/resctrl 1170 # cd /sys/fs/resctrl 1171 # mkdir mon_groups/m01 1172 # mkdir mon_groups/m02 1173 1174 # echo 3478 > /sys/fs/resctrl/mon_groups/m01/tasks 1175 # echo 2467 > /sys/fs/resctrl/mon_groups/m02/tasks 1176 1177Monitor the groups separately and also get per domain data. From the 1178below its apparent that the tasks are mostly doing work on 1179domain(socket) 0. 1180:: 1181 1182 # cat /sys/fs/resctrl/mon_groups/m01/mon_L3_00/llc_occupancy 1183 31234000 1184 # cat /sys/fs/resctrl/mon_groups/m01/mon_L3_01/llc_occupancy 1185 34555 1186 # cat /sys/fs/resctrl/mon_groups/m02/mon_L3_00/llc_occupancy 1187 31234000 1188 # cat /sys/fs/resctrl/mon_groups/m02/mon_L3_01/llc_occupancy 1189 32789 1190 1191 1192Example 4 (Monitor real time tasks) 1193----------------------------------- 1194 1195A single socket system which has real time tasks running on cores 4-7 1196and non real time tasks on other cpus. We want to monitor the cache 1197occupancy of the real time threads on these cores. 1198:: 1199 1200 # mount -t resctrl resctrl /sys/fs/resctrl 1201 # cd /sys/fs/resctrl 1202 # mkdir p1 1203 1204Move the cpus 4-7 over to p1:: 1205 1206 # echo f0 > p1/cpus 1207 1208View the llc occupancy snapshot:: 1209 1210 # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy 1211 11234000 1212 1213Intel RDT Errata 1214================ 1215 1216Intel MBM Counters May Report System Memory Bandwidth Incorrectly 1217----------------------------------------------------------------- 1218 1219Errata SKX99 for Skylake server and BDF102 for Broadwell server. 1220 1221Problem: Intel Memory Bandwidth Monitoring (MBM) counters track metrics 1222according to the assigned Resource Monitor ID (RMID) for that logical 1223core. The IA32_QM_CTR register (MSR 0xC8E), used to report these 1224metrics, may report incorrect system bandwidth for certain RMID values. 1225 1226Implication: Due to the errata, system memory bandwidth may not match 1227what is reported. 1228 1229Workaround: MBM total and local readings are corrected according to the 1230following correction factor table: 1231 1232+---------------+---------------+---------------+-----------------+ 1233|core count |rmid count |rmid threshold |correction factor| 1234+---------------+---------------+---------------+-----------------+ 1235|1 |8 |0 |1.000000 | 1236+---------------+---------------+---------------+-----------------+ 1237|2 |16 |0 |1.000000 | 1238+---------------+---------------+---------------+-----------------+ 1239|3 |24 |15 |0.969650 | 1240+---------------+---------------+---------------+-----------------+ 1241|4 |32 |0 |1.000000 | 1242+---------------+---------------+---------------+-----------------+ 1243|6 |48 |31 |0.969650 | 1244+---------------+---------------+---------------+-----------------+ 1245|7 |56 |47 |1.142857 | 1246+---------------+---------------+---------------+-----------------+ 1247|8 |64 |0 |1.000000 | 1248+---------------+---------------+---------------+-----------------+ 1249|9 |72 |63 |1.185115 | 1250+---------------+---------------+---------------+-----------------+ 1251|10 |80 |63 |1.066553 | 1252+---------------+---------------+---------------+-----------------+ 1253|11 |88 |79 |1.454545 | 1254+---------------+---------------+---------------+-----------------+ 1255|12 |96 |0 |1.000000 | 1256+---------------+---------------+---------------+-----------------+ 1257|13 |104 |95 |1.230769 | 1258+---------------+---------------+---------------+-----------------+ 1259|14 |112 |95 |1.142857 | 1260+---------------+---------------+---------------+-----------------+ 1261|15 |120 |95 |1.066667 | 1262+---------------+---------------+---------------+-----------------+ 1263|16 |128 |0 |1.000000 | 1264+---------------+---------------+---------------+-----------------+ 1265|17 |136 |127 |1.254863 | 1266+---------------+---------------+---------------+-----------------+ 1267|18 |144 |127 |1.185255 | 1268+---------------+---------------+---------------+-----------------+ 1269|19 |152 |0 |1.000000 | 1270+---------------+---------------+---------------+-----------------+ 1271|20 |160 |127 |1.066667 | 1272+---------------+---------------+---------------+-----------------+ 1273|21 |168 |0 |1.000000 | 1274+---------------+---------------+---------------+-----------------+ 1275|22 |176 |159 |1.454334 | 1276+---------------+---------------+---------------+-----------------+ 1277|23 |184 |0 |1.000000 | 1278+---------------+---------------+---------------+-----------------+ 1279|24 |192 |127 |0.969744 | 1280+---------------+---------------+---------------+-----------------+ 1281|25 |200 |191 |1.280246 | 1282+---------------+---------------+---------------+-----------------+ 1283|26 |208 |191 |1.230921 | 1284+---------------+---------------+---------------+-----------------+ 1285|27 |216 |0 |1.000000 | 1286+---------------+---------------+---------------+-----------------+ 1287|28 |224 |191 |1.143118 | 1288+---------------+---------------+---------------+-----------------+ 1289 1290If rmid > rmid threshold, MBM total and local values should be multiplied 1291by the correction factor. 1292 1293See: 1294 12951. Erratum SKX99 in Intel Xeon Processor Scalable Family Specification Update: 1296http://web.archive.org/web/20200716124958/https://www.intel.com/content/www/us/en/processors/xeon/scalable/xeon-scalable-spec-update.html 1297 12982. Erratum BDF102 in Intel Xeon E5-2600 v4 Processor Product Family Specification Update: 1299http://web.archive.org/web/20191125200531/https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v4-spec-update.pdf 1300 13013. The errata in Intel Resource Director Technology (Intel RDT) on 2nd Generation Intel Xeon Scalable Processors Reference Manual: 1302https://software.intel.com/content/www/us/en/develop/articles/intel-resource-director-technology-rdt-reference-manual.html 1303 1304for further information.