cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

intel-speed-select.rst (31416B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3============================================================
      4Intel(R) Speed Select Technology User Guide
      5============================================================
      6
      7The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new
      8collection of features that give more granular control over CPU performance.
      9With Intel(R) SST, one server can be configured for power and performance for a
     10variety of diverse workload requirements.
     11
     12Refer to the links below for an overview of the technology:
     13
     14- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
     15- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf
     16
     17These capabilities are further enhanced in some of the newer generations of
     18server platforms where these features can be enumerated and controlled
     19dynamically without pre-configuring via BIOS setup options. This dynamic
     20configuration is done via mailbox commands to the hardware. One way to enumerate
     21and configure these features is by using the Intel Speed Select utility.
     22
     23This document explains how to use the Intel Speed Select tool to enumerate and
     24control Intel(R) SST features. This document gives example commands and explains
     25how these commands change the power and performance profile of the system under
     26test. Using this tool as an example, customers can replicate the messaging
     27implemented in the tool in their production software.
     28
     29intel-speed-select configuration tool
     30======================================
     31
     32Most Linux distribution packages may include the "intel-speed-select" tool. If not,
     33it can be built by downloading the Linux kernel tree from kernel.org. Once
     34downloaded, the tool can be built without building the full kernel.
     35
     36From the kernel tree, run the following commands::
     37
     38# cd tools/power/x86/intel-speed-select/
     39# make
     40# make install
     41
     42Getting Help
     43------------
     44
     45To get help with the tool, execute the command below::
     46
     47# intel-speed-select --help
     48
     49The top-level help describes arguments and features. Notice that there is a
     50multi-level help structure in the tool. For example, to get help for the feature "perf-profile"::
     51
     52# intel-speed-select perf-profile --help
     53
     54To get help on a command, another level of help is provided. For example for the command info "info"::
     55
     56# intel-speed-select perf-profile info --help
     57
     58Summary of platform capability
     59------------------------------
     60To check the current platform and driver capabilities, execute::
     61
     62#intel-speed-select --info
     63
     64For example on a test system::
     65
     66 # intel-speed-select --info
     67 Intel(R) Speed Select Technology
     68 Executing on CPU model: X
     69 Platform: API version : 1
     70 Platform: Driver version : 1
     71 Platform: mbox supported : 1
     72 Platform: mmio supported : 1
     73 Intel(R) SST-PP (feature perf-profile) is supported
     74 TDP level change control is unlocked, max level: 4
     75 Intel(R) SST-TF (feature turbo-freq) is supported
     76 Intel(R) SST-BF (feature base-freq) is not supported
     77 Intel(R) SST-CP (feature core-power) is supported
     78
     79Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)
     80------------------------------------------------------------------------
     81
     82This feature allows configuration of a server dynamically based on workload
     83performance requirements. This helps users during deployment as they do not have
     84to choose a specific server configuration statically.  This Intel(R) Speed Select
     85Technology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism
     86that allows multiple optimized performance profiles per system. Each profile
     87defines a set of CPUs that need to be online and rest offline to sustain a
     88guaranteed base frequency. Once the user issues a command to use a specific
     89performance profile and meet CPU online/offline requirement, the user can expect
     90a change in the base frequency dynamically. This feature is called
     91"perf-profile" when using the Intel Speed Select tool.
     92
     93Number or performance levels
     94~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     95
     96There can be multiple performance profiles on a system. To get the number of
     97profiles, execute the command below::
     98
     99 # intel-speed-select perf-profile get-config-levels
    100 Intel(R) Speed Select Technology
    101 Executing on CPU model: X
    102 package-0
    103  die-0
    104    cpu-0
    105        get-config-levels:4
    106 package-1
    107  die-0
    108    cpu-14
    109        get-config-levels:4
    110
    111On this system under test, there are 4 performance profiles in addition to the
    112base performance profile (which is performance level 0).
    113
    114Lock/Unlock status
    115~~~~~~~~~~~~~~~~~~
    116
    117Even if there are multiple performance profiles, it is possible that they
    118are locked. If they are locked, users cannot issue a command to change the
    119performance state. It is possible that there is a BIOS setup to unlock or check
    120with your system vendor.
    121
    122To check if the system is locked, execute the following command::
    123
    124 # intel-speed-select perf-profile get-lock-status
    125 Intel(R) Speed Select Technology
    126 Executing on CPU model: X
    127 package-0
    128  die-0
    129    cpu-0
    130        get-lock-status:0
    131 package-1
    132  die-0
    133    cpu-14
    134        get-lock-status:0
    135
    136In this case, lock status is 0, which means that the system is unlocked.
    137
    138Properties of a performance level
    139~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    140
    141To get properties of a specific performance level (For example for the level 0, below), execute the command below::
    142
    143 # intel-speed-select perf-profile info -l 0
    144 Intel(R) Speed Select Technology
    145 Executing on CPU model: X
    146 package-0
    147  die-0
    148    cpu-0
    149      perf-profile-level-0
    150        cpu-count:28
    151        enable-cpu-mask:000003ff,f0003fff
    152        enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41
    153        thermal-design-power-ratio:26
    154        base-frequency(MHz):2600
    155        speed-select-turbo-freq:disabled
    156        speed-select-base-freq:disabled
    157	...
    158	...
    159
    160Here -l option is used to specify a performance level.
    161
    162If the option -l is omitted, then this command will print information about all
    163the performance levels. The above command is printing properties of the
    164performance level 0.
    165
    166For this performance profile, the list of CPUs displayed by the
    167"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that
    168condition is met, then base frequency of 2600 MHz can be maintained. To
    169understand more, execute "intel-speed-select perf-profile info" for performance
    170level 4::
    171
    172 # intel-speed-select perf-profile info -l 4
    173 Intel(R) Speed Select Technology
    174 Executing on CPU model: X
    175 package-0
    176  die-0
    177    cpu-0
    178      perf-profile-level-4
    179        cpu-count:28
    180        enable-cpu-mask:000000fa,f0000faf
    181        enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39
    182        thermal-design-power-ratio:28
    183        base-frequency(MHz):2800
    184        speed-select-turbo-freq:disabled
    185        speed-select-base-freq:unsupported
    186	...
    187	...
    188
    189There are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if
    190the user only keeps these CPUs online and the rest "offline," then the base
    191frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
    192
    193Get current performance level
    194~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    195
    196To get the current performance level, execute::
    197
    198 # intel-speed-select perf-profile get-config-current-level
    199 Intel(R) Speed Select Technology
    200 Executing on CPU model: X
    201 package-0
    202  die-0
    203    cpu-0
    204        get-config-current_level:0
    205
    206First verify that the base_frequency displayed by the cpufreq sysfs is correct::
    207
    208 # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
    209 2600000
    210
    211This matches the base-frequency (MHz) field value displayed from the
    212"perf-profile info" command for performance level 0(cpufreq frequency is in
    213KHz).
    214
    215To check if the average frequency is equal to the base frequency for a 100% busy
    216workload, disable turbo::
    217
    218# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
    219
    220Then runs a busy workload on all CPUs, for example::
    221
    222#stress -c 64
    223
    224To verify the base frequency, run turbostat::
    225
    226 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
    227
    228  Package	Core	CPU	Bzy_MHz
    229		-	-	2600
    230  0		0	0	2600
    231  0		1	1	2600
    232  0		2	2	2600
    233  0		3	3	2600
    234  0		4	4	2600
    235  .		.	.	.
    236
    237
    238Changing performance level
    239~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    240
    241To the change the performance level to 4, execute::
    242
    243 # intel-speed-select -d perf-profile set-config-level -l 4 -o
    244 Intel(R) Speed Select Technology
    245 Executing on CPU model: X
    246 package-0
    247  die-0
    248    cpu-0
    249      perf-profile
    250        set_tdp_level:success
    251
    252In the command above, "-o" is optional. If it is specified, then it will also
    253offline CPUs which are not present in the enable_cpu_mask for this performance
    254level.
    255
    256Now if the base_frequency is checked::
    257
    258 #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
    259 2800000
    260
    261Which shows that the base frequency now increased from 2600 MHz at performance
    262level 0 to 2800 MHz at performance level 4. As a result, any workload, which can
    263use fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
    264
    265Changing performance level via BMC Interface
    266~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    267
    268It is possible to change SST-PP level using out of band (OOB) agent (Via some
    269remote management console, through BMC "Baseboard Management Controller"
    270interface). This mode is supported from the Sapphire Rapids processor
    271generation. The kernel and tool change to support this mode is added to Linux
    272kernel version 5.18. To enable this feature, kernel config
    273"CONFIG_INTEL_HFI_THERMAL" is required. The minimum version of the tool
    274is "v1.12" to support this feature, which is part of Linux kernel version 5.18.
    275
    276To support such configuration, this tool can be used as a daemon. Add
    277a command line option --oob::
    278
    279 # intel-speed-select --oob
    280 Intel(R) Speed Select Technology
    281 Executing on CPU model:143[0x8f]
    282 OOB mode is enabled and will run as daemon
    283
    284In this mode the tool will online/offline CPUs based on the new performance
    285level.
    286
    287Check presence of other Intel(R) SST features
    288---------------------------------------------
    289
    290Each of the performance profiles also specifies weather there is support of
    291other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
    292(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
    293SST-TF)).
    294
    295For example, from the output of "perf-profile info" above, for level 0 and level
    2964:
    297
    298For level 0::
    299       speed-select-turbo-freq:disabled
    300       speed-select-base-freq:disabled
    301
    302For level 4::
    303       speed-select-turbo-freq:disabled
    304       speed-select-base-freq:unsupported
    305
    306Given these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4
    307changed from "disabled" to "unsupported" compared to performance level 0.
    308
    309This means that at performance level 4, the "speed-select-base-freq" feature is
    310not supported. However, at performance level 0, this feature is "supported", but
    311currently "disabled", meaning the user has not activated this feature. Whereas
    312"speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance
    313levels, but currently not activated by the user.
    314
    315The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation
    316technology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).
    317The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF
    318is supported on a platform.
    319
    320Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP)
    321---------------------------------------------------------------
    322
    323Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that
    324allows users to define per core priority. This defines a mechanism to distribute
    325power among cores when there is a power constrained scenario. This defines a
    326class of service (CLOS) configuration.
    327
    328The user can configure up to 4 class of service configurations. Each CLOS group
    329configuration allows definitions of parameters, which affects how the frequency
    330can be limited and power is distributed. Each CPU core can be tied to a class of
    331service and hence an associated priority. The granularity is at core level not
    332at per CPU level.
    333
    334Enable CLOS based prioritization
    335~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    336
    337To use CLOS based prioritization feature, firmware must be informed to enable
    338and use a priority type. There is a default per platform priority type, which
    339can be changed with optional command line parameter.
    340
    341To enable and check the options, execute::
    342
    343 # intel-speed-select core-power enable --help
    344 Intel(R) Speed Select Technology
    345 Executing on CPU model: X
    346 Enable core-power for a package/die
    347	Clos Enable: Specify priority type with [--priority|-p]
    348		 0: Proportional, 1: Ordered
    349
    350There are two types of priority types:
    351
    352- Ordered
    353
    354Priority for ordered throttling is defined based on the index of the assigned
    355CLOS group. Where CLOS0 gets highest priority (throttled last).
    356
    357Priority order is:
    358CLOS0 > CLOS1 > CLOS2 > CLOS3.
    359
    360- Proportional
    361
    362When proportional priority is used, there is an additional parameter called
    363frequency_weight, which can be specified per CLOS group. The goal of
    364proportional priority is to provide each core with the requested min., then
    365distribute all remaining (excess/deficit) budgets in proportion to a defined
    366weight. This proportional priority can be configured using "core-power config"
    367command.
    368
    369To enable with the platform default priority type, execute::
    370
    371 # intel-speed-select core-power enable
    372 Intel(R) Speed Select Technology
    373 Executing on CPU model: X
    374 package-0
    375  die-0
    376    cpu-0
    377      core-power
    378        enable:success
    379 package-1
    380  die-0
    381    cpu-6
    382      core-power
    383        enable:success
    384
    385The scope of this enable is per package or die scoped when a package contains
    386multiple dies. To check if CLOS is enabled and get priority type, "core-power
    387info" command can be used. For example to check the status of core-power feature
    388on CPU 0, execute::
    389
    390 # intel-speed-select -c 0 core-power info
    391 Intel(R) Speed Select Technology
    392 Executing on CPU model: X
    393 package-0
    394  die-0
    395    cpu-0
    396      core-power
    397        support-status:supported
    398        enable-status:enabled
    399        clos-enable-status:enabled
    400        priority-type:proportional
    401 package-1
    402  die-0
    403    cpu-24
    404      core-power
    405        support-status:supported
    406        enable-status:enabled
    407        clos-enable-status:enabled
    408        priority-type:proportional
    409
    410Configuring CLOS groups
    411~~~~~~~~~~~~~~~~~~~~~~~
    412
    413Each CLOS group has its own attributes including min, max, freq_weight and
    414desired. These parameters can be configured with "core-power config" command.
    415Defaults will be used if user skips setting a parameter except clos id, which is
    416mandatory. To check core-power config options, execute::
    417
    418 # intel-speed-select core-power config --help
    419 Intel(R) Speed Select Technology
    420 Executing on CPU model: X
    421 Set core-power configuration for one of the four clos ids
    422	Specify targeted clos id with [--clos|-c]
    423	Specify clos Proportional Priority [--weight|-w]
    424	Specify clos min in MHz with [--min|-n]
    425	Specify clos max in MHz with [--max|-m]
    426
    427For example::
    428
    429 # intel-speed-select core-power config -c 0
    430 Intel(R) Speed Select Technology
    431 Executing on CPU model: X
    432 clos epp is not specified, default: 0
    433 clos frequency weight is not specified, default: 0
    434 clos min is not specified, default: 0 MHz
    435 clos max is not specified, default: 25500 MHz
    436 clos desired is not specified, default: 0
    437 package-0
    438  die-0
    439    cpu-0
    440      core-power
    441        config:success
    442 package-1
    443  die-0
    444    cpu-6
    445      core-power
    446        config:success
    447
    448The user has the option to change defaults. For example, the user can change the
    449"min" and set the base frequency to always get guaranteed base frequency.
    450
    451Get the current CLOS configuration
    452~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    453
    454To check the current configuration, "core-power get-config" can be used. For
    455example, to get the configuration of CLOS 0::
    456
    457 # intel-speed-select core-power get-config -c 0
    458 Intel(R) Speed Select Technology
    459 Executing on CPU model: X
    460 package-0
    461  die-0
    462    cpu-0
    463      core-power
    464        clos:0
    465        epp:0
    466        clos-proportional-priority:0
    467        clos-min:0 MHz
    468        clos-max:Max Turbo frequency
    469        clos-desired:0 MHz
    470 package-1
    471  die-0
    472    cpu-24
    473      core-power
    474        clos:0
    475        epp:0
    476        clos-proportional-priority:0
    477        clos-min:0 MHz
    478        clos-max:Max Turbo frequency
    479        clos-desired:0 MHz
    480
    481Associating a CPU with a CLOS group
    482~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    483
    484To associate a CPU to a CLOS group "core-power assoc" command can be used::
    485
    486 # intel-speed-select core-power assoc --help
    487 Intel(R) Speed Select Technology
    488 Executing on CPU model: X
    489 Associate a clos id to a CPU
    490	Specify targeted clos id with [--clos|-c]
    491
    492
    493For example to associate CPU 10 to CLOS group 3, execute::
    494
    495 # intel-speed-select -c 10 core-power assoc -c 3
    496 Intel(R) Speed Select Technology
    497 Executing on CPU model: X
    498 package-0
    499  die-0
    500    cpu-10
    501      core-power
    502        assoc:success
    503
    504Once a CPU is associated, its sibling CPUs are also associated to a CLOS group.
    505Once associated, avoid changing Linux "cpufreq" subsystem scaling frequency
    506limits.
    507
    508To check the existing association for a CPU, "core-power get-assoc" command can
    509be used. For example, to get association of CPU 10, execute::
    510
    511 # intel-speed-select -c 10 core-power get-assoc
    512 Intel(R) Speed Select Technology
    513 Executing on CPU model: X
    514 package-1
    515  die-0
    516    cpu-10
    517      get-assoc
    518        clos:3
    519
    520This shows that CPU 10 is part of a CLOS group 3.
    521
    522
    523Disable CLOS based prioritization
    524~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    525
    526To disable, execute::
    527
    528# intel-speed-select core-power disable
    529
    530Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritization
    531is enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause
    532Intel(R) SST-TF to fail. This will cause the "disable" command to display an error
    533if Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF
    534feature must be disabled first.
    535
    536Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)
    537-------------------------------------------------------------------
    538
    539The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets
    540the user control base frequency. If some critical workload threads demand
    541constant high guaranteed performance, then this feature can be used to execute
    542the thread at higher base frequency on specific sets of CPUs (high priority
    543CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.
    544This feature does not require offline of the low priority CPUs.
    545
    546The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -
    547Performance Profile (Intel(R) SST-PP) performance level configuration. It is
    548possible that only certain performance levels support Intel(R) SST-BF. It is also
    549possible that only base performance level (level = 0) has support of Intel
    550SST-BF. Consequently, first select the desired performance level to enable this
    551feature.
    552
    553In the system under test here, Intel(R) SST-BF is supported at the base
    554performance level 0, but currently disabled. For example for the level 0::
    555
    556 # intel-speed-select -c 0 perf-profile info -l 0
    557 Intel(R) Speed Select Technology
    558 Executing on CPU model: X
    559 package-0
    560  die-0
    561    cpu-0
    562      perf-profile-level-0
    563        ...
    564
    565        speed-select-base-freq:disabled
    566	...
    567
    568Before enabling Intel(R) SST-BF and measuring its impact on a workload
    569performance, execute some workload and measure performance and get a baseline
    570performance to compare against.
    571
    572Here the user wants more guaranteed performance. For this reason, it is likely
    573that turbo is disabled. To disable turbo, execute::
    574
    575#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
    576
    577Based on the output of the "intel-speed-select perf-profile info -l 0" base
    578frequency of guaranteed frequency 2600 MHz.
    579
    580
    581Measure baseline performance for comparison
    582~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    583
    584To compare, pick a multi-threaded workload where each thread can be scheduled on
    585separate CPUs. "Hackbench pipe" test is a good example on how to improve
    586performance using Intel(R) SST-BF.
    587
    588Below, the workload is measuring average scheduler wakeup latency, so a lower
    589number means better performance::
    590
    591 # taskset -c 3,4 perf bench -r 100 sched pipe
    592 # Running 'sched/pipe' benchmark:
    593 # Executed 1000000 pipe operations between two processes
    594     Total time: 6.102 [sec]
    595       6.102445 usecs/op
    596         163868 ops/sec
    597
    598While running the above test, if we take turbostat output, it will show us that
    5992 of the CPUs are busy and reaching max. frequency (which would be the base
    600frequency as the turbo is disabled). The turbostat output::
    601
    602 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
    603 Package	Core	CPU	Bzy_MHz
    604 0		0	0	1000
    605 0		1	1	1005
    606 0		2	2	1000
    607 0		3	3	2600
    608 0		4	4	2600
    609 0		5	5	1000
    610 0		6	6	1000
    611 0		7	7	1005
    612 0		8	8	1005
    613 0		9	9	1000
    614 0		10	10	1000
    615 0		11	11	995
    616 0		12	12	1000
    617 0		13	13	1000
    618
    619From the above turbostat output, both CPU 3 and 4 are very busy and reaching
    620full guaranteed frequency of 2600 MHz.
    621
    622Intel(R) SST-BF Capabilities
    623~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    624
    625To get capabilities of Intel(R) SST-BF for the current performance level 0,
    626execute::
    627
    628 # intel-speed-select base-freq info -l 0
    629 Intel(R) Speed Select Technology
    630 Executing on CPU model: X
    631 package-0
    632  die-0
    633    cpu-0
    634      speed-select-base-freq
    635        high-priority-base-frequency(MHz):3000
    636        high-priority-cpu-mask:00000216,00002160
    637        high-priority-cpu-list:5,6,8,13,33,34,36,41
    638        low-priority-base-frequency(MHz):2400
    639        tjunction-temperature(C):125
    640        thermal-design-power(W):205
    641
    642The above capabilities show that there are some CPUs on this system that can
    643offer base frequency of 3000 MHz compared to the standard base frequency at this
    644performance levels. Nevertheless, these CPUs are fixed, and they are presented
    645via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF
    646feature is selected, the low priorities CPUs (which are not in
    647high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this
    648clipping of low priority CPUs is acceptable, then the user can enable Intel
    649SST-BF feature particularly for the above "sched pipe" workload since only two
    650CPUs are used, they can be scheduled on high priority CPUs and can get boost of
    651400 MHz.
    652
    653Enable Intel(R) SST-BF
    654~~~~~~~~~~~~~~~~~~~~~~
    655
    656To enable Intel(R) SST-BF feature, execute::
    657
    658 # intel-speed-select base-freq enable -a
    659 Intel(R) Speed Select Technology
    660 Executing on CPU model: X
    661 package-0
    662  die-0
    663    cpu-0
    664      base-freq
    665        enable:success
    666 package-1
    667  die-0
    668    cpu-14
    669      base-freq
    670        enable:success
    671
    672In this case, -a option is optional. This not only enables Intel(R) SST-BF, but it
    673also adjusts the priority of cores using Intel(R) Speed Select Technology Core
    674Power (Intel(R) SST-CP) features. This option sets the minimum performance of each
    675Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to
    676maximum performance so that the hardware will give maximum performance possible
    677for each CPU.
    678
    679If -a option is not used, then the following steps are required before enabling
    680Intel(R) SST-BF:
    681
    682- Discover Intel(R) SST-BF and note low and high priority base frequency
    683- Note the high priority CPU list
    684- Enable CLOS using core-power feature set
    685- Configure CLOS parameters. Use CLOS.min to set to minimum performance
    686- Subscribe desired CPUs to CLOS groups
    687
    688With this configuration, if the same workload is executed by pinning the
    689workload to high priority CPUs (CPU 5 and 6 in this case)::
    690
    691 #taskset -c 5,6 perf bench -r 100 sched pipe
    692 # Running 'sched/pipe' benchmark:
    693 # Executed 1000000 pipe operations between two processes
    694     Total time: 5.627 [sec]
    695       5.627922 usecs/op
    696         177685 ops/sec
    697
    698This way, by enabling Intel(R) SST-BF, the performance of this benchmark is
    699improved (latency reduced) by 7.79%. From the turbostat output, it can be
    700observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz.
    701The turbostat output::
    702
    703 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
    704 Package	Core	CPU	Bzy_MHz
    705 0		0	0	2151
    706 0		1	1	2166
    707 0		2	2	2175
    708 0		3	3	2175
    709 0		4	4	2175
    710 0		5	5	3000
    711 0		6	6	3000
    712 0		7	7	2180
    713 0		8	8	2662
    714 0		9	9	2176
    715 0		10	10	2175
    716 0		11	11	2176
    717 0		12	12	2176
    718 0		13	13	2661
    719
    720Disable Intel(R) SST-BF
    721~~~~~~~~~~~~~~~~~~~~~~~
    722
    723To disable the Intel(R) SST-BF feature, execute::
    724
    725# intel-speed-select base-freq disable -a
    726
    727
    728Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
    729--------------------------------------------------------------------
    730
    731This feature enables the ability to set different "All core turbo ratio limits"
    732to cores based on the priority. By using this feature, some cores can be
    733configured to get higher turbo frequency by designating them as high priority at
    734the cost of lower or no turbo frequency on the low priority cores.
    735
    736For this reason, this feature is only useful when system is busy utilizing all
    737CPUs, but the user wants some configurable option to get high performance on
    738some CPUs.
    739
    740The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
    741depends on the Intel(R) Speed Select Technology - Performance Profile (Intel
    742SST-PP) performance level configuration. It is possible that only a certain
    743performance level supports Intel(R) SST-TF. It is also possible that only the base
    744performance level (level = 0) has the support of Intel(R) SST-TF. Hence, first
    745select the desired performance level to enable this feature.
    746
    747In the system under test here, Intel(R) SST-TF is supported at the base
    748performance level 0, but currently disabled::
    749
    750 # intel-speed-select -c 0 perf-profile info -l 0
    751 Intel(R) Speed Select Technology
    752 package-0
    753  die-0
    754    cpu-0
    755      perf-profile-level-0
    756        ...
    757        ...
    758        speed-select-turbo-freq:disabled
    759        ...
    760        ...
    761
    762
    763To check if performance can be improved using Intel(R) SST-TF feature, get the turbo
    764frequency properties with Intel(R) SST-TF enabled and compare to the base turbo
    765capability of this system.
    766
    767Get Base turbo capability
    768~~~~~~~~~~~~~~~~~~~~~~~~~
    769
    770To get the base turbo capability of performance level 0, execute::
    771
    772 # intel-speed-select perf-profile info -l 0
    773 Intel(R) Speed Select Technology
    774 Executing on CPU model: X
    775 package-0
    776  die-0
    777    cpu-0
    778      perf-profile-level-0
    779        ...
    780        ...
    781        turbo-ratio-limits-sse
    782          bucket-0
    783            core-count:2
    784            max-turbo-frequency(MHz):3200
    785          bucket-1
    786            core-count:4
    787            max-turbo-frequency(MHz):3100
    788          bucket-2
    789            core-count:6
    790            max-turbo-frequency(MHz):3100
    791          bucket-3
    792            core-count:8
    793            max-turbo-frequency(MHz):3100
    794          bucket-4
    795            core-count:10
    796            max-turbo-frequency(MHz):3100
    797          bucket-5
    798            core-count:12
    799            max-turbo-frequency(MHz):3100
    800          bucket-6
    801            core-count:14
    802            max-turbo-frequency(MHz):3100
    803          bucket-7
    804            core-count:16
    805            max-turbo-frequency(MHz):3100
    806
    807Based on the data above, when all the CPUS are busy, the max. frequency of 3100
    808MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)
    809and on CPU 12 and 13, execute "hackbench pipe" workload::
    810
    811 # taskset -c 12,13 perf bench -r 100 sched pipe
    812 # Running 'sched/pipe' benchmark:
    813 # Executed 1000000 pipe operations between two processes
    814     Total time: 5.705 [sec]
    815       5.705488 usecs/op
    816         175269 ops/sec
    817
    818The turbostat output::
    819
    820 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
    821 Package	Core	CPU	Bzy_MHz
    822 0		0	0	3000
    823 0		1	1	3000
    824 0		2	2	3000
    825 0		3	3	3000
    826 0		4	4	3000
    827 0		5	5	3100
    828 0		6	6	3100
    829 0		7	7	3000
    830 0		8	8	3100
    831 0		9	9	3000
    832 0		10	10	3000
    833 0		11	11	3000
    834 0		12	12	3100
    835 0		13	13	3100
    836
    837Based on turbostat output, the performance is limited by frequency cap of 3100
    838MHz. To check if the hackbench performance can be improved for CPU 12 and CPU
    83913, first check the capability of the Intel(R) SST-TF feature for this performance
    840level.
    841
    842Get Intel(R) SST-TF Capability
    843~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    844
    845To get the capability, the "turbo-freq info" command can be used::
    846
    847 # intel-speed-select turbo-freq info -l 0
    848 Intel(R) Speed Select Technology
    849 Executing on CPU model: X
    850 package-0
    851  die-0
    852    cpu-0
    853      speed-select-turbo-freq
    854          bucket-0
    855            high-priority-cores-count:2
    856            high-priority-max-frequency(MHz):3200
    857            high-priority-max-avx2-frequency(MHz):3200
    858            high-priority-max-avx512-frequency(MHz):3100
    859          bucket-1
    860            high-priority-cores-count:4
    861            high-priority-max-frequency(MHz):3100
    862            high-priority-max-avx2-frequency(MHz):3000
    863            high-priority-max-avx512-frequency(MHz):2900
    864          bucket-2
    865            high-priority-cores-count:6
    866            high-priority-max-frequency(MHz):3100
    867            high-priority-max-avx2-frequency(MHz):3000
    868            high-priority-max-avx512-frequency(MHz):2900
    869          speed-select-turbo-freq-clip-frequencies
    870            low-priority-max-frequency(MHz):2600
    871            low-priority-max-avx2-frequency(MHz):2400
    872            low-priority-max-avx512-frequency(MHz):2100
    873
    874Based on the output above, there is an Intel(R) SST-TF bucket for which there are
    875two high priority cores. If only two high priority cores are set, then max.
    876turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz
    877more than the base turbo capability for all cores.
    878
    879In turn, for the hackbench workload, two CPUs can be set as high priority and
    880rest as low priority. One side effect is that once enabled, the low priority
    881cores will be clipped to a lower frequency of 2600 MHz.
    882
    883Enable Intel(R) SST-TF
    884~~~~~~~~~~~~~~~~~~~~~~
    885
    886To enable Intel(R) SST-TF, execute::
    887
    888 # intel-speed-select -c 12,13 turbo-freq enable -a
    889 Intel(R) Speed Select Technology
    890 Executing on CPU model: X
    891 package-0
    892  die-0
    893    cpu-12
    894      turbo-freq
    895        enable:success
    896 package-0
    897  die-0
    898    cpu-13
    899      turbo-freq
    900        enable:success
    901 package--1
    902  die-0
    903    cpu-63
    904      turbo-freq --auto
    905        enable:success
    906
    907In this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF
    908feature and also sets the CPUs to high and low priority using Intel Speed
    909Select Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed
    910with "-c" arguments are marked as high priority, including its siblings.
    911
    912If -a option is not used, then the following steps are required before enabling
    913Intel(R) SST-TF:
    914
    915- Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
    916
    917- Enable CLOS using core-power feature set - Configure CLOS parameters
    918
    919- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
    920
    921If the same hackbench workload is executed, schedule hackbench threads on high
    922priority CPUs::
    923
    924 #taskset -c 12,13 perf bench -r 100 sched pipe
    925 # Running 'sched/pipe' benchmark:
    926 # Executed 1000000 pipe operations between two processes
    927     Total time: 5.510 [sec]
    928       5.510165 usecs/op
    929         180826 ops/sec
    930
    931This improved performance by around 3.3% improvement on a busy system. Here the
    932turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.
    933The turbostat output::
    934
    935 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
    936 Package	Core	CPU	Bzy_MHz
    937 ...
    938 0		12	12	3200
    939 0		13	13	3200