cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

bfq-iosched.rst (26861B)


      1==========================
      2BFQ (Budget Fair Queueing)
      3==========================
      4
      5BFQ is a proportional-share I/O scheduler, with some extra
      6low-latency capabilities. In addition to cgroups support (blkio or io
      7controllers), BFQ's main features are:
      8
      9- BFQ guarantees a high system and application responsiveness, and a
     10  low latency for time-sensitive applications, such as audio or video
     11  players;
     12- BFQ distributes bandwidth, and not just time, among processes or
     13  groups (switching back to time distribution when needed to keep
     14  throughput high).
     15
     16In its default configuration, BFQ privileges latency over
     17throughput. So, when needed for achieving a lower latency, BFQ builds
     18schedules that may lead to a lower throughput. If your main or only
     19goal, for a given device, is to achieve the maximum-possible
     20throughput at all times, then do switch off all low-latency heuristics
     21for that device, by setting low_latency to 0. See Section 3 for
     22details on how to configure BFQ for the desired tradeoff between
     23latency and throughput, or on how to maximize throughput.
     24
     25As every I/O scheduler, BFQ adds some overhead to per-I/O-request
     26processing. To give an idea of this overhead, the total,
     27single-lock-protected, per-request processing time of BFQ---i.e., the
     28sum of the execution times of the request insertion, dispatch and
     29completion hooks---is, e.g., 1.9 us on an Intel Core i7-2760QM@2.40GHz
     30(dated CPU for notebooks; time measured with simple code
     31instrumentation, and using the throughput-sync.sh script of the S
     32suite [1], in performance-profiling mode). To put this result into
     33context, the total, single-lock-protected, per-request execution time
     34of the lightest I/O scheduler available in blk-mq, mq-deadline, is 0.7
     35us (mq-deadline is ~800 LOC, against ~10500 LOC for BFQ).
     36
     37Scheduling overhead further limits the maximum IOPS that a CPU can
     38process (already limited by the execution of the rest of the I/O
     39stack). To give an idea of the limits with BFQ, on slow or average
     40CPUs, here are, first, the limits of BFQ for three different CPUs, on,
     41respectively, an average laptop, an old desktop, and a cheap embedded
     42system, in case full hierarchical support is enabled (i.e.,
     43CONFIG_BFQ_GROUP_IOSCHED is set), but CONFIG_BFQ_CGROUP_DEBUG is not
     44set (Section 4-2):
     45- Intel i7-4850HQ: 400 KIOPS
     46- AMD A8-3850: 250 KIOPS
     47- ARM CortexTM-A53 Octa-core: 80 KIOPS
     48
     49If CONFIG_BFQ_CGROUP_DEBUG is set (and of course full hierarchical
     50support is enabled), then the sustainable throughput with BFQ
     51decreases, because all blkio.bfq* statistics are created and updated
     52(Section 4-2). For BFQ, this leads to the following maximum
     53sustainable throughputs, on the same systems as above:
     54- Intel i7-4850HQ: 310 KIOPS
     55- AMD A8-3850: 200 KIOPS
     56- ARM CortexTM-A53 Octa-core: 56 KIOPS
     57
     58BFQ works for multi-queue devices too.
     59
     60.. The table of contents follow. Impatients can just jump to Section 3.
     61
     62.. CONTENTS
     63
     64   1. When may BFQ be useful?
     65    1-1 Personal systems
     66    1-2 Server systems
     67   2. How does BFQ work?
     68   3. What are BFQ's tunables and how to properly configure BFQ?
     69   4. BFQ group scheduling
     70    4-1 Service guarantees provided
     71    4-2 Interface
     72
     731. When may BFQ be useful?
     74==========================
     75
     76BFQ provides the following benefits on personal and server systems.
     77
     781-1 Personal systems
     79--------------------
     80
     81Low latency for interactive applications
     82^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     83
     84Regardless of the actual background workload, BFQ guarantees that, for
     85interactive tasks, the storage device is virtually as responsive as if
     86it was idle. For example, even if one or more of the following
     87background workloads are being executed:
     88
     89- one or more large files are being read, written or copied,
     90- a tree of source files is being compiled,
     91- one or more virtual machines are performing I/O,
     92- a software update is in progress,
     93- indexing daemons are scanning filesystems and updating their
     94  databases,
     95
     96starting an application or loading a file from within an application
     97takes about the same time as if the storage device was idle. As a
     98comparison, with CFQ, NOOP or DEADLINE, and in the same conditions,
     99applications experience high latencies, or even become unresponsive
    100until the background workload terminates (also on SSDs).
    101
    102Low latency for soft real-time applications
    103^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    104Also soft real-time applications, such as audio and video
    105players/streamers, enjoy a low latency and a low drop rate, regardless
    106of the background I/O workload. As a consequence, these applications
    107do not suffer from almost any glitch due to the background workload.
    108
    109Higher speed for code-development tasks
    110^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    111
    112If some additional workload happens to be executed in parallel, then
    113BFQ executes the I/O-related components of typical code-development
    114tasks (compilation, checkout, merge, ...) much more quickly than CFQ,
    115NOOP or DEADLINE.
    116
    117High throughput
    118^^^^^^^^^^^^^^^
    119
    120On hard disks, BFQ achieves up to 30% higher throughput than CFQ, and
    121up to 150% higher throughput than DEADLINE and NOOP, with all the
    122sequential workloads considered in our tests. With random workloads,
    123and with all the workloads on flash-based devices, BFQ achieves,
    124instead, about the same throughput as the other schedulers.
    125
    126Strong fairness, bandwidth and delay guarantees
    127^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    128
    129BFQ distributes the device throughput, and not just the device time,
    130among I/O-bound applications in proportion their weights, with any
    131workload and regardless of the device parameters. From these bandwidth
    132guarantees, it is possible to compute tight per-I/O-request delay
    133guarantees by a simple formula. If not configured for strict service
    134guarantees, BFQ switches to time-based resource sharing (only) for
    135applications that would otherwise cause a throughput loss.
    136
    1371-2 Server systems
    138------------------
    139
    140Most benefits for server systems follow from the same service
    141properties as above. In particular, regardless of whether additional,
    142possibly heavy workloads are being served, BFQ guarantees:
    143
    144* audio and video-streaming with zero or very low jitter and drop
    145  rate;
    146
    147* fast retrieval of WEB pages and embedded objects;
    148
    149* real-time recording of data in live-dumping applications (e.g.,
    150  packet logging);
    151
    152* responsiveness in local and remote access to a server.
    153
    154
    1552. How does BFQ work?
    156=====================
    157
    158BFQ is a proportional-share I/O scheduler, whose general structure,
    159plus a lot of code, are borrowed from CFQ.
    160
    161- Each process doing I/O on a device is associated with a weight and a
    162  `(bfq_)queue`.
    163
    164- BFQ grants exclusive access to the device, for a while, to one queue
    165  (process) at a time, and implements this service model by
    166  associating every queue with a budget, measured in number of
    167  sectors.
    168
    169  - After a queue is granted access to the device, the budget of the
    170    queue is decremented, on each request dispatch, by the size of the
    171    request.
    172
    173  - The in-service queue is expired, i.e., its service is suspended,
    174    only if one of the following events occurs: 1) the queue finishes
    175    its budget, 2) the queue empties, 3) a "budget timeout" fires.
    176
    177    - The budget timeout prevents processes doing random I/O from
    178      holding the device for too long and dramatically reducing
    179      throughput.
    180
    181    - Actually, as in CFQ, a queue associated with a process issuing
    182      sync requests may not be expired immediately when it empties. In
    183      contrast, BFQ may idle the device for a short time interval,
    184      giving the process the chance to go on being served if it issues
    185      a new request in time. Device idling typically boosts the
    186      throughput on rotational devices and on non-queueing flash-based
    187      devices, if processes do synchronous and sequential I/O. In
    188      addition, under BFQ, device idling is also instrumental in
    189      guaranteeing the desired throughput fraction to processes
    190      issuing sync requests (see the description of the slice_idle
    191      tunable in this document, or [1, 2], for more details).
    192
    193      - With respect to idling for service guarantees, if several
    194	processes are competing for the device at the same time, but
    195	all processes and groups have the same weight, then BFQ
    196	guarantees the expected throughput distribution without ever
    197	idling the device. Throughput is thus as high as possible in
    198	this common scenario.
    199
    200     - On flash-based storage with internal queueing of commands
    201       (typically NCQ), device idling happens to be always detrimental
    202       for throughput. So, with these devices, BFQ performs idling
    203       only when strictly needed for service guarantees, i.e., for
    204       guaranteeing low latency or fairness. In these cases, overall
    205       throughput may be sub-optimal. No solution currently exists to
    206       provide both strong service guarantees and optimal throughput
    207       on devices with internal queueing.
    208
    209  - If low-latency mode is enabled (default configuration), BFQ
    210    executes some special heuristics to detect interactive and soft
    211    real-time applications (e.g., video or audio players/streamers),
    212    and to reduce their latency. The most important action taken to
    213    achieve this goal is to give to the queues associated with these
    214    applications more than their fair share of the device
    215    throughput. For brevity, we call just "weight-raising" the whole
    216    sets of actions taken by BFQ to privilege these queues. In
    217    particular, BFQ provides a milder form of weight-raising for
    218    interactive applications, and a stronger form for soft real-time
    219    applications.
    220
    221  - BFQ automatically deactivates idling for queues born in a burst of
    222    queue creations. In fact, these queues are usually associated with
    223    the processes of applications and services that benefit mostly
    224    from a high throughput. Examples are systemd during boot, or git
    225    grep.
    226
    227  - As CFQ, BFQ merges queues performing interleaved I/O, i.e.,
    228    performing random I/O that becomes mostly sequential if
    229    merged. Differently from CFQ, BFQ achieves this goal with a more
    230    reactive mechanism, called Early Queue Merge (EQM). EQM is so
    231    responsive in detecting interleaved I/O (cooperating processes),
    232    that it enables BFQ to achieve a high throughput, by queue
    233    merging, even for queues for which CFQ needs a different
    234    mechanism, preemption, to get a high throughput. As such EQM is a
    235    unified mechanism to achieve a high throughput with interleaved
    236    I/O.
    237
    238  - Queues are scheduled according to a variant of WF2Q+, named
    239    B-WF2Q+, and implemented using an augmented rb-tree to preserve an
    240    O(log N) overall complexity.  See [2] for more details. B-WF2Q+ is
    241    also ready for hierarchical scheduling, details in Section 4.
    242
    243  - B-WF2Q+ guarantees a tight deviation with respect to an ideal,
    244    perfectly fair, and smooth service. In particular, B-WF2Q+
    245    guarantees that each queue receives a fraction of the device
    246    throughput proportional to its weight, even if the throughput
    247    fluctuates, and regardless of: the device parameters, the current
    248    workload and the budgets assigned to the queue.
    249
    250  - The last, budget-independence, property (although probably
    251    counterintuitive in the first place) is definitely beneficial, for
    252    the following reasons:
    253
    254    - First, with any proportional-share scheduler, the maximum
    255      deviation with respect to an ideal service is proportional to
    256      the maximum budget (slice) assigned to queues. As a consequence,
    257      BFQ can keep this deviation tight not only because of the
    258      accurate service of B-WF2Q+, but also because BFQ *does not*
    259      need to assign a larger budget to a queue to let the queue
    260      receive a higher fraction of the device throughput.
    261
    262    - Second, BFQ is free to choose, for every process (queue), the
    263      budget that best fits the needs of the process, or best
    264      leverages the I/O pattern of the process. In particular, BFQ
    265      updates queue budgets with a simple feedback-loop algorithm that
    266      allows a high throughput to be achieved, while still providing
    267      tight latency guarantees to time-sensitive applications. When
    268      the in-service queue expires, this algorithm computes the next
    269      budget of the queue so as to:
    270
    271      - Let large budgets be eventually assigned to the queues
    272	associated with I/O-bound applications performing sequential
    273	I/O: in fact, the longer these applications are served once
    274	got access to the device, the higher the throughput is.
    275
    276      - Let small budgets be eventually assigned to the queues
    277	associated with time-sensitive applications (which typically
    278	perform sporadic and short I/O), because, the smaller the
    279	budget assigned to a queue waiting for service is, the sooner
    280	B-WF2Q+ will serve that queue (Subsec 3.3 in [2]).
    281
    282- If several processes are competing for the device at the same time,
    283  but all processes and groups have the same weight, then BFQ
    284  guarantees the expected throughput distribution without ever idling
    285  the device. It uses preemption instead. Throughput is then much
    286  higher in this common scenario.
    287
    288- ioprio classes are served in strict priority order, i.e.,
    289  lower-priority queues are not served as long as there are
    290  higher-priority queues.  Among queues in the same class, the
    291  bandwidth is distributed in proportion to the weight of each
    292  queue. A very thin extra bandwidth is however guaranteed to
    293  the Idle class, to prevent it from starving.
    294
    295
    2963. What are BFQ's tunables and how to properly configure BFQ?
    297=============================================================
    298
    299Most BFQ tunables affect service guarantees (basically latency and
    300fairness) and throughput. For full details on how to choose the
    301desired tradeoff between service guarantees and throughput, see the
    302parameters slice_idle, strict_guarantees and low_latency. For details
    303on how to maximise throughput, see slice_idle, timeout_sync and
    304max_budget. The other performance-related parameters have been
    305inherited from, and have been preserved mostly for compatibility with
    306CFQ. So far, no performance improvement has been reported after
    307changing the latter parameters in BFQ.
    308
    309In particular, the tunables back_seek-max, back_seek_penalty,
    310fifo_expire_async and fifo_expire_sync below are the same as in
    311CFQ. Their description is just copied from that for CFQ. Some
    312considerations in the description of slice_idle are copied from CFQ
    313too.
    314
    315per-process ioprio and weight
    316-----------------------------
    317
    318Unless the cgroups interface is used (see "4. BFQ group scheduling"),
    319weights can be assigned to processes only indirectly, through I/O
    320priorities, and according to the relation:
    321weight = (IOPRIO_BE_NR - ioprio) * 10.
    322
    323Beware that, if low-latency is set, then BFQ automatically raises the
    324weight of the queues associated with interactive and soft real-time
    325applications. Unset this tunable if you need/want to control weights.
    326
    327slice_idle
    328----------
    329
    330This parameter specifies how long BFQ should idle for next I/O
    331request, when certain sync BFQ queues become empty. By default
    332slice_idle is a non-zero value. Idling has a double purpose: boosting
    333throughput and making sure that the desired throughput distribution is
    334respected (see the description of how BFQ works, and, if needed, the
    335papers referred there).
    336
    337As for throughput, idling can be very helpful on highly seeky media
    338like single spindle SATA/SAS disks where we can cut down on overall
    339number of seeks and see improved throughput.
    340
    341Setting slice_idle to 0 will remove all the idling on queues and one
    342should see an overall improved throughput on faster storage devices
    343like multiple SATA/SAS disks in hardware RAID configuration, as well
    344as flash-based storage with internal command queueing (and
    345parallelism).
    346
    347So depending on storage and workload, it might be useful to set
    348slice_idle=0.  In general for SATA/SAS disks and software RAID of
    349SATA/SAS disks keeping slice_idle enabled should be useful. For any
    350configurations where there are multiple spindles behind single LUN
    351(Host based hardware RAID controller or for storage arrays), or with
    352flash-based fast storage, setting slice_idle=0 might end up in better
    353throughput and acceptable latencies.
    354
    355Idling is however necessary to have service guarantees enforced in
    356case of differentiated weights or differentiated I/O-request lengths.
    357To see why, suppose that a given BFQ queue A must get several I/O
    358requests served for each request served for another queue B. Idling
    359ensures that, if A makes a new I/O request slightly after becoming
    360empty, then no request of B is dispatched in the middle, and thus A
    361does not lose the possibility to get more than one request dispatched
    362before the next request of B is dispatched. Note that idling
    363guarantees the desired differentiated treatment of queues only in
    364terms of I/O-request dispatches. To guarantee that the actual service
    365order then corresponds to the dispatch order, the strict_guarantees
    366tunable must be set too.
    367
    368There is an important flipside for idling: apart from the above cases
    369where it is beneficial also for throughput, idling can severely impact
    370throughput. One important case is random workload. Because of this
    371issue, BFQ tends to avoid idling as much as possible, when it is not
    372beneficial also for throughput (as detailed in Section 2). As a
    373consequence of this behavior, and of further issues described for the
    374strict_guarantees tunable, short-term service guarantees may be
    375occasionally violated. And, in some cases, these guarantees may be
    376more important than guaranteeing maximum throughput. For example, in
    377video playing/streaming, a very low drop rate may be more important
    378than maximum throughput. In these cases, consider setting the
    379strict_guarantees parameter.
    380
    381slice_idle_us
    382-------------
    383
    384Controls the same tuning parameter as slice_idle, but in microseconds.
    385Either tunable can be used to set idling behavior.  Afterwards, the
    386other tunable will reflect the newly set value in sysfs.
    387
    388strict_guarantees
    389-----------------
    390
    391If this parameter is set (default: unset), then BFQ
    392
    393- always performs idling when the in-service queue becomes empty;
    394
    395- forces the device to serve one I/O request at a time, by dispatching a
    396  new request only if there is no outstanding request.
    397
    398In the presence of differentiated weights or I/O-request sizes, both
    399the above conditions are needed to guarantee that every BFQ queue
    400receives its allotted share of the bandwidth. The first condition is
    401needed for the reasons explained in the description of the slice_idle
    402tunable.  The second condition is needed because all modern storage
    403devices reorder internally-queued requests, which may trivially break
    404the service guarantees enforced by the I/O scheduler.
    405
    406Setting strict_guarantees may evidently affect throughput.
    407
    408back_seek_max
    409-------------
    410
    411This specifies, given in Kbytes, the maximum "distance" for backward seeking.
    412The distance is the amount of space from the current head location to the
    413sectors that are backward in terms of distance.
    414
    415This parameter allows the scheduler to anticipate requests in the "backward"
    416direction and consider them as being the "next" if they are within this
    417distance from the current head location.
    418
    419back_seek_penalty
    420-----------------
    421
    422This parameter is used to compute the cost of backward seeking. If the
    423backward distance of request is just 1/back_seek_penalty from a "front"
    424request, then the seeking cost of two requests is considered equivalent.
    425
    426So scheduler will not bias toward one or the other request (otherwise scheduler
    427will bias toward front request). Default value of back_seek_penalty is 2.
    428
    429fifo_expire_async
    430-----------------
    431
    432This parameter is used to set the timeout of asynchronous requests. Default
    433value of this is 250ms.
    434
    435fifo_expire_sync
    436----------------
    437
    438This parameter is used to set the timeout of synchronous requests. Default
    439value of this is 125ms. In case to favor synchronous requests over asynchronous
    440one, this value should be decreased relative to fifo_expire_async.
    441
    442low_latency
    443-----------
    444
    445This parameter is used to enable/disable BFQ's low latency mode. By
    446default, low latency mode is enabled. If enabled, interactive and soft
    447real-time applications are privileged and experience a lower latency,
    448as explained in more detail in the description of how BFQ works.
    449
    450DISABLE this mode if you need full control on bandwidth
    451distribution. In fact, if it is enabled, then BFQ automatically
    452increases the bandwidth share of privileged applications, as the main
    453means to guarantee a lower latency to them.
    454
    455In addition, as already highlighted at the beginning of this document,
    456DISABLE this mode if your only goal is to achieve a high throughput.
    457In fact, privileging the I/O of some application over the rest may
    458entail a lower throughput. To achieve the highest-possible throughput
    459on a non-rotational device, setting slice_idle to 0 may be needed too
    460(at the cost of giving up any strong guarantee on fairness and low
    461latency).
    462
    463timeout_sync
    464------------
    465
    466Maximum amount of device time that can be given to a task (queue) once
    467it has been selected for service. On devices with costly seeks,
    468increasing this time usually increases maximum throughput. On the
    469opposite end, increasing this time coarsens the granularity of the
    470short-term bandwidth and latency guarantees, especially if the
    471following parameter is set to zero.
    472
    473max_budget
    474----------
    475
    476Maximum amount of service, measured in sectors, that can be provided
    477to a BFQ queue once it is set in service (of course within the limits
    478of the above timeout). According to what said in the description of
    479the algorithm, larger values increase the throughput in proportion to
    480the percentage of sequential I/O requests issued. The price of larger
    481values is that they coarsen the granularity of short-term bandwidth
    482and latency guarantees.
    483
    484The default value is 0, which enables auto-tuning: BFQ sets max_budget
    485to the maximum number of sectors that can be served during
    486timeout_sync, according to the estimated peak rate.
    487
    488For specific devices, some users have occasionally reported to have
    489reached a higher throughput by setting max_budget explicitly, i.e., by
    490setting max_budget to a higher value than 0. In particular, they have
    491set max_budget to higher values than those to which BFQ would have set
    492it with auto-tuning. An alternative way to achieve this goal is to
    493just increase the value of timeout_sync, leaving max_budget equal to 0.
    494
    4954. Group scheduling with BFQ
    496============================
    497
    498BFQ supports both cgroups-v1 and cgroups-v2 io controllers, namely
    499blkio and io. In particular, BFQ supports weight-based proportional
    500share. To activate cgroups support, set BFQ_GROUP_IOSCHED.
    501
    5024-1 Service guarantees provided
    503-------------------------------
    504
    505With BFQ, proportional share means true proportional share of the
    506device bandwidth, according to group weights. For example, a group
    507with weight 200 gets twice the bandwidth, and not just twice the time,
    508of a group with weight 100.
    509
    510BFQ supports hierarchies (group trees) of any depth. Bandwidth is
    511distributed among groups and processes in the expected way: for each
    512group, the children of the group share the whole bandwidth of the
    513group in proportion to their weights. In particular, this implies
    514that, for each leaf group, every process of the group receives the
    515same share of the whole group bandwidth, unless the ioprio of the
    516process is modified.
    517
    518The resource-sharing guarantee for a group may partially or totally
    519switch from bandwidth to time, if providing bandwidth guarantees to
    520the group lowers the throughput too much. This switch occurs on a
    521per-process basis: if a process of a leaf group causes throughput loss
    522if served in such a way to receive its share of the bandwidth, then
    523BFQ switches back to just time-based proportional share for that
    524process.
    525
    5264-2 Interface
    527-------------
    528
    529To get proportional sharing of bandwidth with BFQ for a given device,
    530BFQ must of course be the active scheduler for that device.
    531
    532Within each group directory, the names of the files associated with
    533BFQ-specific cgroup parameters and stats begin with the "bfq."
    534prefix. So, with cgroups-v1 or cgroups-v2, the full prefix for
    535BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group
    536parameter to set the weight of a group with BFQ is blkio.bfq.weight
    537or io.bfq.weight.
    538
    539As for cgroups-v1 (blkio controller), the exact set of stat files
    540created, and kept up-to-date by bfq, depends on whether
    541CONFIG_BFQ_CGROUP_DEBUG is set. If it is set, then bfq creates all
    542the stat files documented in
    543Documentation/admin-guide/cgroup-v1/blkio-controller.rst. If, instead,
    544CONFIG_BFQ_CGROUP_DEBUG is not set, then bfq creates only the files::
    545
    546  blkio.bfq.io_service_bytes
    547  blkio.bfq.io_service_bytes_recursive
    548  blkio.bfq.io_serviced
    549  blkio.bfq.io_serviced_recursive
    550
    551The value of CONFIG_BFQ_CGROUP_DEBUG greatly influences the maximum
    552throughput sustainable with bfq, because updating the blkio.bfq.*
    553stats is rather costly, especially for some of the stats enabled by
    554CONFIG_BFQ_CGROUP_DEBUG.
    555
    556Parameters
    557----------
    558
    559For each group, the following parameters can be set:
    560
    561  weight
    562        This specifies the default weight for the cgroup inside its parent.
    563        Available values: 1..1000 (default: 100).
    564
    565        For cgroup v1, it is set by writing the value to `blkio.bfq.weight`.
    566
    567        For cgroup v2, it is set by writing the value to `io.bfq.weight`.
    568        (with an optional prefix of `default` and a space).
    569
    570        The linear mapping between ioprio and weights, described at the beginning
    571        of the tunable section, is still valid, but all weights higher than
    572        IOPRIO_BE_NR*10 are mapped to ioprio 0.
    573
    574        Recall that, if low-latency is set, then BFQ automatically raises the
    575        weight of the queues associated with interactive and soft real-time
    576        applications. Unset this tunable if you need/want to control weights.
    577
    578  weight_device
    579        This specifies a per-device weight for the cgroup. The syntax is
    580        `minor:major weight`. A weight of `0` may be used to reset to the default
    581        weight.
    582
    583        For cgroup v1, it is set by writing the value to `blkio.bfq.weight_device`.
    584
    585        For cgroup v2, the file name is `io.bfq.weight`.
    586
    587
    588[1]
    589    P. Valente, A. Avanzini, "Evolution of the BFQ Storage I/O
    590    Scheduler", Proceedings of the First Workshop on Mobile System
    591    Technologies (MST-2015), May 2015.
    592
    593    http://algogroup.unimore.it/people/paolo/disk_sched/mst-2015.pdf
    594
    595[2]
    596    P. Valente and M. Andreolini, "Improving Application
    597    Responsiveness with the BFQ Disk I/O Scheduler", Proceedings of
    598    the 5th Annual International Systems and Storage Conference
    599    (SYSTOR '12), June 2012.
    600
    601    Slightly extended version:
    602
    603    http://algogroup.unimore.it/people/paolo/disk_sched/bfq-v1-suite-results.pdf
    604
    605[3]
    606   https://github.com/Algodev-github/S