cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

blkio-controller.rst (11944B)


      1===================
      2Block IO Controller
      3===================
      4
      5Overview
      6========
      7cgroup subsys "blkio" implements the block io controller. There seems to be
      8a need of various kinds of IO control policies (like proportional BW, max BW)
      9both at leaf nodes as well as at intermediate nodes in a storage hierarchy.
     10Plan is to use the same cgroup based management interface for blkio controller
     11and based on user options switch IO policies in the background.
     12
     13One IO control policy is throttling policy which can be used to
     14specify upper IO rate limits on devices. This policy is implemented in
     15generic block layer and can be used on leaf nodes as well as higher
     16level logical devices like device mapper.
     17
     18HOWTO
     19=====
     20
     21Throttling/Upper Limit policy
     22-----------------------------
     23Enable Block IO controller::
     24
     25	CONFIG_BLK_CGROUP=y
     26
     27Enable throttling in block layer::
     28
     29	CONFIG_BLK_DEV_THROTTLING=y
     30
     31Mount blkio controller (see cgroups.txt, Why are cgroups needed?)::
     32
     33        mount -t cgroup -o blkio none /sys/fs/cgroup/blkio
     34
     35Specify a bandwidth rate on particular device for root group. The format
     36for policy is "<major>:<minor>  <bytes_per_second>"::
     37
     38        echo "8:16  1048576" > /sys/fs/cgroup/blkio/blkio.throttle.read_bps_device
     39
     40This will put a limit of 1MB/second on reads happening for root group
     41on device having major/minor number 8:16.
     42
     43Run dd to read a file and see if rate is throttled to 1MB/s or not::
     44
     45        # dd iflag=direct if=/mnt/common/zerofile of=/dev/null bs=4K count=1024
     46        1024+0 records in
     47        1024+0 records out
     48        4194304 bytes (4.2 MB) copied, 4.0001 s, 1.0 MB/s
     49
     50Limits for writes can be put using blkio.throttle.write_bps_device file.
     51
     52Hierarchical Cgroups
     53====================
     54
     55Throttling implements hierarchy support; however,
     56throttling's hierarchy support is enabled iff "sane_behavior" is
     57enabled from cgroup side, which currently is a development option and
     58not publicly available.
     59
     60If somebody created a hierarchy like as follows::
     61
     62			root
     63			/  \
     64		     test1 test2
     65			|
     66		     test3
     67
     68Throttling with "sane_behavior" will handle the
     69hierarchy correctly. For throttling, all limits apply
     70to the whole subtree while all statistics are local to the IOs
     71directly generated by tasks in that cgroup.
     72
     73Throttling without "sane_behavior" enabled from cgroup side will
     74practically treat all groups at same level as if it looks like the
     75following::
     76
     77				pivot
     78			     /  /   \  \
     79			root  test1 test2  test3
     80
     81Various user visible config options
     82===================================
     83
     84  CONFIG_BLK_CGROUP
     85	  Block IO controller.
     86
     87  CONFIG_BFQ_CGROUP_DEBUG
     88	  Debug help. Right now some additional stats file show up in cgroup
     89	  if this option is enabled.
     90
     91  CONFIG_BLK_DEV_THROTTLING
     92	  Enable block device throttling support in block layer.
     93
     94Details of cgroup files
     95=======================
     96
     97Proportional weight policy files
     98--------------------------------
     99
    100  blkio.bfq.weight
    101	  Specifies per cgroup weight. This is default weight of the group
    102	  on all the devices until and unless overridden by per device rule
    103	  (see `blkio.bfq.weight_device` below).
    104
    105	  Currently allowed range of weights is from 1 to 1000. For more details,
    106          see Documentation/block/bfq-iosched.rst.
    107
    108  blkio.bfq.weight_device
    109          Specifes per cgroup per device weights, overriding the default group
    110          weight. For more details, see Documentation/block/bfq-iosched.rst.
    111
    112	  Following is the format::
    113
    114	    # echo dev_maj:dev_minor weight > blkio.bfq.weight_device
    115
    116	  Configure weight=300 on /dev/sdb (8:16) in this cgroup::
    117
    118	    # echo 8:16 300 > blkio.bfq.weight_device
    119	    # cat blkio.bfq.weight_device
    120	    dev     weight
    121	    8:16    300
    122
    123	  Configure weight=500 on /dev/sda (8:0) in this cgroup::
    124
    125	    # echo 8:0 500 > blkio.bfq.weight_device
    126	    # cat blkio.bfq.weight_device
    127	    dev     weight
    128	    8:0     500
    129	    8:16    300
    130
    131	  Remove specific weight for /dev/sda in this cgroup::
    132
    133	    # echo 8:0 0 > blkio.bfq.weight_device
    134	    # cat blkio.bfq.weight_device
    135	    dev     weight
    136	    8:16    300
    137
    138  blkio.time
    139	  Disk time allocated to cgroup per device in milliseconds. First
    140	  two fields specify the major and minor number of the device and
    141	  third field specifies the disk time allocated to group in
    142	  milliseconds.
    143
    144  blkio.sectors
    145	  Number of sectors transferred to/from disk by the group. First
    146	  two fields specify the major and minor number of the device and
    147	  third field specifies the number of sectors transferred by the
    148	  group to/from the device.
    149
    150  blkio.io_service_bytes
    151	  Number of bytes transferred to/from the disk by the group. These
    152	  are further divided by the type of operation - read or write, sync
    153	  or async. First two fields specify the major and minor number of the
    154	  device, third field specifies the operation type and the fourth field
    155	  specifies the number of bytes.
    156
    157  blkio.io_serviced
    158	  Number of IOs (bio) issued to the disk by the group. These
    159	  are further divided by the type of operation - read or write, sync
    160	  or async. First two fields specify the major and minor number of the
    161	  device, third field specifies the operation type and the fourth field
    162	  specifies the number of IOs.
    163
    164  blkio.io_service_time
    165	  Total amount of time between request dispatch and request completion
    166	  for the IOs done by this cgroup. This is in nanoseconds to make it
    167	  meaningful for flash devices too. For devices with queue depth of 1,
    168	  this time represents the actual service time. When queue_depth > 1,
    169	  that is no longer true as requests may be served out of order. This
    170	  may cause the service time for a given IO to include the service time
    171	  of multiple IOs when served out of order which may result in total
    172	  io_service_time > actual time elapsed. This time is further divided by
    173	  the type of operation - read or write, sync or async. First two fields
    174	  specify the major and minor number of the device, third field
    175	  specifies the operation type and the fourth field specifies the
    176	  io_service_time in ns.
    177
    178  blkio.io_wait_time
    179	  Total amount of time the IOs for this cgroup spent waiting in the
    180	  scheduler queues for service. This can be greater than the total time
    181	  elapsed since it is cumulative io_wait_time for all IOs. It is not a
    182	  measure of total time the cgroup spent waiting but rather a measure of
    183	  the wait_time for its individual IOs. For devices with queue_depth > 1
    184	  this metric does not include the time spent waiting for service once
    185	  the IO is dispatched to the device but till it actually gets serviced
    186	  (there might be a time lag here due to re-ordering of requests by the
    187	  device). This is in nanoseconds to make it meaningful for flash
    188	  devices too. This time is further divided by the type of operation -
    189	  read or write, sync or async. First two fields specify the major and
    190	  minor number of the device, third field specifies the operation type
    191	  and the fourth field specifies the io_wait_time in ns.
    192
    193  blkio.io_merged
    194	  Total number of bios/requests merged into requests belonging to this
    195	  cgroup. This is further divided by the type of operation - read or
    196	  write, sync or async.
    197
    198  blkio.io_queued
    199	  Total number of requests queued up at any given instant for this
    200	  cgroup. This is further divided by the type of operation - read or
    201	  write, sync or async.
    202
    203  blkio.avg_queue_size
    204	  Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
    205	  The average queue size for this cgroup over the entire time of this
    206	  cgroup's existence. Queue size samples are taken each time one of the
    207	  queues of this cgroup gets a timeslice.
    208
    209  blkio.group_wait_time
    210	  Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
    211	  This is the amount of time the cgroup had to wait since it became busy
    212	  (i.e., went from 0 to 1 request queued) to get a timeslice for one of
    213	  its queues. This is different from the io_wait_time which is the
    214	  cumulative total of the amount of time spent by each IO in that cgroup
    215	  waiting in the scheduler queue. This is in nanoseconds. If this is
    216	  read when the cgroup is in a waiting (for timeslice) state, the stat
    217	  will only report the group_wait_time accumulated till the last time it
    218	  got a timeslice and will not include the current delta.
    219
    220  blkio.empty_time
    221	  Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
    222	  This is the amount of time a cgroup spends without any pending
    223	  requests when not being served, i.e., it does not include any time
    224	  spent idling for one of the queues of the cgroup. This is in
    225	  nanoseconds. If this is read when the cgroup is in an empty state,
    226	  the stat will only report the empty_time accumulated till the last
    227	  time it had a pending request and will not include the current delta.
    228
    229  blkio.idle_time
    230	  Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y.
    231	  This is the amount of time spent by the IO scheduler idling for a
    232	  given cgroup in anticipation of a better request than the existing ones
    233	  from other queues/cgroups. This is in nanoseconds. If this is read
    234	  when the cgroup is in an idling state, the stat will only report the
    235	  idle_time accumulated till the last idle period and will not include
    236	  the current delta.
    237
    238  blkio.dequeue
    239	  Debugging aid only enabled if CONFIG_BFQ_CGROUP_DEBUG=y. This
    240	  gives the statistics about how many a times a group was dequeued
    241	  from service tree of the device. First two fields specify the major
    242	  and minor number of the device and third field specifies the number
    243	  of times a group was dequeued from a particular device.
    244
    245  blkio.*_recursive
    246	  Recursive version of various stats. These files show the
    247          same information as their non-recursive counterparts but
    248          include stats from all the descendant cgroups.
    249
    250Throttling/Upper limit policy files
    251-----------------------------------
    252  blkio.throttle.read_bps_device
    253	  Specifies upper limit on READ rate from the device. IO rate is
    254	  specified in bytes per second. Rules are per device. Following is
    255	  the format::
    256
    257	    echo "<major>:<minor>  <rate_bytes_per_second>" > /cgrp/blkio.throttle.read_bps_device
    258
    259  blkio.throttle.write_bps_device
    260	  Specifies upper limit on WRITE rate to the device. IO rate is
    261	  specified in bytes per second. Rules are per device. Following is
    262	  the format::
    263
    264	    echo "<major>:<minor>  <rate_bytes_per_second>" > /cgrp/blkio.throttle.write_bps_device
    265
    266  blkio.throttle.read_iops_device
    267	  Specifies upper limit on READ rate from the device. IO rate is
    268	  specified in IO per second. Rules are per device. Following is
    269	  the format::
    270
    271	   echo "<major>:<minor>  <rate_io_per_second>" > /cgrp/blkio.throttle.read_iops_device
    272
    273  blkio.throttle.write_iops_device
    274	  Specifies upper limit on WRITE rate to the device. IO rate is
    275	  specified in io per second. Rules are per device. Following is
    276	  the format::
    277
    278	    echo "<major>:<minor>  <rate_io_per_second>" > /cgrp/blkio.throttle.write_iops_device
    279
    280          Note: If both BW and IOPS rules are specified for a device, then IO is
    281          subjected to both the constraints.
    282
    283  blkio.throttle.io_serviced
    284	  Number of IOs (bio) issued to the disk by the group. These
    285	  are further divided by the type of operation - read or write, sync
    286	  or async. First two fields specify the major and minor number of the
    287	  device, third field specifies the operation type and the fourth field
    288	  specifies the number of IOs.
    289
    290  blkio.throttle.io_service_bytes
    291	  Number of bytes transferred to/from the disk by the group. These
    292	  are further divided by the type of operation - read or write, sync
    293	  or async. First two fields specify the major and minor number of the
    294	  device, third field specifies the operation type and the fourth field
    295	  specifies the number of bytes.
    296
    297Common files among various policies
    298-----------------------------------
    299  blkio.reset_stats
    300	  Writing an int to this file will result in resetting all the stats
    301	  for that cgroup.