cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

swap_numa.rst (3100B)


      1.. _swap_numa:
      2
      3===========================================
      4Automatically bind swap device to numa node
      5===========================================
      6
      7If the system has more than one swap device and swap device has the node
      8information, we can make use of this information to decide which swap
      9device to use in get_swap_pages() to get better performance.
     10
     11
     12How to use this feature
     13=======================
     14
     15Swap device has priority and that decides the order of it to be used. To make
     16use of automatically binding, there is no need to manipulate priority settings
     17for swap devices. e.g. on a 2 node machine, assume 2 swap devices swapA and
     18swapB, with swapA attached to node 0 and swapB attached to node 1, are going
     19to be swapped on. Simply swapping them on by doing::
     20
     21	# swapon /dev/swapA
     22	# swapon /dev/swapB
     23
     24Then node 0 will use the two swap devices in the order of swapA then swapB and
     25node 1 will use the two swap devices in the order of swapB then swapA. Note
     26that the order of them being swapped on doesn't matter.
     27
     28A more complex example on a 4 node machine. Assume 6 swap devices are going to
     29be swapped on: swapA and swapB are attached to node 0, swapC is attached to
     30node 1, swapD and swapE are attached to node 2 and swapF is attached to node3.
     31The way to swap them on is the same as above::
     32
     33	# swapon /dev/swapA
     34	# swapon /dev/swapB
     35	# swapon /dev/swapC
     36	# swapon /dev/swapD
     37	# swapon /dev/swapE
     38	# swapon /dev/swapF
     39
     40Then node 0 will use them in the order of::
     41
     42	swapA/swapB -> swapC -> swapD -> swapE -> swapF
     43
     44swapA and swapB will be used in a round robin mode before any other swap device.
     45
     46node 1 will use them in the order of::
     47
     48	swapC -> swapA -> swapB -> swapD -> swapE -> swapF
     49
     50node 2 will use them in the order of::
     51
     52	swapD/swapE -> swapA -> swapB -> swapC -> swapF
     53
     54Similaly, swapD and swapE will be used in a round robin mode before any
     55other swap devices.
     56
     57node 3 will use them in the order of::
     58
     59	swapF -> swapA -> swapB -> swapC -> swapD -> swapE
     60
     61
     62Implementation details
     63======================
     64
     65The current code uses a priority based list, swap_avail_list, to decide
     66which swap device to use and if multiple swap devices share the same
     67priority, they are used round robin. This change here replaces the single
     68global swap_avail_list with a per-numa-node list, i.e. for each numa node,
     69it sees its own priority based list of available swap devices. Swap
     70device's priority can be promoted on its matching node's swap_avail_list.
     71
     72The current swap device's priority is set as: user can set a >=0 value,
     73or the system will pick one starting from -1 then downwards. The priority
     74value in the swap_avail_list is the negated value of the swap device's
     75due to plist being sorted from low to high. The new policy doesn't change
     76the semantics for priority >=0 cases, the previous starting from -1 then
     77downwards now becomes starting from -2 then downwards and -1 is reserved
     78as the promoted value. So if multiple swap devices are attached to the same
     79node, they will all be promoted to priority -1 on that node's plist and will
     80be used round robin before any other swap devices.