fake-numa-for-cpusets.rst (3311B)
1.. SPDX-License-Identifier: GPL-2.0 2 3===================== 4Fake NUMA For CPUSets 5===================== 6 7:Author: David Rientjes <rientjes@cs.washington.edu> 8 9Using numa=fake and CPUSets for Resource Management 10 11This document describes how the numa=fake x86_64 command-line option can be used 12in conjunction with cpusets for coarse memory management. Using this feature, 13you can create fake NUMA nodes that represent contiguous chunks of memory and 14assign them to cpusets and their attached tasks. This is a way of limiting the 15amount of system memory that are available to a certain class of tasks. 16 17For more information on the features of cpusets, see 18Documentation/admin-guide/cgroup-v1/cpusets.rst. 19There are a number of different configurations you can use for your needs. For 20more information on the numa=fake command line option and its various ways of 21configuring fake nodes, see Documentation/x86/x86_64/boot-options.rst. 22 23For the purposes of this introduction, we'll assume a very primitive NUMA 24emulation setup of "numa=fake=4*512,". This will split our system memory into 25four equal chunks of 512M each that we can now use to assign to cpusets. As 26you become more familiar with using this combination for resource control, 27you'll determine a better setup to minimize the number of nodes you have to deal 28with. 29 30A machine may be split as follows with "numa=fake=4*512," as reported by dmesg:: 31 32 Faking node 0 at 0000000000000000-0000000020000000 (512MB) 33 Faking node 1 at 0000000020000000-0000000040000000 (512MB) 34 Faking node 2 at 0000000040000000-0000000060000000 (512MB) 35 Faking node 3 at 0000000060000000-0000000080000000 (512MB) 36 ... 37 On node 0 totalpages: 130975 38 On node 1 totalpages: 131072 39 On node 2 totalpages: 131072 40 On node 3 totalpages: 131072 41 42Now following the instructions for mounting the cpusets filesystem from 43Documentation/admin-guide/cgroup-v1/cpusets.rst, you can assign fake nodes (i.e. contiguous memory 44address spaces) to individual cpusets:: 45 46 [root@xroads /]# mkdir exampleset 47 [root@xroads /]# mount -t cpuset none exampleset 48 [root@xroads /]# mkdir exampleset/ddset 49 [root@xroads /]# cd exampleset/ddset 50 [root@xroads /exampleset/ddset]# echo 0-1 > cpus 51 [root@xroads /exampleset/ddset]# echo 0-1 > mems 52 53Now this cpuset, 'ddset', will only allowed access to fake nodes 0 and 1 for 54memory allocations (1G). 55 56You can now assign tasks to these cpusets to limit the memory resources 57available to them according to the fake nodes assigned as mems:: 58 59 [root@xroads /exampleset/ddset]# echo $$ > tasks 60 [root@xroads /exampleset/ddset]# dd if=/dev/zero of=tmp bs=1024 count=1G 61 [1] 13425 62 63Notice the difference between the system memory usage as reported by 64/proc/meminfo between the restricted cpuset case above and the unrestricted 65case (i.e. running the same 'dd' command without assigning it to a fake NUMA 66cpuset): 67 68 ======== ============ ========== 69 Name Unrestricted Restricted 70 ======== ============ ========== 71 MemTotal 3091900 kB 3091900 kB 72 MemFree 42113 kB 1513236 kB 73 ======== ============ ========== 74 75This allows for coarse memory management for the tasks you assign to particular 76cpusets. Since cpusets can form a hierarchy, you can create some pretty 77interesting combinations of use-cases for various classes of tasks for your 78memory management needs.