devices.rst (4549B)
1=========================== 2Device Whitelist Controller 3=========================== 4 51. Description 6============== 7 8Implement a cgroup to track and enforce open and mknod restrictions 9on device files. A device cgroup associates a device access 10whitelist with each cgroup. A whitelist entry has 4 fields. 11'type' is a (all), c (char), or b (block). 'all' means it applies 12to all types and all major and minor numbers. Major and minor are 13either an integer or * for all. Access is a composition of r 14(read), w (write), and m (mknod). 15 16The root device cgroup starts with rwm to 'all'. A child device 17cgroup gets a copy of the parent. Administrators can then remove 18devices from the whitelist or add new entries. A child cgroup can 19never receive a device access which is denied by its parent. 20 212. User Interface 22================= 23 24An entry is added using devices.allow, and removed using 25devices.deny. For instance:: 26 27 echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow 28 29allows cgroup 1 to read and mknod the device usually known as 30/dev/null. Doing:: 31 32 echo a > /sys/fs/cgroup/1/devices.deny 33 34will remove the default 'a *:* rwm' entry. Doing:: 35 36 echo a > /sys/fs/cgroup/1/devices.allow 37 38will add the 'a *:* rwm' entry to the whitelist. 39 403. Security 41=========== 42 43Any task can move itself between cgroups. This clearly won't 44suffice, but we can decide the best way to adequately restrict 45movement as people get some experience with this. We may just want 46to require CAP_SYS_ADMIN, which at least is a separate bit from 47CAP_MKNOD. We may want to just refuse moving to a cgroup which 48isn't a descendant of the current one. Or we may want to use 49CAP_MAC_ADMIN, since we really are trying to lock down root. 50 51CAP_SYS_ADMIN is needed to modify the whitelist or move another 52task to a new cgroup. (Again we'll probably want to change that). 53 54A cgroup may not be granted more permissions than the cgroup's 55parent has. 56 574. Hierarchy 58============ 59 60device cgroups maintain hierarchy by making sure a cgroup never has more 61access permissions than its parent. Every time an entry is written to 62a cgroup's devices.deny file, all its children will have that entry removed 63from their whitelist and all the locally set whitelist entries will be 64re-evaluated. In case one of the locally set whitelist entries would provide 65more access than the cgroup's parent, it'll be removed from the whitelist. 66 67Example:: 68 69 A 70 / \ 71 B 72 73 group behavior exceptions 74 A allow "b 8:* rwm", "c 116:1 rw" 75 B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm" 76 77If a device is denied in group A:: 78 79 # echo "c 116:* r" > A/devices.deny 80 81it'll propagate down and after revalidating B's entries, the whitelist entry 82"c 116:2 rwm" will be removed:: 83 84 group whitelist entries denied devices 85 A all "b 8:* rwm", "c 116:* rw" 86 B "c 1:3 rwm", "b 3:* rwm" all the rest 87 88In case parent's exceptions change and local exceptions are not allowed 89anymore, they'll be deleted. 90 91Notice that new whitelist entries will not be propagated:: 92 93 A 94 / \ 95 B 96 97 group whitelist entries denied devices 98 A "c 1:3 rwm", "c 1:5 r" all the rest 99 B "c 1:3 rwm", "c 1:5 r" all the rest 100 101when adding ``c *:3 rwm``:: 102 103 # echo "c *:3 rwm" >A/devices.allow 104 105the result:: 106 107 group whitelist entries denied devices 108 A "c *:3 rwm", "c 1:5 r" all the rest 109 B "c 1:3 rwm", "c 1:5 r" all the rest 110 111but now it'll be possible to add new entries to B:: 112 113 # echo "c 2:3 rwm" >B/devices.allow 114 # echo "c 50:3 r" >B/devices.allow 115 116or even:: 117 118 # echo "c *:3 rwm" >B/devices.allow 119 120Allowing or denying all by writing 'a' to devices.allow or devices.deny will 121not be possible once the device cgroups has children. 122 1234.1 Hierarchy (internal implementation) 124--------------------------------------- 125 126device cgroups is implemented internally using a behavior (ALLOW, DENY) and a 127list of exceptions. The internal state is controlled using the same user 128interface to preserve compatibility with the previous whitelist-only 129implementation. Removal or addition of exceptions that will reduce the access 130to devices will be propagated down the hierarchy. 131For every propagated exception, the effective rules will be re-evaluated based 132on current parent's access rules.