cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

landlock.rst (18290B)


      1.. SPDX-License-Identifier: GPL-2.0
      2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
      3.. Copyright © 2019-2020 ANSSI
      4.. Copyright © 2021-2022 Microsoft Corporation
      5
      6=====================================
      7Landlock: unprivileged access control
      8=====================================
      9
     10:Author: Mickaël Salaün
     11:Date: May 2022
     12
     13The goal of Landlock is to enable to restrict ambient rights (e.g. global
     14filesystem access) for a set of processes.  Because Landlock is a stackable
     15LSM, it makes possible to create safe security sandboxes as new security layers
     16in addition to the existing system-wide access-controls. This kind of sandbox
     17is expected to help mitigate the security impact of bugs or
     18unexpected/malicious behaviors in user space applications.  Landlock empowers
     19any process, including unprivileged ones, to securely restrict themselves.
     20
     21We can quickly make sure that Landlock is enabled in the running system by
     22looking for "landlock: Up and running" in kernel logs (as root): ``dmesg | grep
     23landlock || journalctl -kg landlock`` .  Developers can also easily check for
     24Landlock support with a :ref:`related system call <landlock_abi_versions>`.  If
     25Landlock is not currently supported, we need to :ref:`configure the kernel
     26appropriately <kernel_support>`.
     27
     28Landlock rules
     29==============
     30
     31A Landlock rule describes an action on an object.  An object is currently a
     32file hierarchy, and the related filesystem actions are defined with `access
     33rights`_.  A set of rules is aggregated in a ruleset, which can then restrict
     34the thread enforcing it, and its future children.
     35
     36Defining and enforcing a security policy
     37----------------------------------------
     38
     39We first need to define the ruleset that will contain our rules.  For this
     40example, the ruleset will contain rules that only allow read actions, but write
     41actions will be denied.  The ruleset then needs to handle both of these kind of
     42actions.  This is required for backward and forward compatibility (i.e. the
     43kernel and user space may not know each other's supported restrictions), hence
     44the need to be explicit about the denied-by-default access rights.
     45
     46.. code-block:: c
     47
     48    struct landlock_ruleset_attr ruleset_attr = {
     49        .handled_access_fs =
     50            LANDLOCK_ACCESS_FS_EXECUTE |
     51            LANDLOCK_ACCESS_FS_WRITE_FILE |
     52            LANDLOCK_ACCESS_FS_READ_FILE |
     53            LANDLOCK_ACCESS_FS_READ_DIR |
     54            LANDLOCK_ACCESS_FS_REMOVE_DIR |
     55            LANDLOCK_ACCESS_FS_REMOVE_FILE |
     56            LANDLOCK_ACCESS_FS_MAKE_CHAR |
     57            LANDLOCK_ACCESS_FS_MAKE_DIR |
     58            LANDLOCK_ACCESS_FS_MAKE_REG |
     59            LANDLOCK_ACCESS_FS_MAKE_SOCK |
     60            LANDLOCK_ACCESS_FS_MAKE_FIFO |
     61            LANDLOCK_ACCESS_FS_MAKE_BLOCK |
     62            LANDLOCK_ACCESS_FS_MAKE_SYM |
     63            LANDLOCK_ACCESS_FS_REFER,
     64    };
     65
     66Because we may not know on which kernel version an application will be
     67executed, it is safer to follow a best-effort security approach.  Indeed, we
     68should try to protect users as much as possible whatever the kernel they are
     69using.  To avoid binary enforcement (i.e. either all security features or
     70none), we can leverage a dedicated Landlock command to get the current version
     71of the Landlock ABI and adapt the handled accesses.  Let's check if we should
     72remove the `LANDLOCK_ACCESS_FS_REFER` access right which is only supported
     73starting with the second version of the ABI.
     74
     75.. code-block:: c
     76
     77    int abi;
     78
     79    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
     80    if (abi < 2) {
     81        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
     82    }
     83
     84This enables to create an inclusive ruleset that will contain our rules.
     85
     86.. code-block:: c
     87
     88    int ruleset_fd;
     89
     90    ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
     91    if (ruleset_fd < 0) {
     92        perror("Failed to create a ruleset");
     93        return 1;
     94    }
     95
     96We can now add a new rule to this ruleset thanks to the returned file
     97descriptor referring to this ruleset.  The rule will only allow reading the
     98file hierarchy ``/usr``.  Without another rule, write actions would then be
     99denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
    100``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
    101descriptor.
    102
    103.. code-block:: c
    104
    105    int err;
    106    struct landlock_path_beneath_attr path_beneath = {
    107        .allowed_access =
    108            LANDLOCK_ACCESS_FS_EXECUTE |
    109            LANDLOCK_ACCESS_FS_READ_FILE |
    110            LANDLOCK_ACCESS_FS_READ_DIR,
    111    };
    112
    113    path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
    114    if (path_beneath.parent_fd < 0) {
    115        perror("Failed to open file");
    116        close(ruleset_fd);
    117        return 1;
    118    }
    119    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
    120                            &path_beneath, 0);
    121    close(path_beneath.parent_fd);
    122    if (err) {
    123        perror("Failed to update ruleset");
    124        close(ruleset_fd);
    125        return 1;
    126    }
    127
    128It may also be required to create rules following the same logic as explained
    129for the ruleset creation, by filtering access rights according to the Landlock
    130ABI version.  In this example, this is not required because
    131`LANDLOCK_ACCESS_FS_REFER` is not allowed by any rule.
    132
    133We now have a ruleset with one rule allowing read access to ``/usr`` while
    134denying all other handled accesses for the filesystem.  The next step is to
    135restrict the current thread from gaining more privileges (e.g. thanks to a SUID
    136binary).
    137
    138.. code-block:: c
    139
    140    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
    141        perror("Failed to restrict privileges");
    142        close(ruleset_fd);
    143        return 1;
    144    }
    145
    146The current thread is now ready to sandbox itself with the ruleset.
    147
    148.. code-block:: c
    149
    150    if (landlock_restrict_self(ruleset_fd, 0)) {
    151        perror("Failed to enforce ruleset");
    152        close(ruleset_fd);
    153        return 1;
    154    }
    155    close(ruleset_fd);
    156
    157If the `landlock_restrict_self` system call succeeds, the current thread is now
    158restricted and this policy will be enforced on all its subsequently created
    159children as well.  Once a thread is landlocked, there is no way to remove its
    160security policy; only adding more restrictions is allowed.  These threads are
    161now in a new Landlock domain, merge of their parent one (if any) with the new
    162ruleset.
    163
    164Full working code can be found in `samples/landlock/sandboxer.c`_.
    165
    166Good practices
    167--------------
    168
    169It is recommended setting access rights to file hierarchy leaves as much as
    170possible.  For instance, it is better to be able to have ``~/doc/`` as a
    171read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
    172``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
    173Following this good practice leads to self-sufficient hierarchies that don't
    174depend on their location (i.e. parent directories).  This is particularly
    175relevant when we want to allow linking or renaming.  Indeed, having consistent
    176access rights per directory enables to change the location of such directory
    177without relying on the destination directory access rights (except those that
    178are required for this operation, see `LANDLOCK_ACCESS_FS_REFER` documentation).
    179Having self-sufficient hierarchies also helps to tighten the required access
    180rights to the minimal set of data.  This also helps avoid sinkhole directories,
    181i.e.  directories where data can be linked to but not linked from.  However,
    182this depends on data organization, which might not be controlled by developers.
    183In this case, granting read-write access to ``~/tmp/``, instead of write-only
    184access, would potentially allow to move ``~/tmp/`` to a non-readable directory
    185and still keep the ability to list the content of ``~/tmp/``.
    186
    187Layers of file path access rights
    188---------------------------------
    189
    190Each time a thread enforces a ruleset on itself, it updates its Landlock domain
    191with a new layer of policy.  Indeed, this complementary policy is stacked with
    192the potentially other rulesets already restricting this thread.  A sandboxed
    193thread can then safely add more constraints to itself with a new enforced
    194ruleset.
    195
    196One policy layer grants access to a file path if at least one of its rules
    197encountered on the path grants the access.  A sandboxed thread can only access
    198a file path if all its enforced policy layers grant the access as well as all
    199the other system access controls (e.g. filesystem DAC, other LSM policies,
    200etc.).
    201
    202Bind mounts and OverlayFS
    203-------------------------
    204
    205Landlock enables to restrict access to file hierarchies, which means that these
    206access rights can be propagated with bind mounts (cf.
    207Documentation/filesystems/sharedsubtree.rst) but not with
    208Documentation/filesystems/overlayfs.rst.
    209
    210A bind mount mirrors a source file hierarchy to a destination.  The destination
    211hierarchy is then composed of the exact same files, on which Landlock rules can
    212be tied, either via the source or the destination path.  These rules restrict
    213access when they are encountered on a path, which means that they can restrict
    214access to multiple file hierarchies at the same time, whether these hierarchies
    215are the result of bind mounts or not.
    216
    217An OverlayFS mount point consists of upper and lower layers.  These layers are
    218combined in a merge directory, result of the mount point.  This merge hierarchy
    219may include files from the upper and lower layers, but modifications performed
    220on the merge hierarchy only reflects on the upper layer.  From a Landlock
    221policy point of view, each OverlayFS layers and merge hierarchies are
    222standalone and contains their own set of files and directories, which is
    223different from bind mounts.  A policy restricting an OverlayFS layer will not
    224restrict the resulted merged hierarchy, and vice versa.  Landlock users should
    225then only think about file hierarchies they want to allow access to, regardless
    226of the underlying filesystem.
    227
    228Inheritance
    229-----------
    230
    231Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
    232restrictions from its parent.  This is similar to the seccomp inheritance (cf.
    233Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
    234task's :manpage:`credentials(7)`.  For instance, one process's thread may apply
    235Landlock rules to itself, but they will not be automatically applied to other
    236sibling threads (unlike POSIX thread credential changes, cf.
    237:manpage:`nptl(7)`).
    238
    239When a thread sandboxes itself, we have the guarantee that the related security
    240policy will stay enforced on all this thread's descendants.  This allows
    241creating standalone and modular security policies per application, which will
    242automatically be composed between themselves according to their runtime parent
    243policies.
    244
    245Ptrace restrictions
    246-------------------
    247
    248A sandboxed process has less privileges than a non-sandboxed process and must
    249then be subject to additional restrictions when manipulating another process.
    250To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
    251process, a sandboxed process should have a subset of the target process rules,
    252which means the tracee must be in a sub-domain of the tracer.
    253
    254Compatibility
    255=============
    256
    257Backward and forward compatibility
    258----------------------------------
    259
    260Landlock is designed to be compatible with past and future versions of the
    261kernel.  This is achieved thanks to the system call attributes and the
    262associated bitflags, particularly the ruleset's `handled_access_fs`.  Making
    263handled access right explicit enables the kernel and user space to have a clear
    264contract with each other.  This is required to make sure sandboxing will not
    265get stricter with a system update, which could break applications.
    266
    267Developers can subscribe to the `Landlock mailing list
    268<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
    269test their applications with the latest available features.  In the interest of
    270users, and because they may use different kernel versions, it is strongly
    271encouraged to follow a best-effort security approach by checking the Landlock
    272ABI version at runtime and only enforcing the supported features.
    273
    274.. _landlock_abi_versions:
    275
    276Landlock ABI versions
    277---------------------
    278
    279The Landlock ABI version can be read with the sys_landlock_create_ruleset()
    280system call:
    281
    282.. code-block:: c
    283
    284    int abi;
    285
    286    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
    287    if (abi < 0) {
    288        switch (errno) {
    289        case ENOSYS:
    290            printf("Landlock is not supported by the current kernel.\n");
    291            break;
    292        case EOPNOTSUPP:
    293            printf("Landlock is currently disabled.\n");
    294            break;
    295        }
    296        return 0;
    297    }
    298    if (abi >= 2) {
    299        printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
    300    }
    301
    302The following kernel interfaces are implicitly supported by the first ABI
    303version.  Features only supported from a specific version are explicitly marked
    304as such.
    305
    306Kernel interface
    307================
    308
    309Access rights
    310-------------
    311
    312.. kernel-doc:: include/uapi/linux/landlock.h
    313    :identifiers: fs_access
    314
    315Creating a new ruleset
    316----------------------
    317
    318.. kernel-doc:: security/landlock/syscalls.c
    319    :identifiers: sys_landlock_create_ruleset
    320
    321.. kernel-doc:: include/uapi/linux/landlock.h
    322    :identifiers: landlock_ruleset_attr
    323
    324Extending a ruleset
    325-------------------
    326
    327.. kernel-doc:: security/landlock/syscalls.c
    328    :identifiers: sys_landlock_add_rule
    329
    330.. kernel-doc:: include/uapi/linux/landlock.h
    331    :identifiers: landlock_rule_type landlock_path_beneath_attr
    332
    333Enforcing a ruleset
    334-------------------
    335
    336.. kernel-doc:: security/landlock/syscalls.c
    337    :identifiers: sys_landlock_restrict_self
    338
    339Current limitations
    340===================
    341
    342Filesystem topology modification
    343--------------------------------
    344
    345As for file renaming and linking, a sandboxed thread cannot modify its
    346filesystem topology, whether via :manpage:`mount(2)` or
    347:manpage:`pivot_root(2)`.  However, :manpage:`chroot(2)` calls are not denied.
    348
    349Special filesystems
    350-------------------
    351
    352Access to regular files and directories can be restricted by Landlock,
    353according to the handled accesses of a ruleset.  However, files that do not
    354come from a user-visible filesystem (e.g. pipe, socket), but can still be
    355accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
    356restricted.  Likewise, some special kernel filesystems such as nsfs, which can
    357be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
    358restricted.  However, thanks to the `ptrace restrictions`_, access to such
    359sensitive ``/proc`` files are automatically restricted according to domain
    360hierarchies.  Future Landlock evolutions could still enable to explicitly
    361restrict such paths with dedicated ruleset flags.
    362
    363Ruleset layers
    364--------------
    365
    366There is a limit of 16 layers of stacked rulesets.  This can be an issue for a
    367task willing to enforce a new ruleset in complement to its 16 inherited
    368rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
    369E2BIG.  It is then strongly suggested to carefully build rulesets once in the
    370life of a thread, especially for applications able to launch other applications
    371that may also want to sandbox themselves (e.g. shells, container managers,
    372etc.).
    373
    374Memory usage
    375------------
    376
    377Kernel memory allocated to create rulesets is accounted and can be restricted
    378by the Documentation/admin-guide/cgroup-v1/memory.rst.
    379
    380Previous limitations
    381====================
    382
    383File renaming and linking (ABI 1)
    384---------------------------------
    385
    386Because Landlock targets unprivileged access controls, it needs to properly
    387handle composition of rules.  Such property also implies rules nesting.
    388Properly handling multiple layers of rulesets, each one of them able to
    389restrict access to files, also implies inheritance of the ruleset restrictions
    390from a parent to its hierarchy.  Because files are identified and restricted by
    391their hierarchy, moving or linking a file from one directory to another implies
    392propagation of the hierarchy constraints, or restriction of these actions
    393according to the potentially lost constraints.  To protect against privilege
    394escalations through renaming or linking, and for the sake of simplicity,
    395Landlock previously limited linking and renaming to the same directory.
    396Starting with the Landlock ABI version 2, it is now possible to securely
    397control renaming and linking thanks to the new `LANDLOCK_ACCESS_FS_REFER`
    398access right.
    399
    400.. _kernel_support:
    401
    402Kernel support
    403==============
    404
    405Landlock was first introduced in Linux 5.13 but it must be configured at build
    406time with `CONFIG_SECURITY_LANDLOCK=y`.  Landlock must also be enabled at boot
    407time as the other security modules.  The list of security modules enabled by
    408default is set with `CONFIG_LSM`.  The kernel configuration should then
    409contains `CONFIG_LSM=landlock,[...]` with `[...]`  as the list of other
    410potentially useful security modules for the running system (see the
    411`CONFIG_LSM` help).
    412
    413If the running kernel doesn't have `landlock` in `CONFIG_LSM`, then we can
    414still enable it by adding ``lsm=landlock,[...]`` to
    415Documentation/admin-guide/kernel-parameters.rst thanks to the bootloader
    416configuration.
    417
    418Questions and answers
    419=====================
    420
    421What about user space sandbox managers?
    422---------------------------------------
    423
    424Using user space process to enforce restrictions on kernel resources can lead
    425to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
    426the OS code and state
    427<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
    428
    429What about namespaces and containers?
    430-------------------------------------
    431
    432Namespaces can help create sandboxes but they are not designed for
    433access-control and then miss useful features for such use case (e.g. no
    434fine-grained restrictions).  Moreover, their complexity can lead to security
    435issues, especially when untrusted processes can manipulate them (cf.
    436`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
    437
    438Additional documentation
    439========================
    440
    441* Documentation/security/landlock.rst
    442* https://landlock.io
    443
    444.. Links
    445.. _samples/landlock/sandboxer.c:
    446   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c