cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

protection-keys.rst (3512B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3======================
      4Memory Protection Keys
      5======================
      6
      7Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
      8which is found on Intel's Skylake (and later) "Scalable Processor"
      9Server CPUs. It will be available in future non-server Intel parts
     10and future AMD processors.
     11
     12For anyone wishing to test or use this feature, it is available in
     13Amazon's EC2 C5 instances and is known to work there using an Ubuntu
     1417.04 image.
     15
     16Memory Protection Keys provides a mechanism for enforcing page-based
     17protections, but without requiring modification of the page tables
     18when an application changes protection domains.  It works by
     19dedicating 4 previously ignored bits in each page table entry to a
     20"protection key", giving 16 possible keys.
     21
     22There is also a new user-accessible register (PKRU) with two separate
     23bits (Access Disable and Write Disable) for each key.  Being a CPU
     24register, PKRU is inherently thread-local, potentially giving each
     25thread a different set of protections from every other thread.
     26
     27There are two new instructions (RDPKRU/WRPKRU) for reading and writing
     28to the new register.  The feature is only available in 64-bit mode,
     29even though there is theoretically space in the PAE PTEs.  These
     30permissions are enforced on data access only and have no effect on
     31instruction fetches.
     32
     33Syscalls
     34========
     35
     36There are 3 system calls which directly interact with pkeys::
     37
     38	int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
     39	int pkey_free(int pkey);
     40	int pkey_mprotect(unsigned long start, size_t len,
     41			  unsigned long prot, int pkey);
     42
     43Before a pkey can be used, it must first be allocated with
     44pkey_alloc().  An application calls the WRPKRU instruction
     45directly in order to change access permissions to memory covered
     46with a key.  In this example WRPKRU is wrapped by a C function
     47called pkey_set().
     48::
     49
     50	int real_prot = PROT_READ|PROT_WRITE;
     51	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
     52	ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
     53	ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
     54	... application runs here
     55
     56Now, if the application needs to update the data at 'ptr', it can
     57gain access, do the update, then remove its write access::
     58
     59	pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
     60	*ptr = foo; // assign something
     61	pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
     62
     63Now when it frees the memory, it will also free the pkey since it
     64is no longer in use::
     65
     66	munmap(ptr, PAGE_SIZE);
     67	pkey_free(pkey);
     68
     69.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
     70          An example implementation can be found in
     71          tools/testing/selftests/x86/protection_keys.c.
     72
     73Behavior
     74========
     75
     76The kernel attempts to make protection keys consistent with the
     77behavior of a plain mprotect().  For instance if you do this::
     78
     79	mprotect(ptr, size, PROT_NONE);
     80	something(ptr);
     81
     82you can expect the same effects with protection keys when doing this::
     83
     84	pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
     85	pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
     86	something(ptr);
     87
     88That should be true whether something() is a direct access to 'ptr'
     89like::
     90
     91	*ptr = foo;
     92
     93or when the kernel does the access on the application's behalf like
     94with a read()::
     95
     96	read(fd, ptr, 1);
     97
     98The kernel will send a SIGSEGV in both cases, but si_code will be set
     99to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
    100the plain mprotect() permissions are violated.