cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

5level-paging.rst (2593B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3==============
      45-level paging
      5==============
      6
      7Overview
      8========
      9Original x86-64 was limited by 4-level paging to 256 TiB of virtual address
     10space and 64 TiB of physical address space. We are already bumping into
     11this limit: some vendors offer servers with 64 TiB of memory today.
     12
     13To overcome the limitation upcoming hardware will introduce support for
     145-level paging. It is a straight-forward extension of the current page
     15table structure adding one more layer of translation.
     16
     17It bumps the limits to 128 PiB of virtual address space and 4 PiB of
     18physical address space. This "ought to be enough for anybody" ©.
     19
     20QEMU 2.9 and later support 5-level paging.
     21
     22Virtual memory layout for 5-level paging is described in
     23Documentation/x86/x86_64/mm.rst
     24
     25
     26Enabling 5-level paging
     27=======================
     28CONFIG_X86_5LEVEL=y enables the feature.
     29
     30Kernel with CONFIG_X86_5LEVEL=y still able to boot on 4-level hardware.
     31In this case additional page table level -- p4d -- will be folded at
     32runtime.
     33
     34User-space and large virtual address space
     35==========================================
     36On x86, 5-level paging enables 56-bit userspace virtual address space.
     37Not all user space is ready to handle wide addresses. It's known that
     38at least some JIT compilers use higher bits in pointers to encode their
     39information. It collides with valid pointers with 5-level paging and
     40leads to crashes.
     41
     42To mitigate this, we are not going to allocate virtual address space
     43above 47-bit by default.
     44
     45But userspace can ask for allocation from full address space by
     46specifying hint address (with or without MAP_FIXED) above 47-bits.
     47
     48If hint address set above 47-bit, but MAP_FIXED is not specified, we try
     49to look for unmapped area by specified address. If it's already
     50occupied, we look for unmapped area in *full* address space, rather than
     51from 47-bit window.
     52
     53A high hint address would only affect the allocation in question, but not
     54any future mmap()s.
     55
     56Specifying high hint address on older kernel or on machine without 5-level
     57paging support is safe. The hint will be ignored and kernel will fall back
     58to allocation from 47-bit address space.
     59
     60This approach helps to easily make application's memory allocator aware
     61about large address space without manually tracking allocated virtual
     62address space.
     63
     64One important case we need to handle here is interaction with MPX.
     65MPX (without MAWA extension) cannot handle addresses above 47-bit, so we
     66need to make sure that MPX cannot be enabled we already have VMA above
     67the boundary and forbid creating such VMAs once MPX is enabled.