cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

memory.rst (7757B)


      1==============================
      2Memory Layout on AArch64 Linux
      3==============================
      4
      5Author: Catalin Marinas <catalin.marinas@arm.com>
      6
      7This document describes the virtual memory layout used by the AArch64
      8Linux kernel. The architecture allows up to 4 levels of translation
      9tables with a 4KB page size and up to 3 levels with a 64KB page size.
     10
     11AArch64 Linux uses either 3 levels or 4 levels of translation tables
     12with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit
     13(256TB) virtual addresses, respectively, for both user and kernel. With
     1464KB pages, only 2 levels of translation tables, allowing 42-bit (4TB)
     15virtual address, are used but the memory layout is the same.
     16
     17ARMv8.2 adds optional support for Large Virtual Address space. This is
     18only available when running with a 64KB page size and expands the
     19number of descriptors in the first level of translation.
     20
     21User addresses have bits 63:48 set to 0 while the kernel addresses have
     22the same bits set to 1. TTBRx selection is given by bit 63 of the
     23virtual address. The swapper_pg_dir contains only kernel (global)
     24mappings while the user pgd contains only user (non-global) mappings.
     25The swapper_pg_dir address is written to TTBR1 and never written to
     26TTBR0.
     27
     28
     29AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit)::
     30
     31  Start			End			Size		Use
     32  -----------------------------------------------------------------------
     33  0000000000000000	0000ffffffffffff	 256TB		user
     34  ffff000000000000	ffff7fffffffffff	 128TB		kernel logical memory map
     35 [ffff600000000000	ffff7fffffffffff]	  32TB		[kasan shadow region]
     36  ffff800000000000	ffff800007ffffff	 128MB		bpf jit region
     37  ffff800008000000	ffff80000fffffff	 128MB		modules
     38  ffff800010000000	fffffbffefffffff	 124TB		vmalloc
     39  fffffbfff0000000	fffffbfffdffffff	 224MB		fixed mappings (top down)
     40  fffffbfffe000000	fffffbfffe7fffff	   8MB		[guard region]
     41  fffffbfffe800000	fffffbffff7fffff	  16MB		PCI I/O space
     42  fffffbffff800000	fffffbffffffffff	   8MB		[guard region]
     43  fffffc0000000000	fffffdffffffffff	   2TB		vmemmap
     44  fffffe0000000000	ffffffffffffffff	   2TB		[guard region]
     45
     46
     47AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support)::
     48
     49  Start			End			Size		Use
     50  -----------------------------------------------------------------------
     51  0000000000000000	000fffffffffffff	   4PB		user
     52  fff0000000000000	ffff7fffffffffff	  ~4PB		kernel logical memory map
     53 [fffd800000000000	ffff7fffffffffff]	 512TB		[kasan shadow region]
     54  ffff800000000000	ffff800007ffffff	 128MB		bpf jit region
     55  ffff800008000000	ffff80000fffffff	 128MB		modules
     56  ffff800010000000	fffffbffefffffff	 124TB		vmalloc
     57  fffffbfff0000000	fffffbfffdffffff	 224MB		fixed mappings (top down)
     58  fffffbfffe000000	fffffbfffe7fffff	   8MB		[guard region]
     59  fffffbfffe800000	fffffbffff7fffff	  16MB		PCI I/O space
     60  fffffbffff800000	fffffbffffffffff	   8MB		[guard region]
     61  fffffc0000000000	ffffffdfffffffff	  ~4TB		vmemmap
     62  ffffffe000000000	ffffffffffffffff	 128GB		[guard region]
     63
     64
     65Translation table lookup with 4KB pages::
     66
     67  +--------+--------+--------+--------+--------+--------+--------+--------+
     68  |63    56|55    48|47    40|39    32|31    24|23    16|15     8|7      0|
     69  +--------+--------+--------+--------+--------+--------+--------+--------+
     70   |                 |         |         |         |         |
     71   |                 |         |         |         |         v
     72   |                 |         |         |         |   [11:0]  in-page offset
     73   |                 |         |         |         +-> [20:12] L3 index
     74   |                 |         |         +-----------> [29:21] L2 index
     75   |                 |         +---------------------> [38:30] L1 index
     76   |                 +-------------------------------> [47:39] L0 index
     77   +-------------------------------------------------> [63] TTBR0/1
     78
     79
     80Translation table lookup with 64KB pages::
     81
     82  +--------+--------+--------+--------+--------+--------+--------+--------+
     83  |63    56|55    48|47    40|39    32|31    24|23    16|15     8|7      0|
     84  +--------+--------+--------+--------+--------+--------+--------+--------+
     85   |                 |    |               |              |
     86   |                 |    |               |              v
     87   |                 |    |               |            [15:0]  in-page offset
     88   |                 |    |               +----------> [28:16] L3 index
     89   |                 |    +--------------------------> [41:29] L2 index
     90   |                 +-------------------------------> [47:42] L1 index (48-bit)
     91   |                                                   [51:42] L1 index (52-bit)
     92   +-------------------------------------------------> [63] TTBR0/1
     93
     94
     95When using KVM without the Virtualization Host Extensions, the
     96hypervisor maps kernel pages in EL2 at a fixed (and potentially
     97random) offset from the linear mapping. See the kern_hyp_va macro and
     98kvm_update_va_mask function for more details. MMIO devices such as
     99GICv2 gets mapped next to the HYP idmap page, as do vectors when
    100ARM64_SPECTRE_V3A is enabled for particular CPUs.
    101
    102When using KVM with the Virtualization Host Extensions, no additional
    103mappings are created, since the host kernel runs directly in EL2.
    104
    10552-bit VA support in the kernel
    106-------------------------------
    107If the ARMv8.2-LVA optional feature is present, and we are running
    108with a 64KB page size; then it is possible to use 52-bits of address
    109space for both userspace and kernel addresses. However, any kernel
    110binary that supports 52-bit must also be able to fall back to 48-bit
    111at early boot time if the hardware feature is not present.
    112
    113This fallback mechanism necessitates the kernel .text to be in the
    114higher addresses such that they are invariant to 48/52-bit VAs. Due
    115to the kasan shadow being a fraction of the entire kernel VA space,
    116the end of the kasan shadow must also be in the higher half of the
    117kernel VA space for both 48/52-bit. (Switching from 48-bit to 52-bit,
    118the end of the kasan shadow is invariant and dependent on ~0UL,
    119whilst the start address will "grow" towards the lower addresses).
    120
    121In order to optimise phys_to_virt and virt_to_phys, the PAGE_OFFSET
    122is kept constant at 0xFFF0000000000000 (corresponding to 52-bit),
    123this obviates the need for an extra variable read. The physvirt
    124offset and vmemmap offsets are computed at early boot to enable
    125this logic.
    126
    127As a single binary will need to support both 48-bit and 52-bit VA
    128spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
    129also must be sized large enough to accommodate a fixed PAGE_OFFSET.
    130
    131Most code in the kernel should not need to consider the VA_BITS, for
    132code that does need to know the VA size the variables are
    133defined as follows:
    134
    135VA_BITS		constant	the *maximum* VA space size
    136
    137VA_BITS_MIN	constant	the *minimum* VA space size
    138
    139vabits_actual	variable	the *actual* VA space size
    140
    141
    142Maximum and minimum sizes can be useful to ensure that buffers are
    143sized large enough or that addresses are positioned close enough for
    144the "worst" case.
    145
    14652-bit userspace VAs
    147--------------------
    148To maintain compatibility with software that relies on the ARMv8.0
    149VA space maximum size of 48-bits, the kernel will, by default,
    150return virtual addresses to userspace from a 48-bit range.
    151
    152Software can "opt-in" to receiving VAs from a 52-bit space by
    153specifying an mmap hint parameter that is larger than 48-bit.
    154
    155For example:
    156
    157.. code-block:: c
    158
    159   maybe_high_address = mmap(~0UL, size, prot, flags,...);
    160
    161It is also possible to build a debug kernel that returns addresses
    162from a 52-bit space by enabling the following kernel config options:
    163
    164.. code-block:: sh
    165
    166   CONFIG_EXPERT=y && CONFIG_ARM64_FORCE_52BIT=y
    167
    168Note that this option is only intended for debugging applications
    169and should not be used in production.