From: Linus Torvalds Date: Fri, 10 May 2019 17:24:53 +0000 (-0400) Subject: Merge tag 'docs-5.2a' of git://git.lwn.net/linux X-Git-Tag: microblaze-v5.4-rc1~747 X-Git-Url: http://git.monstr.eu/?a=commitdiff_plain;h=1fb3b526df3bd7647e7854915ae6b22299408baf;p=linux-2.6-microblaze.git Merge tag 'docs-5.2a' of git://git.lwn.net/linux Pull more documentation updates from Jonathan Corbet: "Some late arriving documentation changes. In particular, this contains the conversion of the x86 docs to RST, which has been in the works for some time but needed a couple of final tweaks" * tag 'docs-5.2a' of git://git.lwn.net/linux: (29 commits) Documentation: x86: convert x86_64/machinecheck to reST Documentation: x86: convert x86_64/cpu-hotplug-spec to reST Documentation: x86: convert x86_64/fake-numa-for-cpusets to reST Documentation: x86: convert x86_64/5level-paging.txt to reST Documentation: x86: convert x86_64/mm.txt to reST Documentation: x86: convert x86_64/uefi.txt to reST Documentation: x86: convert x86_64/boot-options.txt to reST Documentation: x86: convert i386/IO-APIC.txt to reST Documentation: x86: convert usb-legacy-support.txt to reST Documentation: x86: convert orc-unwinder.txt to reST Documentation: x86: convert resctrl_ui.txt to reST Documentation: x86: convert microcode.txt to reST Documentation: x86: convert pti.txt to reST Documentation: x86: convert amd-memory-encryption.txt to reST Documentation: x86: convert intel_mpx.txt to reST Documentation: x86: convert protection-keys.txt to reST Documentation: x86: convert pat.txt to reST Documentation: x86: convert mtrr.txt to reST Documentation: x86: convert tlb.txt to reST Documentation: x86: convert zero-page.txt to reST ... --- 1fb3b526df3bd7647e7854915ae6b22299408baf diff --cc Documentation/x86/kernel-stacks.rst index 000000000000,c7c7afce086f..6b0bcf027ff1 mode 000000,100644..100644 --- a/Documentation/x86/kernel-stacks.rst +++ b/Documentation/x86/kernel-stacks.rst @@@ -1,0 -1,147 +1,152 @@@ + .. SPDX-License-Identifier: GPL-2.0 + + ============= + Kernel Stacks + ============= + + Kernel stacks on x86-64 bit + =========================== + + Most of the text from Keith Owens, hacked by AK + + x86_64 page size (PAGE_SIZE) is 4K. + + Like all other architectures, x86_64 has a kernel stack for every + active thread. These thread stacks are THREAD_SIZE (2*PAGE_SIZE) big. + These stacks contain useful data as long as a thread is alive or a + zombie. While the thread is in user space the kernel stack is empty + except for the thread_info structure at the bottom. + + In addition to the per thread stacks, there are specialized stacks + associated with each CPU. These stacks are only used while the kernel + is in control on that CPU; when a CPU returns to user space the + specialized stacks contain no useful data. The main CPU stacks are: + + * Interrupt stack. IRQ_STACK_SIZE + + Used for external hardware interrupts. If this is the first external + hardware interrupt (i.e. not a nested hardware interrupt) then the + kernel switches from the current task to the interrupt stack. Like + the split thread and interrupt stacks on i386, this gives more room + for kernel interrupt processing without having to increase the size + of every per thread stack. + + The interrupt stack is also used when processing a softirq. + + Switching to the kernel interrupt stack is done by software based on a + per CPU interrupt nest counter. This is needed because x86-64 "IST" + hardware stacks cannot nest without races. + + x86_64 also has a feature which is not available on i386, the ability + to automatically switch to a new stack for designated events such as + double fault or NMI, which makes it easier to handle these unusual + events on x86_64. This feature is called the Interrupt Stack Table + (IST). There can be up to 7 IST entries per CPU. The IST code is an + index into the Task State Segment (TSS). The IST entries in the TSS + point to dedicated stacks; each stack can be a different size. + + An IST is selected by a non-zero value in the IST field of an + interrupt-gate descriptor. When an interrupt occurs and the hardware + loads such a descriptor, the hardware automatically sets the new stack + pointer based on the IST value, then invokes the interrupt handler. If + the interrupt came from user mode, then the interrupt handler prologue + will switch back to the per-thread stack. If software wants to allow + nested IST interrupts then the handler must adjust the IST values on + entry to and exit from the interrupt handler. (This is occasionally + done, e.g. for debug exceptions.) + + Events with different IST codes (i.e. with different stacks) can be + nested. For example, a debug interrupt can safely be interrupted by an + NMI. arch/x86_64/kernel/entry.S::paranoidentry adjusts the stack + pointers on entry to and exit from all IST events, in theory allowing + IST events with the same code to be nested. However in most cases, the + stack size allocated to an IST assumes no nesting for the same code. + If that assumption is ever broken then the stacks will become corrupt. + + The currently assigned IST stacks are: + -* DOUBLEFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE). ++* ESTACK_DF. EXCEPTION_STKSZ (PAGE_SIZE). + + Used for interrupt 8 - Double Fault Exception (#DF). + + Invoked when handling one exception causes another exception. Happens + when the kernel is very confused (e.g. kernel stack pointer corrupt). + Using a separate stack allows the kernel to recover from it well enough + in many cases to still output an oops. + -* NMI_STACK. EXCEPTION_STKSZ (PAGE_SIZE). ++* ESTACK_NMI. EXCEPTION_STKSZ (PAGE_SIZE). + + Used for non-maskable interrupts (NMI). + + NMI can be delivered at any time, including when the kernel is in the + middle of switching stacks. Using IST for NMI events avoids making + assumptions about the previous state of the kernel stack. + -* DEBUG_STACK. DEBUG_STKSZ ++* ESTACK_DB. EXCEPTION_STKSZ (PAGE_SIZE). + + Used for hardware debug interrupts (interrupt 1) and for software + debug interrupts (INT3). + + When debugging a kernel, debug interrupts (both hardware and + software) can occur at any time. Using IST for these interrupts + avoids making assumptions about the previous state of the kernel + stack. + -* MCE_STACK. EXCEPTION_STKSZ (PAGE_SIZE). ++ To handle nested #DB correctly there exist two instances of DB stacks. On ++ #DB entry the IST stackpointer for #DB is switched to the second instance ++ so a nested #DB starts from a clean stack. The nested #DB switches ++ the IST stackpointer to a guard hole to catch triple nesting. ++ ++* ESTACK_MCE. EXCEPTION_STKSZ (PAGE_SIZE). + + Used for interrupt 18 - Machine Check Exception (#MC). + + MCE can be delivered at any time, including when the kernel is in the + middle of switching stacks. Using IST for MCE events avoids making + assumptions about the previous state of the kernel stack. + + For more details see the Intel IA32 or AMD AMD64 architecture manuals. + + + Printing backtraces on x86 + ========================== + + The question about the '?' preceding function names in an x86 stacktrace + keeps popping up, here's an indepth explanation. It helps if the reader + stares at print_context_stack() and the whole machinery in and around + arch/x86/kernel/dumpstack.c. + + Adapted from Ingo's mail, Message-ID: <20150521101614.GA10889@gmail.com>: + + We always scan the full kernel stack for return addresses stored on + the kernel stack(s) [1]_, from stack top to stack bottom, and print out + anything that 'looks like' a kernel text address. + + If it fits into the frame pointer chain, we print it without a question + mark, knowing that it's part of the real backtrace. + + If the address does not fit into our expected frame pointer chain we + still print it, but we print a '?'. It can mean two things: + + - either the address is not part of the call chain: it's just stale + values on the kernel stack, from earlier function calls. This is + the common case. + + - or it is part of the call chain, but the frame pointer was not set + up properly within the function, so we don't recognize it. + + This way we will always print out the real call chain (plus a few more + entries), regardless of whether the frame pointer was set up correctly + or not - but in most cases we'll get the call chain right as well. The + entries printed are strictly in stack order, so you can deduce more + information from that as well. + + The most important property of this method is that we _never_ lose + information: we always strive to print _all_ addresses on the stack(s) + that look like kernel text addresses, so if debug information is wrong, + we still print out the real call chain as well - just with more question + marks than ideal. + + .. [1] For things like IRQ and IST stacks, we also scan those stacks, in + the right order, and try to cross from one stack into another + reconstructing the call chain. This works most of the time. diff --cc Documentation/x86/topology.rst index 000000000000,5176e5315faa..6e28dbe818ab mode 000000,100644..100644 --- a/Documentation/x86/topology.rst +++ b/Documentation/x86/topology.rst @@@ -1,0 -1,221 +1,221 @@@ + .. SPDX-License-Identifier: GPL-2.0 + + ============ + x86 Topology + ============ + + This documents and clarifies the main aspects of x86 topology modelling and + representation in the kernel. Update/change when doing changes to the + respective code. + + The architecture-agnostic topology definitions are in + Documentation/cputopology.txt. This file holds x86-specific + differences/specialities which must not necessarily apply to the generic + definitions. Thus, the way to read up on Linux topology on x86 is to start + with the generic one and look at this one in parallel for the x86 specifics. + + Needless to say, code should use the generic functions - this file is *only* + here to *document* the inner workings of x86 topology. + + Started by Thomas Gleixner and Borislav Petkov . + + The main aim of the topology facilities is to present adequate interfaces to + code which needs to know/query/use the structure of the running system wrt + threads, cores, packages, etc. + + The kernel does not care about the concept of physical sockets because a + socket has no relevance to software. It's an electromechanical component. In + the past a socket always contained a single package (see below), but with the + advent of Multi Chip Modules (MCM) a socket can hold more than one package. So + there might be still references to sockets in the code, but they are of + historical nature and should be cleaned up. + + The topology of a system is described in the units of: + + - packages + - cores + - threads + + Package + ======= + Packages contain a number of cores plus shared resources, e.g. DRAM + controller, shared caches etc. + + AMD nomenclature for package is 'Node'. + + Package-related topology information in the kernel: + + - cpuinfo_x86.x86_max_cores: + + The number of cores in a package. This information is retrieved via CPUID. + + - cpuinfo_x86.phys_proc_id: + + The physical ID of the package. This information is retrieved via CPUID + and deduced from the APIC IDs of the cores in the package. + - - cpuinfo_x86.logical_id: ++ - cpuinfo_x86.logical_proc_id: + + The logical ID of the package. As we do not trust BIOSes to enumerate the + packages in a consistent way, we introduced the concept of logical package + ID so we can sanely calculate the number of maximum possible packages in + the system and have the packages enumerated linearly. + + - topology_max_packages(): + + The maximum possible number of packages in the system. Helpful for per + package facilities to preallocate per package information. + + - cpu_llc_id: + + A per-CPU variable containing: + + - On Intel, the first APIC ID of the list of CPUs sharing the Last Level + Cache + + - On AMD, the Node ID or Core Complex ID containing the Last Level + Cache. In general, it is a number identifying an LLC uniquely on the + system. + + Cores + ===== + A core consists of 1 or more threads. It does not matter whether the threads + are SMT- or CMT-type threads. + + AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses + "core". + + Core-related topology information in the kernel: + + - smp_num_siblings: + + The number of threads in a core. The number of threads in a package can be + calculated by:: + + threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings + + + Threads + ======= + A thread is a single scheduling unit. It's the equivalent to a logical Linux + CPU. + + AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always + uses "thread". + + Thread-related topology information in the kernel: + + - topology_core_cpumask(): + + The cpumask contains all online threads in the package to which a thread + belongs. + + The number of online threads is also printed in /proc/cpuinfo "siblings." + + - topology_sibling_cpumask(): + + The cpumask contains all online threads in the core to which a thread + belongs. + + - topology_logical_package_id(): + + The logical package ID to which a thread belongs. + + - topology_physical_package_id(): + + The physical package ID to which a thread belongs. + + - topology_core_id(); + + The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo + "core_id." + + + + System topology examples + ======================== + + .. note:: + The alternative Linux CPU enumeration depends on how the BIOS enumerates the + threads. Many BIOSes enumerate all threads 0 first and then all threads 1. + That has the "advantage" that the logical Linux CPU numbers of threads 0 stay + the same whether threads are enabled or not. That's merely an implementation + detail and has no practical impact. + + 1) Single Package, Single Core:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + + 2) Single Package, Dual Core + + a) One thread per core:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + -> [core 1] -> [thread 0] -> Linux CPU 1 + + b) Two threads per core:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + -> [thread 1] -> Linux CPU 1 + -> [core 1] -> [thread 0] -> Linux CPU 2 + -> [thread 1] -> Linux CPU 3 + + Alternative enumeration:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + -> [thread 1] -> Linux CPU 2 + -> [core 1] -> [thread 0] -> Linux CPU 1 + -> [thread 1] -> Linux CPU 3 + + AMD nomenclature for CMT systems:: + + [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 + -> [Compute Unit Core 1] -> Linux CPU 1 + -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 + -> [Compute Unit Core 1] -> Linux CPU 3 + + 4) Dual Package, Dual Core + + a) One thread per core:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + -> [core 1] -> [thread 0] -> Linux CPU 1 + + [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2 + -> [core 1] -> [thread 0] -> Linux CPU 3 + + b) Two threads per core:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + -> [thread 1] -> Linux CPU 1 + -> [core 1] -> [thread 0] -> Linux CPU 2 + -> [thread 1] -> Linux CPU 3 + + [package 1] -> [core 0] -> [thread 0] -> Linux CPU 4 + -> [thread 1] -> Linux CPU 5 + -> [core 1] -> [thread 0] -> Linux CPU 6 + -> [thread 1] -> Linux CPU 7 + + Alternative enumeration:: + + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 + -> [thread 1] -> Linux CPU 4 + -> [core 1] -> [thread 0] -> Linux CPU 1 + -> [thread 1] -> Linux CPU 5 + + [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2 + -> [thread 1] -> Linux CPU 6 + -> [core 1] -> [thread 0] -> Linux CPU 3 + -> [thread 1] -> Linux CPU 7 + + AMD nomenclature for CMT systems:: + + [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 + -> [Compute Unit Core 1] -> Linux CPU 1 + -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 + -> [Compute Unit Core 1] -> Linux CPU 3 + + [node 1] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 4 + -> [Compute Unit Core 1] -> Linux CPU 5 + -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 6 + -> [Compute Unit Core 1] -> Linux CPU 7 diff --cc Documentation/x86/x86_64/mm.rst index 000000000000,52020577b8de..267fc4808945 mode 000000,100644..100644 --- a/Documentation/x86/x86_64/mm.rst +++ b/Documentation/x86/x86_64/mm.rst @@@ -1,0 -1,161 +1,161 @@@ + .. SPDX-License-Identifier: GPL-2.0 + + ================ + Memory Managment + ================ + + Complete virtual memory map with 4-level page tables + ==================================================== + + .. note:: + + - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down + from the top of the 64-bit address space. It's easier to understand the layout + when seen both in absolute addresses and in distance-from-top notation. + + For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the + 64-bit address space (ffffffffffffffff). + + Note that as we get closer to the top of the address space, the notation changes + from TB to GB and then MB/KB. + + - "16M TB" might look weird at first sight, but it's an easier to visualize size + notation than "16 EB", which few will recognize at first sight as 16 exabytes. + It also shows it nicely how incredibly large 64-bit address space is. + + :: + + ======================================================================================================================== + Start addr | Offset | End addr | Size | VM area description + ======================================================================================================================== + | | | | + 0000000000000000 | 0 | 00007fffffffffff | 128 TB | user-space virtual memory, different per mm + __________________|____________|__________________|_________|___________________________________________________________ + | | | | + 0000800000000000 | +128 TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical + | | | | virtual memory addresses up to the -128 TB + | | | | starting offset of kernel mappings. + __________________|____________|__________________|_________|___________________________________________________________ + | + | Kernel-space virtual memory, shared between all processes: + ____________________________________________________________|___________________________________________________________ + | | | | + ffff800000000000 | -128 TB | ffff87ffffffffff | 8 TB | ... guard hole, also reserved for hypervisor + ffff880000000000 | -120 TB | ffff887fffffffff | 0.5 TB | LDT remap for PTI + ffff888000000000 | -119.5 TB | ffffc87fffffffff | 64 TB | direct mapping of all physical memory (page_offset_base) + ffffc88000000000 | -55.5 TB | ffffc8ffffffffff | 0.5 TB | ... unused hole + ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base) + ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole + ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base) + ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole + ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory + __________________|____________|__________________|_________|____________________________________________________________ + | + | Identical layout to the 56-bit one from here on: + ____________________________________________________________|____________________________________________________________ + | | | | + fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole + | | | | vaddr_end for KASLR + fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping + fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole + ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks + ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole + ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space + ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole + ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0 + ffffffff80000000 |-2048 MB | | | + ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space + ffffffffff000000 | -16 MB | | | + FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset + ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI + ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole + __________________|____________|__________________|_________|___________________________________________________________ + + + Complete virtual memory map with 5-level page tables + ==================================================== + + .. note:: + + - With 56-bit addresses, user-space memory gets expanded by a factor of 512x, - from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PT starting ++ from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PB starting + offset and many of the regions expand to support the much larger physical + memory supported. + + :: + + ======================================================================================================================== + Start addr | Offset | End addr | Size | VM area description + ======================================================================================================================== + | | | | + 0000000000000000 | 0 | 00ffffffffffffff | 64 PB | user-space virtual memory, different per mm + __________________|____________|__________________|_________|___________________________________________________________ + | | | | - 0000800000000000 | +64 PB | ffff7fffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical ++ 0100000000000000 | +64 PB | feffffffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical + | | | | virtual memory addresses up to the -64 PB + | | | | starting offset of kernel mappings. + __________________|____________|__________________|_________|___________________________________________________________ + | + | Kernel-space virtual memory, shared between all processes: + ____________________________________________________________|___________________________________________________________ + | | | | + ff00000000000000 | -64 PB | ff0fffffffffffff | 4 PB | ... guard hole, also reserved for hypervisor + ff10000000000000 | -60 PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI + ff11000000000000 | -59.75 PB | ff90ffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base) + ff91000000000000 | -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole + ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base) + ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole + ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base) + ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole - ffdf000000000000 | -8.25 PB | fffffdffffffffff | ~8 PB | KASAN shadow memory ++ ffdf000000000000 | -8.25 PB | fffffbffffffffff | ~8 PB | KASAN shadow memory + __________________|____________|__________________|_________|____________________________________________________________ + | + | Identical layout to the 47-bit one from here on: + ____________________________________________________________|____________________________________________________________ + | | | | + fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole + | | | | vaddr_end for KASLR + fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping + fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole + ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks + ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole + ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space + ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole + ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0 + ffffffff80000000 |-2048 MB | | | + ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space + ffffffffff000000 | -16 MB | | | + FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset + ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI + ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole + __________________|____________|__________________|_________|___________________________________________________________ + + Architecture defines a 64-bit virtual address. Implementations can support + less. Currently supported are 48- and 57-bit virtual addresses. Bits 63 + through to the most-significant implemented bit are sign extended. + This causes hole between user space and kernel addresses if you interpret them + as unsigned. + + The direct mapping covers all memory in the system up to the highest + memory address (this means in some cases it can also include PCI memory + holes). + + vmalloc space is lazily synchronized into the different PML4/PML5 pages of + the processes using the page fault handler, with init_top_pgt as + reference. + + We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual + memory window (this size is arbitrary, it can be raised later if needed). + The mappings are not part of any other kernel PGD and are only available + during EFI runtime calls. + + Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all + physical memory, vmalloc/ioremap space and virtual memory map are randomized. + Their order is preserved but their base will be offset early at boot time. + + Be very careful vs. KASLR when changing anything here. The KASLR address + range must not overlap with anything except the KASAN shadow area, which is + correct as KASAN disables KASLR. + + For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB + hole: ffffffffffff4111