dev-resources.site
for different kinds of informations.
x64 Virtual Address Translation
This article is Day 15 of the Connehito Advent Calendar 2024.
Introduction to Paging
Paging is a memory management mechanism that maps virtual address space to physical address space. This mechanism divides virtual address space into fixed-size units called 'pages' and physical address space into 'frames', then maps pages to frames. The page size can be set to either 4KiB, 2MiB, or 1GiB. In this explanation, we'll use the 4KiB page size, which is the most commonly used configuration.
Pages can be mapped to frames independently, so they don't need to be contiguous in physical address space. This mapping is managed by page tables, which are structured in a hierarchical manner in x64 systems.
x64 Paging Structure
A page table is a data structure that stores the necessary information for mapping pages to frames.
In x64 4-Level Paging, four levels of page tables are used as follows:
- PML4 (Page Map Level 4)
- PDP (Page Directory Pointer)
- PD (Page Directory)
- PT (Page Table)
Each table consists of 512 entries, with each entry being 8 bytes in size. Each entry stores the base address of the next-level table and its attributes. As an exception, PT entries store the base address of the corresponding frame, which becomes the actual physical address.
The base address of the PML4 Table is obtained from the CR3 register. All page tables are placed in physical address space, and each table occupies 4KiB (512 entries Γ 8 bytes).
Each process requires its own page tables since each process has a unique virtual address space. The OS switches page tables by loading the base address of the process's page table into the CR3 register before execution.
The entries to be referenced in each table are determined by interpreting the structure of the virtual address during address translation.
Virtual Address Interpretation
The conversion from virtual to physical address requires proper interpretation of the virtual address. Let's use 0xFFFF80001E141F3C as an example virtual address to explain this process.
Canonical Address
In the x64 architecture, bits 63:48 of the virtual address are treated as sign extension bits. These bits must be filled with the value of bit 47.
Virtual addresses that satisfy this constraint are called 'canonical addresses', and the CPU raises an exception if an address not following this format is used.
Canonical addresses are classified into two ranges:
- Negative canonical addresses (bits 63:48 are all 1s)
- 0xFFFF800000000000 ~ 0xFFFFFFFFFFFFFFFF
- Known as kernel space, used by the operating system kernel
- Positive canonical addresses (bits 63:48 are all 0s)
- 0x0000000000000000 ~ 0x00007FFFFFFFFFFF
- Known as user space, used by user processes
Our example address 0xFFFF80001E141F3C is a negative canonical address.
Page Table Index and Page Offset
Bits 47:12 of the virtual address are divided into four 9-bit segments, each used as an index into a different page table:
- Bits 47:39 - PML4 Table index
- Bits 38:30 - PDP Table index
- Bits 29:21 - PD Table index
- Bits 20:12 - Page Table index
Bits 11:0 are used as the page offset, which is added to the frame address to obtain the final physical address.
Let's interpret our example address 0xFFFF80001E141F3C:
- Bits 47:39 - 0x1E (30) β references PML4 Table[30]
- Bits 38:30 - 0x14 (20) β references PDP Table[20]
- Bits 29:21 - 0x1 (1) β references PD Table[1]
- Bits 20:12 - 0xF (15) β references Page Table[15]
- Bits 11:0 - 0x3C (60) β offset of 60 bytes from the frame address
If the frame address pointed to by the final Page Table entry is 0x200000, the physical address becomes 0x200000 + 0x3C = 0x20003C.
Featured ones: