Virtual memory

Although the base and address registers are used to create an abstraction of the address space, there is another problem that needs to be solved: managing bloatware. Although the size of memory is growing rapidly, the size of software is growing even faster than the size of memory. In 1980, many universities used a 4 MB VAX computer to run a time-sharing operating system for a dozen simultaneous users. Now Microsoft’s recommended 64-bit Windows 8 system requires at least 2 GB of ram, and many multimedia trends are driving demand even further.

This development as a result, the need to run the program, often to memory cannot accommodate, and must require the system to support multiple applications running at the same time, even if the memory can satisfy the demand of one single program, but from the overall memory still cannot satisfy the increasing requirements of software (feeling is similar to the contradiction of XXX and XXX). Swap technology is not a very effective solution, in some small and medium applications can still use swap, if the application is too large, do you need to swap several GB of memory each time? This is obviously inappropriate, as the peak transfer speed of a typical SATA disk is several hundred megabits per second, which means it takes several seconds to swap out or swap in a 1 GB program.

SATA disks, also known as Serial ATA disks, are the trend of PC disks in the future, and have basically replaced traditional PATA disks.

So is there an effective way to deal with it? Yes, it uses virtual memory. The basic idea of virtual memory is that each program has its own address space divided into blocks called pages. Each page is a contiguous range of addresses. These pages are mapped to physical memory, but not all pages must be in memory to run the program. When a program references a portion of the address space in physical memory, the hardware immediately performs the necessary mapping. When a program references a portion of the address space that is not in physical memory, it is the responsibility of the operating system to load the missing portion into physical memory and re-execute the failed instruction.

In a sense, the virtual address is an overview of the base address register and the indexed register. The 8088 has separate base address registers (but not indexing registers) for text and data.

With virtual memory, you can map the entire address space into physical memory in tiny units, rather than just relocating the Text and data areas. We will explore how virtual memory is implemented.

Virtual memory is well suited for use in multiprogramming systems, where fragments of many programs are kept in memory at the same time and the CPU can be handed over to another process while one program waits for part of it to be read into memory.

paging

A technique called paging is used in most systems that use virtual memory. On any computer, a program references a set of memory addresses. When the program executes

MOV REG,1000
Copy the code

This instruction copies the contents of the memory location 1000 into the REG (or vice versa, depending on the computer). Addresses can be generated by index, base address register, segment register, or other means.

The addresses generated by the program are called virtual addresses and form a virtual address space. On a computer without virtual memory, the system sends the virtual addresses directly to the in-memory line. Read and write operations use the physical memory of the same address. When virtual memory is used, the virtual address is not sent directly to the memory bus. Instead, virtual addresses are mapped to physical Memory addresses using the Memory Management Unit (MMU), as shown in the figure below

The picture below shows how this mapping works. Right

The page table maps virtual addresses to physical memory addresses. Each page starts at a multiple of 4096 and ends at 4095, so 4K to 8K is actually 4096-8191, and 8K-12K is 8192-12287

In this example, we might have a computer with 16 bit addresses ranging from 0 to 64 K-1. These are virtual addresses. However, there is only 32 KB of physical address. So while 64 KB programs can be written, they cannot all be brought into memory to run. A complete copy of the core image of the program, at most 64 KB, must be kept on disk to ensure that the program fragments can be brought into memory when needed.

How to map pages that have mappings

The virtual address space consists of fixed-size cells called pages. In contrast, physical memory also has physical units of fixed size, called page frames. The page is the same size as the page frame. In the example above, the page size is 4KB, but in real life the page size can range from 512 bytes to 1G bytes. For 64 KB of virtual address space and 32 KB of physical memory, you get 16 virtual pages and 8 page boxes. The swap between RAM and disk is always done in units of entire pages.

When a program attempts to access an address, for example, execute the following instruction

MOV REG, 0
Copy the code

Virtual address 0 is sent to the MMU. MMU sees that the virtual address is on page 0 (0-4095). According to its mapping result, this page corresponds to page 2 (8192-12287), so MMU transforms the address to 8192 and sends the address 8192 to the bus. Memory knows nothing about the MMU, it just sees a read/write request to address 8192 and executes it. The MMU effectively maps all virtual addresses 0-4095 to physical addresses 8192-12287. Again, instructions

MOV REG, 8192
Copy the code

Is also effectively converted to

MOV REG, 24576
Copy the code

The virtual address 8192 (in virtual page 2) is mapped to the physical address 24576 (in physical page box 6).

With proper MMU Settings, 16 virtual pages can be mapped to any of the eight page boxes. But this does not solve the problem of virtual address space being larger than physical memory.

There are eight physical page boxes in the figure above, so only eight virtual pages are mapped to physical memory. The other pages represented by X in the figure above are not mapped. In real hardware, a Present/ Absent bit is used to record the actual presence of the page in memory.

How are unmapped pages mapped

When a program accesses an unmapped page, such as an execution instruction

MOV REG, 32780
Copy the code

What’s going to happen? What is the physical address corresponding to the 12th byte of virtual page 8 (starting from 32768)? The MMU notices that the page is not mapped (represented by an X in the diagram) and the CPU is trapped into the operating system. This trap is called a page fault or a missing page error. The operating system selects a rarely used page and writes its contents to disk (if it is not on disk). Then read the page you want to visit into the reclaimed page box, modify the mapping, and restart the instruction that caused the drop. It’s a little bit confusing, but let’s do an example.

For example, if the operating system decides to abandon page box 1, it will load virtual machine page 8 at physical address 4096 and make two changes to the MMU mapping. First, it marks the 1 entry in the virtual page as unmapped, so that any future access to the virtual addresses 4096-8191 will result in a trap. Then change the cross of the entry on virtual page 8 to 1, so that when the booby trap instruction is restarted, it maps the virtual address 32780 to the physical address (4096 + 12).

Let’s take a look at the internals of the MMU to see how they work and why the page sizes we chose were all integers of powers of 2. Below we can see an example of a virtual address

Virtual address 8196 (binary 0010000000000100) is mapped using the MMU mapping mechanism shown in the page table mapping diagram above, and the input 16-bit virtual address is divided into 4-bit page numbers and 12-bit offsets. A 4-bit page number can represent 16 pages, and a 12-bit offset can represent all 4096 bytes in a page.

The page number can be used as the index of the Page table to get the box number corresponding to the virtual page. 0 if present/absent causes an operating system to trap. If the bit is 1, the page frame number found in the page table is copied to the higher three bits of the output register, plus the lower 12-bit offset in the input virtual address. This makes up a 15-bit physical address. The contents of the output register are then sent to the bus as a physical address.

A page table

In the simple example above, the virtual address to physical address mapping can be summarized as follows: The virtual address is divided into virtual page number (high part) and offset (low part). For example, for 16-bit addresses and 4-KB page sizes, the top four bits can specify one of the 16 virtual pages, while the bottom 12 bits then determine the offset (0-4095) in the selected page.

The virtual page number can be used as the index of the page table to find the contents of the virtual page. You can find the page frame number (if any) from the page table entry. The frame number is then concatenated to the high end of the offset to replace the virtual page number and form the physical address.

Therefore, the purpose of the page table is to map virtual pages to page frames. Mathematically, a page table is a function that takes a virtual page number and results in a physical page frame number.

This function converts the virtual pages in the virtual address into a page frame to form a physical address.

Structure of the page table entry

Now let’s talk about the structure of the page entry. You know the general structure of the page entry, which is composed of the box number and in/out position. Now let’s talk about the structure of the page entry

The structure of the page entry is machine-specific, but the page entry is roughly the same on different machines. Above is the composition of a page table entry. The page table entries may vary from computer to computer, but generally they are 32 bits. The most important field in a Page entry is the Page Frame number. After all, the most important step in the page-table to page-box operation is to map this value across. The next important bit is in/out. If the value of this bit is 1, the page table entry is valid and can be used. If this value is 0, it indicates that the virtual page corresponding to the page entry is not in memory, and accessing this page causes a page fault.

Protection tells us which access is allowed. In its simplest form, the field has only one bit, 0 for readable and writable and 1 for read-only.

Modified and Referenced track page usage. When a page is written, the hardware automatically sets the change bit. The modify bit is useful when reassigning page frames. If a page has been modified (that is, it is dirty), it must be written back to disk. If a page has not been modified (that is, it is clean), the page box is discarded when reassigned because the copy on disk is still valid. This bit is sometimes called the dirty bit because it reflects the state of the page.

Referenced The access bit is set when a page is visited, whether it is read or written. This value helps the operating system select pages to be culled in the event of a missing page interrupt. Pages that are no longer in use are better suited for elimination than pages that are in use. This bit is very useful in the page replacement algorithm discussed later.

The last bit is used to disable the page from being cached, which plays a key role in mapping to device registers or memory. This bit is used to disable caching. This bit is not required for machines that have a separate I/O space rather than memory-mapped I/O.

Before diving into the following, it is important to emphasize that virtual memory is essentially used to create an abstraction of the address space, which can be thought of as a process abstraction of the CPU. The implementation of virtual memory is essentially a decomposition of the virtual address space into pages, each item mapped to a page box in physical memory. Because our focus is on how to manage this abstraction of virtual memory.

Speed up paging

Now that we have the basics of virtual Memory and paging, we can focus on concrete implementations. In any system with paging, there are two main problems:

  • The mapping between virtual addresses and physical addresses must be fast
  • If the virtual address space is large enough, then the page table is large enough

The first problem is that since every access to memory requires a mapping of virtual addresses to physical addresses, all instructions ultimately come from memory, and many instructions also access operands in memory.

Operands: An operand is a component of a computer instruction that specifies the amount of numeric operations to be performed in the instruction. Operands indicate the source of data required for the operation performed by the instruction. An operand is a field in an assembly instruction. For example, MOV, ADD, etc.

Therefore, each instruction may access the page table multiple times, and if it takes 1 ns to execute an instruction, the page table query needs to be completed within 0.2 ns to avoid the mapping becoming a major performance bottleneck.

The second problem is that all modern operating systems use at least 32-bit virtual addresses, and 64-bit is becoming more common. Assuming a page size of 4 KB, the 32-bit address space is nearly 1 million pages, while the 64-bit address space is unimaginably large.

The need for large and fast page maps becomes a very important constraint on building computers. Just like the figure in the page table above, each entry corresponds to a virtual page with the virtual page number as the index. When a process is started, the operating system puts a copy of the process page table read stored in memory into a register.

Is that last sentence hard to understand? Remember what a page table is? It is a mapping page table of virtual addresses to memory addresses. Page tables are a key component of virtual address translation and are required to access in-memory data. The virtual address to physical address conversion is performed many times at process startup, and a copy of the physical address is read from memory into a register before the conversion process is performed.

As a result, there is no need to access memory for page tables while the process is running. The advantages of using this approach are simplicity and no memory access is required during the mapping process. The disadvantages are that page tables are expensive when they are too large, and the entire page table must be loaded every time a context switch occurs, which can cause performance degradation. In view of this, let’s discuss an implementation that speeds up paging and handles large virtual address Spaces

Conversion detection buffer

Let’s start by exploring the problem of accelerated paging. Most optimizations start with page tables in memory. This design has a huge impact on efficiency. Consider, for example, a 1-byte instruction that wants to copy data from one register to another. In the absence of paging, this instruction accesses memory only once, that is, it fetches instructions from memory. With paging, more memory access is required to access page tables. Since execution speed is often limited by how fast the CPU can fetch instructions and data from memory, the performance of memory access can be reduced by half if it takes two accesses to achieve one. In this case, paging is not used at all.

What is a 1-byte instruction? Let’s take the 8085 microprocessor as an example to illustrate, in the 8085 microprocessor, there are three kinds of byte instructions, they are 1-byte(1 byte), 2-byte(2 byte), 3-byte(3 byte), let’s talk about them respectively

1-byte: The 1-byte operand and opcode are both 1 byte. Operands are internal registers that are encoded into instructions; Instructions require a storage location to store a single register in a storage location. Instructions with no operands are also 1-byte instructions.

For example: MOV B,C, LDAX B, NOP, HLT

2-byte: 2 bytes including: the opcode specified by the first byte; The second byte specifies the operand; Instructions require two memory locations to be stored in memory.

For example, MVI B, 26 H, IN 56 H

3-byte: In a 3-byte instruction, the first byte specifies the opcode; The last two bytes specify a 16-bit address; The second byte holds the low-order address; The third byte holds the high-level address. Instructions require three memory locations to store a single byte in memory.

For example, LDA 2050 H and JMP 2085 H

Most programs always make multiple visits to a few pages rather than a few visits to a large number of pages. As a result, only a few pages can be accessed again, and other page table entries are rarely accessed.

Page Table entries are also commonly known as Page Table Entry(PTE).

Based on this idea, a solution is proposed, that is, to solve this problem from the hardware aspect, set up a small hardware device for the computer, which can map the virtual address directly to the physical address, without having to access the page table. This device is known as Translation Lookaside Buffer (TLB), sometimes referred to as associate memory.

Significant bit Virtual page number To modify a To protect a Page frame no.
1 140 1 RW 31
1 20 0 R X 38
1 130 1 RW 29
1 129 1 RW 62
1 19 0 R X 50
1 21 0 R X 45
1 860 1 RW 14
1 861 1 RW 75

TLB accelerated paging

The TLB is usually located in the MMU and contains a small number of entries. Each entry records information about the page. Except for the virtual page number, other entries correspond to the page table one by one

TLB, if you still don’t understand what TLB is, TLB is basically a memory cache that reduces the amount of time it takes to access memory, it’s part of the MMU, and TLB stores the translation of virtual addresses to physical addresses, This is often called an address-translation cache. TLB usually sits between the CPU and the CPU cache, which is a different cache level from the CPU cache. Let’s take a look at how TLB works.

When the virtual address of an MMU needs to be translated, the hardware checks that the virtual page number matches all entries in the TLB to determine whether the virtual page is in the TLB. If a valid match is found and the access operation does not violate the protection bit, the page frame number is fetched directly from the TLB instead of accessing the page table directly. A protection fault is generated if the virtual page is in the TLB but violates protection bit permissions (such as read only but a write instruction).

What if the virtual address is no longer in TLB? If the MMU detects no valid match, it does a normal page table lookup, then evicts an entry from the TLB and places the entry found from the page table in the TLB. When an entry is purged from the TLB, the modified bits are copied to the in-memory page entry, leaving the bits unchanged except for the access bits. When a page table entry is loaded from the page table into the TLB, all values come from memory.

Software TLB management

Until now, we’ve assumed that every computer has a hardware-recognized page table, plus a TLB. In this design, TLB management and TLB error handling is done entirely by the hardware. An operating system trap occurs only when the page is not in memory.

In the past, our above assumption was usually correct. However, on many modern RISC machines, including SPARC, MIPS, and HP PA, almost all page management is done in software.

A compact instruction set computer or RISC is a computer instruction set that enables a computer’s microprocessor to have fewer instruction per instruction (CPI) cycles than a complex instruction set computer (CISC)

On these computers, TLB entries are loaded by the operating system display. When TLB access loss occurs, instead of the MMU searching the page table and retrieving the required page table entry, a TLB failure is generated and the problem is resolved by the operating system. The operating system must find the page, remove it from the TLB (remove an item from the page table), place the newly found page in the TLB, and finally execute the instruction that failed earlier. However, all of this must be done with a few instructions, because TLB loss occurs much more often than error.

The common way to handle TLB failures, whether hardware or software, is to find the page table and perform an index operation to locate the page to be visited. The problem with searching in software is that the page that holds the page table may not be in the TLB, which will cause other TLB errors during the processing. The improvement is to maintain a large cache of TLB entries in a fixed location in memory to reduce TLB failures. By checking the software cache first, the operating system can effectively reduce TLB failures.

TLB software management has two kinds of TLB failures. When a page access is in memory but not in TLB, a soft miss will occur. Then all you need to do is update the page table to TLB (the process we discussed above) without causing disk I/O. Processing takes only a few machine instructions in a matter of nanoseconds. However, hard misses occur when the page itself is out of memory, and page table extraction is required from disk. Hard misses typically take millions of times longer than soft misses. The process of finding a mapping in a page table structure is called a Page Table walk.

Both of these are ideal phenomena, but in practice the situation is more complicated, and a miss may be neither hard nor soft. Some misses may be softer or harder (chuckles). For example, if the desired page is not found during the page table traversal, three situations will occur:

  • The required pages are in memory, but not recorded in the process’s page table. In this case, another process may have dropped the pages from disk into memory. In this case, the pages need only be mapped correctly, rather than being called in from disk againMinor Page fault.
  • Based on the above, if you need to load pages directly from your hard disk, this is itMajor page falut.
  • In another case, the program may access an invalid address without adding a mapping to the TLB at all. At this point, the operating system reports oneSegmentation faultsTo terminate the program. Only the third type of missing page is a program error; the others are fixed by the hardware or operating system at the expense of reduced procedural performance

Page tables for large memory

Remember what we were talking about? (her face), which may be discussed too much, you don’t know, I’ll remind you that the above speed paging process is discussed in the virtual addresses to physical address mapping must be fast, is another question If the virtual address space is enough big, there will be a big enough problem page table, how to deal with the huge virtual address space, the following our discussions.

Multistage page table

The first option is to use multilevel page tables (MULTI). Here is an example

32-bit virtual addresses are divided into a 10-bit PT1 field, a 10-bit PT2 field, and a 12-bit Offset field. Because the offset is 12 bits, the page size is 4KB, which is 2^20 pages.

The reason for introducing multi-level page tables is to avoid keeping the entire page table in memory all the time. Page tables that are not needed should not be kept.

A multilevel page table is a paging scheme that consists of two or more levels of paging tables, also known as hierarchical paging. Entries in level 1 (Level 1) page tables are Pointers to Level 2 (Level 2) page tables, entries in level 2 page tables are Pointers to Level 3 (Level 3) page tables, and so on. The last level of page tables stores the actual information.

Below is the working process of a two-level page table

On the far left is the top-level page table, which has 1024 entries corresponding to the 10-bit PT1 field. When a virtual address is sent to THE MMU, the MMU first extracts the PT1 field and uses that value as an index to access the top-level page table. Because the entire 4 GB (32-bit) virtual address has been partitioned into 4 KB sizes, each of the 1024 entries in the top-level page table represents a 4 MB block address range.

The entry from the index top-level page table contains the address or box number of the secondary page table. Item 0 of the top-level page table points to the page table of the program body, item 1 points to the page table containing data, item 1023 points to the page table of the stack, and other items (shaded) indicate unused. Now use the PT2 field as an index to access the selected secondary page table to find the corresponding box number for the virtual page.

Inverted page table

An alternative to the proliferation of Inverted Page tables in the paging hierarchy is to use inverted Page tables. Examples of this solution include PowerPC, UltraSPARC, and Itanium. In this design, there is one entry per page box in real memory, rather than one entry per virtual page.

Although the inverted page table saves a lot of space, it has its own disadvantage: converting virtual addresses to physical addresses can be difficult. When process N accesses virtual page P, the hardware can no longer look up the physical page by treating P as an index to the page table. Instead, you must search the entire inversion table to find an entry. In addition, the search must be performed once for every memory access operation, not when a page miss interrupt occurs.

The way to solve this problem is to use TLB. When TLB failure occurs, software is required to search the entire inversion page list. One possible way to do this is to create a hash table with virtual addresses. All virtual pages currently in memory with the same hash value are linked together. As shown in the figure below

If there are as many slots in the hash table as there are physical pages in the machine, then the length of the collision chain in the hash table will be one entry long, which will greatly improve the mapping speed. Once the page frame is found, the new (virtual page number, physical page frame number) is loaded into the TLB.

KanWu

On page P115, the positions of the two lines are drawn in reverse. It is easy to mislead others according to the following description. Please amend it.

Article Reference:

En.wikipedia.org/wiki/Page_r…

Faculty.salina.k-state.edu/tim/ossg/Me…

www.geeksforgeeks.org/page-replac…

www.geeksforgeeks.org/multilevel-…

En.wikipedia.org/wiki/Transl…

Electricalvoice.com/instruction…

En.wikipedia.org/wiki/Page_t…

www.javatpoint.com/os-page-tab…

Baike.baidu.com/item/ memory / 103…

baike.baidu.com/item/ data segment /51…

Blog.csdn.net/One_L_Star/…

Modern operating systems, fourth edition

Modern Operation System Fourth

Baike.baidu.com/item/SATA hard disk…

baike.baidu.com/item/ Virtual address /1…