Let’s start with the next question that has puzzled me for a long time: the difference between vitural and logical addresses.

Most books on operating systems either write virtual addresses or logical addresses, which makes me confused.

In the book “Deep Understanding of the Linux kernel” finally found the exact answer, here I will not write out, the concept of the two are indeed some differences, but for our daily use and understanding of the operating system, we can temporarily understand the virtual address and logical address for the same meaning.

None of the addresses you saw were real

The following C code excerpt from Introduction to Operating Systems, Remz H. Apachidusser, prints out the address of main, the heap allocation returned by malloc (similar to the New operation in Java), and the address of an integer on the stack:

We get the following output:

What we need to know is that all of these printed addresses are virtual, they do not exist in physical memory, and they will eventually be translated by the operating system and CPU hardware into the real physical address, and then the value of that address can be retrieved from the real physical location.

OK, the above as an introduction, let you have an intuitive understanding of the physical address and virtual address, the text begins.

Physical Addressing

The concept of a physical address is well understood, and you can call it a real address. The definition of a physical address given in Understanding Computer Systems – 3rd Edition is as follows:

The main memory of a computer system is organized into an array of M contiguous byte – sized cells. Each byte has a unique physical address.

For example, the first byte’s physical address is 0, the next byte’s address is 1, the next byte’s address is 2, and so on. Given this simple structure, the most natural way for a CPU to access memory is to use such physical addresses. We refer to this approach as physical addressing.

For example, when a program executes a load instruction that reads a 4-byte word from physical address 4 and passes it to a register.

The physical addressing process is as follows: When the CPU executes this instruction, it generates the physical address 4 and passes it to memory through the main memory line. Memory fetches the 4-byte word starting at physical address 4 and returns it to the CPU, which stores it in the specified register. See below:

Physical addressing, in which each program accesses physical memory directly, has significant drawbacks:

1) First, user programs can address any byte of memory, and they can easily corrupt the operating system, bringing it to a slow halt.

2) Again, this approach to addressing makes it nearly impossible to run two or more programs simultaneously in an operating system.

For example, we open three identical programs (calculators), all executing to a certain point. For example, the user enters 10, 100 and 1000 respectively on the interface of these three programs, and the corresponding instruction is to save the number entered by the user in a certain address in memory. If you can only store one number in this location, which one should you store? Isn’t that a conflict?

Here’s another example, from Modern Operating Systems, 3rd Edition:

If one program assigns a value to the physical memory address 1000, that is, after storing some data, another program assigns the same value to the physical memory address, and the second program’s assignment overwrites the first one, causing both programs to crash simultaneously.

Of course, we also said that it’s almost impossible, not impossible, but there are ways to run multiple programs concurrently with physical addressing.

The easiest way to do this is: Will first of all, the process of free stored on disk, so that when they do not run time will not take up memory, and then, make a program (or process) takes up all the memory alone run a short period of time, when a context switch, stop this process, and put it all state information stored in the disk, and then load the status information of the other processes, Then run for a while…… As long as there is only one program in memory at any one time, address collisions as described above do not occur. This allows for a crude form of concurrency.

There is a problem with this approach: saving all memory information to disk takes too long! Especially when memory grows.

Therefore, we consider keeping the corresponding memory of the process in physical memory at all times and switching to a specific region when a context switch occurs.

As shown in the following figure, there are three processes (A, B, and C), each of which has A small portion of the 512KB physical memory that is allocated to them. It can be understood that these three processes share physical memory:

Obviously, there are security risks in this way. After all, it would be a mess if processes could read and write to each other at will.

So how do you protect the addresses used by each process? The physical memory model was no longer an option, so the operating system created a new memory abstraction and introduced a new memory model called the virtual Address Space, which in many books is simply called “Address Space.”

Virtual Addressing

Let me explain the concept of virtual address space and virtual address in a colloquial way.

This means that the actual physical memory address of each process’s stack, heap, code segment, etc., is not visible to the process, and no one can directly access the physical address.

So how do we access this process?

The operating system assigns each process a vitural address. Each process’s stack, heap and code segment is assigned an address from this address space, which is called the virtual address. The address written by the underlying instruction is also a virtual address.

Each process has its own address space and is independent of the address Spaces of other processes. That is, the physical address corresponding to virtual address 28 in one process is different from the physical address corresponding to virtual address 28 in another process, so there is no conflict.

A physical address is a warehouse, and a virtual address is a door. For example, if there are thirty doors, all the processes can see the thirty doors, but they see the same door, pointing to a different warehouse.

Address space: Modern Operating Systems, 3rd Edition

An address space is a set of addresses that a process can use to address memory. Each process has its own address space, and this address space is independent of the address Spaces of other processes (except in some special cases where processes need to share their address Spaces).

The concept of address space is very general and comes up in many contexts. Take phone numbers. In the United States and many other countries, a local phone number is usually a seven-digit number. Therefore, the address space of a phone number is from 0, 000, 000 to 9, 999, 999.

The address space can also be non-numeric, and the collection of network domain names ending in “.com “is also an address space. The address space consists of all strings of 2 to 63 characters followed by “.com “. These strings can be letters, digits, and hyphens.

By now you should understand the concept of address space, which is very simple.

With a virtual address space, the CPU can access main memory by generating a virtual address that is translated into an appropriate physical address before being sent to memory. This virtual address to physical address translation process is called address translation.

Address translation requires close cooperation between the CPU hardware and the operating system: The Memory Management Unit (MMU) on the CPU is dedicated to converting virtual addresses to physical addresses, but the MMU relies on a query table stored in Memory, whose contents are managed by the operating system.

The process by which the CPU generates and translates virtual addresses is, then, virtual Addressing. For example, take a look at the following image:

References

  • Introduction to Operating Systems – Remz H. Apajdussel
  • Modern Operating Systems – 3rd Edition
  • “Understanding Computer Systems in Depth – 3rd Edition”

| flying veal 🎉 pay close attention to the public, get updates immediately

  • I am a postgraduate student in Southeast University and a summer intern in Java background development of Ctrip. I run a public account “Flying Veal” in my spare time, which was opened on 2020/12/29. Focus on sharing computer fundamentals (data structure + algorithm + computer network + database + operating system + Linux), Java technology stack and other related original technology good articles. The purpose of this public account is to let you can quickly grasp the key knowledge, targeted. Pay attention to the public number for the first time to get the article update, we progress together on the way to growth
  • And recommend personal maintenance of open source tutorial project: CS-Wiki (Gitee recommended project, has accumulated 1.7K + STAR), committed to creating a perfect back-end knowledge system, in the road of technology to avoid detours, welcome friends to come to exchange learning ~ 😊
  • If you don’t have any outstanding projects, you can refer to the Gitee official recommended project of “Open Source Community System Echo” written by me, which has accumulated 700+ star so far. SpringBoot + MyBatis + Redis + Kafka + Elasticsearch + Spring Security +… And provide detailed development documents and supporting tutorials. Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo