Binder is a forest of things, and it’s easy to learn and get lost. I’ve read this section before, but I quit, only to sigh: it’s too hard. Binder was really important.

As a novice, Binder is easy to understand from a macro perspective.

1. What is a Binder? 2. What elements are a Binder made of? 3. How are these components linked with Binder? 4. How is the communication process implemented with Binder?

Preliminary understanding of Binder

  • Literally “bundlers” bind two objects that need to form a relationship.
  • Binder is simply the most widely used process communication mechanism in Android.

It’s still abstract, so let’s look at the problem it’s trying to solve:

The essential purpose of a Binder is that process 1 and process 2 want to communicate with each other, but because they are cross-process and cannot share resources directly, a Binder (driver) is needed to link the two processes together.

Binder prototypes look like this:

Binder acts as a bridge between two processes that are independent and unrelated.

Here’s the question:

Q: There are so many ways to communicate between processes in operating systems (pipes, sockets, semaphores, etc.), why does Android need to create a new IPC mechanism of its own? Or is there anything human about Binder?

We can continue with that kind of doubt

What are the components of a Binder mechanism

Binder is a bridge, so there must be many elements that make up the bridge.

Binder is based on a C/S architecture that is much like TCP/IP networks and consists of the following elements:

  • Binder Client -> Client
  • Binder Server -> Server
  • Binder Driver -> Router
  • Service Manager -> DNS

Fill these elements into the previous Binder prototype and you get something like this:

Client and Server correspond to process 1 and process 2. Server Manager is a separate process that also communicates with Binder drivers.

How do the components relate

Catch up on the little things

In order to understand the following content, I will add a knowledge of Linux process isolation, process space partition, and system calls

The picture above illustrates these points, isn’t it simple:

1. Process isolation: Process 1 and process 2 do not share memory, so process 1 cannot directly access process 2’s data. This is the popular definition of process isolation.

2. Process space division: As shown in the figure, you can see that there are two memory areas: user space and kernel space. Kernel is the Space where the system Kernel runs, and User Space is the Space where User programs run. They are isolated from each other for security.

3. System calls: Although there is a logical division between user space and kernel space, it is inevitable that the user space needs to access kernel resources. In order to break through the isolation restrictions, system calls are the only way to give users access to kernel space. Mainly through the following two functions:

Copy_from_user () // copies data from user space to kernel space copy_to_user() // copies data from kernel space to user spaceCopy the code

Q: What are the benefits of such a difficult design?

This design can make all the access to resources under the control of the kernel, avoid the user program access to system resources, improve the security and stability of the system.

As you can see from the figure, it is based on system calls that our interprocess communication (IPC) implementation is possible. So what are the specific ways of communication?

Traditional IPC communication mode

In light of the above information, consider how data is transferred from the sending process to the receiving process in traditional IPC mode.

The usual practice is:

  1. The data that the message sender is about to send is stored in an in-memory cache,
  2. Enter kernel mode through system call. The kernel allocates memory in the kernel space, creates a kernel cache, and calls copy_from_user() to copy data from the user-space memory cache to the kernel cache in the kernel space.
  3. Similarly, when receiving data, the recipient process creates a memory cache in its own user space,
  4. The kernel program then calls the copy_to_user() function to copy data from the kernel cache to the receiving process’s memory cache. In this way, the data sender process and the data receiver process have completed a data transfer, which is called an inter-process communication.

This traditional IPC process looks very reasonable, smooth and flawless.

There are actually two drawbacks:

  • First of all, it is inefficient to make two copies: one data transfer needs to go through: memory cache –> kernel cache –> memory cache;
  • The cache that receives the data is provided by the data receiving process, but the receiving process does not know how much space it needs to store the data to be delivered, so it can only make as much memory space as possible or call the API to receive the message header to get the size of the message body. Either way, it’s a waste of space or time.

The time has come for Binder, the main character

Design and implementation of Binder

A complete Binder IPC communication is usually as follows:

  1. First, the Binder driver creates a data receive cache in kernel space.
  2. Then, a kernel cache is created in the kernel space, and the mapping relationship between the kernel cache and the kernel data receiving cache, as well as the mapping relationship between the kernel data receiving cache and the user space address of the receiving process is established.
  3. The sender process uses the system call copy_from_user() to copy the data to the kernel cache in the kernel. Since there is a memory mapping between the kernel cache and the user space of the receiving process, the data is sent to the user space of the receiving process, thus completing an inter-process communication.

Compared with the traditional communication process, we can obviously see the difference:

Binder IPC communication mode, it has a data receiving cache area, and through this data receiving cache area, respectively with the kernel cache area of the kernel space, the user space of the receiving process to establish a mapping relationship. The biggest benefit is one less copy of the data, which means twice as much performance. Don’t underestimate this double performance, on the basis of a large amount of data and a trivial number of communications, it all makes a difference.

Memory mapping: Memory mapping in the Binder IPC mechanism is implemented through mmap(), a method of memory mapping in operating systems. Memory mapping simply means mapping a memory region of user space to kernel space. After the mapping relationship is established, the modification of the memory area can be directly reflected in the kernel space. Conversely, changes made to this area in kernel space can be directly reflected in user space.

At this point, we can address the original question: What is “special” about Binder

Features of other IPC communication modes:

  • Socket, as a universal interface, has low transmission efficiency and high cost. It is mainly used for inter-process communication across the network and low-speed communication between processes on the local machine.
  • Message queues and pipes adopt store-and-forward mode, that is, data is copied from the sender cache to the cache created by the kernel, and then copied from the kernel cache to the receiver cache at least twice.
  • Although shared memory does not require copying, it is difficult to control and use.

The Binder:

  • Good performance: Only one data copy is required, second only to shared memory in performance
  • High stability: Based on THE C/S architecture, the responsibilities are clear and the architecture is clear. Therefore, the stability is better than memory sharing.
  • Strong security: Assign a UID to each APP. The PROCESS UID is an important symbol to identify the process. Traditional IPC can only be filled with UID/PID by the user in the packet, but this is unreliable and easy to be exploited by malicious programs.

Discussion:

  1. Traditional IPC mechanisms such as pipes and sockets are part of the kernel, so interprocess communication with binders is fine. Binder is not part of the Linux kernel.

This is due to Linux’s Loadable Kernel Module (LKM) mechanism. A module is a self-contained program that can be compiled separately but cannot be run independently. It is linked to the kernel at run time and runs as part of the kernel. The Android system can then dynamically add a kernel module running in the kernel space, which can be used as a bridge between user processes.

  1. The receiving process does not need the kernel cache and can directly map to the Binder receive cache, but the sending process can not directly map to the Binder receive cache. If they can also map, there is no need to copy and performance is better.

This question has existed since THE first time I looked at the diagram of communication process, but I did not find the answer when I googled the question. All I found was why I only need to copy once. I don’t know if it was too easy, but I didn’t think of it…

I finally came up with an explanation (probably wrong) that everyone knows and can share in the comments section:

If both sending and receiving processes map directly to the binder’s cache, it is equivalent to crossing the Linux kernel. This is obviously not reasonable. The inter-process architecture is designed so that the communication data of the process needs to be managed by the kernel space, so it needs to receive the data copied by the sending process.

The last

Binder is an interesting way to learn Binder.

You’ve probably noticed that the Binder communication process is missing one element, yes, the Server Manager, which is quite a bit of content and is scheduled for another blog post.