When a program starts execution, its portion of memory is called a process between the time it begins execution and the time it exits.

Linux is a multitasking operating system, which means that more than one process can be running at the same time. The single-CPU computers we all use can actually execute only one instruction at a time segment.

So how does Linux implement simultaneous execution of multiple processes? Linux USES an original referred to as the “process scheduling” means, first of all, for each process to assign certain running time, the time is usually very short, short to in milliseconds, then according to certain rules, the process of from many pick a running, other processes waiting for temporarily, when running the process time running out, out, or has been completed Or paused for some reason, Linux will reschedule and pick a process to run because each process occupies a short time segment, as if from the user’s point of view several processes were running at the same time.

In Linux, each process is assigned a data structure, called a process control block (PCB), when it is created. PCB contains a lot of important information for system scheduling and process execution, among which the most important is the process ID. The process ID, also known as the process identifier, is a non-negative integer that uniquely identifies a process in Linux operating system. On the most commonly used I386 architecture, a non-negative integer values from 0 to 32767, which is also the process ID we can get, which is the ID number of the process.

The generation of zombie processes

A zombie process is a process that has died, but has not been removed from the process table. Too many zombie processes can cause the process table to fill up and cause the system to crash, but it doesn’t take up system resources.

In a state of the process, the zombie process is one of the very special, it has given up almost all of the memory space, no any executable code, can not be scheduled, just keep a for position in the process list, record the process of exit status information for other processes such as collection, in addition, zombie process no longer take up any memory space, It needs its parent to collect its corpse. If the parent does not install the SIGCHLD signal handler to wait or waitpid () for the child to finish, and does not explicitly ignore the signal, then it remains zombie. If the parent process dies, init will automatically take over the child process, collect its body, and it can still be removed. But if the parent process is a loop that never ends, then the child process remains zombie.

Causes of zombie processes:

Each Linux process has an Entry point in the process table where all information used by the core program to execute the process is stored. When you run the ps command to view the process information in the system, you can see the related data in the process table.

When the fork system call creates a new process, the core process assigns an entry point to the new process in the process table and stores information in the process table corresponding to the entry point. One of these information items is the identification number of the parent process.

When the process completes its life cycle, it makes an exit () system call, at which point the data in the process table is replaced by the process’s exit code, CPU time used during execution, and so on, which is retained until the system passes it to its parent process. Thus, the zombie process appears after the subroutine terminates, but before the parent process reads the data.

How to avoid zombie processes

1. The parent waits for the child to finish using functions such as wait and waitPID, which causes the parent to hang

If the parent process is busy, we can use signal to install a handler for SIGCHLD. After the child process finishes, the parent process receives the signal and can call wait in the handler.

3. If the parent process does not care when the child process ends, it can use “singal (SIGCHLD), SIG_IGN” to inform the kernel that it is not interested in the end of the child process, so that the child process ends, the kernel will recycle and no longer send signals to the parent process.

If the parent forks () twice, the child forks () and continues to work. If the child forks () and exits, the child is taken over by init. If the child forks (), init will recycle the child.

Process versus thread

Let’s take an example, multithreading is the intersection multithreading is the plane traffic system, low cost, but more traffic lights, old traffic jam, and multi-process is an overpass, although the cost is high, up and down the hill more fuel consumption, but no traffic jam. This is an abstract concept. I believe you will have this feeling after watching.

Process and thread are two relative concepts. Generally speaking, a process can define an Instance of a program. In Win32, the process doesn’t perform anything, it just occupies the address space used by the application. In order for a process to do some work, it must own at least one thread, which is responsible for containing the code in the process’s address space.

In fact, a process can contain several threads that execute code in the process’s address space at the same time. To do this, each thread has its own set of CPU registers and stacks. Each process has at least one thread executing code in its address space. If there is no thread to execute the code in the address space of the process, there is no reason for the process to continue to exist, and the system will automatically purge the process and its address space.

The realization principle of multithreading

When a process is created, its first thread is called the Primary thread and is automatically generated by the system. This main thread can then generate additional threads, and these threads, in turn, can generate more threads. When running a multithreaded program, it appears that the threads are running simultaneously. This is not the case. In order to run all these threads, the operating system allocates some CPU time to each individual thread.

The single-CPU operating system provides time slices (Quantum) to threads in a time slice rotation mode. Each thread hands over control after using the time slice, and the system allocates CPU time slices to the next thread. Since each slice is sufficiently short, it gives the illusion that the threads are running at the same time. The sole purpose of creating additional threads is to maximize CPU time.

The problem with multithreading

Using multithreaded programming gives programmers a great deal of flexibility, while also making problems that would otherwise require complex skills easier to solve. However, you should not write a program that is artificially divided into pieces that execute on their own threads. This is not the right way to develop an application.

Threads are useful, but when used, they can create new problems as well as solve old ones. For example, you want to develop a word processor and want the printing function to execute itself as a separate thread. This sounds like a good idea, because when printing, the user can immediately go back and start editing the document.

However, the data in the document may be modified when the document is printed, and the printed result is no longer what is expected. It is probably best not to put the printing function in a separate thread, but if you must use multiple threads, consider the following solutions: First, lock the document you are printing and let the user edit other documents, so that no changes are made to the document until you are finished printing; A more efficient alternative is to copy the document to a temporary file, print the contents of the temporary file, and allow the user to make changes to the original document.

When the temporary file containing the document is printed, delete the temporary file. As can be seen from the above analysis, multi-threading may bring new problems while helping solve them. So it’s important to figure out when you need to create multiple threads and when you don’t. In general, multithreading is often used in the foreground operation at the same time also need to carry out background calculation or logical judgment.

Classification of threads

In MFC, threads are divided into two categories, worker threads and user interface threads. If a thread only performs background calculations and does not need to interact with the user, a worker thread can be used. If you need to create a thread to process the user interface, use the user interface thread. The main difference between the two is that the MFC framework adds a message loop to the user interface thread so that the user interface thread can process messages in its own message queue.

So, if you need to do some simple calculation in the background (e.g., recalculation of the spreadsheet), should first consider to use a worker thread, and when a background thread to deal with more complex task, specifically, when a background thread execution will change with the actual situation is different, you should use the user interface thread, so as to respond to different news.

Priority of the thread

When the system needs to execute multiple processes or threads at the same time, it is sometimes necessary to specify thread priority. The priority of a thread generally refers to the base priority of the thread, that is, the combination of the relative priority of the thread relative to the process and the priority of the process containing the thread.

The operating system arranges all active threads on a priority basis, and each thread in the system is assigned a priority ranging from 0 to 31. At runtime, the system simply allocates CPU time to the first thread with priority 31, and when the thread’s time slice ends, the system allocates CPU time to the next thread with priority 31. When there are no threads at priority 31, the system starts allocating CPU time to threads at priority 30, and so on.

In addition to the programmer changing the priority of the thread in the program, sometimes the system will automatically change the priority of the thread dynamically during the program execution to ensure the system is highly responsive to the end user. For example, when a user presses a key on the keyboard, the system temporarily increases the priority of the thread that processes the WM_KEYDOWN message by 2 or 3. The CPU executes the thread in a complete time slice. When the time slice is complete, the system reduces the priority of the thread by 1.

Synchronization of threads

Another very important issue when using multithreaded programming is thread synchronization. Thread synchronization refers to the ability of threads to communicate with each other without corrupting their data. Synchronization problems are caused by the way Win32’s CPU time slices are allocated.

Although only one thread is occupying CPU time at a time (single-CPU), there is no way to know when or where threads are interrupted, making it important to ensure that threads do not corrupt each other’s data. In MFC, four synchronization objects can be used to keep multiple threads running simultaneously. They are the CriticalSection object (CCriticalSection), the mutex object (CMutex), the semaphore object (CS Emaphore), and the event object (CEvent).

Of these objects, the critical section object is the easiest to use and has the disadvantage of synchronizing only threads in the same process. In addition, there is a basic method, called linearization method in this paper, that is, in the programming process of certain data write operations are completed in a thread. Thus, since code in the same thread always executes sequentially, it is impossible to overwrite data at the same time.

Conclusion:

In a thread (as opposed to a process), a thread is a concept that is closer to an execution body, sharing data with other threads in the same process, but having its own stack space and independent execution sequences. Both of these can improve the concurrency of the program and improve the efficiency and response time of the program.

Thread and process have their own advantages and disadvantages: thread execution costs little, but is not conducive to resource management and protection; The process is just the opposite. The fundamental difference is one point: with multiple processes each process has its own address space, threads are shared address space, in terms of speed: threads generate speed fast, communication between threads fast, switch fast, because they are in the same address space.

In terms of resource utilization: threads also have better resource rates because they are in the same address space. On the synchronization side: threads using common variables/memory need to use synchronization because they are in the same address space process: the child process is a copy of the parent process, and the child process gets a copy of the parent’s data space, heap, and stack.

This article is reprinted from the public account: Migrant brother Linux operation and maintenance

Note: articles collected in the network, if there is infringement, please contact xiaobian timely processing, thank you!

The original address: http://embed.21ic.com/software/linuxos/201805/56190.html

AIOps Enterprise Summit (ABBREVIATED: AIES) is the first AIOps technology Summit in China, co-directed by the Data Center Alliance (DCA) and the Open Operations Alliance (OOPSA) and hosted by the Efficient Operations Community.

AIES Conference is the first official release platform of AIOps standard at home and abroad, inviting domestic top experts to talk about AIOps system and method, process and practice, tools and technology, and presenting AIOps practice experience and tool technology of top domestic and foreign enterprises to you.

Scan the qr code below to register immediately

Click here to read more