How much do we know about progress?

Java multithreading article 1

To talk about threads, you usually have to talk about processes. What is a process? Here is a brief introduction.

Take a look at the process in Windows through task Manager.

From the picture, each process occupies CPU, memory, disk, network and other resources. From an operating system perspective, processes are the basic and smallest unit of resource allocation.

Why does a process appear?

The purpose of introducing processes: to enable multiple programs to execute concurrently, so as to improve resource utilization and system throughput. How to understand this sentence? A program in the running process will involve a lot of operations, using CPU calculation, data transmission through disk IO and so on, we know that when the program in disk IO, because of the speed problem, will be relatively slow, in this process, the CPU will be idle, which will cause a waste of resources, because of the introduction of process, When process A performs disk I/OS, process B allocates CPU resources to process A, which ensures concurrent execution of programs.

From A CPU point of view, the execution process looks like this: The CPU is always in charge of executing instructions, and the processes compete with each other for CPU resources. The following figure shows processes A and B. At one point in time, the CPU only executes instructions for one process. This is the benefit of processes: increased resource utilization and concurrent execution of multiple programs.

Of course, the introduction of processes is not harmful, it adds time and space overhead to the system. Space overhead is easy to understand. Processes have their own components (described below) that take up space. The time cost is the time required for process switching.

Process composition

A Process consists of three parts, namely program code, data set, stack and Process Control Block.

Their respective roles are as follows:

  1. Program code: describes what the process needs to do.
  2. Data set, stack: The data and workspace required by the program for execution.
  3. Process control block: contains description and control information about a process. It uniquely identifies the existence of a process.

How to compete for resources (scheduling algorithm)

Processes compete for resources, usually CPU resources, because cpus run too fast for other media to keep up. Competition requires rules, just like a game, every game needs rules, different rules will have different emphasis, the friends who have seen the show “The Strongest Brain” are very clear, each question has different assessment focus, some are focused on spatial thinking, some focus on logical calculation and so on. Let’s briefly explain the rules of the game for competing resources.

FCFS

First In First Out: A process that enters the ready queue First, runs First, and reschedule when it finishes or blocks. In general, the scheduling algorithm is combined with the priority policy, such as one queue for each priority, and FCFS is used for scheduling in each queue.

Features: simple, relatively long process, compared with other scheduling algorithms long average turnaround time.

RR

Round Robin: The processes in the ready queue occupy CPU resources in turn and run for a fixed period of time. If the execution is not complete, the processes enter the end of the ready queue and wait for the next execution.

Features: Fair, short response time to process.

SPN

Shortest Process Next: The Process expected to occupy the Shortest running time is preferentially executed until the Process is complete or blocked, and then rescheduled.

Features: Good for short processes.

SRT

Shortest Remaining Time: When a new process comes in, it preempts the current process if its expected running Time is shorter than the current process’s Remaining running Time.

Features: favors short processes. The difference with SPN is preemption. Because of preemption, efficiency is better than SPN.

HRRN

Highest Response Ratio Next: scheduling occurs when the current running process completes or blocks. Before each scheduling, the Response Ratio of all ready processes is calculated. The process with the Highest Response Ratio runs first.

The response ratio formula is as follows:

Features: In favor of short processes, the service that comes first is preferentially executed for processes with the same service time. Long processes do not remain unexecuted because their priorities become higher and higher during the waiting process.

FB

Feedback: A Feedback mechanism composed of multiple ready queues, which has the following rules:

  1. The processes in the same queue are scheduled according to FCFS algorithm, and the last ready queue is scheduled according to RR algorithm.
  2. The queue with a higher priority has a smaller time slice.
  3. If the process does not finish running in a time slice, it falls to the end of the next queue.
  4. The local ready queue runs only when there is no ready process in the upper queue, and the lower ready queue runs only when there is no ready process in the upper queue, and so on

The following figure shows the process execution

Features: Short processes have a huge advantage because the first queues are short.

The above is a description of several scheduling algorithms for preempting resources.

Process status

As we mentioned above, processes compete for resources, run when they get them, wait when they don’t, and this needs to be stateful to maintain. Like many systems, you need a state machine.

Three states figure

Three-state diagram is also the simplest and most basic diagram to describe the process state, which contains the most basic three states of the process, namely: ready state, running state and blocked state.

Read (ready) : The process has all the resources it needs except the CPU. Running: Process instructions are being executed. Blocked: A process is waiting for a resource or an event to occur.

When a ready process is scheduled, it enters the running state. If the time slice runs out or a higher level process preempts resources, it becomes ready to be scheduled again. If an event (such as an IO event) occurs, the process goes from the running state to the blocked state, and the process that enters the blocked state can only wait for the event to clear and re-enter the ready state.

State five figure

Based on the three-state diagram, two new states are added, namely: new state and exit state.

New: The process is being created. Memory allocated will be set to ready state.

Exit: The process ends normally or abnormally. Recycle resources.

When a new program is created and no resources are allocated, it is created in a new state. When resources are allocated, it enters the ready state after being loaded. When the process is finished running, it goes from the run state to the exit state.

Seven state diagram

Based on the five-state diagram, two new suspended states are added, namely ready suspended state and blocked suspended state.

Ready suspension state: also called external storage ready state. Because memory capacity is limited, ready processes that were in memory are dumped to external storage (disk).

Blocking pending state: also called external memory blocking state. Similarly, a blocked process in memory can be moved to external storage (disk) because of limited memory capacity.

We can see that the new state transition process is added in the figure, usually because the suspended process has a high priority or enough memory space, the process in the external storage (disk) to the memory.

Process relationship

In fact, processes are quite independent. For example, QQ and wechat are used in daily life. Is there any relationship between the processes they run? In fact, there is no relationship other than competition for resources.

Father and son

Although there is no relationship between the processes mentioned above, there is a special relationship that needs to be mentioned, and that is the parent-child relationship.

Let’s do an experiment to verify the parent-child relationship of the process. Operation steps:

  1. Open CMD, set the current window to Father, and run the command in the Father windowstart cmdStart another CMD command line program;
  2. Set the newly opened CMD command-line program window to Son, and run the command in the Son windowstart cmdStart another CMD command line program;
  3. Set the window of the newly opened CMD command-line program to Grandson.

The operation process is shown in the figure below.

The relationship between the three CMD processes can be clearly seen through ProcessExplorer. (want ProcessExplorer plug-in can be downloaded by baidu network backup link: pan.baidu.com/s/19531gf5t… Extraction code: QHC6)

We saw Father, Son and Grandson taking on the tree that we expected. So what is parent-child process? To put it simply, a new process is created in the process. The new process is a child process. A process can have multiple child processes, but only one parent process. In Unix, the parent process creates a child process by calling fork(). The parent process has the following characteristics:

  1. The parent and child processes execute concurrently.
  2. Parent and child processes share all the resources of the parent process.
  3. The child copies the parent’s address space and even has the same body segment and program counter PC values;
  4. Reduce unnecessary copying by using Copy On Write techniques: parents share parent space when forked, and Copy only when one tries to modify.

With Copy On Write, the parent process does not Copy all data to the child process when creating the child process, saving the replication time and reducing a large amount of memory. This replication is not necessary because if the application loads the new program immediately after the process replication, the previous replication effort would be a waste of time and memory.

The zombie process and orphan process will be introduced separately.

zombies

Zombie process: The parent process does not call WAIT or WaitPID to obtain the status information of the child process after the child process exits. The process descriptor of the child process is still saved in the system. This process is called zombie process.

Hazards of zombie processes: Zombie processes occupy process ids all the time, and the available process ids are limited. If a large number of zombie processes exist, new processes cannot be created because no available process IDS are available.

An orphan

Orphan process: The parent process exits while its child process is still running. The orphan process is adopted by the init process (process number 1), which collects state for the orphan process.

Orphan processes are harmless because they are hosted by the init process, which handles the collection of orphan processes.

Execution mode

Instructions are divided into privileged instructions (instructions that can only be used by the operating system kernel) and non-privileged instructions (instructions that can only be used by user programs), because instructions are privileged and non-privileged points, so the CPU is also divided into two execution modes: System state (can execute all instructions, use all resources and change CPU state) and user state (can execute only non-privileged instructions).

Switch between system mode and user mode of CPU.

Interprocess communication

When there is a need for data transmission and sharing between processes, processes need to communicate with each other. There are several ways of communication as follows. Here is just a brief overview, not to expand, we focus on multi-threading.

Pipe

Two pipes are needed to establish communication between processes. This communication mode can only be used between related processes, such as parent-child processes.

Flow Pipe

The flow pipeline is an evolution of the pipeline. Data is no longer one-way flow, but two-way flow, but still only used between related processes.

Named Pipe (Named Pipe)

Named pipes provide a new capability to name pipes, which improves the above two pipe communication modes and supports non-related process communication.

Semophore

A semaphore is A counter used to control the access of multiple processes to A shared resource. When A process is accessing the shared resource, the semaphore prevents other processes from accessing the shared resource. Only when process A stops accessing the shared resource, other processes can access the shared resource.

Signal

A signal can be sent to a process at any time without knowing the current state of the process. If the other process does not execute, the signal will remain in the kernel until the process executes and passes it. If the other process is blocked, the signal is delayed until the other process is unblocked.

Message Queue

A message queue is a linked list stored in the kernel that can be written and read by multiple processes. It solves the disadvantages of little signal transfer, a pipe that can only carry a stream of bytes without format, and a limited buffer size. There are POSIX message queues and System V message queues.

Shared Memory

Shared memory is a segment of memory that can be accessed by other processes. Multiple processes access the same memory to achieve communication.

Socket (Socket)

The socket is the socket in our network programming. It can communicate over the network as well as on the local machine. Its advantage is that it can communicate across hosts.

conclusion

Generally speaking, a process is an execution process of a program on a data set. It is the performance of the program running. This is the beginning of our study of multithreading. I hope this article will give you a brief understanding of what a process is, and then we will learn more about multithreading.

Recommended reading

Design patterns seen and forgotten, forgotten and read?

Public account background reply “Design mode” can get “a story a design mode” e-book

Think the article is useful to help forward & praise, thank you friends!