Process

Process is a running activity of a program on a data set in a computer. It is the basic unit of resource allocation and scheduling in the system and the basis of operating system structure. In the early process-oriented computer architecture, the process is the basic execution entity of the program. In modern thread-oriented computer architectures, processes are containers for threads. A program is a description of instructions, data and their organizational form, while a process is an entity of the program.

The background and problems to be solved

Single channel batch system

Single-channel batch system will be multiple input it to tape, and used in the system Monitor (Monitor) for multiple jobs to continuous processing, every time processing into memory to tape the first homework, read the next job again after processing is completed, to a certain extent, improve the utilization of system resources and system throughput. However, in a single-channel batch system, I/O requests block CPU execution, and each program monopolizes all memory space. Therefore, system resources such as CPU and memory are not fully utilized in a single-channel batch system.

Multichannel batch system

In order to solve the problem of poor resource utilization in single-channel batch system, multi-channel batch system came into being. Multi-channel batch processing systems read multiple jobs into memory at a time. When the program currently occupying the CPU performs I/O operations, THE CPU resources are allocated to other programs. In this way, the CPU resources can be kept busy and the memory resources can be used by multiple programs simultaneously. The contradiction of inadequate resource utilization in single channel system is solved.

Under this mechanism, multiple programs can be executed alternately in the same time period, which is often referred to as the concept of concurrency.

However, the problem is that when a program is executed concurrently, it will lose its closure, and will have the characteristics of discontinuity and unrepeatable results, which makes it impossible for ordinary programs to execute concurrently. For example, after executing for A period of time, program A enters the state of waiting for I/O. At this time, the CPU resources are temporarily allocated to program B. When program A finishes I/O and gets the CPU resources again, the CPU is confused: Where was I?

The concept of process is introduced to solve the above problems, to control and describe the execution of the program, so as to realize the concurrency between programs. That is, processes are created for concurrency.

If you read the above carefully, you may have noticed that it refers to processing a task and executing a program. What is the difference between the two words? In a single-channel batch system, the OS only needs to process the lines of code recorded on the tape, which is simply called a job. In a multi-channel batch system, a program can contain multiple processes, and a process is composed of multiple parts, so it cannot be simply called a job.

Process composition

So how do processes solve the problem of control between concurrent programs? Process is generally short for process entity, which consists of the following three parts:

  • Procedures section
  • Related data procedures
  • Process Control Block
    • Process identifier
      • External identifier
      • Internal identifier
    • Processor state
    • Process scheduling information
    • Process control information

It is not difficult to see that the problem of program control and description is realized by the process control block, so the creation of the process is in essence the PCB of the process entity, and the cancellation of the process is the PCB of the cancellation process.

Process definition

With our long-winded pile of foreplay above, we can finally define progress:

A process is the running process of a process entity. It is an independent unit of the system for resource allocation and scheduling.

That is:

  1. Processes are the basic unit of resource allocation in the system.
  2. Process is the basic unit of system resource scheduling.

Doesn’t it feel like something’s wrong? It’s different from what I see in other places. Don’t worry, brother. Rome wasn’t built in a day. Let me tell you more about how it became different.

Process characteristics and limitations

  1. processhaveA separate address space, which makes multipleprocessThere is no need to worry about oneprocessAccidentally overwriting another process’s virtual memory. But this also makes it hard to rely on if you want to share state information between processesInterprocess communication(IPC), and IPC is expensive.
  2. becauseprocessHas a separate address space, so createprocessWhen, it is necessary to allocate all resources except processing machine for it, and all resources must be recovered when revocation, which increases the time and space overhead.
  3. rightprocessDuring context switchover, retain the CPU environment of the current process. Setting the CPU environment of the new process consumes a lot of CPU resources.

Imagine a program with three basic functions:

  1. Gets multiple inputs from the keyboard
  2. Performs a series of processes on the input
  3. Display the corresponding content according to the processing result

Without the concept of threads, we could only do the following:

  1. If all three functions are completed in process A, the flow of in-process functions is as follows:
Get input -> Process -> Show Output -> Get Input -> Process -> Show output ->....Copy the code

Although the process is concurrent, but in the process, the implementation of each function is blocked.

  1. Create three processes A, B and C to realize the three functions of the program respectively, and the process is as follows:
Get input -> Process data -> Show Output Get input -> Process data -> Show OutputCopy the code

That is, multiple processes can execute concurrently, but frequently switch between them.

The reason for this phenomenon is that processes are the basic unit of resource allocation and scheduling at the same time, so we need to allocate system resources to processes with different functions, and then schedule and execute them independently. However, if we want to allocate all system resources required by multiple functions to a single process, the execution of multiple functions within the process will be blocked and cannot achieve the effect of concurrency.

To solve this contradiction, we no longer regard the process as the basic unit of independent scheduling, and the concept of thread was born.

Threads

Threads, sometimes called Lightweight processes (LWP), are the smallest unit of a program’s execution flow. A standard thread consists of the thread ID, current instruction pointer (PC), register set, and stack. In addition, thread is an entity in the process, is the system independent scheduling and dispatching of the basic unit, thread does not own system resources, only have a little in the operation of the essential resources, but it can be a process and other threads share all the resources. One thread can create and undo another thread, and multiple threads in the same process can execute concurrently. Because of the mutual restriction between threads, threads appear discontinuity in operation. Threads also have three basic states: ready, blocked, and running. Ready state refers to the thread has all the conditions to run, logically can run, waiting for the processor; The running state is when the thread owns the processor and is running; A blocked state is when a thread is waiting for an event (such as a semaphore) that cannot be logically executed. Every program has at least one thread. If the program has only one thread, that’s the program itself. A thread is a single sequence control flow in a program. Process has a relatively independent, schedulable execution unit, is the system independent scheduling and dispatching CPU basic unit instruction run program scheduling unit. The simultaneous running of multiple threads in a single program to accomplish different tasks is called multithreading. Threads are the basic unit of independent scheduling and dispatch. In a thread-introduced OS, each process can have multiple threads, which can execute concurrently not only between processes, but also between threads, which resolves the resource allocation versus scheduling conflict we mentioned above.

independence

Threads are attached to the process and exist. A process can have multiple threads. Threads themselves do not own resources, but they can share the resources of the process they are in. Therefore, the independence between threads is weak, which facilitates the communication between threads.

concurrency

Multiple threads in a process can execute concurrently at the same time as between processes, and multiple threads in different processes can execute concurrently. Isn’t that cool?

overhead

When the OS creates or destroys processes, it needs to allocate or release the corresponding memory space. Threads do not own resources, so they do not need to perform these operations. At the same time, because multiple threads belonging to the same process have the same address space, communication and synchronization between threads is much easier than between processes. Inter-thread context switching takes N times faster than inter-process switching.

So in the example above, we can assign resources for multiple functions to a single process, and then use multiple threads in the process to handle each function separately. In this way, multiple functions can be executed concurrently by multiple threads without switching between processes.

Summary (or too long to look at the page)

  • process
    • Is to achieveconcurrentAnd the emergence of.
    • It is the basic unit of uniform distribution of resources.
    • A program can have more than oneprocess.processIndependent of each other.
    • Processes have a separate memory space,processDependencies are required for information exchange betweenInterprocess communication(IPC).
    • Memory space needs to be opened/freed during creation/destruction
    • Context switching is expensive.
  • thread
    • In order to reduceprocessThe time and space overhead of concurrent execution.
    • It is the basic unit of independent resource scheduling and does not own resources.
    • threadAttached to theprocessExistence, less independent, oneprocessYou can have multiplethread.
    • threadShare what he isprocessResources, socommunicationwithsynchronousThe time and space cost of.
    • Create/destroy requires no manipulation of memory space.
    • Context switching costs less.
  • other
    • rightprocessAll the states that are applied will also be applied to itthreadTake effect (suspend, activate, etc.).