Author: hackett
Wechat official account: Overtime Ape

Please explain the concept of process and thread, why there are process threads, what are the differences, and how they are synchronized

Basic Concepts:

Process is the encapsulation of run-time program, is the basic unit of system resource scheduling and allocation, and realizes the concurrent operation of the operating system.

Thread is a subtask of the process, and is the basic unit of CPU scheduling and dispatching. It is used to ensure the real-time performance of the program and realize the concurrency within the process. A thread is the smallest unit of execution and scheduling that an operating system can recognize.

The difference between:

1. A thread can belong to only one process, and a process can have multiple threads but at least one thread. Threads are dependent on processes.

2. A process has an independent memory unit during execution, and multiple threads share the process memory. Resources are allocated to a process, and all threads of the same process share all resources of that process. Multiple threads in the same process share code segments (code and constants), data segments (global and static variables), and extended segments (heap storage). However, each thread has its own stack segment, also known as the run time, which holds all local and temporary variables.

3. The process is the smallest unit of resource allocation, and the thread is the smallest unit of CPU scheduling.

4. System overhead: When creating or canceling a process, the system allocates or reclaims resources, such as memory space, I/O devices, etc. Therefore, the overhead incurred by the operating system will be significantly greater than the overhead incurred when creating or undoing threads. Similarly, process switching involves saving the CPU environment of the entire current process and setting the CPU environment of the new scheduled running process. Thread switching only requires saving and setting the contents of a small number of registers and does not involve memory management. As can be seen, the cost of process switching is also much higher than the cost of thread switching.

5. Communication: Because multiple threads in the same process have the same address space, the realization of synchronization and communication between them becomes easier. Interprocess communication IPC, threads can directly read and write process data segments (such as global variables) to communicate — with the help of process synchronization and mutual exclusion to ensure data consistency. On some systems, threads can be switched, synchronized, and communicated without the intervention of the operating system kernel

6. Process programming and debugging is simple and reliable, but the creation and destruction cost is high; Threads, on the other hand, have low overhead, fast switching, but relatively complex programming and debugging.

7. Processes do not influence each other; Threads The failure of one thread causes the failure of the entire process

8. Process ADAPTS to multi-core and multi-machine distribution; Threads work with multiple cores

Interprocess communication:

  1. Pipe (is a kernel buffer from which processes access data on a first-in, first-out basis)
  2. Named pipe FIFO (corresponding to a disk index node that any process can access)
  3. Message queues (specify a specific data type for each message and can receive a specific type of message based on custom conditions)
  4. Shared memory (high efficiency, direct read and write memory, no data copy required)
  5. Semaphore (atomic operation, passing data needs to be combined with shared memory)
  6. Socket socket (for process communication between different machines)
  7. Signal (used to notify the receiving process that an event has occurred)

2 What is the port number of MySQL? How do I change the port number

Check the port number:

Use the show global variables like ‘port’ command; The default port for mysql is 3306

Changing the port number:

To change the port number: edit the /etc/my.cnf file

Please tell me why there are threads when there are processes.

Thread causes:

Process can make multiple programs can be executed concurrently to improve the utilization of resources and system throughput; But it has some disadvantages:

Processes can only do one thing at a time

If a process blocks during execution, the entire process will hang, and even if some work in the process does not depend on waiting resources, it will not be executed.

Threads have the following advantages over processes:

In terms of resources, threads are a very frugal way to multitask. Under Linux, starting a new process must be assigned its own address space and create numerous tables to maintain its code segments, stack segments, and data segments, which is an “expensive” way of multitasking.

In terms of switching efficiency, multiple threads running in a process use the same address space among them, and the time required for switching between threads is much less than the time required for switching between processes. According to statistics, the overhead of a process is about 30 times that of a thread.

In terms of communication mechanism, convenient communication mechanism between threads. For different processes, they have independent data space, and data transfer can only be carried out through inter-process communication, which is not only time-consuming, but also very inconvenient. Threads, on the other hand, contribute data space between the same incoming threads, so data from one thread can be directly used by other threads, which is not only fast, but also convenient.

In addition to the above advantages, multithreaded programs, as a multitasking, concurrent way of working, have the following advantages:

1. Make multi-CPU systems more efficient. The operating system ensures that different threads run on different cpus when the number of threads does not exceed the number of cpus.

2. Improve the program structure. A long and complex process can be divided into several threads, separate or semi-independent running parts, so that the program is easier to understand and modify.

Do you need to consider locking when writing multithreaded programs on single-core machines? Why?

Writing multithreaded programs on a single-core machine still requires a thread lock. Thread locks are usually used to synchronize and communicate with threads. In multithreaded programs on a single-core machine, there is still the problem of thread synchronization

5 please describe the synchronization between threads, preferably the specific system call

A semaphore

A semaphore is a special variable that can be used for thread synchronization. It only takes natural values and supports only two operations:

P(SV): If the semaphore SV is greater than 0, subtract it by one; If SV is 0, the thread is suspended.

V(SV) : Wake up if other processes are suspended because of waiting for SV, and then SV+1; Otherwise, just add SV+1.

The system call is:

Sem_wait (sem_t *sem) : Atomically reduces the semaphore by 1. If the semaphore value is 0, sem_wait will be blocked until the semaphore has a non-zero value.

Sem_post (sem_T *sem) : Atomic operation of the signal value +1. When the semaphore is greater than 0, other threads that are calling sem_WAIT for the semaphore will be woken up.

The mutex

Mutex, also known as mutex, is mainly used for thread mutual exclusion and cannot guarantee sequential access. It can be synchronized with conditional locks. When entering the critical region, mutex needs to be acquired and locked; When leaving a critical region, the mutex needs to be unlocked to wake up other threads waiting on the mutex. The main system calls are as follows:

Pthread_mutex_init: Initializes the mutex

Pthread_mutex_destroy: destroys the mutex

Pthread_mutex_lock: Atomically locks a mutex. If the target mutex is already locked, the pthread_mutex_lock call blocks until the possessor of the mutex unlocks it.

Pthread_mutex_unlock: An atomic operation to unlock a mutex.

Condition variables,

Condition variables, also known as conditional locks, are used to synchronize shared data values between threads. Condition variables provide a mechanism for interthread communication: when a shared data reaches a certain value, one or more threads waiting for that shared data are awakened. That is, when a shared variable equals a value, signal/broadcast is called. In this case, locks are required to operate on shared variables. The main system calls are as follows:

Pthread_cond_init: Initializes the condition variable

Pthread_cond_destroy: Destroys the condition variable

Pthread_cond_signal: Wakes up a thread waiting for the target condition variable. Which thread is awakened depends on the scheduling policy and priority.

Pthread_cond_wait: Waits for target condition variables. A locked mutex is required to ensure atomicity of the operation. In this function, the thread unlocks the shared resource before entering the wait state. After receiving the signal, the thread locks the shared resource correctly.

Should the game server open up one thread or one process per user? Why?

The game server should open up a process for each user. Because threads in the same process can affect each other, the death of one thread can affect other threads, causing the process to crash

Please talk about structure alignment, byte alignment in the operating system

1. Reasons:

1) Platform reasons (porting reasons) : Not all hardware platforms can access any data on any address; Some hardware platforms can only fetch certain types of data at certain addresses or throw hardware exceptions.

2) Performance reasons: Data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; Aligned memory access requires only one access.

2. Rules:

1) Alignment rules for data members: Data members of struct (or union), the first data member is placed at offset 0, and then each data member is aligned according to the value specified by #pragma Pack and the smaller of the length of the data member.

2) Overall alignment rules for structures (or unions) : After the data members have completed their respective alignment, the structures (or unions) themselves are also aligned according to the value specified by #pragma Pack and the smaller of the maximum data member length of the structures (or unions).

3) Struct as members: If a struct has some struct members, the struct members are stored from an integer multiple of their largest internal element size.

3, define struct alignment:

This can be changed by using the precompiled command #pragma pack(n), n=1,2,4,8,16, where n is the specified “alignment factor.”

When do static variables get initialized

Static variables are stored in the data segment and BSS segment of the virtual address space

In C, it is initialized before code execution, which is compile-time initialization.

In C++, because of the introduction of objects, object generation must call constructors, so C++ specifies that global or local static objects are constructed if and only when the object is first used

9 Please tell the difference between user mode and kernel mode

User mode and kernel mode are two operating system running levels, the biggest difference between the two is the different privilege level. The user mode has the lowest privilege level and the kernel mode has the highest privilege level. Programs running in user mode do not have direct access to the operating system kernel data structures and programs. The conversion modes between kernel mode and user mode mainly include: system call, exception and interrupt

10 How can I design the Server to receive requests from multiple clients

Multithreading, thread pooling, IO multiplexing

11 The method of creating new threads is a bit inefficient when using an infinite loop + to connect, how to improve it?

Create a thread pool ahead of time, use the producer consumer model, create a task queue, queue as a critical resource, new connections, hang on the task queue, queue empty all threads sleep. Improved infinite loops: Use techniques like Select epoll

How do I wake up a blocked socket thread?

Give resources that are missing when blocking

13 How do I determine whether the current thread is busy or blocked?

Run the ps command

14 What is the ready process waiting for?

The scheduler uses CPU running rights

15 If two processes access a critical section resource, will they both acquire a spin lock?

Single-core CPUS with preemption enabled can cause this.

16 Windows message mechanism know? Please say

When the user has an action (mouse, keyboard, etc.), the system will convert this time into a message. Each open process system maintains a message queue for it, which the system places into the message queue of the process, and the application circulates messages from the message queue to complete the corresponding operation

17. Tell me about the lock you used in C++.

The producer-consumer problem can be easily solved using mutexes and conditional variables, which here act as a surrogate semaphore

Please talk about C++ memory overflow and memory leak

1. Memory overflow

When a program requests memory and does not have enough memory for the applicant. Memory overflow is the amount of memory you need exceeds the amount of space allocated to you by the system. At this time, the system can not meet your needs and will report an overflow error

Memory overflow causes:

Too much data is loaded in memory, such as too much data is fetched from the database at one time

There are references to objects in the collection class that are not emptied after use, making it unrecyclable

Code that has an infinite loop or loop that produces too many duplicate object entities

Bugs in third-party software used

The memory value of startup parameter is too small. Procedure

2. Memory leaks

A memory leak is when a program fails to release memory that is no longer used due to negligence or error. A memory leak is not the physical disappearance of memory, but rather the loss of control over memory allocated by an application due to design errors.

Memory leak classification:

1. Heap leak. Pair memory refers to a chunk of memory allocated from the heap by malloc,realloc new, etc., when the program is running, it must be deleted by calling the corresponding free or delete. If there is a design error in the program that prevents this portion of memory from being released, then it will not be used thereafter, resulting in a Heap Leak.

2. System Resource Leak. It mainly refers to that the program uses resources allocated by the system, such as Bitmap, Handle and SOCKET, but does not use corresponding functions to release them, resulting in waste of system resources, which can seriously reduce system efficiency and lead to unstable system operation.

The destructor of the base class is not defined as virtual. When the base class pointer points to a subclass object, if the base class’s destructor is not Virtual, the subclass’s destructor will not be called, and the subclass’s resources will not be freed correctly, resulting in a memory leak.

Would you tell me something about coroutines?

1. Concept:

Coroutine, also known as microthread, fiber, English name Coroutine. Coroutines also look like subroutines, but can be broken inside the subroutine and then switched to another subroutine, returning to resume execution at an appropriate time

2) Coroutines and threads

That’s one of the biggest advantages of coroutines over multithreading is their high execution efficiency. Because subroutine switching is not thread switching but controlled by the program itself, there is no overhead of thread switching, and the more threads there are, the greater the performance advantage of coroutines compared to multithreading.

The second advantage is that there is no need for multi-threaded locking mechanism, because there is only one thread, there is no conflict of variables written at the same time, in the coroutine control of shared resources without locking, only need to judge the state is good, so the execution efficiency is much higher than multi-threading.

3) other

Using multi-core CPU in coroutine – multi-process + coroutine, not only make full use of multi-core, but also give full play to the high efficiency of coroutine, can obtain extremely high performance.

Python has very limited support for coroutines, and yield in generators can be used to some extent to implement coroutines. Although the support is incomplete, it can be used with considerable power.

What can you tell us about zombie progression?

1) Normal process

Normally, the child process is created by the parent process, and the child process creates a new process.

2) Orphan process

If a parent exits while one or more of its children are still running, those children become orphan processes. Orphan processes are adopted by the init process (process number 1), which collects state for them

3) Zombie processes

A process uses fork to create a child. If the child exits and the parent does not call wait or WaitPID to obtain the child’s state, the child’s process descriptor remains in the system. These processes are called zombie processes.

Extermination:

Kill the zombie process by sending the SIGTERM or SIGKILL signal, and the zombie process becomes an orphan process. These orphan processes are taken over by the init process, which waits () for them to release resources from the system process table

Internal solution:

1. When the child exits, it sends the SIGCHILD signal to the parent, which processes the SIGCHILD signal. Call WAIT in the signal handler to handle zombie processes.

2. Fork twice. The child is orphaned and its parent becomes init, which can handle zombie processes.

Please introduce five IO models

1. Block IO: the caller calls a function, waits for the function to return, does nothing, keeps checking to see if the function returns, and must wait for the function to return before proceeding to the next step

2. Non-blocking IO: non-blocking wait to check whether I/O events are ready at regular intervals. You can do something else without being ready.

3. Signal driven IO: Signal driven IO: Linux uses sockets for signal driven IO, install a signal processing function, the process continues to run without blocking, when the IO time is ready, the process receives SIGIO signal. Then the IO events are processed.

4.I/O multiplexing/multiplexing I/O: Linux uses select/poll functions to implement the I/O multiplexing model. These two functions also block processes, but unlike blocking I/O, they can block multiple I/O operations at the same time. In addition, multiple read and write IO functions can be detected at the same time. I/O operations are called only when data is known to be readable or writable

5. Asynchronous IO: In Linux, the aio_read function can be called to tell the kernel the description word buffer pointer and the size of the buffer, the file offset, and the mode of notification, and then return immediately to inform the application when the kernel has copied the data into the buffer.

22 What is the status of the process when the server listens to the port but no client connects to it?

This depends on the server’s programming model, which is blocked if the answer to the previous question describes it, or running if it uses IO reuse like epoll, SELECT, etc

How can C++ implement thread pool?

Set up a producer consumer queue as a critical resource

How do I get 100 to 200 lines of a file in Linux

Sed -n ‘100200 p’ inputfile

awk ‘NR>=100&&NR<=200{print}’ inputfile

head -200 inputfile|tail -100

Please talk about the use of AWK

1) Functions:

Style scanning and processing language. It allows you to create short programs that read input files, sort data, process data, perform calculations on input, and generate reports, among countless other functions.

2.

awk [-F field-separator] ‘commands’ input-file(s)

3) Built-in variables

ARGC Number of command line arguments
ARGV Command line argument arrangement
ENVIRON Supports the use of system environment variables in queues
FILENAME Awk File name to view
FNR Number of files viewed
FS Sets the input field separator, equivalent to the command line -f option
NF The number of fields to browse
NR Number of read records
OFS Output field separator
ORS Output record delimiters
RS Control record delimiters

If you think it’s good, “Like” it.

Follow my wechat official account [Overtime Ape] to get more content