I have been learning Golang for a while, and I have also made some demos by looking at various tutorials. In fact, after being exposed to so many languages, it is not too difficult to get started when switching to a programming language for a short period of time due to work, projects and interests, and even boring to understand some very similar basic grammar, but it is the only way. For a technology enthusiast, breadth, depth, and new features are often the best stimulants. Today, this article mainly combines the recent information study, carries on some slightly in-depth analysis and summary of the operation mechanism of Go language and the operation of Go program, and establishes a simple macro cognition of the startup and execution process of Go

Why is Go suitable for modern back-end programming environments?

  • Service applications are mostly API, IO intensive, and network IO is the most;
  • Low running cost, no VM. Low memory usage when there are not many network connections;
  • Strong type language, easy to use, easy to maintain;

Why infrastructure?

  • k8s,etcd,istio,dockerHave proven Go’s ability

Understand executable files

1. Prepare basic experimental environment

Use Docker to build the base environment

FROM centos
RUNyum install golang -y \ && yum install dlv -y \
   && yum install binutils -y \ 
   && yum install vim -y \ 
   && yum install gdb -y
# docker build -t test .
# docker run -it --rm test bash
Copy the code

2. Go language compilation process

Go program compilation process: text -> compile -> binary executable

Compile: Text code -> object file (.o,.a)

Link: Merges object files into executable files

Use go build-x xxx.go to observe this process

3. Specifications of executable files for different systems

Executable files vary from operating system to operating system

Take the Executable file ELF(Executable and Linkable Format) of Linux as an example, ELF consists of several parts:

  • ELF header
  • Section header
  • Sections

Procedure For the operating system to execute executable files (Linux is used as an example) :

4. How to find the entry to the Go process

Use the Entry Point to locate the Go process’s execution entry and use readelf. Where does the Go process start from

2. Start and initialize the Go process

1. How does the computer execute our program

The CPU cannot understand text and can only execute binary machine code instructions one at a time. After each instruction, the PC register points to the next one and continues execution.

PC register = RIP on 64-bit platforms.

The computer executes assembly instructions from top to bottom:

2. What is Runtime &Go language Runtime

Go is a language that has runtime, so what is Runtime?

Think of runtime as modules that automatically load/run while the program is running for additional functionality.

In Go, the relationship between runtime, operating system, and programmer-defined code is shown below:

In the Go language, runtime consists of:

  • SchedulerThe scheduler manages all G, M, P and executes the scheduling loop in the background
  • Netpoll: Network polling manages read/write and ready events related to FD
  • Memory Management: Is responsible for allocating memory when the code needs it
  • Garbage Collector: Reclaims memory when it is no longer needed

At the heart of these modules is the Scheduler, which concatenates all runtime processes.

Find the entry point of the Go process:

Runtime.rt0_go:

  • Start userThe main function(From here you enter the scheduling cycle)
  • Initialize the built-in data structure
  • Get the number of CPU cores
  • globalm0,g0Initialize the
  • argc,argvTo deal with

M0 is the first thread created after the Go program starts

Scheduling components and scheduling cycles

1. Overview of the production-consumption process of Go

Whenever I write:

go func(a) {
  println("hello alex")
}()
Copy the code

What happened? The func code is the basic content of this calculation task

The scheduling process of Go is essentially a production-consumption process. The following figure shows the general process of production consumption:

  • The producer on the right is every timego func() {}When submitting tasks;
  • In the middle is the queue, and the sent tasks are packaged into oneCoroutines G, that is, togoroutine;
  • goroutineIt goes into this queue, and on the other end of the queue is the thread, and the thread is doing the consuming in the loop;
  • The middle queue will be mainly divided into two parts, respectivelyThe local queueandGlobal queue

2. Go scheduling component P, G, M structure

First, define P, G and M as a whole:

  • G:goroutine, a computing task. It consists of the code to be executed and its context, including: current code location, top of stack, bottom of stack address, state, etc.
  • M:machineThe system thread, the execution entity, wants to be inCPUTo execute code, there must be threads, andThe C languageIn the same thread, through the system callcloneTo create.
  • P:processorM must get P to execute code, otherwise it must hibernate (except for background monitoring threads)token, there is thetoken, have the right to execute on the physical CPU core.

After all the content of this section is introduced, review these concepts, you will find it relatively easy to understand some ~

The overall structure is as follows:

  • The blue, yellow and green ones on the rightMMost threads execute a scheduling loop all the time, which simply means that the thread has to go to the task queue on the left (local run queue & global run queue) The repetitive action of taking a task out and performing it;
  • Of course, threads are created on demand throughout the process, so some of the threads may be free, and these threads will be placed in a calledmidleWhen there are no free threads availablemidleInside looking for use;
  • We can see in the figure above, exceptLocal run QueueGlobal Run QueueAnd one morerunnextThe structure of, whilerunnextwithlocal run queueThis is essentially to solve the problem of program locality ** (program locality principle: the most recently called code is very likely to be called again, the whole is divided into code locality and data locality) **, we generally do not want all production to go into the globalglobal run queue;
  • If all the threads consumeglobal run queueIf so, then also need to carry on the additional lock design. That’s why it’s divided into, rightlocal run queueglobal run queueThe reason why.

3. Production-consumption details of Go

goroutineAt the production end ofrunnext,local run queue,global run queueThe process of)

  • One will be created in the upper left cornergoroutineAnd thisgoroutineWill create aruntime, that is, throughruntime.newprocI’m going to make a G;
  • forGIn terms of the queue,runnextIs the highest priority and will enter therunnext;
  • But the newGGoing in, it could lead to oldGSqueezed out, and then there’s the cleanup, oldGWill enter theThe local queueAnd ifThe local queueIf it’s already full, it willThe local queueTake half of it and shove it inGlobal queue, and so on;
  • Note:runnextIt’s not a queue per se, it’s a single elementPointer to theFor ease of understanding, compare it with anotherLocal queue (essentially an array and only 256 in length)andGlobal queue (essentially a linked list)Same name.

goroutineThe consumer end of the

  • The consumer side is essentially multiple threads executing a loop over and over again, and the loop is evaluated from the queue, as the blue bar on the right in the image above refers to the standardScheduling cycleThere are four functions in the Runtime:runtime.schedule,runtime.execute,runtime.goexit,runtime.gogo;
  • The red areas in the picture areGarbage collection GCRelevant logic,scheduleThe three yellow boxes on the left are all functions of getting G ifscheduleAny function on the left hand side returns a G toschedule, the loop on the right will always execute;
  • And in these functions,globalrunqget/61Which means it’s done 61 times on a regular basis. GoGlobal queueInside retrieve a G to prevent inGlobal queueThe G in it is excessively delayed;
  • If the global G is not fetched, or is not currently needed, the global G will be fetched fromThe local queueGet (get firstrunnext), and the local queue is fetched throughrunqgetThis function does that;
  • If G is still not retrieved, it is executedfindrunnableFunction, the whole function is divided into two parts, calledtopandstop.topPart of the function function is basically to try again from the sequenceThe local queue->Global queueGet G, if still not available, usenetpollCheck the network polling situation. If G can be found here, put G inGlobal queueInside, if still not available, userunqstealStealing half of the G’s from the other P’s, this is kind of likeWork stealingThe principle of (runqsteal -> runqgrab);
  • If I execute the whole thingtopIf the part still does not get G, it means that M has no chance to be executed, so the execution startsstopPart of the thread is dormant process, but instopmBefore executing, the thread is checked again for the presence of G, and then hibernated.
  • Note that when M executes a scheduling loop, it must be bound to a P; allglobalAll operations require locks.

The following is a separate description of the scheduling cycle process on the right:

  • In the aboveScheduling cycleThe most important thing isscheduleIt can find the task being performed in the relevant language.
  • whenscheduleWhen I get G, I goexecuteFlow (code to execute GO),gogoWill retrieve the scene of G, continue from the PC register,goexitCompletes the current process, caches the associated G struct resources, and returnsscheduleContinue the loop;
  • During the scheduling cycle, there will be oneP.scheditickFor recordingScheduling cycleHow many times has it been executed forglobrunnqget/61And so on. When performing theexecuteThe time,P.scheditickwill+ 1.

The previous section is about scheduling loops and scheduling components, but Go can only handle normal situations. If there is a block in the program, you need to avoid thread blocking

Fourth, deal with blocking

1. Common blocking conditions in Go language

channel

time.Sleep

Network reading

The network writing

The select statement

The lock

So these six blocks, they block the scheduling loop, and they actually suspend the Goroutine so they’re called suspend, and they actually tell G to go ahead and execute some data structure until it’s ready, so it doesn’t take up any threads.

At this point, the thread enters schedule, continues consuming the queue, and performs other G’s

2. How is G suspended in various blocks

  1. The channel to send: If blocked, there will be onesendqWait queue, package G assudogData structure, plugged in the wait structure;
  2. The channel to receive: If blocked, there will be onerecvqWait queue, package G assudogData structure, plugged in the wait structure;
  3. Link write block: G will hang at the bottompollDescthewg;
  4. Link read blocking: G will hang at the bottompollDesctherg;
  5. Select block: Take the 3 channels in the figure for example, there will be 3 channelssendqOr is itrecvqQueue, and G is packaged assudogAt the end of these queues;
  6. Time. Sleep obstruction: Hang G intimerOn a parameter of the structure.

Because lock blocking is relatively special, take it out separately.

  • Similar to the previous centralized blocking case, the lock blocking will still pack G assudog, will stay atThe tree ofIn the structure of,The tree ofIs aBinary equilibrium tree, and each of these nodes is oneThe list;

From the above introduction, we can see that some pending structures are sudog and some are G. Why is this so?

Because one G may correspond to multiple sudogs, for example, one G may select multiple channels at the same time.

3. Blocking that the Runtime cannot handle

CGO

Obstruction insyscallon

When executing C code, or blocking on Syscall, one thread must be occupied

4. sysmon

sysmon: system monitor

Sysmon has high priority in the background and executes in a proprietary thread without binding P.

Sysmon has three main functions:

  1. checkdead– > Is used to check whether all threads are blocked. If all threads are deadlocked, it indicates that the program is not written properly and needs to crash. For network applications, this function is not triggered. A common misconception is that this can be checkedA deadlock;
  2. netpoll– > Insert G toGlobal queueInside;
  3. retake– > if it issyscallIt’s been stuck for a long timePfromMThe detachment (handoffp); ingo1.14In the future, if user G has been running for a long time, thenSignal preemption.

V. Development history of the scheduler

Six, knowledge summary

1. Executable file ELF:

  • usego build -xWatch the compilation and linking process
  • throughreadelf -HIn theentryFind the program entry
  • inDLV debuggerIn the b *entry_addrFind the code location

2. Startup process:

  • Process parameters -> Initialize internal data structure -> main thread -> Start scheduling loop

3. Runtime structure:

  • Scheduler,Netpoll,Memory management,The garbage collection

4. GMP:

  • M, task consumers; G, computing tasks; P, can be usedCPUtoken

5. The queue:

  • P of the localrunnextField -> Plocal run queue -> global run queue, multi-level queues reduce lock contention

6. Scheduling cycle:

  • The process by which thread M continuously consumes G in the run queue while holding P.

7. Handle blocking:

  • A block that can take over:The channel to send and receive.lock.Network connection read/write.select
  • A blocking that cannot be taken over:syscall.cgo, long running requires stripping P execution

8. sysmon:

  • A background high-priority loop that executes without binding any P is responsible for:
  • Check if there are no active threads, if so, crash;
  • pollingnetpoll
  • Stripping insyscallP of M blocked on;
  • Send a signal to preempt G that has been executed for too long.