Exploring “Does a system call occur when time.now () is called in the Go language?” Before we do that, let’s review what a system call is.

What is a system call?

A system call is when a program running in user space requests services with higher privileges from the operating system kernel. What kind of services are they? These services refer to the services managed by the operating system kernel, such as process management, storage, memory, network, and so on. In the case of opening a file, the user program needs to call the open and read system calls, which are either implemented in C using the LIBC library (which is also the underlying system call) or implemented directly using the system call.

Why is it necessary to make a system call to access a particular resource in Linux? Can’t the call access function be done in user space? The purpose of this design is to improve system security and fault tolerance to prevent malicious attacks. The operating system (OS) divides the security levels of CPU access into four levels, which are called privilege levels or CPU Rings. At any given moment, the CPU is running at a particular level of privilege, which determines what can and cannot be done. These levels can be visualized as a ring, with the highest privilege Ring0 inside, followed by Ring1, Ring2, and finally Ring3. When a system call occurs, the application is moved from application space into kernel space, the privilege level is raised from Ring3 to Ring0, and the application code jumps to the relevant system call code.

In the early days, system calls were implemented with a soft interrupt int 0x80. As the implementation of soft interrupt requires scanning the interrupt description table to find the corresponding entry address of the system call, the performance is poor. Therefore, Linux system introduces a proprietary system call instruction to complete the system call. In 64-bit system, the relevant instruction is SYSCALL/SYSRET instruction. What we need to know is that system calls need to be switched from user mode to kernel mode, which will cause some performance loss.

Time.now () calls analysis

Having reviewed the concepts of system calls, let’s use the strace command to see if time.now () uses system calls in the following code.

package main

import "time"

func main(a) {
	time.Now()
}
Copy the code

Build the binary executable file test, and then use strace to view all system calls during test execution to see if any time-dependent system calls were used:

go build -gcflags="-N -l" -v -o test

strace ./test 2>&1 | grep time
Copy the code

It turns out that no time-based system calls were used when time.now () was called. We can preliminarily conclude that no system call occurred when time.now () was called. However, this conclusion conflicts with the system call concept introduced above, because obtaining the time requires reading the system clock information, which is Ring0 privilege and requires the use of the system call.

Let’s examine the implementation of time.now () to see what happens when it is called.

Analysis of the source code, there are two ways, the first is to check the source code directly, because in the process of check the source source content is various, and the assembly code and system support, code editor does not support a good support tips and jump, sometimes we need to use the global search for related keywords to find the function or variable. The second is to use debugging tools such as GDB or DLV to trace the source code of the execution in the form of breaking points. These two methods are usually mixed. This time we will use GDB for analysis. The author’s system environment is as follows:

vagrant@vagrant:~$go version go version go1.14.15 Linux /amd64 vagrant@vagrant:~$cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=20.04 DISTRIB_CODENAME=focal DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS" vagrant@vagrant:~$GDB --version GNU GDB (Ubuntu 9.2-0Ubuntu1 ~20.04) 9.2Copy the code

First we start GDB, then set a breakpoint at main and run the program:

Next we set a breakpoint at time.now () (line 6) and execute the continue and step commands to see the internal implementation of time.now () :

The source code for time.now () is at line 1121 in the time/time.go file. The time.now () function calls the Now() function to get the current number of seconds and nanoseconds. Let’s look at the implementation of the now() function:

You can see from the figure above that when we look at the now() function, it jumps to time_now(). This is due to the go:linkname compilation directive, which links private functions or variables in the current source file to the specified method or variable at compile time. For example, //go:linkname time_now time.now means linking time_now to time.now, so time_now is implemented by time_now at line 15 of runtime/timestub.go.

Next, let’s look at the walltime implementation in time_now. As you can see in the figure below, walltime is at line 23 in the Runtime /time_nofake.go file, which calls the walltime1 function. Walltime1 is implemented by assembler, source code is located in runtime/sys_linux_amd64.s at 209.

Now let’s look at assembly code, and that’s all we care aboutThe runtime, walltime1The core of the function, specifically the assembly code between 209 and 210 in the sys_linux_amd64.s file, is indicated by arrows:

The assembly code in the figure above performs two main functions. First, it switches the Goroutine stack to the G0 stack.

get_tls(CX) // Load TLS into the CX register
MOVQ	g(CX), AX // Save g information in TLS to AX register
MOVQ	g_m(AX), BX // BX unchanged by C code.

// Set vdsoPC and vdsoSP for SIGPROF traceback.
LEAQ	sec+0(FP), DX
MOVQ	- 8 -(DX), CX
MOVQ	CX, m_vdsoPC(BX)
MOVQ	DX, m_vdsoSP(BX)

CMPQ	AX, m_curg(BX)	// Compare the g stored in TLS with m.curg. If it is not equal, it is already on the g0 stack. Skip to noswith branch
JNE	noswitch
// Switch from g stack to G0 stack
MOVQ	m_g0(BX), DX 
MOVQ	(g_sched+gobuf_sp)(DX), SP	// Set SP to g0 stack
Copy the code

According to the GMP model, the stack M executes may be on the system stack (i.e. G0 stack) or signal stack, or on the user thread stack (i.e. Gorountine stack). Getg () returns the executing G, which may be the g0 of M, or the gsignal, or the goroutine associated with M. Getg ().m.curg always returns the goroutine associated with M. So we can compare getg() == getg().m.curg to determine whether the current stack executed by M is the system stack. The stack type judgment is based on the above assembly code before switching to the system stack.

The second function is to call the function pointed to by the Runtime ·vdsoClockgettimeSym variable to get the current number of seconds and milliseconds. This is also at the heart of the time.now () implementation.

noswitch:
	SUBQ	$16, SP		// Space for results
	ANDQ	$~15, SP	// Align for C codeMOVQ runtime · vdsoClockgettimeSym (SB), AX// Save the function address saved by the vdsoClockgettimeSym variable into the AX register
	CMPQ	AX, $0 // Compare the function address saved by vdsoClockgettimeSym with 0, if equal, jump to the fallback branch
	JEQ	fallback
	MOVL	$0, DI // CLOCK_REALTIME
	LEAQ	0(SP), SI
	CALL	AX // Call the function pointed to by vdsoClockgettimeSym
	MOVQ	0(SP), AX	// sec
	MOVQ	8(SP), DX	// nsec
	MOVQ	BP, SP		// Restore real SP
	MOVQ	$0, m_vdsoSP(BX)
	MOVQ	AX, sec+0(FP)
	MOVL	DX, nsec+8(FP)
Copy the code

Time.now () finally calls the function pointed to by the runtime·vdsoClockgettimeSym variable with the entry address 0x7ffff7FFe8e0. Why use a variable to point to a function address instead of the normal function symbol? We assume that the function address is not fixed, it will change from application to application, it needs to dynamically obtain the address at run time.

Now let’s look at the entry address0x7ffff7ffe8e0Function assembly code:

We can see that the function name corresponding to address 0x7ffff7FFe8e0 is clock_getTime.

All the way through GDB debugging, we found that we had to understand how the runtime·vdsoClockgettimeSym variable was assigned to the clock_getTime entry address.

We know that when the Go application is started, the Go runtime completes the initialization of global variables such as NCPU, G0, schet, etc. The runtime·vdsoClockgettimeSym is no exception. They have completed the initialization before executing the main function. So when we use the watch command to observe the vdsoClockgettimeSym variable change, the application must be started.

When you look at the variable vdsoClockgettimeSym, you can see that it was the function vdsoParseSymbols that changed its value. It assigned 0x7FFFF7FFE8E0 to the variable vdsoClockgettimeSym, 0x7ffff7FFe8e0 is the entry address of the function clock_getTime.

It is important to note in the GDB access vdsoClockgettimeSym this variable is the runtime vdsoClockgettimeSym, between the runtime and vdsoClockgettimeSym period (.). It is not the same as the dot (·) in the assembly.

Next we use the bt command, we can see the whole function stack frame, then we can open the code editor to follow the diagram:

At this point we end up using GDB analysis to trace time.now (). We use the code editor to look at the function vdsoParseSymbols, which is in the Runtime/vdSO_linux.go file. Look up symbols in the Linux vDSO. Combined with the function name, it is known that vdsoParseSymbols performs symbolic resolution of vDSO. This introduces the concept of vDSO.

What vDSO?

VDSO is the abbreviation of Virtual Dynamic Shared Object (vDSO). VDSO is a mechanism for Linux kernel to expose kernel functions to user space. VDSO is implemented by directly mapping some non-security system call code in the kernel to the user space, so that the user code does not use the system call, but also can complete the relevant functions. The vDSO mechanism reduces performance costs by eliminating the need to switch from user space to kernel space during system calls. VDSO supports system calls such as clock_getTime,time, getCPU, etc.

We can find the vDSO module by looking at the memory map of the process:

You can see from the above that the vDSO addresses are 0x7ffff7FFe000 through 0x7ffff7FFf000.

For security purposes, the starting address of the vDSO is not fixed, and the vDSO is different for each binary application. We can test the validation using the following command, and see that the vDSO starts at a different address each time:

vagrant@vagrant:~$ LD_SHOW_AUXV=1 cat /proc/self/maps | egrep '\[vdso|AT_SYSINFO'
AT_SYSINFO_EHDR:      0x7fff3d725000
7fff3d725000-7fff3d726000 r-xp 00000000 00:00 0                          [vdso]
Copy the code

Next, we try to save the vDSO information in memory to a file. What format is it? There are two ways to do this.

The first uses GDB dump to save the VDSO portion of the process memory. First, we will use info Proc Mappings to find the starting address of the VDSO in the application process memory. Then, we will use the dump memory command to save the memory data corresponding to the starting address to the vdso.so file.

The second way is to write your own code implementation, click to see the full source code.

	outputFile, err := os.Create(*output)
	iferr ! =nil {
		log.Fatal(err)
	}
	defer outputFile.Close()

	mapFile := "/proc/self/maps"
	memFile := "/proc/self/mem"
	if *pid > 0 {
		mapFile = fmt.Sprintf("/proc/%d/maps", *pid)
		memFile = fmt.Sprintf("/proc/%d/mem", *pid)
	}

	mapFileH, err := os.Open(mapFile)
	iferr ! =nil {
		log.Fatal(err)
	}

	bufReader := bufio.NewReader(mapFileH)
	var vdsoSectionLine string
	for {
		line, err := bufReader.ReadString('\n')
		iferr ! =nil {
			if err == io.EOF {
				break
			}
			log.Fatal(err)
		}
		line = strings.Trim(line, "\n")
		if strings.HasSuffix(line, "[vdso]") {
			vdsoSectionLine = line
			break}}if len(vdsoSectionLine) == 0 {
		log.Fatal("can't find vdso module")
	}

	addrs := strings.Split(strings.SplitN(vdsoSectionLine, "".2) [0]."-")
	vdsoStartAddr, _ := strconv.ParseInt(addrs[0].16.64)
	vdsoEndAddr, _ := strconv.ParseInt(addrs[1].16.64)

	memFileH, err := os.Open(memFile)
	iferr ! =nil {
		log.Fatal(err)
	}

	if _, err = memFileH.Seek(vdsoStartAddr, 0); err ! =nil {
		log.Fatal(err)
	}

	buf := make([]byte, vdsoEndAddr-vdsoStartAddr)
	if_, err = io.ReadFull(memFileH, buf); err ! =nil {
		log.Fatal(err)
	}

	if_, err = outputFile.Write(buf); err ! =nil {
		log.Fatal(err)
	}
Copy the code

After obtaining the vDso file using the method described above, we can use the file command to check the file type and the objdump -t command to check the Dynamic symbols information.

In the figure above we see clock_getTime again.

How is vDSO used in the Go language?

From the above introduction, we know that no system call occurs when time.now () is called in Go language, because it uses vDSO technology to map system call clock_gettime to application space, and Go language calls corresponding code in application space, avoiding system call.

As mentioned above, the entry address of vDSO is not fixed. How does Go find the entry address and the address of the clock_gettime function?

The Go language gets the start address of the vDSO by reading the Auxiliary Vectors information, and then reading the vDSO information resolves to the CLOCK_getTime address. Auxiliary Vectors is a collection of user-space information provided by the kernel ELF binary loader, including executable entry addresses, thread Gids, thread Uids, and VDSO entry addresses.

Auxiliary Vectors contain a series of key-value pairs, one value for each key. The key corresponding to the vDSO entry address is AT_SYSINFO_EHDR. See the manual for the system call getauxval for details. The source code for Go Runtime is as follows, without further details:

func vdsoauxv(tag, val uintptr) {
	switch tag {
	case _AT_SYSINFO_EHDR:
		if val == 0 {
			// Something went wrong
			return
		}
		var info vdsoInfo
		// TODO(rsc): I don't understand why the compiler thinks info escapes
		// when passed to the three functions below.
		info1 := (*vdsoInfo)(noescape(unsafe.Pointer(&info)))
		vdsoInitFromSysinfoEhdr(info1, (*elfEhdr)(unsafe.Pointer(val)))
		vdsoParseSymbols(info1, vdsoFindVersion(info1, &vdsoLinuxVersion))
	}
}
Copy the code

Do system calls occur when time.sleep () is called in Go?

Further reading

  • Creating a vDSO: the Colonel’s Other Chicken
  • man: VDSO
  • stackexchange: Are system calls the only way to interact with the Linux kernel from user land?
  • Sysenter Based System Call Mechanism in Linux 2.6
  • Linux syscall, vsyscall, and vDSO… Oh My!
  • About ELF Auxiliary Vectors
  • Atp’s external memorylinux syscalls on x86_64