Hello everyone, I’m Zhang Jintao.

The binaries we built with Go contain a lot of useful information by default. For example, you can get the Go version for the build:

(I’m using KIND, an open source project I’ve been working on, as an example.)

➜ kind Git :(Master) Qualify go version./bin/ kind. /bin/kind: go1.16Copy the code

Alternatively, you can get information about the module on which the binary depends:

➜ kind Git :(Master) Qualify go version -m./bin/kind./bin/kind: Go1.16 path sigs. K8s. IO/kind mod sigs. K8s. IO/kind (devel) dep github.com/BurntSushi/toml v0.3.1 dep Github.com/alessio/shellescape v1.4.1 dep github.com/evanphx/json-patch/v5 v5.2.0 dep github.com/mattn/go-isatty v0.0.12 Dep github.com/pelletier/go-toml v1.8.1 h1:1 nf83orprkjyknt6h7zbueguejcyvlcxsugtenmncrm = dep github.com/pkg/errors V0.9.1 DEp github.com/spf13/cobra v1.1.1 DEp github.com/spf13/pflag v1.0.5 deP golang.org/x/sys 22 da62e12c0c v0.0.0-20210124154548 - h1: VwygUrnw9jn88c4u8GD3rZQbqrP/tgas88tPUbBxQrk = dep gopkg. In/yaml. V2 v2.2.8 dep Gopkg. In/yaml. V3 v3.0.0 a6307b h1-20210107192922-496545: h8qDotaEPuJATrMmW04NCwg7v22aHH28wwpauUhK9Oo = dep K8s. IO/Apimachinery v0.20.2 dep SIGs.k8s. IO/YAML v1.2.0Copy the code

Check out the go.mod file in the KIND repository, it’s all included.

The extra information contained in binary files on Linux is not unique to Go, but I’ll explain the internals and implementation below. Of course, binaries built with Go are still the focus of this article.

Linux ELF format

ELF stands for Executable and Linkable Format, and is a standard file Format for Executable files, object files, shared libraries, and core dumps. ELF files are usually compiler-like output and are in binary format. For executable files compiled by Go, use the file command to see the specific type ELF 64-bit LSB executable:

➜ kind git:(Master) Qualify file./bin/kind./bin/kind: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not strippedCopy the code

In this article, we take a closer look at the structure of the ELF file format used by 64-bit executables and its definition in the Linux kernel source code.

An executable file that uses the ELF file format starts with an ELF Header followed by a Program Header or a Section Header, or both.

ELF header

The ELF header is always at the zero offset of the file, and the offset of the program header and section header is defined in the ELF header.

We can view the ELF header of the executable with the readelf command, as follows:

  kind git:(master)  readelf -h ./bin/kind 
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x46c460 Start of program headers: 64 (bytes into file) Start of section headers: 400 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 6 Size of section headers: 64 (bytes) Number of section headers: 15 Section header string table index: 3Copy the code

As you can see from the output above, the ELF header starts with a Magic that identifies the information about the file: the first four hexadecimal numbers, indicating that this is an ELF file. Specifically, convert them to their ASCII counterparts:

45 = E

4c = L

46 = F

7f is the prefix, of course, you can also get the specific definition in the Linux kernel source:

// include/uapi/linux/elf.h#L340-L343
#define	ELFMAG0		0x7f		/* EI_MAG */
#define	ELFMAG1		'E'
#define	ELFMAG2		'L'
#define	ELFMAG3		'F'
Copy the code

The next number, 02, corresponds to the Class field and represents its architecture. It can be either 32-bit (=01) or 64-bit (=02), showing 02 here means 64-bit, and readelf converts it to ELF64 for display. The values here can also be found in the Linux kernel source code:

// include/uapi/linux/elf.h#L347-L349
#define	ELFCLASSNONE	0		/* EI_CLASS */
#define	ELFCLASS32	1
#define	ELFCLASS64	2
Copy the code

Data has two values: LSB (01) and MSB (02). There is no need to expand Data. Version has only one value, that is, 01.

// include/uapi/linux/elf.h#L352-L358
#define ELFDATANONE	0		/* e_ident[EI_DATA] */
#define ELFDATA2LSB	1
#define ELFDATA2MSB	2

#define EV_NONE		0		/* e_version, EI_VERSION */
#define EV_CURRENT	1
#define EV_NUM		2

Copy the code

The next thing to notice is what I mentioned earlier about offsets, which is the following in the output:

  Start of program headers:          64 (bytes into file)
  Start of section headers:          400 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         6
  Size of section headers:           64 (bytes)
  Number of section headers:         15
Copy the code

The ELF Header is always at the beginning, in this case the Program Header is followed by the Section Header, and the output here shows that the Program Header starts at 64, so the position of the Section Header is:

64 plus 56 times 6 is 400Copy the code

As with the above output, the end position of the section header is:

400 plus 15 times 64 is 1,360Copy the code

You’ll use this knowledge in the next section.

Program header

As you can see from readelf-L, the program header contains several segments. When the kernel sees these segments, it calls Mmap Syscall to map them into the virtual address space. This part is not the focus of this article, so let’s just skip it and get an impression.

➜ kind git:(master) qualify readelf-l. /bin/kind Elf filetype is EXEC (Executable file)
Entry point 0x46c460
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x0000000000000150 0x0000000000000150  R      0x1000
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x0000000000333a75 0x0000000000333a75  R E    0x1000
  LOAD           0x0000000000334000 0x0000000000734000 0x0000000000734000
                 0x00000000002b3be8 0x00000000002b3be8  R      0x1000
  LOAD           0x00000000005e8000 0x00000000009e8000 0x00000000009e8000
                 0x0000000000020ac0 0x00000000000552d0  RW     0x1000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x8
  LOOS+0x5041580 0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0x8

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .text 
   02     .rodata .typelink .itablink .gosymtab .gopclntab 
   03     .go.buildinfo .noptrdata .data .bss .noptrbss 
   04     
   05     
Copy the code

Section head

Use readelf-s to view the section header, which is structured as follows:

// include/uapi/linux/elf.h#L317-L328
typedef struct elf64_shdr {
  Elf64_Word sh_name;		/* Section name, index in string tbl */
  Elf64_Word sh_type;		/* Type of section */
  Elf64_Xword sh_flags;		/* Miscellaneous section attributes */
  Elf64_Addr sh_addr;		/* Section virtual addr at execution */
  Elf64_Off sh_offset;		/* Section file offset */
  Elf64_Xword sh_size;		/* Size of section in bytes */
  Elf64_Word sh_link;		/* Index of another section */
  Elf64_Word sh_info;		/* Additional section information */
  Elf64_Xword sh_addralign;	/* Section alignment */
  Elf64_Xword sh_entsize;	/* Entry size if section holds table */
} Elf64_Shdr;
Copy the code

When you look at the actual command output, the implications are obvious.

➜ kind git:(Master) Qualify Readelf-s./bin/kind There are 15 section headers, starting at offset 0x190: Section headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000401000 00001000 0000000000332a75 0000000000000000 AX 0 0 32 [ 2] .rodata PROGBITS 0000000000734000 00334000 000000000011f157 0000000000000000 A 0 0 32 [ 3] .shstrtab STRTAB 0000000000000000 00453160 00000000000000a4 0000000000000000 0 0 1 [ 4] .typelink PROGBITS 0000000000853220 00453220 00000000000022a0 0000000000000000 A 0 0 32 [ 5] .itablink PROGBITS 00000000008554c0 004554c0 0000000000000978 0000000000000000 A 0 0 32 [ 6] .gosymtab PROGBITS 0000000000855e38 00455e38 0000000000000000 0000000000000000 A 0 0 1 [ 7] .gopclntab PROGBITS 0000000000855e40 00455e40 0000000000191da8 0000000000000000 A 0 0 32 [ 8] .go.buildinfo PROGBITS 00000000009e8000 005e8000 0000000000000020 0000000000000000 WA 0 0 16 [ 9] .noptrdata PROGBITS 00000000009e8020 005e8020  0000000000017240 0000000000000000 WA 0 0 32 [10] .data PROGBITS 00000000009ff260 005ff260 0000000000009850 0000000000000000 WA 0 0 32 [11] .bss NOBITS 0000000000a08ac0 00608ac0 000000000002f170 0000000000000000 WA 0 0 32 [12] .noptrbss NOBITS 0000000000a37c40 00637c40 0000000000005690 0000000000000000 WA 0 0 32 [13] .symtab SYMTAB 0000000000000000 00609000 0000000000030a20 0000000000000018 14 208 8 [14] .strtab STRTAB 0000000000000000 00639a20 000000000004178d 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific)Copy the code

Go Binary file exploration

In this article, we focus on a section called.go.buildinfo. Use objdump to see what it contains:

➜ kind git:(Master) Qualify objdump - s-J.go.buildinfo./bin/kind./bin/kind: file format elf64-x86-64 Contents of section .go.buildinfo: 9e8000 ff20476f 20627569 6c64696e 663a0800 . Go buildinf:.. 9e8010 a0fc9f00 00000000 e0fc9f00 00000000 ................Copy the code

Here we do it in order, looking at the first 16 bytes.

  • The first 14 bytes are magic bytes and must be\xff Go buildinf:
  • Byte 15 indicates the size of its pointer, where the value is0x08Is 8 bytes.
  • Byte 16 is used to determine whether the byte order is big-endian mode or small-endian mode. Non-0 indicates big-endian mode and 0 indicates small-endian mode.

Let’s move on to byte 17.

Go Version Information

We also saw earlier that the current byte order used is small endian mode, where the address should be 0x009ffCA0.

Let’s take out 16 bytes of content:

➜ kind git:(master) Qualify objdump-s --start-address 0x009ffCA0 --stop-address 0x009ffCB0./bin/kind./bin/kind: file format elf64-x86-64 Contents of section .data: 9ffca0 f5027d00 00000000 06000000 00000000 .. }...Copy the code

Here, the first 8 bytes are the Go version information, and the last 8 bytes are the size of the version (6 bytes here).

➜ kind git:(master) Qualify objdump-s --start-address 0x007D02F5 --stop-address 0x007d02fb. /bin/kind./bin/kind: File format ELf64-x86-64 Contents of section. rodata: 7D02F5 676F31 2e3136 GO1.16Copy the code

So, as shown above, we have information about the version of Go used to build this binary, built with Go 1.16.

Go the Module information

Earlier we used 17 to 24 bytes of information, and this time we will continue to use it.

➜ kind git:(master) Qualify objdump-s --start-address 0x009ffce0 --stop-address 0x009ffcf0./bin/kind./bin/kind: file format elf64-x86-64 Contents of section .data: 9ffce0 5a567e00 00000000 e6020000 00000000 ZV~.............Copy the code

As before when obtaining the Go version information, the first 8 bytes are Pointers and the last 8 bytes are their sizes. That is, starting at 0x007e565A, size 0x000002E6, so we get the following:

➜ kind git:(master) Qualify objdump-s --start-address 0x007e565A --stop-address 0x7e5940./bin/kind./bin/kind: file format elf64-x86-64 Contents of section .rodata: 7e565a 3077 af0c9274 080241e1 c107e6d6 18e6 0w... t.. A....... 7e566a 7061 74680973 6967732e 6b38732e 696f path.sigs.k8s.io 7e567a 2f6b 696e640a 6d6f6409 73696773 2e6b /kind.mod.sigs.k 7e568a 3873 2e696f2f 6b696e64 09286465 7665 8s.io/kind.(deve 7e569a 6c29 090a6465 70096769 74687562 2e63 l).. dep.github.c 7e56aa 6f6d 2f427572 6e745375 7368692f 746f om/BurntSushi/to 7e56ba 6d6c 0976302e 332e3109 0a646570 0967 Ml. V0.3.1.. dep.g 7e56ca 6974 6875622e 636f6d2f 616c6573 7369 ithub.com/alessi 7e56da 6f2f 7368656c 6c657363 61706509 7631 O /shellescape. V1 7e56EA 2e34 2e31090A 64657009 67697468 7562.4.1.. dep.github 7e56fa 2e63 6f6d2f65 76616e70 68782f6a 736f .com/evanphx/jso 7e570a 6e2d 70617463 682f7635 0976352e 322e N-patch/v5.2.7e571A 3009 0a646570 09676974 6875622E 636F 0.. dep.github.co 7e572a 6d2f 6d617474 6e2f676f 2d697361 7474 m/mattn/go-isatt 7e573a 7909 76302e30 2e313209 0a646570 0967 Y.v 0.0.12.. dep.g 7e574a 6974 6875622e 636f6d2f 70656c6c 6574 ithub.com/pellet 7e575a 6965 722f676f 2d746f6d 6c097631 2e38 Ier/GO-TomL.v1.8 7e576a 2e31 0968313a 314e6638 336F7270 726B.1.h1 :1Nf83orprk 7e577a 4a79 6b6e5436 68377a62 75454755 456a JyknT6h7zbuEGUEj 7e578a 6379 566c4378 53554754 454e6d4e 4352 cyVlCxSUGTENmNCR 7e579a 4d3d 0a646570 09676974 6875622e 636f M=.dep.github.co 7e57aa 6d2f 706b672f 6572726f 72730976 302e m/pkg/errors.v0. 7e57ba 392e 31090a64 65700967 69746875 622 9.1 e.. dep.github. 7e57ca 636f 6d2f7370 6631332f 636f6272 6109 com/spf13/cobra. 7e57da 7631 2e312e31 090a6465 70096769 7468 V1.1.1.. dep.gith 7e57ea 7562 2e636f6d 2f737066 31332f70 666c ub.com/spf13/pfl 7e57fa 6167 0976312e 302e3509 0a646570 0967 Ag. V1.0.5.. dep.g 7e580a 6f6c 616e672e 6f72672f 782f7379 7309 olang.org/x/sys. 7e581a 7630 2e302e30 2d323032 31303132 3431 V0.0.0-202101241 7e582A 3534 3534382D 32326461 36326531 3263 54548-22da62e12c 7e583a 3063 0968313a 56777967 55726e77 396a 0c.h1:VwygUrnw9j 7e584a 6e38 38633475 38474433 725a5162 7172 n88c4u8GD3rZQbqr 7e585a 502f 74676173 38387450 55624278 5172 P/tgas88tPUbBxQr 7e586a 6b3d 0a646570 09676f70 6b672e69 6e2f k=.dep.gopkg.in/ 7e587a 7961 6d6c2e76 32097632 2 e322e38 090 a yaml. V2. V2.2.8.. 7e588a 6465 7009676f 706b672e 696e2f79 616d dep.gopkg.in/yam 7e589a 6c2e 76330976 332e302e 302d3230 3231 L.v.3.0.0-2021 7e58AA 3031 30373139 32393232 2D343936 3534 0107192922-49654 7e58BA 3561 36333037 62096831 3a683871 446f 5a6307b.h1:h8qDo 7e58ca 7461 4550754a 4154724d 6d573034 4e43 taEPuJATrMmW04NC 7e58da 7767 37763232 61484832 38777770 6175 wg7v22aHH28wwpau 7e58ea 5568 4b394f6f 3d0a6465 70096b38 732e UhK9Oo=.dep.k8s. 7e58fa 696f 2f617069 6d616368 696e6572 7909 IO/Apimachinery. 7e590A 7630 2e32302E 32090a64 65700973 6967 v0.20.2.. Dep.sig 7e591A 732E 6b38732E 696f2f79 616d6c09 7631 s.k8s. IO/YAML.v1 7e592A 2e32 2e30090a F9324331 86182072 0082.2.0... 2C1.. r.. 7e593a 4210 4116d8f2 B.A...Copy the code

We managed to get information about Modules it depends on, which would have matched the go version -m./bin/kind at the beginning of the article, but it was serialized.

The specific implementation

In the previous section, I covered how to use the readelf and objdump commands to get the Go version and Module information for binaries. Here I will introduce the implementation of the Go code.

Section header names are hard-coded into the code

//src/cmd/go/internal/version/exe.go#L106-L110
	for _, s := range x.f.Sections {
		if s.Name == ".go.buildinfo" {
			return s.Addr
		}
	}
Copy the code

Also, magic bytes are defined as follows:

var buildInfoMagic = []byte("\xff Go buildinf:")
Copy the code

The logic for obtaining information about Version and Module is as follows, which has been basically described in the previous section. The byte order part needs to be noted here.

	ptrSize := int(data[14])
	bigEndian := data[15] != 0
	var bo binary.ByteOrder
	if bigEndian {
		bo = binary.BigEndian
	} else {
		bo = binary.LittleEndian
	}
	var readPtr func([]byte) uint64
	if ptrSize == 4 {
		readPtr = func(b []byte) uint64 { return uint64(bo.Uint32(b)) }
	} else {
		readPtr = bo.Uint64
	}
	vers = readString(x, ptrSize, readPtr, readPtr(data[16:))if vers == "" {
		return
	}
	mod = readString(x, ptrSize, readPtr, readPtr(data[16+ptrSize:]))
	if len(mod) >= 33 && mod[len(mod)- 17] = ='\n' {
		// Strip module framing.
		mod = mod[16 : len(mod)- 16]}else {
		mod = ""
	}
Copy the code

conclusion

In this article, I shared how to get information from the Go binaries about the version of Go used to build it and the modules it depends on. If you are not interested in the principle, you can get the information directly from the Go Version-m binary.

The implementation depends on the information in the ELF file format, as well as the basic use of readELF and objdump tools. The ELF format has many interesting scenarios beyond the ones described in this article, such as reversing for security purposes.

Also, you might be wondering what’s the use of getting this information from Go’s binaries. Most directly, it can be used to scan for security vulnerabilities, such as checking its dependencies for security vulnerabilities. Or it can be useful to analyze dependencies (mainly in cases where source code is not available).


Please feel free to subscribe to my official account [MoeLove]