1. Space and address allocation

  • Space and address allocation scans all object files, merges same segments, and collects symbol definitions and references.
  • Symbol resolution and relocation

The sample code

/* a.c */
extern int shared;
extern void swap(int* a, int* b);

int main(a)
{
    int a = 100;
    swap(&a, &shared);
}

/* b.c */
int shared = 1;

void swap(int* a, int* b)
{
    *a ^= *b ^= *a ^= *b;
}
Copy the code

Before and after the link

a.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000039 00000000 00000000 00000034 2**0 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000000 00000000 00000000 0000006d 2**0 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 0000006d 2**0 ALLOC 3 .comment 00000036 00000000 00000000 0000006d 2**0 CONTENTS,  READONLY 4 .note.GNU-stack 00000000 00000000 00000000 000000a3 2**0 CONTENTS, READONLY 5 .eh_frame 00000044 00000000 00000000 000000a4 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA b.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000039 00000000 00000000 00000034 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 00000004 00000000 00000000 00000070 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000074 2**0 ALLOC 3 .comment 00000036 00000000 00000000 00000074 2**0 CONTENTS,  READONLY 4 .note.GNU-stack 00000000 00000000 00000000 000000aa 2**0 CONTENTS, READONLY 5 .eh_frame 00000038 00000000 00000000 000000ac 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA ab: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000072 08048094 08048094 00000094 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .eh_frame 00000064 08048108 08048108 00000108 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .data 00000004 0804916c 0804916c 0000016c 2**2 CONTENTS, ALLOC, LOAD, DATA 3 .comment 00000035 00000000 00000000 00000170 2**0 CONTENTS, READONLYCopy the code

The linked VMA is the virtual address in the process space. The executable allocates space for the BSS segment, with the static_uninit_var address 0x08049170 immediately following the.data segment

SYMBOL TABLE: 08048094 l d .text 00000000 .text 08048108 l d .eh_frame 00000000 .eh_frame 0804916c l d .data 00000000 .data 08049170 l  d .bss 00000000 .bss 00000000 l d .comment 00000000 .comment 00000000 l df *ABS* 00000000 a.c 08049170 l O .bss 00000004 static_uninit_var.1485 00000000 l df *ABS* 00000000 b.c 080480cd g F .text 00000039 swap 0804916c g O .data 00000004 shared 08049170 g .bss 00000000 __bss_start 08049174 g O .bss 00000004 global_uninit_var 08048094 g F .text 00000039 main 08049170 g .data 00000000 _edata 08049178 g .bss 00000000 _endCopy the code

Symbol analysis and relocation

Symbol table '.symtab' contains 16 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     9: 080480cd    57 FUNC    GLOBAL DEFAULT    1 swap
    10: 0804916c     4 OBJECT  GLOBAL DEFAULT    3 shared
Copy the code

The symbol table information of the executable file. After splicing, the virtual space address of each segment has been assigned, and then the virtual space address is calculated by the offset address of each symbol in the target file segment.

Relocation table for object file A.o
a.o:     file format elf32-i386

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
0000001c R_386_32          shared
00000025 R_386_PC32        swap
Copy the code

The relocation table records the offset addresses of symbols to be relocated in the target file. For example, the offset address bit 0x0000001c of the share variable is known. The virtual space address of the.text segment is 0x08048094.

Disassembly of section. text: 08048094 <main>: ···· 80480AF: 68 6C 91 04 08 push$0x804916c· · · · · ·Copy the code

0x080480B0 The first 4-byte reference to shared is corrected to the correct address 0x804916c.

If a global symbol appears more than once in a compilation unit, each reference to it will have a relocation message in the relocation table.

a.o:     file format elf64-x86-64

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE 
0000000000000018 R_X86_64_PC32     shared-0x0000000000000004  # shared is referenced multiple times with multiple relocation information
0000000000000049 R_X86_64_PC32     shared-0x0000000000000004
0000000000000051 R_X86_64_PLT32    swap-0x0000000000000004
000000000000006a R_X86_64_PLT32    __stack_chk_fail-0x0000000000000004
Copy the code

The COMMON block

To initialize global variables, for the weak symbols in the target file, but not in the distribution of BSS space marked as COMMON, this is because he is a global variable, may have defined in other compilation unit, even in other type of the symbol in a compilation unit size is bigger, so that only the final link can confirm his size, Space is allocated in the BSS section of the output file.

  • If a weak symbol is larger than a strong symbol, the link is warned.
  • Attributes can be added to force global variables not to be marked COMMON for initialization, which is equivalent to a strong symbol.