preface


One morning, Sunskey was sipping coffee and listening to a ditty (nonexistent, painstakingly coding). The leader came to me in a hurry. There were security risks in our encryption tool. So, we needed to clear the symbol table and improve security. I was like, “What is a symbol table?” But he said “yes” without hesitation. After a simple reference to the relevant information, the symbol table has a basic understanding.

Symbols and symbol tables


1.1 What is a symbol table

When a program is compiled into an executable file, there is a special table in the file to store the relationship between function names, variable names, and segment names and code or data. This table is called the symbol table (generated during compilation).

ELF files typically have two symbol tables. One is called symbol table (.symtab), and the other is called dynamic symbol table (.dynsym). Generally, only the symbol table (.symtab) is removed.

ELF(Executable and Linkable Format) can execute the Linkable file Format, and currently common Linux, Android Executable files, shared libraries (so), object files (.o) and Core files (spitting Core) all use this Format.

Symbol table type instructions
.symtab Contains a lot of information (including the global symbols)
.dynsym Keep only global symbols in.symtab
1.2 Functions of symbol tables

1. Collect symbolic attributes

The compiler scans the description section to collect the attributes of the identifier and establish the corresponding attribute information of the symbol in the symbol table.

For example: int A; float B[5]; The symbol A is an integer variable and the symbol B is A floating-point array.Copy the code

2. Check the validity of context semantics

The same identifier may appear in different places, and the attributes of the identifier need to be checked for consistency and validity in the context.

For example: int A[3,5]; Float A (3, 5); It is clear that this code will cause semantic conflictsCopy the code

3. As the basis of address allocation in the generation stage of object code

A symbolic variable needs to be determined by its symbol type and order in which region of the symbol table the variable is stored.

For example, common areas are extern, extern static, function static, auto, etc.Copy the code
1.3 Types of symbol tables
  • Global Symbols Symbols defined by the main module that can be referenced by other modules.

  • External symbols(externally defined global symbols) are defined by other modules and referenced by the main module.

  • Local symbols Local symbols defined and referenced only by the main module.

How do I remove the symbol table


2.1 strip command

Strip is often used to remove some symbol table and debug symbol table information from object files (.so,.a, executables, etc.) to reduce the size of static libraries, dynamic libraries, and programs.

Usage: strip file name

You can run ls -l to check the size of the original file.

The file command shows whether the original file has been stripped.

You can run the nm command to view the symbol table of the original file.

Strip command usage: www.linuxidc.com/Linux/2011-…

2.2 Use -s and -s parameters in the link stage

-s: deletes all symbol information

-s: Deletes debugging symbol information

You can also remove the symbol information directly in GCC by using -wl,-s and -wl,-s.

3. Summary


The symbol table is basically only useful in the compilation stage and link stage. Removing the symbol table can reduce the size of the program and improve certain security performance. However, it is impossible to debug the program after removing the symbol table. You can choose whether to remove the symbol table according to your actual situation.

reference


  • Dynamic link library optimization – Clear symbol table information

  • Program links (I) : Overview of links

  • Program links (iii) : symbols and symbol tables

  • Introduction to the Linux GCC strip command

  • What is a symbol table? Strip went where the symbol went

  • The role and status of symbol tables