Mach-O

Mach-O is short for Mach Object. It is a standard format for storing programs and libraries on Mac and iOS. The following file types belong to the Mach-O format (see the XNU source code for a detailed definition of Mach-o, detailed address)

  • MH_OBJECT

    • Object file (.o)

    • Static library file (.a), static library is actually multiple. O collection of

  • MH_EXECUTE: indicates an executable file

  • MH_DYLIB: dynamic libraries (.dylib,.framework)

  • MH_DYLINKER: dynamic link editor (/usr/lib/dyld)

  • MH_DSYM: stores the binary symbol information file (. DSYM/Contents/Resources/DWARF/xx), is often used to analyze the collapse of the APP

You can also look at target’s Mach-O type in Xcode

The basic structure of Mach-O

The official description

aMach-OThe file contains three main areas

  • Header: File type, target schema type, etc
  • Load Commands: Describes the logical structure and layout of files in virtual memory
  • Raw Segment Data: inLoad CommandsDefined in theSegmentRaw data of

See the structure of Mach-O

  • Command line tools:file,otool,lipo -info
  • GUI tools:MachOView

Use the objdump –macho -p [file path] command to view the Mach-header

You can also use the otool -h [file path] command to view the file path, but some of the fields are not readable

magic number

For otool command to view the meaning of magic contained in this table can be compared

FEEDFACE CEFAEDFE FEEDFACF CFFAEDFE
IsLittleEndian(Small End Mode) YES NO YES NO
Is64Bits NO NO YES YES
MachO Type MH_MAGIC MH_CIGAM MH_MAGIC_64 MH_CIGAM_64

Check the __TEXT

Run the objdump -d [file path] command to view code segments

Tips: To view the objdump command and its parameters, run the man objdump command

The link compilation process does the following:

  • Code assembly: The conversion of code into assembly instructions
  • Symbols to classify, for exampleNSLOGExternal symbols of this kind fall underRelocate symbol table
  • Multiple target files are merged, and multiple symbol tables are also merged into one table, and finally executable files are generated
  • Then the process of linking is the process of dealing with the object file symbol

You can also view the relocation symbol table information by running objdump –reloc [file path]

The comparative information is classified as follows:

symbol

Global symbol, local symbol

Define some variables as follows

Use the command objdump –macho –syms [file path] to view its symbol table

All variables defined above can be found in two categories: local symbol and global symbol; static symbol before variable: local symbol and global symbol. The difference between a local symbol and a global symbol is essentially the symbol’s visibility. If you add __attribute__((visibility(“hidden”))) to the default_x variable, it will become a local symbol

The visibility property controls the export symbols of files and limits their export properties

  • usedefaultThe symbol defined will be exported
  • usehiddenThe symbol defined will not be exported

Make changes to variable visibility:

Symbol table after modification:

Demonstration of global and local symbols

Create a framework and a test project and define a method in the framework that is not declared in its header file. What about calling that method in the test project?

The result is a successful invocation of a method in the framework

If you add the static modifier to a method, compile-time errors are reported:

If you write a method with the same method name in the test project and then call it, will it call the method in the framework or will it call the method in the project or will it report an error?

The result is a successful call to the methods of this project because of a namespace problem.

Two level namespaces and one level namespaces (two_levelNamespace & FLAT_namespace)

By default, the linker uses a secondary namespace, that is, in addition to the symbol name, it keeps track of which MachO the symbol belongs to, such as _NSLog from Foundation.

Linker annotations to secondary namespaces

So in the example above, methods are called directly to this item without error.

Import symbol & export symbol

The NSLog that we use a lot is in the Foundation library, so NSLog is an import symbol for the file or project that we use it for, and it’s an export symbol for Foundation, so it’s going to appear in the indirect symbol table of the project, Objdump –macho –indirect-symbols [file path

So global symbols are basically exported symbols, and there’s a question here, if you make a dynamic library, will all of your global symbols be exported without any processing?

Objc classes are exported symbols by default

Objdump –macho –exports-trie objdump –macho –exports-trie objdump –macho –exports-trie

After the increase:

As you can see, Objc’s classes are added to the export symbol table by default without any processing, but they can be changed to non-export by setting some parameters in the linker.

OTHER_LDFLAGS=$(inherited) -Xlinker -unexported_symbol -Xlinker _OBJC_CLASS_$_TestObject

This allows the Settings not to be exported

Weak symbol

Weak symbols can be divided into: weak definition symbols and weak reference symbols, the following are introduced respectively

Weak definition symbol

A weakly defined symbol indicates that the symbol is a weakly defined symbol. If the static or dynamic linker finds another non-weak definition for this symbol, the weak definition is ignored. Symbols in the merged section can only be marked as weakly defined.

Let’s take an example

Obviously, this will generate an error when compiled, which is a classic error: a symbol error that is repeatedly defined

But if you define this symbol as a weakly defined symbol

As a result, no errors are reported again, and the function implementation in Main is called

When inherited, this is inherited. When inherited, this is inherited. When inherited, this is inherited.

Weak Reference Symbol

Weak reference symbol: Indicates that this undefined symbol is a weak reference. If the dynamic linker cannot find the definition of the symbol, it sets it to 0. The linker sets this symbol to a weak link symbol. Weak_import is used for decoration, and the Weak_import_func function is not defined anywhere

Called directly from the main function

The running program found an error indicating that the symbol was not found

In this case, you just need to tell the linker that this symbol is a dynamic link, and when running, it will find its symbol, if not found, set to 0. OTHER_LDFLAGS=$(inherited) -Xlinker -U -Xlinker _weak_import_func

Doing so makes dynamic libraries more flexible. For example, an entire library can be set to weak references in a linker directive

Reexport of symbols

As you can see from the previous analysis, the _NSLog symbol is the exported symbol of Foundation, so in our program, it will exist in the indirect symbol table

To export _NSLog, run the OTHER_LDFLAGS=$(inherited) -xlinker-alias-xlinker _NSLog -xlinker HD_NSLog command. That’s an alias for _NSLog, HD_NSLog

You can see that annotations in the document can alias not only a single symbol, but can also be passed into the file to handle alias cases for multiple symbols.

You can now view the exported symbol table with the objdump –macho –exports-trie command

Or through the command nm – m/file path | grep “HD”

In this way, the reexported symbol can call our reexported symbol without directly referencing a dynamic library.

Swift symbol

So far we’ve been looking at Objc symbols, but now let’s briefly look at Swift symbols

Using the objdump command, macho, syms ${file path} | grep ‘SwiftClassSymbol’

Also take a look at the exported symbol table

Make the following changes

View the symbol table information again

The previous global notation basically became local notation, which also confirms that Swift is a static language, and a lot of things are already determined at compile time.

supplement

In the process of looking at the symbol table, we often see l, G, and so on. Now we summarize the following

By function:

Type instructions
f File
F Function
O Data
d Debug
*ABS* Absolute
*COM* Commen
*UND* ?

By type of symbol:

①: Lowercase represents local symbol

Symbol Type instructions
U Undefined – undefined
A Absolute(Absolute symbol)

T 1. T_ (1)
text section symbol(__Text.__text)

D 1. D_ (1)
data section symbol(__ Data.__ data)

B 1. B_ (1)
bss section symbol(__ Data.__ bss)
C Common symbol(appears only atMH_OBJECTThe type ofMachOFile)
debugger symbol table

S 1. S_ (1)
In addition to those described abovesectionContents such as uninitialized global variables are stored in (__ data.__ common)
I How to turn a question of indirect symbol into an indirect symbol
u A small write U in a dynamic shared library represents an undefined reference to a private external symbol in another module in the same library