First, understand symbols

1. Basic concepts

  • Symbol: Simply put, a collective term for classes, functions, and variables; The class Name, function Name, or variable Name is the Symbol Name.

  • There are three types of symbols:

    • Global symbols: symbols visible outside object files that can be referenced by other object files, or that require other object file definitions;
    • Local symbols: symbols visible only in object files, functions and variables visible only in object files;
    • Debug symbol: debug symbol information including line number information, which records the file and line number corresponding to functions and variables.
  • Symbol Table: Symbol Table is a mapping Table of memory address and function name, file name, line number. Each defined Symbol has a corresponding Value, called Symbol Value. For variables and functions, the Symbol Value is their address. The symbol table elements look like this:

    < start address > < end address > < function > [< file name: line number >]Copy the code
  • Debug symbols (dSYM) is an iOS symbol table file that stores hexadecimal address information and symbol mapping files. The file name is usually xxx.app.dSYM, which is similar to the mapping file generated by Android build Release. Using THE dSYM file, you can restore the address information in the stack information to the corresponding symbol to help troubleshoot problems.

2. Symbol storage location

  • Mach-o is a common binary format on Mac/iOS platforms; App executables, dynamic libraries, static libraries, etc., are all in Mach-O format; For more information about Mach-O, see two or three things around the Mach-O file.
  • Debugging information can be stored in mach-o, which uses the DWARF (Debug With Arbitrary Record Foramt) standard debugging information format. Debug information in DWARF can also be saved as a separate File called Debug Symbol File (dSYM) File with a suffix of.dsym.
  • DWARF(debugging with opposed record formats) : A storage format for debugging information, used to support source-level debugging. When Release is packaged, debug symbols and so on will be cropped out, but we still need to be able to know the corresponding source code for the stack counted online, so we need to write the symbols into a separate file, which is dSYM.
  • In general, static librariesGlobal symbols, local symbols, line number information, etcSave in the corresponding binary file; In the fileSymbol TablestoreGlobal symbols and local symbols;DWARFThe line number information that stores the symbol.
  • Generally, App executables and dynamic libraries’ Mach-O files and dSYM store both global and local symbols, while dSYM files’ DWARF stores line number information.

The function of symbols

  • Symbols allow library files to be referenced and object files to be linked to each other to produce executables;
  • Symbols can also help developers locate problems. Commonly, address information in Crash logs is converted into corresponding function names, file names, and file line numbers.
  • In Release mode, symbols will be clipped out of the binary, but in Release mode there is a dSYM file by default, so you can use the dSYM file to make symbols.

Xcode symbol configuration

1. Configuration Overview

Xcode builds with several options for symbols.

  • Debug Information Format: DWARF OR DWARF with dSYM File. This configuration has no effect on static libraries; Influence on dynamic library: DWARF with dSYM File is set to generate corresponding dSYM File when dynamic library is generated. If DWARF is set to DWARF, DWARF segments, that is, debug information, will be lost if there is no place to store them.

  • Generate Debug Symbols: set to YES to Generate debugging information when compiling and generating object files; Set to NO, dwarf is not generated, dSYM files are not generated, and breakpoints used during debugging do not take effect because the address is NO longer associated with the corresponding line of code.

  • Deployment:

    • Deployment Postprocessing: If YES, there is further processing after compiling and generating the target file; If the value is NO, NO further processing is performed. Xcode Archive is used for compilation, the Deloyment Postprocessing value is always YES.

    • Strip Linked Product: If YES, tailor; If it is NO, NO clipping is performed. What level of symbols to crop is determined by the Strip Style configuration; If Deployment Postprocessing is NO, Strip Linked Product setting is invalid;

    • Strip Style(Deployment Postprocessing and Strip Linked Product are both YES to take effect; Remove symbols in binary) :

      • Debugging Symbols: Removes Debugging Symbols from the binary, that is, DWARF information;
      • Non-global Symbols: Local Symbols and debug Symbols will be deleted from the binary, that is, DWARF information and some information in the Symbol Table will be removed.
      • All Symbols: Remove All SymbolsDWARFDebug information andSymbol TableGlobal and local symbol information defined by the target module in.

      Supplement 1: Dynamic libraries and static libraries cannot Strip All Symbols, but keep Global Symbols (select non-global Symbols). They are the bridge of communication between libraries and other libraries. Without global notation, dynamic and static libraries become black boxes.

      Supplement 2: Removing symbols does not affect symbol information in dSYM files. For dynamic libraries and executable binaries, symbols can be removed as much as possible to reduce the size of the binary volume. When symbols are needed to symbolize crash logs, find corresponding symbols from dSYM files.

  • Symbols Hidden by Default: This is a global switch that sets the Default visibility of Symbols. If set to YES, all Symbols are defined as “private extern”;

    • You can also use compiler attributes __attribute__((visibility(“default”))) and __attribute__((visibility(“hidden”))) to control symbol visibility;

      __attribute__((visibility("default"))) void MyFunction1() {} __attribute__((visibility("hidden"))) void MyFunction2() {} // Not visibleCopy the code

2. Configure in the Debug mode of App

  • Set Debug Information Format to DWARF. Because generating dSYM files is a time-consuming process, selecting DWARF can save debugging time.

  • Generate Debug Symbols: set to YES; This enables breakpoint debugging; Note that in Debug mode, Deployment Postprocessing must be NO, otherwise the Generate Debug Symbols package is set to YES and breakpoint debugging is not supported.

  • Deployment configuration:

    • Deployment Postprocessing set to NO
    • Strip Linked Product set to NO
    • Set Strip Style to All Symbols (because Strip Linked Product is NO, Strip Style has NO effect)

    After this configuration, the App binary has global symbol and local symbol information, and the binary itself can support self-resolving symbols without using dSYM files (self-resolving symbols do not contain line numbers).

  • Symbols Hidden by Default set to YES;

3. Configuration in Release mode of App

  • Set Debug Information Format to DWARF with dSYM File. In this way, a dSYM file is generated along with the IPA.

  • Generate Debug Symbols: set to NO; This enables breakpoint debugging; Note that in Debug mode, Deployment Postprocessing must be NO, otherwise the Generate Debug Symbols package is set to YES and breakpoint debugging is not supported.

  • Deployment configuration:

    • Deployment Postprocessing is set to YES
    • Set Strip Linked Product to YES
    • Set Strip Style to All Symbols (because Strip Linked Product is NO, Strip Style has NO effect)

    This configuration, without any symbols in the App, reduces the size of the installation package and avoids symbol leakage. To locate problems, you can obtain symbols from dSYM files.

  • Symbols Hidden by Default set to YES;

4. Static and dynamic library configuration

  • Static library configuration: Debug Information Format Default, Generate Debug Symbols set YES, Deployment Postprocessing set NO, Strip Linked Product set NO, Strip Style will work by default (Strip Linked Product set to NO, Strip Style configuration does not matter; Symbols Hidden by Default set to NO;

    Static libraries are configured this way without clipping binary symbols; Therefore, the size of the binary of the static library will be greatly increased, but the size of the static library does not affect the size of the final installation package binary, and the debugging symbol can support the installation package or the linked dynamic library to generate corresponding dSYM files, which is convenient to locate problems in the static library.

  • Dynamic library configuration: Debug Information Format Is divided into Debug and Release, and the configuration is the same as App; Generate Debug Symbols set YES, Symbols Hidden by Default set NO;

    • Release:Deployment PostprocessingSet to YES;Strip Linked ProductSet to YES;Strip StyleSet toNon-Global Symbols
    • Under the Debug:Deployment PostprocessingSet to NO;Strip Linked ProductSet to NO;Strip StyleStrip Linked Product is NO, Strip Style has NO effect

    If the Symbols Hidden by Default in the dynamic library is set to YES, the dynamic library will still compile, but the App will report a bunch of link errors because the symbol is Hidden.

Three, symbol knowledge

1, weak symbol

  • Symbol is strong by default, but you can increase the __attribute__ ((weak)) attribute to make it weak symbol; Weak symbols are special when linked:

    • Strong symbol must be implemented or an error will be reported;
    • You cannot have two strong symbols with the same name
    • The strong symbol can override the weak Symbol’s implementation
  • Application scenario: Use weak Symbol to provide the default implementation, external can provide strong symbol to inject the implementation, in order to achieve dependency injection.

2. Other Linker Flags configuration

  • ld(static linker) when linking static libraries, only.aOne of the.oWhen the symbol is referenced, this.oIt will be written to the last binary file by ld, otherwise it will be discarded,other linker flagsThree options are provided to resolve the problem of preserving code.
    • -ObjCKeep all Objective C code;
    • -force_loadKeep all the code of a static library;
    • -all_loadRetain all static library code involved in linking;
  • Many SDKS that integrate into apps will require theother linker flagsAdd *-ObjC*.

3, can not find the symbol error

  • _OBJC_CLASS_$_CLASSNAME cannot be found;

  • It happened before when accessing AlipaySDK (the reason is: AlipaySDK conflicts with Alibaichuan SDK, so UTDID Framework needs to be accessed)

    Undefined symbols for architecture x86_64:
    "_OBJC_CLASS_$_UTDevice", referenced from:
    objc-class-ref in AlipaySDK
    ld: symbol(s) not found for architecture x86_64
    clang: error: linker command failed with exit code 1 (use -v to see invocation)
    Copy the code

    Add 1: If the class symbol is not trimmed, the runtime takes the _OBJC_CLASS_$_CLASSNAME argument and gets the class pointer through DLSYm.

    Add 2: nm app_name. App /app_name In the execution return, lowercase letters correspond to local symbols, and uppercase letters correspond to global symbols. U stands for undefined, that is, undefined external symbol;

4, LLDB symbol debugging

  • At runtime, you can use LLDB to query symbol related information.

  • View the symbol definition

    image lookup -t symbol_name
    Copy the code
  • View the position of the symbol

    image lookup -s symbol_name 
    Copy the code
  • Set a symbolic breakpoint

    Breakpoint set -f "symbol" # can also be set via Xcode's GUICopy the code

Four, the symbol after cutting…

1, an overview of the

  • In development, there’s no clipping of symbols, so everything’s fine; Symbolizing an address is straightforward: find the memory image of the address, then locate the symbol table in that image, and finally match the symbol of the destination address from the symbol table.

  • However, for the package of clipping Symbols, for example, the AppStore of the enterprise internal test package chooses the way of clipping Symbols, or even clipping All Symbols (Strip Style is set to All Symbols); Conventional symbolization cannot solve the problem;

    Benefits of symbol clipping: reduces installation package size and avoids symbol leakage;

  • The enterprise internal test package is different from the AppStore package. It is mainly used for internal testing and grayscale. Sometimes, it is necessary to collect the context information of the problem, including the number of lines of code, code file and function name, and even stack symbol.

  • After clipping symbols, many stack addresses obtained by [NSThread callStackSymbols] require symbol recovery; The DLADDR cannot obtain the symbol information based on the address;

2. Get the current position line number, etc

  • Regardless of whether the symbol is clipped or not, it can be obtained using the following predefined characters in C:

    __FILE__ //File path __LINE__ //Code Line __FUNCTION__ //Funcation Name //demo printf("File = %s\nLine = %d\nFunc=%s\n",  __FILE__, __LINE__, __FUNCTION__);Copy the code
  • Regardless of whether the symbol is clipped or not, you can also get the current method name using objective-C’s _cmd method, eg as follows

    printf("call %s", [NSStringFromSelector(_cmd) UTF8String]);
    Copy the code
  • Get the current debug symbol information through the predefined character and _cmd method. It is also commonly used in the system API, such as the use of NSAssert macro, source code is as follows:

    #define NSAssert(condition, desc, ...) \ do { \ __PRAGMA_PUSH_NO_EXTRA_ARG_WARNINGS \ if (__builtin_expect(! (condition), 0)) { \ NSString *__assert_file__ = [NSString stringWithUTF8String:__FILE__]; \ __assert_file__ = __assert_file__ ? __assert_file__ : @"<Unknown File>"; \ [[NSAssertionHandler currentHandler] handleFailureInMethod:_cmd \ object:self file:__assert_file__ \ lineNumber:__LINE__ description:(desc), ##__VA_ARGS__]; \ } \ __PRAGMA_POP_NO_EXTRA_ARG_WARNINGS \ } while(0) #endifCopy the code

    NSAssert works well with objective-C methods that use __FILE__, __LINE__, and _cmd to get the path to the code file, the number of lines of code, and the name of the method when a problem occurs. Then put these to the [[NSAssertionHandler currentHandler] handleFailureInMethod: object: file: lineNumber: description:];

Restore the current thread stack symbol scheme

  • We used to use [NSThread callStackSymbols] to get the current thread call stack symbol information, but this is only ideal in Debug mode; In the case of symbol clipping, symbol recovery is required for the address obtained.

  • If the use of dSYM files to symbolize is ok; Consider the following alternatives: symbolizing the memory address of the call stack according to all class methods, method names, and method implementation addresses; Similar to Frida call stack symbol recovery, the scheme is described in detail:

    • X seconds after App startup, obtain all class methods, method names and method implementation addresses;
    • perform[NSThread callStackReturnAddresses]Get the memory address of the call stack;
    • Walk through all the method addresses and compare them to the address of the call stack and calculate the distance. If the method address is smaller than the target address and the distance is the smallest, then the method is the symbol we want to find.
    • Finally, replace all addresses on the call stack with the corresponding symbols.
  • It is important to note that symbols refer to objective-C function symbols, because there is no way to restore C function symbols if they are stripped.

  • In the symbol clipping case, dlADDR cannot obtain the symbol from the address. You can test it with the following code

    NSArray<NSNumber *> *addresses = [NSThread callStackReturnAddresses]; NSNumber *firstAddress = [addresses objectAtIndex:0]; Dl_info info; int result = dladdr((const void *)[firstAddress integerValue], &info); if (result ! = 0 && info.dli_sname) {//Debug mode configuration printf(" get symbol_name = %s via dladdr ", [[NSString stringWithUTF8String:info.dli_sname] UTF8String]); } else {//Release mode configure printf(" after symbol clipping, symbols cannot be obtained through dladdr function, need [new symbol recovery scheme]"); }Copy the code

    [NSThread callStackSymbols] if dlADDR can retrieve symbols from the address, the symbols are not clipped.

4, To be continued…

  • In the case of clipping symbols, it is helpful to use some necessary means to obtain more context information, including: device information, the user’s main operation path, the current ViewController, etc.
  • For the Crash problem, it is a very normal choice to do a good symbolization; It is the main application scenario of symbolization;

5. Crash and symbolization

1, an overview of the

  • To analyze the Crash problem, it is necessary to do Crash capture, stack information collection and stack symbolization.

2, Crash capture

  • There are two main types of Crash: Mach exception and Objective-C exception (NSException);

  • Mach exceptions are the lowest level kernel-level exceptions, such as EXC_BAD_ACCESS (memory access exception); However, Objective-C layer cannot obtain Mach exceptions, but Mach exceptions will be converted to corresponding Signal signals in BSD layer. We can register the processing functions of SIGABRT, SIGBUS, SIGSEGV and other signals when they occur.

    SIGSEGV signal(SIGSEGV,handleSignal); // Register to handle other signals.... Static void handleSignal(int sig) {}Copy the code
  • NSException is an exception thrown by the iOS library or various third-party libraries or Runtime that validates an error. Such as NSRangeException (an array), they can be a try catch capture (apple does not recommend using), or be thrown @ throw if not captured, can be registered NSSetUncaughtExceptionHandler function to capture process.

    / / registered exception handler NSSetUncaughtExceptionHandler (& uncaught_exception_handler); {static void uncaught_exception_handler (NSException *exception) {static void uncaught_exception_handler (NSException *exception) { abort(); }Copy the code

3. Stack information collection

  • Once a Crash is caught, stack information needs to be collected immediately; At present, these are supported by ready-made solutions, such as PLCrashReporter, KSCrash, etc.
  • Even Umeng and Bugly not only provide Crash capture and stack information collection, but also integrate analysis, statistics and other services, which are very perfect.

4. Stack symbolization

  • Now common symbol means

    • 7. Symbolicatecrash + atos
    • The corresponding relation between address and symbol is extracted by dSYM file, and symbol is restored.
  • The first approach is generally to develop their own use; The second is suitable for making a standard scheme, batch help to restore the Crash stack symbol on the line;

5, and subsequent

  • After the symbol, it is the analysis and solution of the Crash, which is another topic. In the early years, I summarized two articles about the Crash, which can be used as the introductory material: iOS Transcript 14: A Brief introduction to iOS Crash (1), iOS Transcript 15: A Brief introduction to iOS Crash (2).

Refer to the article

About macOS & iOS symbol

In-depth understanding of Symbol

IOS Crash capture and stack symbolic thinking analysis

Other Linker Flags