This analysis is based on Android R(11)

SIGSEGV is signal 11, which is generated when a memory access error occurs. After the signal is generated, it needs to be sent to the user space for processing. The pure native process is processed by the DebuggerD_Signal_handler, and the application process (Zygote and its child process) is processed by SignalChain::Hanler.

Compared with pure native processes, application processes have a layer of encapsulation and distribution, mainly to detect NPE(NullPointerException) and SOE(StackOverflowError) in the Java world. As we all know, Java code has two modes of execution, one is interpreted execution and the other is machine code execution. Explanation execution does not produce SIGSEGV because the parameters of each instruction can be determined before interpretation, so NPE and SOE can be thrown in the event of a judgment failure. While machine code execution is a direct operation of assembly instructions, each LDR/STR has no prior judgment, so SIGSEGV may be generated.

The following is an analysis of the registration and distribution of handlers from the source point of view.

Registration of signal handlers

Android applications are forked from Zygote processes, so the way each signal is handled is inherited from Zygote.

Zygote is obtained by executing the app_process executable in init fork. The main() function in the app_process executable is usually used as the entry point of our program, but in fact main is only the logical entry point of our program. When the exec system call occurs, it actually calls the _start entry of /system/bin/linker64, starts the linker, and then calls main.

ENTRY(_start)
  // Force unwinds to end in this function.
  .cfi_undefined x30

  mov x0, sp
  bl __linker_init

  /* linker init returns the _entry address in the main image */
  br x0
END(_start)
Copy the code

Linker_debuggerd_init () is eventually called in __linker_init, which registers SIGSEGV’s signal handler as the debuggerd_signal_handler. So the first registration of SIGSEGV for the entire process takes place during the Linker bootstrap before the main function of app_process is executed.

When the Zygote process is running, it needs to start the ART virtual machine. The global fault_manager variable is initialized during Runtime::Init and NPE and SOE handlers are registered.

// Dex2Oat's Runtime does not need the signal chain or the fault handler.
if (implicit_null_checks_ || implicit_so_checks_ || implicit_suspend_checks_) {
  fault_manager.Init(a);// These need to be in a specific order. The null point check handler must be
  // after the suspend check and stack overflow check handlers.
  //
  // Note: the instances attach themselves to the fault manager and are handled by it. The
  // manager will delete the instance on Shutdown().
  if (implicit_suspend_checks_) {
    new SuspensionHandler(&fault_manager);
  }

  if (implicit_so_checks_) {
    new StackOverflowHandler(&fault_manager);
  }

  if (implicit_null_checks_) {
    new NullPointerHandler(&fault_manager);
  }

  if (kEnableJavaStackTraceHandler) {
    new JavaStackTraceHandler(&fault_manager); }}Copy the code

The fault_manager global variable registers SIGSEGV again during Init and stores SIGSEGV’s original debuggerD_signal_handledr pointer to action_. Register SignalChain::Handler as the new Handler function.

void Register(int signo) {
    struct sigaction64 handler_action = {};
    sigfillset64(&handler_action.sa_mask); . handler_action.sa_sigaction = SignalChain::Handler; handler_action.sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK | SA_UNSUPPORTED | SA_EXPOSE_TAGBITS;linked_sigaction64(signo, &handler_action, &action_);
Copy the code

One thing to note about this code is that the function used to register is linked_sigaction64, not sigaction64. This is because sigaction64 is implemented by liBC by default, and the sigaction64 function is also implemented in libsigchain, which logs liBC’s sigaction64 as linked_sigaction64 to mask liBC. So any subsequent APP dynamic library calls to sigaction64 will go into libsigchain.

The purpose of this is to ensure that the registration behavior in the APP dynamic library does not affect the DETECTION of NPE and SOE.

Here is the code in the libsigchain android. bp file that shields the LIBC sigaction symbol with the -z,global compiler option.

// Make libsigchain symbols global, so that an app library which // is loaded in a classloader linker namespace looks for // libsigchain symbols before libc. // -z,global marks the binary with the DF_1_GLOBAL flag which puts the symbols // in the global group. It does not  affect their visibilities like the version // script does.ldflags: ["-Wl,-z,global"],
Copy the code

In the current version of the Android, implicit_null_checks_ and implicit_so_checks_ default open, while implicit_suspend_checks_ and kEnableJavaStackTraceHandler off by default.

StackOverflowHandler and NullPointerHandler both inherit from FaultHandler and register their Action methods in the Generated_CODE_Handlers_ array at construction time. Such as StackOverflowHandler StackOverflowHandler: : the Action register to the array. The reason why the array is called “generated code” is that the APK file only has dex file at the beginning, only after dex2OAT in the phone can generate machine code, so the generated machine code is also called “generated code” here.

Signal distribution

The distribution rules of SIGSEGV are as follows:

  1. SIGSEGV bySignalChain::HandlerReceive processing, passing in both fault PC and Fault Address information.
  2. First traversalgenerated_code_handlers_All handlers registered at vm startup, one to throw the NPE and the other to throw the SOE. Each handler determines whether the current error is of its own type according to its own rules. If it is, it throws a Java exception and ends the distribution process. If it is not, it passes through the next handler.
  3. traverseother_handlers_By default, this handler array is empty.
  4. calldebuggerd_signal_handlerAs a result, a tombstone file is generated with stack information for all threads and memory map information.

About the meaning of SignalChain, although the beginning of the article has been explained, but more accurate expression can refer to the annotations in the source code.

// libsigchain provides an interception layer for signal handlers, to allow ART and others to give
// their signal handlers the first stab at handling signals before passing them on to user code.
//
// It implements wrapper functions for signal, sigaction, and sigprocmask, and a handler that
// forwards signals appropriately.
Copy the code

In SignalChain::Handler, the special_handlers_ Handler is first iterated and then the action_ stored function is called.

void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) {
  // Try the special handlers first.
  // If one of them crashes, we'll reenter this handler and pass that crash onto the user handler.
  if (!GetHandlingSignal()) {
    for (const auto& handler : chains[signo].special_handlers_) {
      if (handler.sc_sigaction == nullptr) {
        break;
      }
      sigset_t previous_mask;
      linked_sigprocmask(SIG_SETMASK, &handler.sc_mask, &previous_mask);
      ScopedHandlingSignal restorer;
      SetHandlingSignal(true);
      if (handler.sc_sigaction(signo, siginfo, ucontext_raw)) {
        return;
      }
      linked_sigprocmask(SIG_SETMASK, &previous_mask, nullptr); }}// Forward to the user's signal handler.
  chains[signo].action_.sa_sigaction(signo, siginfo, ucontext_raw);
}
Copy the code

There are two things to notice in the code when traversing special_handlers_ :

  1. Mask needs to be changed before calling the handler functionhandler.sc_maskAnd restore the mask after processing. forart_fault_handlerIn terms ofhandler.sc_maskThe Settings are as follows. The purpose of this setup is to prevent the signal from being generated again in the signal processing function.
sigfillset(&mask);
sigdelset(&mask, SIGABRT);
sigdelset(&mask, SIGBUS);
sigdelset(&mask, SIGFPE);
sigdelset(&mask, SIGILL);
sigdelset(&mask, SIGSEGV);
Copy the code
  1. Required before calling the handler functionsetHandlingSignal(true), used with 1 to enter the second timeSignalChain::HandlerSkip the art processing. Because a second entry usually means there is a problem in the art’s handler.

Art_fault_handler invokes FaultManager: : HandleFault function. If so, NPE and SOE will be further detected. Otherwise, generated_CODE_HandLERs_ will be skipped and other_handlers_ will be directly traversed.

bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) {
  if (IsInGeneratedCode(info, context, true)) {
    VLOG(signals) << "in generated code, looking for handler";
    for (const auto& handler : generated_code_handlers_) {
      VLOG(signals) << "invoking Action on handler " << handler;
      if (handler->Action(sig, info, context)) {
        // We have handled a signal so it's time to return from the
        // signal handler to the appropriate place.
        return true; }}}// We hit a signal we didn't handle. This might be something for which
  // we can give more information about so call all registered handlers to
  // see if it is.
  if (HandleFaultByOtherHandlers(sig, info, context)) {
    return true;
  }
  return false;
}
Copy the code

IsInGeneratedCode checks as follows: If the current thread is in a Runnable state and holds a mutator read/write lock (indicating that it can manipulate the Java heap), it is basically proof that the thread is running Java-compiled machine code. Then find ArtMethod according to the arrangement rule of Java stack (ArtMethod object is stored at the top of stack) and determine whether the fault PC is within the instruction range of ArtMethod. If so, it is further proved that it is indeed generated code.

// This function is called within the signal handler. It checks that
// the mutator_lock is held (shared). No annotalysis is done.
bool FaultManager::IsInGeneratedCode(siginfo_t* siginfo, void* context, bool check_dex_pc) {
  // We can only be running Java code in the current thread if it
  // is in Runnable state.
  Thread* thread = Thread::Current(a); ThreadState state = thread->GetState(a);if(state ! = kRunnable) {VLOG(signals) << "not runnable";
    return false;
  }
  // Current thread is runnable.
  // Make sure it has the mutator lock.
  if(! Locks::mutator_lock_->IsSharedHeld(thread)) {
    VLOG(signals) << "no lock";
    return false;
  }

  ArtMethod* method_obj = nullptr;
  uintptr_t return_pc = 0;
  uintptr_t sp = 0;
  bool is_stack_overflow = false;

  // Get the architecture specific method address and return address. These
  // are in architecture specific files in arch/<arch>/fault_handler_<arch>.
  GetMethodAndReturnPcAndSp(siginfo, context, &method_obj, &return_pc, &sp, &is_stack_overflow);

  const OatQuickMethodHeader* method_header = method_obj->GetOatQuickMethodHeader(return_pc);  // If the PC is not in ArtMethod, nullptr is returned

  if (method_header == nullptr) {
    VLOG(signals) << "no compiled code";
    return false;
  }

  dexpc = method_header->ToDexPc(reinterpret_cast<ArtMethod**>(sp), return_pc, false);
  return! check_dex_pc || dexpc ! = dex::kDexNoIndex; }Copy the code

Then the specific detection rules of NPE and SOE are introduced.

NullPointerException detection rule

NullPointerException detection needs to call to NullPointerHandler: : Action function.

bool NullPointerHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* info, void* context) {
  if (!IsValidImplicitCheck(info)) {
    return false;
  }
  // The code that looks for the catch location needs to know the value of the
  // PC at the point of call. For Null checks we insert a GC map that is immediately after
  // the load/store instruction that might cause the fault.

  struct ucontext *uc = reinterpret_cast<struct ucontext* >(context);
  struct sigcontext *sc = reinterpret_cast<struct sigcontext* >(&uc->uc_mcontext);

  // Push the gc map location to the stack and pass the fault address in LR.
  sc->sp -= sizeof(uintptr_t);
  *reinterpret_cast<uintptr_t*>(sc->sp) = sc->pc + 4;
  sc->regs[30] = reinterpret_cast<uintptr_t>(info->si_addr);

  sc->pc = reinterpret_cast<uintptr_t>(art_quick_throw_null_pointer_exception_from_signal);
  VLOG(signals) << "Generating null pointer exception";
  return true;
}
Copy the code

The check is performed by IsValidImplicitCheck, which checks if the fault Address is less than 1 page. Why is it less than 1 page instead of being equal to 0? The reason is that many times we are accessing the fields or Vtables of an object rather than the object itself. Both fields and vtables are offset with respect to the object’s starting address, and if the object’s starting address is 0, the final memory access is a small offset.

static bool IsValidImplicitCheck(siginfo_t* siginfo) {
  // Our implicit NPE checks always limit the range to a page.
  // Note that the runtime will do more exhaustive checks (that we cannot
  // reasonably do in signal processing code) based on the dex instruction
  // faulting.
  return CanDoImplicitNullCheckOn(reinterpret_cast<uintptr_t>(siginfo->si_addr));
}
Copy the code
// Returns whether the given memory offset can be used for generating
// an implicit null check.
static inline bool CanDoImplicitNullCheckOn(uintptr_t offset) {
  return offset < kPageSize;
}
Copy the code

After judged NPE, NullPointerHandler: : the Action will modify the original context of PC value. We are currently in a signal handler, and when we return from the function, by default the program reexecutes the “error” instruction. But if we change the PC value of the original context, the function returns and jumps to the location specified by PC.

Art_quick_throw_null_pointer_ EXCEPtion_FROm_signal does two things, which we’ll explain in detail in “How exceptions are thrown.” Here is a brief list.

  1. Generate a NullPointerException object for the Java layer.
  2. Jump to a catch block that can catch the exception.

Check for StackOverflowError

Before introducing the detection rules of SOE, we must first understand the structure of the stack in ART.

There are two pages at the top of the stack that cannot be read or written, and memory errors occur when they are read or written. In addition, dynamic stack growth is done in functions, so detection must be combined with function calls. In the AArch64 architecture, the following assembly instruction is executed on each function call to write the value of 0 to the position SP-0x2000. If the stack free space is greater than 2 pages, then SP-0x2000 is still in read-write range; However, if the available space is less than 2 pages, the SP-0x2000 will fall into the unreadable red area. SIGSEGV is generated when data is written to an unreadable area.

sub x16, sp, #0x2000 (8192)
ldr wzr, [x16]
Copy the code

Therefore, the actual detection is to determine whether SP-0x2000 and Fault Address are equal. If they are equal, then it is proved that SIGSEGV is generated by the above code, that is, SOE actually occurs.

bool StackOverflowHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* info ATTRIBUTE_UNUSED,
                                  void* context) {
  struct ucontext *uc = reinterpret_cast<struct ucontext* >(context);
  struct sigcontext *sc = reinterpret_cast<struct sigcontext* >(&uc->uc_mcontext);
  VLOG(signals) << "stack overflow handler with sp at " << std::hex << &uc;
  VLOG(signals) << "sigcontext: " << std::hex << sc;

  uintptr_t sp = sc->sp;
  VLOG(signals) << "sp: " << std::hex << sp;

  uintptr_t fault_addr = sc->fault_address;
  VLOG(signals) << "fault_addr: " << std::hex << fault_addr;
  VLOG(signals) << "checking for stack overflow, sp: " << std::hex << sp <<
      ", fault_addr: " << fault_addr;

  uintptr_t overflow_addr = sp - GetStackOverflowReservedBytes(InstructionSet::kArm64);  // sp - 0x2000

  // Check that the fault address is the value expected for a stack overflow.
  if(fault_addr ! = overflow_addr) {VLOG(signals) << "Not a stack overflow";
    return false;
  }

  VLOG(signals) << "Stack overflow found";

  // Now arrange for the signal handler to return to art_quick_throw_stack_overflow.
  // The value of LR must be the same as it was when we entered the code that
  // caused this fault. This will be inserted into a callee save frame by
  // the function to which this handler returns (art_quick_throw_stack_overflow).
  sc->pc = reinterpret_cast<uintptr_t>(art_quick_throw_stack_overflow);

  // The kernel will now return to the address in sc->pc.
  return true;
}
Copy the code

If SOE passes, art_quick_throw_stack_overflow will be executed when the handler returns.

How to throw an exception

After passing the NPE test, execute the following code.

ENTRY art_quick_throw_null_pointer_exception_from_signal // The fault handler pushes the gc map address, i.e. "return address", to stack // and passes the fault address in LR. So we need to set up the CFI info accordingly. .cfi_def_cfa_offset __SIZEOF_POINTER__ .cfi_rel_offset lr, 0 // Save all registers as basis for long jump context. INCREASE_FRAME (FRAME_SIZE_SAVE_EVERYTHING - __SIZEOF_POINTER__)  SAVE_REG x29, (FRAME_SIZE_SAVE_EVERYTHING - 2 * __SIZEOF_POINTER__) // LR already saved. SETUP_SAVE_EVERYTHING_FRAME_DECREMENTED_SP_SKIP_X29_LR mov x0, lr // pass the fault address stored in LR by the fault handler. mov x1, xSELF // pass Thread::Current. bl artThrowNullPointerExceptionFromSignal // (arg, Thread*). brk 0 END art_quick_throw_null_pointer_exception_from_signalCopy the code
extern "C" NO_RETURN void artThrowNullPointerExceptionFromSignal(uintptr_t addr, Thread* self)
    REQUIRES_SHARED(Locks::mutator_lock_) {
  ScopedQuickEntrypointChecks sqec(self);
  ThrowNullPointerExceptionFromDexPC(/* check_address= */ true, addr);
  self->QuickDeliverException(a); }Copy the code

After passing the SOE test, the following code is finally executed.

extern "C" NO_RETURN void artThrowStackOverflowFromCode(Thread* self)
    REQUIRES_SHARED(Locks::mutator_lock_) {
  ScopedQuickEntrypointChecks sqec(self);
  ThrowStackOverflowError(self);
  self->QuickDeliverException(a); }Copy the code

ThrowNullPointerExceptionFromDexPC and ThrowStackOverflowError are tectonic Java Throwable object of the world, it’s just a structure is a NullPointerException, The other is StackOverflowError. The constructed Throwable object has two key pieces of information, the prompt string and the call stack. The constructed object is stored in the Thread ->tlsPtr_. Exception field so that other parts of the thread can fetch it.

Next, focus on the QuickDeliverException function. Its function is to jump to the corresponding catch block.

void Thread::QuickDeliverException(a) {
  // Get exception from thread.
  ObjPtr<mirror::Throwable> exception = GetException(a);// Don't leave exception visible while we try to find the handler, which may cause class
  // resolution.
  ClearException(a);QuickExceptionHandler exception_handler(this.false);
  exception_handler.FindCatch(exception);
  exception_handler.DoLongJump(a); }Copy the code

We first find two pieces of information using FindCatch.

  1. The frame in which the catch block of the exception is located can be caught and the SP of the frame is recorded.
  2. The starting address of the catch block for the exception can be caught, recording the machine code starting address PC or the bytecode address dex_pc.

Then jump through the DoLongJump.

void QuickExceptionHandler::DoLongJump(bool smash_caller_saves) {
  // Place context back on thread so it will be available when we continue.
  self_->ReleaseLongJumpContext(context_);
  context_->SetSP(reinterpret_cast<uintptr_t>(handler_quick_frame_));
  CHECK_NE(handler_quick_frame_pc_, 0u);
  context_->SetPC(handler_quick_frame_pc_);
  context_->SetArg0(handler_quick_arg0_);
  if (smash_caller_saves) {
    context_->SmashCallerSaves(a); }if(! is_deoptimization_ && handler_method_header_ ! =nullptr &&
      handler_method_header_->IsNterpMethodHeader()) {
    context_->SetNterpDexPC(reinterpret_cast<uintptr_t> (GetHandlerMethod() - >DexInstructions().Insns() + handler_dex_pc_));
  }
  context_->DoLongJump(a);UNREACHABLE(a); }Copy the code

The frame address of that frame is first stored in the SP field, and then the machine code address is stored in the PC field. If the frame is executed by the interpreter, the machine code address points to a trampoline function, and the real bytecode address dex_PC is stored in the X22 field, which is eventually retrieved when the interpreter executes.

All fields are then written to the actual register, and the CATCH block is then jumped through the BR instruction header without returning.

ENTRY art_quick_do_long_jump
    // Load FPRs
    ldp d0, d1, [x1, #0]
    ldp d2, d3, [x1, #16]
    ldp d4, d5, [x1, #32]
    ldp d6, d7, [x1, #48]
    ldp d8, d9, [x1, #64]
    ldp d10, d11, [x1, #80]
    ldp d12, d13, [x1, #96]
    ldp d14, d15, [x1, #112]
    ldp d16, d17, [x1, #128]
    ldp d18, d19, [x1, #144]
    ldp d20, d21, [x1, #160]
    ldp d22, d23, [x1, #176]
    ldp d24, d25, [x1, #192]
    ldp d26, d27, [x1, #208]
    ldp d28, d29, [x1, #224]
    ldp d30, d31, [x1, #240]

    // Load GPRs. Delay loading x0, x1 because x0 is used as gprs_.
    ldp x2, x3, [x0, #16]
    ldp x4, x5, [x0, #32]
    ldp x6, x7, [x0, #48]
    ldp x8, x9, [x0, #64]
    ldp x10, x11, [x0, #80]
    ldp x12, x13, [x0, #96]
    ldp x14, x15, [x0, #112]
    // Do not load IP0 (x16) and IP1 (x17), these shall be clobbered below.
    // Don't load the platform register (x18) either.
    ldr      x19, [x0, #152]      // xSELF.
    ldp x20, x21, [x0, #160]      // For Baker RB, wMR (w20) is reloaded below.
    ldp x22, x23, [x0, #176]
    ldp x24, x25, [x0, #192]
    ldp x26, x27, [x0, #208]
    ldp x28, x29, [x0, #224]
    ldp x30, xIP0, [x0, #240]     // LR and SP, load SP to IP0.

    // Load PC to IP1, it's at the end (after the space for the unused XZR).
    ldr xIP1, [x0, #33*8]

    // Load x0, x1.
    ldp x0, x1, [x0, #0]

    // Set SP. Do not access fprs_ and gprs_ from now, they are below SP.
    mov sp, xIP0

    REFRESH_MARKING_REGISTER

    br  xIP1
END art_quick_do_long_jump
Copy the code

Refer to the article blog.csdn.net/hl09083253c…