In iOS reverse development, disassembly is a very important process. If you get an executable file of an app, whether it is opened by IDA Hopper, it is assembly code. Although there are many types of assembly code, iOS reverse analysis mainly uses ARM64.

ARM64 assembly

  • I mainly understand three aspects of assembly knowledge in iOS development

    • register

    • instruction

    • The stack

register

  • 64-bit universal register

    x0 x1x27 x28

    • X0 to x7, usually put function arguments, and use the stack to pass the remainder

    • X0 is usually a function return value

    • 32bit Universal register (64bit downward compatible)

      w0 w1 … W7 W28 (64bit low 32bit)

    The 32bit universal register is not one of the 32bit registers in ARMV7

    For armV7 assembler, registers are R1, R2, R3…

  • Program counter

    • pc (program Counter)
    • Record the current CPU instruction, storing the address of the instruction currently being executed
    • Similar to the 8086 assembly of IP registers
  • The stack pointer

    • sp (Stack Pointer)
    • fp(Frame Pointer), i.ex29
  • Link register

    • Lr (Link Register) is also called X30

      Is the return address used to store the subroutine

  • Program status register

    • cpsr (Current Program Status Register)

    • Saved Program Status Register (SPSR), which is used in the abnormal state

instruction

  • mov

    Mov W10, # 0xA: means to put 10 in register W10

  • ret

    The function returns

    Assign the value of the LR (X30) register to the PC register

  • add

    Add w10, w10, w11: the value of w10 + the value of w11, and save the result to register W10

  • sub

    Subs w10, w10, w11: the value of w10 -w the value of 11. The result is saved to w10

  • cmp

    CMP w10, w11: Compare the values in two registers

    Conclusions are drawn from the results of W10-W11

    The result of subtraction affects the flag bit of the CPSR (Program Status Register) register

  • b

    Jump instruction (cannot return all correct positions if used with RET instruction)

    If used with CMP, conditional jump (eq(equal) : beq equal jump)

  • bl

    • Jump instruction with return (calling function, with RET to return to the next instruction calling function instruction)

      First store the address of the next instruction in LR (x30 link register)

      The code is then executed based on the address jump

  • Conditions of the domain

    • EQ: equal

    • NE: Not equal

    • GT: great than greater than

    • GE: Great equal is greater than or equal to

    • LT: Less than

    • LE: less equal is less than or equal to

  • Memory operation instruction

    • Load: Reads data from the memory

      • ldr

        LDR w10, [sp, #0x8] : Read the value from the address [sp, #0x8] and save it to register W10

      • ldur

        Ldur w10, [sp, #-0x8] : Read data from [sp, #-0x8] and save it to w10 register

      • ldp

        Ldur w10,w11, [sp, #0x8] : segmented store values in registers w10,w11

      LDR differs from LDUR in that one is address plus and the other is address minus

    • Store: Writes data to the memory

      • str

        STR w10, [sp, #0x8] : [sp, #0x8

      • stur

        Same as above, but the address is minus

      • stp

        STP X10, X11 [SP, #0x20] : The values of registers x10,x11 are piecewise written to the corresponding memory address

    • The zero register, which stores the value 0

      • WZR (32bit, Word Zero Register)

      • xzr (64bit)

Function stack balancing

  • Leaf function

    Void test1(){int a = 3; int b = 4; }Copy the code
    // assembler ArmAssembly 'test1: 0x1047b20dc <+0>: sub sp, sp, #0x10; =0x10 0x1047b20e0 <+4>: mov w8, #0x3 0x1047b20e4 <+8>: str w8, [sp, #0xc] 0x1047b20e8 <+12>: mov w8, #0x4 0x1047b20ec <+16>: str w8, [sp, #0x8] 0x1047b20f0 <+20>: add sp, sp, #0x10 ; =0x10 0x1047b20f4 <+24>: retCopy the code

    Sub sp, sp, #0x10: sp = sp-0x10

    mov w8, #0x3

    W8, [sp, #0xc] : These two sentences save 3 to the last four bytes of memory space

    mov w8, #0x4

    W8, [sp, #0x8] : Save 4 to 4 bytes lower than 3, next to 3

    Add sp, sp, #0x10: Restore the location of the sp pointer

    Ret: return

  • Nonleaf function

    Void test2(){int c = 5; int d = 6; test1(); }Copy the code
    // assembler ArmAssembly 'test2: 0x104dd20cc <+0>: sub sp, sp, #0x20; =0x20 0x104dd20d0 <+4>: stp x29, x30, [sp, #0x10] 0x104dd20d4 <+8>: add x29, sp, #0x10 ; =0x10 0x104dd20d8 <+12>: mov w8, #0x5 0x104dd20dc <+16>: stur w8, [x29, #-0x4] 0x104dd20e0 <+20>: mov w8, #0x6 0x104dd20e4 <+24>: str w8, [sp, #0x8] 0x104dd20e8 <+28>: bl 0x1000060b0 ; test1 at main.m:11 0x104dd20ec <+32>: ldp x29, x30, [sp, #0x10] 0x104dd20f0 <+36>: add sp, sp, #0x20 ; =0x20 0x104dd20f4 <+40>: retCopy the code

    Sub sp, sp, #0x20

    STP x29, x30, [SP, #0x10] : x29(FP), x30(LR) Saves the data of the original FP and LR registers in the last 16 bytes of the allocated space

    Add x29, sp, #0x10: put the fp pointer at sp+#0x10

    .

    Bl 0x1000060B0:1. Save the address of the next instruction to the X30 (LR) register 2. 3. Wait for test1 to complete and jump back to LR

    LDP X29, X30, [SP, #0x10] : Fetch the original data and put it back into FP (x29), LR (x30), fp pointer returned to the original position

    Add sp, sp, #0x20: the sp pointer is restored

    In a non-leaf function, if you draw a memory map, you can see that the data is stored between the SP address and the FP address. Sp-fp determines where the data is stored, so it is called the stack pointer.


    Assembly of course, there are a lot of knowledge points, but not in today’s discussion, learning these is to analyze disassembly code, have time and energy friends can of course in-depth study.

    As a result of personal learning understanding, there are mistakes and omissions, if found, please correct, thank you.