Preface:

This article relies more on the lead-in to the mach-O file in the previous article and recommends reading before exploring.

  • Hook is often used for code injection in the reverse process, and is often used in security protection and monitoring.

  • In addition, the way that only the runtime swaps imp for middle and advanced developers (want to be lazy and want to install *) is obviously not enough.

So today we will discuss Hook and fishHook principle.

Summary of the hooks

I have a HOOK. In iOS reverse is the technique of changing the flow of a program. Hook allows other people’s programs to execute their own code. This technique is often used in reverse-engineering. Therefore, in the learning process, we should focus on understanding its principle, so that we can effectively protect malicious code.

The famous Hook has been played out of the flower by many people, and its many uses we do not say.

Several common hooks in iOS

1 . Method Swizzle

Using the Runtime feature of OC, the corresponding relationship between SEL (method number) and IMP (method implementation) is dynamically changed to achieve the purpose of changing the process of OC method call. Mainly used for OC methods.

Commonly used with

  • method_exchangeImplementationsExchange functionimp
  • class_replaceMethodReplace method
  • method_getImplementationmethod_setImplementationdirectlyget / set imp

The basic usage of these Runtime methods and how they work are explained and demo at the end of this article on debugging and code modification for resending applications, if you are interested.

2 . fishhook

It is a tool provided by Facebook to dynamically modify linked Mach-O files. By using MachO file loading principle, C function Hook is achieved by modifying the pointer of lazy loading table and non-lazy loading table.

3. Cydia Substrate

Cydia Substrate, formerly known as Mobile Substrate, is mainly used for Hook operations against OC method, C function, and function address. Of course, it’s not just designed for iOS, android can work as well. Official address: www.cydiasubstrate.com/

It uses logos syntax, which I’ll discuss in more detail in a future article.

Fishhook basic use

download

Git – address: fishhook

If necessary, you can download this annotated version link extraction code: F4F8.

demo

#import "ViewController.h"
#import "fishhook.h"

@interface ViewController(a)
@end

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    
    / / rebinding structure
    struct rebinding nslog;
    // The name of the function that needs HOOK, C string
    nslog.name = "NSLog";
    // Address of the new function
    nslog.replacement = myNslog;
    // Pointer to the original function address!
    nslog.replaced = (void *)&sys_nslog;
    // Rebinding array
    struct rebinding rebs[1] = {nslog};
    /** * Parameter 1: the array to hold the rebinding structure * parameter 2: the array length */
    rebind_symbols(rebs, 1);
}
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- change the NSLog -- -- -- -- -- -- -- -- -- -- -
// Function pointer
static void(*sys_nslog)(NSString* format,...) ;// Define a new function
void myNslog(NSString* format,...) { format = [format stringByAppendingString:@" Check! \n"];
    // Call the originalsys_nslog(format); } - (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event
{
    NSLog("Hit the screen!!");
}
@end
Copy the code

Click on the screen to print the result:

001--fishHookDemo[15776:645816] Click the screen!! Hook!Copy the code

The key function

Rebind_symbols, source code:

int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
    // The prepend_rebindings function adds the entire REbindings array to the _rebindings_head header
    Fishhook uses a linked list to store parameters passed in each call to rebind_symbols. Each call inserts a node into the head of the list. The head of the list is _rebindings_head
    int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
    // Use prepend_rebinding as shown above. If less than 0, return an error code
    if (retval < 0) {
    return retval;
  }
    // Check whether _rebindingS_head ->next is null.
  if(! _rebindings_head->next) {// For the first time, call _dyLD_register_func_for_add_image to register the listener method.
      // An image that has been loaded by dyld is immediately called back.
      // Subsequent images trigger a callback when dyLD is loaded.
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
  } else {
      // Go through the loaded image and hook it
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) { _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i)); }}return retval;
}
Copy the code

The basics of Fishhook are simple to use.

  • We need to define a pointer so thatfishhookCan help us save the original system function implementation address, in addition will need to replaceThe name of the functionandCustom function addressWrite it as a struct callrebind_symbolsThat’s it,
  • It is also possible to write multiple structures to an array at once for multiple functionshook.

Fishhook analysis

Basic OC function hook principle we do not repeat, in fact, is to replace the method to achieve the IMP, this dynamic runtime mechanism based on OC language is very well understood.

But what about C?

We know that C functions are static, that is, at compile time, the compiler knows their implementation address, which is why C functions that write only function declarations are called with an error. So why is Fishhook able to change the call to C functions? Is there a dynamic property of the function? Let’s explore how it works

Note:

Fishhook is a function that can hook the system, not all C functions. That is to say, Fishhook can only rebind system functions with symbol tables, and it has no way to implement C functions.

We could write a C function ourselves and try it out.

#import "ViewController.h"
#import "fishhook.h"

@interface ViewController(a)
@end

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    
    // hook custom C functions
    struct rebinding Cfunction;
    Cfunction.name = "func";
    Cfunction.replacement = newfunc;
    Cfunction.replaced = (void *)&funcOri;
    struct rebinding resbs[1] = {Cfunction};
    rebind_symbols(resbs, 1);
}

// To hook the c function
void func(const char * str){
    NSLog(@"%s",str);
}

// The original function pointer record
static void(*funcOri)(const char *);

void newfunc(const char * str){
    NSLog("Hooked!");
    funcOri(str);
}
@end
Copy the code

Run, print the result

2019-11-12 14:54:19.001680+0800 fishhookDemo[35238:1563336] click on screen 2019-11-12 14:54:19.706819+0800 FishhookDemo [35238:1563336] select * from fishhookDemo[35238:1563336]Copy the code

Unable to hook custom C functions, let’s use the Fishhook principle to explain why.

Fishhook principle

First of all:

C OC
static dynamic
Determine the function address at compile time The runtime determines the function address

The C function of the system has a dynamic part, which is often referred to as the symbol table, using a technique called Position Independent Code, which is where Fishhook did his article.

fishhookOriginal story:

The principle of overview

Since the UIKit/Foundation libraries in iOS are loaded into memory via DyLD, Apple has put them in one place to save space: Dyld shared cache (Mac OS).

Therefore, the implementation address of an NSlog-like function does not and cannot be in our own project’s Mach-O, so how can our project call the NSLog method find its real implementation address?

The process is as follows:

  • Generated when the project is compiledMach-OThere’s a space set aside in the executable, which is essentially a symbol table, and it’s stored there_DATAData segment (because_DATASegments are readable and writable at run time.

  • Compile-time: all projects that reference system library methods in the shared cache point to symbolic addresses. (For example, if there is an NSLog in the project, an NSLog is created in Mach -o at compile time, and the NSLog in the project points to this symbol.)

  • Runtime: When dyld loads the application into memory, it does the binding based on the library files listed in Load Commands (NSLog, for example, Dyld will find the real address of the NSLog in Foundation and write it to the _DATA symbol table above the symbol of the NSLog.)

This process is called PIC technology.

Now that we know how to load the system functions, let’s look at the fishhook function name:

Rebind_symbols :: Rebinding symbols is straightforward.

The principle is:

The symbols pointed to by the compiled system library functions are rebound at run time to the user specified function address, and then the real address of the original system function is assigned to the user specified pointer.

So look back at the custom C function why can’t hook?

The answer is simple:

  • The customCThe actual address of the function is in its ownMach-OIn addition, there is no sign and binding process.
  • This is determined at compile time and there is no way to manipulate it.

View the symbol table in Mach-O

Use MachOView to view directly.

Some students said, say so say, how to verify?

After all, some bug-collecting tools also use symbol table restoration frequently. By the way, let’s have a practical operation, while testing the theory, while deepening our memory.

Symbol tables and practice test theory

As we can see from MachOView, there are two types of symbol tables

  • Lazy Symbol Pointers
  • Non-Lazy Symbol Pointers

That’s literally lazy loading and non-lazy loading.

Therefore, when using Fishhook, it is best to call the original function to prevent the possibility of unused and unbound problems.

So, how about we play?

So without further ado, let’s go to the demo that we just wrote that hooked NSLog, and in viewDidLoad we’ll just add NSLog(@”123″);

Started to play

1 Prepare code and breakpoints

2 MachOView view

CMD + R run the project to get to the breakpoint, find Mach-O and use MachOView.

3 Calculate the symbolic address

  • First we see that the initial address offset for this symbol is 3028 based on the Mach-O file (everyone’s different, you use your own).

    So where is mach-O’s address?

    Go to the project LLDB and enter the command: image list

    The first is the Mach-O actual memory address of our project

  • Open the calculator CMD + 3, select hexadecimal, CMD + V paste in the Mach-O real memory address, plus MachOView sign offset address 3028.

    CMD + c copy the calculation result.

4 LLDB View memory and assembly code

X + 0x1042C8028 (Your calculation result)

(Memory read also works, x is short for memory read)

5 View the content of the first eight bytes

Note: In iOS small end mode, read from right to left.

So my actual address in the figure above is 0x01042C69C0.

Dis-s 0x01042C69C0 (Your own address)

So we see that there’s nothing in there, which means that at this point in the breakpoint here, the symbol is not bound to anything.

Go over the break point and get to the second break point.

For those unfamiliar with assembly, compare the result of the second breakpoint. In addition, the author will consider further compiling some contents in the future.)

Look at the symbol again

X + 0x1042C8028 (Your calculation result)

You can see that the content has obviously changed.

Dis-s 0x01042C69C0 (your own address)

So that’s it. Let’s go back to the principles section. Fully verified!

Don’t worry, this is just verifying the PIC part of iOS, but what about fishhook?

  • intouchesBeganAdd a breakpoint (it doesn’t have to betouchesBeganPlus, I’m just herefishhookrebind_symbolsThere is no code to cross the breakpoint.
  • Past the current breakpoint (rebind_symbols), click the screen to go to the next breakpoint.
  • Look again at memory and assembly

The results are as follows:

After fishhook is rebound, the symbol points to our custom function address. Verify the hypothesis completely.

The last

Fishhook is very important in reverse-engineering, and many tools have Fishhook built into them, so I want you to understand and grasp the principles.

Thanks for watching, and we’ll see you next time.