Author: Doug, 10+ years embedded development veteran, focusing on: C/C++, embedded, Linux.

Pay attention to the following public account, reply [books], get Linux, embedded field classic books; Reply [PDF] to get all original articles (PDF format).

Other people’s experience, our ladder!

Hello everyone, I am Brother Dao, today I explain for you the technical knowledge point is: [dynamic library memory processing].

In a reprint article last week, I described a technique for swapping out calling functions in a dynamic library for specific purposes.

This technology is iQiyi open source xHook, github address is: github.com/iqiyi/xHook.

In the official documentation, the author describes the scenario as Android. Because the underlying layer is linux-based, the hook technology described here is also suitable for other Linux operating systems.

In this article, we will learn how to find the target (the address of the called function) step by step, and then replace it with another function address.

It’s a long article, but it’s worth spending half a day, or even a few days, researching the knowledge points.

Perhaps it can not immediately improve your programming skills, but for internal skill cultivation, promotion, is absolutely first-class good information!

In the process of learning, I will add my own learning experience, or understanding, in some important places with orange fonts. If you have any misunderstanding, please point it out and discuss it together.

In order to make it easier to read, I added font colors to key words in the original text.

Theory and Practice

One of the best books on dynamic libraries on the market is probably Self-training for Programmers – Linking, loading, and Libraries.

The one IN my hand was printed for the 29th time in June 2019, which shows how powerful this book is!

If you have read this book, you may feel that it is too theoretical. Even if you know the truth, how should you put it into practice? Or, what can you do with it?

Iqiyi xHook is the perfect practice of these theoretical knowledge!

Self-training for Programmers – Linking, loading, and Libraries is a rare book. If you are interested in dynamic libraries, you are advised to get a paper book and support the author!

If you just want to browse, I have a PDF version here (I forget where I downloaded it from) and it’s on a web disk.

If you need it, please leave a message at the background of the public account [IOT Town] : 1031, and you can get the download link.

start

New dynamic library

We have a new dynamic library: libtest.so.

The header file test. H

#ifndef TEST_H
#define TEST_H 1

#ifdef __cplusplus
extern "C" {
#endif

void say_hello();

#ifdef __cplusplus
}
#endif

#endif
Copy the code

The source file test. C

#include <stdlib.h>
#include <stdio.h>

void say_hello()
{
    char *buf = malloc(1024);
    if(NULL != buf)
    {
        snprintf(buf, 1024, "%s", "hello\n");
        printf("%s", buf);
    }
}
Copy the code

The function of say_hello is to print the six characters “hello\n” (including the ending \n) on the terminal.

We need a test program: main.

The main source file. C

#include <test.h>

int main()
{
    say_hello();
    return 0;
}
Copy the code

Compile them to generate libtest.so and main, respectively. Run it:

caikelun@debian:~$ adb push ./libtest.so ./main /data/local/tmp
caikelun@debian:~$ adb shell "chmod +x /data/local/tmp/main"
caikelun@debian:~$ adb shell "export LD_LIBRARY_PATH=/data/local/tmp; /data/local/tmp/main"
hello
caikelun@debian:~$
Copy the code

That’s great! The libtest.so code looks silly, but it works correctly, so what’s to complain about?

Start using it in the new version of the APP!

Unfortunately, as you may have noticed, libtest.so has a serious memory leak problem, leaking 1024 bytes of memory every time the say_hello function is called.

After the launch of the new APP, the crash rate began to rise, and all kinds of weird crash information and reports of information.

Problems faced

Fortunately, we fixed the problem with libtest.so. But what about the future? We face two problems:

  1. When the test coverage is insufficient, how to timely find and accurately locate such problems in online apps?

  2. If libtest.so is a system library for some models, or a closed source library for a third party, how can we fix it? How to monitor its behavior?

How to do?

If we can hook (replace, intercept, eavesdrop, or whatever you want to describe correctly) function calls in dynamic libraries, we can do a lot of the things we want to do.

Like Hook Malloc, Calloc, Realloc, and Free, we can count how much memory each dynamic library allocates and which memory is always occupied.

Can it really be done? The answer is: it is perfectly ok to hook our own processes.

Hook Other processes require root privileges (other processes cannot modify their memory space or inject code without root privileges).

Fortunately, we only need to hook ourselves.

If you hook a process that does not belong to you, it really belongs to a virus!

Process-level isolation, typically handled by the operating system!

ELF

Doug’s note:

For a more detailed introduction to ELF, take a look at my previous article: The compiled, linked cornerstone of Linux -ELF files: Peel back its layers and explore the granularity of bytecode.

This article is as detailed as peeling an onion, analyzing the structure of the ELF file layer by layer.

In addition, the binary contents in the ELF file can be directly compared with the relevant structure member variables by means of pictures.

An overview of the

ELF (Executable and Linkable Format) is an industry-standard binary data encapsulation Format, which is mainly used to encapsulate Executable files, dynamic libraries, Object files and core DUMPS files.

The source code is compiled and linked using the Google NDK, resulting in dynamic libraries or executables in ELF format.

Readelf allows you to view the basic information of ELF files, and objdump allows you to view the disassembly output of ELF files.

An overview of the ELF format can be found here, and a complete definition can be found here.

The most important parts are: ELF file headers, SHT (Section Header Table), and PHT (Program Header Table).

The ELF file header

ELF files start with a fixed-length header in a fixed format (52 bytes for 32-bit architectures and 64 bytes for 64-bit architectures). The ELF file header starts with magic Number 0x7F 0x45 0x4C 0x46 (where the last three bytes correspond to visible characters E, L and F respectively).

ELF header for libtest.so:

caikelun@debian:~$ arm-linux-androideabi-readelf -h ./libtest.so
 
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          52 (bytes into file)
  Start of section headers:          12744 (bytes into file)
  Flags:                             0x5000200, Version5 EABI, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         8
  Size of section headers:           40 (bytes)
  Number of section headers:         25
  Section header string table index: 24
Copy the code

The ELF file header contains the starting position and length of SHT and PHT in the current ELF file.

For example, the SHT of libtest.so starts at 12744 and is 40 bytes long.

PHT starts at position 52 and is 32 bytes long.

SHT (Section Header table)

ELF organizes and manages information in sections.

ELF uses SHT to record basic information for all sections.

It mainly includes: section type, offset in file, size, relative address of virtual memory after loading into memory, alignment of bytes in memory, etc.

SHT libtest. So:

caikelun@debian:~$ arm-linux-androideabi-readelf -S ./libtest.so There are 25 section headers, starting at offset 0x31c8: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .note.android.ide NOTE 00000134 000134 000098 00 A 0 0 4 [ 2] .note.gnu.build-i NOTE 000001cc 0001cc 000024 00 A 0 0 4 [ 3] .dynsym DYNSYM 000001f0 0001f0 0003a0 10 A 4 1 4 [ 4] .dynstr STRTAB 00000590 000590 0004b1 00 A 0 0 1 [ 5] .hash HASH 00000a44 000a44 000184 04 A 3 0 4 [ 6] .gnu.version VERSYM 00000bc8 000bc8 000074 02 A 3 0 2 [ 7] .gnu.version_d VERDEF 00000c3c 000c3c 00001c 00 A 4 1 4 [ 8] .gnu.version_r VERNEED 00000c58 000c58 000020 00 A 4 1 4 [ 9] .rel.dyn REL 00000c78 000c78 000040  08 A 3 0 4 [10] .rel.plt REL 00000cb8 000cb8 0000f0 08 AI 3 18 4 [11] .plt PROGBITS 00000da8 000da8 00017c 00 AX 0 0 4 [12] .text PROGBITS 00000f24 000f24 0015a4 00 AX 0 0 4 [13] .ARM.extab PROGBITS 000024c8 0024c8 00003c 00 A 0 0 4 [14] .ARM.exidx ARM_EXIDX 00002504 002504 000100 08 AL 12 0 4 [15] .fini_array FINI_ARRAY 00003e3c 002e3c 000008 04 WA 0 0 4 [16] .init_array INIT_ARRAY 00003e44 002e44 000004 04 WA 0 0 1 [17] .dynamic DYNAMIC 00003e48 002e48 000118 08 WA 4 0 4 [18] .got PROGBITS 00003f60 002f60 0000a0 00 WA 0 0 4 [19] .data PROGBITS 00004000 003000 000004 00 WA 0 0 4 [20] .bss NOBITS 00004004 003004 000000 00 WA 0 0 1 [21] .comment PROGBITS 00000000 003004 000065 01 MS 0 0 1 [22] .note.gnu.gold-ve NOTE 00000000 00306c 00001c 00 0 0 4 [23] .ARM.attributes ARM_ATTRIBUTES 00000000 003088 00003b 00 0 0  1 [24] .shstrtab STRTAB 00000000 0030c3 000102 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), y (noread), p (processor specific)Copy the code

The sections that are more important and have a greater relationship with hook are:

Dynstr: saves all string constant information.

Dynsym: Stores symbol information (symbol type, starting address, size, index number of symbol name in.dynstr, etc.). A function is also a symbol.

Text: Machine instructions generated when the program code is compiled.

Dynamic: Information used by the dynamic linker, which records the current ELF external dependencies and the starting positions of other important sections.

Got: Global Offset Table. The entry address used to record external calls. When a relocate operation is performed by the linker, the absolute address of the real external call is filled in.

PLT: Procedure Linkage Table. A springboard for external calls, mainly used to support lazy binding for external call relocation. (Android currently only supports lazy binding with the MIPS architecture.)

Rel. PLT: Relocation information for direct calls to external functions.

Rel. Dyn: Relocation information other than.rel. PLT. (such as calling an external function through a global function pointer)

Doug’s note:

The dynamic section in ELF files is very important!

When a dynamic library is loaded into memory, the dynamic linker reads the contents of the section. For example:

Which other shared objects depend on;

The position of the dynamic linked symbol table (.dynsym);

Dynamically link relocation table position;

The location of the initialization code;

.

To view the contents of.dynamic in a dynamic library, use the readelf -d xxx.so command.

In addition, the global offset and PLT sections are used to handle address independent functions.

If you query -fPIC related content, will explain these two knowledge points.

To sum up: Linux dynamic library, code segment address related part, by “adding a layer” principle, all become “address independent”.

In this way, snippet of dynamic library code, once loaded into physical memory, can be shared by multiple processes by mapping the physical address of the snippet to the virtual address of each process.

The “address-dependent” part is in the got(references to variables) and PLT (references to functions).

PHT (Program Header Table)

·ELF is loaded into memory in segments. A segment contains one or more sections.

ELF uses PHT to record basic information for all segments.

It includes the type of segment, offset in the file, size, relative address of virtual memory after loading into memory, and alignment of bytes in memory.

PHT libtest. So:

caikelun@debian:~$ arm-linux-androideabi-readelf -l ./libtest.so 

Elf file type is DYN (Shared object file)
Entry point 0x0
There are 8 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x00000034 0x00000034 0x00100 0x00100 R   0x4
  LOAD           0x000000 0x00000000 0x00000000 0x02604 0x02604 R E 0x1000
  LOAD           0x002e3c 0x00003e3c 0x00003e3c 0x001c8 0x001c8 RW  0x1000
  DYNAMIC        0x002e48 0x00003e48 0x00003e48 0x00118 0x00118 RW  0x4
  NOTE           0x000134 0x00000134 0x00000134 0x000bc 0x000bc R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  EXIDX          0x002504 0x00002504 0x00002504 0x00100 0x00100 R   0x4
  GNU_RELRO      0x002e3c 0x00003e3c 0x00003e3c 0x001c4 0x001c4 RW  0x4

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .note.android.ident .note.gnu.build-id .dynsym .dynstr .hash .gnu.version .gnu.version_d .gnu.version_r .rel.dyn .rel.plt .plt .text .ARM.extab .ARM.exidx 
   02     .fini_array .init_array .dynamic .got .data 
   03     .dynamic 
   04     .note.android.ident .note.gnu.build-id 
   05     
   06     .ARM.exidx 
   07     .fini_array .init_array .dynamic .got
Copy the code

All segments of type PT_LOAD are mmapped into memory by the dynamic linker.

Linking View and Execution View

Link view: Data organized in sections before ELF is loaded into memory.

Execution view: ELF data organized in segments after it has been loaded into memory.

The hook operations we care about are dynamic memory operations, so we are mainly concerned with the execution view, that is, how the data in the ELF is organized and stored after the ELF is loaded into memory.

.dynamic section

This is a very important and special section that contains information such as the memory location of the other ELF sections.

In the execution view, there is always a segment of type PT_DYNAMIC that contains the contents of the.dynamic section.

PT_DYNAMIC segment PT_DYNAMIC segment PT_DYNAMIC segment PT_DYNAMIC segment PT_DYNAMIC segment PT_DYNAMIC segment PT_DYNAMIC segment PT_DYNAMIC segment

Libtest. so.dynamic section:

caikelun@debian:~$ arm-linux-androideabi-readelf -d ./libtest.so 

Dynamic section at offset 0x2e48 contains 30 entries:
  Tag        Type                         Name/Value
 0x00000003 (PLTGOT)                     0x3f7c
 0x00000002 (PLTRELSZ)                   240 (bytes)
 0x00000017 (JMPREL)                     0xcb8
 0x00000014 (PLTREL)                     REL
 0x00000011 (REL)                        0xc78
 0x00000012 (RELSZ)                      64 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffa (RELCOUNT)                   3
 0x00000006 (SYMTAB)                     0x1f0
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000005 (STRTAB)                     0x590
 0x0000000a (STRSZ)                      1201 (bytes)
 0x00000004 (HASH)                       0xa44
 0x00000001 (NEEDED)                     Shared library: [libc.so]
 0x00000001 (NEEDED)                     Shared library: [libm.so]
 0x00000001 (NEEDED)                     Shared library: [libstdc++.so]
 0x00000001 (NEEDED)                     Shared library: [libdl.so]
 0x0000000e (SONAME)                     Library soname: [libtest.so]
 0x0000001a (FINI_ARRAY)                 0x3e3c
 0x0000001c (FINI_ARRAYSZ)               8 (bytes)
 0x00000019 (INIT_ARRAY)                 0x3e44
 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
 0x0000001e (FLAGS)                      BIND_NOW
 0x6ffffffb (FLAGS_1)                    Flags: NOW
 0x6ffffff0 (VERSYM)                     0xbc8
 0x6ffffffc (VERDEF)                     0xc3c
 0x6ffffffd (VERDEFNUM)                  1
 0x6ffffffe (VERNEED)                    0xc58
 0x6fffffff (VERNEEDNUM)                 1
 0x00000000 (NULL)                       0x0
Copy the code

Dynamic linker

The dynamic linker app in Android is Linker. The source code is here.

The general steps for dynamic linking (such as executing dlopen) are:

  1. Check the ELF list loaded. (If libtest.so is already loaded, it is not reloaded, just increment the reference count of libtest.so by one and return directly.)

  2. The.dynamic section of libtest.so reads the list of external dependencies of libtest.so, removes loaded ELF from this list, and finally obtains the complete list of ELF dependencies to load (including libtest.so itself).

  3. Load ELF in the list one by one. Loading steps:

(1) Reserve a large enough memory with Mmap for subsequent ELF mapping. (MAP_PRIVATE)

(2) read ELF PHT, use mmap to map all segments of type PT_LOAD into memory.

(3) Read each information item from the. Dynamic segment, mainly the relative virtual memory address of each section, then calculate and save the absolute virtual memory address of each section.

(4) The relocate is the most critical step. Relocation information may exist in one or more of the following secion:.rel. PLT,.rela.plt,.rel. Dyn,.rela.dyn,.rel. Android,.rela.Android. The dynamic linker needs to handle these relocation requests in.relxxx sections one by one. Based on the ELF information that is loaded, the dynamic linker looks for the address of the desired symbol (such as the libtest.so symbol malloc), and when it finds it, inserts the address value into the destination address specified in.relxxx. These “destination addresses” usually exist in.got or.data.

(5) ELF reference count increment by one.

  1. Call the ELF constructors in the list one by one, whose addresses were previously read from the.dynamic segment (types DT_INIT and DT_INIT_ARRAY). ELF constructors are called layer by layer on a dependency basis, starting with the ELF dependent constructor and ending with libtest.so’s own constructor. (ELF can also define its own destructor, which is automatically called when ELF is unloaded)

Wait a minute! We seem to be on to something! Look again at the relocate section.

Do we just get the “destination address” from these.relxxx, and then fill in the “destination address” with a new function address, and then hook? Maybe.

tracking

Static analysis is easy to verify. Take libtest.so for the Armeabi-V7a architecture as an example.

Let’s take a look at the assembly code for the say_hello function.

caikelun@debian:~/$ arm-linux-androideabi-readelf -s ./libtest.so Symbol table '.dynsym' contains 58 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FUNC GLOBAL DEFAULT UND __cxa_finalize@LIBC (2) 2: 00000000 0 FUNC GLOBAL DEFAULT UND snprintf@LIBC (2) 3: 00000000 0 FUNC GLOBAL DEFAULT UND malloc@LIBC (2) 4: 00000000 0 FUNC GLOBAL DEFAULT UND __cxa_atexit@LIBC (2) 5: 00000000 0 FUNC GLOBAL DEFAULT UND printf@LIBC (2) 6: 00000f61 60 FUNC GLOBAL DEFAULT 12 say_hello ............... .Copy the code

Got it! Say_hello at address F61 corresponds to an assembly instruction volume of 60 (base 10) bytes.

View the disassembly output of say_hello with objdump.

caikelun@debian:~$ arm-linux-androideabi-objdump -D ./libtest.so ............... . 00000f60 <say_hello@@Base>: f60: b5b0 push {r4, r5, r7, lr} f62: af02 add r7, sp, #8 f64: f44f 6080 mov.w r0, #1024 ; 0x400 f68: f7ff ef34 blx dd4 <malloc@plt> f6c: 4604 mov r4, r0 f6e: b16c cbz r4, f8c <say_hello@@Base+0x2c> f70: a507 add r5, pc, #28 ; (adr r5, f90 <say_hello@@Base+0x30>) f72: a308 add r3, pc, #32 ; (adr r3, f94 <say_hello@@Base+0x34>) f74: 4620 mov r0, r4 f76: f44f 6180 mov.w r1, #1024 ; 0x400 f7a: 462a mov r2, r5 f7c: f7ff ef30 blx de0 <snprintf@plt> f80: 4628 mov r0, r5 f82: 4621 mov r1, r4 f84: e8bd 40b0 ldmia.w sp! , {r4, r5, r7, lr} f88: f001 ba96 b.w 24b8 <_Unwind_GetTextRelBase@@Base+0x8> f8c: bdb0 pop {r4, r5, r7, pc} f8e: bf00 nop f90: 7325 strb r5, [r4, #12] f92: 0000 movs r0, r0 f94: 6568 str r0, [r5, #84] ; 0x54 f96: 6c6c ldr r4, [r5, #68] ; 0x44 f98: 0a6f lsrs r7, r5, #9 f9a: 0000 movs r0, r0 ............... .Copy the code

The call to the malloc function corresponds to instruction BLX DD4. The address DD4 is displayed.

Take a look at what’s in this address:

caikelun@debian:~$ arm-linux-androideabi-objdump -D ./libtest.so ............... . 00000dd4 <malloc@plt>: dd4: e28fc600 add ip, pc, #0, 12 dd8: e28cca03 add ip, ip, #12288 ; 0x3000 ddc: e5bcf1b4 ldr pc, [ip, #436]! ; 0x1b4 ............... .Copy the code

And sure enough, we’re going to — in PLT, we’re going to do a couple of address calculations, and we’re going to end up at the address that the value in address 3f90 is pointing to, which is a function pointer.

A quick explanation: because arm processors use a three-level pipeline, the value of the first instruction to the PC is the address of the currently executing instruction + 8.

Dd4 + 8 + 3000 + 1b4 = 3F90.

Where is address 3F90?

caikelun@debian:~$ arm-linux-androideabi-objdump -D ./libtest.so ............... . 00003f60 <.got>: ... 3f70: 00002604 andeq r2, r0, r4, lsl #12 3f74: 00002504 andeq r2, r0, r4, lsl #10 ... 3f88: 00000da8 andeq r0, r0, r8, lsr #27 3f8c: 00000da8 andeq r0, r0, r8, lsr #27 3f90: 00000da8 andeq r0, r0, r8, lsr #27 ............... .Copy the code

Sure enough, it’s in. Got.

Rel.plt:

caikelun@debian:~$ arm-linux-androideabi-readelf -r ./libtest.so Relocation section '.rel.plt' at offset 0xcb8 contains 30 entries: Offset Info Type Sym.Value Sym. Name 00003f88 00000416 R_ARM_JUMP_SLOT 00000000 __cxa_atexit@LIBC 00003f8c 00000116 R_ARM_JUMP_SLOT 00000000 __cxa_finalize@LIBC 00003f90 00000316 R_ARM_JUMP_SLOT 00000000 malloc@LIBC ............... .Copy the code

It’s no coincidence that Malloc’s address happens to be stored in 3F90.

Doug’s note:

The.rel. PLT section records information about the relocation table, i.e. which function addresses need to be relocated.

When the linker loads all the dependent shared objects into memory, it summarizes the symbols in each shared object to obtain the global symbol table.

It then checks the.rel. PLT in each shared object to see if any addresses need to be relocated.

If necessary, find the memory address of the symbol from the global symbol table and fill in the corresponding location in the.plt.

What are you waiting for? Let’s change the code. Our main.c should look like this:

#include <test.h>

void *my_malloc(size_t size)
{
    printf("%zu bytes memory are allocated by libtest.so\n", size);
    return malloc(size);
}

int main()
{
    void **p = (void **)0x3f90;
    *p = (void *)my_malloc; // do hook
    
    say_hello();
    return 0;
}
Copy the code

Compile and run:

caikelun@debian:~$ adb push ./main /data/local/tmp
caikelun@debian:~$ adb shell "chmod +x /data/local/tmp/main"
caikelun@debian:~$ adb shell "export LD_LIBRARY_PATH=/data/local/tmp; /data/local/tmp/main"
Segmentation fault
caikelun@debian:~$
Copy the code

The train of thought is correct. But it still failed because the code had three problems:

  1. 3F90 is a relative memory address that needs to be converted to an absolute address.

  2. The absolute address corresponding to 3f90 probably has no write permission, and assigning directly to this address would cause a segment error.

  3. My_malloc will not be executed even if the new function address is successfully assigned because the processor has a Instruction cache.

We need to address these issues.

memory

Base address

In the memory space of the process, the loading address of various ELFs is random, and the loading address, the base address, is available only at runtime.

Doug’s note:

When we look at a dynamically linked library, we see the entry address 0x0000_0000.

When a dynamic library is loaded into memory, the loading address is not fixed because of the loading order.

Another way of saying this is that a process relies on dynamic libraries in the same order as it is loaded into memory.

Therefore, the loading address of each dynamic library is also fixed, so it is theoretically possible to store the relocated code fragment after the first relocation.

This way, when the process is started again later, there is no need to relocate, speeding up the start of the program.

We need to know the ELF base address to convert relative addresses into absolute ones.

Sure enough, if you’re smart enough to be familiar with Linux development, you can call dl_iterate_phdr directly. See here for a detailed definition.

Doug’s note:

The dl_iterate_phdr function is really useful as a callback to the loading address of each dynamically linked library.

Without this function, much of the information would have to be retrieved from /proc/xxx/maps, which is slow because you have to process a lot of string information.

Well, wait, after years of Android development being screwed up, take another look at the linker.h header in the NDK:

#if defined(__arm__)

#if __ANDROID_API__ >= 21
int dl_iterate_phdr(int (*__callback)(struct dl_phdr_info*, size_t, void*), void* __data) __INTRODUCED_IN(21);
#endif /* __ANDROID_API__ >= 21 */

#else
int dl_iterate_phdr(int (*__callback)(struct dl_phdr_info*, size_t, void*), void* __data);
#endif
Copy the code

Why is that? ! Dl_iterate_phdr is not supported in Android 5.0 or later.

Our APP should support all versions of Android 4.0 and above.

ARM in particular, how can not support? ! That doesn’t make anyone write code!

Fortunately, we realized that we can also parse /proc/self/maps:

root@android:/ # ps | grep main ps | grep main shell 7884 7882 2616 1016 hrtimer_na b6e83824 S /data/local/tmp/main root@android:/ # cat /proc/7884/maps cat /proc/7884/maps address perms offset dev inode pathname -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --... . b6e42000-b6eb5000 r-xp 00000000 b3:17 57457 /system/lib/libc.so b6eb5000-b6eb9000 r--p 00072000 b3:17 57457 /system/lib/libc.so b6eb9000-b6ebc000 rw-p 00076000 b3:17 57457 /system/lib/libc.so b6ec6000-b6ec9000 r-xp 00000000 b3:19 753708 /data/local/tmp/libtest.so b6ec9000-b6eca000 r--p 00002000 b3:19 753708 /data/local/tmp/libtest.so b6eca000-b6ecb000 rw-p 00003000 b3:19 753708 /data/local/tmp/libtest.so b6f03000-b6f20000 r-xp 00000000 b3:17 32860 /system/bin/linker b6f20000-b6f21000 r--p 0001c000 b3:17 32860 /system/bin/linker b6f21000-b6f23000 rw-p 0001d000 b3:17 32860 /system/bin/linker b6f25000-b6f26000 r-xp 00000000 b3:19 753707 /data/local/tmp/main b6f26000-b6f27000 r--p 00000000 b3:19 753707 /data/local/tmp/main becd5000-becf6000 rw-p 00000000 00:00 0 [stack] ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] ........... .Copy the code

Maps returns mMAP mapping information for the specified process’s memory space, including various dynamic libraries, executables (such as Linker), stack space, heap space, and even font files.

A detailed explanation of the MAPS format is available here.

Our libtest.so has 3 lines in maps.

The starting address of the first row with offset 0, b6EC6000, is in most cases the base address we are looking for.

Memory access

The information returned by MAPS already contains permission access information.

If you want to execute a hook, you need write permissions, which can be done with mProtect:

#include <sys/mman.h>

int mprotect(void *addr, size_t len, int prot);
Copy the code

Notice When you modify the memory access permission, the unit can only be page.

Detailed instructions for MProtect are available here.

Instruction cache

Note that the section types for.got and.data are PROGBITS, which is the execution code. The processor may cache this data.

After changing the memory address, we need to clear the processor’s instruction cache and let the processor read the instructions from memory again.

The method is to call __builtin___clear_cache:

void __builtin___clear_cache (char *begin, char *end);
Copy the code

Note that the instruction cache can only be cleared in “pages”. __builtin___clear_cache is explained here.

validation

Let’s change main.c to:

#include <inttypes.h> #include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <sys/mman.h> #include <test.h>  #define PAGE_START(addr) ((addr) & PAGE_MASK) #define PAGE_END(addr) (PAGE_START(addr) + PAGE_SIZE) void *my_malloc(size_t size) { printf("%zu bytes memory are allocated by libtest.so\n", size); return malloc(size); } void hook() { char line[512]; FILE *fp; uintptr_t base_addr = 0; uintptr_t addr; //find base address of libtest.so if(NULL == (fp = fopen("/proc/self/maps", "r"))) return; while(fgets(line, sizeof(line), fp)) { if(NULL ! = strstr(line, "libtest.so") && sscanf(line, "%"PRIxPTR"-%*lx %*4s 00000000", &base_addr) == 1) break; } fclose(fp); if(0 == base_addr) return; //the absolute address addr = base_addr + 0x3f90; //add write permission mprotect((void *)PAGE_START(addr), PAGE_SIZE, PROT_READ | PROT_WRITE); //replace the function address *(void **)addr = my_malloc; //clear instruction cache __builtin___clear_cache((void *)PAGE_START(addr), (void *)PAGE_END(addr)); } int main() { hook(); say_hello(); return 0; }Copy the code

Recompile to run:

caikelun@debian:~$ adb push ./main /data/local/tmp
caikelun@debian:~$ adb shell "chmod +x /data/local/tmp/main"
caikelun@debian:~$ adb shell "export LD_LIBRARY_PATH=/data/local/tmp; /data/local/tmp/main"
1024 bytes memory are allocated by libtest.so
hello
caikelun@debian:~$
Copy the code

Yes, it worked!

We didn’t modify the libtest.so code, or even recompile it. We just changed the main program.

The source code for libtest.so and main is available on Github.

(Depending on the compiler you are using, or the version of the compiler you are using, malloc may not be 0x3f90 in the generated libtest.so, you need to check with readelf first and then go to main.c.)

Using xhook

Of course, we’ve opened source a library of tools called XHook.

With xhook, you can hook libtest.so more gracefully without having to worry about compatibility issues with hard-coding 0x3f90.

#include <stdlib.h>
#include <stdio.h>
#include <test.h>
#include <xhook.h>

void *my_malloc(size_t size)
{
    printf("%zu bytes memory are allocated by libtest.so\n", size);
    return malloc(size);
}

int main()
{
    xhook_register(".*/libtest\\.so$", "malloc", my_malloc, NULL);
    xhook_refresh(0);
    
    say_hello();
    return 0;
}
Copy the code

Xhook supports Armeabi, ArmeabI-V7A and ARM64-V8A.

Supports Android 4.0 and later versions (API level >= 14).

It has been verified by product-level stability and compatibility. You can get xHook here.

To summarize the process of implementing PLT hooks in XHook:

  1. Read MAPS to get ELF’s start address.

  2. Verify ELF header information.

  3. Find segment of type PT_LOAD with offset 0 from PHT. Calculate ELF base addresses.

  4. Get the.dynamic section from the segment whose type is PT_DYNAMIC. Get the memory addresses of other sections from the.dynamic section.

  5. In the.dynstr section find the index value for the symbol that needs to hook.

  6. Iterate through all.relxxx sections to find the symbol index and symbol type matching items. For this relocation item, hook. Hook process is as follows:

(1) Read MAPS and confirm the memory access permission of the current hook address.

(2) If the access is not readable or writable, use mProtect to change the access to readable or writable.

(3) If the caller needs, the current value of the hook address is retained for return.

(4) Replace the value of hook address with the new value. (Execute hook)

(5) If you have changed the memory access permissions with mProtect, now restore the previous permissions.

(6) Clear the processor instruction cache of the memory page where hook address is located.

FAQ

Can ELF information be read directly from files?

You can.

And for format parsing, reading files is the safest way to do it, because while ELF is running, there are many sections that don’t need to be kept in memory all the time and can be discarded after loading, saving a small amount of memory.

But from a practical point of view, dynamic linkers and loaders of all platforms do not do this, perhaps deciding that the added complexity is not worth the cost.

So instead of reading various ELF messages from memory, reading files adds to the performance loss.

In addition, APP may not have access to some system library ELF files.

What is the exact method for calculating the location of a base?

As you will have noticed, the previous description of the libtest.so base address retrieval used “most of the time” in order to simplify concepts and simplify coding.

For HOOK, the accurate base address calculation process is as follows:

  1. Find the line in Maps with offset 0 and pathName as target ELF. Save the start address for this line as P0.

  2. Find the first segment of ELF PHT with type PT_LOAD and offset 0, save this segment with virtual memory relative address (p_vaddr) p1

  3. P0-p1 is the current base address of the ELF.

Most ELF PT_LOAD segments have p_vaddr 0.

In addition, the reason we look for a line with offset 0 in maps is because we want to check the ELF header in memory to ensure that we are operating on a valid ELF before we hook, and this ELF header can only appear in the MMAP area with offset 0.

You can search for “load_bias” in the Android Linker source code to find many detailed comments, as well as refer to linker’s assignment logic for the load_bias_ variable.

How does the compilation option used by target ELF affect the hooks?

There will be some impact.

External function calls can be divided into three cases:

  1. Direct call. Regardless of the compilation option, it can be hooked. External function addresses are always stored in.got.

  2. Called through a global function pointer. Regardless of the compilation option, it can be hooked. External function addresses are always stored in.data.

  3. Called via a local function pointer. If the compile option is -O2 (the default), the call is optimized for a direct call (as in case 1). If the compilation option is -o0, then a pointer to an external function that has been assigned to a temporary variable before hook execution cannot be hooked by PLT. For those assigned after a hook is executed, a PLT hook can be used.

In general, production-grade ELFs rarely compile with -o0, so don’t worry too much.

However, if you want your ELF to be as unhooked as possible, try compiling with -o0 and assigning the pointer to the local pointer as early as possible, and then using the local pointer to access the external function all the time.

In short, looking at the C/C++ source code is of no use to understanding this problem. You need to look at the disassembly output of the generated ELF using different compilation options and compare them to see which cases fail to be hooked by PLT for what reason.

What is the reason for the occasional segment error in hook? How to deal with it?

We sometimes have problems like this:

  1. After reading /proc/self/maps, we found that the access permission of a certain memory area is readable. When we read the contents of this area for ELF file header verification, a segment error occurred (SIG: SIGSEGV, code: SEGV_ACCERR).

  2. Mprotect () returns success, and then reads /proc/self-/maps again to verify that the corresponding memory region is writable. A segment error occurred when executing hook (SIG: SIGSEGV, code: SEGV_ACCERR).

  3. Read and verify ELF header successfully, segment error (SIG: SIGSEGV, code: SEGV_ACCERR or SEGV_MAPERR) while further reading PHT or.dynamic section according to relative address value in ELF header.

Possible reasons:

  1. The memory space of a process is shared by multiple threads, and other threads (or even Linkers) may be executing DLclose () or modifying access to this memory area with MProtect () when we execute a hook.

  2. Android ROMs from different manufacturers, models, and versions may have undisclosed behaviors, such as write protection or read protection for certain memory areas in some cases that are not reflected in the contents of /proc/self_/maps.

Problem analysis:

  1. Segment errors while reading memory are actually harmless.

  2. The only place I need to write data directly by calculating the memory address in the process I’m executing in the hook is the most critical line to replace the function pointer. As long as there are no errors in the logic elsewhere, there is no damage to any other area of memory if a write fails here.

  3. When loading an APP process running on Android platform, the loader has injected the registration logic of Signal Handler to communicate with the system debuggerd daemon when the APP crashes. The debuggerd uses pTrace to debug the crashed process. Get the required crash scene information, record it to a Tombstone file, and then the APP commits suicide.

  4. The system sends a segment error signal exactly to the thread where the segment error occurred.

  5. We wanted a stealthy, controlled way to avoid APP crashes caused by segment errors.

Let’s be clear:

Don’t just think of segment errors from the perspective of application-layer development. Segment errors are not a scourge, they are just a normal way for the kernel to communicate with user processes.

When a user process accesses a virtual memory address without permissions or MMap, the kernel sends SIGSEGV signals to the user process to notify the user process, and that’s all.

As long as the location where a segment error occurs is controllable, we can handle it in the user process.

Solution:

  1. When the hook logic enters the dangerous area (directly calculating the memory address for reading and writing), it is marked by a global flag, and the flag is reset after leaving the dangerous area.

  2. Register our own Signal handler to catch only segment errors. In the Signal handler, flag is used to determine whether the current thread logic is in the danger zone. If so, use siglongjMP to jump out of the Signal handler and straight to the “next line of code outside the danger zone” that we’ve set up. If it is not, we will restore the signal handler that the loader injected to us and return it directly. In this case, the system will send a segment error signal to our thread again. The default signal handler is used to run the normal logic.

  3. We call this mechanism SFP (Segmentation Fault Protection).

  4. Note: SFP requires a switch that allows us to turn it on and off at any time. The SFP should always be turned off during APP development and debugging so that segment errors due to coding errors are not missed and should be fixed. The SFP should be enabled after launch to ensure that the APP does not crash. (Of course, partial shutdown of SFP in the form of sampling is also considered to observe and analyze the crash caused by the hook mechanism itself.)

Specific code can refer to the implementation of Xhook, in the source search siglongJMP and SIGsetjMP.

Can calls between ELF internal functions hook?

The hook method we introduce here is PLT hook, which cannot be used to call ELF internal functions.

Doug’s note:

The external function is recorded in the.plT section, so you can step through that section to find its relocation address and modify it.

For internal functions, such as one that is modified with the static keyword, the compiler may “hardcode” the address of the function directly into the reference at compile time.

That’s why: if a function is only used inside a file, it’s best to include the static keyword.

One reason is security, to prevent the same name as symbols in other files, and another reason is to speed up startup, because there is no need to relocate ah!

Inline hooks do this by knowing the symbol name or address of the internal function you want to hook.

There are many open and non-open inline hook implementations, such as:

substrate:www.cydiasubstrate.com/

Frida: www.frida.re/

The inline hook scheme is powerful and can cause the following problems:

  1. Due to the need to parse and modify machine instructions (sink codes) directly in ELF, there may be compatibility and stability issues for different architectures of processors, processor instruction sets, compiler optimization options, and operating system versions.

  2. Problems can be difficult to analyze and locate when they occur, and some well-known inline hook schemes are closed source.

  3. The realization is relatively complex and difficult.

  4. Unknown pits are relatively more, this can be Google.

It is recommended not to try inline hooks if PLT hooks are sufficient.





Article comes from: my.oschina.net/nomagic/blo…

I have sent a private message to the author to reprint this article, but have not received a reply. Since the article is so well written, LET me share it with you.

If infringement, please private letter I delete text, thank you!

Copyright (C) 2018, IQiyi, Inc. All Rights Reserved.

This article is licensed under a Creative Commons license.

Recommended reading

[1] Series of articles on Linux From Scratch

[2] C language pointer – from the underlying principles to fancy skills, with graphics and code to help you explain thoroughly

[3] The underlying debugging principle of GDB is so simple

[4] Is inline assembly terrible? Finish this article and end it!

Other series: featured articles, Application design, Internet of Things, C language.