Understanding inline API hooking

I managed to make an API Hooking DLL that takes advantage of the hop patching feature.
Now I want to do inline API Hooking. I looked a bit there, a bit here, did some Google searches and taught I could try it. Well...

I managed to replace the first instructions of the victim function (I test my code by Hooking CreateThread from Kernel32.dll), but after the first call the victim process crashes.

My idea:
- get CreateThread address
- overwrite the first instructions with a jmp to my function (this takes 5 bytes)
- save somewhere those 5 bytes before overwriting

In my newCreateProcess I wanted to do the following:
- log the call in a file
- unhook (restore those first 5 bytes)
- give control back to the original function
- re-hook in order to catch future calls

The reason I wanted to do this and not create a trampoline function is because I want to have as little hardcoded things as possible and to avoid disassembling every function I want to hook.

This works, but with a little big problem: after the first call to CreateThread, the victim process crashes regardless of what I do in the function that replaces it.

My little big problem is that I don't think I know / fully understand how to do inline API hooking.

Here's my attempt at shaping the above idea into working (heh) code (most error checks are stripped to make it easier to read):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
LPBYTE original = 0;
DWORD oldFuncAdr = 0;

// I want to log the time of the call and the parameters
void
LogCreateThread(
		_In_opt_   LPSECURITY_ATTRIBUTES lpThreadAttributes,
                _In_       SIZE_T dwStackSize,
                _In_       LPTHREAD_START_ROUTINE lpStartAddress,
                _In_opt_   LPVOID lpParameter,
                _In_       DWORD dwCreationFlags,
                _Out_opt_  LPDWORD lpThreadId
		);

void
UnHook();

void
Hook()
{
    HMODULE hModule = NULL;
    DWORD newFuncAdr = (DWORD)LogCreateThread;
    DWORD continueAdr = 0;
    DWORD jmpAdr = 0;
    LPBYTE pOldFuncAdr = 0;
    LPBYTE pJmpAdr = 0;
    DWORD oldProtect;
    DWORD i = 0;

    hModule = LoadLibrary(TEXT("kernel32.dll"));

    oldFuncAdr = (DWORD)GetProcAddress(hModule,
        "CreateThread");

    // offset for the relative jump
    jmpAdr = newFuncAdr - oldFuncAdr - 5;

    pOldFuncAdr = (LPBYTE)oldFuncAdr;

    pJmpAdr = (LPBYTE)(&jmpAdr);

    if( !VirtualProtect(pOldFuncAdr,
            5,
            PAGE_EXECUTE_READWRITE,
            &oldProtect) )
    {
        return;
    }

    // save the original first 5 bytes
    original = pOldFuncAdr;
    for(i = 0; i < 5; i++)
    {
        original[i] = pOldFuncAdr[i];
    }

    // overwrite first byte with jmp opcode
    pOldFuncAdr[0] = (BYTE)0xE9;
    for(i = 0; i < 5; i++)
    {
        pOldFuncAdr[i + 1] = pJmpAdr[i]; // overwrite next 4 bytes with the jump offset
    }

    
    if( !VirtualProtect(pOldFuncAdr,
        5,
        oldProtect,
        &oldProtect) )
    {
        return;
    }
}

void
UnHook()
{
    LPBYTE pOldFuncAdr = 0;
    DWORD i = 0;
    DWORD oldProtect;

    pOldFuncAdr = (LPBYTE)oldFuncAdr;
    VirtualProtect(pOldFuncAdr,
        5,
        PAGE_EXECUTE_READWRITE,
        &oldProtect);

    // put back the first 5 bytes
    for(i = 0; i < 5; i++)
    {
        pOldFuncAdr[i] = original[i];
    }
    VirtualProtect(pOldFuncAdr,
        5,
        oldProtect,
        &oldProtect);
}

void
LogCreateThread(
		_In_opt_   LPSECURITY_ATTRIBUTES lpThreadAttributes,
                _In_       SIZE_T dwStackSize,
                _In_       LPTHREAD_START_ROUTINE lpStartAddress,
                _In_opt_   LPVOID lpParameter,
                _In_       DWORD dwCreationFlags,
                _Out_opt_  LPDWORD lpThreadId
		)
{
    UnHook();
    // if I want, I can log the first call, but after this it crashes
    // it crashes even if I remove every instruction from this function
    // here I should call the original CreateThread, but I can't figure out how
    Hook(); // after all is done, I should re-hook it
}

DWORD
APIENTRY
DllMain(
        HANDLE hModule,
        DWORD reason,
        LPVOID reserved
        )
{
    if(DLL_PROCESS_ATTACH == reason)
    {
        Hook();
    }
    return TRUE;
}


Should I go the extra mile and make a trmpoline function? Should LogCreateThreadCall be a naked function in order for the intercepting of the parameters to work?
I'm a bit lost. Some guidelines / feedback / pointing out of the really wrong things / simple examples will help me a lot.

EDIT: after some more reading and some more trial & error coding I think that if I want to hook something that's not a windows API I should load my target in IDA (or some other dis-assembler), see how many instruction my over writing destroys and work around that.
As for Windows' API's I should just use the hot-patching feature and keep my life simple.
Last edited on
closed account (13bSLyTq)
Hi,

If you wish to learn the essence of inline hooking I would strongly recommend you look at my blog: http://codeempire.blogspot.co.uk/

It has more than enough inline hooking examples perhaps you may discover that for yourself and its more or less better and effective in terms of code-size and portabillity and dependency than the code shown above.

Why are you unhooking it the re-hooking it there is no point all you are doing is increasing the processing power for your computer, don't ever do that not only is it an un-helpful thing to do during hooking due to you executing more instructions than you need to but it will slow the performance of the CreateThread() itself which may be unnoticeable now but if you inject this code into an high-performance application you may slow it down by several magnitudes therefore its not advised. Next, its always good idea to make the function:

 
static __declspec(naked)


This is because making it __declspec(naked) will make it a simple stub in the memory rather than a function with introductory instructions. This means when using code-injection you could cut down on "waste" bytes being copied over to 100 process which can add by to maybe few Kb of information being added waste. It may not sound much but its always good to cut down as much as possible.

Do be aware making it a __declspec(naked) requires your function to be written in x86 or x64 assembly rather than using C\C++ as compiler literally takes the exact code and copies it over in memory therefore C\C++ code aren't allowed at least on VS. That in mind, I strongly recommend you are strong in developing in assembly as a callback can sometimes require a lot of work from an assembler that too with decent amount of development knowledge of assembly.

Nevertheless, YOU MUST always create an trampoline function or else, when it comes to redirecting the function to an user-callback it will allow you to have complete control over it which is basically the use of hooking therefore trampoline function is one of the most necessary when it comes to hooking. In my blog there is wonderful examples of how hooking and even unhooking can be done, much more easily:

http://codeempire.blogspot.co.uk/2013/11/hooking-x64-system-call-stub.html - Good Example!
http://codeempire.blogspot.co.uk/2013/10/hooking-x86-system-call-stub.html - Good Example!

http://codeempire.blogspot.co.uk/2014/01/global-remote-detour-kifastsystemcall.html - Expert Level!

To those who are interested in learning art of unhooking read my blog here:

http://codeempire.blogspot.co.uk/2013/10/how-to-unhook-ntopenprocess.html - Expert Level!

Anyway, to learn more about hooking and security you should try and learn code-injections as hooking dwells on code-injections a lot as well espicieally the global hook based loading of DLL inside process spaces:

http://codeempire.blogspot.co.uk/2013/10/code-injection-into-mozilla-firefox.html - Good for novices!

http://codeempire.blogspot.co.uk/2013/11/security-process-injection-hooking.html - Good for those who understand hooking and know basic Code-Injection!

That in mind, we can also block DLL injections and to learn this: http://codeempire.blogspot.co.uk/2013/10/security-blocking-dll-injections.html - Easy!

There are other valuable blogs and information floating around Internet about hooking\detouring in general:

http://www.codeproject.com/Articles/2082/API-hooking-revealed
http://www.codeproject.com/Articles/11985/Hooking-the-native-API-and-controlling-process-cre
http://www.codeproject.com/Articles/20084/A-More-Complete-DLL-Injection-Solution-Using-Creat

http://www.codeproject.com/Articles/4610/Three-Ways-to-Inject-Your-Code-into-Another-Proces - VERY GOOD!


http://jbremer.org/x86-api-hooking-demystified/

I have supplied a lot of information pertaining hooking and I can even give you code samples if you PM me. I hope this helped!
Last edited on
I always did it like you say when it comes to APIs (with a much cleaner code).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    // replicate first instructions (mov edi, edi omitted)
    __asm
    {
        push ebp
        mov ebp, esp
    };

    // do my thing here
    
    pJmpAdr = pOldFuncAdr + 5;

   // jump back to the hooked function
    __asm
    {
        jmp pJmpAdr
    };


Thanks for the links. Should be enough.
closed account (13bSLyTq)
Hi,

You made a lot of errors in the callback you presented, for example:

pJmpAdr = pOldFuncAdr + 5;


This is an bad because these variables will not be available after code is ported via the method of any method of Code-Injection whether it be APC or WriteProcessMemory() method, or in fact any code injection method requiring to copy the equivalent memory bytes or ONLY the callback function because once code has been ported the variables are lost and are not available in the place of the processes-virtual memory therefore all you are jumping to is an protected memory with a lot of protections therefore it will lead to an Access Violation of the process and lead to an Crash as a result. If you optimize you will see an Access Violation will be caused by this:

1
2
3
4
5
 
__asm
    {
        jmp pJmpAdr
    };


That in mind, I would recommend you to manually copy the address to the end of function that in mind I would suggest this:


1
2
3
4
5
 
__asm
    {
        jmp pJmpAdr
    };


be changed to:

1
2
3
4
5
6
7
8
__asm
{
nop  //5 Bytes
nop
nop
nop
nop
}


Then simply unprotect the area (Callback) locally before injection using this Semi-pseudo-code [1*]:

1
2
3
4
//Using Hungarian Notation. 
dwOldProtection = 0;
DWORD dwMemory = (PBYTE)Stub - (PBYTE)Callback; //Calculate size of Callback 
VirtualProtect(Callback, dwMemory, PAGE_EXECUTE_READWRITE, &dwOldProtection); //Unprotect the area 


Then simply edit the last 5 bytes of the callback to jmp pJmpAdr but the pJmpAdr will become the actual memory address so the memory then make sure to assign its previous protections back to the callback using VirtualProtect:
VirtualProtect(Callback, dwMemory, dwOldProtection, &dwOldProtection);

Then the code should/would look something like this before and after:

Before you edit the callback
1
2
3
4
5
6
7
8
9
10
11
// replicate first instructions (mov edi, edi omitted)
    __asm
    {
        push ebp
        mov ebp, esp
        nop
        nop
        nop
        nop
        nop 
    };


then after you use the Semi-pseudo-code plus the editing of the last 5 bytes of the callback, the callback should look like so:

1
2
3
4
5
6
7
// replicate first instructions (mov edi, edi omitted)
    __asm
    {
        push ebp
        mov ebp, esp
        jmp [memory address of (pOldFuncAdr + 5).]
    };


After Effect of this change:

That in mind, it should completely avoid the code that can crash and make the code MORE Dynamic plus allow the code to be transferred through future projects with the same need without any immediate attention and fixing to the code thus making code more usable.

The code-usability is an important thing to consider when software-engineers develop programs and since its more Dynamic, Usable, reduces speed during run-time (by around 5 > nanoseconds max. but still...), Improves Code-Quality.

Hope this helped,
OrionMaster
Last edited on
But what if before each jump back to the original function I get the original function's address with GetProcAddress, add 5 to it and then jump to that address?
Last edited on
closed account (13bSLyTq)
Hi,

That method will not work either especially if you are using Visual Studio or many well-known IDE (Integrated Development Environment) they use call-gates for their applications to work so an call-gate in VS simply looks so:

 
GetProcAddress(...,...) // You Call this function 


then the EIP is moved through the call-gate:

1
2
GetProcAddress_Callgate:
jmp [Real-Address of GetProcAddress]


and after porting the code through these call-gates don't remain the same are are different due to ASLR protection on Windows.

Nevertheless, there is one more reason on why it will not work:

When you call a function the PC does not just jump to the address but rather it has to move the address of the function into few registers or so and finally it will jump to the address however since after porting the integrity of the code is highly vulnerable, and compilers normally plan the movement of registers before-hand thus the transfer of control between functions is deterministic in virtual memory space however due to code-injection this determinism is lost and the entire function plan has been destroyed so when you call say GetProcAddress with 2 parameters you may instead jump over to an random function which may result in an crash of the process due to run-time memory corruption or run-time execution issue.

There is a little chance less that your exact code is executed correctly in an deterministic fashion, in fact this chance is comparable to pulling 10000 lottery tickets consecutively and each being awarded to 1 person. In other words its very much near to being IMPOSSIBLE.

That's why you must give the function address manually then fill out parameters manually using inline assembly and using registers (ESP register).

I suggest you try executing an function like GetProcAddress as you said and see what happens, I bet it will crash as it normally does but it may not as every PC is different but I suggest you try that.
Last edited on
Yep, it crashes.

But I'm not sure I understood your example.

1
2
3
4
5
6
   __asm
    {
        push ebp
        mov ebp, esp
        jmp [memory address of (pOldFuncAdr + 5).]
    };


Let's say I want to edit some parameters before giving control back to the old function, so naturally, I'll add that code before the jmp ([ebp + 4] is the return address, [ebp + 8] is the first parameter and so on), then I calculate de jmp address. But I can't figure out how I'm supposed to do that from your example.

When I'm hooking the function, I calculate the relative jmp offset like this:
jmpOff = newFuncAdr - oldFuncAdr - 5; which calculates the distance for the jump.

But while I'm in my function, how can I know where to jump back in the old function. Being an API function, it's at it's entry point + 5, but how do I get that?
That in mind, I would recommend you to manually copy the address to the end of function

I don't know what function are you talking about here.
@ OrionMaster: Welcome back! When you stopped posting on your blog and MediaFire stopped hosting your older stuff I thought something had happened. Good to see that you're OK.

That's why you must give the function address manually then fill out parameters manually using inline assembly and using registers (ESP register).

What's this about? Reading this thread I got the impression that you and OP were talking about targeting exported functions. In that case this sentence doesn't apply here since the information you need for the remote process is available at run-time making ASLR a non-issue.
closed account (13bSLyTq)
Hi,

I stopped posting on my blog because recently I had a lot of work and to be honest I had a feeling my blog was getting wrong attention from hackers or malware-developers, in fact few messages emerged asking me for any sort of contact details which I provided only to know that they wanted me to develop malware which I declined and so I decided to not post too much material regarding security or anything floating around for a while to stop the constant attraction from malware\hacking community and finally avoid getting wrong attention that law-enforcement agencies get involved with my life and probably end up ruining it but at-least I have got good-attention as well so Thanks. I will still continue to post security material regardless though only attempt to make it less attractive to hackers :D.

Anyway, No me and OP were talking about general functions and not just exported functions and as I proved that calling any functions in an foreign process will crash it and if you re-read it again you will see that VS and a lot of other compilers uses Call-gates and such which differ from virtual memory of each process thus making it an issue but you are right ITS NOT ASLR but the basic randomization of call-gates because if the call-gates were static we could call any functions and except them not to crash the process and if you do test this theory you will realize that the virtual memory does change the call-gates address in each process.


That in mind, you are right its NOT the ASLR but the basic randomization of call-gates creates this impression that ASLR acts every process besides if ASLR did act locally every process we can still capture the address of the functions via browsing through EAT but it was just to get an point across but yes you are correct.



@zdzero

let me explain again, you know how you added a simple 0xE9 (Jmp) to the prolouge of the function, similarly you can add a Jmp back to the old function.

I suggest you read my Article: http://codeempire.blogspot.co.uk/2013/11/security-process-injection-hooking.html

This shows what I mean and is extremely basic and you could easily grasp this, I believe.
Last edited on
Now I get it :) Clever.

EDIT: is there a chance to found some inflammations about out parameters interceptions on your blog?

Being out parameters they will store valid data only after the hooked function finishes it's business. Just storing their address and accessing it later seems wrong.
Last edited on
closed account (13bSLyTq)
Hi,

Its not difficult at all in fact its really simple procedure its exactly how you handle parameters in normal functions with an out parameter, however I'm not sure what you mean with out parameters do you mean _Out_ parameters if so its exactly the same as handling normal routine parameters nothing special about it. In fact if you want to handle these using pure assembly even then its really simple:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
; PARAMETER.ASM
; Compute and check for Parameters
;
       PUBLIC _parameter2
_TEXT SEGMENT WORD PUBLIC 'CODE'
_parameter2 PROC            ;Imagine this being you callback similar to __declspec(naked)

        push ebp        ; Save EBP
        mov ebp, esp    ; Move ESP into EBP so we can refer
                                           ;   to arguments on the stack
        mov eax, [ebp+4] ; Get first argument        
        mov edx, 10 ; Move our comparison paramter into EDX so we can refer back  

        cmp edx, eax
        je ret_1
       
 else_failed:
        mov eax, 1       ;Failed 
        pop ebp         ; Restore EBP
        ret             ; Return with sum in EAX, EQUIVALENT TO return 1; 

ret_1:                        ;SUCCESSFUL 
        nop                   ; This will later be changed into an JMP to old function 
        nop
        nop
        nop
        nop 

_parameter2 ENDP
_TEXT   ENDS
        END



As you see its not too difficult or hard for any worthy assembly programmer, that being said using only the power of low-level assembly we can actually even use it to compare infinitive variety of parameters in one given function just by increment it by 2 or 4 depending on parameters.

If you are using C++ then its even easier simply move the parameters using esp register into an INTEGER type variable or the variable type which could require type-casting (very basic skill) then simply compare the variable rather than using hard-coded assembly like I showed as an example.

Kind Regards,
OrionMaster
I should have given an example.
Let's say I hook a function like CreateProcess which receives as it's last parameter a pointer to a PROCESS_INFORMATION structure that receives identification information about the new process.
If I want to log somewhere all the process IDs of the processes created by a monitored process I need that PROCESS_INFORMATION structure, but it will contain the information I'm looking only after the new process is created. Also, I should not alter that structure in any way (so I can not replace it with one of my own) because the initial caller may need information from that structure.
How should one approach this situation?

Thanks boss, @Orionmaster, I tried to hook send()my source compiles following the hook function I showed you last time.

From what I did, I bootstrapped to DLL, compiles without issues, tried injecting to svchost, didn't inject. So I was wondering , where I could inject to hook send() and recv(). I ran process explorer and I saw some places I can inject.

Like you said, the hook function I provided was ok? Pls confirm and assist on where I can inject to check for packets (send(), recv())
closed account (13bSLyTq)
Hi,

You could actually store this data using IPC which will be a bit of work but I guess its not too difficult if you tried hard enough. Nevertheless, it has to be completely MANUALLY added like I have told you last time.

This is ONLY if you are planning on doing this sort of injection, if you are planning on making it easier and avoid manually copying data to your hook-callback I will strongly recommend you learn PE injection however I think its too far to attempt I will strongly urge you to stick to this method of injection as PE injection will probably over complicate your injection knowledge thus I will recommend you to master this before moving on.

@m0mathur

You can inject this into pretty much any process if you have the correct permissions usually Administrator, this is because a lot of processes tend to secure themselves effectively from injection from non-administrator users via ACLs and DACLS this requires you to elevate your permissions via UAC, to learn how to do this I suggest you read this to elevate during run-time: http://www.codeproject.com/Articles/320748/Haephrati-Elevating-during-runtime

I will still encourage you to NOT INJECT into these very important processes:

1. Winlogon.exe - Login Process! VERY IMPORTANT! DO NOT INJECT! WILL BSOD IF YOU CRASH IT
2. csrss.exe - Shutdown Process! VERY IMPORTANT! DO NOT INJECT! WILL BSOD IF YOU CRASH IT

Except from this, don't try inject into processes you cannot inject into like this:

1. Anti-Virus solution - This will never work!

2. Google Chrome - I managed just about managed to inject into it! I used a different secret method to inject into it. But still possible if you plan on starting it up and then injecting into it using process pointers.

_________________________________________

Now let me explain why svchost.exe failed, if you have an Anti-Virus solution running they tend to protect svchost.exe from injection as svchost.exe is usually used as malwares to hijack and zombify the process and use it to communicate with C&C.

I'd try disabling AV and then inject in svchost.exe it should in my thought experiment work else I think its your injector or (GOOD IDEA):

An better idea is to create an sample network application client and server using Winsock which uses send() to send messages across then check if they are checking up to server. Then run the hook and in callback change the message slightly or return 0 so that different results will pop up thus you can check your hook without any tools.

AND @M0MATHUR YOU MUST COMMENT IN YOUR OWN THREAD: http://www.cplusplus.com/forum/windows/139704/

AS ITS GOING TO CONFUSE AND CHANGE THE PURPOSE OF THIS THREAD.
Last edited on
Any place where I should start reading about how to use IPC to do this? It gets a bit more complex than I would have expected.

As for m0mathur's problem, I managed to hook and log Internet Explorer's send() activity on Windows 7 x64 (but IE is 32 bit). For recv() I think I'll stop in the same place I stopped for CreateProcess().
Last edited on
closed account (13bSLyTq)
Hi,

You can learn about IPC using CodeProject: http://www.codeproject.com/Articles/34073/Inter-Process-Communication-IPC-Introduction-and-S

I think you should create a DLL instead then inject and hook then create a IPC that'd be easier.

Next, IE is a bad idea as it sends out routine pings around to collect few data thus making it bad testing process for an unexperienced person to injection and hooking.
Topic archived. No new replies allowed.