IDA -> Inline Assembly [Possible]??
| TheNoobie (28) | |||||
I am trying to write something in Visual Studio using inline assembly, but not quite sure how to convert from IDA view disassemble. At first I thought that if IDA view looked like this (pulled from Google)..
Then Visual Studio conversion would be...
Basically thinking that whatever IDA sees, I must register and in the same order. However, after looking at some examples online I realized I'm pretty lost. So if you guys can either give me an IDA2InlineASM script that converts an IDA function into inline assembly that can be put in C++ or if you can just put a basic example of a small function in IDA View converted to Visual Studio 2008 with a small explanation of how you converted it. For example, do you call the registers backwards, which ones do I need to call, why is some of it left out, any small little info like that helps. | |||||
| Zaita (1576) | |||
| You have not set eax, nor does the value in edx and ecx correctly match what you want to do by calling memset. | |||
| TheNoobie (28) | |||
| I think your missing my point, like I said I pulled that code from Google. I can PM you my code whenever I get back home if you want, that was just an example. Basically I'm asking how I convert a function I'm reading in IDA View to inline assembly in Visual Studio C++. | |||
| Duoas (1596) | |||
| I think you are missing Zaita's point. Just because you found something, somewhere on the internet doesn't make it smart. And when messing with disassemblers you need to be very careful about register values --compilers will often move things around because they know something about what values a register holds. Simply cut-n-paste'ing code is dangerous. https://www.openrce.org/repositories/users/dennis/launch_image_in_memory.html Notice how it takes care to provide an address in EAX? Also, c_memset is non-standard, and the only references I can find for it are for IDA and Haskell. Standard cdecl calling conventions expect you to push and pop arguments:
You can learn everything you want to know about inline assembly from Microsoft http://msdn.microsoft.com/en-us/library/4ks26t93(VS.71).aspx Another useful link (has an example of a pure assembly function) http://www.cplusplus.com/forum/beginner/3280/ Also (just to be sure), you have to know at least basic Intel/MASM assembly if you plan to use it. Good luck! [edit] I've never used VC++ __asm before. Line 14 might be wrong. Perhaps it is just
mov eax, s ? | |||
| jmc (64) | |||
| In VC++ you have to write what kind of pointer it is, byte, word or dword. Xor eax, eax may be still sometimes faster than mov eax, 0 afaik. And I think if you are not using naked functions you do not even have to push eax yourself, because a prolog and an epilog is created by the compiler. | |||
| kryptonite (36) | |||
| PM me the code. | |||
| TheNoobie (28) | |||
| Going to try and explain myself a little bit better in two separate posts. The first post (this one) is going to try and explain a few things that I guess I didn't make clear before. The second post (below this) is going to ask the same question, reformatted, so that hopefully you guys understand my question more as well as including an IDA-View code snippet of the ACTUAL code that I'm trying to convert to C++. Hehe alright maybe I didn't explain myself properly because no one seems to understand what I'm asking, let me try again. First of all, Duoas that did help me a little bit, but again it isn't exactly what i was asking. The code that I posted above was something I found in IDA view on Google (no IDA on work computer) and was just a snippet of a much larger dissasembly of some random file. That IDA assembly I posted has NOTHING to do with what I want to accomplish, as a matter of fact I don't even have a clue what file it's dissasembling. The reason I posted the IDA view is to show you guys how I have been converting to inline assembly (finding the function in IDA and copy/pasting the assembly, then just paste it up in the same order in Visual Studio __Asm, except obviously changing the formatting so it will compile.) I know it's wrong, I was just showing you what I've been doing so far. I also know that c_memset is not a valid assembly expression and I would have to replace it with whatever the address of the function in the exe is. | |||
| TheNoobie (28) | |||||||
Question 1 I've always been a little confused about how C++ knows what function your writing the assembly for. I have an idea, I'll provide an example with the inline assembly below. Note that this is just assembly written off the top of my head as an example, and most likely is not coded properly or won't compile.
So my guess is that C++ doesn't actually NEED to know what offset the assembly is from. The reason is because when you set whatever registers you plan to set, then call the function your passing them to, then the function your calling will use these registers to execute whatever it is that it is supposed to do. I may be wrong, this is just a guess and wasn't read anywhere or anything like that. Question 2 What happens when you find a function your trying to write assembly for and it uses the same register to declare two different variables that are both needed. Look below for an example, using a replaced MoveLocation function from the code posted in Question 1.
Because if my guess on Question 1 is right, when you use inline assembly and call a function, it takes the passed registers to execute whatever it is that the called function does. So what happens if eax is used for X, Y, and Z? Wouldn't the eax register get reset each time so that your really just passing eax with the z parameter in the end? Question 3 So here's my final question, the one that I really came on the forums to ask. I need to convert the following IDA assembly code to inline assembly in Visual Studio. How would I go about doing this? Look below for the code, I removed all the ".text:0xoffset's" on the left to make it easier to read. I've also marked up the 3 areas I will be editing here.
| |||||||
| Duoas (1596) | |||||||||||||||||||||||
Only if you count the amount of time it takes to load the instruction, since the XOR is two bytes and the MOV EAX is five. (So XOR is still better...)
Going to try to explain my answer a little bit better, but on just one single post because I think you are intelligent enough to notice it without resorting to spamming. (NB. You are lucky you are getting any response at all. You've just done the internet equivalent of biting the hand that feeds you then adding insult to injury. If you don't think you have got a satifactory answer, don't just turn up the volume and repeat the question. The reason I am responding at all is because of the clear effort you made to explain yourself better --so I assume that the inordinate rudeness was an accidental faux pas.) Answer 1 If you stick inline assembly in the middle of a function, C and C++ presume that the assembly code is part of that very function, and simply insert the assembly code (assembled to machine code, of course) into the function exactly where your __asm statement(s) are. There is no magic to it. But, if I understand your example correctly, you are wondering about the local variables x, y, and z. Typically, functions have specific entry and exit frame code, or (as jmc referred to it), a prolog and an epilog. Basically it means that a set amount of code is inserted at the beginning and end of a function. It works something like this simplified example:
The prolog adjusts the stack's frame pointer (EBP on Intel architectures) and makes room on the stack for all local variables. A "frame" is simply a reference to a specific position on the program's stack. (For purposes of illustration I will not optimize that local variable out of existence and that we are using the default cdecl calling convention.)
So, when the compiler comes across something like
mov eax, [z]it knows that you want to move the contents of memory at address 'z' into the EAX register. The compiler looks in its current symbol table and discovers that 'z' is a local variable at (in my example) ebp-4. So it replaces [z] with [ebp-4] and assembles that into machine code. When you call a function, like
call _quuxagain the compiler looks in its current symbol table to find the actual address of the function 'quux', and again produces the correct machine code. Exact details of how the stack frame is constructed and used depend on the function's calling convention. There are many besides the old cdecl. (Hopefully you understand now why I said that you must know basic assembly to use this stuff.) Answer 2 The registers EAX, ECX, and EDX are typically understood to be "use as you will". Other registers (including EBX) may (or may not) have restrictions on them. Exactly which registers you can clobber and which registers you must preserve depend on your compiler. You will have to read the documentation to know for sure. Your assumption is wrong: the variables aren't being declared. They already exist. They are just being used. In the example you posted, EAX is used as a temporary value. (It wasn't strictly necessary --the code could have been written as just
In any case, EAX is changed each time you assign it a value (using the MOV instruction, or any other instruction that modifies it). Now, if you were using the __fastcall calling convention (MS VC++), you could just use ECX and EDX as the first two arguments. The function
void __fastcall MovePlayerToXYZCoords( float x, float y, float z ) would be called with:
Answer 3 All those things named 'dword_12345' are the names the IDA disassembler gives to things it doesn't know the names of. This is typical of local variables (which you already named in your original program) and temporary values produced by the compiler (which don't have a name to begin with). You cannot use these IDA names in your inline assembly. The C++ compiler would have no clue what you are talking about. And, even if you did by some lucky miracle get some name right: 1) it could change the next time you compile and/or disassemble 2) the name wouldn't be listed in the local symbol table the compiler keeps for user names. (It keeps a separate table for symbols that it makes up.) Your first task will be to figure out what each of those names are.
The thins with names like 'loc_40A365' are address labels. If there isn't a lable at the given address, you'll have to add one.
Make sure you look up the meanings of the instructions you are using. RETN means "near return" --are you using a medium, small, or tiny memory model? Beyond that I cannot guess much to what the code you posted means or does. Hopefully this helps you get nearer to understanding your goal. | |||||||||||||||||||||||
| TheNoobie (28) | |||
| I have not read your entire reply yet, I am just about to. Right now I just read that first comment you made about how I am lucky you responded at all. I am extremely sorry if you took my response the wrong way, I absolutely did not mean to sound rude in any way. I just simply thought to myself that I must have worded the question wrong, because even though it sounded right to me it seemed as though the responses people gave me had thought I was asking about something I wasn't. Please don't hold my response against me, because again, I absolutely meant no disrespect to anyone at all. I want everyone that may have taken offense to what I said to please not hold it against me. I highly respect everyone on this website that devotes their personal time to helping others, sorry if it came out differently than that. Thanks for the big write up Duoas, I'm going to read it now. To everyone else that offered their help, I appreciate that as well - I wasn't saying I didn't appreciate it, merely that I think you may have misunderstood what I was asking. | |||
| TheNoobie (28) | |||
| Yes, that did help me a ton thanks. Full IDA view of the function is found here - http://pastebin.com/d376b6aea Thanks a lot for that write-up Duoas, much appreciated and well written. I'm going to be doing a little more research and learning on assembly so I can better understand everything, I provided you a link with the full process (not much was missing) but if you do not want to convert that to inline for me then it's no problem, I need to learn more myself regardless. Any further help appreciated, but what's provided is already more than I expected. | |||
| Duoas (1596) | |||
| That link was useful. If you get hung up anywhere I (or someone else here) will be glad to help. It looks like you can pretty much copy most of the function into some __asm blocks. Just a few pointers: Make sure that you declare your C routine as a near call. (In VC++ I think you have to tag it with __near --but I can't get the darn thing to install yet so I haven't used it in a while...) Oh yeah, almost forgot: you'll have to find all the places where the old function was called and fix them to call your new function's address. You can only do this after you have compiled-in the new code. Rename the labels (that start with 'loc_') to something better, like l_send_message: and l_return: For the labels that start with 'sub_', find in the IDA listing the line with the address. For example, for 'sub_658FB0', find the routine that starts at 658FB0 (the front of the line in IDA should read
At this point, you can see the code for that function, so you can try to figure out what its actual name is. Once you know, replace the 'sub_658FB0' with the actual name --case sensitive. You may also have to mangle the name to match the calling convention and your compiler's type info additions. For the stuff that starts 'dword_', 'word_', and 'byte_', you should be able to find those in the .data segment (using the same way you did for the routine). However, chances are that you can just drop the address in there directly (assuming you don't change any variables or constants in the program --otherwise you'll have to use its proper name). Don't put the
retn statement in. Let the compiler do that itself for the wrapper function.Test your modified program as thoroughly as you can. You will probably have to try a few times to get it right. Good luck! | |||
This topic is archived - New replies not allowed.
