Signature scrambler?

Pages: 12
Hello everyone! So i am looking for some solutions on how i would go about writing a signature scrambler program. What i eventually want to be able to do is to create a piece of software, that can accept a .exe file, randomizes the source code of that .exe (with junk that do not effect the fuctionality) in order to obtain different signatures at anytime.

Is there any one here that have experience with this? Method of approach, refrence to documentation that might guide me in the right direction for this?

All help appriciated.

----------------------------------------------------------------------------------------------
Edit:

I came to think about, if i define macros for example
#define myOwnBreak break;

and include that in the top section of my .cpp file, then iterate over the .cpp file and replace everything that is
if(line.ReadLine() == "break")
// replace break with MyOwnBreak;

Wouldn't that approach work? Because that should generate different signatures if i include som random aspect to it?
Last edited on
What is a signature, and what does it mean to scramble it?
Hey. The program signature of the .exe (which is based on the combine code of a program).

I do not really know how to describe it, but a signature for a function for example would be unique bytes for that specific function. You can see the program signature (.exe) as a identifier thst describe the "code" that lies in the .exe.

Basically signatures replicate what a program is doing, and a signature is based on what is inside the .exe.

So if i were to write a program

Int k = 5;

This will have a signature. But if i instead write a program

Bool fs = true;

The signature will be different. Commom anti cheat systems use signature scanning of processes in order to detect public known cheats. So if that signature is found, they are able to differentiate between legit playerd and cheating players.

Hope i described it good enough?

Scramble code means chaning aspects of the source code (that does not effect funtionality) in order to obtain different signatures.
I probably shouldn’t be here, but:
Why would that change the signature if your using a macro? It is immediately retranslated by the preprocessor so wouldn’t it just end up having the same sig still? What if instead you made an encapsulating function that just uses the operators and make classes that are just the elemental types?
What i eventually want to be able to do is to create a piece of software, that can accept an .exe file, randomizes the source code of that .exe (with junk that does not effect the functionality) in order to obtain different signatures at any time.

An executable file doesn't contain source code. Your source code is only used by your C++ compiler in the process of translating that C++ program into machine language.

Machine language is a representation of the instructions that your machine can directly respond to. It is very primitive.

A signature for a function for example would be unique bytes for that specific function.

I thought the term you might be looking for is cryptographic hash. See:
https://en.wikipedia.org/wiki/Cryptographic_hash_function
For most practical purposes a cryptographic hash uniquely identifies a dataset (e.g., some representation of a function), although it can't be used to get the data back. The good thing about decent hash functions is that any miniscule change in the input results in a large, unpredictable change in the output - making your job easy.

Maybe of interest is Christian Colberg's textbook Surreptitious Software, ISBN 978-0321549259.
Last edited on
I believe this is done to a degree by virus scanners.
they can look in the machine code of the exe for patterns that are parts of known viruses and throw a flag.


the difference is the pattern match is done against their database file, not injected into the program, and its not breaking it down by routines, its just looking for a pattern.


you can break machine code into 'functions' but its not going to look like the source, and I can't see any benefit to this. It go a bit nuts when the compiler had inlined functions or eliminated them (optimized out) or the like.


I would attempt this on something like python instead, where the source code is what it is (interpreted languages). And for that a crypto hash would work, find the begin and end of the block and hash it, put it in a comment in a known location (say right above the function def ) and see if you can manage that. But dealing in compiled code... won't be portable, is very hard to do for c++… its doable, but you would want a disassembler and you need a good working knowledge of the binary format so you can put the signature in without damaging the program (in the wrong place it will become data or instructions and break things). A crypto hash would be very different for even a single space difference in the code. If you want to find nearly identical code, you would have to clean it first, and possibly eliminate variables to some tag instead of their original names as well.

you can also do this on the c++ source code (like the python idea). I dunno if that is useful to you.
Last edited on
Hmm okay. Can anyone explain how this works then?
https://github.com/yash-bansod/CodeScrambler

Because this has the same functionality as i explained earlier with #define and typedef. Supposed to be a code scrambler that replaces keywords with custom user defined macros, that will generate different different signatures once the project is compiled.

What it does is that it takes a .cpp file, searches through it, replaces common keywords with junk names.

I also found this website, suppose to replicate what i am trying to do. This would obv generate different signatures toh? Am i wrong? http://stunnix.com/prod/cxxo/
Last edited on
What it does is that it takes a .cpp file

You said exe file.

And defines and typedefs (used in the way you are describing) don't change anything in the exe, only in the cpp.

However, the obfuscator in your second link will produce a different signature. Its purpose is to make the code harder to reverse-engineer by randomizing symbols in the exe so that they don't give any hints as to the structure and functionality.
Last edited on
Ye sorry that is what i meant. What i am doing is that i am compiling/building a .exe from my c++ project. So right before i am building my exe, i will run through my .cpp file with a own built code scrambler that will insert junk names defined from

typedef & #define

And just replace associated names with my own defined names. That would generate different signatures after i have built my .exe from that process?

Last edited on
No it won't, as I already said.

Do you really think these two programs will produce different executables?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Program 1

#include <iostream>

#define MYFOR for

typedef int MYINT;

int main()
{
    MYFOR (MYINT i = 0; i < 10; ++i)
        std::cout << i << '\n';
}


// Program 2

#include <iostream>

int main()
{
    for (int i = 0; i < 10; ++i)
        std::cout << i << '\n';
}


Maybe just adding a random global string would change the signature, as long as it doesn't get optimized out.
 
char SignatureRandomizer[] {"owiefohgdhslkjslfdj"};

Last edited on
Yep you're right, now i understand. So i've read up on some things , and what i need to be able to do instead is to take in an .exe, then programatically read assembly code from it. And based on known instructions, i can replace them with equivalent instructions, that have the same functionality but different byte code. In that way, when some reverse engineer my .exe , and allocate a signature for a specific region, then that region signature will be different from my other .exe since i've replaced byte codes.

This is what i need to do.

For example is the functionality of these two

mov eax, 0
&
xor eax, eax

Equivalent, but they are different byte operations. Any suggestions how i would read assembly from an .exe?
there are disassemblers out there. Look for an open source one.
this is extremely risky. You *will* break something at some point if you do this, and you won't know it until someone runs the program and gets a weird bug that was not in the original.
EXE is not assembly. Its machine language. Assembly is 'almost' machine language, but its also different enough that tampering without a great deal of knowledge == disaster. It can be done, of course. Also, changing exe files like this can set off a billionty alarms in virus scanners /watchdog services.

Another thing: the above 2 statements may DO the same thing but they may also be radically different in performance. If this is in the middle of the graphics engine's deep inner loop, this could take the FPS down from 80/sec to 60/sec, if the difference were significant. There will be cases where looking at one instruction is not enough as well. You may make it the 'same' but miss a side-effect that was critical for the next instruction. Many of these instructions set cpu flags and things differently from each other.
Last edited on
Any suggestions how i would read assembly from an .exe?
In visual studio, you can actually open the .exe as a file directly, and it will show the entire instruction list. If you don't have the source code/associated PDB file, VS will just iterate over each assembly instruction directly (Start debugger with F11).

https://www.youtube.com/watch?v=w0sz5WbS5AM
xor eax eax is 2 bytes long, mov eax 0 is 5 bytes :)
xor can be predicated easily by CPU magic to effectively have 0 cost, for reasons I don't understand.
Ask Matt Godbolt.
Last edited on
if you have the code, you can generate the assembler yourself, modify it as text file into your various versions, and assemble them to exe from there (skip the compiler). This is safer, but it will again damage performance and may have the side effects and other weirdness.
Last edited on

if you have the code, you can generate the assembler yourself, modify it as text file into your various versions, and assemble them to exe from there (skip the compiler). This is safer, but it will again damage performance and may have the side effects and other weirdness.

That is what i was looking for! There is a command for g++ in order to obtain purely assembly code from a .cpp source file. I am using that to generate my assembly.
Now, how do i assemble a .asm source file into an exe? This is exactly the information i have been trying to look for as modifying assembly is abit easier.

If you please have any links or methods on how i would be able to assemble my .asm code into the .exe, that would be very helpful.

Also someone mentioned mov eax, 0 is 5 bytes and xor eax, eac is 2 bytes, i will fill the remaining bytes with nopes in my engine i am writing :) , or other garbage junk code that does not effect functionality.
If you please have any links or methods on how i would be able to assemble my .asm code into the .exe, that would be very helpful.
Just call G++ again.

Example: Pay attention to the command-line at the bottom of the window.
http://coliru.stacked-crooked.com/a/ff091e49202dc290
Last edited on

Just call G++ again.

Example: Pay attention to the command-line at the bottom of the window.
http://coliru.stacked-crooked.com/a/ff091e49202dc290


Thanks. Yep i recently found command lines with g++ to first
(1) generate assembly from .cpp file
(2) modify assembly file (this is my polymorph engine)
(3) generate object file from assembly
(4) load object file into .exe
Done!

This will be my method of approach.

Also someone mentioned mov eax, 0 is 5 bytes and xor eax, eac is 2 bytes, i will fill the remaining bytes with nopes in my engine i am writing :) , or other garbage junk code that does not effect functionality.

your junk is going to be seen as a bad instruction unless its in the padding, and if its in the padding, it was already junked. Let the assembler do it for you.

IIRC on windows you can inject bytes on the end of an exe with no effect, if you want to make the file sizes match. The end of program is hit and the junk is ignored. This is not true on all types of systems, and it may not still be true on windows, but you can take a look.
Well i do not care about the file size, nor do i care about useless instructions, because that is what a polymorphic engine is for. Chaning up byte paterns throughout the program in order to bypass byte/signature recognizion.

Yes and junk code is useless instructions, that is what it is meant to be.

If i were to insert junk only at the end, some parts of my program can still br caught by signature recognition..
Last edited on
Hmm. This thread just took a shady turn :(
Pages: 12