Some confusions...

Hello all of you....
I'm a beginner to C++, but I've some confusions in the language..
first of all I want to know that:
Why a program made in C++ can not be again converted to it's source code ??? :/
and is it makes any difference by using different kinds of compiler ??

Thanks in Advance

Regards
Butch Cavendish :)
Why a program made in C++ can not be again converted to it's source code???
After compiling the program you will have binary code. These are instructions for processor. And CPU does not have things as variables, loops, objects and many other things, so they got dropped (for things which are only needed in compile time) or converted into other things.

For example it is impossible to tell int from bool in resulting executable because there is no bools in CPU. Or guess which kind of loop was here: while/do-while/for; unless you know specific compiler quirks. There is rarely traces of std::unique_pointer either (it looks and behaves like raw pointer). References are undistinguishable from pointers (they are commongly pointers internally).
Inlines, macro and also templates cannot be reconstructed because of they nature. Code which does not compiled due to peprocessor condition does not appear in executable for obvious reason.

Also there are optimisations. They can completely change sources. For example some function can completely dissapear because it got inlined. Operation reordering, dead code elimination, compile-time calculations...

Therefore it is impossible to reconstruct source. At max you can get source code which works like original, but usually lack all high-level abstractions.
Can't we get back to source code after compiling, by exactly reversing the same process what we have done in compile time ??
And is it possible to make such compiler which can re-produce source code from it's own generated executable file ?

Thanks for your previous Answer :? ++Rep :)
Can't we get back to source code after compiling, by exactly reversing the same process what we have done in compile time ??
The problem here is that there are multiple, perhaps almost infinite versions of source code which will all generate the same final executable program. That's a many to one relationship. In the forward direction that's fine. But attempting to reverse it one is faced with a fork in the road having multiple possibilities. Which path should one choose when each is equally valid? Any such source would be purely arbitrary and might not resemble the actual original source at all.
33

↑ This number is a result of some arithmetic operations. Reconstruct original formula.

Also you can thing about things like simplification:
((a2 + 2ab + b2)(a-b))/(a+b) → ((a + b)2(a - b))/(a + b) → (a+b)(a-b) → a2 - b2
You cannot reconstruct original from simplified formula.

Another example:
1
2
3
L1:
add eax, ebx;
jmp L1; 
Can be result of either
1
2
3
4
5
6
7
8
9
while(true)
    a +=b;
//...
do
    a += b;
while (true);
//...
for(;;)
    a += b;
Guess which.

Also there is nowhere to store variable names in program. Where would you take them from?

In my previous post I too wrote examples of things that are simply unreconstructable due to their very nature.

Decompilers which generates code which works like your program do exist. Decompilers which reconstructs original source are impossible.
The problem here is that there are multiple, perhaps almost infinite versions of source code which will all generate the same final executable program. That's a many to one relationship.

what is the meaning of that ??
I made an example of that: three loops whcih are compiled to the same code.
4 formulas that are essentually the same.

code
1
2
3
mov %eax, %ebx;
shl 1, %eax;
add %ebx, %eax;
could be either
1
2
3
4
5
6
7
y = x;
x <<= 1;
x += y;
//...
x = (x << 1) + x;
//...
x*3;
To add to this, compiler can generate following from any of these examples:
mul 3, %eax;

Here we actually have many to many relationship: several different sources can result in same executable, or same source can result to different executables.
Last edited on
Somewhere in Data structures in Time Complexity topics I've read this, but I really don't know what does it mean, I'll very thankful to helping hands :)
"3-4GHz processors on the market
–still …
–researchers estimate that the computation of various transformations for 1 single DNA chain for one single protein on 1 TerraHZcomputer would take about 1 year to run to completion"
Topic archived. No new replies allowed.