Encryption: The Binary side of things...

Pages: 12
Hello. I am 'teaching' myself C++ (internet/forums, etc...) and have written some pretty nice programs. I have also written an encryption algorithm, and understand why the longer an encryption algorithm is, the harder it is to "hack" it.

I wrote my algorithm in a way that it would be easy to add to any program by simply copying the header, and the .cpp file. Right now, all it does is encrypt text. So, it isn't necessarily that garbled. I would like to learn more advanced encryption techniques, but the ones I have found (google...) are hard to understand or have vague/insufficient explanations for me to understand how to emplement them.

I specifically want to learn binary encryption (bit-shift, etc...). I know how to open a file for binary, but from then on, nothing. If somone could give me an example code, and explain (very thoroughly; I want to know what each and every function does) in detail for me. As I said, I am trying to teach myself C++.

Thank you for your time!
understand why the longer an encryption algorithm is, the harder it is to "hack" it
I'm sorry to inform you that you understand something that's false. Some encryption techniques can be described in just a few sentences and accordingly short programs, but are capable of generating cyphertexts impossible to break.

Suppose you have an integer x in the range [0;255] which you want to encrypt. If you take a random number y in the same interval, it's impossible to get back x from the result of (x XOR y) without knowing y. Congratulations; you can now encrypt any kind of data and implement a one-time pad, one of the techniques I was referring to in the previous paragraph.
@helios I'm talking about hacking. there are people who will decompile a program and find the encryption algorithm. I think it is self explanitory from there. The goal in writing a longer encryption algorithm, is that when seen in assembly, it is harder to separate the algorithm from the rest of the program's tasks.

But that is beside the point. Please only post answers to my question. Thank you.
Which is why the best encryption algorithms assume that an attacker knows the algorithm that was used to generate the cyphertext.

For example, I say to you: "17 is what I got after adding two integers. Figure out one of the integers I used."
There's no decompilation to be done; I told you what I did to get the encrypted data. Can you answer my question? No, because + doesn't put in the output any information about the operands, and because the search space is enormous. It's simply impossible for you to know with certainty which two numbers I added if I don't tell you at least one of them.

Relying on the fact that an attacker doesn't know the method of encryption is known as "security through obscurity", and it's generally regarded as the weakest and most easily defeated form of security. Mainly because it's remarkably hard to come up with good encryption algorithms if you don't have half a dozen or so applied mathematicians.
helios.... when you decompile a program, you get assembly code. It doesn't mean jack if no one knows what the algorithm is, i can literally see that x = 10 + 7.

if you post again and it is not an answer, you will not recieve a response, and I believe repeated irrelevancies are reportable. Stick to the subject at hand.
Last edited on
i can literally see that x = 10 + 7


That's just silly. The key will never be hard coded. That would mean if you find the key for a single instance of that encryption, you could use that same key for any other instance of that encryption. What helios was referring to was a technique that's pretty much dominant in current encryption (as it's computationally infeasible to get the key back). Look up the RSA encryption algorithm. It's used all over the place. It basically takes the product of two large prime numbers as the key. The only way to retrieve the key would be to factor out the resulting number (which will be quite large), find the primes of those factors, and then start guessing as to which ones were used.

Until quantum computing gets anywhere reasonable, this technique will likely be the "best" way to encrypt data.
It doesn't mean shit if no one knows the algorithm is

Are you even reading my replies? Encryption algorithms are publicly known. Everyone knows how the data was encrypted.

i can literally see that x = 10 + 7
Except you don't see that. What you see is
cyphertext = plaintext + key
key is a run time variable, it's not stored in the program, and it's known only to the person who performed the encryption and the person who will perform the decryption. You cannot and will not figure it out by just looking at the executable. And this is remarkably easy to prove.
I just used this program to generate the cyphertext 1236:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>

//#define ENCRYPT

int main(){
	unsigned plaintext,
		key,
		cyphertext;
#ifdef ENCRYPT
	std::cout <<"Encryption\n";
	std::cout <<"Plaintext:\n";
	std::cin >>plaintext;
	std::cout <<"Key:\n";
	std::cin >>key;
	cyphertext=plaintext+key;
	std::cout <<"Cyphertext: "<<cyphertext<<std::endl;
#else
	std::cout <<"Decryption\n";
	std::cout <<"Cyphertext:\n";
	std::cin >>cyphertext;
	std::cout <<"Key:\n";
	std::cin >>key;
	plaintext=cyphertext-key;
	std::cout <<"Plaintext: "<<plaintext<<std::endl;
#endif
}

There, you have the source and the cyphertext. You don't even need to do any reverse engineering. Can you figure out the plain text and/or the key?

EDIT:
if you post again and it is not an answer, you will not recieve a response
I gave you an answer in my first post in this thread. You just decided to ignore it and hang on to the irrelevant bit of my post.
Last edited on
@helios
Judgeing by your profile, you seem (seem) to be experienced, so IO will not argue. Your first post DID NOT answer my question. I want to know how to literally modify the bits of a character.
I want to know how to literally modify the bits of a character


You're pretty much just stuck with the bitwise operators for this.
http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Bitwise_operators
It did. All it takes is to know how to see implications.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <iostream>
#include <cstdlib>
#include <ctime>

void f(char *text,char *key,size_t size){
	for (size_t a=0;a<size;a++)
		text[a]^=key[a];
}

int main(){
	srand(time(0));
	char hello[]="Hello, World!";
	const size_t n=sizeof(hello)-1;
	
	char key[n];
	for (size_t a=0;a<n;a++)
		key[a]=(char)rand();
	
	std::cout <<hello<<std::endl;
	
	f(hello,key,n);
	std::cout <<hello<<std::endl;
	
	f(hello,key,n);
	std::cout <<hello<<std::endl;
	return 0;
}
Last edited on
@helios
i have some questions:

char *text means text is a pointer (or a memory variable)? Why is this necessary? (same for key, why?)

What does the ^= operand mean?

why are you useing a variable to declare a variable? (size_t a)

Can you use a vector instead of an array?

You're casting rand as a character, can you cast any number (float, int, etc..) as a character?

How can you decrypt a file if the encryption is randomly determined?

char *text means text is a pointer (or a memory variable)? Why is this necessary?
Because a single character is not enough to represent a string of text.

What does the ^= operand mean?
Bitwise XOR and assign.

why are you useing a variable to declare a variable?
size_t is a type, obviously, or the program wouldn't compile.

Can you use a vector instead of an array?
Yes.

You're casting rand as a character, can you cast any number (float, int, etc..) as a character?
Characters are integers. If you can cast something to int then you can cast it to char.

How can you decrypt a file if the encryption is randomly determined?
The program is a proof of concept. Rather than randomizing the key, I could have had the user input it. It's just easier like this.
Even if the key is randomized, as long as it's known by both the sender and the receiver, decryption is possible.
Thanky ou helios, for this insightful information.

I also want to add:

If characters are intergers, would this be possible:

1
2
3
4
int x = 10 //num between 0-25
char ch;
ch = char(x)
//would char(x) = 'j'? 


in other words, is it possible to reverse the process; integers into characters, assuming the integers are less than 25.
Last edited on
You can set a char to equal an int constant, but the character value wouldn't be 'j'. The ascii value for 'j' is 106 http://www.asciitable.com/ That assignment ch = char(x); is wrong, it should be ch = x;
Yes, the conversion is legal. However, I only said that chars are integers. I never said anything about what character any specific value represents or what ranges of values a char can hold.

Regarding the former, it's possible that a platform has 'a' mapped to 13 and 'b' mapped to 83. Making assumptions in this regard is not theoretically safe. Practically, the ASCII mapping is nearly universal. In ASCII, 'A' maps to 65, 'a' to 97, '0' to 48, and alphanumeric characters are sorted and contiguous by class.
If one wanted to not make assumptions about the mappings, character literals (e.g. 'a', '#', '\\', and so on) can be used, which the compiler translates to the appropriate integers for the platform.
As a mostly historic note, the competing code page for the sub-128 range is EBCDIC, a horrible abomination by IBM.

Regarding the latter, by definition char has to hold exactly as many different values as a byte can hold. This doesn't say much because a byte can technically be of any size. Some computers used to use 12-bit bytes. However, this is also mostly a theoretical concern, because I haven't heard of any modern computer with non-8-bit values. Meaning, a char can be reasonably safely assumed to be able to hold 256 values.
There's one other concern which isn't theoretical: the signedness of char. The standard Allows the implementation to decide whether 'char' means 'signed char' or 'unsigned char', meaning char could be a positive integer lower than 256, or it could be one in the range [-128;127], or even [-127;127] plus a negative zero

In other words, char(10) != 'j' with a high degree of certainty.

EDIT: On the other hand, casting from char to int is rarely necessary. If you need to perform operations on it, char admits them, since it is an integer:
1
2
3
4
5
char x=10;
x*=10;
x-=11;
x/=2;
x %=4;
Last edited on
hmm..... I think I will just stick with my old technique (using vectors to assign character IDs, and shifting the character using it's position in the string). I'm not so worried about decryption of my algorithm anyway.

Regarding the former, it's possible that a platform has 'a' mapped to 13 and 'b' mapped to 83. Making assumptions in this regard is not theoretically safe. Practically, the ASCII mapping is nearly universal.


I have no clue what you're talking about. I can assume ASC II is the standard identification of characters on the keyboard by the way you describe it. I will probably learn that when I start my semester.
I have no clue what you're talking about.
I mean that it's not safe to make assumptions about the output of this:
1
2
std::cout <<(int)'a'<<std::endl
    <<(int)'b'<<std::endl;
The keyboard has little to do with what I'm talking about.

I think I will just stick with my old technique
You mean the one that used a quadratic function composed with a modulo operation to create an irreversible non-encryption function? Good luck with that.
@helios
You mean the one that used a quadratic function composed with a modulo operation to create an irreversible non-encryption function? Good luck with that.


It works dude. I made it in a way that I can easily put it into any program. And since it takes a team of mathamaticians to make a really secure encryption, i think i will just settle for protection against all script kiddies/computer geeks ( anyone not a professional). Obviously it isn't completely solid, but it is waaayyy better than absolutely nothing, and I prefer somthing.

I mean that it's not safe to make assumptions about the output of this


I thought of somthing: toupper(ch) = (a number). So, if, on startup, the program went through a 'calibration' (a-z and A-Z, and store all their values in a vector) we could make a program that would be able to effectively encrypt/decrypt somthing. (this is a theory, havent tried it and im tired right now).
It works dude.
Sure, sure.

And since it takes a team of mathamaticians to make a really secure encryption, i think i will just settle for protection against all script kiddies/computer geeks ( anyone not a professional). Obviously it isn't completely solid, but it is waaayyy better than absolutely nothing, and I prefer somthing.
If you're trying to get practical results, don't roll your own algorithm. You're just making it easier for the attackers. Seriously, a cryptographic algorithm that only takes a plaintext (without a key) as input is among the weakest forms of security, as it preserves the entropy of the input. Any such transformation isn't too different from ROT13.
For practical results, just use any of the widespread encryption algorithms, such as AES or RSA. Many of these algorithms are conveniently included in the Crypto++ library:
http://www.cryptopp.com/

Again, don't roll your own encryption for practical purposes. The fact that you can't figure out how to break it is meaningless.
http://security.stackexchange.com/questions/2202/lessons-learned-and-misconceptions-regarding-encryption-and-cryptology

So, if, on startup, the program went through a 'calibration' (a-z and A-Z, and store all their values in a vector) we could make a program that would be able to effectively encrypt/decrypt somthing. (this is a theory, havent tried it and im tired right now).
1. What would this accomplish?
2. How is knowing that 'a' binds to one particular number rather than another helpful in any way, as far as encryption is concerned?
You seem to have very confused notions of how computers deal with various types of information. It's possible to encrypt a message without any understanding of what kind of information it contains, and in fact this is the sensible way to do it.
@helios

I'm not using this program to store SSN numbers. It's a ballot program, so it doesn't matter what encryption it has. I just want to have the files protected from prying eyes that don't know what a decompiler is. That covers about 98% of the people that will be using this program. That is good enough for me, as the data that's being 'protected' isn't even that sensitive.
Pages: 12