binary code

here is an example of random binary code :

000000000110000001100000000000000000000001000000
000100000000000001110000001001000010000001101001
000100000000000000101001011001010110010000100000
000000000000000001010100000000000110001000100001
001011000110000001010000001000000010100100000010
000000000000000000001001000000100111010000100001
010000000110000101000010000101000110000101000110
001000000010000000000100000000100001011100000101

the code above is only for clarity and specification

my question is :

how to decode a random binary code to plain text ???????

and what are , generally , the methods used to decode a random binary code????

Bourgond Aries (415)

Assuming a byte is a char, is 8 bits, divide the binary string into chunks of 8.

for the first bit, you have the equation:

bit * 2^0

For the second:

bit * 2^1

Third:

bit * 2^2

etc.

Adding all those together gives you a number that is between 0 and 255.
You can simply cast this number into a char.

Another way to do it would be to fiddle around with binary operator, would probably be a little faster.

AbstractionAnon (6954)

 
bool more_bits = true;
char c; 
int b;
while (more_bits)
{  c = 0;
    for (int i=0;i<7;i++)  // operate on 8 bits at a time
    { b = get_bit();
       c = c << 1 | b;
    }
    cout << c;  // should be an ASCII character if encoding was correct
}

Chervil (7320)

    string input =
    "000000000110000001100000000000000000000001000000"
    "000100000000000001110000001001000010000001101001"
    "000100000000000000101001011001010110010000100000"
    "000000000000000001010100000000000110001000100001"
    "001011000110000001010000001000000010100100000010"
    "000000000000000000001001000000100111010000100001"
    "010000000110000101000010000101000110000101000110"
    "001000000010000000000100000000100001011100000101";

    int pos = 0;
    while (pos < input.size())
    {
        string word = input.substr(pos,8);
        pos+= 8;

        bitset<8> bits(word);
        int n = bits.to_ulong();
        cout << word << setw(5) << n << " "<< char(n) << endl;
    }

since the input is random, the character output is not meaningful.
Output:


00000000    0
01100000   96 `
01100000   96 `
00000000    0
00000000    0
01000000   64 @
00010000   16 ?
00000000    0
01110000  112 p
00100100   36 $
etc.

JLBorges (13770)

You might want to consider use of a binary-to-text encoding scheme.
http://en.wikipedia.org/wiki/Binary-to-text_encoding

dilver (142)

sorry but none of your answers was of a good help to me ; you concentrate all your answers on the normal way of decoding the binary code which , I know since a long time ago , is

convert from binary to decimal
then replace the decimal values by its characters in ascii table

let's say that

000000000110000001100000000000000000000001000000
000100000000000001110000001001000010000001101001
000100000000000000101001011001010110010000100000
000000000000000001010100000000000110001000100001
001011000110000001010000001000000010100100000010
000000000000000000001001000000100111010000100001
010000000110000101000010000101000110000101000110
001000000010000000000100000000100001011100000101

is the encryption of a plain text message , how do we know the type of encryption in which this code is encrypted , is there a software to check for encryption type ? yes there is but I all the collection of software I found are unable to give any information about the encryption of the code above

so what is the solution to decode it , it is really a big problem
any suggestion , any help would be appreciated

AbstractionAnon (6954)

If you don't know how it was encoded, then you're not going to be able to decode it.

If you take the first row and divide it into groups of 8, it's clear it's not 8 bit ASCII data.

00000000 01100000 01100000 00000000 00000000 01000000
nul tilde tilde nul nul @

If you know the plaintext, that can sometimes be a help in determining how it is encoded.

Chervil (7320)

Well, the original question very clearly stated that this was random binary code.
Now it seems to be suggested that it is not random at all, but instead it has some meaning. I find that a bit confusing. If it has meaning, how do you know? Where did this data come from? (Or shouldn't I ask)

dilver (142)

of course you are the welcome

here how it began

I found a site in which there is many computer challenges in many field like encryption , stegano , php etc ...

I want to solve the challenges one by one until I got a challenge in encoding field.

I opened the html source code of the challenge and I found three large binary codes . Because they are too long I will put only the first one :

000000000110000001100000000000000000000001000000
000100000000000001110000001001000010000001101001
000100000000000000101001011001010110010000100000
000000000000000001010100000000000110001000100001
001011000110000001010000001000000010100100000010
000000000000000000001001000000100111010000100001
010000000110000101000010000101000110000101000110
001000000010000000000100000000100001011100000101
011010000010100000100000011001000000000100000000
010000100010111000000000011101000000001000100000
000101000110000001000001001000000000011000100100
010110000111010000100000001000110111010001000001
001001110100010000001100000000000100011000100011
001000000111010001001000011000010101000001000101
001100100000100100001010000000000010000100000000
000100000011000100100001001100000011000000010000
000100010010000000000000001000010000000000100000
001100010001000000000000000000000010000000110000
000000010001000000010001000000000001000000110001
000000000000000100110000000000000010000000001101
000010000001000000010000001000000010000000110000
000000010000000000000001001000000001000100100000
000100000010000100010001000000010001000000110000
000100010011000000000000000100010000000000000000
000100000000000000000001001000010000000000110000
001100010000000000110000000010000000000000000000
000000000001000100110000001000000001000100000000
001100000001000000100001001100000001000000100000
001100000011000000010001001000000010000000010000
000100000010000000000000000000000001000000100000
000100000011000000010001000000000010000000010000
000100000000100000001000001000000011000000010000
000100000000000000000000000100000011000000000000
001100010010000100010001000000000001000000000000
000000000011000000010001001100010011000000000001
001100000001000000110000000000000001000000010001
001100000001000000110000000000000001000000000001
00000000

as you can see the earlier binary code is piece of this large code : it forms its beginning .

now coming to your question how do I know that the binary code is random?
we know that the code unit as a sequence of bits is used to encode a character for example :

US-ASCII, code unit is 7 bits
UTF-8, code unit is 8 bits
EBCDIC, code unit is 8 bits
UTF-16, code unit is 16 bits
UTF-32, code unit is 32 bits

and we also know that in order to decode from binary to plaint text we need to know the type of encoding used . Or to know the type of encoding used on the code and how to decode it , we need to know how many bits form one character . If there is no code unit pattern in a binary code , it means that the code is random and the code above does not follow any specific code unit .
We have seen that when we tried to decode it ,according to some specific code unit, it gave weird mix of characters and numbers like (0,96,96,0,0,@,p,112,p,$) or (nul tilde tilde nul nul @) this is why I said it is a random binary code unless someone prove me wrong

there is another possibility that the binary code was obfuscated to fool and cheat the challenger and destroy his ability to solve the problem but how? which method I do not know

the site is :

http://www.wechall.net/

the name of challenge is : Enlightment

the url of the challenge page is : http://www.wechall.net/challenge/anto/enlightment/index.php

if you want chervil you can take a look to understand more about what is going on

JLBorges (13770)

See: http://www-archive.mozilla.org/projects/intl/UniversalCharsetDetection.html

http://lxr.mozilla.org/seamonkey/source/extensions/universalchardet/src/base/

dilver (142)

I looked into the two sites but they did not help me because they have nothing to do with the binary code above and they have nothing to do with what I asked and my problem . the information are two abstract and general and it talks about Chinese and Japanese and Russian languages besides the three auto-detection techniques are not explained very well and cannot help me with the binary code I stated above . The most dominant first technique namely the Coding scheme method :

If an illegal byte or byte sequence (i.e. unused code point) is encountered when verifying a certain encoding, we can immediately conclude that this is not the right guess. A small number of code points are also specific to a certain encoding, and that fact can lead to an immediate positive conclusion

has some technical terms that I could not find like illegal byte or code points . Moreover it is too ambiguous and I do not see how it can help me in determining the decoding or encoding of the binary code above.

the rest : http://lxr.mozilla.org/seamonkey/source/extensions/universalchardet/src/base/ are only c++ headers or c++ files that has nothing to do with my request and the encoding type the binary code above

sorry it does not help me at all

I need a help an article or a reference or a hint or a free book in the internet that helps in determining the type of encoding of the binary code above

new techniques , new methods my problem in order to solve it

Topic archived. No new replies allowed.