Groups of 8 to groups of 6

So this is more of a logic question opposed to a code question. I am working a project for school and this is the part im stuck on. I have an array of integers that I got from different characters. I can get them into 8bit binary, but than I have to separate them into groups of 6.

i.e. 01011010 01101111 01110010 01101011 ===>
010110 100110 111101 110010
011010 11xxxx xxxxxx xxxxxx

Than I have to analyze the groups after. Basically I am making a base64 encoder. I spent about three hours last night messing around with bit wise operators and couldn't for the life of me figure out how to get the grouping right. Could anyone push me in the right direction?
If you can convert them into their binary forms in the form of booleans, consider a std::vector<bool> (which is optimized for this) to hold the bits. Then you can just have nested loops getting six at a time with space in between ;)
C++ has special container which is named as std::bitset. I think that it is what you need.
std::bitset cannot be resized. std::vector<bool> can. Consider what behavior you need.
Does vector work like a stack in Java with push and pop?
Does vector work like a stack in Java with push and pop?


In my opinion all you need is vector of bitsets. For example

std::vector<std::biteset<6>> v;
The above would still waste space for padding. Might as well simply use a vector of chars and ignore the first two bits, no?
Why have you decided that the realization of bitset can not use a character array? Is any requirements in the C++ Standard for the realization of butset?
Why would the semantics of a char ever be useful for a bitset? The opposite might be true, though.
Yeah, vector of bitsets is quite handy.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef std::bitset<8> octet ;
typedef std::bitset<6> sextet ;

std::vector<sextet> to_sextet( const std::vector<octet>& octets )
{
    std::string bits ;
    for( const octet& o : octets ) bits += o.to_string() ;

    std::vector<sextet> result ;
    enum { NBITS = 6 } ;
    for( std::size_t i=0 ; i<bits.size() ; i +=  NBITS )
        result.emplace_back( bits.substr( i,  NBITS ) ) ;

    return result ;
}

Only enum should be before typedefs.:)
For example

1
2
3
4
5
6
7
8
9
enum { SEXTET = 6, OCTET = 8 };

typedef std::bitset<OCTET> octet ;
typedef std::bitset<SEXTET> sextet ;

...

    for( std::size_t i=0 ; i<bits.size() ; i +=  SEXTET )
...
You guys helped me to a point. Im trying to write my own code in the stuff I know and all that. So I have a vector full of 8bit segments. How would I cast or transfer that to a vector of 6bit segments? And what do you think would be the easiest way to do the base64 library? The only way I know would be to do 63 if statements.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void groupSix(string str){
    typedef std::bitset<8> eBytes;
    typedef std::bitset<6> sBytes;
    sBytes sixBit;
    eBytes bits;
    vector<eBytes> vec;
    vector<sBytes> six;
    for(int i = 0; i != str.length(); i++){
        int n;
        char var;
        var = str[i];
        n = ((unsigned)var);
        int tempBinary;
        for(int j = 0; n != 0; j++)
        {
            tempBinary = n%2;
            bits.set(j, tempBinary);
            n = n/2;
        }
        vec.push_back(bits);
    }
    int count = 0;
    for(int v = 0; v < vec.size(); v++){
            cout << vec[v];
            count = count + 1;
            if(count / 8 == 0){
                cout << " ";
            }

    }
}


This is what I have
Which base 64 do you speak of? There is mathematical base 64 with digits 0-9, digits A-Z, digits a-z, and the last two digits of your choice, then there is this other base 64: http://en.wikipedia.org/wiki/Base_64

Also, I recommend not using bitsets at all for this. It is far easier without them.
Last edited on
Whoops! Didn't know there was more than one. Im looking for the "ABCabc+/" one!
> Whoops! Didn't know there was more than one.

Wiki lists thirteen of them: http://en.wikipedia.org/wiki/Base64


> Im looking for the "ABCabc+/" one!

That is the 'standard' encoding as per RFC 4648: http://tools.ietf.org/html/rfc4648


> I recommend not using bitsets at all for this. It is far easier without them.

Doing it the terribly hard way:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <iostream>
#include <bitset>
#include <vector>

typedef std::bitset<8> octet ;
typedef std::bitset<6> sextet ;

std::vector<sextet> to_sextets( const std::vector<octet>& octets )
{
    std::string bits ;
    for( const octet& o : octets ) bits += o.to_string() ;

    std::vector<sextet> result ;
    enum { NBITS = 6 } ;
    for( std::size_t i=0 ; i<bits.size() ; i +=  NBITS )
        result.emplace_back( bits.substr( i,  NBITS ) ) ;

    return result ;
}

// 'standard' base64 encoding as per RFC 4648
std::string b64_encode_rfc4648( const std::string& utf8_str )
{
    std::vector<octet> octets( utf8_str.begin(), utf8_str.end() ) ;
    if( octets.size()%3 != 0 ) octets.resize( octets.size() + 3 - octets.size()%3 ) ;
    std::vector<sextet> sextets = to_sextets(octets) ;

    std::string b64_str ;
    static const char b64_index_table[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
                                 "abcdefghijklmnopqrstuvwxyz0123456789+/";
    for( const auto& bs : sextets ) b64_str += b64_index_table[ bs.to_ulong() ] ;

    auto last = b64_str.size()-1 ;
    if( b64_str[last] == 'A' )
    {
        b64_str[last] = '=' ;
        if( b64_str[last-1] == 'A' ) b64_str[last-1] = '=' ;
    }

    return b64_str ;
}

int main()
{
    const char* const u8str = u8"Man is distinguished, not only by his reason,"
        " but by this singular passion from other animals, which is a lust of"
        " the mind, that by a perseverance of delight in the continued and"
        " indefatigable generation of knowledge, exceeds the short vehemence of"
        " any carnal pleasure." ;
    std::cout << u8str << "\n\n" << b64_encode_rfc4648(u8str) << '\n' ;
}

Last edited on
Topic archived. No new replies allowed.