Reading and writing variable binary width data

I'm stuck trying to figure out how to read and write an arbitrary number of bits from/to a file stream.

For instance, how to repeatedly read 9 bits from a file, then change to 10 bits, then 11 bits, and so on?

Obviously one way is by doing a lot of bit shifting, and masking. But honestly, I'm too dumb to get it right. Then I thought about using std::bitset and std::vector<bool>.

Have you suggestions?
Can you give an example of what you are trying to read in? 1's and 0's or something else? Obviously you can use a condition to change when to start reading in a new number of bits, you don't have to do any bit shifting
Last edited on
Suppose input file is:
00110011 01110111 10001000 10111011


And suppose my goal is to read three integers a, b, c of type uint16_t such that:
a = _______0 01100110 (first 9 bits)
b = _______1 11011110 (next 9 bits)
c = ______00 10001011 (next 10 bits, and so on)


Edit: for clarity...
00110011 01110111 10001000 10111011
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <limits>
#include <iostream>
#include <stdexcept>
#include <string>
#include <bitset>

unsigned long long read_bits( std::istream& stm, std::size_t nbits )
{
    constexpr std::size_t MAX_BITS = std::numeric_limits<unsigned long long>::digits ;
    if( nbits > MAX_BITS ) throw std::out_of_range( "too many bits" ) ;

    std::string str ;
    char c ;
    for( std::size_t i = 0 ; i < nbits ; ++i )
    {
        if( stm >> c ) // skip whitespace
        {
            if( c == '0' || c == '1' ) str += c ;
            else throw std::domain_error( "invalid character for bit" ) ;
        }
        else break ;
    }

    return std::bitset<MAX_BITS>(str).to_ullong() ;
}

#include <sstream>
#include <cstdint>

int main()
{
    std::istringstream stm( "00110011 01110111 10001000 10111011" ) ;

    std::uint16_t first_9 = read_bits( stm, 9 ) ;
    std::cout <<  std::bitset<9>(first_9) << '\n' ; // 001100110

    std::uint16_t next_9 = read_bits( stm, 9 ) ;
    std::cout <<  std::bitset<9>(next_9) << '\n' ; // 111011110

    std::uint16_t next_10 = read_bits( stm, 10 ) ;
    std::cout <<  std::bitset<10>(next_10) << '\n' ; // 0010001011
}

http://liveworkspace.org/code/2dfO26$0
JLBorges, there is a misunderstanding.

My example was a representation of a non-text file composed of four bytes.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <limits>
#include <iostream>
#include <string>
#include <bitset>
#include <vector>
#include <numeric>

std::vector<unsigned long long> read_bits_binary( std::istream& stm,
                                                  const std::vector<std::size_t>& nbits )
{
    using bitset_byte = std::bitset< std::numeric_limits<unsigned char>::digits > ;
    using bitset_ull = std::bitset< std::numeric_limits<unsigned long long>::digits > ;

    std::vector<unsigned long long> numbers ;


    auto total_bits = std::accumulate( nbits.begin(), nbits.end(), std::size_t() ) ;
    std::string str ;
    char c ;
    while( stm.get(c) && str.size() < total_bits ) str += bitset_byte(c).to_string() ;
    if( str.size() < total_bits ) { /* error: throw something */ }

    std::string::size_type start = 0 ;
    for( auto n : nbits )
    {
        numbers.push_back( bitset_ull( str.substr( start, n ) ).to_ullong() ) ;
        start += n ;
    }

    return numbers ;
}

#include <sstream>
#include <cstdint>

int main()
{
    auto n = 0xf0f0f0f0 ;
    std::stringstream stm( std::ios::in|std::ios::out|std::ios::binary ) ;
    stm.write( reinterpret_cast<const char*>(&n), sizeof(n) ) ;

    auto nums = read_bits_binary( stm, { 9, 9, 10 } ) ;
    for( std::uint16_t n : nums ) std::cout <<  std::bitset<16>(n) << '\n' ;
}

http://liveworkspace.org/code/jMf44$0
Last edited on
Topic archived. No new replies allowed.