Data Compression Question

I recently learned about Huffman Compression in one of my classes and wrote a program that read a text file, built a tree, and created the Huffman code for each letter of that file. Then I output the compressed file to another text file, created a decode file, and so on.
But now the one thing I've been wondering is how do I actually compress a file? If I take an input text file say abc and output 10110 to another text file its not actually compressed because I'm not creating a binary file or editing the bits/bytes of a file correct?
So I was wondering if anyone could point me in the right direction as to learn more about editing a files bits or creating a new file with bits based on the compressed file. Hopefully my question makes sense sorry if it's a bit confusing.
Last edited on
The data "abc" in binary could actually be a sequence like "01000001 01000010 01000011" which could be represented by a shorter Huffman Compressed stream "100110" depending on your Huffman tree.

Normally a character like 'a' or 'b' or 'c' would be represented by a complete byte which contains 8 bits.
A Huffman sequence however may represent such a character 'a', 'b' or 'c' by a binary sequence less than 8 bits.

Thus by packing together the Huffman tree bit sequences that relate to the characters "abc" we could actually end up with a binary sequence that is shorter than the uncompressed sequence for these characters.

In order to pack and unpack the bits within a Huffman stream we need to use bit operations such as & or |.

hope this helps.
Topic archived. No new replies allowed.