Bitwise and Shift for compression

Would bitwise operators and shift operators be a good way to create your own compression software?

For example, could I take a file and determine the bits within the file. Then write an algorithm that will separate the bits in a byte with contiguous 0's into another byte that also has left over contigous 0's, removing the 0's to save space and perhaps separating different data elements by a single 0?

I just learned about the bitwise operators and found them to be interesting, if the above isn't true, what is bitwise used for?
Last edited on
You're on the right track, but you seem to not understand how data is represented. Your 'compression' technique would just corrupt the data. Those repeated 0s are important, just like the repeated 1s.

There is a compression technique that involves storing the number of consecutive digits rather than the actual digits, but it's not very effective.

Most highly effective compression is just pattern analysis and optimization - dealing with things in the bit levels don't always work too well.
Last edited on
Could I not store somewhere else in the code, the bits that were removed and the locations, so during decompression they could be returned?

or would this just make the file just as big as it always was ,lol
Last edited on
How would you reverse your compression? You'd need to mark how many 0s you deleted somehow, which would probably take up just as much space, or more, as you saved.

I think most compression algorithms find patterns in the bytes/bits of the data and try to represent patterns using less space. You could encode part of a picture as 14*red instead of red,red,red,red...

Bitwise operators are generally used for high performance code, encryption and very low level programming.
For example, var<<1 performs better for most processors than 2*var. (Although I think most compilers will turn a multiplication by a power of two into left shifts anyway, so this optimization is pretty much implicit)

In encryption, many algorithms use the xor and shift operations. (Haha, don't know too much about encryption algorithms.)

For low level programmers, it can once again be used for optimizations, ie.
 
xor var, var; //Sets var to zero, usually faster than an assignment 

Bitwise ops are also used to manipulate very dense blocks of data, where different bits in a single byte could represent different states.

EDIT: I probably missed some bitwise uses, if anyone wants to add on :P

EDIT2: Just saw L B's and your reply. Yeah, you'd probably make the file a little bigger actually, because you'd have to store not only the number of 0s, but also where they go.
Last edited on
Great insight BlackSheep - I think you're right about pattern detection and consolidation.

L B - I do not know much about data structure as it pertains to filesystems and operating systems, but I do know about IP address structuring. For example, each of the four octets used in an IP address are bytes that represent up to 255 and contain 8 bits. However, I do not exactly know how it's represented in data storage. Would it be similar, for example data is stored in bytes and the bytes are representative of 255 values, which in programming is usually converted into ASCII for characters and integers?
Last edited on
Another use for shift operators is in random algorithms.
@rcast the dotted string format of IPv6 is literally what you would get if you took each byte of the IP address and printed it in base 10 separated by dots.

Again, you are misunderstanding fundamental concept in computers as I once did. it has nothing to do with the storage structure.
Does anybody care to expand on the subject of storage structure conventions?
What do you mean by "storage structure conventions"? Do you understand the difference between data representation/interpretation and data?
Last edited on
I understand too little on this subject. I'll find it out in my readings in the near future. I asked, because Thinking in C++ expects me to understand bitwise and shift operators, without much explanation. However, I think I'm jumping the gun a bit - perhaps the topic will be expanded later in Volume 2. Thanks
Last edited on
Topic archived. No new replies allowed.