Using the << and >> Operators on Bits

Pages: 12
What exactly are the effects of these operators, and how do you use them? I know that they shift bits, but in what way? I can't seem to find any documentation on this.
<< left shifts, and >> right shifts.

1
2
3
4
5
6
7
int foo = 0xF0;

foo >>= 4;
// foo == 0x0F

foo <<= 2;
// foo == 0x3C 



EDIT: for those unaware of what bitshifting is...

0xF0 is %11110000 in binary:

1
2
3
4
5
6
 hex      bin 
0xF0   %11110000   <-- original
0x78   %01111000   <-- right shift 1
0x3C   %00111100   <-- right shift 2
0x1E   %00011110   <-- right shift 3
0x3C   %00001111   <-- right shift 4


Right shifting by X is similar (but not exactly the same as) dividing by 2X, and left shifting by X is similar (but not exactly the same as multiply by 2X
Last edited on
To put it in simple mathematical terms:
x<<y == x*2^y
x>>y == x/2^y (integer division)

For example, 3 in binary is 11b 3>>1==1 because 3/2==1, and 1==1b.
Another example:
21==10101b
21>>2==5
10101b>>10b==101b

21==10101b
21<<2==84
10101b<<10b==1010100b
Last edited on
http://en.wikipedia.org/wiki/Arithmetic_shift
<< to the left
>> to the right
Thanks Disch, your post was the least confusing.
There are two types of right shift:

Arithmetic Shift
http://en.wikipedia.org/wiki/Arithmetic_shift

Logical Shift
http://en.wikipedia.org/wiki/Logical_shift

In C (and C++) the choice depends upon the signedness of the operand. For example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <bitset>
#include <iostream>
#include <string>
using namespace std;

template <typename ValueType>
inline void print( const char* name, ValueType value )
  {
  cout << name << " = "
       << bitset <8> ( value ).to_string()
       << " = "
       << (int)value << endl;
  }

int main()
  {
  int           i;
  signed   char s;
  unsigned char u;

  cout << "Please enter a signed value in [-128, 127]: " << flush;
  cin >> i;

  s = i;
  u = i;

  cout << endl;
  print( "signed       ", s );
  print( "unsigned     ", u );

  cout << endl;
  print( "signed   >> 1", s >> 1 );
  print( "unsigned >> 1", u >> 1 );

  cout << endl;
  print( "signed   >> 2", s >> 2 );
  print( "unsigned >> 2", u >> 2 );

  return 0;
  }

Notice how the unsigned values always get zeros in the msb, while the signed value keeps the same bit value as it had before.

Hope this helps.
Actually, it showed the same values for EVERYTHING for both signed and unsigned values.
Hmm, it seems I spoke too soon. Apparently right-shifting signed values need not use an arithmetic shift. See ISO/IEC 9899:1999 Section 6.5.7. I suppose this is due to hardware implementations of shifting operators... but personally I would think that is something simple enough for the compiler to compensate...

In any case, if you want to be sure to do an arithmetic shift in C and C++, you have to copy the sign bit yourself...
1
2
3
4
inline long ash( long value, unsigned bits )
  {
  return (value >> bits) | (value & (1 << ((sizeof( long ) * 8) - 1) ) );
  }

Note that this presupposes a Ones' or Twos' Complement computer...

Alas.
Last edited on
So... What sort of CPU is NGen using, anyway?
Could be his compiler just doesn't do it.
Visual C++ 2008. And I'm not sure what you mean by CPU. The manufacturer, the model, or both?

You know what, I'm just going to ask about where I'm seeing this implemented. I still can't figure out how this works.

There's an OpenGL tutorial I'm doing on loading textures, and this guy made an algorithm for loading BMP files. In it, there are a few functions for converting char arrays into shorts and ints using little-endian form, and he uses bit-shifting to accomplish it. Here's what he has:

1
2
3
4
5
6
	int toInt(const char* bytes) {
		return (int)(((unsigned char)bytes[3] << 24) |
					 ((unsigned char)bytes[2] << 16) |
					 ((unsigned char)bytes[1] << 8) |
					 (unsigned char)bytes[0]);
	}

and
1
2
3
4
	short toShort(const char* bytes) {
		return (short)(((unsigned char)bytes[1] << 8) |
					   (unsigned char)bytes[0]);
	}


He's using them, from what I'm seeing, to read the headers of the BMP file. I'll post a link to the tutorial, if you want.
Last edited on
Oh, I see the problem, now. Duoas forgot to mention that the most significant bit will be set only when the number is a negative.
Try inputting -1.
Er, yes, sorry about that...

A signed number's msb is either 0 or 1, and bit shifting should preserve that value.
An unsigned number's msb should always become zero when shifted.

For values where the msb is zero to begin with, it doesn't matter what version of shift you use (logical or arithmetic).

But if the msb is one, then it makes a difference if you are dealing with negative numbers (you want the number to remain negative after the shift).


NGen
That shifting stuff is correct and the signedness of the bytes[] array's elements is not important. Where it could be important it is fixed:
1 each element is converted to an eight-bit, unsigned quantity before shifting
2 the elements are composed using logical OR
3 the final result is converted to a signed value
However, the code does presume that an int is four bytes wide...

:-)
But wouldn't that keep the number of bits the same? From my understanding, something like "01010101 | 10101010" should return a single byte value of "11111111", not combine the 2. Or does the shifting make room for the new values when casted?
Last edited on
Could you maybe rephrase your question?
Your or instruction would (could) look like this:
1
2
mov al, 0b01010101
or al, 0b10101010 ; al contains now 0b11111111

What do you mean by "does the shifting make room for the new values when casted"?
Last edited on
The input value is cast onto a byte.
Then it is type-promoted to a machine word.
Then shifted.
Then ORed.
Then cast onto the output value.
The way I see it, the guy is shifting the bits of each char so that they can be ORed into a new 4-byte value. That's the only way this would make any sense. Otherwise, he would just be shifting chars into chars, and would probably end up with either all true bits, or 1 or 2 false bits (although, most would probably be moved out of range and the whole byte would probably be '00000000'). Here's what I'm talking about:

char[0] = 00001111
char[1] = 01100100
char[2] = 10011001
char[3] = 11000010


If we treat these as ints:

char[0] = 00000000 00000000 00000000 00001111
char[1] = 00000000 00000000 00000000 01100100
char[2] = 00000000 00000000 00000000 10011001
char[3] = 00000000 00000000 00000000 11000010

He's shifting them (from what I'm seeing) like this:

char[0] << 0 = 00000000 00000000 00000000 00001111
char[1] << 8 = 00000000 00000000 01100100 00000000
char[2] << 16= 00000000 10011001 00000000 00000000
char[3] << 24= 11000010 00000000 00000000 00000000

Then we would OR them together into a whole int:

11000010 10011001 01100100 00001111


That's what I think he's doing, since it makes the most sense. I think the casting treats the chars as ints (he casts them as an int at the beginning of the function), and basically just fills in 24 bits after the char byte, as I showed earlier.
Last edited on
Here yet another possibility to convert little/big endian:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
unsigned int toUInt(const unsigned char bytes[4]){
 unsigned int ret;
 *((unsigned char*)&ret) = bytes[3];
 *((unsigned char*)&ret + 1) = bytes[2];
 *((unsigned char*)&ret + 2) = bytes[1];
 *((unsigned char*)&ret + 3) = bytes[0];
 return ret;
}
int main(void){
 // example
 unsigned int a = 0x0F00F0FFu;
 unsigned char *b;

 b = (unsigned char *)&a;
 printf("a : %u, b0 : %u, b1 : %u, b2 : %u, b3 : %u\n", a, b[0], b[1], b[2], b[3]);
 a = toUInt(b); // or toUInt((unsigned char *)&a)
 printf("a : %u, b0 : %u, b1 : %u, b2 : %u, b3 : %u\n", a, b[0], b[1], b[2], b[3]);

 return 0;
}
NGen
Yes, you have it right.

jmc
Why start playing with endiannesses?
Also, that also assumes a 32-bit int...

[edit] Having said that, that is an interesting way to mess with it...

Heh. What fun messing with computers... :-@
Last edited on
As well as the function above...
This is just the way it would be done in assembly and it might be easier to understand for the TS ;)
Pages: 12