Bit shifting question

Hey guys

I'm reading a book on algorithms and the author illustrates a simple hash function

this hash function takes a long integer and converts it to a hashcode

1
2
3
4
  int hashCode(long x){
      typedef unsigned long ulong;
      return hashCode(int(ulong(x)>>32+ + int(x));
  }


The author never explicitly mentioned that there is an overloaded hashCode function that takes an int, I'm guessing there is if not this would lead to infinite recursion

anyway the author says we can add both the higher end of the long with the lower end of the long, I'm guessing the lower end is the first 4 bytes and the upper end is the last 4 bytes, he mentions to this you will need to shift x by 32 places??? but wouldn't this just put 32 zeros before the last 4 bytes( upper end ) of the long, so effectively it would look something like this
00000000 00000000 00000000 00000000 11111111 11111111 11111111 11111010
- which is the number 4294967290

we then convert this long to an int BUT wouldn't this just save 4 bytes or 32 bits of zeros and add that four bytes of zeros to the first original bytes of x ( converted to an int so the final four bytes will be shaved off) which would just be int(x)

I can't see how this is adding the lower end of the variable x to the upper end of the variable x? to me it just looks like adding 4 bytes of zero to the first(original) 4 bytes of x

Thanks
Last edited on
You are right about the recursion issue, not sure what the author meant there.

Bit shifting is not defined over signed integers in C++, so the trick is to first convert it to an unsigned integer.

The second (faulty) assumption here is that sizeof(long) != sizeof(int). I presume the author takes it as something like 64 and 32 bits respectively.

This hash function is a very simple (stupid) one. It simply takes the high-order bits of x and adds them to the low-order bits of x, not bothering to care about overflow.

If we consider a byte version of this, producing a nibble-sized hash code:

  1011 0110   →   0000 1011 + 1011 0110   →   1011 + 0110   →   0001 0001

The strike-through is where the cast-to-int is in your author’s code snippet. (In our case, we are casting to nibble.)

Hope this helps.
Thanks Duthomhas

that makes sense, I thought that casting from a unsigned long long shaved the first 32 bytes or lower order bits off rather than the last 32 bytes or high order bits

is there a reason why the low order bits are shaved off as opposed to the high end bits?

I thought it would be more logical to take the first 32 bytes from the cast and disregard anything after that as the first 32 bytes from the unsigned long long would fit inside the 4 bytes for an int

thanks
The high-order bits are lost because it would not make sense to lose the low-order bits — such a number conversion would be useless.

For simplicity, let us deal with one- and two-digit numbers.

Given 27, and we want to cast that value down to a single digit, which digit makes most sense to keep?

Granted, that is a matter of need. And the usual need is to keep the low-order (least-significant) digits.

If you want to keep the high-order value, then shift (that is, divide) before truncating.

  (single_digit)27 → 7

  (single_digit)(27 / 10) → 2

Either way, the value now fits neatly in a single digit. Hope this helps.
That makes perfect sense

for example if we were to convert a long let's say it's value is 42 to an int if we truncate the higher order bits we will still end up with 42

but if we truncate the lower order bits we will get a completely unrelated number



thanks!
Last edited on
Exactly! :O)
Topic archived. No new replies allowed.