Fast finding midpoint between two values

While reviewing a book on code elegance and testing ( Beautiful Code ), I came across an interesting section regarding Binary Searches. A binary search is pretty simple and fast, when run against an already sorted list of items.

The Basics

It basically goes like this:

1) find the midpoint of the list.
– is the item GREATER than value at midpoint? find midpoint between current and last ‘top’
– is the item LESSER than the value ad midpoint? find midpoint between current and last ‘bottom’

Continue with routine until either is true:
– you found the item you are looking for.
– you have exhausted your search list (midpoint = current).

History of the problem

To effect his binary search, you need to reliably locate that midpoint in the list. I’m sure a lot of people think they could effectively knock this out in a short time. And for the most part their solutions would work for most cases. However, when operating against enormous lists, it’s been found that the registers holding the ends and midpoint, when operated upon, can suddenly (due to signing issues) wrap and become negative! This is a huge bug, and until the 80’s was largely ignored and/or unsolved.

Then a most elegant solution was devised.

The reason I bring this up is that I rely upon this method for a lot of the code I write. It’s fast, it’s effective, and operating in the binary world provides ultimate control. This is direct manipulation of the data, at the binary level.

One solution to fast mid-point calculation

Locating the midpoint between two values is fairly simple, but perhaps a little confusing to some. In this example, there is another advantage in it’s side effect, you will always have a whole number upon which to work after division.

The old solution – straight division


int mid = (low + high) / 2;

The problem is, if mid > max storage value for int (varies by language) then the value overflows, and results in a negative number. A negative index on a list is going to product unpredictable results. That, is a problem!

The new solution – bitshift


int mid = (low + high) >>> 1;

What is that you might ask? Quite simple, it’s a bit shift operations (in this case it’s JAVA and it’s an unsigned bitshift operation, which you’ll need to use or the bug will simply re-manifest itself).

The concept here is simple. Given the value (which is always stored as binary data), to divide it by two, one only need to shift the bits to the right, in effect dropping the single (0|1), moving all bits by one value to the right and viola. You have found the midpoint (middle).

At this point, thinking I’ve lost some of you in the theory, so I’ll do my best to explain, by looking at the number 217, represented in 8 bits, then use the bit shift to see the result.

Each ‘bit’ in this 8-bit register represents a base10 value.

  217 in bits
    128   64   32  16   8   4   2   1      (value at each bit)

     1     1    0   1   1   0   0   1

The furthest left (in this ordering) represents a value of 128, the next to the right 64, and so forth. If you add up the bits that are ‘on’ (set to 1) you have the following:


128 + 64 + 16 + 8 + 1 = 217

So, study that and get it straight in your head, because next we look at the magic of the bit shift.

By shifting the value 1 bit to the right, the bitshift operation back-fills the vacant values with 0 (unset). Here is the before and after:

  BEFORE
    128   64   32  16   8   4   2   1      (value at each bit)

     1     1    0   1   1   0   0   1

  AFTER
    128   64   32  16   8   4   2   1      (value at each bit)

     0     1   1    0   1   1   0   0     1 is dropped
             moved to the right --->

Now, using the bit values as before, the calculation now becomes:

64 + 32 + 8 + 4 = 108

108 is 1/2 of 217, represented by a whole number (the fractional .5 or 1/2 is automatically rounded down by the action of the bit shift itself. Simple, fast, effective, repeatable and eminently safe!

Conclusion

Bit shift operators are your friend, once you understand how to use them. And now you know one of my programming secrets. Bit-shift operations can also be implemented in the other direction, and that can be a very powerful coding weapon indeed, but I’m not giving away all my tricks.