15

Is there a built-in function to do binary search? Say, given a list (sorted) and a number, find the position which keeps the listed sorted when the number is inserted.

I know that LengthWhile could manage that, but it's slow.

MMM
  • 643
  • 3
  • 9

3 Answers3

12

There is some built-in binary search code but not in the core language as far as I know.

  • There is BinarySearch from the Combinatorica package, which is still the function I use most often despite the fact that that package is now deprecated and loading it causes shadowing of some Symbols.

  • There is the undocumented GeometricFunctions`BinarySearch but this function does not appear to perform particularly well.

When I need greater performance I typically use a compiled form of Leonid's code from:

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
6

It seems that since 2021 there is a ResourceFunction that does this.

Example:

ResourceFunction["BinarySearch"][{1, 2, 5, 5, 7, 12}, 6]

I looked at the definition of the resource function: it seems to mainly add error handling to GeometricFunctions`BinarySearch which, according to @Mr.Wizard's answer, "does not appear to perform particularly well."

The documentation for the resource function mentions under "Properties and Relations" that:

BinarySearch can be considerably faster for packed arrays

with the example

packed = Sort[RandomReal[{0, 100}, 100000]];
RepeatedTiming[ResourceFunction["BinarySearch"][packed, 50]]

Here are some comparisons:

AbsoluteTiming[LengthWhile[packed, #<50&]]

{0.016084, 50113}

bsearchResource = ResourceFunction["BinarySearch"] AbsoluteTiming[bsearchResource[packed, 50]]

(* {0.000474, 50113} *)

Here's BinarySearch from the Combinatorica package mentioned in Mr.Wizard's answer. The package is deprecated and the user has to load the package beforehand with Needs["Combinatorica`"]:

AbsoluteTiming[BinarySearch[packed,50]]

{0.000183, 100227/2}

Then the bsearch function in the link by Mr.Wizard. This is the version that is not compiled:

AbsoluteTiming[bsearchMinNoCompile[packed, 50]]

{0.000159, 50113}

The C compiled version of bsearch (I changed the original complex type to real):

AbsoluteTiming[bsearchMinCompile[packed, 50]]

{0.000042, 50113}

Now looking at the difference between bsearchMinNoCompile and the resource function by increasing the length of packed by a factor of 100:

packed = Sort[RandomReal[{0, 100}, 10000000]];

RepeatedTiming[bsearchMinNoCompile[packed, 50]]

{0.0000492547, 4998443}

RepeatedTiming[bsearchResource[packed, 50]]

{0.0000291153, 4998443}

Summary: At least from the example provided by the resource function, it seems that ResourceFunction["BinarySearch"] provides a convenient method to obtain results quickly when the lists are sorted. The function also has some error handling.

MarcoB
  • 67,153
  • 18
  • 91
  • 189
userrandrand
  • 5,847
  • 6
  • 33
4

There are two relevant functions at Wolfram Function Repository (WFR) submitted by "Wolfram Staff":


Of course, one can see or follow the examples in the WFR pages. Nevertheless, here are examples that demonstrate the speed of BinarySearch.

SeedRandom[32];
packed = Sort[RandomReal[{0, 100}, 100000]];

s = packed[[1332]]

(* 1.35978 *)

ResourceFunction["BinarySearch"][packed, s]

(* 1332 *)

Timings comparison:

AbsoluteTiming[
 Do[ResourceFunction["BinarySearch"][packed, s], 1000]]

(* {0.194593, Null} *)

AbsoluteTiming[Do[Position[packed, s], 1000]]

(* {4.6829, Null} *) ```

Anton Antonov
  • 37,787
  • 3
  • 100
  • 178
  • 1
    For a packed array of up to 140-150K elements, a full linear search Pick[Range@Length@packed, Unitize[packed - s], 0] is just as fast or faster. And finds duplicates. (+1) – Michael E2 Sep 14 '22 at 11:41
  • @MichaelE2 Interesting and good to know -- thanks! (You can write to WFR staff suggesting to enhance BinarySearch with that code.) – Anton Antonov Sep 14 '22 at 11:43