36

Suppose we have a sorted list of values. Let's use list = Sort@RandomReal[1, 1000000]; for this example.

I need a fast function window[list, {xmin, xmax}] which will return all list elements $x$ for which $x_\textrm{min} \le x \le x_\textrm{max}$.

How can this be implemented in Mathematica? I am looking both for fast and for elegant solutions. The direct solution is implementing binary search, but there are several ways to do this, and perhaps Mathematica already has something built in that I am not aware of.


Here's the most naïve implementation:

window[list_, {xmin_, xmax_}] := 
 list[[LengthWhile[list, # < xmin &] + 1 ;; LengthWhile[list, # <= xmax &]]]

Summary:

Here are the timings I get for the different solutions for some random data of a million machine reals which also contains duplicates:

  • My original naive solution: 3.85 s

  • Leonid, using binary search: 0.01 s (close to the measurable limit, $\log n$ complexity)

  • R.M., using Clip: 0.59 s (linear time, no sorting required)

  • faleichik, using Nearest: 1.29 s (strangely, this also runs in linear time, by measurement)

  • kguler, using Map (which autocompiles) and Pick: 0.30 s (also linear time, the fastest simple linear time solution so far, does not require sorting either)

For sorted data, the fastest solution is Leonid's, which uses binary search and has logarithmic complexity.

For unsorted data, the fastest (and also one of the simplest) is kguler's. A not so obvious trick was using Boole on the condition to allow it to be automatically compiled.

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263

6 Answers6

27

As you said, essentially you need binary search, since you have a sorted list and binary search has a logarithmic complexity. However, since

  • the limiting numbers may not be present in the list
  • some numbers may be present more than once

we'd need modified binary search. Here is a possible implementation:

(* maximum number smaller than or equal to the limit *)
bsearchMin[list_List, elem_] :=
  Module[{n0 = 1, n1 = Length[list], m},
    While[n0 <= n1,
     m = Floor[(n0 + n1)/2];
     If[list[[m]] == elem, 
         While[list[[m]] == elem, m++]; 
         Return[m - 1]];
     If[list[[m]] < elem, n0 = m + 1, n1 = m - 1]
    ];
    If[list[[m]] < elem, m, m - 1] 
  ];

and

(* minimum number larger than or equal to the limit *)
bsearchMax[list_List, elem_] :=
  Module[{n0 = 1, n1 = Length[list], m},
    While[n0 <= n1,
      m = Floor[(n0 + n1)/2];
      If[list[[m]] == elem, 
         While[list[[m]] == elem, m--]; 
         Return[m + 1]];
      If[list[[m]] < elem, n0 = m + 1, n1 = m - 1]
    ];
    If[list[[m]] > elem, m, m + 1] 
  ];

With the help of these:

window[list_, {xmin_, xmax_}] :=
  With[{minpos = bsearchMax[list, xmin], maxpos =  bsearchMin[list, xmax]},
    Take[list, {minpos, maxpos}] /; ! MemberQ[{minpos, maxpos}, -1]
  ];
window[__] := {};

For example:

lst = {1, 4, 4, 4, 6, 7, 7, 11, 11, 11, 11, 13, 15, 18, 19, 22, 23, 25, 27, 30}

window[lst, {4, 11}]

(* ==> {4, 4, 4, 6, 7, 7, 11, 11, 11, 11} *)

You can Compile functions bsearchMin and bsearchMax, if you expect lots of duplicate elements (this will speed an inner While loop). Compiling them per se won't improve the speed much (unless you call these very often), since the complexity is logarithmic in any case.

This is certainly generally more efficient than Nearest for sorted lists (perhaps unless you have lots of repeated elements, but then you can compile), because Nearest is a general algorithm which can not take into account the sorted nature of the list. I tried on 10^7 elements list, and it took something 0.0003 seconds for that.

Compiled version

Compiled versions speed up bsearchMin and bsearchMax, but seem not to improve the performance of window[]. See discussion in comments section.

bsearchMax = Compile[{{list, _Complex, 1}, {elem, _Real}},
  Block[{n0 = 1, n1 = Length[list], m = 0},
    While[n0 <= n1,
      m = Floor[(n0 + n1)/2];
      If[list[[m]] == elem,
        While[m >= n0 && list[[m]] == elem, m--]; Return[m + 1]  ];
      If[list[[m]] < elem, n0 = m + 1, n1 = m - 1]];
    If[list[[m]] > elem, m, m + 1]
  ]
  ,
  RuntimeAttributes -> {Listable},
  CompilationTarget -> "C"
]

bsearchMin = Compile[{{list, _Complex, 1}, {elem, _Real}},
  Block[{n0=1,n1=Length[list],m = 0},
    While[n0<=n1,
      m=Floor[(n0+n1)/2];
      If[list[[m]]==elem,
        While[m<=n1 && list[[m]]==elem,m++]; Return[m-1]  ];
      If[list[[m]]<elem, n0=m+1, n1=m-1]];
    If[list[[m]]<elem,m,m-1]
  ]
  ,
  RuntimeAttributes -> {Listable},
  CompilationTarget -> "C"
]
QuantumDot
  • 19,601
  • 7
  • 45
  • 121
Leonid Shifrin
  • 114,335
  • 15
  • 329
  • 420
  • I wanted to say that I think this should become a built-in part of Mathematica. I use it all the time, definitely one of my favorite answers on the site. – s0rce Jul 14 '13 at 13:58
  • @s0rce Then you should know that it likely has a bug. I seem to remember that I had to fix it in this answer, but I did not propagate the changes back. Please see the edit history of that answer to see which changes I made there. – Leonid Shifrin Jul 14 '13 at 14:12
  • A minor quibble: I'd use Quotient[n0 + n1, 2] instead of Floor[(n0 + n1)/2]. – J. M.'s missing motivation Oct 15 '15 at 14:37
  • @karolis Your bsearchMin contains mistake, m isn't initialized. And what sample have you used for testing the compiled function? With lst = RandomReal[1, 10^8]; test = window[lst, {0.3, 0.6}]; // AbsoluteTiming the compiled version becomes slower. – xzczd Dec 06 '15 at 05:50
  • @xzczd, the typo in code has been fixed (by someone, thanks!). My testing code: list = Sort@RandomReal[1, 10000]; find = RandomReal[1, 3000]; and then First@AbsoluteTiming[bsearchMax[list, #] & /@ find]. On my Macbook pro laptop uncompiled version takes 0.312 and compiled version takes 0.069. – Karolis Dec 06 '15 at 21:14
  • @Karolis Oh, just noticed I used the wrong lst, it should be lst = Sort@RandomReal[1, 10^8]. After this correction, the compiled version is almost as fast as the uncompiled one in my test. So our tests are both consistent to Leonid's prediction: Compiling them per se won't improve the speed much (unless you call these very often), since the complexity is logarithmic in any case. – xzczd Dec 07 '15 at 04:19
  • 1
    @Karolis It was me who fixed the typo. Thanks for providing the compiled versions. – Leonid Shifrin Dec 07 '15 at 08:14
  • @xzczd, curious that uncompiled version runs faster than the compiled one. On what system did you test? I jut tried on Windows 7 with i7: compiled version took 0.034s, and uncompiled version took 0.218s – Karolis Dec 07 '15 at 20:39
  • @Karolis Seems that again I've made a mistake in yesterday's test… With the correct sample i.e. lst = Sort@RandomReal[1, 10^8]; (test = window[lst, {0.3, 0.6}]; // AbsoluteTiming) the compiled version is still slower (2.78s vs 0.10s). With your sample my timing is similar to yours. I'm on v9.0.1, win10 64bit with i7, the C compiler is TDM-GCC 4.9.2. – xzczd Dec 08 '15 at 05:18
  • @xzczd, I get the same result with window[] function. There might be hidden costs for calling C code infrequently. I updated the description to reflect this. – Karolis Dec 09 '15 at 11:12
  • Do you know why the non-compiled version is about 3 orders of magnitude faster than the compiled version for large lists (10^8)? I'm using Mathematica 10.4. – RunnyKine Mar 17 '16 at 04:07
  • @RunnyKine I don't have 10.4 on this machine, and neither do I have C compiler here currently installed, so I can't test. Do you include the compilation time in the measurements? If so, it might explain it. If not, I am really not sure - compiled version should be faster. – Leonid Shifrin Mar 17 '16 at 05:47
  • I don't include compilation time. Both the "WVM" and "C" compiled versions run about the same time, but the non compiled version just smokes them. – RunnyKine Mar 17 '16 at 05:52
  • @RunnyKine Indeed. Tested on WVM target, M10.2. I have only one possible explanation: the compiled function copies the list internally, while the top-level version does not. Since the time it takes to do a single lookup is logarithmic in the size of the list, it is negligible w.r.t. internal array copy. Why array is copied, I don't know - it should've reused / shared the original array passed to it. – Leonid Shifrin Mar 17 '16 at 08:30
  • Ah thank you, that makes sense. On further testing my initial timing may have been a one time thing, but the top-level version is still an order of magnitude faster, on average, than the compiled one. Regardless, this function is insanely fast. Thanks for sharing. – RunnyKine Mar 17 '16 at 08:37
  • Any chance of "releasing" a windowBy version? – P. Fonseca May 29 '16 at 12:24
  • @P.Fonseca Well, it is kind of trivial. You just have to extract a list of all "by" property values, and find positions there, then extract elements like in window. – Leonid Shifrin May 29 '16 at 16:40
  • @LeonidShifrin This would be a nice addition to the function repository. – Rohit Namjoshi Apr 23 '21 at 22:59
  • @RohitNamjoshi If you say so. I am not a big fan of FR. – Leonid Shifrin Apr 23 '21 at 23:17
18

Using Pick with Boole selector

window[list_, {xmin_, xmax_}] := 
 Pick[list, Boole[xmin <= # <= xmax] & /@ list, 1]

With

 list = Sort@RandomReal[1, 1000000];
 {min, max} = Sort@RandomReal[1, 2];

Timings:

 Table[ClearSystemCache[]; 
 Timing[window[list, {min, max}];], {50}] // Mean
 (* ==> {0.0674, Null} *)

on a laptop with Vista 64bit OS, Intel Core2 Duo T9600 2.80GHz, 8G memory.

UPDATE: Using Pickwith alternative selector arrays:

UnitStep

  windowUnitStep1[list_, {xmin_, xmax_}] := 
  Pick[list, UnitStep[(list-xmin)(xmax-list)], 1]

or

  windowUnitStep2[list_, {xmin_, xmax_}] := 
  Pick[list, UnitStep[list-xmin]UnitStep[xmax-list], 1]

both are twice as fast as Boole.

UnitStep Compiled (Ruebenko's compiled function win)

  windowUnitStep3[list_, {xmin_, xmax_}] := 
  Pick[list, win[list,xmin,xmax], 1]

is twice as fast as uncompiled UnitStep.

Using GatherBy with Boole:

  windowGatherBy[list_, {xmin_, xmax_}] := Last@GatherBy[list, Boole[xmin <= # <= xmax] &]

gives similar timings to window.

Using SparseArray with Part or Take:

The following alternatives attempt to take into account the fact that input data is sorted, thus the first and the last non-zero positions in SparseArray[UnitStep[(list-min)(max-list)]] give the first and the last positions of the portion of input list that satisfy the bounds.

 windowSparseArray1[list_, xmin_, xmax_] := 
 With[{fromTo = SparseArray[UnitStep[(list - xmin) (xmax - list)]][
  "NonzeroPositions"][[{1, -1}]]}, 
  list[[fromTo[[1, 1]] ;; fromTo[[2, 1]]]]]

or

 windowSparseArray2[list_, xmin_, xmax_] := 
 With[{fromTo = SparseArray[UnitStep[(list - xmin) (xmax - list)]][
  "NonzeroPositions"][[{1, -1}]]}, 
  Take[list, {fromTo[[1, 1]], fromTo[[2, 1]]}]]

both give rougly 50 percent speed improvement over window above. Using Ruebenko's compiled UnitStep to construct the array again doubles the speed:

 windowSparseArray3[list_, xmin_, xmax_] := 
 With[{fromTo = SparseArray[win[list,xmin,xmax]][
  "NonzeroPositions"][[{1, -1}]]}, 
  Take[list, {fromTo[[1, 1]], fromTo[[2, 1]]}]]
kglr
  • 394,356
  • 18
  • 477
  • 896
  • 2
    @Szabolcs By looking at the timings, which are rather impressive here for a linear-time top-level solution, I would guess that Map auto-compiles the test. Coupled with Pick being optimized on packed arrays, this represents a viable linear-time alternative, to my mind. – Leonid Shifrin Feb 27 '12 at 10:50
  • 3
    @Szabolcs, and @Leonid, this was intended as a baseline. It is much faster than alternatives using LengthWhile (1.52662), Position (1.4015), Clip(0.1819) or Nearest (0.42962) on the same data set. Of course, a method that explicitly uses binary search is uncomparably better: Leonid's binary search method gives over 40x better results than plain Pick. – kglr Feb 27 '12 at 11:46
  • It is really not so obvious why this is so fast. If you regularly use this pattern then could you include some additional explanations? Select is more natural here, but it's also much slower (due to unpacking?) Related: http://mathematica.stackexchange.com/a/11/12 and http://mathematica.stackexchange.com/q/1803/12 – Szabolcs Feb 27 '12 at 12:59
17

I think Nearest[] is the most effective way. You don't even need to sort your data.

a = RandomReal[1, 100];
nf = Nearest@a;
xmin = 0.01; xmax = 0.6;
x0 = (xmin + xmax)/2; dx = (xmax - xmin)/2;
nf[x0, {\[Infinity], dx}] // Sort

{0.0117819, 0.013102, 0.0177269, 0.0356801, 0.040019, 0.0504563, \
0.0627056, 0.0749593, 0.0758206, 0.106541, 0.107941, 0.112281, \
0.117172, 0.132445, 0.143151, 0.157252, 0.166585, 0.179652, 0.217876, \
0.241301, 0.242821, 0.254276, 0.258477, 0.267544, 0.268951, 0.280489, \
0.290386, 0.305346, 0.315458, 0.318908, 0.337006, 0.338169, 0.339338, \
0.362153, 0.366946, 0.371712, 0.386563, 0.396061, 0.416329, 0.426874, \
0.430932, 0.439427, 0.460844, 0.473224, 0.475559, 0.476573, 0.479037, \
0.480472, 0.503684, 0.513969, 0.521916, 0.535221, 0.541562, 0.54198, \
0.554534, 0.558954, 0.563491, 0.565873, 0.582683, 0.58919, 0.592807, \
0.593541}

For array of 100 000 numbers it took 0.062 seconds on my machine. For million -- 0.688.

faleichik
  • 12,651
  • 8
  • 43
  • 62
  • 1
    I did not know about this syntax of a NearestFunction (i.e. an argument of the form {\[Infinity], dx}). It is not documented on the doc page for Nearest. Your answer shows that it was definitely worth asking this question. I do wonder how Nearest works though. It works on two-dimensional data as well, so I doubt it does a plain sort internally. – Szabolcs Feb 27 '12 at 09:50
  • 2
    It is documented here: http://reference.wolfram.com/mathematica/tutorial/UsingNearest.html – faleichik Feb 27 '12 at 09:56
  • 1
    I was always curious about how Nearest worked, and I don't know much about data structures, so I finally asked on SO. – Szabolcs Feb 27 '12 at 10:13
14

Here are a few approaches:

1: Using Clip

This should be definitely faster than the naïve implementation and is a good un-compiled option for unsorted lists

 window[list_, {xmin_, xmax_}] :=  Clip[list, {xmin, xmax}, {{}, {}}] // Flatten

However, as Leonid notes, this also unpacks the array (causing a slight drop in speed) because the last argument is not numerical, although this can be handled by clipping differently.

2: Using Pick and IntervalMemberQ

This is a straightforward mathematical implementation of the problem, and is again faster than the naïve approach.

window[list_, {xmin_, xmax_}] := Pick[list, 
    IntervalMemberQ[Interval[{xmin, xmax}], list], True]

This will also unpack the array.

3: Forward-backward search (Compiled)

Since you have a sorted list, the following first searches forward till it hits the first element >=xmin and then searches backward till it hits the first element <= xmax and returns everything in between. Compiling to C and parallellizing it makes it very fast (300x faster than naïve, 30x faster than Clip and 170x faster than IntervalMemberQ on my machine).

window = Compile[{{list, _Real, 1}, {xmin, _Real}, {xmax, _Real}},
    Module[{i, j},
        i = 1; While[list[[i]] < xmin, i++];
        j = 1; While[list[[-j]] > xmax, j++];
        list[[i ;; -j]]
    ],
    CompilationTarget -> "C", Parallelization -> True, 
    "RuntimeOptions" -> "Speed"
]
rm -rf
  • 88,781
  • 21
  • 293
  • 472
  • This is still bound to be linear time, although perhaps with a small time constant. A good one for unsorted lists though. – Leonid Shifrin Feb 27 '12 at 10:06
  • @Leonid In Mathematica sometimes the fastest solution for input data of practical sizes is not the same as the best complexity solution, so this is pretty useful. In this case I was interested in using the same list over and over, so a pre-processing (like sorting) is affordable. This is why I asked about sorted lists. – Szabolcs Feb 27 '12 at 10:12
  • 3
    @Szabolcs And this is why I used binary search in my solution. You can't beat a log with linear for large lists, even in Mathematica where some time constants (e.g. packed array vs. unpacked arrays) are very different and create rather unnatural performance handicaps. So, in this particular case: the complexity is important. – Leonid Shifrin Feb 27 '12 at 10:15
  • The less obvious problem here is that Clip unpacks, because the last argument is not numerical. This seriously degrades the performance. This can be fixed by Clip-ping differently and using other methods of eliminating clipped elements, which won't unpack. – Leonid Shifrin Feb 27 '12 at 11:55
  • @Szabolcs Please see my edit and let me know the timings on your data (in your question) – rm -rf Feb 27 '12 at 13:22
12

Here is my entry. It's O(n), but quite fast, so if you ever have unsorted data, this is a choice:

win = Compile[{{inVec, _Real, 1}, {min, _Real, 0}, {max, _Real, 0}},
  UnitStep[(inVec - min)*(-inVec + max)]
  ]
  • You meant to use this inside Pick, I guess? – Leonid Shifrin Feb 27 '12 at 14:30
  • One interesting thing about kguler's solution compared to this is that it will autos-specialize to the type of input. I only managed to get explicitly compiled solutions (like this one) working if I specified the type of the input (_Real or _Integer). When we let Map auto-compile, it seems to choose the correct one automatically. Of course in this particular situation using a function that works with reals will also work with integers (in practice). But if it is compiled specifically for integers, it will be faster on an integer vector. – Szabolcs Feb 27 '12 at 14:41
  • Now, for this particular solution: it is more than twice as fast the kguler's Map solution, and compilation is not really needed (it doesn't seem to influence the timing as everything is vectorized anyway), so types are not an issue. But in general I do wonder: isn't it maybe better to just let Mathematica auto-compile through Map or a similar function when we want a general function that works for several input types? Or is it possible to create a compiled function that will work for several input types (maybe it'll auto-select the correct version to run internally)? – Szabolcs Feb 27 '12 at 14:44
  • @LeonidShifrin, that is one option, depends on what to do next. Positions might be another option. –  Feb 27 '12 at 14:52
  • @Szabolcs, I don't know of a way to compile for several types out of the box, you'd have to set something up, like a templateCompile, that could compile for a list of types and default to the expr given. –  Feb 27 '12 at 14:55
  • @Szabolcs Implementing auto-compiled function is not that difficult. You have to dynamically determine the type by something like ArrayQ, and query the dimensions. For really important cases, I'd actually prefer manual version to have more control over things. What is IMO even more important, auto-compilation AFAIK always compiles to byte-code, so if you want to compile to C, this won't be an option. You can memoize the compiled function for a given set of argument types, in that case, to avoid compilation overhead in subsequent calls. – Leonid Shifrin Feb 27 '12 at 15:02
  • 1
    @Szabolcs I gave an explicit example of how the memoization can be done, in this answer, section "Making JIT-compiled functions...". All that remains is to write a dispatcher which determines the types and forms the right sets of type declarations. This is not difficult at all. – Leonid Shifrin Feb 27 '12 at 15:04
  • 1
    @LeonidShifrin We should really start a site blog (please see the chat room). Would you like to volunteer for a couple of posts on these topics? I think it's better suited for a blog than answers. – Szabolcs Feb 27 '12 at 15:10
  • @Szabolcs Yes, that's a good idea, I actually wrote somewhere that I'd support it, but perhaps that was somewhere in comments. I'd sure be glad to do a couple of posts on these matters. It is a question of finding time also, since blog posts are typically larger than an average SE answer, and are supposed to be more comprehensive and logically complete. But, you don't have to blog every day, so it should be managable :). Besides, the blog format allows more extended discussions which are frowned upon on SE and don't quite fit the SE format. – Leonid Shifrin Feb 27 '12 at 15:17
  • 1
    @LeonidShifrin, the default for auto compilation has to be byte code: 1) M- does not ship with a C compiler. 2) compilation to C takes for ever, compared to compiling for byte code. –  Feb 27 '12 at 15:18
  • @ruebenko Yes, I realize all that, and also think that these are sensible choices. This was not a critique of the internal Compile defaults, but just a word of caution, so that one does not rely on auto-compilation as the always best solution. Sometimes compilation to C is beneficial despite the compilation time, this is one reason why I said that for really important cases (in terms of performance), I'd prefer to retain the manual control over Compile, including control over the compilation target. – Leonid Shifrin Feb 27 '12 at 15:25
10

Assuming that you are interested in multiple different windows for the same list, then the following approach will be much faster than the other answers. Basically, compute a NearestFunction of the data once, and then use that NearestFunction for each window of interest. Here is a function that does this:

WindowFunction[list_] := With[{s = Sort@list},
    WindowFunction[Nearest[s->"Index"], s]
]

WindowFunction[nf_, list_][min_, max_] := Module[{r,s},
    {r, s} = nf[{min, max}][[All,1]];
    Take[list, {r + Boole[list[[r]] < min], s - Boole[list[[s]] > max]}]
]

Here is a comparison with the accepted answer. Sample data;

list = Sort @ RandomReal[1, 10^6];

Compute the WindowFunction (this step is a bit slow, but only needs to be done once):

wf = WindowFunction[list]; //AbsoluteTiming

{0.044266, Null}

Compare:

r1 = wf[.49, .51]; //RepeatedTiming
r2 = window[list, {.49, .51}]; //RepeatedTiming

r1 === r2

{0.000043, Null}

{0.00018, Null}

True

About 4 times faster. One could also add a summary box format for WindowFunction if desired.

Carl Woll
  • 130,679
  • 6
  • 243
  • 355
  • Your answers always make me wonder where is the limit to optimizations in Mathematica. I thought I understood its computation model / optimizations fairly well to squeeze most of the speed out, but your answers regularly put me to shame. Hats off, and of course a +1. – Leonid Shifrin Jan 07 '18 at 17:40
  • 1
    Also, I will use an opportunity to say this once again, and I am sure that the community as a whole shares that opinion. Your contributions make this already great resource just that much better, and you also set a great example of following up old questions and providing high-quality answers to them - something that is alas not commonly done. It would be great if more of us would follow your path in this regard too. – Leonid Shifrin Jan 07 '18 at 17:45
  • I really appreciate your contribution which helped me so much.but there's a fly in the ointment: the method you provided seems only work for the situation that there is no repeating elements in the list. obviously it is easy to fix this. – Ren Witcher Mar 24 '19 at 18:31