Throwaway memoization makes built-ins faster?

Question

Yesterday I got into an argument with @UnchartedWorks over in the comment thread here. At first glance, he posted a duplicate of Marius' answer, but with some unnecessary memoization:

unitize[x_] := unitize[x] = Unitize[x]
pick[xs_, sel_, patt_] := pick[xs] = Pick[xs, sel, patt]

and proposed the following test to justify his claim that his approach is faster:

RandomSeed[1];
n = -1;
data = RandomChoice[Range[0, 10], {10^8, 3}];

AbsoluteTiming[Pick[data, Unitize@data[[All, n]], 1] // Length]

AbsoluteTiming[pick[data, unitize@data[[All, n]], 1] // Length]

(*
{7.3081, 90913401}
{5.87919, 90913401}
*)

A significant difference. Naturally, I was skeptical. The evaluation queue for his pick is (I believe) as follows:

pick is inert, so evaluate the arguments.
data is just a list, 1 is inert, data[[All, n]] quickly evaluates to a list
unitize@data[[All, n]] writes a large DownValue...
...calling Unitize@data[[All, n]] in the process, returning the unitized list.
Another large DownValue of the form pick[data] = *pickedList* is created (data here is, of course, meant in its evaluated form), never to be called again (unless, for some reason, we explicitly type pick[data]).
The *pickedList* is returned.

What about the evaluation queue for Pick[data, Unitize@data[[All, n]], 1]?

Pick is inert.
data becomes an inert list, 1 is inert, data[[All, n]] quickly evaluates to an inert list.
Nothing happens here.
Unitize@data[[All, n]] returns the unitized list.
Nothing happens here either.
The same step as before is taken to get us the picked list.

So, clearly pick has more things to do than Pick.

To test this out I run the following code:

Quit[]

$HistoryLength = 0;

Table[
  Clear[pick, unitize, data];
  unitize[x_] := unitize[x] = Unitize[x];
  pick[xs_, sel_, patt_] := pick[xs] = Pick[xs, sel, patt];
  data = RandomChoice[Range[0, 10], {i*10^7, 3}];
 {Pick[data, Unitize@data[[All, -1]], 1]; // AbsoluteTiming // First,
  pick[data, unitize@data[[All, -1]], 1]; // AbsoluteTiming // First},
  {i, 5}]

Much to my surprise, pick is consistently faster!

{{0.482837, 0.456147},
 {1.0301,   0.90521},
 {1.46596,  1.35519},
 {1.95202,  1.8664},
 {2.4317,   2.37112}}

How can I ~~protect myself from black magic~~ make a representative test? Or ~~should I embrace the black magic~~ is this real and a valid way to speed things up?

Update re: answer by Szabolcs

Reversing the order of the list like so:

{pick[data, unitize@data[[All, -1]], 1]; // AbsoluteTiming // First,
 Pick[data, Unitize@data[[All, -1]], 1]; // AbsoluteTiming // First}

gave me the following result:

{{0.466251, 0.497084},
 {1.18016,  1.17495},
 {1.34997,  1.42752},
 {1.80211,  1.93181},
 {2.25766,  2.39347}}

Once again, regardless of order of operations, pick is faster. Caching could be suspect, and as mentioned in the comment thread of the other question, I did try throwing in a ClearSystemCache[] between the pick and Pick, but that didn't change anything.

Szabolcs suggested that I throw out the memoization and just use wrapper functions. I presume, he meant this:

unitize[x_] := Unitize[x];
pick[xs_, sel_, patt_] := Pick[xs, sel, patt];

As before, on a fresh kernel I set history length to 0 and run the Table loop. I get this:

{{0.472934, 0.473249},
 {0.954632, 0.96373},
 {1.42848,  1.43364},
 {1.91283,  1.90989},
 {2.37743,  2.40031}}

i.e. nearly equal results, sometimes one is faster, sometimes the other (left column is pick, right is Pick). The functions perform as well as Pick in a fresh kernel.

I try again with the memoization as described towards the beginning of the answer:

{{0.454302, 0.473273},
 {0.93477,  0.947996},
 {1.35026,  1.4196},
 {1.79587,  1.90001},
 {2.24727, 2.38676}}

The memoized pick and unitize perform consistently better out of a fresh kernel. Of course, it uses twice the memory along the way.

Have you tried removing the "memoization" part and simply using wrapper functions? — Szabolcs, May 05 '17 at 10:25
@szabolcs excellent suggestion. No, I have not. I'll report back in the next hour or so. — LLlAMnYP, May 05 '17 at 10:35
@Szabolcs I tried it out, wrappers are essentially not different from built-ins. — LLlAMnYP, May 05 '17 at 12:28
@Szabolcs also, I tried to put SeedRandom[42] right after the Clear statement, so that I could repeatedly have Pick act on the same data. I don't witness the speedup from your answer. — LLlAMnYP, May 05 '17 at 12:34

Ray Shadow · Accepted Answer · 2017-05-06T15:50:25.653

20

Cause of speed up

This is definitely not memoization. The reason for the observed speed up is that for large arrays (e.g. 10^8 elements), the memory clean up operations may take noticeable time. If one doesn't free memory, one can perform some operations a bit faster.

Here is a simple example:

Let's create a large array, then perform a calculation, and remove the array:

AbsoluteTiming[
    Total[ConstantArray[0, 10^8]];
]

{0.422509, Null}

It takes 0.42 seconds. Let's now do the same thing, but keep the array in memory:

AbsoluteTiming[
    Total[garbage = ConstantArray[0, 10^8]];
]

{0.366755, Null}

This evaluation is a bit faster.

Let's check how long does it take to remove the large array:

AbsoluteTiming[
    Remove[garbage]
]

{0.061982, Null}

Note that 0.06 seconds is the difference of the calculation times above. This example shows that if we keep the large array instead of removing it, our code can run faster, because we don't need to spent time on freeing memory.

Your example

In the example you provide, removing the result of Unitize@data[[All, n]] from memory takes some time. If one saves this array in a redundant variable, one avoids immediate memory clean-up and the evaluation seems to be faster. In case of pseudo-memoization the Clear[pick, unitize] command will take extra time to free the memory, but this command is placed outside the AbsoluteTiming[] scope. That is why "memoization" seems to speed up the calculation.

How to make a representative test?

You should put Clear[pick, unitize] inside your timing function. This test will show that the pseudo-memoization technique is actually slower than built-in functions:

Table[
    Clear[data];
    data=RandomInteger[{0,10},{i*10^7,3}];
    {
        Pick[data,Unitize@data[[All,-1]],1]; // AbsoluteTiming // First
        ,
        Clear[pick,unitize];
        unitize[x_]:=unitize[x]=Unitize[x];
        pick[xs_,sel_,patt_]:=pick[xs,sel,patt]=Pick[xs,sel,patt];
        pick[data,unitize@data[[All,-1]],1]; // AbsoluteTiming // First
    },
    {i,5}]

(*
{{0.534744, 0.469538},
{1.03776, 1.05842},
{1.58536, 1.65404},
{2.10422, 2.11284},
{2.48129, 2.71405}}
*)

Technical note: as noted by Carl Woll in comments, if one wants to measure the symbol-removing-time using the following code:

In[1] := garbage = ConstantArray[0, 10^8];
In[2] := AbsoluteTiming[Remove[garbage]]

one should set $HistoryLength to zero, otherwise the Out[1] variable will retain the contents of the large array. If Out[1] retains the large data, Remove[garbage] will only delete the reference, but not the data itself. Deletion time of a reference is almost zero, but it doesn't correspond to the deletion time for large data.

edited May 06 '17 at 15:50

answered May 05 '17 at 14:05

Ray Shadow

7,816
1
16
44

Brilliant! Upvote and likely candidate for accept. Must go now, probably will be on mobile till next Wednesday. – LLlAMnYP May 05 '17 at 14:34
Not really. unitize is still faster. https://i.stack.imgur.com/LAuI0.png – webcpu May 05 '17 at 14:54
3

However, I do find that strange. This is a packed array, doesn't garage collection amount to removing an entry like "memory from a to b is allocated"? I'd expect that to be an O(1) operation. – LLlAMnYP May 05 '17 at 14:55
@Shadowray The result of AbsoluteTiming@Remove[garbage] is 1.*10^-6 seconds. https://i.stack.imgur.com/798I5.png – webcpu May 05 '17 at 15:19
5

@UnchartedWorks The problem in your screenshot is that Out[9] still has the contents of garbage, so Remove[garbage] doesn't cause the allocated memory to get recovered. Try using $HistoryLength=0 and repeat your experiment. Shadowray's comment is another way to avoid sticking the contents of garbage into Out. – Carl Woll May 05 '17 at 15:44
It still doesn't add up. https://i.stack.imgur.com/UFt7u.png – webcpu May 05 '17 at 16:14
@Shadowray 25.5 ms is too much. – webcpu May 05 '17 at 18:04
I finally found time to test this out. The answer looked convincing at first, but now that I'm actually trying this out I'm not getting such nice results. I'll update my question; the details are too long for a comment. – LLlAMnYP May 17 '17 at 06:17
Ok, turns out I don't have meaningful data to add, but here's this: the first two cells in your answer aren't quite representative as they also clock the time spent on generating the array. I tried a With[{arr=RandomReal...},AbsoluteTiming[Total[arr]]] as well as Total[garbage=arr] and saw no noticable difference (~<1%). 0.165s for length 3*10^8 array. I am also concerned that AbsoluteTiming[Clear[garbage];Remove[garbage];] takes 0.159s wall clock time, but Timing the same takes 0. This is odd, in line with my suspicion that freeing memory is $O(1)$. Thoughts? – LLlAMnYP May 17 '17 at 06:29
Finally, your test with the Table isn't right: note that you're clearing the previous downvalues of pick and unitize and then saving a larger dataset. To give your answer credit though, I tried having (pick[data, unitize@data[[All, -1]], 1]; DownValues@pick = DownValues@unitize = {};) // AbsoluteTiming // First, this gave me almost perfect matching of the two timings. – LLlAMnYP May 17 '17 at 06:34
@LLlAMnYP I have several questions: Have you set $HistoryLength=0 in your tests? Could you provide exact code and the result you get? Which OS and MMA Version do you use? – Ray Shadow May 17 '17 at 07:34
@Shadowray I'm building a paste-bin right now. I realized, that my With constructs prevent garbage collection, but I'm severely bothered that the generation of an array may considerably distort timings. – LLlAMnYP May 17 '17 at 08:11
Table[Clear[pick, unitize, data]; unitize[x_] := unitize[x] = Unitize[x]; pick[xs_, sel_, patt_] := pick[xs] = Pick[xs, sel, patt]; data = RandomChoice[Range[0, 10], {i*10^7, 3}]; {Pick[data, Unitize@data[[All, -1]], 1]; // AbsoluteTiming // First, (pick[data, unitize@data[[All, -1]], 1]; DownValues@pick = DownValues@unitize = {};) // AbsoluteTiming // First}, {i, 5}] forget it, all my tests are no better than your's, but this one shows that you're right. Discrepancy between Timing@*Remove and AbsoluteTiming@*Remove really bothers me (cont'd) – LLlAMnYP May 17 '17 at 08:18
(cont'd) since freeing memory should be fast in any case and the zero result of Timing sort of hints at that. I would like to time operation on an array plus garbage collection, but not the generation of said array, but see no viable way to do so. Since the block of code in my previous comment seems to confirm your hypothesis, I'm accepting this answer. – LLlAMnYP May 17 '17 at 08:21
2

@LLlAMnYP Timing may not include time for system calls like free. See this answer. – Ray Shadow May 17 '17 at 08:23

score 12 · Answer 2 · answered May 05 '17 at 08:02

You are absolutely correct that this memoization is completely unnecessary.

What seems to happens is that from the second run onwards on the same data, the builtin functions become faster. I do not understand why (perhaps some internal caching), but it does show that it has absolutely nothing to do with the memoization:

In[38]:= AbsoluteTiming[Pick[data, Unitize@data[[All, n]], 1] // Length]
AbsoluteTiming[Pick[data, Unitize@data[[All, n]], 1] // Length]
AbsoluteTiming[Pick[data, Unitize@data[[All, n]], 1] // Length]

Out[38]= {10.6117, 90909421}

Out[39]= {8.08706, 90909421}

Out[40]= {7.96311, 90909421}

Another comment:

RandomSeed is an option, not a function. It should be SeedRandom, otherwise it does nothing.

@Shadowray and Szabolcs, I did suspect caching, but couldn't "fix" this behavior with a ClearSystemCache[]. Even though pick calls Pick internally, the order of operations does not seem to change much: please see my update. +1 for the above example, but no accept :) — LLlAMnYP, May 05 '17 at 08:20

jkuczm · Answer 3 · 2017-05-06T21:48:51.217

I can't reproduce claimed speedup on "11.0.1 for Linux x86 (64-bit) (September 21, 2016)".

In my tests, custom function wrappers without memoization (as suggested by Szabolcs) consistently add overhead of about 1 µs, functions with memoization add 2-3 µs overhead, compared to built-ins. This overhead is measurable only for small lists, for larger lists it's completely negligible.

Important thing is that results of AbsoluteTiming are very volatile with median deviation from minimal value, for larger lists, from few to ten percent. I'm sure there are better ways to measure this volatility, I used median deviation just to have any estimate.

Code used for timings:

$HistoryLength = 0;

minDev // ClearAll
minDev // Attributes = HoldFirst;
minDev[expr_, n_Integer?Positive] := Module[{res, min},
  res = Table[expr, n];
  min = Min@res;
  {min, Median[res - min]}
]

testBuiltin@data_ := (
  ClearSystemCache[];
  Pick[data, Unitize@data, 1] // AbsoluteTiming // First
)
testCustom@data_ := (
  ClearSystemCache[];
  ClearAll[unitize, pick];
  unitize[x_] := Unitize[x];
  pick[xs_, sel_, patt_] := Pick[xs, sel, patt];
  pick[data, unitize@data, 1] // AbsoluteTiming // First
)
testMemo@data_ := (
  ClearSystemCache[];
  ClearAll[unitize, pick];
  unitize[x_] := unitize[x] = Unitize[x];
  pick[xs_, sel_, patt_] := pick[xs] = Pick[xs, sel, patt];
  pick[data, unitize@data, 1] // AbsoluteTiming // First
)

testAll[k_, n_] := 
  With[{data = (SeedRandom[1]; RandomChoice[Range[0, 10], k])},
    minDev[#@data, n] & /@ {testMemo, testCustom, testBuiltin}
  ]
format = TableForm@Map[
  NumberForm[#, ExponentFunction -> (Null &)] &, 
  SetAccuracy[#, Min[Accuracy@SetPrecision[Min@#, 2], 7]],
  {-1}
] &;

First argument of testAll is size of used data, second is number of repeated timings. First column of result is minimal absolute timing, second is median deviation from this minimal value. First rows are results for memoized custom pick and unitize, second rows are results for non-memoized custom functions, third rows are for built-in Pick and Unitize.

testAll[10^1, 10^5]//format
(* 0.000004 0.*10^(-7)
   0.000002 0.000001
   0.000001 0.000001 *)

testAll[10^2, 10^5]//format
(* 0.000005 0.000001
   0.000003 0.000001
   0.000002 0.000001 *)

testAll[10^3, 10^5]//format
(* 0.000015 0.000001
   0.000014 0.*10^(-7)
   0.000013 0.000001 *)

testAll[10^4, 10^5]//format
(* 0.000124 0.000002
   0.000122 0.000002
   0.000121 0.000001 *)

testAll[10^5, 10^4]//format
(* 0.001297 0.000093
   0.001296 0.000069
   0.001295 0.000103 *)

testAll[10^6, 10^3]//format
(* 0.0201   0.0014
   0.0201   0.0011
   0.0201   0.0012 *)

testAll[10^7, 10^2]//format
(* 0.2004   0.0148
   0.2003   0.0099
   0.2004   0.0088 *)

testAll[5 10^7, 2 10^1]//format
(* 0.972    0.021
   0.974    0.017
   0.973    0.022 *)

Fresh kernel

To make sure that we're not using any Mathematica's internal cache, that might not be cleared by ClearSystemCache, we can launch separate kernel for each test using:

freshKernelEvaluate // ClearAll
freshKernelEvaluate // Attributes = HoldAll;
freshKernelEvaluate@expr_ := Module[{link, result},
  link = LinkLaunch[First@$CommandLine <> " -mathlink -noprompt"];
  LinkWrite[link, Unevaluated@EvaluatePacket@expr];
  result = LinkRead@link;
  LinkClose@link;
  Replace[result, ReturnPacket@x_ :> x]
]

Timings of built-ins:

resBuiltin = Table[
  freshKernelEvaluate[
    SeedRandom@1;
    data = RandomChoice[Range[0, 10], 5 10^7];
    Pick[data, Unitize@data, 1] // AbsoluteTiming // First
  ],
  100
]

{1.28392, 1.23527, 1.25863, 1.23625, 1.33601, 1.24361, 1.26809, 1.23502, 1.34473, 1.23813, 1.24654, 1.23617, 1.27127, 1.25661, 1.22674, 1.58978, 1.26939, 1.37024, 1.24581, 1.54075, 1.23516, 1.23805, 1.3053, 1.40044, 1.42726, 1.39822, 1.46109, 1.27038, 1.39617, 1.2588, 1.29047, 1.23082, 1.25069, 1.34985, 1.27281, 1.24016, 1.2642, 1.2511, 1.23745, 1.27978, 1.24066, 1.38282, 1.32234, 1.30623, 1.26118, 1.58021, 1.27522, 1.24706, 1.27051, 1.2493, 1.24819, 1.28184, 1.46254, 1.24269, 1.26356, 1.24011, 1.35468, 1.27491, 1.35288, 1.24462, 1.27119, 1.26811, 1.23685, 1.33249, 1.23138, 1.29139, 1.23725, 1.28638, 1.23906, 1.27579, 1.3872, 1.31602, 1.29556, 1.26464, 1.27076, 1.24602, 1.25735, 1.24667, 1.27297, 1.23757, 1.34311, 1.26616, 1.35083, 1.24861, 1.23788, 1.25357, 1.24262, 1.28117, 1.25753, 1.28231, 1.23406, 1.27971, 1.22885, 1.27199, 1.24191, 1.23346, 1.26387, 1.24803, 1.27653, 1.23953}

Timings of memoized custom functions:

resMemo = Table[
  freshKernelEvaluate[
    SeedRandom@1;
    data = RandomChoice[Range[0, 10], 5 10^7];
    unitize[x_] := unitize[x] = Unitize[x];
    pick[xs_, sel_, patt_] := pick[xs] = Pick[xs, sel, patt];
    pick[data, unitize@data, 1] // AbsoluteTiming // First
  ],
  100
]

{1.35284, 1.23307, 1.27167, 1.23678, 1.27437, 1.25009, 1.27847, 1.2418, 1.23227, 1.39655, 1.26371, 1.26179, 1.27424, 1.27965, 1.236, 1.28489, 1.25988, 1.26318, 1.24007, 1.24381, 1.2672, 1.25462, 1.26703, 1.24123, 1.28868, 1.24192, 1.27177, 1.23488, 1.23468, 1.27525, 1.26571, 1.27287, 1.23757, 1.26981, 1.25737, 1.2729, 1.23705, 1.24429, 1.26927, 1.23292, 1.28266, 1.23352, 1.28423, 1.23743, 1.26883, 1.23515, 1.27272, 1.25892, 1.23213, 1.23746, 1.3435, 1.27545, 1.23472, 1.49113, 1.42916, 1.56421, 1.5238, 1.37695, 1.27734, 1.23146, 1.2388, 1.24054, 1.27661, 1.23467, 1.43818, 1.51605, 1.28172, 1.24674, 1.34043, 1.36447, 1.28034, 1.23788, 1.3027, 1.25299, 1.26136, 1.24514, 1.23405, 1.26157, 1.24994, 1.27737, 1.23637, 1.26785, 1.411, 1.24163, 1.2301, 1.29223, 1.25492, 1.25177, 1.26862, 1.25825, 1.23715, 1.25327, 1.2694, 1.6624, 1.24317, 1.26682, 1.27915, 1.25705, 1.23258, 1.25804}

I don't see any consistent difference, distribution of results seems similar:

res = <|"Built-in" -> resBuiltin, "Memo" -> resMemo|>;
ListPlot[res, PlotRange -> All]
Histogram[res, PlotRange -> All, ChartStyle -> {Blue, Orange}, ChartLegends -> Automatic]

Edit

UnchartedWorks writes in a comment that unitize is enough to show difference in speed and pick is not necessary. Carl Woll points out that data matrix should be used instead of data vector.

After changing above I see a difference between memoized and built-in version. Memoized function is consistently slower than built-in.

Used test functions:

testBuiltin = freshKernelEvaluate[
  $HistoryLength = 0;
  SeedRandom@1;
  data = RandomChoice[Range[0, 10], {#, 3}];
  Pick[data, Unitize@data[[All, -1]], 1] // AbsoluteTiming // First
] &;
testCustom = freshKernelEvaluate[
  $HistoryLength = 0;
  SeedRandom@1;
  data = RandomChoice[Range[0, 10], {#, 3}];
  unitize[x_] := Unitize[x];
  Pick[data, unitize@data[[All, -1]], 1] // AbsoluteTiming // First
] &;
testMemo = freshKernelEvaluate[
  $HistoryLength = 0;
  SeedRandom@1;
  data = RandomChoice[Range[0, 10], {#, 3}];
  unitize[x_] := unitize[x] = Unitize[x];
  Pick[data, unitize@data[[All, -1]], 1] // AbsoluteTiming // First
] &;

Timings:

SetDirectory@NotebookDirectory[];
s = OpenWrite@"results.dat";
k = 3 10^7;
Do[
  Write[s, 
    If[OddQ@i,
      {testBuiltin@k, testCustom@k, testMemo@k}, 
      Reverse@{testMemo@k, testCustom@k, testBuiltin@k}
    ]
  ],
  {i, 10^2}
] // AbsoluteTiming
file = Close@s;
(* {898.653, Null} *)

Result analysis:

results = AssociationThread[{"Built-in", "Custom", "Memo"} -> Transpose@ReadList@file]
colors = {Blue, Orange, Darker@Green};
ListPlot[results, PlotRange -> All, PlotStyle -> colors]
Histogram[results, PlotRange -> All, ChartStyle -> colors, ChartLegends -> Automatic]

<|"Built-in" -> {1.22985, 1.22461, 1.23061, 1.23184, 1.2402, 1.22937, 1.25221, 1.21342, 1.23612, 1.22765, 1.23061, 1.23409, 1.25464, 1.21786, 1.23144, 1.24461, 1.24803, 1.24498, 1.24818, 1.23294, 1.2348, 1.51256, 1.51016, 1.46498, 1.48277, 1.49113, 1.38432, 1.23417, 1.23139, 1.23475, 1.23356, 1.22846, 1.23629, 1.25202, 1.23593, 1.24975, 1.22473, 1.23137, 1.2266, 1.25627, 1.21828, 1.2525, 1.23725, 1.24693, 1.24163, 1.23324, 1.28597, 1.23083, 1.22618, 1.23927, 1.22844, 1.23095, 1.21823, 1.23546, 1.23057, 1.22338, 1.22514, 1.23199, 1.23086, 1.21832, 1.22947, 1.22668, 1.2302, 1.24527, 1.23862, 1.48311, 1.48445, 1.47365, 1.24457, 1.25607, 1.26731, 1.22819, 1.23567, 1.23589, 1.27261, 1.22645, 1.22554, 1.23832, 1.22731, 1.2334, 1.25166, 1.26591, 1.22114, 1.24653, 1.22359, 1.22788, 1.22567, 1.25535, 1.23223, 1.24091, 1.24912, 1.23169, 1.23663, 1.23177, 1.2278, 1.55135, 1.4796, 1.49146, 1.49611, 1.23101}, "Custom" -> {1.23652, 1.23587, 1.23412, 1.22896, 1.22707, 1.23646, 1.25783, 1.26341, 1.24158, 1.22581, 1.22999, 1.24083, 1.23376, 1.23851, 1.24782, 1.22384, 1.2431, 1.23661, 1.23801, 1.24318, 1.23982, 1.53433, 1.48343, 1.54463, 1.48097, 1.47601, 1.23676, 1.24323, 1.2311, 1.22642, 1.23351, 1.23296, 1.23254, 1.23407, 1.23169, 1.24395, 1.24042, 1.24769, 1.23167, 1.21756, 1.2301, 1.23421, 1.24282, 1.23704, 1.23525, 1.2351, 1.25029, 1.23524, 1.22839, 1.22839, 1.23667, 1.26583, 1.22544, 1.22955, 1.22292, 1.22819, 1.27443, 1.24958, 1.24789, 1.22195, 1.21883, 1.22279, 1.21813, 1.22052, 1.23921, 1.5044, 1.49484, 1.50915, 1.23095, 1.23694, 1.22373, 1.24806, 1.22945, 1.24085, 1.23373, 1.22282, 1.2362, 1.23099, 1.23932, 1.24258, 1.25047, 1.26868, 1.23042, 1.22579, 1.2229, 1.23243, 1.2368, 1.22925, 1.2387, 1.23014, 1.21772, 1.2259, 1.22549, 1.23208, 1.26501, 1.33781, 1.48822, 1.48658, 1.25979, 1.26228}, "Memo" -> {1.29497, 1.29798, 1.29918, 1.29907, 1.31014, 1.29503, 1.29095, 1.3249, 1.29036, 1.30051, 1.2789, 1.29959, 1.2988, 1.2882, 1.29519, 1.28946, 1.31952, 1.32948, 1.32447, 1.29627, 1.31841, 1.5721, 1.57097, 1.55392, 1.56358, 1.55974, 1.28744, 1.3029, 1.28567, 1.2914, 1.29167, 1.29062, 1.29471, 1.29797, 1.30193, 1.30423, 1.30097, 1.29706, 1.29027, 1.29005, 1.29543, 1.2929, 1.29996, 1.29386, 1.29502, 1.31621, 1.31506, 1.29105, 1.30462, 1.28348, 1.30922, 1.28715, 1.30386, 1.29361, 1.29596, 1.30149, 1.28943, 1.29833, 1.31909, 1.2911, 1.31163, 1.28986, 1.29063, 1.28847, 1.29451, 1.46695, 1.55118, 1.55433, 1.29779, 1.29201, 1.29947, 1.29045, 1.28494, 1.29003, 1.29385, 1.2856, 1.31603, 1.33432, 1.28929, 1.29873, 1.29259, 1.28694, 1.28868, 1.28838, 1.29824, 1.29435, 1.29401, 1.30137, 1.2971, 1.29248, 1.29333, 1.2847, 1.28666, 1.28647, 1.29923, 1.30116, 1.56112, 1.56282, 1.29155, 1.2936}|>

testMemo@data_ := (ClearSystemCache[]; ClearAll[unitize, pick]; unitize[x_] := unitize[x] = Unitize[x]; (pick[xs_,sel_,patt_]:=pick[xs]=Pick[xs,sel,patt];)
Pick[data, unitize@data, 1] // AbsoluteTiming // First) — webcpu, May 05 '17 at 18:58
If you only run Pick once and then Clear all of definitions, pick is not necessary, and you can use Pick instead. You will see that testMemo is the fastest. — webcpu, May 05 '17 at 19:01
@jkuczm Could you, please, measure this time $HistoryLength = 0; garbage = ConstantArray[0, 10^8]; AbsoluteTiming[Remove[garbage]] on your system? — Ray Shadow, May 06 '17 at 13:55
As I mentioned before, there is a problem in testMemo. the last statement in testMemo should be Pick[data, unitize@data, 1] // AbsoluteTiming // First but not pick[data, unitize@data, 1] // AbsoluteTiming // First . — webcpu, May 06 '17 at 14:28
You are using a vector RandomChoice[Range[0, 10], k] as data, while everybody else is using a matrix, e.g., RandomChoice[Range[0, 10], {k, 3}] as data. — Carl Woll, May 06 '17 at 15:41

webcpu · Answer 4 · 2017-05-05T11:56:01.640

1

What if let pick and unitize run before Pick and Unitize? pick is still faster than Pick. In:

Clear[unitize, pick, n, data]
SeedRandom[1];
n = -1;
data = RandomChoice[Range[0, 10], {10^8, 6}];

unitize[x_] := unitize[x] = Unitize[x];
pick[xs_, sel_, patt_] := pick[xs, sel, patt] = Pick[xs, sel, patt]
AbsoluteTiming[
 pick[data, unitize@data[[All, n]], 1] // Length]
AbsoluteTiming[Pick[data, Unitize@data[[All, n]], 1] // Length]

Out:

{4.71476, 90911166}
{5.14919, 90911166}

Memoization

This technical is Memoization. I learned it from Haskell.

Memoization in GHC’s interactive environment.(GHC=Glasgow Haskell Compiler)

The first execution time of add1 x is 0.73 seconds and the memory usage is 676MB memory, the second execution time of add1 x is 0.08 seconds, and the memory usage is only 86KB.

If you use this technique wisely, it can save a lot of CPU time and memory. In some worse scenarios, it might only save CPU time and use too much memory.

edited May 05 '17 at 11:56

answered May 05 '17 at 08:48

webcpu

3,182
12
17

2

I appreciate your joining this thread, but this really doesn't answer the "why". I'd rather like to know where you learned of this technique. – LLlAMnYP May 05 '17 at 09:26
@LLlAMnYP I have to add a screenshot and explain it briefly, So I answered your question in my answer. – webcpu May 05 '17 at 11:57
"This technical is Memoization. I learned it from Haskell." There is obviously no memoization happening due to your code, because it is only run once. – Szabolcs May 05 '17 at 12:08
1

@Szabolcs I'm not a Haskell user, but I guess there is. Correct me if I'm wrong, but let x = (sum [1..2^22]) stores that basically as an expression, but as soon as add1 x is run, the machine is forced to actually go and calculate this sum. After this it now remembers, that x = 8796095119360. However we know that MMA does not work like that. – LLlAMnYP May 05 '17 at 12:11
3

@LLlAMnYP I was talking about the Mathematica code, not the Haskell screenshot, which is irrelevant here. The answer to your question about why it is faster seemed to be "it's just memoization". This is not correct. – Szabolcs May 05 '17 at 12:13
Anybody dare to try trace? Maybe on a small sample?? – Jack LaVigne May 05 '17 at 12:56

Throwaway memoization makes built-ins faster?

4 Answers4

Fresh kernel

Edit

Linked