3

I have a task, where I have to combine pairs with a function, but the number of pairs I actually need is much smaller than the number of all pairs. The condition to keep the element is some function of both elements. Originally, I used something like:

Select[Flatten@Outer[function,array1,array2],condition]

However, this creates a huge computational and memory overhead: it generates the entire matrix (which is quickly bigger than RAM, and also takes time to apply function when I'll just throw it away), and then also takes time to filter.

I was looking for a non-procedural way to do this, and it seems most builtins are oriented towards constructing everything and filtering later. Map is out of the question, so are Table and Array: Even leaving Nulls in the array is an overhead, and using Sequence appeared worse. I ended up writing

ApplyIf[f_, a_, b_, condition_] := 
  Module[{result = {}, n1 = Length[a], n2 = Length[b], i, j},
   For[i = 1, i < n1, i++,
    For[j = 1, j < n2, j++,
     r1 = a[[i]];
     r2 = b[[j]];
     If[condition[r1, r2],
      AppendTo[result, f[r1, r2]];
      ]
     ]
    ];
   result
   ];

which does solve the memory overhead problem, but being procedural, it's not in the spirit of Mathematica, and I'm also concerned with time complexity of AppendTo.

Is there a better way to do this?

Is Reap / Sow advisable in these situations?

orion
  • 141
  • 2
  • "the number of pairs I actually need is much smaller than the number of all pairs" -- have you thought about SparseArray to reduce memory usage? – Stitch Mar 10 '17 at 15:49
  • @Stitch I still have to go over all pairs to figure out which ones I need. – orion Mar 10 '17 at 15:50
  • 5
    Forget about For for good and use Sow/Reap with Do. – Szabolcs Mar 10 '17 at 15:54
  • 1
    I often hit the memory limits of my kernels, crashing them. And effective trick I use to avoid both the time complexity of AppendTo and procedural programming is to simply chunk my arrays. Alternatively, Reap and Sow will indeed circumvent the time complexity of AppendTo and by using a Do loop instead of your For loops you can remove the time complexity from all those For tests and incrementations and whatnot. – b3m2a1 Mar 10 '17 at 15:56
  • Thank you, I quite forgot about Do, it also avoids a double loop, much cleaner. – orion Mar 10 '17 at 16:02
  • Alternatively, you could give us an idea of what condition is (i.e. explain what you are trying to do), and perhaps we could come up with a functional way of doing what you want to do. That said, the Sow-Reap construction is a good one, and I use it a lot. – march Mar 10 '17 at 16:23
  • I agree with @Szabolcs For should be avoided at all costs. – Ali Hashmi Mar 10 '17 at 17:04
  • 1
    Not only does it avoid a double loop, it also avoids explicit indexing. Instead of Do[f[array[[i]]], {i, Length[array]}] you can use Do[f[elem], {elem, array}]. All this will effectively save you the Module, as you will not need local variables anymore. Reap@Do[If[condition[r1, r2], Sow@f[r1, r2]], {r1, a}, {r2, b}] – Szabolcs Mar 10 '17 at 18:41
  • @Szabolcs just write it as an answer for me to accept, it works beautifully. – orion Mar 10 '17 at 22:25

0 Answers0