Efficient way to utilise Parallel features to make use of many cores

Question

Let's say I have 100 cores/kernels as my disposal and want to compute a function of two variables f[x,y] over {x,1,10}, {y,0,59}. So in total 600 data points. Ideally I would like to utilize all 100 cores giving each core 6 data points to compute. How can I achieve this?

ParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 10 }]

Would only parallelize on the first 60 cores and give each core a workload of 10 points and

ParallelTable[f[x, y], {x, 1, 10 }, {y, 0, 59 }]

would do even worse- parallelizing on the first 10 cores and giving each a workload of 60 points.

I think doing

ParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 3 }];
ParallelTable[f[x, y], {y, 0, 59 }, {x, 4, 6 }];
ParallelTable[f[x, y], {y, 0, 59 }, {x, 7, 10}];

Would only evaluate the three calls sequentially when the previous ParallelTable had finished so would also do not better?

Is there a way around this?

score 14 · Accepted Answer · edited Mar 06 '13 at 00:06

14

I usually work around this by first generating all argument combinations, then using ParallelMap:

ParallelMap[f, Tuples[{Range[0,59], Range[1,10]}]]

You'll need to define your function so that it takes the form f[{x,y}], not f[x,y].

Often it is more practical to use the form

ParallelTable[{arg, f[arg]}, {arg, Tuples[{Range[0,59], Range[1,10]}]}]

as this will save both the result and the arguments in the output.

Note that this method gives you a flat 1D list, not a 2D one like a Table with two iterators would. If you do need a 2D table, use Partition or ArrayReshape (v9) on the result.

edited Mar 06 '13 at 00:06

Mr.Wizard

271,378
34
587
1,371

answered Mar 06 '13 at 00:03

Szabolcs

234,956
30
623
1,263

1

Or you could use f @@ # &. (+1) – Mr.Wizard Mar 06 '13 at 00:06
+1 for anticipating and bettering my offer ;-) I literally didn't see this until you posted. Ah, well... something to which I can aspire. – Jagra Mar 06 '13 at 00:06
@Szabolcs thanks, this is going to save me a lot of time. – fpghost Mar 06 '13 at 00:22
@Jagra This happens all the time here on SE :) it happened to me countless times that just before I finished typing my answer I saw the "new answer" notification – Szabolcs Mar 06 '13 at 01:08

score 3 · Answer 2 · answered Mar 05 '13 at 23:56

3

You want to use the EvaluationsPerKernel option for Method:

ParallelTable[f[x, y], {y, 0, 59}, {x, 1, 10},
              Method -> "EvaluationsPerKernel" -> 6]

To check and see if it's doing what you want, you can Tally the number of evaluations assigned to each kernel:

Tally@Flatten@ParallelTable[$KernelID, {y, 0, 59}, {x, 1, 10}, 
                            Method -> "EvaluationsPerKernel" -> 6]

answered Mar 05 '13 at 23:56

Guillochon

6,117
2
31
57

2

This doesn't solve the OPs problem, which is that ParallelTable only parallelisms with respect to the 1st iterator. If your first iterator is {y,0,59}, it will never use more than 60 kernels, even if a 100 are available. – Szabolcs Mar 06 '13 at 00:00
1

Yikes, I wasn't aware of that limitation. Seems like a really easy thing for them to fix. – Guillochon Mar 06 '13 at 00:06
1

No, it appears that it only parallelizes over the first index. See @Szabolcs's answer. – Guillochon Mar 06 '13 at 00:08
@Szabolcs Yep, that is the problem. Is there no way around this then? – fpghost Mar 06 '13 at 00:09

matheorem · Answer 3 · 2015-11-29T13:17:31.523

ParallelMap method has an apparent downside, that is it will eliminate the parameter names.

from this line of code

ParallelMap[f, Tuples[{Range[0,59], Range[1,10]}]]

you won't know directly the meaning of first Range and second Range

But with the latest powerful feature of Inactivate and Activate of Mathematica 10.

We could improve this using the following code

ParallelMap[Activate,Flatten@Table[Inactivate@f[x, y], {y, 0, 59 }, {x, 1, 10 }]

I think is more clear than Szabolcs's ParallelMap approach.

and you could also define a function

Clear[finestParallelTable]
SetAttributes[finestParallelTable,HoldAll]
finestParallelTable[expr_,parameter__]:=
ParallelMap[Activate,Flatten@Table[Inactivate@expr,parameter]]

Now

finestParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 10 }]

Looks exactly the same as the original ParallelTable.

Finally, of course the result is a flat list, you could reshape it later or directly add the reshape feature into the function finestParallelTable

Jagra · Answer 4 · 2013-03-06T00:12:27.803

``Perhaps just an extended comment (with some speculation) rather than a specific answer, but I'll hazard a few lines.

I think you've made some assumptions about how ParallelTable works that you may have as an artifact of procedural coding? Mathematica and specifically ParallelTable might, but ought not to have to loop through nested loops.

(Note: @Szabolcs comment above that ParallelTable only parallels "...with respect to the 1st iterator." I'd have guessed it was cleverer than that).

As all of the applications of your function f stand independent, I see no reason that Mathematica wouldn't employ all available kernels to calculate them. Positing equal time for each calculation, it could employ each of your 100 available kernels 6 times.

Kind of interesting to think about how Mathematica might do this internally. Why wouldn't it create a matrix of all the paired x's and y's then run f on them.

You might try creating such a matrix then just Map or ParallelMap f to the matrix.

You can monitor this to some degree from the Evaluation menu item. Click through to Parallel Status and observe as you run the calculation.

the matrix of pairs and ParallelMap might just be the way to go, thanks. — fpghost, Mar 06 '13 at 00:11

Efficient way to utilise Parallel features to make use of many cores

4 Answers4

Linked