Let's say I have 100 cores/kernels as my disposal and want to compute a function of two variables f[x,y] over {x,1,10}, {y,0,59}. So in total 600 data points. Ideally I would like to utilize all 100 cores giving each core 6 data points to compute. How can I achieve this?
ParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 10 }]
Would only parallelize on the first 60 cores and give each core a workload of 10 points and
ParallelTable[f[x, y], {x, 1, 10 }, {y, 0, 59 }]
would do even worse- parallelizing on the first 10 cores and giving each a workload of 60 points.
I think doing
ParallelTable[f[x, y], {y, 0, 59 }, {x, 1, 3 }];
ParallelTable[f[x, y], {y, 0, 59 }, {x, 4, 6 }];
ParallelTable[f[x, y], {y, 0, 59 }, {x, 7, 10}];
Would only evaluate the three calls sequentially when the previous ParallelTable had finished so would also do not better?
Is there a way around this?
f @@ # &. (+1) – Mr.Wizard Mar 06 '13 at 00:06