1

Here is a simplified example illustrating my issue;

G = GaussianMatrix[{1000, 1000}];

f[] := (
  n = RandomInteger[];
  n + Total[G, 2]
  )

ParallelTable[f[], {2}] (* two kernels *)

During the evaluation one needs two times the memory which is used for Table. The variable G is duplicated for each subkernel although G remains unchanged.

How to avoid that?

m_goldberg
  • 107,779
  • 16
  • 103
  • 257
Leon
  • 11
  • 2
  • Welcome to Mathematica.SE! I suggest that: 1) You take the introductory Tour now! 2) When you see good questions and answers, vote them up by clicking the gray triangles, because the credibility of the system is based on the reputation gained by users sharing their knowledge. Also, please remember to accept the answer, if any, that solves your problem, by clicking the checkmark sign! 3) As you receive help, try to give it too, by answering questions in your area of expertise. – bbgodfrey Feb 12 '15 at 19:06
  • Can you explain in more detail why you need the same data to be transferred to each kernel? In this simplied example the Total could be computed on the main kernel only once and only the result needs to be transferred. I'm sure that your real problem is not that simple, but can you explain in more detail what sort of computation you are doing and precisely what data it needs? – Szabolcs Feb 12 '15 at 19:18
  • I'm doing a particle tracking on a given vector field G (30gb memory) each subkernel computes one particle path with random initial position in this field G. I want to avoid G being copied since it is not necessary. – Leon Feb 12 '15 at 19:24
  • @Leon When you respond to comments, please address people using the @ character, as I'm doing here. Otherwise they don't get notified about the response and may never return to check. – Szabolcs Feb 12 '15 at 19:58
  • @Szabolcs Thank you for this hint! – Leon Feb 12 '15 at 20:00
  • 1
    @Leon Mathematica's parallel tools are limited and it is not possible to have even read access to the same data structure without duplicating it across kernels. What you can do is create a function that retrieves only the information you need, and use SetSharedFunction on it. This function will always be evaluated on the main kernel. Calling it will be slow. However, you will avoid data duplication. If your algorithm is such that it doesn't spend most of its time in this function, this approach might work without a slowdown that is too significant. ... – Szabolcs Feb 12 '15 at 20:02
  • ... @Leon However if the function is called too often, then the parallel version will be slower than the serial one. – Szabolcs Feb 12 '15 at 20:02
  • @Szabolcs This is a real pity that mathematica can't do parallel computing with large data. My rig consists of 16 cores with 128gb of ram but I can't use the power. – Leon Feb 12 '15 at 20:10
  • @Leon Can you parallelize elsewhere? Or perhaps vectorize (operate on full arrays)? There's another sort of parallelization available for Compiled functions. It's much more limited but it also has much less overhead. It might be used to parallelize some simple basic steps of the algorithm. – Szabolcs Feb 12 '15 at 20:12
  • @Szabolcs thanks! I will try to parallelize the computation of a single particle path. – Leon Feb 12 '15 at 20:18
  • What is your goal: simulate two particles or simulate many particles? If latter I recommend you to do it simultaneously. It allows you to use packed arrays and sometimes built-in threading (e.g. in matrix multiplication). – ybeltukov Feb 12 '15 at 20:49
  • @ybeltukov Hello, yes I simulate thousands of particle paths. Each with NDSolve based on the field G as input. – Leon Feb 12 '15 at 21:04

0 Answers0