I have to do a lot of calculations that take lot of time, and using a Do loop is simply too long. It is the first time I am using ParallelDo and it is not working as fast as it should. I am sure it is an easy fix, likely due to how I define the variables (shared, not shared... I cannot understand how to do it right).
My code is the following:
(*Parameter to set the size of the computation. in the final version will be 15*)
Dim = 8;
(*Matrices considered*)
eHe = RandomReal[{-1, 1}, {2^Dim, 2^Dim}];
eHe = eHe + ConjugateTranspose[eHe];
(*Some parameters*)
Nst = Log2[Length[eHe]]
SysDim = N[2^Nst];
(Matrices for the scalar product)
PauliString = {"Id", "X", "Y", "Z"};
S0 = SparseArray[{{1, 1} -> N[1], {2, 2} -> N[1]}];
S1 = SparseArray[{{1, 2} -> N[1], {2, 1} -> N[1]}];
S2 = SparseArray[{{1, 2} -> N[-I], {2, 1} -> N[I]}];
S3 = SparseArray[{{1, 1} -> N[1], {2, 2} -> N[-1]}];
SVec = {S0, S1, S2, S3};
(Parameters for the ParallelDo)
Soglia = 10^-10;
Hper = eHe;
resHper = {};
SetSharedVariable[resHper];
(ParallelDo)
ParallelDo[
tmp = Tr[
ConjugateTranspose[
Apply[KroneckerProduct,
Table[SVec[[el[[j]]]], {j, 1, Nst}]]].Hper]/SysDim;
If[Abs[tmp] > Soglia,
AppendTo[resHper,
Flatten[
Append[{tmp},
Table[PauliString[[el[[j]]]], {j, 1, Nst}]
]
]
]
]
, {el, Tuples[{1, 2, 3, 4}, Nst]}]
Here, Hper and Apply[KroneckerProduct,Table[Vec[[el[[j]]]],{j, 1, Nst}]]] are matrices and Table[PauliString[[el[[j]]]] a table of strings. The ParallelDo loop actually works, but is seemingly much slower if compared to a normal Do loop...
I have spent really long time, and have not understood where the problem is. Any help would be really appreciated. Thanks!
UPDATE: The code above now seems to yield the correct result, but is much slower than expected. I have benchmarked the code by substituting ParallelDo with Do and the first takes up to 18 times longer if compared to Do. I have checked the CPU and it is always mostly unused when using the ParallelDo. I guess it is because Mathematica has trouble with the tmp variable inside the ParallelDo?
ParallelDois using a bad choice by default, possibly an explicit setting will help. – Daniel Lichtblau Dec 06 '22 at 21:24ParalellTableor similar (more generally:ParallelCombine), it is likely just not a good fit for parallelization. – Szabolcs Dec 07 '22 at 09:49