2

I have these functions that are used to defineMyFun below

F[x_, y_] = {{0, (17 I)/10, 
   2  Cos[1/2  (x - Sqrt[3]  y)], (0.2` + 
      0.34641016151377546`  I)  Cos[1/2  (x - Sqrt[3]  y)], 
   2  Cos[1/2  (x + Sqrt[3]  y)], (-0.2` + 
      0.34641016151377546`  I)  Cos[1/2  (x + Sqrt[3]  y)]}, {-((
    17 I)/10), 
   0, (-0.2` + 0.34641016151377546`  I)  Cos[1/2  (x - Sqrt[3]  y)], 
   2  Cos[1/2  (x - Sqrt[3]  y)], (0.2` + 
      0.34641016151377546`  I)  Cos[1/2  (x + Sqrt[3]  y)], 
   2  Cos[1/2  (x + Sqrt[3]  y)]}, {2  Cos[
     1/2  (x - Sqrt[3]  y)], (-0.2` - 0.34641016151377546`  I)  Cos[
     1/2  (x - Sqrt[3]  y)], 0, -1.4722431864335457` - 0.85`  I, 
   2  Cos[x], -((2 Cos[x])/
    5)}, {(0.2` - 0.34641016151377546`  I)  Cos[
     1/2  (x - Sqrt[3]  y)], 
   2  Cos[1/2  (x - Sqrt[3]  y)], -1.4722431864335457` + 0.85`  I, 
   0, (2 Cos[x])/5, 
   2  Cos[x]}, {2  Cos[
     1/2  (x + Sqrt[3]  y)], (0.2` - 0.34641016151377546`  I)  Cos[
     1/2  (x + Sqrt[3]  y)], 2  Cos[x], (2 Cos[x])/5, 0, 
   1.4722431864335457` - 
    0.85`  I}, {(-0.2` - 0.34641016151377546`  I)  Cos[
     1/2  (x + Sqrt[3]  y)], 
   2  Cos[1/2  (x + Sqrt[3]  y)], -((2 Cos[x])/5), 2  Cos[x], 
   1.4722431864335457` + 0.85`  I, 0}}
f1[x_, y_] = D[F[x1, y1], x1] /. { x1 -> x, y1 -> y};
f2[x_, y_] = D[F[x1, y1], y1] /. { x1 -> x, y1 -> y};
tft[r_, er_] = If[r <= er, 1, 0];     

and this is MyFun

MyFun[x_, y_, z_] := Block[{val, fun, order, reslt},

{val, fun} = Eigensystem[F[x, y]]; order = Ordering[val]; val = val[[order]]; fun = fun[[order]];

reslt = (Sum[ If[in == jn, 0, (tft[val[[jn]], z] - tft[val[[in]], z]) ((( Conjugate[fun[[jn]]] . f1[x, y] . fun[[in]] )*(Conjugate[fun[[in]]] . f2[x, y] . fun[[jn]])))], {in, 6}, {jn, 6}]); Im@reslt ]

with this data

data = Flatten[Table[{x, y}, {x, -2, 2, 0.5}, {y, -2, 2, 0.5}], 1];   

I evaluate

Table[ParallelSum[
    1/Length@data   MyFun[data[[i, 1]], data[[i, 2]], z], {i, 
     Length@data}], {z, -1, 1, 0.1}]; // AbsoluteTiming
{4.16042, Null}   

with Parall on Table, it is faster

ParallelTable[
   Sum[1/Length@data   MyFun[data[[i, 1]], data[[i, 2]], z], {i, 
     Length@data}], {z, -1, 1, 0.1}]; // AbsoluteTiming
{2.66483, Null}       

and finally with Compile but almost the same speed

compilMyFun = 
 Compile[{{x, _Real}, {y, _Real}, {z, _Real}}, MyFun[x, y, z], 
  CompilationTarget -> "WVM", RuntimeOptions -> "Speed"]    

which gives

ParallelTable[
   Sum[compilMyFun[data[[i, 1]], data[[i, 2]], z], {i, 
     Length@data}], {z, -1, 1, 0.1}]; // AbsoluteTiming
{2.71059, Null}

I was wondering if speed can be boosted more because my real data length is about 40000.

Henrik Schumacher
  • 106,770
  • 7
  • 179
  • 309
MMA13
  • 4,664
  • 3
  • 15
  • 21
  • 1
    MyFun contains Eigensystem and Ordering which are already compiled. That's why you should not expect any improvement here. – Henrik Schumacher Mar 04 '24 at 16:04
  • @HenrikSchumacher, is there another way to boost speed? – MMA13 Mar 04 '24 at 16:05
  • Maybe you can find a formulation of the result for MyFun that does not require the eigendecomposition? – Henrik Schumacher Mar 04 '24 at 16:05
  • If that's not possible, I would probably try to call SelfAdjointEigenSolver of the Eigen package via LibraryLink. The point about using Eigen is that it might have a more performant implementation for small matrices of fixed size. It is a C++ library, so this requires some C++ experimence and some fiddling. – Henrik Schumacher Mar 04 '24 at 16:08
  • Alternatively, one can of course also try to call the LAPACK routine zheev directly. (One should assume that this is what Mathematica does, but I am not sure.) – Henrik Schumacher Mar 04 '24 at 16:10
  • 1
    Just as a reminder: Compile`CompilerFunctions[] gives a list of functions for which Compile has maybe some benefit. – Rolf Mertig Mar 04 '24 at 21:54
  • Have you read https://mathematica.stackexchange.com/a/104031/1871 ? – xzczd Mar 05 '24 at 08:05
  • 1
    @HenrikSchumacher OP's function involves heavy low-level calculation, so Compile still helps quite a bit. See my answer below. – xzczd Mar 05 '24 at 09:18

2 Answers2

7

Well, it's a bit sad to see that OP seems to have learned nothing from my answer to his previous question. Please remember Compile is advanced tool and it's not something that'll have effect if you just use it blindly. Please consider starting from here if you really want to learn its usage, and please don't miss those links in my previous answer.

Anyway, let me do a favor. The keypoint is to take the uncompilable Eigensystem out. Function and variable definitions that are same as yours are omitted in this answer.

ref = ParallelTable[
       Sum[1/Length@data MyFun[data[[i, 1]], data[[i, 2]], z], {i, 
           Length@data}], {z, -1, 1, 0.1}]; // AbsoluteTiming
(* {2.96491, Null} *)

help = func |-> Unevaluated@ Compile[{x, y}, func[x, y], RuntimeOptions -> "Speed", CompilationTarget -> "C"] /. DownValues@func;

{cf1, cf2, cF, ctft} = help /@ {f1, f2, F, tft};

MyFun[x_, y_, z_, val_, fun_] := Im@Sum[If[in == jn, 0., (tft[val[[jn]], z] - tft[val[[in]], z])(Conjugate[fun[[jn]]] . f1[x, y] . fun[[in]] Conjugate[fun[[in]]] . f2[x, y] . fun[[jn]])], {in, 6}, {jn, 6}]

cMyFun = Hold@Compile[{x, y, z, {val, _Real, 1}, {fun, _Complex, 2}}, MyFun[x, y, z, val, fun], RuntimeOptions -> "Speed", CompilationTarget -> "C"] /. DownValues@MyFun /. {f1 -> cf1, f2 -> cf2, tft -> ctft} /. Part -> Compile`GetElement // ReleaseHold;

eigen[x_, y_] := Module[{val, fun, order}, {val, fun} = Eigensystem[cF[x, y]]; order = Ordering[val]; {val[[order]], fun[[order]]}]

eigenlst = eigen @@@ data // Transpose; // AbsoluteTiming (* {0.0050079, Null} *)

cf = Hold@Compile[{{data, _Real, 2}, {val, _Real, 2}, {fun, _Complex, 3}}, Table[Sum[MyFun[data[[i, 1]], data[[i, 2]], z, val[[i]], fun[[i]]]/ Length[data], {i, Length[data]}], {z, -1, 1, 0.1}], CompilationTarget -> "C", RuntimeOptions -> "Speed"] /. MyFun -> cMyFun /. Part -> Compile`GetElement // ReleaseHold;

tst = cf[data, eigenlst[[1]], eigenlst[[2]]]; // AbsoluteTiming (* {0.0585079, Null} *)

ref - tst // Abs // Max (* 2.7755610^-17 )

xzczd
  • 65,995
  • 9
  • 163
  • 468
2

Just launching the parallel Kernels can take a few seconds. Putting Once @ LaunchKernels[]; on top and just adding an N@ operation in your F[x_,y_] definition like F[x_, y_] = N@{ ... does speed things up a bit and on a 16-core Windows computer using Mathematica 14.0 I get a speed of half a minute for a data list of 6561 elements, so I think your problem is just solvable within minutes. If you really need more speed: Use Fortran !

Rolf Mertig
  • 17,172
  • 1
  • 45
  • 76