Can any one help me make my program work faster?

Question

In order to construct the matrix h1+h2, I have used three Table commands. I am using fxn[n1a_, n1b_, n2a_, n2b_] as a function in a loop. I need my program to be a lot faster than what it is now, since this is only a part of my main program. Could any one tell me how I can make it work faster? In other words, what are the points I should pay attention to in Mathematica, in case of timing?

Description:

In the original program I am using an iteration method in which I need to construct the matrix h1+h2, calculate the new eigenvectors in each step, reconstruct the matrix h1+h2, until I observe convergence in eigenenergies.

avec, is a guessed vector, and I am using it as an initial vector to construct the matrix h1+h2.

Later on, I need to calculate the eigenvectors of this matrix, choose the vectors related to the minimum eigenvalues and put them instead of the previous avec.

h1 is the kinetic term of my matrix and h2 is the exchange potential (the formula which I have used came from the math I did for Hartree-Fock approximation).

fxn is a function I constructed based on the pre-calculated integrals in my code (so that I do not need to calculate the integrals in each iteration).

The problem now is the calculation for h2, which consumes most of the time. I assume the problem is the way I am using the Table commands. But I couldn't think of a more efficient way.

  avec = {{1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0}, {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0}};
  nvec = {1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6};
  svec = {-1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1};
  kdfxn[i_, j_] := If[i == j, 1, 0]

  ne = 3;
  n\[Mu] = 12;
  \[Delta] = -50;
  \[Beta] = 1;

   fxn[n1a_, n1b_, n2a_, n2b_] := N[ Which[
   n1a == n1b && n2a == n2b,
   1/6 (1 - 3/(n1a^2 \[Pi]^2) - 3/(n2a^2 \[Pi]^2)),
   n1a == n1b && n2a != n2b,
   (4 (1 + (-1)^(n2a + n2b)) n2a n2b)/((n2a^2 - n2b^2)^2 \[Pi]^2),
   n1a != n1b && n2a == n2b,
   (4 (1 + (-1)^(n1a + n1b)) n1a n1b)/((n1a^2 - n1b^2)^2 \[Pi]^2),
   True, -((
    32 (-1 + (-1)^(n1a + n1b)) (-1 + (-1)^(
   n2a + n2b)) n1a n1b n2a n2b)/((n1a^2 - n1b^2)^2 (n2a^2 - 
   n2b^2)^2 \[Pi]^4))]]

hmat = Table[
n0 = nvec[[n\[Mu]0]];
n1 = nvec[[n\[Mu]1]];
h1 = (n0^2 \[Pi]^2 + \[Beta] svec[[n\[Mu]0]]) kdfxn[n\[Mu]0, 
  n\[Mu]1] ;
h2 = Total[Table[
  n2 = nvec[[n\[Mu]2]];
  n3 = nvec[[n\[Mu]3]];
  Table[
   temp1 = 
    fxn[n1, n0, n3, n2] avec[[j, n\[Mu]3]] avec[[j, n\[Mu]2]] ;
   temp2 = 
    fxn[n1, n3, n0, n2] avec[[j, n\[Mu]3]] avec[[j, n\[Mu]2]];
   temp = \[Delta] (temp1 - temp2), {j, 1, ne}],
  {n\[Mu]2, 1, n\[Mu]},
  {n\[Mu]3, 1, n\[Mu]}], Infinity];
  h1 + h2
 , {n\[Mu]0, 1, n\[Mu]}, {n\[Mu]1, 1, n\[Mu]}] // Timing

I want to use the code @Henrik Schumacher gave me to find my new avecs (which are coming from the eigenvectors of hcmat) to recalculate hcmat and iterate. The problem is I can do it manually by substituting new avec instead of the previous one, but as I start using table it doesn't work. I don't know why it's acting like that.

Table[chmat = 
    With[{ccfxn = cfxn}, 
    Compile[{{nm, _Integer}, {ne, _Integer}, {d, _Real}, {b, \
   _Real}, {avec, _Real, 2}, {nvec, _Real, 1}, {svec, _Real, 1}}, 
        Block[{n0, n1, n2, n3, h1, h2, tmp, tmp2}, 
        Table[n0 = Compile`GetElement[nvec, nm0];
        n1 = Compile`GetElement[nvec, nm1];
        tmp = 0.;
        Do[n2 = Compile`GetElement[nvec, nm2];
        n3 = Compile`GetElement[nvec, nm3];
        tmp2 = (ccfxn[n1, n0, n3, n2] - ccfxn[n1, n3, n0, n2]);

        Do[tmp += 
        tmp2 Compile`GetElement[avec, j, nm3] Compile`GetElement[
           avec, j, nm2], {j, 1, ne}], {nm2, 1, nm}, {nm3, 1, 
        nm}];
        d tmp + 
        If[nm0 == 
          nm1, (n0^2 Pi^2 + b Compile`GetElement[svec, nm0]), 
          0.], {nm0, 1, nm}, {nm1, 1, nm}]], 
       CompilationTarget -> "C", 
       CompilationOptions -> {"InlineCompiledFunctions" -> True}, 
       RuntimeOptions -> "Speed"]];

       hmat2 = chmat[n\[Mu], ne, \[Delta], \[Beta], avec, nvec, svec]; //
       AbsoluteTiming // First
            hmat = hmat2

       eout = Eigensystem[hmat];
       evals = eout[[1]];
       evecs = eout[[2]];
       sortedevals = Sort[evals];
       bvec = Chop[Table[bpos = Position[evals, sortedevals[[i]]][[1, 1]];
       evecs[[bpos]], {i, 1, ne}]];
       Table[If[Sign[bvec[[i, 1]]] == Sign[avec[[i, 1]]], avec = ( bvec) , 
       avec = (Sign[avec[[i, 1]]] bvec) ], {i, 1, ne}];
       normalizedavec = Table[avec[[i]]/norm[avec, i], {i, 1, ne}];
       avec = normalizedavec;
       en = Total[Take[sortedevals, ne]], {j, 1, 5}]

In the rest of the code, I am finding the first ne(3) minimum eigenenergies and finding the the related eigenvectors, to construct bvec. by doing some modification on bvec and normalizing it ,I could get the new avec I want to use instead of the previous one. Now my problem is doing the iteration. I would appreciate it if you could help me with this.

I want to calculate this integrals

 sol = Assuming[
 A \[Element] Reals && 0 <= s1 <= 1 && 1 <= n1a <= 10 && 
 1 <= n2a <= 10 && 1 <= n1b <= 10 && 1 <= n2b <= 10, 
 Expand@FullSimplify[
 Integrate[
  4 A Exp[-B (s2 - s1)] Sin[n1a \[Pi] s1] Sin[n1b \[Pi] s1] Sin[
    n2a \[Pi] s2] Sin[n2b \[Pi] s2], {s2, s1, 1}]]];

 sol1 = Assuming[
  A \[Element] Reals && 0 <= s2 <= 1 && 1 <= n1a <= 10 && 
  1 <= n2a <= 10 && 1 <= n1b <= 10 && 1 <= n2b <= 10, 
  Expand@FullSimplify[
  Integrate[
   4 A Exp[-B (s1 - s2)] Sin[n1a \[Pi] s1] Sin[n1b \[Pi] s1] Sin[
    n2a \[Pi] s2] Sin[n2b \[Pi] s2], {s1, s2, 1}]]];

(int[n1a_, n2a_, n1b_, n2b_] = 
 Integrate[sol, {s1, 0, 1}, 
  Assumptions -> 
  B \[Element] Reals && A \[Element] Reals && 1 <= n1a <= 10 && 
  1 <= n2a <= 10 && 1 <= n1b <= 10 && 1 <= n2b <= 10] + 
  Integrate[sol1, {s2, 0, 1}, 
  Assumptions -> 
  B \[Element] Reals && A \[Element] Reals && 1 <= n1a <= 10 && 
  1 <= n2a <= 10 && 1 <= n1b <= 10 && 1 <= n2b <= 10])

and replace n1a in the int function with n1a+0.0001 . Then I need to use the result of int after the replacement, as the material for code :code=result in:

        "cfxn = Block[{n1a, n1b, n2a, n2b}, 
         With[{code = "result"}, 
         Compile[{{n1a, _Integer}, {n1b, _Integer}, {n2a, _Integer}, {n2b, \
          _Integer}}, code, CompilationTarget -> "C"]]];"

but I am getting these errors:

CompiledFunction::cfse: Compiled expression CompiledFunction[{10,11.,5464},<<7>>,LibraryFunction[/Users/delaramnematollahi/Library/Mathematica/ApplicationData/CCompilerDriver/BuildFolder/Delarams-MacBook-Pro-3755/compiledFunction3.dylib,compiledFunction3,{<<1>>},{Real,2}]] should be a machine-size real number.

which I do not understand why.

Could you please help me with this?

It would be immensely more helpful if you could, besides the code, give explanation of what you want to achieve; e.g. what are matrices h1 and h2, how they are obtained, what is fxn supposed to do? — ercegovac, Sep 30 '17 at 21:38
My avec, is a guessed vector, and I am using it as an initial vector to construct the matrix h1+h2, later on I need to calculate the Eigen vectors of this matrix and put instead of the previous avec. — Delaram Nematollahi, Sep 30 '17 at 21:44
@ercegovac Sorry for that, I fixed that. I have also put an easier function from my previous program. but my actual function (fxn) is more complicated which I don't think I can do anything about that. I also edited my question and added some explanation. I hope that helps. Thank you so much for your time. — Delaram Nematollahi, Sep 30 '17 at 22:17
Again, the code is not reproducible. It appears that you did not include at least avec and nvec. Verify that you have included all data by executing the code in the clean context by executing command ClearAll["Global\*"]` — ercegovac, Sep 30 '17 at 22:25
A few ideas: fxn must calculate either 4/pi^2 or 32/pi^4 every time. Can that be factored out? What's the fastest way to raise -1 to an integer power? avec is full of zeros. Does the code call fxn and then multiply the result by zero? Instead of a function, maybe fxn should be a sparse array with a Band structure. Maybe fxn should be the sum of a Symmetric array and an Antisymmetric array. — LouisB, Oct 01 '17 at 00:10
Your update seems to be a new question. If so, you should ask it separately. — Michael E2, Oct 06 '17 at 18:18

Henrik Schumacher · Accepted Answer · 2017-10-02T06:42:38.670

Compile can be your friend. Fortunately, your code is almost perfectly prodedural and thus can be compiled with only few changes.

avec = {{1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0}};
nvec = {1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6};
svec = {-1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1};
ne = 3;
nμ = 12;
δ = -50;
β = 1;

cfxn = Block[{n1a, n1b, n2a, n2b},
   With[{code = Which[
       n1a == n1b && n2a == n2b,
       Evaluate[N[1/6 (1 - 3/(n1a^2 π^2) - 3/(n2a^2 π^2))]],

       n1a == n1b && n2a != n2b,
       Evaluate[N[(4 (1 + (-1)^(n2a + n2b)) n2a n2b)/((n2a^2 - n2b^2)^2 π^2)]],

       n1a != n1b && n2a == n2b,
       Evaluate[N[(4 (1 + (-1)^(n1a + n1b)) n1a n1b)/((n1a^2 - n1b^2)^2 π^2)]],       

       True, 
       Evaluate[N[-((32 (-1 + (-1)^(n1a + n1b)) (-1 + (-1)^(n2a + n2b)) n1a n1b n2a n2b)/((n1a^2 - n1b^2)^2 (n2a^2 - n2b^2)^2 π^4))]]
       ]},
    Compile[{{n1a, _Integer}, {n1b, _Integer}, {n2a, _Integer}, {n2b, _Integer}},
     code,
     CompilationTarget -> "C"
     ]
    ]];

Here I tried to enforce that cfxn uses floating point numbers internally (therefore the Evaluate) such that less Integers have to be type cast to Reals.

chmat = With[{ccfxn = cfxn},
   Compile[{{nm, _Integer}, {ne, _Integer}, {d, _Real}, {b, _Real}, {avec, _Real, 2}, {nvec, _Real, 1}, {svec, _Real, 1}},
    Block[{n0, n1, n2, n3, h1, h2, tmp, tmp2},
     Table[
      n0 = nvec[[nm0]];
      n1 = nvec[[nm1]];
      tmp = 0.;
      Do[
       n2 = nvec[[nm2]];
       n3 = nvec[[nm3]];
       tmp2 = (ccfxn[n1, n0, n3, n2] - ccfxn[n1, n3, n0, n2]);
       Do[tmp += tmp2 avec[[j, nm3]] avec[[j, nm2]], {j, 1, ne}],
        {nm2, 1, nm}, {nm3, 1, nm}];
      d tmp + If[nm0 == nm1, N[(n0^2 Pi^2 + b svec[[nm0]])], 0.],
      {nm0, 1, nm}, {nm1, 1, nm}]
     ],
    CompilationTarget -> "C",
    CompilationOptions -> {"InlineCompiledFunctions" -> True}
    ]];

Edit: My former implementation of chmat did not produce the correct result. I removed it and also optimized away the Total: Summing up the results in a fixed Real register has the advantage that no auxiliary arrays have to be allocated and that no indexing into such arrays takes place within Total.

You can compute hmat with

hmat2= chmat[nμ, ne, δ, β, avec, nvec, svec];//AbsoluteTiming//First
hmat==hmat2

(* 0.003411 *)
(* True *)

On my machine, this results in a 600-fold speed-up.

Addendum:

This need not to be the end of the story. By deactivating some security checks and replacing Part with CompileGetElement` I even get a speed-up factor of over 2500 (compared to the original code). This is how it looks:

chmat = With[{ccfxn = cfxn}, 
   Compile[{{nm, _Integer}, {ne, _Integer}, {d, _Real}, {b, _Real},  {avec, _Real, 2}, {nvec, _Real, 1}, {svec, _Real, 1}},
    Block[{n0, n1, n2, n3, h1, h2, tmp, tmp2},
     Table[
      n0 = Compile`GetElement[nvec, nm0];
      n1 = Compile`GetElement[nvec, nm1];
      tmp = 0.;
      Do[
       n2 = Compile`GetElement[nvec, nm2];
       n3 = Compile`GetElement[nvec, nm3];
       tmp2 = (ccfxn[n1, n0, n3, n2] - ccfxn[n1, n3, n0, n2]);
       Do[
         tmp += tmp2 Compile`GetElement[avec, j, nm3] Compile`GetElement[avec, j, nm2],
        {j, 1, ne}],
       {nm2, 1, nm}, {nm3, 1, nm}];
      d tmp + If[nm0 == nm1, (n0^2 Pi^2 + b Compile`GetElement[svec, nm0]), 0.],
      {nm0, 1, nm}, {nm1, 1, nm}]
     ],
    CompilationTarget -> "C",
    CompilationOptions -> {"InlineCompiledFunctions" -> True},
    RuntimeOptions -> "Speed"
    ]];

I have another question Henrik, If I use this code and calculate the Eigen energies and Eigenvectors of chmat, and replace the new eigen vectors with the previous avec, I can do that, as far as I do it manually. But as I use Table to apply the iteration it gives me error. I will edit my question and add my question to it. Could you please help me with this also? — Delaram Nematollahi, Oct 03 '17 at 19:19
@DelaramNematollahi I will see what I can do for you. Maybe it is better to start a new question; it might get confusing otherwise. At the moment, it sounds like Nest or NestList instead of Table might help you. — Henrik Schumacher, Oct 03 '17 at 19:26
oh, ok. sorry I just modified this. should I ask it in a separate question now? — Delaram Nematollahi, Oct 03 '17 at 19:32
Could you take a look at a new modification I want to implement? — Delaram Nematollahi, Oct 06 '17 at 17:21
Could you take a look at a new modification I want to implement? — Delaram Nematollahi, Oct 06 '17 at 17:39

score 5 · Answer 2 · answered Oct 01 '17 at 20:03

This is around 30-40 times faster than the original, so not as fast as Henrik's Compile version.

I replaced the inner loop with Outer and rearranged slightly to avoid computing the same thing multiple times - this includes memoizing fxn and moving the avec multiplications outside the loop.

I also expressed fxn differently and changed some of the symbol names, this was not for performance but to help me understand the code better.

avec = Developer`ToPackedArray@N@{
     {1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
     {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
     {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0}};
nvec = {1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6};
svec = {-1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1};

ne = 3; nμ = 12; δ = -50; β = 1;

Clear[g, fxn]

g[x_, y_] := 8 x y/((x^2 - y^2)^2 π^2)

mem : fxn[x_, x_, u_, u_] := mem = N[1/6 (1 - 3/(x^2 π^2) - 3/(u^2 π^2))]
mem : fxn[x_, x_, u_, v_] := mem = N@If[OddQ[u + v], 0, g[u, v]]
mem : fxn[x_, y_, u_, u_] := mem = N@If[OddQ[x + y], 0, g[x, y]]
mem : fxn[x_, y_, u_, v_] := mem = N@If[EvenQ[x + y] || EvenQ[u + v], 0, -2 g[x, y] g[u, v]]

a = Transpose[avec];
oa = Outer[Times, a, a, 1];

hmat = Table[
   n0 = nvec[[i]]; n1 = nvec[[j]];
   h1 = (n0^2 π^2 + β svec[[i]]) KroneckerDelta[i, j];
   of = Outer[fxn[n1, n0, #1, #2] - fxn[n1, #1, n0, #2] &, nvec, nvec];
   h2 = δ Total[of oa, Infinity];
   h1 + h2, {i, nμ}, {j, nμ}] // Timing

score 1 · Answer 3 · answered Oct 03 '17 at 20:18

1

The reworked code for the second answer could look like this. Here, we need also the definitions of chmat and cfxn from my other post. It is not meaningful to include the code for chmat with the Compile command in the Table: Compile creates the function chmat that is highly optimized for multiple execution. We can simply reuse the function chmat.

Apart from that, there was some bad syntax, e.g. due to commands in the loop that were not well seperated by semicoli. Also, the rest of the code can be shortened a bit. Eigensystem always throws orthonormalized eigenvalues when supplied with numerical arguments. So no normalization is needed. The sign computations at the end are probably to prevent the rows of avec to flip back and forth. Since some of the first entries tend to be close to zero, a sign rule that depends only on the first entry is not very stable. Hence I took the freedom to enforce that the sum of each row of avec is nonnegative. The table returns now pairs of the distances of current to former instance of avec and the desired value en. As can be read of easily, the method converges nicely after a few dozen of iterations.

Table[
 hmat = chmat[n\[Mu], ne, \[Delta], \[Beta], avec, nvec, svec];
 {evals, evecs} = Eigensystem[hmat];
 pos = Ordering[evals][[1 ;; ne]];
 bvec = Map[x \[Function] If[Total[x] < 0, -x, x], evecs[[pos]]];
 residual = Max[Abs[avec - bvec]];
 avec = bvec;
 {residual, Total[evals[[pos]]]},
 {j, 1, 30}]

answered Oct 03 '17 at 20:18

Henrik Schumacher

106,770
7
179
309

Thank you so much for your great explanation. I learned a lot. I just wanted to ask you what are the other possible ways I can get master in mathematica. Is there any special way you would suggest instead of the asking, solving I am currently doing? – Delaram Nematollahi Oct 03 '17 at 22:16
@DelaramNematollahi You can find a quite comprehensive list of advices here: https://mathematica.stackexchange.com/questions/18/where-can-i-find-examples-of-good-mathematica-programming-practice. The best advice that I can give you personally is to learn about Function and Map: While procedural programming with Do and Table is good for compilation (in this case, one leaves the Mathematica universe and indirectly writes C-code), functional programming helps enormously to produce decently fast and expressive code. – Henrik Schumacher Oct 04 '17 at 07:25
Sorry to disturb you again, I don't know why I am getting this error: "CompiledFunction::cfne: Numerical error encountered; proceeding with uncompiled evaluation." if I use this term for code in cfxn: "code = -(( 16 A B^2 E^-B ((-1)^(1 + n1a + n1b) + (-1)^( 1 + n2a + n2b) + (1 + (-1)^(n1a + n1b + n2a + n2b)) E^ B) n1a n1b n2a n2b [Pi]^4)/((B^2 + (n1a - n1b)^2 [Pi]^2) (B^2 + (n1a + n1b)^2 [Pi]^2) (B^2 + (n2a - n2b)^2 [Pi]^2) (B^2 + (n2a + n2b)^2 [Pi]^2)))}" – Delaram Nematollahi Oct 10 '17 at 15:53
Hi. I think that happens because you haven't declared all symbols (e.g. A and B) in code as arguments to Compile. Please, look up Compile in the documentation to learn what it actually does and how to set it up correctly. See also https://mathematica.stackexchange.com/questions/1803. The function CompilePrint in the "CompiledFunctionTools`" package might also be quite helpful. – Henrik Schumacher Oct 10 '17 at 19:55
@ Henrik Schumacher could you look at this question I asked? https://mathematica.stackexchange.com/questions/158445/how-can-i-make-this-program-work-faster – Delaram Nematollahi Oct 23 '17 at 20:32

Can any one help me make my program work faster?

3 Answers3

Linked