Optimization of power tower fractal generator

Question

I tried to optimize the code for generating power tower fractals from here. As the author suggested, I tried to memorize the points already tested in a list.

Here is my code :

convlist = {};
convQ[z_, max_] := Module[{list},
  If[MemberQ[convlist, z], 1,
   list = 
    Quiet[NestWhileList[N[(z^#)] &, z, Abs[#1 - #2] > 0.01 &, 2, max]];
   If[Length[list] < max,
    convlist = Union[convlist, list];
    1, 0]]]

LaunchKernels[8];
ParallelEvaluate[SetSystemOptions["CatchMachineUnderflow" -> False]];
DistributeDefinitions[convQ];
SetSharedVariable[convlist];

towerFract[xmin_, xmax_, ymin_, ymax_, step_] :=
 ArrayPlot[
  ParallelTable[
   convQ[x + I y, 50], {x, xmin, xmax, step}, {y, ymin, ymax, step}]]

towerFract[-1, 0, -1, 0, 0.01]

However, my code does not seem to work properly :

the CPU is not used at 100% (whereas in the original version it is)
I don't have a proper result.

I think that it may be related to the CatchMachineUnderflow option, or to the SetSharedVariable declaration.

My questions are :

How to make my code work properly, and, if possible, efficiently?
Is there a better way of doing what I am trying to do?

Related: http://mathematica.stackexchange.com/questions/48295/why-wont-parallelize-speed-up-my-code — Michael E2, Jun 17 '14 at 05:07

Michael E2 · Accepted Answer · 2014-06-17T04:41:37.370

The main problem is that at three points in the code of convQ, the shared variable convlist is accessed (read or write). Each access necessitates a call back to the main kernel. The upshot is that each of your parallel kernels spends a lot of time waiting on the main kernel to deal with the requests of the other parallel kernels.

Another issue is that the computed values in convlist are highly unlikely to ever appear again. This means that the MemberQ test will almost certainly always be False. To get speed up from this technique, you should round the numbers to the pixel coordinates. That will most likely introduce extra numerical error that will blur the fractal boundary, but one could test that hypothesis.

It seems to me that the greatest speed up would come from compiling convQ, since one is looking at dealing with machine precision numbers anyway. We can manually keep track of the size of the numbers to deal with overflow and underflow programmatically. It seems the Mathematica Virtual Machine (that runs compiled code) does not handle such exceptions gracefully.

convQC =
 With[{logmaxnum = Log[$MaxMachineNumber], logminnum = Log[$MinMachineNumber]}, 
  Compile[{{z, _Complex}, {max, _Integer}},
   Module[{z1, z2, z3, i},
    i = 2;
    If[z != 0,
     z1 = z;
     z2 = z^z;
     While[i <= max && Abs[z2 - z1] > 0.01,
      z3 = z2 Log[z];
      If[Re[z3] > logmaxnum,   (* guess: overflow => diverges *)
       i = max + 1,
       If[Re[z3] < logminnum,  (* guess: underflow => converges *)
        Break[],
        z1 = z2;
        z2 = Exp[z3]; i++]]
      ]
     ];
    If[i > max, 0, 1]
    ],
    RuntimeOptions -> "Speed"
   ]
  ]

towerFract[xmin_, xmax_, ymin_, ymax_, step_] := ArrayPlot[
  ParallelTable[
   convQC[x + I y, 50], {y, ymin, ymax, step}, {x, xmin, xmax, step}]
  ]

This takes about 20 seconds on a 2.7 GHz Intel i7 (MacBook Pro).

towerFract[-2, 2, -2, 2, 0.002]

Mathematica graphics

Thank you very much for this answer! Your explanations are clear and your optimizations even better... I guess that I'm not used to this permanent communication with the main kernel. — deltux, Jun 20 '14 at 14:33

Optimization of power tower fractal generator

1 Answers1