2

So I have what I assumed to be a textbook example of where compiling my code would be faster. I just make a little ellipse in the complex plane;

oval[θ_, a_] :=  N[1/2 (Cos[π (1 - θ)] + I a Sin[π (1 - θ)] + 1)]

ovalC := Compile[ {{θ, _Real}, {a, _Real}}
, (Cos[π (1 - θ)] + I a Sin[π (1 - θ)] + 1)/2
]

But then when I run the compiled code and it is consistently about four times slower than the uncompiled code

oval[0.2, 1] // Timing
ovalC[0.2, 1] // Timing

{0.000075, 0.0954915 + 0.293893 I}
{0.000268, 0.0954915 + 0.293893 I} 

My understanding is that I have made up a numerical-value only function and then compiling it should be quicker as it shouldn't need to do so much eg. checking for symbolic/numerical expressions etc. that Mathematica normally does.

Am I misunderstanding what the compile function is meant to do?

Kuba
  • 136,707
  • 13
  • 279
  • 740
Jojo
  • 1,278
  • 8
  • 19
  • 5
    It's the := you use to define ovalC. Since evaluation is delayed it's recompiling the function every time it's called instead of once when it's defined and then using the faster code. If I use ovalC=Compile[...] I see a slight improvement over the uncompiled function. – N.J.Evans Nov 06 '17 at 12:57
  • 1
    You might also want to add the option CompilationTarget -> "C", and if you're going to run it over a lot of arguments, RuntimeAttributes -> Listable (and then run it over lists). – aardvark2012 Nov 06 '17 at 13:18
  • Oh great thanks a lot I'm just so used to writing := everywhere >_<. I will try your suggestions thankyou aardvark – Jojo Nov 06 '17 at 14:12
  • 1
    Righty, Mathematica =!= Pascal. I also had quite a hard time learning that... – Henrik Schumacher Nov 06 '17 at 18:42

1 Answers1

7

You used ovalC :=, which means that Compile will be re-evaluated on every single use of ovalC. See:

You use Timing for benchmarking, but that is unreliable for such short timings (and may report wrong results when some internal parallelization is going on). The code should be evaluated many times, and the total time measured. I would not fully trust even that for such a quick evaluation. But after fixing the above problem, I get

oval[.2, 1] // RepeatedTiming
(* {5.8*10^-6, 0.0954915 + 0.293893 I} *)

ovalC[.2, 1] // RepeatedTiming
(* {5.9*10^-7, 0.0954915 + 0.293893 I} *)

Finally, if you need to evaluate this for a large set of pre-computed numbers, the way to go is vectorization. This will typically be faster than the compiled function (unless that is written to take advantage of vectorization too).

inputs = RandomReal[1, {1000000, 2}];

oval @@@ inputs; // AbsoluteTiming
(* {5.92498, Null} *)

ovalC @@@ inputs; // AbsoluteTiming
(* {0.804943, Null} *)

oval @@ Transpose[inputs]; // AbsoluteTiming
(* {0.055662, Null} *)

Note that @@@ also unpacks inputs, which takes a bit of additional time, and a lot more memory.

Why is vectorization so fast compared to a compiled function, or even a function you may write yourself in C? It's because when working with arrays of values, it is possible to take advantage both of SIMD processing and automatic parallelization. If you were writing this program in C, you would need to take advantage of these features explicitly (which is a lot of extra work), while Mathematica (and similar systems) have this built in.

Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263