Consider I have a function, with one version that compiled to the virtual machine and one that compiled to C:
fcp1[C0_, {xmin_, xmax_}, {tmin_, tmax_, dt_}] :=
Compile[{},
Module[{xls, Nx, tls, Ct, cdiagT, cdiagh, dx},
Nx = Length[C0];
dx = N@(xmax - xmin)/(Nx - 1);
tls = Range[tmin, tmax, dt] + dt/2.;
cdiagT = -1./(2. dx^2);
Do[
cdiagh = cdiagT;
, {t, tls}]]
, CompilationOptions -> {"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True}
]
fcp2[C0_, {xmin_, xmax_}, {tmin_, tmax_, dt_}] :=
Compile[{},
Module[{xls, Nx, tls, Ct, cdiagT, cdiagh, dx},
Nx = Length[C0];
dx = N@(xmax - xmin)/(Nx - 1);
tls = Range[tmin, tmax, dt] + dt/2.;
cdiagT = -1./(2. dx^2);
Do[
cdiagh = cdiagT;
, {t, tls}]]
, CompilationOptions -> {"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True}, CompilationTarget -> "C"
]
Now if we compare their compilation speed (the time it take MMA to compile the function, not the time to execute the function), the C compilation is very slow:
Nxgrid = 2000;
Ct0 = Array[Exp[-5. #^2] &, Nxgrid, {-1., 1.}] //
Developer`ToPackedArray;
xRange = {-199.9`, 199.9`};
dt = 0.05515999116515042`;
Ntgrid = 20000;
Needs["CompiledFunctionTools`"]
CompilePrint@f1 == CompilePrint@f2
f1 = fcp1[Ct0, xRange, {0, dt*Ntgrid, dt}]; // AbsoluteTiming
f2 = fcp2[Ct0, xRange, {0, dt*Ntgrid, dt}]; // AbsoluteTiming
(*True*)
(* {0.000642, Null} *)
(* {14.959721, Null} *)
So why does the C compilation so much slower than the WVM version, and how to speed it up?
Update
MarcoB gave a good suggestion to look at the compilation time independent of Mathematica. So I tested the compilation:
The documentation says the CCompilerDriver will be automatically involked when compiling to C. And it indeed seems quite slow.
Needs["CCompilerDriver`"]
file = Export[FileNameJoin[{$TemporaryDirectory, "fcp1.c"}],
f1]; // AbsoluteTiming
CreateObjectFile[{file}, "fcp1"]; // AbsoluteTiming
(* {0.505879, Null} *)
(* {7.194721, Null} *)
And the compiler CCompilerDriver involked is Clang in my system
DefaultCCompiler[]
(* CCompilerDriver`ClangCompiler`ClangCompiler *)
so I also tested it outside MMA:
Import["!clang -v 2>&1", "Text"]
Import["!clang -shared -o " <>
ToString@FileNameJoin[{$TemporaryDirectory, "fcp1.so"}] <> " " <>
ToString@FileNameJoin[{$TemporaryDirectory, "fcp1.c"}] <> " -I" <>
ToString[
FileNameJoin[{$InstallationDirectory,
"SystemFiles/IncludeFiles/C/"}]] <> " 2>&1",
"Text"]; // AbsoluteTiming
(* "Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.5.0 Thread model: posix" *)
(* {0.750312, Null} *)
Import["!nm " <> ToString@FileNameJoin[{$TemporaryDirectory, "fcp1.so"}], "Text"]
".....
0000000000013140 T _fcp1
0000000000017030 b _funStructCompile
0000000000017020 d _initialize
U dyld_stub_binder"
It looks like compile outside of MMA is very fast. So:
- Why does
CCompilerDrivertake long time to compile? - CCompilerDriver takes about 7s in the last example, so why
Compiletake about 15s?
f2takes about 2 seconds. So maybe a different compiler will help you here. – RunnyKine Jul 23 '15 at 03:36f2. I guess its machine specific compiler issue. – PlatoManiac Jul 23 '15 at 09:06f2. I think this is within the normal range. When you test Clang by itself, you are only compiling, not linking (I think; I've never used Clang). Linking may take considerable time and should certainly be included as well. – Oleksandr R. Jul 23 '15 at 09:58-oswitch instead of-cin most compilers. – Oleksandr R. Jul 23 '15 at 15:11-shared. But you would be better off to read the Clang manual than to ask me, because I've never used it. – Oleksandr R. Jul 23 '15 at 15:36C0to be inlined, although you only need its length. And I am not sure whether the Mathematica and the C compiler are able to optimize this out. I think theMTensorrepresentingC0has first to be instantiated at runtime before its size can be queried. And instantiation needs to store all its values in the program code. So a lot of useless stuff to let the C compiler chew on. – Henrik Schumacher Jan 03 '24 at 13:40