How to improve performance of BesselJ to the level of GSL?

Question

Consider the following code:

zs = N /@ Range[0, 12, 10^-5];
AbsoluteTiming[bessels = BesselJ[1, #] & /@ zs;]
Length @ zs

I've tried to measure only computation of BesselJ[1, #] & and filling of bessels with results. What I get is:

{12.103338, Null}
1200001

So OK, 12 seconds. Now I try the same with GSL in C language and get this:

1200001 computations, time taken to compute: 0.223292 s

What's even stranger — at first I thought GSL just suffered from some accuracy problems, and thus was that fast. But when I actually compared the numbers, I found that it's Mathematica who gives worse precision, in particular for larger z (for small z the output is identical to GSL's). I used N[#, 100] & as reference values.

I assume there must still be some way to achieve performance in Mathematica that is similar to that of GSL; what is this way?

I deleted my answer since it doesn't actually compute the Bessel function. Nevertheless, the advice to use packed arrays and prefer listable functions still holds. — halirutan, Jun 02 '15 at 11:12
Btw, you do realize that (if the C program is your own) with your knowledge of C it is a simple task to create your own LibraryFunction that calls the GLS code? Check this tutorial and all the references. — halirutan, Jun 02 '15 at 11:18
MATLAB's performance is on par with GSL. There's no fundamental reason why Mathematica should be slow. You could report this to WRI and ask them to improve performance. Not only is MATLAB much faster, it is also much more precise, as verified with Mathematica's own arbitrary precision arithmetic (30 digits): http://i.stack.imgur.com/3nrOL.png — Szabolcs, Jun 02 '15 at 11:26
If need be, you can use the trapezoidal rule to evaluate a Bessel function of low integer order and small to moderately sized argument. That should be easy to do within a compiled function, if speed is of the essence. (For large arguments, use asymptotic expansions instead.) — J. M.'s missing motivation, Jun 02 '15 at 11:49
@Szabolcs, I'd wager that this is because the built-in algorithm is a bit too general; that is, it's equipped to handle complex orders. For integer orders, the computations are not as complex, but it seems Mathematica does not know about them. — J. M.'s missing motivation, Jun 02 '15 at 11:52
Hi, @halirutan, I found it really odd, All the special function will unpack packed array even used them Listably. I tried Gamma, PolyLog,...etc. Why? I thought mathematica claimed they did a lot of optimization for packed array. However, the whole bunch of special functions is not optimized?!!! I think this deserves a bug tag — matheorem, Sep 05 '16 at 14:08

Ruslan · Accepted Answer · 2018-06-30T19:39:04.217

33

Following the advice in comments, I've made a test library for BesselJ[1, #] & function to evaluate via GSL. I still consider it a workaround, so if you find a way to use Mathematica built-in functions with good performance, please do make a new answer.

Needs["CCompilerDriver`"]
besselJ1src = "
  #include \"WolframLibrary.h\"
  DLLEXPORT mint WolframLibrary_getVersion() {return WolframLibraryVersion;}
  DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData) {return 0;}
  DLLEXPORT void WolframLibrary_uninitialize(WolframLibraryData libData) {}

  #include <gsl/gsl_sf_bessel.h>
  DLLEXPORT int j1(WolframLibraryData libData, mint argc, MArgument* args, MArgument result)
  {
    MArgument_setReal(result,gsl_sf_bessel_J1(MArgument_getReal(args[0])));
    return LIBRARY_NO_ERROR;
  }
  ";
besselJ1lib = CreateLibrary[besselJ1src, "besselj1", 
  "Libraries" -> {"gsl", "gslcblas"}];
j1 = LibraryFunctionLoad[besselJ1lib, "j1", {Real}, Real];

Now I can execute my original code with j1 instead of BesselJ[1, #] &:

zs = N /@ Range[0, 12, 10^-5];
AbsoluteTiming[bessels = j1 /@ zs;]

{0.241519, Null}

And bessels do indeed have numerical values.

edited Jun 30 '18 at 19:39

answered Jun 02 '15 at 13:03

Ruslan

7,152
1
23
52

+1, excellent self-answer! Nice use of the library function. – dr.blochwave Jun 02 '15 at 13:22
@Rusian: I am trying to resimulate your solution and need your help: ... "IncludeDirectories" -> "C:\dev\vcpkg\packages\gsl_x64-windows\include\", "LibraryDirectories" -> "C:\dev\vcpkg\packages\gsl_x64-windows\lib\", "Libraries" -> {"gsl", "gslcblas"}]; j1 = ,,,
Message: LibraryFunction::libload:The function j1 was not loaded from the file C:\Users\ld00032397\AppData\Roaming\Mathematica\SystemFiles\LibraryResources\Windows-x86-64\besselj1.dll. – stocha Apr 06 '21 at 20:22
1

@stocha my guess is that your system can't find gsl and gslcblas libraries. You may need to alter the system environment variable PATH, and maybe relaunch Mathematica after that. – Ruslan Apr 06 '21 at 20:23
Is "IncludeDirectories" ,"LibraryDirectories" not enough? – stocha Apr 06 '21 at 20:29
1

@stocha this is for compilation, while your problem is at the stage of loading the library after it has been compiled. – Ruslan Apr 06 '21 at 20:35

score 18 · Answer 2 · answered Jun 13 '15 at 23:30

I present in this answer a compiled implementation of one of the simpler algorithms for numerically evaluating a Bessel function of (modestly-sized) integer order and (small to medium-sized) real argument. This uses Miller's algorithm:

bessj = With[{bjl = N[Log[1*^16]]},
             Compile[{{n, _Integer}, {x, _Real}},
                     Module[{h, hb, k, l, m, r, s},
                            If[x == 0., Return[If[n == 0, 1., 0.]]];
                            l = r = 1;
                            While[r = 2 l;
                                  r (Log[r] - Log[Abs[x/2]] - 1) + Log[6 r + 1]/2 < bjl,
                                  l = r];
                            While[r - l > 1, m = Quotient[l + r, 2];
                                  If[m (Log[m] - Log[Abs[x/2]] - 1) + Log[6 m + 1]/2 < bjl,
                                     l = m, r = m]];
                            h = 0.; s = 0.; hb = Internal`Bag[Most[{0.}]];
                            Do[h = x/(2 k - h x); Internal`StuffBag[hb, h];
                               s = h (s + 2 Mod[k + 1, 2]), {k, l, 1, -1}];
                            Internal`StuffBag[hb,
                                              If[n >= 0 || Mod[n, 2] == 0, 1, -1]/(1 + s)];
                            Times @@ Take[Internal`BagPart[hb, All], -Abs[n] - 1]],
                     RuntimeAttributes -> {Listable}]];

Compare:

zs = N[Range[0, 12, 1*^-5]];
AbsoluteTiming[BesselJ[1, #] & /@ zs;]
   {76.729389, Null}
AbsoluteTiming[bessj[1, #] & /@ zs;]
   {39.575264, Null}

However, the listability of both the built-in function and the compiled version are not being exploited here, so here's another test:

AbsoluteTiming[j1 = BesselJ[1, zs];]
   {74.120239, Null}
AbsoluteTiming[j1t = bessj[1, zs];]
   {24.816419, Null}
Max[Abs[j1t - j1]]
   2.94459*10^-13

Note that I did not do any compilation to C in this case; doing this ought to result in a function with even better performance.

Very useful answer! Indeed, using listability makes the biggest difference. Timings on my MacBook Pro are 14s for BesselJ[1,zs] and 3.8 s for bessj[1,zs]. Using Map, the difference is much smaller: 14 seconds versus 11 seconds. CompilationTarget -> "C" gets me down to 2.3 s (with bessj[1,zs]). — Jens, Jun 13 '15 at 23:47

score 12 · Answer 3 · answered Oct 21 '15 at 11:48

If need be, you can use the trapezoidal rule to evaluate a Bessel function of low integer order and small to moderately sized argument. That should be easy to do within a compiled function, if speed is of the essence.

I mentioned this in a comment months ago. Here is the long-overdue follow through. What follows is a simple routine for $J_1(x)$ for real, moderately sized $x$, using the midpoint rule (since, as it turns out, the expressions look simpler with it; the performance is similar, however):

bj1 = With[{bjl = N[Log[1*^16]], p2 = N[π/2]},
           Compile[{{x, _Real}},
                   Module[{m = 1, k, r, s},
                          If[x != 0., While[(8 m - 2) (Log[4 m - 1] -
                             Log[Abs[0.5 x]] - 1) + Log[25 m - 5] < 2 bjl, m++]];
                          s = 0.;
                          Do[r = Sin[(k + 0.5) p2/(m + 1)];
                             s += r Sin[r x],
                             {k, 0, m}];
                          s/(m + 1)],
                   RuntimeAttributes -> {Listable}]];

Admittedly, it does not have the generality of my previous answer, but it is concise and quite efficient:

AbsoluteTiming[j1 = BesselJ[1, zs];]
   {191.703, Null}

AbsoluteTiming[j1t = bj1[zs];]
   {37.6003, Null}

Max[Abs[j1t - j1]]
   4.44089*10^-16

For an explanation of why the trapezoidal/midpoint rule works so well, see this or this.

How to improve performance of BesselJ to the level of GSL?

3 Answers3

Linked

Related