16

We know when solving linear algebra equations, despite its abstract syntax, LinearSolve is much faster compared to Solve:

n = 1000;
m = DiagonalMatrix@RandomReal[{1, 2}, n];
b = RandomReal[1, n];
sol1 = LinearSolve[m, b]; // AbsoluteTiming
x = Table[Unique[], {n}];
sol2 = Solve[m.x == b, x]; // AbsoluteTiming
Last /@ First@sol2 == sol1
{0.124800, Null}
{1.216800, Null}
True

I wonder if NRoots owns such a counterpart, too?

…To be honest, I came across this problem because a friend of mine claimed that a matlab function roots() is one order of magnitude faster than NRoots[]. The following is the comparison between NRoots and roots.

NRoots in Mathematica 9.0.1:

n = 10; m = 10000; list = RandomInteger[{1, 100}, {m, n + 1}];
equations = list.x^Range[0, n];
NRoots[#, x] & /@ equations; // AbsoluteTiming

{7.984318, Null}

NRoots in Mathematica 9.0.1

roots in Matlab R2014a:

 clear;clc;
 n = 10;
 m = 10000;
 lst=randi(100,m,n+1);
 tic;
 for i=1:m
     roots(lst(i,:));
 end
 toc

Elapsed time is 0.756003 seconds.

roots in Matlab R2014a Given the performance difference between LinearSolve and Solve, I believe if the analysis for the structure of polynomial is eliminated, Mathematica will have the same speed.

If such an internal function doesn't exist, can we have a self-made one that eliminates the preprocessing? Anyway, though I almost don't know the syntax of Matlab, the definition of roots() seems to be simple so I think it won't be that hard to implement the same algorithm in Mathematica.


OK, since the target of this question hasn't been completely achieved (NRoots owns 3 methods, currently only "CompanionMatrix" is implemented), let me add something to try to draw more attention. You remember this?:

enter image description here

It's said that Sam Derbyshire spent 4 days for a similar one, while the above image only takes me no more than 3 minutes in my old 2GHz laptop with roots[] that chyaong showed in his answer below:

(* I modified chyaong's roots[] a little for this specific task. *)
modifiedroots[c_List] := 
  Block[{a = DiagonalMatrix[ConstantArray[1., Length@c - 1], -1]},
         a[[1]] = -c;
         Eigenvalues[a]];

s = 1000;
l = ConstantArray[0., {s + 1, s + 1}];

l[[#, #2]] += 1. & @@@ 
   Round[1 + s Rescale@
       Function[z, {Im@z, Re@z}, Listable]@
        Flatten[modifiedroots /@ Tuples[{-1., 1.}, 18]]]; // AbsoluteTiming

ArrayPlot[UnitStep[80 - l] l, ColorFunction -> "AvocadoColors"]
{161.737000, Null}

Let's ignore the fact that my image is only degree 19.

This is not the end, according to my simple test (I simply add different Methods to the 2nd sample of this post), the "Aberth" method is probably more efficient than the CompanionMatrix inside NRoots, so if it's implemented in an abstract form, the above image will likely to be done in a even shorter time!

xzczd
  • 65,995
  • 9
  • 163
  • 468
  • (1) NRoots takes a Method option. The three documented possibilities are "Aberth", "JenkinsTraub", and "CompanionMatrix". (I do not know if there are undocumented possibilities). From what I could see, the third of those possibilities should be similar to what roots() does. – Daniel Lichtblau Feb 05 '15 at 16:16
  • 1
    (2) I confess I fail to see an analogy to Solve/LinearSolve, unless NRoots plays the role of LinearSolve with Roots being the "more general" function (insofar as it handles exact and symbolic input). I suppose one could go from the coefficient list and directly form a companion matrix and invoke Eigenvalues. Would that be faster or as reliable? I don't know. – Daniel Lichtblau Feb 05 '15 at 16:18
  • To the downvoter, I am interested in what's wrong with my question, would you please elaborate. I'm not trying to complain here, I just want improve my question if possible. – xzczd Feb 06 '15 at 12:23
  • 1
    I don't know exactly (I didn't vote one way or the other) but I can say a bit about things I thought could bear improvement. One issue is that it could have been phrased differently, maybe as "Are there different methods of NRoots that could have impact on performance vs. quality?" As I noted in a comment, the analogy to Solve/LinearSolve is just not clear to me. A bigger issue, in my mind, is that there is no clear example where NRoots is seen to perform relatively badly as compared to roots(). – Daniel Lichtblau Feb 06 '15 at 16:45
  • @DanielLichtblau That sounds reasonable. Question edited. (Thanks for the test from Amita!) – xzczd Feb 09 '15 at 09:16
  • list.x^Range[0,n]==x^Range[0,n].Transpose@list – chyanog Feb 09 '15 at 09:26
  • @chyaong Yeah, list.x^Range[0, n] is simper, but Amita has chosen the latter and take the screenshot, so let it be, anyway it's not the main issue here :) – xzczd Feb 09 '15 at 09:33
  • @xzczd, 根据你编辑后的代码,Mathematica 10.02的NRoots的速度大概是Mathematica 9的4倍多;用我答案中的roots,速度和matlab的roots差不多 – chyanog Feb 09 '15 at 09:41
  • Translation for the comment of @chyaong: According to your new-added code, NRoots in v10.0.2 is more than 4 times as fast as that in v9.0.1. The roots[] in my answer has similar performance as roots() of matlab. Well, that's interesting. – xzczd Feb 10 '15 at 10:48

2 Answers2

9

Translated that roots() function from Matlab code to Mathematica, about 4 times faster than NRoots .

Clear["`*"];
n=2000;
m=RandomReal[1,{n,10}];
res1=x/.(ToRules@NRoots[FromDigits[#,x]==0.,x]& /@ m);//AbsoluteTiming
(*-------------------------------*)
roots[c_List]:=Block[{a},
 a=DiagonalMatrix[ConstantArray[1.,Length@c-2],-1];
 a[[1]]=-Rest@c/First@c;
 Eigenvalues[a]
];

res2=Join @@ roots /@ m;//AbsoluteTiming

Union@Chop[Sort@res1 - Sort@res2]

Output:

{1.080062, Null}

{0.241012, Null}

{0}

n = 10;
m = 10000;
list = RandomReal[{1, 100}, {m, n + 1}];
equations = list.x^Range[0, n];
res1 = x /. (ToRules@NRoots[#, x] & /@ equations); // AbsoluteTiming
res2 = Join @@ roots /@ Reverse /@ list; // AbsoluteTiming
Sort@res1 - Sort@res2 // Chop // Union

enter image description here enter image description here

chyanog
  • 15,542
  • 3
  • 40
  • 78
  • If you try this with modestly higher degree (so preprocessing time does not dominate) you will see comparable timings. E.g with n = 100; m = RandomReal[1, {n, 100}]; Also the residuals from the NRoots result will be smaller. – Daniel Lichtblau Feb 08 '15 at 20:19
  • @DanielLichtblau Yeah, and eliminating the preprocessing is the target of this question. I just edited the question and made this part clearer (I think). – xzczd Feb 09 '15 at 09:18
  • Ah. No, I'm not aware of a lower level function that extracts numeric roots from a list of polynomial equations. – Daniel Lichtblau Feb 09 '15 at 16:18
  • As for the higher degree case I had mentioned, here is a plausible test. n = 100; m = 100; list = RandomReal[{1, 100}, {m, n + 1}]; equations = list.x^Range[0, n]; res1 = x /. (ToRules@NRoots[#, x] & /@ equations); // AbsoluteTiming res2 = Join @@ roots /@ Reverse /@ list; // AbsoluteTimingl – Daniel Lichtblau Feb 09 '15 at 16:18
  • Any plan to implement the other 2 method of NRoots[]? Though I admit this question is motivated by my competitive mood against Matlab, now its target is to get the counterpart of NRoots[] anyway :) – xzczd Feb 10 '15 at 10:55
  • @xzczd, this is a bit late, but: as I show here, there's a built-in undocumented routine to compute the (Frobenius) companion matrix (which can then be passed to Eigenvalues[]). For Aberth's method, I have a barebones implementation here; Jenkins-Traub, however, is much more complicated and finicky to implement properly. – J. M.'s missing motivation Nov 14 '19 at 12:48
  • Very nice. On my machine roots2 = Eigenvalues[NRoots`CompanionMatrix[Reverse[#]]]&; is about twice as fast. – Greg Hurst Dec 04 '23 at 13:28
5

We can also using GSL via LibraryLink, almost ten times faster than before.

LaunchKernels[];
ParallelMap[roots, Tuples[N@{-1, 1}, 15]]; // RepeatedTiming

{1.1, Null}

gslRoots@Tuples[N@{-1, 1}, 15]; // RepeatedTiming

{0.12, Null}

It only takes about 3 minutes to calculate the roots of all polynomial equations of degree 24 with a coefficient of ±1.

enter image description here

For visualization, it's better to find the location of each roots within the image matrix in C code, instead of calculating all roots in Mathematica. It takes about 150 seconds to generate a 100 million pixel image.

enter image description here

Mathematica code:

Needs["CCompilerDriver`"];
$CCompiler={"Compiler"->CCompilerDriver`GenericCCompiler`GenericCCompiler,"CompilerInstallation"->"C:/msys64/mingw64","CompilerName"->"gcc.exe","CompileOptions"->"-Ofast"};

LibraryUnload["poly_roots"]; lib=CreateLibrary[{"poly_roots.c"},"poly_roots","CompileOptions"->"-Ofast -fopenmp","Libraries"->{"gsl"}]; func1=LibraryFunctionLoad[lib,"roots",{{Real,1}},{Real,1}]; func2=LibraryFunctionLoad[lib,"modifiedRoots",{{Real,1}},{Real,1}]; gslRoots=func1//Compile[{{i,_Real,1}},#[i],RuntimeAttributes->{Listable},Parallelization->True]&; gslModifiedRoots=func2//Compile[{{i,_Real,1}},#[i],RuntimeAttributes->{Listable},Parallelization->True]&; rootsCounts=LibraryFunctionLoad[lib,"rootsCounts",{Integer,{Real,1},{Real,1}},{Integer,1}];

n=18; {w,h}={10^3,10^3}; Dimensions[counts=rootsCounts[n,{-2,2,w},{-2,2,h}]]//AbsoluteTiming

With[{idx=Round[255 Rescale[N[counts UnitStep[Quantile[counts,0.999]-counts]]]]+1}, Part[Developer`ToPackedArray@Table[List@@ColorData["SunsetColors",i],{i,0.,1,1/255}],idx] ]//ArrayReshape[#,{w,h,3}]&//Image

C code (poly_roots.c):

#include "WolframLibrary.h"
#include <omp.h>
#include <gsl/gsl_poly.h>

void solve(int n, double a, double z) { gsl_poly_complex_workspace *ws=gsl_poly_complex_workspace_alloc(n+1); gsl_poly_complex_solve(a,n+1,ws,z); gsl_poly_complex_workspace_free(ws); }

// Note that the parameter list of the previous "roots" is reversed DLLEXPORT int roots(WolframLibraryData libData, mint Argc, MArgument Args, MArgument Res){ MTensor in, out; in = MArgument_getMTensor(Args[0]); mreal in_data = libData->MTensor_getRealData(in); mint n = libData->MTensor_getFlattenedLength(in); mint len = 2(n - 1); double z = malloc(len * sizeof(double)); solve(n-1,in_data,z); int err = libData->MTensor_new( MType_Real, 1, &len, &out ); mreal* out_data = libData->MTensor_getRealData(out); memcpy(out_data, z, len*sizeof(z)); MArgument_setMTensor(Res,out); return LIBRARY_NO_ERROR; }

DLLEXPORT int modifiedRoots(WolframLibraryData libData, mint Argc, MArgument Args, MArgument Res){ MTensor in, out; in = MArgument_getMTensor(Args[0]); mreal in_data = libData->MTensor_getRealData(in); mint n = libData->MTensor_getFlattenedLength(in); mint n1 = n + 1; mreal a = malloc(n1sizeof(mreal)); memcpy(a, in_data, n1sizeof(mreal)); a[n]=1.0; mint len = 2n; double z = malloc(len sizeof(double)); solve(n,a,z); int err = libData->MTensor_new( MType_Real, 1, &len, &out ); mreal* out_data = libData->MTensor_getRealData(out); memcpy(out_data, z, len*sizeof(z)); MArgument_setMTensor(Res,out); return LIBRARY_NO_ERROR; }

DLLEXPORT int rootsCounts(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) { MTensor out; mint n = MArgument_getInteger(Args[0]);

MTensor T_arg1 = MArgument_getMTensor(Args[1]);
mreal* xRange = libData-&gt;MTensor_getRealData(T_arg1);
mreal x1 = xRange[0], x2 = xRange[1];
int w = (int)xRange[2];

MTensor T_arg2 = MArgument_getMTensor(Args[2]);
mreal* yRange = libData-&gt;MTensor_getRealData(T_arg2);
mreal y1 = yRange[0], y2 = yRange[1];
int h = (int)yRange[2];

mint len = w * h;
int err = libData-&gt;MTensor_new( MType_Integer, 1, &amp;len, &amp;out );
mint* out_data = libData-&gt;MTensor_getIntegerData(out);
double a[n+1], z[2*n];
#pragma omp parallel for reduction(+:a,z) 
for (int ii = 0; ii &lt; 1&lt;&lt;n; ii++) {
    for (int jj = 0; jj &lt; n; jj++) {
        a[jj]=2*((ii &gt;&gt; jj) &amp; 1)-1.0;
    }
    a[n]=1.0;
    solve(n,a,z);
    for (int k = 0; k &lt; n; k++){
        double x=z[2*k];
        double y=z[2*k+1];
        int i = (w*(x - x1)/(x2 - x1));
        int j = (h*(y - y1)/(y2 - y1));
        if(0&lt;i &amp;&amp; i&lt;w &amp;&amp; 0&lt;j &amp;&amp; j&lt;h )
            out_data[i+w*j]++;
    }
}

MArgument_setMTensor(Res,out);
return LIBRARY_NO_ERROR;

}

xzczd
  • 65,995
  • 9
  • 163
  • 468
chyanog
  • 15,542
  • 3
  • 40
  • 78