Using dense matrix is faster than sparse matrix in LinearSolve

Question

I'm doing FEM with Mathematica and have to deal with LinearSolve of a large matrix inside a Newton loop constantly. Hence I need to speed up the computation efficiency of LinearSolve. I'm aware of this post.

Similarly, I test with the following code. First I tried with dense matrix:

Clear["Global`*"];
dim = 5000;
a = RandomReal[{1, 2}, {dim, dim}];
b = RandomReal[{1}, {dim}];
Table[SetSystemOptions["ParallelOptions" -> "MKLThreadNumber" -> i];
  Print["Case=", i];
  Print[SystemOptions["ParallelOptions" -> "MKLThreadNumber"]];
  t = AbsoluteTime[];
  LinearSolve[a, b];
  time2 = AbsoluteTime[] - t;
  Print["t(", i, ")=", time2];
  Print["******"], {i, 4}];

The output is:

Case=1

{ParallelOptions->{MKLThreadNumber->1}}

t(1)=2.155357

******

Case=2

{ParallelOptions->{MKLThreadNumber->2}}

t(2)=1.279845

******

Case=3

{ParallelOptions->{MKLThreadNumber->3}}

t(3)=1.525075

******

Case=4

{ParallelOptions->{MKLThreadNumber->4}}

t(4)=1.246443

******

From the above test, it seems that the LinearSolve is automatically parallelized, which is good. My laptop has only 2 kennels hence no much improvement for 2,3, and 4 threads. Then I run it again with sparse matrix:

Clear["Global`*"];
dim = 5000;
a = RandomReal[{1, 2}, {dim, dim}];
b = RandomReal[{1}, {dim}];
sa = SparseArray[a];
sb = SparseArray[b];
Table[SetSystemOptions["ParallelOptions" -> "MKLThreadNumber" -> i];
  Print["Case=", i];
  Print[SystemOptions["ParallelOptions" -> "MKLThreadNumber"]];
  t = AbsoluteTime[];
  LinearSolve[sa, sb];
  time2 = AbsoluteTime[] - t;
  Print["t(", i, ")=", time2];
  Print["******"], {i, 4}];

However, it seems that LinearSolve needs much more time to deal with sparse matrix then with dense matrix.

Case=1

{ParallelOptions->{MKLThreadNumber->1}}

t(1)=11.346750

******

Case=2

{ParallelOptions->{MKLThreadNumber->2}}

t(2)=5.508803

******

Case=3

{ParallelOptions->{MKLThreadNumber->3}}

t(3)=5.551608

******

Case=4

{ParallelOptions->{MKLThreadNumber->4}}

t(4)=5.518430

******

This seems a little bit weird to me since I have experienced the improved efficiency by converting a dense matrix into sparse matrix in Matlab. Can anyone explain this? Is there any other way to speed up LinearSolve?

Using a sparse array is effective when the array is actually sparse. You are converting a dense array into a sparse array that has a sparsity of 0, so I'm not surprised that sparse methods are slower. — Carl Woll, Nov 08 '17 at 23:28
If sparse matrix methods were faster than methods for dense matrices, when working with dense matrices, then the dense methods would disappear. Voting to close on grounds that there is no clear reason to expect behavior any different from what was observed. — Daniel Lichtblau, Nov 09 '17 at 00:00
As already noted, trying to use sparse methods on a demonstrably dense matrix is a fool's errand. You had chosen an especially poor example to present your premise. — J. M.'s missing motivation, Nov 09 '17 at 00:55
I'm voting to close this question as off-topic because the issued raised is not really a problem; it is arises from the OP's not having the necessary understanding of the Mathematica objects he is working with. — m_goldberg, Nov 09 '17 at 02:09
With only 2 kennels there is an upper bound how many dogs you can have running efficiently in parallell. — mathreadler, Nov 09 '17 at 06:54

Henrik Schumacher · Answer 1 · 2017-11-09T07:51:05.650

14

You cannot benefit from SparseArray if your matrix is not sparse! The data structure storing a SparseArray needs a certain amount of overhead which only pays off if your matrix is really, really sparse (density below a permille or so). Moreover, whenever you find yourself converting a large matrix to a sparse matrix you have actually done something wrong. SparseArrays have to be constructed in a sparse way by special constructor functions (the constructor functions are called SparseArray in Mathematica and sparse in MATLAB).

But admittedly, a point that is bugging me is that Mathematica uses UMFPACK for factorizations of SparseArrays (as backend of LinearSolve). While UMFPACK is quite robust, it is rather slow, extremely memory hungry, and not parallelized. However, we have the alternative to use the mostly undocumented option Method->"Pardiso" in LinearSolve which uses the Intel MKL Pardiso solver; and Pardiso is (partially) parallelized (back substitution is hardly parallelizable).

edited Nov 09 '17 at 07:51

answered Nov 08 '17 at 23:31

Henrik Schumacher

106,770
7
179
309

Thank you for pointing that out. Do u imply that linearsolve is not parallelized by saying "While UMFPACK is quite robust, its is rather slow, extremely memory hungry and it is not parallelized"? – Wilhelm Nov 08 '17 at 23:57
The reason I said that is that I set SetSystemOptions["ParallelOptions" -> "MKLThreadNumber" -> 1] and SetSystemOptions["ParallelOptions" -> "MKLThreadNumber" -> 16], respectively. However, there is little difference of computing time. – Wilhelm Nov 09 '17 at 00:05
2

Yes, that's what I say. Still, LinearSolve will perform magnitudes better for SparseArrays arising in FEM than for the according densified matrices. – Henrik Schumacher Nov 09 '17 at 00:12
I think you may assume that the default parameters "ParallelOptions" are already optimized for your system. – Henrik Schumacher Nov 09 '17 at 00:13
If this is the case, how the computing time is decreased from 2.155357 to 1.279845 when changing MKLThreadNumber from 1 to 2?@Henrik Schumacher – Wilhelm Nov 09 '17 at 16:46
Ah, interesting. Seemingly UMFPACK still profits from parallelization. Thinking of it: UMFPACK might call BLAS or SparseBLAS routines and those are parallelized. – Henrik Schumacher Nov 09 '17 at 17:01

Using dense matrix is faster than sparse matrix in LinearSolve

1 Answers1