I need to find the rank (and eventually do the Gaussian elimination) of a large, sparse, non-square matrix of integers. There are a few methods in Mathematica to find the rank of a non-square matrix (or decompose it, and thereby find the rank) that work more or less out of the box: MatrixRank, SingularValueList, QRDecomposition, HermiteDecomposition
For large and sparse matrices there is an ultra fast algorithm of LU decomposition implemented through LinearSolve[ ... , Method -> "Multifrontal"] . It needs a square matrix but that can be achieved just by completing the square with zeroes (or "Background" in the language of SparseArray). This gives a very fast method to calculated the SparseMatrixRank.
The problem is that the underlying algorithm does not always work when the matrix is singular (LinearSolve gives a warning also), but if the decomposition reproduces the original matrix then we can trust the result.
In my real case it does not work maybe because the system is too big (30k x 100k) or too dense (~4/30k) or it has to do with its shape. So I am doing a very simple LU decomposition by hand that gives me the rank. I was hoping to use the functionalities of SparseArray to make it as fast as possible, using as little memory as possible.
So far, my algorithm gives always the correct result, even for a bit denser matrices where Multifrontal fails. But it is nowhere near as fast, and sometimes it eats all my RAM.
If you have suggestions on how to improve this code, maybe exploiting the Methods of SparseArray, or maybe spoting some obvious inefficiencies of my coding, I think that it could be useful for many people that do not want to resort to other programming languages to face large systems.
My simpleLU
Clear[simpleLU]
simpleLU[A_SparseArray] :=
Module[{L, U, m, n, rk, ck, rank = 0, auxA = A, js, subt},
{m, n} = Dimensions[A];
L = IdentityMatrix[m, SparseArray];
subt = ConstantArray[0, m];
(Loop over all rows i that aren't zero )
Do[
If[Total[Abs[auxA[[i]]]] != 0,
rank++;
(*Pivot has indices {rk,ck}*)
rk = i;
ck =
auxA[[i]][
"NonzeroPositions"][[FirstPosition[
Flatten[
auxA[[i]]["NonzeroValues"]], _?(# != 0 &)][[1]]]][[1]];
(*Preselect which rows (js) have nonzero elements below the pivot*)
js = Select[
Flatten[auxA[[All, ck]][
"NonzeroPositions"]], (# >= rk + 1 &)];
js = Select[js, (auxA[[#, ck]] != 0 &)];
(*Make zero all elements below the pivot auxA[[rk,ck]] *)
L[[js, rk]] = auxA[[js, ck]]/auxA[[rk, ck]];
Do[subt[[j]] = L[[j, rk]]*auxA[[rk]], {j, js}];
auxA[[js]] -= subt[[js]]; (*<- bottleneck*)
, {}];
, {i, m}];
U = auxA;
Print["simpleLU worked? ", A == L . U];
Print["simpleLU rank: ", rank];
{L, U}]
The SparseMatrixRank that does not always work
SparseMatrixRank[A_SparseArray?MatrixQ] :=
Module[{a, S, U, L, p, q, d1, d2},
{d1, d2} = Dimensions[A];
a = If[d1 > d2,
Join[Transpose[A],
SparseArray[{}, {d1 - d2, d1}, A["Background"]]],
Join[A, SparseArray[{}, {d2 - d1, d2}, A["Background"]]]];
S = Quiet@LinearSolve[a, Method -> "Multifrontal"];
U = S["getU"];
L = S["getL"];
{p, q} = S["getPermutations"];
Print["Multifrontal worked? ", (L . U)[[p, q]] == a];
Print["Multifrontal rank: ", Total[Unitize[Diagonal[Chop[U]]]]];
]
Denser matrices where sometimes SparseMatrixRank fails
n = 3000;
d2 = 100;
d1 = 50;
i = RandomInteger[{1, d1}, n];
j = RandomInteger[{1, d2}, n];
v = RandomInteger[{-10, 10}, n];
A = SparseArray[Transpose[{i, j}] -> v, {d1, d2}];
A // SparseMatrixRank // EchoTiming;
A // simpleLU // EchoTiming;
Update:
I have improved some basic syntax of simpleLU and now it runs 5x faster (the code is updated above). Depending on the density and size of the matrix, it outperforms other built-in methods (SingularValueList, QRDecomposition, HermiteDecomposition). And always gives the correct rank (unlike Multifrontal). It is still slow for very large systems, but so far it is the best I have for my problem.
If someone is willing to have a look I'm fairly sure it can still be improved, I have marked the bottleneck.
As @HenrikSchumacher pointed out, it would be great to have an interface to the methods of SuiteSparse. If you have untreatably large systems, you probably have to go there. Besides C++ they have an interface with MATLAB.
SparseArray`NestedDissectionis typically a good starting point, at least to compute a decent elimination tree... What you have in mind is probably closer toSparseArray`ApproximateMinimumDegree. – Henrik Schumacher Apr 24 '23 at 13:30There is no method ApproximateMinimumDegree for SparseArray objects. BothApproximateMinimumDegreeandNestedDissectionappear under Names["SparseArray`*"]. Can anybody give a very minimal example of how to use these methods? – Albercoc May 02 '23 at 12:49SparseArray`ApproximateMinimumDegree[A]andSparseArray`NestedDissection[A](of courseAhas to be aSparseArraythat is also a matrix). – Henrik Schumacher May 02 '23 at 18:37p=SparseArray`ApproximateMinimumDegree[A]; A=A[[p,p]]and it saves a 30% of time (!) if the matrix is already square. Unfortunately it gets compensated from having a larger matrix (making square a rectangular matrix). On the other hand it seems likeNestedDissection only works with structurally symmetric matrices– Albercoc May 03 '23 at 07:47