2

I have a list bdrs of SparseArrays, which represent the boundary matrices of a chain complex $R^{d_0}\overset{\partial_1}{\longleftarrow}R^{d_1}\longleftarrow\ldots\longleftarrow R^{d_{N-1}}\overset{\partial_N}{\longleftarrow}R^{d_N}$ over $R\!\in\!\{\mathbb{Z}, \mathbb{Q}, \mathbb{Z}_p\}$. I store this data as $d_0,\ldots,d_N$=dims, $N$=dim, $R$=p$\!\in\!\{$"Z",0,p$\}$. Typically, $\partial_k$ has dimensions $10^6\!\times\!10^6$ and density $10^{-5}$. See this or this or this link, to understand how SparseArray stores data.


For $k=1,\ldots,N$, I wish to (a) compute a sequence Mm=$M_1,\ldots,M_N$, in which $M_k$ is the set of all invertible entries in $\partial_k$, that have zeros left of them (in their row) and below them (in their column). Then for every $(i,j)$-entry in $M_k$, I wish to (b) delete it from $\partial_k$, and delete the $i$-column from $\partial_{k-1}$ and $j$-row from $\partial_{k+1}$.


First, if $R\!=\!\mathbb{Z}_p$ I mod out the entries, then delete all zero entries, and sort the rest of the entries:

If[p!="Z" && p!=0, bdrs=Mod[bdrs,p]];
bdrs = Table[SparseArray[Sort@ArrayRules@bdrs[[k]],dims[[k;;k+1]]], {k,dim}];

Attempt 1:

Note that some $d_k$ may be $0$. The construction is done by

Mm = DeleteCases[ Table[
      i=bdrs[[k,All,j]]["NonzeroPositions"][[-1,1]]; w=bdrs[[k,i,j]];   
      If[bdrs[[k,i]]["NonzeroPositions"][[1,1]]==j && If[p=="Z", w^2==1, True], 
       bdrs[[k,i,j]]=0;  {i,j}->w, ""], 
      {k,dim}, {j, If[dims[[k]]*dims[[k+1]]==0, {}, 
        DeleteDuplicates@Flatten@bdrs[[k]]["ColumnIndices"]]}], "",2];

or

Mm = DeleteCases[ Table[ {i,j}=e; w=bdrs[[k,i,j]]; 
    If[bdrs[[k,All,j]]["NonzeroPositions"][[-1]]=={i} && If[p=="Z", w^2==1, True],
     bdrs[[k,i,j]]=0;  {i,j}->w, ""], 
    {k,dim}, {e, If[dims[[k]]*dims[[k+1]]==0, {}, 
      First/@ GatherBy[bdrs[[k]]["NonzeroPositions"], First]]}], "",2];

and then deletion of rows/columns:

Do[ If[1<k,   bdrs[[k-1, All, #[[1,1]]&/@Mm[[k]] ]] = 0];   
    If[k<dim, bdrs[[k+1, #[[1,2]]&/@Mm[[k]], All ]] = 0];, {k,dim}]; 

Both constructions (a) are awfully slow on large matrices, but (b) is really fast.

Attempt 2:

Instead of a SparseArray, I use an Association of rows and of columns.

rows[bdr_] := With[
  {l=GatherBy[ ArrayRules[bdr][[1;;-2]] /.({i_,j_}->w_Integer) :> {i,j,w}, First]}, 
  Association@Table[r[[1,1]]->(Rest/@r), {r,l}]]; 
mm[r_,c_] := Module[{j,w}, Association@ DeleteCases[ Table[ {j,w}=r[i][[1]]; 
  If[c[j][[-1]]=={i,w} && If[p=="Z",w^2==1,True], {i,j}->w,""],{i,Keys@r}],""]];
Mm = Table[r=rows@bdrs[[k]]; c=rows@Transpose@bdrs[[k]]; mm[r,c], {k,dim}];

Here (a) is much faster, but if I replace each bdrs[[k]] with its association of rows and of columns, then achieving (b) becomes very slow (applying DeleteCases[#,MemberQ[#[[1,1]]&/@Mm[[k]],#]&]& to all values in the Association).

Comment:

Later, I will be computing a lot sums of rows, so I need to keep either bdrs or rows & columns of these matrices as Associations.

Examples for testing purposes:

Let us use a command, that builds up a chain complex from the faces of a simplicial complex:

chainComplexSC[bases_] := Module[{dim=Length@bases, dims=Length/@bases, basesk, baseskk, bdrs={}, entries, bdr, x},
   basesk = AssociationThread[bases[[1]],Range@dims[[1]]];     
   Do[ baseskk=AssociationThread[bases[[k]],Range@dims[[k]]];
    entries=Flatten[Table[{basesk[Delete[s,{{i}}]],baseskk[s]}->(-1)^(i+1),{s,Keys@baseskk},{i,k}],1];
    AppendTo[bdrs, SparseArray[entries, dims[[k-1;;k]]]]; 
   basesk=baseskk; Clear[baseskk];, {k,2,dim}]; bdrs];
sCxSimplices[facets_,k_] := ParallelCombine[ DeleteDuplicates@
   Flatten[Table[Subsets[s,{k}],{s,#}],1]&, facets,Union,Method->"CoarsestGrained"];

Let us create the $m\!\times\!n$ chessboard complex:

m = 4; 
n = 4; 
facets = FindClique[GraphComplement@LineGraph@CompleteGraph[{m, n}], Infinity, All];
dim = Max[Length /@ facets];
bases = Table[sCxSimplices[facets, k + 1], {k, 0, dim - 1}];
dims = Length /@ bases;
bdrs = chainComplexSC[bases];

A good testing example is $m\!=\!8, n\!=\!9$, which has the f-vector dims=$72, 2016, 28224, 211680, 846720, 1693440, 1451520, 362880$.

Leo
  • 1,155
  • 5
  • 14
  • 1
    It would be much easier to help you if you provided an example of such a complex for testing purposes. – Henrik Schumacher Jan 07 '19 at 07:12
  • Don't get me wrong: I find this problem very interesting and tried to start right away. This problem is simply to complex to attack it without thourough testing and constructing good test cases is too time consuming. E.g. it took me quite a while to realize that instead of two sorts of entries (zeroes and nonzeroes) we need three (zeroes, noninvertible nonzeroes, and invertible nonzeroes) and that was kind of frustrating. So a potential example should feature also noninvertible nonzeroes. – Henrik Schumacher Jan 07 '19 at 07:39
  • @HenrikSchumacher You are right, I added the construction of test examples. Since it is the chain complex of a simplicial complex, it has only entries $0,\pm1$. For now, let us focus on this case (where all nonzero entries are invertible). If needed, one can replace all $-1$ entries with $2$ to get an example with nonunit entries. – Leo Jan 07 '19 at 21:09

1 Answers1

2

This might solve problem (a). You have to preprocess the input matrix A such that it has 1 at positions where the original matrix bdrs[[k]] had an invertible element and -1 at positions where the original matrix had a noninvertible nonzero element.

Here is a test matrix:

n = 10000;
m = 10000;
nedges = 4000000;
A = SparseArray`SparseArraySort@SparseArray[
    Transpose[{
       RandomInteger[{1, m}, nedges],
       RandomInteger[{1, n}, nedges]
       }] -> RandomChoice[{-1, 1}, nedges],
    {m, n}];

This computes the set of pairs (i,j} as described by OP; it runs through in about a quarter of a second.

M = A \[Function] Module[{Ainv, At, rowswithinv, firstinvcolumns, firstnonzerocolumns, rowswithleadinginv, leadinginvcols, lastnonzerorows},
   Ainv = SparseArray[Clip[A, {0, 1}]];
   rowswithinv = Random`Private`PositionsOf[Unitize[Differences[Ainv["RowPointers"]]], 1];
   If[Length[rowswithinv] > 0,
    firstinvcolumns = Flatten[Ainv["ColumnIndices"][[Ainv["RowPointers"][[rowswithinv]] + 1]]];
    firstnonzerocolumns = Flatten[A["ColumnIndices"][[A["RowPointers"][[rowswithinv]] + 1]]];
    With[{picker = Random`Private`PositionsOf[UnitStep[firstnonzerocolumns - firstinvcolumns], 1]},
     rowswithleadinginv = rowswithinv[[picker]];
     leadinginvcols = firstinvcolumns[[picker]];
     ];
    If[Length[leadinginvcols] > 0,
     At = Transpose[A[[All, leadinginvcols]]];
     lastnonzerorows = Flatten[At["ColumnIndices"][[At["RowPointers"][[2 ;; -1]]]]];
     With[{picker = Random`Private`PositionsOf[UnitStep[rowswithleadinginv - lastnonzerorows], 1]},
      Transpose[{
        rowswithleadinginv[[picker]],
        leadinginvcols[[picker]]
        }]
      ]
     ,
     {}
     ]
    ,
    {}
    ]
   ];

Now,

 M[A]; // AbsoluteTiming //First

0.227727

The only ingredients of measureable costs are the lines

Ainv = SparseArray[Clip[A, {0, 1}]]; // AbsoluteTiming // First

0.138703

and

At = Transpose[A[[All, leadinginvcols]]];

which costs about 0.096315 seconds.

However, I have not tested the implementation thoroughly, yet.

Side Remark

You might find this interesting. Your example can be generated much quicker by using Szabolcs' package "IGraphM`" along with the following code:

Needs["IGraphM`"]
facets = IGCliques[GraphComplement@LineGraph@CompleteGraph[{m, n}]];
bases = Sort@*Developer`ToPackedArray /@ GatherBy[facets, Length];
dims = Length /@ bases;

Moreover, the following function might be faster at generating the boundary operators:

chainComplexSC2[bases_] := 
  Module[{dim, dims, basesk, baseskk, bdrs = {}, entries, bdr, x, a, cf},
   dim = Length@bases;
   dims = Length /@ bases;
   cf = Compile[{{list, _Integer, 1}},
     Table[Delete[list, i], {i, 1, Length[list]}],
     CompilationTarget -> "WVM",
     RuntimeAttributes -> {Listable},
     Parallelization -> True
     ];
   basesk = AssociationThread[bases[[1]], Range@dims[[1]]];
   Do[
    baseskk = AssociationThread[bases[[k]], Range@dims[[k]]];
    a = SparseArray[
      Rule[
       Transpose[{
         Lookup[basesk, Flatten[cf[Keys@baseskk], 1]],
         Flatten[Transpose@ConstantArray[Range[dims[[k]]], k]]
         }],
       Flatten[ConstantArray[(-1)^Range[2, k + 1], dims[[k]]]]
       ],
      dims[[k - 1 ;; k]]
      ];
    AppendTo[bdrs, a];
    basesk = baseskk;
    Clear[baseskk];,
    {k, 2, dim}];
   bdrs
   ];
Henrik Schumacher
  • 106,770
  • 7
  • 179
  • 309
  • Was the family of examples that I provided of any help to you? – Leo Jan 11 '19 at 22:07
  • Oh, I've been quite busy in the last couple of days. Was the code that I provided of any help to you? – Henrik Schumacher Jan 11 '19 at 22:13
  • Umm, I don't know what I'm doing wrong, but for A=SparseArray[{{2,3}->7,{5,1}->1,{4,2}->-1,{5,5}->-1},{5,5}]; your M returns {{2,3},{5,1}} instead of {{5,1}->1,{4,2}->-1}. Also, what does SparseArray\SparseArraySort@SparseArray[...]do? Sorts the entries lexicographically? Is this better thanSparseArray[Sort@...]`? – Leo Jan 13 '19 at 23:48
  • 1
    SparseArray`SparseArraySort is totally different from SparseArray[Sort@...]; it enforces that the internal representation of the sparse array is conforming to CRS standard. This standard is heavily used in the posted piece of code, so the input matrix must be conforming. Secondly, I said that you have to preproces your input matrix: Only entries 0, -1, and 1 are allowed: 1 stands for an invertible nonzero entry and -1 for a noninvertible nonzero entry. – Henrik Schumacher Jan 14 '19 at 08:02
  • Ahaa, ok now it works correctly, and very fast! I preprocessed the matrix (over $\mathbb{Z}$) with f[w_] := Which[w==0,0,w^2==1,1,True,-1]; SetAttributes[f,Listable]; A=f@A;, but now this piece of code uses up 85% of the whole running time. Is there a better way? – Leo Jan 16 '19 at 21:38
  • 1
    In that case, Clip[A^2, {0, 1}, {-1, -1}] should work also much faster... – Henrik Schumacher Jan 16 '19 at 21:47
  • I wanted to let you know that your solutions work extremely efficiently. By providing such code, you're doing me and the community a big favour, as well as promoting Mathematica, so thank you! :D Also, thank you for the IGraphM package; it's a shame I can't upvote your answer more times. I might have some questions later on, after I've pondered on the code long enough... – Leo Jan 19 '19 at 17:53
  • I am glad to hear that this has been of help to you! – Henrik Schumacher Jan 19 '19 at 18:09
  • I think instead of SparseArraySparseArraySort[..]you could just useSparseArrayagain, e.g., ``SparseArraySparseArraySort@SparseArray[...]``. – Carl Woll Jun 25 '19 at 23:48
  • @CarlWoll I see, you're right. Actually, I had presumed that SparseArray were implemented as identity operator on SparseArrays. Nonetheless, SparseArray`SparseArraySort has the advantage that it is self-documenting and SparseArray@*SparseArray is not--a later version of mine could attempt to remove the seemingly redundant SparseArray and break the code. – Henrik Schumacher Jun 26 '19 at 00:30
  • 2
    Note that SparseArray @* SparseArray both sorts and removes background elements, while SparseArray`SparseArraySort only sorts. You may or may not want this behavior. – Carl Woll Jun 26 '19 at 00:38
  • @CarlWoll @HenrikSchumacher Does transposing or permuting rows/cols (e.g. a[[{3,1,2,5,4}]]) mess up the CSR/CSC structure? After those operations, do we need to apply SparseArray @* SparseArray@a or SparseArraySparseArraySort` again? – Leo Oct 16 '20 at 17:08
  • 1
    Apparently, that is not necessary: a = IdentityMatrix[10, SparseArray]; SparseArraySparseArraySortedQ[ a[[{3, 1, 2, 5, 4}]]]returnsTrue`, hence the new array is sorted already. – Henrik Schumacher Oct 16 '20 at 17:44
  • 1
    How come a=SparseArray[{{1,4}->14,{1,2}->12},{2,5}]; SparseArraySparseArraySortedQ@areturns True, even thougha["NonzeroValues"]returns{14,12}? I'm running MMA11.3. Also, givena, is justa=SparseArray@a;enough to make surea` is without zeros and in CSR format? – Leo Oct 16 '20 at 20:00
  • Huh. That's indeed shocking. Hm. I don't know what to say about that... – Henrik Schumacher Oct 16 '20 at 20:48