3

I am interested in the most efficient way to transform a huge rectangular array to a sparse array structure.

Consider a numeric rectangular table that is rather sparse, with zero positions being set to Null rather than 0, which saves time and memory during the fill. The problem is that SparseArray function can not be directly applied to that kind of table. First, every Null should be replaced by 0, as can be seen in the following example:

Q = 100;(*the matrix dimension is QxQ.*)
f[p1_,p2_]:=If[PrimeQ[p1+p2],1,Null];(*the specific form of the matrix elements chosen for illustrative purposes only*)
mat=ParallelTable[f[p1,p2],{p1,Q},{p2,Q}];(*the matrix to be later compressed to SparseArray*)
mat0=mat/.(Null)->0;(*Here we replace Nulls with zeroes*)
sparse=SparseArray[mat0];(*Now the SparseArray may be applied*)

However, during such replacement procedure, the ByteCount[mat0] is dramatically increased, and at sufficiently large Q one runs out of memory. Besides, it does not seem logical to artificially inflate the matrix in order to be able to compress it afterwards.

So, how can I transform a huge rectangular array full of Nulls to a sparse array structure, without replacing Nulls with zeroes?

Thank you!

P.S. Other ways to form a sparse array that I am aware of are getting too speed- and memory inefficient as Q is taken large (say, Q > 50000), e.g. this way

SparseArray @ Flatten[ParallelTable[{p1, p2} -> f[p1, p2], {p1,Q}, {p2,Q}] /. (_ -> Null) -> Sequence[]];

takes 100Gb of memory to form a sparse array of 10 Gb.

P.P.S. One could ask, what's the need to compress a matrix that is already quite sparse? The SparseArray version takes less memory, anyway: ByteCount[sparse] is less than ByteCount[mat].

P.P.P.S. Above was the toy problem. The more realistic code is as follows:

<< CompiledFunctionTools`
Compiler`$CCompilerOptions = {"SystemCompileOptions" -> 
"-fPIC -Ofast -march=native"};
On[ Compile::noinfo]
n = 4;(*the number of electrons*)
Upp = 6;(*Upp is the upper occupied 
quantum state*)
a = Subsets[Range[Upp], {n}];(*a is the set of all many-particle states*)
Q = Binomial[Upp, n];(*The number of many-particle states in a*)
(*Fast compiled function that compares two vectors and returns the positions of different elements.*)
VectorCompare = 
Compile[{{v1, _Integer, 1}, {v2, _Integer, 1}}, Block[{i1 = 1, i2 = 1, d1 = Internal`Bag@Most[{0}], 
 d2 = Internal`Bag@Most[{0}]},
(*Run along the lists,recording differences as we go*)
 While[i1 <= Length[v1] && i2 <= Length[v2], 
 Which[v1[[i1]] < v2[[i2]], Internal`StuffBag[d1, i1]; i1++, 
  v1[[i1]] > v2[[i2]], Internal`StuffBag[d2, i2]; i2++, True, i1++;
   i2++]];
(*Fix up in case we ran off the end of one of the lists*)
While[i1 <= Length[v1], Internal`StuffBag[d1, i1]; i1++];
While[i2 <= Length[v2], Internal`StuffBag[d2, i2]; i2++];
{Internal`BagPart[d1, All], Internal`BagPart[d2, All]}], 
"CompilationTarget" -> "C", 
CompilationOptions -> {"ExpressionOptimization" -> True, 
 "InlineExternalDefinitions" -> True}, RuntimeOptions -> "Speed", 
RuntimeAttributes -> {Listable}, Parallelization -> True];

CompareMatrix = SparseArray@
Developer`ToPackedArray[
ParallelTable[
  If[p1 < p2, vc = VectorCompare[a[[p1]], a[[p2]]]; 
   diff = Length@Flatten@vc; 
   Which[diff == 2, {vc[[1, 1]], vc[[2, 1]], 0, 0}, 
    diff == 4, {vc[[1, 1]], vc[[2, 1]], vc[[1, 2]], 
     vc[[2, 2]]}]], {p1, Q}, {p2, Q}] /. (Null) -> {0, 0, 0, 0}];

The array I need could be built without any unnecessary zeros by running the following code:

CMB2 = Internal`Bag@Most[{0}];
CMB4 = Internal`Bag@Most[{0}];
Do[vc = VectorCompare[a[[p1]], a[[p2]]]; l = Length@Flatten@vc; 
Which[l == 2,Internal`StuffBag[CMB2, {p1, p2, vc[[1, 1]], vc[[2, 1]]}],
l == 4,Internal`StuffBag[CMB4, {p1, p2, vc[[1, 1]], vc[[2, 1]], vc[[1, 2]], vc[[2, 2]]}]], {p1, 1, Q - 1}, {p2, p1 + 1, Q}];

If only it were parallelizable...

Edit: The fastest and memory-efficient solution is as follows:

CM=Join @@ ParallelMap[Developer`ToPackedArray,
Table[vc=VectorCompare[a[[p1]],a[[p2]]];diff = Length@Flatten@vc;
Which[diff > 4, empty, 
diff == 4, {p1,p2,vc[[1,1]],vc[[1,2]],vc[[2,1]],vc[[2,2]]},
True, {p1,p2,vc[[1,1]],vc[[2,1]],0,0}],
{p1,1,Q-1},{p2,p1+1,Q}]];

In essence, this closes the question.

Yasha Gindikin
  • 458
  • 3
  • 7
  • Using a symbol like Null here instead of 0 (or 0. for a matrix of reals) is much more likely to increase memory usage ... – Szabolcs May 16 '14 at 19:45
  • @Szabolcs Not in the case of an unpacked array it seems, at least in version 7. What do you get for the three ByteCounts in my example? – Mr.Wizard May 16 '14 at 19:48
  • I don't believe ByteCount is accurate here. It's known to double count in certain cases when several parts of an expression are really a reference to the same thing in memory. ByteCount@Outer[If[PrimeQ[#1 + #2], 1] &, Range[100], Range[100]] is indeed about half the size of ByteCount@Outer[If[PrimeQ[#1 + #2], 1, 0] &, Range[100], Range[100]]. But if I use MaxMemoryUsed (introduced in v9 I think, at least for use this way) instead of ByteCount, I get the same result for both. – Szabolcs May 16 '14 at 19:53
  • 1
    @Mr.Wizard Here's v7-compatible proof that there's no difference: screenshot pastebin – Szabolcs May 16 '14 at 19:56
  • @Szabolcs Point taken. – Mr.Wizard May 16 '14 at 19:58
  • 1
    @Szabolcs Upon increasing the dimensions, some differences arise: screenshot – Yasha Gindikin May 16 '14 at 23:08
  • I notice that you Accepted my answer. Thanks. I'm sorry I never responded to your update, but besides being a bit busy I also cannot compile-to-C in version 7, so it will be hard to know if anything I suggest is an improvement or not. Nevertheless I'm taking a look at it now and I'll let you know if I have any ideas. – Mr.Wizard May 19 '14 at 20:05
  • Would you please describe the output of VectorCompare? – Mr.Wizard May 19 '14 at 20:33
  • @Mr.Wizard Your answer speeds my calculation a lot, thank you! However, my algorithm by design suffers from the huge redundant memory usage. I wish there were more economical alternative to build the sparse array, and ideally a parallelizable one.

    The VectorCompare functions takes two vectors consisting of integers, taken in increasing order, e.g. {1, 6, 7, 8} and {1, 5, 6, 9}. It returns the positions of different elements, i.e. {3, 4} and {2, 4} in the above example. It is useful for Slater's rules in quantum mechanics. The function easily compiles to MVM, just remove the CompilationTarget.

    – Yasha Gindikin May 19 '14 at 21:31
  • Thanks. I new I could compile it, and did, but the output wasn't as I expected because I didn't understand what you were doing (which seem quite clever). I notice that you have included RuntimeAttributes -> {Listable} in your function but you don't appear to be making use of that; comment? – Mr.Wizard May 19 '14 at 23:42
  • @Mr.Wizard It's only because I don't know how to use it effectively.:) I heard it allows one for automatic parallelization and all that, but I was only able to parallelize 'by hand'. – Yasha Gindikin May 20 '14 at 04:10

1 Answers1

3

You can specify a background for the array using the third parameter:

SparseArray[mat, Automatic, Null]
SparseArray[<2089>,{100,100},Null]

However, if you can pack the (non-sparse) array it should take less memory to use 0 than it does to use Null, assuming the rest of the array elements are machine-size integers. Example:

f2[p1_, p2_] := If[PrimeQ[p1 + p2], 1, 0];
mat2 = Table[f2[p1, p2], {p1, Q}, {p2, Q}];

mat3 = Developer`ToPackedArray[mat2];
ByteCount /@ {mat, mat2, mat3}
{117456, 244032, 40128}

If you can give your actual f function I can perhaps suggest code to build a packed array row by row. You could create a packed array with e.g. ConstantArray, then set values as needed using Part and Set. However, I shall wait for an example of your actual code before making additional recommendations.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • Specifing a background for the array is a brilliant idea, thank you! In essence, that solves my problem. As to packed arrays, I tried them of course. You see, a packed array with zeroes takes less memory then unpacked with Nulls, although SparseArray is even more compact. The real problem is that building the sparse array via the packed array inflated with zeroes takes memory larger by orders of magnitude. Please see examples in the following comment. – Yasha Gindikin May 16 '14 at 20:21
  • Q = 10000;
    f[p1_, p2_] := If[PrimeQ[p1 + p2], 1];
    LaunchKernels[];
    mat = ParallelTable[f[p1, p2], {p1, Q}, {p2, Q}]; // AbsoluteTiming
    sparse = SparseArray[mat, Automatic, Null];
    ByteCount[sparse]
    MaxMemoryUsed[] {76.715798, Null} 175463256 1204043984
    – Yasha Gindikin May 16 '14 at 20:21
  • Q = 10000;
    f2[p1_, p2_] := If[PrimeQ[p1 + p2], 1, 0];
    LaunchKernels[];
    mat2 = Developer`ToPackedArray[ ParallelTable[f[p1, p2], {p1, Q}, {p2, Q}]]; // AbsoluteTiming
    sparse2 = SparseArray[mat2];
    ByteCount[sparse2]
    MaxMemoryUsed[] {266.039460, Null} 10400080728 17037906848
    – Yasha Gindikin May 16 '14 at 20:23
  • @Yasha I was proposing constructing the array from packed rows, rather than packing the entire array afterward. Is this function with PrimeQ a toy example or is it actually representative of your application? – Mr.Wizard May 16 '14 at 20:28
  • It is a toy example. I would like to provide the actual code but I am not sure that comment is roomy enough for this. May I add it to the original post? Besides, I have difficulties with this mini-markdown formatting in comments, sorry for that. – Yasha Gindikin May 16 '14 at 20:30
  • @Yasha Yes, you may edit your question to add it. Please do. – Mr.Wizard May 16 '14 at 20:31
  • I provided the code. The CompareMatrix is the sparse array I need. – Yasha Gindikin May 16 '14 at 21:05
  • @Yasha Thanks. I don't have time to look at it now (I'm going hiking!) but I'll try to come back to it later. – Mr.Wizard May 16 '14 at 21:08
  • Thank you very much and have a nice day! Meanwhile, I added a piece of code that could allow one to do without SparseArray at all, but, unfortunately, it can not be straightforwardly parallelized, which is crucial for me. – Yasha Gindikin May 16 '14 at 21:16
  • I think I managed to write the code to build a packed array row by row. It is added to the end of the post. – Yasha Gindikin Jun 17 '14 at 12:53