I need to create matrix from a large array (>>10000 elements) for further use. I have tried Outer and ParallelTable commands, among which Outer seems to be faster when the array size is not so large;
Edit: Array ti below is given in terms Range for simplicity but in general it is intended to represent an arbitrary time series.
<< Developer`
tf = 100;
d = 1;
ti = N@Range[-tf, tf, d];
Nt = Length[ti];
m1 = Developer`ToPackedArray[Outer[Subtract, ti, ti]]; // RepeatedTiming
{0.007300590625, Null}
m2 = Developer`ToPackedArray[ParallelTable[Table[ti[[i]] - ti[[j]], {j, 1, Nt}],
{i, 1, Nt}]]; // RepeatedTiming
{0.03681925625, Null}
But for larger array size I have got
tf = 1000;
d = 1;
ti = N@Range[-tf, tf, d];
Nt = Length[ti];
m1 = Developer`ToPackedArray[
Outer[Subtract, ti, ti]]; // RepeatedTiming; // RepeatedTiming
{0.74297815, Null}
m2 = Developer`ToPackedArray[ParallelTable[Table[ti[[i]] - ti[[j]], {j, 1, Nt}], {i, 1,
Nt}]]; // RepeatedTiming
{0.1460876, Null}
which is run on a 64-core machine (ver. 12.3). I was wondering if there is a more efficient way of doing this. I also need to construct the matrix:
tf = 1000;
d = 1;
ti = N@Range[-tf, tf, d];
Nt = Length[ti];
m3 = Developer`ToPackedArray[
ParallelTable[Table[If[i != j, 1/(ti[[i]] - ti[[j]]), 0], {i, 1, Nt}], {j, 1,
Nt}]]; // RepeatedTiming
{0.8566911, Null}
where diagonal values are to be excluded. Apparently, the if statement inside Table introduces a slow down. I have seen a similar problem posted here, but it is not obvious to me how to implement those solutions inside ParallelTable.
ToPackedArray[Outer[Subtract, ti, ti]]is about 750 slower thanToPackedArray[Outer[Plus, ti, -ti]];– Craig Carter Mar 05 '24 at 18:20m1is a rank-1-matrix and there are many ways to exploit this.m2is a Toeplitz matrix, and there also many algorithms that avoid forming it in the first place (e.g., some use fast Fourier transform). So I think it is quite likely that you don't have to brute-force it. – Henrik Schumacher Mar 06 '24 at 04:03I_{ij}. But you don't have to store them all at once in memory. Instead, you create on accumulation variablesum = 0.and then runDo[sum += f[i,j], {i, -tf, tf, s}, {j, -tf, tf, s}]. Problem is that this is a loop and loops are slow in Mathematica. But they are fast inCompile. That spares you from working on large memory chunk - which have to be loaded chunkwise into the processor, first. And this kind of bulk memory is much, much slower than the little scratch space that the processor can work on directly. – Henrik Schumacher Mar 06 '24 at 13:12Do[sum += f[i,j], {i, -tf, tf, s}, {j, -tf, tf, s}]is still quadratic inft. The point here is to involve only fast components of the machine. – Henrik Schumacher Mar 06 '24 at 13:16