Problem
Suppose a matrix-vector product of the form $M\vec{v}$ should be calculated where the amount of storage needed to store $M$ is noticeably larger than the available RAM on a machine. What is the fastest way to perform such an operation?
Slow solution approach
Since Mathematica takes care of such problems by the provision of InputStream object I thought of using them and implemented a version where the matrix $M$ is given as (very long) stream. I Dot it row-wise with the vector $\vec{v}$ and collect the results as final result vector
Table[Dot[NextStreamEntries[matrixStr, VectorLength], v], {ii,VectorLength}].
Here matrixStr is the matrix as an InputStream and VectorLength is the Lengthof the vector $\vec{v}$. By the way, I use a Table to do exactly the same thing VectorLength times. This seems weird but I did not find another fast (!) solution.
The performance of this is a catastrophy. Using the RuntimeTools`Profile I found that the performance is worse because of this method
NextStreamEntries[stream_, count_] := BinaryReadList[stream, "Real64", count];
with which I travel through the rows of the matrix.
On request: What does the matrix store and where does it come from?
The matrix comes from outside MMA since performance matters. A C++ routine writes it to disk as a binary file and I bit by bit read in in from MMA with BinaryReadList. The matrix elements are quantum mechanical expectation values (needed in context of the iterated equations of motion approach to describe non-equilibrium physics) with very few elements equal to zero.


ParallelMap[#.v &, m]– Edmund Aug 19 '17 at 12:40SparseArray. – Henrik Schumacher Aug 19 '17 at 14:16SparseArrayin my own code. For reference: My matrix has about 1% of elements equal to zero. – pbx Aug 19 '17 at 15:02