Big matrix multiplication on single machine

Question

For example I have 2 matrices that can't fit in RAM. I need algorithm or library which can handle this.Preferably Matlab or Python.

I think it can be some block matrix multiplication? Also I think there is an analogy hard drive<->ram, gpu ram<->cpu ram, cpu ram<->cpu cache, so we can take some cpu cache optimized techniques?

It seems in python I can use numpy.memmap but I don't understand memory consumption of this approach and maybe it isn't optimal solution at all.

score 9 · Answer 1 · answered Oct 17 '13 at 12:50

9

I think you should have a look at PyTables. Especially the tutorial given at PyData 2012. PyTables combines hierarchical datasets with a computational engine. It uses the Blosc compresser to avoid I/O bottlenecks and an optimized evaluator for expressions tables.Expr (based on Numexpr).

answered Oct 17 '13 at 12:50

GertVdE

6,179
1
21
36

1

Can you provide any example of matrix multiplication using pytables? – mrgloom Oct 21 '13 at 13:19
PyTables does not appear to provide this, nor is there package that pulls data in reasonable chunks into SciPy matrices. – Brian Dolan Jul 18 '17 at 21:22

score 2 · Answer 2 · answered Oct 21 '13 at 12:31

2

More info about your matrix type would help. The easiest way to work with large matrices like this is to distribute them across multiple machines and use ScaLAPACK to do the operations in parallel. It will be faster too. If you need to do it on one machine, the out-of-core techniques you were alluding to will work. pytables mentioned by mrgloom supports this. Generally you want to take advantage of the nature of your matrix.

answered Oct 21 '13 at 12:31

mostlyWright

141
3

Can you provide any example of matrix multiplication using pytables? – mrgloom Oct 21 '13 at 13:18

Big matrix multiplication on single machine

2 Answers2