For example I have 2 matrices that can't fit in RAM. I need algorithm or library which can handle this.Preferably Matlab or Python.
I think it can be some block matrix multiplication? Also I think there is an analogy hard drive<->ram, gpu ram<->cpu ram, cpu ram<->cpu cache, so we can take some cpu cache optimized techniques?
It seems in python I can use numpy.memmap but I don't understand memory consumption of this approach and maybe it isn't optimal solution at all.