8

I have a large 1D dataset of double precision numbers that are stored in many external database files. In order to load the whole dataset into Mathematica 8 I iterate over the files AppendingTo a List. The problem is that my computer keeps running out of memory even though the memory available should be more than enough to store the whole set. When researching the issue I discovered a strange behaviour (see the code below). The list produced by AppendTo consumes much more memory than would correspond to the size of an integer. I presume that, in the latter List, Mathematica uses some different kind of objects to store the data.

Any suggestions on how to reduce the memory usage?

In[78]:= ByteCount[Range[1, 10000]]
Out[78]= 40168

In[79]:= data = List[];
For[i = 0, i &lt 10000, ++i,
  AppendTo[data, i];
];

In[81]:= ByteCount[data]
Out[81]= 320040
rm -rf
  • 88,781
  • 21
  • 293
  • 472
Petr
  • 549
  • 4
  • 10
  • 2
    In addition to Rojo's answer, for better performance avoid AppendTo and do something similar to this: Join @@ (Import /@ FileNames["data*.txt"]). Also avoid For if you can. – Szabolcs May 29 '12 at 14:45

1 Answers1

16

You're looking to pack your array.

It's done with Developer`ToPackedArray

dataPacked = Developer`ToPackedArray[data];
ByteCount[dataPacked]

40168

There's more on this following the link.

The links were taken from the packed array section of this answer.

Rojo
  • 42,601
  • 7
  • 96
  • 188