4

Loading less than 2 gB of csv data took 5.29 hours:

In[1]:= FileByteCount@"Downloads/fann_feature_point_word_\
        model_1000000_Y_grid_8_test.csv"        
Out[1]= 1833545126

In[2]:= AbsoluteTiming[x = Import["Downloads/fann_feature_point_word_model_1000000_Y_\
        grid_8_test.csv"];]
Out[2]= {19060.3, Null}

In[3]:= Dimensions[x]    
Out[3]= {175224, 2070}

Any suggestions for circumventing this or speeding it up?

M.R.
  • 31,425
  • 8
  • 90
  • 281
  • 1
    Perhaps unpacking? Maybe for files this large, it might be faster to roll your own CSV importer by reading each line (ReadList) and splitting on ,? Or is it phoning home to WRI to convert it into some bs Entity["CSVRowItem", Quantity[1]]? – rm -rf Jun 05 '15 at 03:34
  • 2
    Related (dupe?): http://mathematica.stackexchange.com/q/35371/5 – rm -rf Jun 05 '15 at 03:36
  • 1
    What exactly do you need to do with the 2GB csv after loading it? Is it just plotting? Is it interpolation? What you can do highly depends on how you want to manipulate the data (I assume manipulating 2GB of data will be a challenge as they have to be duplicated for some procedures and it might just be inconvenient) – Bichoy Jun 05 '15 at 05:12
  • I aggree that this look like a duplicate. But to decide that and give any useful advice we would have to know what kind of data you are trying to import. Is it purly numerical? Another important thing you should mention is what Bichoy rised: are you really needing all of that data? If not you certainly would be better off to already filter what you need while importing, there is also a related question about reading only parts of a (binary) file here... – Albert Retey Jun 05 '15 at 09:02
  • 1
    my first advice is to not use CSV in the first place. Assuming its all numbers use a binary format. That said, I've found raw Open/Read operations to be faster than Import (again assuming its all numbers) – george2079 Jun 05 '15 at 16:07
  • You will increase your chances to get a better answer if you link to some smaller sample of your file (like, 10-50 Mb). – Leonid Shifrin Jun 05 '15 at 21:36

0 Answers0