Answer
It seems that DumpSave is the fastest, based on my test below.
Albert Retey offers a very relevant comment to this answer, which I think deserves to be highlighted here:
There is an important difference between
Export and
DumpSave:
while Export will write (just) the data to file so it can be
re-imported with
Import,
DumpSave will store the definition for the symbol (here ds) and
recreate that definition when the file is loaded with Get. It will
overwrite previous definitions for that symbol and when loading data
you will need to know which definition a file will restore. So I would
suggest to use Export[_,_,"MX"] despite the fact that it is a bit
slower due to the extra overhead of the export framework...
Dummy data
ds = Dataset[
Array[<|
"Words" -> RandomWord["KnownWords", 100],
"Country" -> RandomEntity["Country"],
"Reals" -> RandomReal[{0, 1}, 10^6],
"Integers" -> RandomInteger[{1, 10}, 10^6],
"Image" -> RandomImage[1, {100, 100}, ColorSpace -> "RGB"]
|> &, 20]];
Only 0.3 GB, can't be bothered with more.
UnitConvert[Quantity[N@ByteCount[ds], "Bytes"], "Gigabytes"]
(* Quantity[0.324982, "Gigabytes"] *)
Put performance
AbsoluteTiming[
Put[ds, "ds.m"];
UnitConvert[Quantity[N@FileByteCount["ds.m"], "Bytes"], "Gigabytes"]
]
(* {131.745, Quantity[0.5267, "Gigabytes"]} *)
AbsoluteTiming[
file = CreateFile["PerformanceGoalSize.bin"];
ow = OpenWrite[file, BinaryFormat -> True];
BinaryWrite[ow, BinarySerialize[ds, PerformanceGoal -> "Size"]];
UnitConvert[Quantity[N@FileByteCount[Close[ow]], "Bytes"],
"Gigabytes"]
]
(* {15.4984, Quantity[0.169483, "Gigabytes"]} *)
AbsoluteTiming[
file = CreateFile["PerformanceGoalSpeed.bin"];
ow = OpenWrite[file, BinaryFormat -> True];
BinaryWrite[ow, BinarySerialize[ds, PerformanceGoal -> "Speed"]];
UnitConvert[Quantity[N@FileByteCount[Close[ow]], "Bytes"],
"Gigabytes"]
]
(* {3.43482, Quantity[0.324826, "Gigabytes"]} *)
AbsoluteTiming[
Export["Export.mx", ds];
UnitConvert[Quantity[N@FileByteCount["Export.mx"], "Bytes"],
"Gigabytes"]
]
(* {0.149372, Quantity[0.324832, "Gigabytes"]} *)
AbsoluteTiming[
DumpSave[File["DumpSave.mx"], ds];
UnitConvert[Quantity[N@FileByteCount["DumpSave.mx"], "Bytes"],
"Gigabytes"]
]
(* {0.142341, Quantity[0.324832, "Gigabytes"]} *)