Your ds2 is a highly inefficient shape. Using 10 in place of 1000000 in the data code we get something like:
(Observations made in Mathematica 10.1)
ds2 // InputForm
Dataset[{<|"x1" -> 10, "x2" -> 4, "x3" -> 8, "x4" -> 3, "x5" -> 6,
"x6" -> 5, "x7" -> 5|>, <|"x1" -> 2, "x2" -> 9, "x3" -> 6, "x4" -> 10,
"x5" -> 10, "x6" -> 5, "x7" -> 1|>, <|"x1" -> 9, "x2" -> 6, "x3" -> 6,
"x4" -> 10, "x5" -> 2, "x6" -> 10, "x7" -> 6|>,
<|"x1" -> 8, "x2" -> 2, "x3" -> 2, "x4" -> 2, "x5" -> 6, "x6" -> 2,
"x7" -> 8|>, <|"x1" -> 1, "x2" -> 8, "x3" -> 10, "x4" -> 3, "x5" -> 8,
"x6" -> 3, "x7" -> 9|>, <|"x1" -> 3, "x2" -> 6, "x3" -> 1, "x4" -> 10,
"x5" -> 3, "x6" -> 5, "x7" -> 8|>, <|"x1" -> 6, "x2" -> 7, "x3" -> 5,
"x4" -> 2, "x5" -> 10, "x6" -> 4, "x7" -> 9|>,
<|"x1" -> 7, "x2" -> 10, "x3" -> 8, "x4" -> 1, "x5" -> 3, "x6" -> 3,
"x7" -> 8|>, <|"x1" -> 4, "x2" -> 3, "x3" -> 6, "x4" -> 1, "x5" -> 8,
"x6" -> 5, "x7" -> 1|>, <|"x1" -> 8, "x2" -> 7, "x3" -> 8, "x4" -> 1,
"x5" -> 3, "x6" -> 9, "x7" -> 10|>}, TypeSystem`Vector[
TypeSystem`Struct[{"x1", "x2", "x3", "x4", "x5", "x6", "x7"},
{TypeSystem`Atom[Integer], TypeSystem`Atom[Integer],
TypeSystem`Atom[Integer], TypeSystem`Atom[Integer],
TypeSystem`Atom[Integer], TypeSystem`Atom[Integer],
TypeSystem`Atom[Integer]}], 10], <|"ID" -> 127397422492264|>]
Not only is this redundant but its form prohibits packing; every Integer is stored separately.
Compare the Transpose of your ds2:
ds2\[Transpose] // InputForm
Dataset[<|"x1" -> {10, 2, 9, 8, 1, 3, 6, 7, 4, 8},
"x2" -> {4, 9, 6, 2, 8, 6, 7, 10, 3, 7},
"x3" -> {8, 6, 6, 2, 10, 1, 5, 8, 6, 8},
"x4" -> {3, 10, 10, 2, 3, 10, 2, 1, 1, 1},
"x5" -> {6, 10, 2, 6, 8, 3, 10, 3, 8, 3},
"x6" -> {5, 5, 10, 2, 3, 5, 4, 3, 5, 9},
"x7" -> {5, 1, 6, 8, 9, 8, 9, 8, 1, 10}|>,
TypeSystem`Struct[{"x1", "x2", "x3", "x4", "x5", "x6", "x7"},
{TypeSystem`Vector[TypeSystem`Atom[Integer], 10],
TypeSystem`Vector[TypeSystem`Atom[Integer], 10],
TypeSystem`Vector[TypeSystem`Atom[Integer], 10],
TypeSystem`Vector[TypeSystem`Atom[Integer], 10],
TypeSystem`Vector[TypeSystem`Atom[Integer], 10],
TypeSystem`Vector[TypeSystem`Atom[Integer], 10],
TypeSystem`Vector[TypeSystem`Atom[Integer], 10]}],
<|"Origin" -> HoldComplete[AssociationTranspose,
Dataset`DatasetHandle[127397422492264]]|>]
Observe that not only is the representation more compact but the data is typed as vector arrays: TypeSystem`Vector[TypeSystem`Atom[Integer], 10].
Create your Dataset in the efficient shape to begin with to avoid a very slow Transpose operation:
ds3 = Dataset @ AssociationThread[head -> (data\[Transpose])];
From here you can quickly apply a sort like this:
ds3[[All, ds3["x1", Ordering] // Normal]]; // RepeatedTiming
{0.183, Null}
Note: This Ordering uses only the "x1" data and therefore the output will be similar to the stable SortBy[{#x1 &}] rather than the tie-breaking SortBy[#x1 &]. My guess is that this is actually what you will want most of the time. If a full tie-breaking form is required then:
ds3[[All, ds3[Values] // Normal // Transpose // Ordering]]; // RepeatedTiming
{0.320, Null}
The dependence on shape in Dataset is similar to the case of SparseArray; see:
ds2[SortBy[#"x1" &]]; // AbsoluteTiming. And this takes even longer;ds2[SortBy[#["x1"] &]]; // AbsoluteTiming. Very curious. v 11.1 Win 8.1 Pro – Edmund Mar 29 '17 at 01:49ByteCount(I know it isn't perfect), you will see factor 20 difference in size. Very inefficient. You can't really work with big data sets and useDatasetwith named columns etc. -- you have to use simple lists, otherwise that overhead would eat all the space on your machine before you even started processing it. – Stitch Mar 29 '17 at 19:21Datasetdid mention that they were looking into ways to not repeat theAssociationKeysin every row. I think it would be a challenge in the general case because of the free-form construction. Maybe some sort of packed dataset will be introduced for regularly structured datasets at some point. – Edmund Mar 29 '17 at 21:06