14

Let's use the example Dataset:

dataset = Dataset[{
   <|"a" -> 1, "b" -> "x", "c" -> {1}|>,
   <|"a" -> 2, "b" -> "y", "c" -> {2, 3}|>,
   <|"a" -> 3, "b" -> "z", "c" -> {3}|>,
   <|"a" -> 4, "b" -> "x", "c" -> {4, 5}|>,
   <|"a" -> 5, "b" -> "y", "c" -> {5, 6, 7}|>,
   <|"a" -> 6, "b" -> "z", "c" -> {}|>}]

And data for a new field "d" that I'd like to add:

d = {6, 5, 4, 3, 2, 1}

I can add the field by completely unpackaging and repackaging the data:

Dataset[Map[Association,Transpose[Append[Transpose[Normal[Normal[dataset]]], Thread["d"->d]]]]]

There must be a simpler way! I would like to do this routinely and with large datasets, so I'm looking for something more compact and potentially much more efficient. What am I missing?

ArgentoSapiens
  • 7,780
  • 1
  • 32
  • 49

4 Answers4

7

With this auxiliary function:

tr = Transpose[#, AllowedHeads -> All] &;

you can do

dataset[tr /* Append["d" -> {6, 5, 4, 3, 2, 1}] /*  tr]

The formatting of the result won't be as nice as the original, because of type inference limitations, but the result is correct.

Leonid Shifrin
  • 114,335
  • 15
  • 329
  • 420
  • Could you please show how to generalize to ds=ExampleData[{"Dataset", "Planets"}] and, say, we want to add a column with the distance from Earth ? – b.gates.you.know.what Nov 04 '14 at 12:10
  • @b.gatessucks It's a bit harder. This is what I came up with (used consecutive integers for distances, to make it simple): ds[tr /* Append["distance" -> AssociationThread[Normal@ds[Keys], {1, 2, 3, 4, 5, 6, 7, 8}]] /* tr] – Leonid Shifrin Nov 04 '14 at 12:21
  • Many thanks, much better than de-/re-constructing the dataset from scratch. Hopefully it won't be too long until this kind of operation is built-in. – b.gates.you.know.what Nov 04 '14 at 12:51
  • 1
    @b.gatessucks Indeed, it looks like this operation must be built-in. – Leonid Shifrin Nov 04 '14 at 13:18
7
dataset[MapThread[Append, {#, "d" -> d // Thread}] &]

or

dataset[Join[#, "d" -> d // Thread /* Map[Association], 2] &]

or

Module[{ds = Normal@dataset},
 ds[[All, "d"]] = d;
 Dataset@ds]
Rojo
  • 42,601
  • 7
  • 96
  • 188
6

Perhaps this:

Module[{i = 1}, dataset[All, <| #, "d" -> d[[i++]] |> &]]

dataset screenshot

WReach
  • 68,832
  • 4
  • 164
  • 269
3

Either

MapIndexed[Append[#1, "d" -> d[[#2[[1]]]]] &, dataset]

or

MapIndexed[Insert[#1, "d" -> d[[#2[[1]]]], -1] &, dataset]

works

Mathematica graphics

Sjoerd C. de Vries
  • 65,815
  • 14
  • 188
  • 323