18

I have a Dataset and a List, called dataset and intervals, and I would like to append the list as a new column of the dataset. I have tried:

ds = Dataset[{
   <|"char" -> 1, "freq" -> 0.1|>,
   <|"char" -> 2, "freq" -> 0.2|>,
   <|"char" -> 3, "freq" -> 0.3|>,
   <|"char" -> 0, "freq" -> 0.4|>
   }]

original_dataset

intervals = {{0, 0.4}, {0.4, 0.7}, {0.7, 0.9}, {0.9, 1.}}
ds[All, <|#, "intervals" -> intervals|> &]

result_dataset

but this adds the whole list to every row, whereas I would like each item of the list to be added to a different row.

Can this be achieved?

C. E.
  • 70,533
  • 6
  • 140
  • 264
damjandd
  • 495
  • 4
  • 8

5 Answers5

14

One way:

ds // Transpose // Append["intervals" -> intervals] // Transpose

Mathematica graphics

In Mathematica it is usually easier to operate on rows, which is what this solution demonstrates. I would have done the same if I was working with lists too. However, as other answers show there are dataset specific solutions that might work better.

C. E.
  • 70,533
  • 6
  • 140
  • 264
13

Another way:

ds[MapThread[Append[#1, "intervals" -> #2] &, {#, intervals}] &]
swish
  • 7,881
  • 26
  • 48
12

Suppose your column is a list of Association objects or a Dataset:

ds = Dataset[{
<|"char" -> 1, "freq" -> 0.1|>,
<|"char" -> 2, "freq" -> 0.2|>,
<|"char" -> 3, "freq" -> 0.3|>,
<|"char" -> 0, "freq" -> 0.4|>
}];

intervals = {{0, 0.4}, {0.4, 0.7}, {0.7, 0.9}, {0.9, 1.}};

assoc=<|"interval"->#|>& /@ intervals;
col = Dataset[assoc];

Then you can simply use Join to add a column:

Join[ds, assoc, 2]
Join[ds, col, 2]

enter image description here

Carl Woll
  • 130,679
  • 6
  • 243
  • 355
  • your first example doesn't return a Dataset for me on v12.1, but rather just a list of associations (on which you can simply apply Dataset to get a dataset) – glS Dec 24 '20 at 13:41
3

Here is a way using MapIndexed:

ds[MapIndexed[<| #, "intervals" -> intervals[[#2[[1]]]]|> &]]

dataset screenshot


Streams

In some other software environments, streams are often used to solve problems like this. For lists, we can define a poor-man's version like this:

stream[list_List] := Module[{i = 1}, If[i > Length[list], Missing[], list[[i++]]] &]

Then we can write:

nextInterval = intervals // stream;

ds[All, <| #, "interval" -> nextInterval[] |> &]

dataset screenshot

Streams can greatly simplify merging operations. Let's say we wanted to add each interval as two columns instead of one:

nextLimit = intervals // Flatten // stream;

ds[All, <| #, "lower" -> nextLimit[], "upper" -> nextLimit[] |> &]

dataset screenshot

Or perhaps we wanted to add columns from multiple sources:

nextInterval = intervals // stream;
nextCode = "ROYG" // Characters // stream;
nextColor = {Red, Orange, Yellow, Green} // stream;

ds[All, <| #, "interval"->nextInterval[], "code"->nextCode[], "color"->nextColor[] |>&]

dataset screenshot

Coming Soon?

Since at least version 10.4 of Mathematica there has been an undocumented set of iterator functions. Perhaps they will become documented some day? Then we could officially write:

nextInterval = GeneralUtilities`ToIterator[intervals];

ds[All, <| #, "interval" -> Read[nextInterval] |> &]

dataset screenshot

Also in the Coming Soon? department, we have the Streaming package of Leonid Shifrin.

WReach
  • 68,832
  • 4
  • 164
  • 269
2

More easy way

ds[All, Append["intervals" -> # & /@ intervals]]

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
yode
  • 26,686
  • 4
  • 62
  • 167