8

I have a dataset, and I want to group by one of my columns, and then show the Length of another set of columns for each group. For example:

letter color
 a     red
 a     red
 a     blue
 b     blue

I want my new dataset to show:

    red blue
a    2   1
b    0   1

where red and blue are now column headers.

What code can I use to do this?

4 Answers4

7

Starting with the dataset from the question:

dataset =
  {{"a", "red"}, {"a", "red"}, {"a", "blue"}, {"b", "blue"}} //
    Query[Dataset, AssociationThread[{"letter", "color"} -> #]&]

input dataset

... then we can get the desired result as follows:

dataset[
  KeyUnion[#, 0&]& /* (GroupBy[#, Key["letter"] -> KeyDrop["letter"], Merge[Total]]&)
, <|"letter" -> #letter, #color -> 1|>&
]

result dataset

This works as follows:

  1. <|"letter" -> #letter, #color -> 1|>& is used to set the count of each row's color to one.
  2. KeyUnion is used to introduce a count of zero for colors missing from each row.
  3. GroupBy is used to group the rows together by letter and Total the color counts.
WReach
  • 68,832
  • 4
  • 164
  • 269
4
dt = {{letter, color}, {a, red}, {a, red}, {a , blue}, {b, blue}}; 
TableForm[Outer[Count[dt[[2 ;;]], {##}] &, ## & @@ #], 
    TableHeadings -> #] &[DeleteDuplicates /@ Transpose[dt[[2 ;;]]]] // TeXForm

$\begin{array}{ccc} & \text{red} & \text{blue} \\ a & 2 & 1 \\ b & 0 & 1 \\ \end{array}$

kglr
  • 394,356
  • 18
  • 477
  • 896
2

You may use GroupBy and Transpose.

Taking @WReach Dataset definition.

dataset =
  {{"a", "red"}, {"a", "red"}, {"a", "blue"}, {"b", "blue"}} //
    Query[Dataset, AssociationThread[{"letter", "color"} -> #]&]

Then

dataset[GroupBy[{#["color"] &, #["letter"] &}] /*
  Transpose /*
  ReplaceAll[_Missing -> 0],
 All,
 Length]

Mathematica graphics

Hope this helps.

Edmund
  • 42,267
  • 3
  • 51
  • 143
0
defaultds = <|"blue" -> 0, "red" -> 0|>;

dataset[GroupBy[#, Key@"letter" -> Key@"color", 
   (Join[defaultds, #] &)@GroupBy[#, Identity, Length] &] &]

Blockquote

SquareOne
  • 7,575
  • 1
  • 15
  • 34