7

I have a list of associations dataset.

  d = Dataset@{
   <|"a" -> "Red", "b" -> .1|>,
   <|"a" -> "Blue", "b" -> .2|>,
   <|"a" -> "Red", "b" -> .11|>,
   <|"a" -> "Blue", "b" -> .21|>,
   <|"a" -> "Blue", "b" -> .23|>
   }

I can count each type of "a" with

d[GroupBy["a"], Length, "b"]

Red  2
Blue  3

and I can find the mean value of "b" for each "a" with

d[GroupBy["a"], Mean, "b"]

Red  0.105
Blue  0.21333

Is there any easy way I can do both at the same time, getting a result that has both a count of the number of occurrences and the mean "b" for each "a"? I'd want the result to look like

Red 2 0.105
Blue 3 0.21333

Obviously I know how to do the queries separately, and I could figure out how to glue the results together. I also know how to apply different functions to different columns without the GroupBy as in here. And I know how to Map multiple functions to individual entries of a column, creating new columns after Group By. But the GroupBy is messing me up when I try to do Mean or Length on an overall column. Is there some arcane syntax to do this elegantly?

Chris Nadovich
  • 509
  • 2
  • 11

1 Answers1

6

I think I figured it out

d[GroupBy["a"], {Query[Length, "b"], Query[Mean, "b"]}]

although I can't quite explain why that works.

UPDATE -- and it doesn't really work. If you look at the Normal

<|"Red" -> {2, 0.105}, "Blue" -> {3, 0.213333}|>

A better answer is

d[GroupBy["a"], <| "Length" -> Query[Length, "b"], 
  "Mean" -> Query[Mean, "b"]|>]

But an even better answer is simply

 d[GroupBy["a"], <|"Length" -> Length, "Mean" -> Mean|>, "b"]

Dunno why I didn't see it earlier.

Chris Nadovich
  • 509
  • 2
  • 11