I have a huge panel data with more than 2.5 million observations and more than 15000 groups. I want to calculate group-wise mean and I also need Id variable in my result. This is what I have done.
dat = {{2, .1, .2}, {2, .2, .4}, {2, .3, .6}, {2, .4, .8}, {5, 1,
2}, {5, 2, 4}, {7, 20, 10}, {7, 40, 20}, {7, 60, 30}, {7, 80,
40}, {7, 100, 50}, {10, 30, 50}};
results = Table[N[Mean[Select[dat, #[[1]] == i &]]], {i, {2, 5, 7, 10}}];
But it is taking too long time. How can we make it quicker. Also, I want to calculate group-wise max value of second column.
Any help is greatly appreciated.
Mean /@ GatherBy[dat, First]ORMean /@ GroupBy[dat, First]? – RunnyKine Sep 05 '14 at 16:23