11

I am trying to do a first order interpolation on a multidimensional data set containing duplicate abscissa values (which Mathematica does not like), like this one:

{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}

After playing around with DeleteDuplicates and Union I found that DeleteDuplicates does not work on multidimensional data, at least not when represented like above. I found that Union could delete the duplicates by using the SameTest-option, like this:

Union[{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}, SameTest -> (#1[[1]] == #2[[1]] &)]

which returns

{{{1, 2, 3}, 10}, {{1, 2, 4}, 30}}

However, it is very slow with larger data sets, which I have, so that is not an option.

What I really would like is a function that will make an average of the duplicates, so that

{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}

will turn into

{{{1, 2, 3}, 15}, {{1, 2, 4}, 30}}

I am by no means an expert in Mathematica programming, so I could really use some help.

The solution has to be reasonable fast. My data set is not sorted in any way but I guess it can be sorted quickly before the duplicates are combined if that is needed.

Thanks in advance :-)

pvh1987
  • 245
  • 2
  • 4

5 Answers5

13

DeleteDuplicatesBy may be faster if used as follows:

DeleteDuplicatesBy[{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}, First]
{{{1, 2, 3}, 10}, {{1, 2, 4}, 30}}

For your second need:

lis = GatherBy[{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}, First];
mean = Mean /@ Map[Last, lis, {2}];

Then:

Transpose[{lis[[All, 1, 1]], mean}]
{{{1, 2, 3}, 15}, {{1, 2, 4}, 30}}

As a function:

delDupMean[data_] := With[{lis = GatherBy[data, First]},
  Transpose[{lis[[All, 1, 1]], Mean /@ Map[Last, lis, {2}]}]
 ]

Use:

delDupMean[{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}]
{{{1, 2, 3}, 15}, {{1, 2, 4}, 30}}
RunnyKine
  • 33,088
  • 3
  • 109
  • 176
6
lst = {{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}};

DeleteDuplicates[lst, First@#1 == First@#2 &]
(* {{{1,2,3},10},{{1,2,4},30}} *)

First /@ GatherBy[lst, First]
(* {{{1,2,3},10},{{1,2,4},30}} *)

{#[[1, 1]], Mean[#[[All, 2]]]} & /@ GatherBy[lst, First]
(* or {#[[1, 1]], Mean[#[[All, 2]]]} & /@Gather[lst, First@#1 == First@#2 &]  *)
(* {{{1,2,3},15},{{1,2,4},30}} *)
kglr
  • 394,356
  • 18
  • 477
  • 896
2

Just an approach with Reap and Sow. Probably not be efficient.

data = {{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}};
Last@Reap[Sow[#2, w[#1]] & @@@data, _, {ReplacePart[#1, 0 -> Identity], Mean@#2} &]

yields:

(*{{{1, 2, 3}, 15}, {{1, 2, 4}, 30}}*)
ubpdqn
  • 60,617
  • 3
  • 59
  • 148
2
lis = GatherBy[{{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}}, First];

Plus @@@ lis/Length /@ lis

Plus @@ #/Length@# & /@ lis

Plus[##]/Length@{##} & @@@ lis

(*{{{1, 2, 3}, 15}, {{1, 2, 4}, 30}}*)
Basheer Algohi
  • 19,917
  • 1
  • 31
  • 78
1
list = {{{1, 2, 3}, 10}, {{1, 2, 3}, 20}, {{1, 2, 4}, 30}};

result = GroupBy[list, First -> Last, Mean]

enter image description here

Normal @ result /. Rule -> List

enter image description here

OR

Thread[{Keys @ #, Mean /@ Values @ #}] &[GroupBy[list, First -> Last]]

enter image description here

eldo
  • 67,911
  • 5
  • 60
  • 168