9

Consider the following:

data={{1,a},{10,a},{5,b},{4,c}};

In my case duplicates are defined when for {x1,y1} and {x2,y2}, y2 is equal y1. Hence in the case of data, I would get via MyFunction[data] the following result:

{{1,a},{10,a}}.

Does anyone have an idea?

J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
John
  • 4,361
  • 1
  • 26
  • 41

4 Answers4

17
Select[GatherBy[{{1, a}, {10, a}, {5, b}, {4, c}}, Last], Length[#] > 1 &]

seems to do what you want. Alternatives to this construction include:

DeleteCases[GatherBy[{{1, a}, {10, a}, {5, b}, {4, c}}, Last], {_List}]

and

DeleteCases[GatherBy[{{1, a}, {10, a}, {5, b}, {4, c}}, Last], {{__}}]
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
9

GatherBy is usually fastest, but Sow and Reap are more flexible. Here is a method using those.

data = {{5, "f"}, {10, "b"}, {10, "e"}, {6, "c"}, {3, "c"}, {6, "e"},
        {4, "a"}, {2, "c"}, {6, "f"}, {2, "g"}, {9, "e"}, {0, "d"},
        {10, "c"}, {6, "b"}, {6, "c"}};

Reap[Sow @@@ data, _, {#2,#}&][[2]] ~Cases~ {{_, __},_}
{{{5, 6}, "f"},
 {{10, 6}, "b"},
 {{10, 6, 9}, "e"},
 {{6, 3, 2, 10, 6}, "c"}}

As you can see, the output is in a somewhat different format, but one that IMHO may itself be useful. You can recover your original output with Thread, e.g. Thread /@ on the output above:

{{{5, "f"}, {6, "f"}},
 {{10, "b"}, {6, "b"}},
 {{10, "e"}, {6, "e"}, {9, "e"}},
 {{6, "c"}, {3, "c"}, {2, "c"}, {10, "c"}, {6, "c"}}}
J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
2

Using some functions which were not availabe at the time the question was posed

list = {{1, a}, {10, a}, {5, b}, {4, c}};

f[x : {{, a}, {, a}}] := Splice[x] f[_] := Nothing

BlockMap[f, list, 2, 1]

{{1, a}, {10, a}}

eldo
  • 67,911
  • 5
  • 60
  • 168
2

Using Cases:

data1 = {{1, a}, {10, a}, {5, b}, {4, c}};
data2 = {{1, a}, {10, a}, {5, b}, {4, c}, {7, c}};

Cases[#, {_, Alternatives @@ Keys[Select[CountsBy[#, Last], # > 1 &]]}] &@data1 ({{1, a}, {10, a}})

Cases[#, {_, Alternatives @@ Keys[Select[CountsBy[#, Last], # > 1 &]]}] &@data2 ({{1, a}, {10, a}, {4, c}, {7, c}})

Or using Pick, GatherBy and the third argument of GroupBy:

data3 = {{5, "f"}, {10, "b"}, {10, "e"}, {6, "c"}, {3, "c"}, {6, "e"}, {4, "a"},
        {2, "c"}, {6, "f"}, {2, "g"}, {9, "e"}, {0, "d"}, {10, "c"}, {6, "b"},
        {6, "c"}};

Pick[GatherBy[#, Last], Values[GroupBy[#, Last, Length[#] > 1 &]]] &@data3

({{{5, "f"}, {6, "f"}}, {{10, "b"}, {6, "b"}}, {{10, "e"}, {6, "e"}, {9, "e"}}, {{6, "c"}, {3, "c"}, {2, "c"}, {10, "c"}, {6, "c"}}})

E. Chan-López
  • 23,117
  • 3
  • 21
  • 44