12

How do I select only the first pair of numbers in which a number occurs from a list of pairs of numbers?

For example, I have

{{20, 11}, {17, 20}, {26, 5}, {14, 9}, {18, 13}, {19, 11}}

I would like to get

{{20, 11}, {26, 5}, {14, 9}, {18, 13}}

and If I have

{{20, 11}, {17, 20}, {26, 13}, {14, 26}, {11, 20}, {18, 13}, {19, 11}}

I would want

{{20, 11}, {26, 13}}

That is: the resulting list must have the properties that

  • it contains only pairs from the original list, and that
  • it has been pruned (just*) to the extent that each number occurs in only one pair.

*Ideally the result should retains the most pairs (realizing that my examples above may not accomplish this).

orome
  • 12,819
  • 3
  • 52
  • 100
  • Your problem is under-specified. A single number might appear in several pairs in such a way that there is no unique answer. Please constrain your problem to avoid such ambiguity. – David G. Stork Mar 30 '15 at 17:29
  • @DavidG.Stork: Goot point; but I'll accept any answer that has the two properties listed. Ideally, I'd like the one that produces retains the most pairs though (I'll add that to the question). – orome Mar 30 '15 at 17:37
  • How large are the lists you intend to use this on? The current only answer is clean and suffices for small lists but will become horribly inefficient for large lists, but if your lists are small, it's probably as tight of an answer possible. – ciao Mar 30 '15 at 20:18
  • @rasher: Never more than 10. – orome Mar 30 '15 at 20:19
  • @raxacoricofallapatorius: Ah, then you're golden with Kguler`s very pretty answer. Interesting problem on large lists... – ciao Mar 30 '15 at 20:21
  • 1
    @rasher: A general answer would be good here to (others may need it to apply to longer lists). – orome Mar 30 '15 at 20:34
  • @raxacoricofallapatorius: Does order matter? That is, must results appear in same order as the original list? – ciao Mar 30 '15 at 22:00
  • @rasher: Short answer, yes: original list order should be preserved. – orome Mar 30 '15 at 22:04
  • @raxacoricofallapatorius: Added my first thought, will ponder further... – ciao Mar 30 '15 at 23:06
  • @rasher: Looks great! – orome Mar 30 '15 at 23:07

5 Answers5

13

Update: I was able to improve the performance of pruner2b below with a couple of suggestions from the screw on my cell block. In honor of the booze we make in the toilets here, I'll call it pruno:

pruno[lst_] := Module[{f, g},
   g[_] = True;
   f[a_, b_] := If[g[a] && g[b], g[a] = g[b] = False; {a, b},Unevaluated@Sequence[]];
   f @@@ lst];

The original idea and updated benchmark follow.

Just a quick-and-dirty idea:

pruner2b[lst_] := Module[{f},
                f[_] = True;
                Map[If[f[#[[1]]] && f[#[[2]]], (f[#] = False)&/@#; #,Unevaluated@Sequence[]] &, lst]]

This returns precisely the same results as Kguler's DeleteDuplicates solution (I've not proofed that this is "optimal", in the sense of maximizing length of result).

A quick performance comparison using lstx = RandomInteger[{1, 10000}, {10000, 2}]; to generate a test list and then incrementally increasing the amount used: enter image description here

By 500 pairs pruno is over 2 orders of magnitude faster than using DeleteDuplicates and significantly leads pruner2b, ran out of patience much beyond that...

Taking further advantage of the speed of the pattern matcher, this is even faster for lists >~1K pairs on the loungebook (with a nod to Mr. W's "cool kids" comment - no performance difference using ##&[], but certainly prettier):

prunod[lst_] := Module[{f, g},
   g[_] = True;
   f[a_?g, b_?g] := (g[a] = g[b] = False; {a, b});
   f[_, _] = ## &[];
   f @@@ lst];

enter image description here

ciao
  • 25,774
  • 2
  • 58
  • 139
11

Maybe

lst = {{20, 11}, {17, 20}, {26, 5}, {14, 9}, {18, 13}, {19, 11}};
DeleteDuplicates[lst, Intersection[##] != {} &]
(* {{20, 11}, {26, 5}, {14, 9}, {18, 13}} *)

lst2 = {{20, 11}, {17, 20}, {26, 13}, {14, 26}, {11, 20}, {18, 13}, {19, 11}};
DeleteDuplicates[lst2, Intersection[##] != {} &]
(* {{20, 11}, {26, 13}} *)
kglr
  • 394,356
  • 18
  • 477
  • 896
  • Looks good. Bear with me while I check it in production. – orome Mar 30 '15 at 18:03
  • I'd venture on large lists this will get very slow... but neat and tidy, +1 – ciao Mar 30 '15 at 19:32
  • 1
    thank you @rasher. Re slow, thought about using Message to add " caveat emptor: ..." , but that would have made it too long :) – kglr Mar 31 '15 at 05:39
4

A bit shorter than the answer by kglr is

list = {{20, 11}, {17, 20}, {26, 13}, {14, 26}, {11, 20}, {18, 13}, {19, 11}};
DeleteDuplicates[list, IntersectingQ]

But it also gets really slow for large lists.

Henrik Schumacher
  • 106,770
  • 7
  • 179
  • 309
1

Similar to ciao's

list = {{20, 11}, {17, 20}, {26, 5}, {14, 9}, {18, 13}, {19, 11}};

Clear[f]

f[_] = True;

If[f[#1], f[#1] = False;
   If[f[#2], f[#2] = False;
    {##}, Nothing], Nothing] & @@@ list
{{20, 11}, {26, 5}, {14, 9}, {18, 13}}
Chris Degnen
  • 30,927
  • 2
  • 54
  • 108
1

The following works and is fast for big lists.

delDuPairs[list_] :=
 Module[{flist = Flatten@list},
  Delete[list, Transpose@
    {Union@Flatten@Ceiling[Rest /@ Select[
           GatherBy[Range@Length[flist], flist[[#]] &], 
           Length[#] > 1 &]/2]}
   ] ]

The function is based upon this answer to a different question. I'm sure it could be cleaned up a bit.

lstx = RandomInteger[{1, 100000}, {100000, 2}];

First@AbsoluteTiming[delDuPairs[lstx]]

0.397049

geordie
  • 3,693
  • 1
  • 26
  • 33