11

Suppose that I have two lists of different sizes, the smaller of which is a subset of the larger, e.g.

a = {"A", "A", "A", "B", "B", "C"}
b = {"A", "B"}

How can I subtract b from a, treating each element as distinct, such that the result is {"A", "A", "B", "C"}?

One solution I can think of is to use

Merge[{Counts[a], -Counts[b]}, Total]

then reconstruct the result somehow, but is there a simpler way?

Taiki
  • 5,259
  • 26
  • 34
  • Please, hold on with an accept. Better answers may appear. Let's do not discourage others. :) – Kuba Nov 19 '14 at 19:46
  • @Kuba As I've asked for a simpler way, marking your answer as accepted is appropriate. – Taiki Nov 20 '14 at 10:11

5 Answers5

11
Fold[DeleteCases[##, 1, 1] &, a, b]
{"A", "A", "B", "C"}
Kuba
  • 136,707
  • 13
  • 279
  • 740
7

For efficiency, treat them as variables and let Plus do the work:

List @@ (Plus @@ a - Plus @@ b)

and convert to a list if you want:

{"A", 2 "B", -3 "C"} /. (n_*v_ :> 
   Sequence @@ ConstantArray[v,n])
(* => {"A", "B"} *)

Update Kuba's solution is elegant but not efficient nor does it maintain a sorted order, look at the performance:

In[1]:= {a, b} = Map[
   FromCharacterCode /@ RandomInteger[{67, 77}, #] &, {100000, 
    10000}];

In[2]:= AbsoluteTiming[Fold[DeleteCases[##, 1, 1] &, a, b]]
Out[2]= $Aborted (* still running after 20 seconds... *)

In[3]:= AbsoluteTiming[List @@ (Plus @@ a - Plus @@ b)]
Out[3]= {0.041489, {8114 "C", 8177 "D", 8182 "E", 8328 "F", 
  8123 "G", 7934 "H", 8191 "I", 8267 "J", 8348 "K", 8223 "L", 
  8113 "M"}}
M.R.
  • 31,425
  • 8
  • 90
  • 281
5

To reconstruct the result somehow:

reconstructF=Join@@ConstantArray@@@Normal@#&;

reconstructF@Merge[Total][{Counts[a],- Counts[b]}]
(* {"A", "A", "B", "C"} *)
kglr
  • 394,356
  • 18
  • 477
  • 896
4

If (a) preserving the order of the minuend list and (b) respecting the order of the subtrahend list are important to you (or to someone else reading this post), here's a neat solution that seems to perform well:

listComplement1[a_List, b_List] :=
 Module[{j = 1},
  With[{bn = Length@b},
   Select[a, j > bn || # =!= b[[j]] || ++j &]
   ]
  ]

listComplement1[a, b]

(* Out: {"C", "A", "A", "A", "B", "C"} *)

And just because I can, here's a semi-compiled version that uses integer codes for distinct elements! Wheee!

listComplement2Helper =
  Compile[{{a, _Integer, 1}, {b, _Integer, 1}},
   Module[{j = 1},
    With[{bn = Length@b},
     Select[a,
      If[j > bn || # =!= b[[j]],
        True,
        ++j; False
        ] &
      ]
     ]
    ]
   ];

listComplement2[a_List, b_List] := 
 With[{rels = MapIndexed[# -> #2[[1]] &, DeleteDuplicates@a]},
  With[
   {fwdMap = Dispatch@Append[rels, _ -> 0],
    revMap = Dispatch[Reverse /@ rels]},
   Replace[
    listComplement2Helper[
     Replace[a, fwdMap, {1}],
     Replace[b, fwdMap, {1}]],
    revMap,
    {1}
    ]
   ]
  ]

listComplement2[a, b]

(* Out: {"C", "A", "A", "A", "B", "C"} *)

At first I thought that integer codes could speed things up when the list elements are (a) very many, (b) very complicated, and (c) very similar, but now I'm thinking it won't help much (if at all) due to the comparisons that have to happen anyway for DeleteDuplicates... Oh well, still fun.


Caveat: All elements in b after the first element in b that is not also in a will not be removed from a.


Note: This answer previously also suggested a solution using SequenceAlignment. Unfortunately, this approach doesn't seem to work. I couldn't figure out a nice way to adapt SequenceAlignment to the requirements here.

William
  • 1,103
  • 6
  • 10
  • 1
    Sadly it seems that the SequenceAlignment method is not correct. Consider for example: a = {3, 5, 0, 1, 2, 0, 0, 4, 5, 2}; b = {2, 5, 4, 3, 0} -- the result should be {1, 0, 0, 5, 2} but instead it gives {3, 5, 0, 1, 0, 0, 5, 2}. – Mr.Wizard Feb 21 '15 at 08:18
  • @Mr.Wizard, you're right. Playing with it just now, it seems like SequenceAlignment is probably just ill-suited for this. (Or I'm not seeing how to use it correctly.) Thanks for pointing out this error. Answer's now updated. – William Feb 22 '15 at 04:50
  • @Mr.Wizard, incidentally, +1 for this. Very, very neat. – William Feb 22 '15 at 05:20
  • @Mr.Wizard, quite a puzzling ellipsis there. Text lacks tone, though, so what's it for? – William Feb 23 '15 at 16:15
  • That was just because of the minimum character requirement for comments. I did not realize the confusion I created. I'll be more verbose in the future. – Mr.Wizard Feb 23 '15 at 17:55
  • @Mr.Wizard, ah! [facepalm] I completely forgot the comment minimum. I thought I had erred in some informal MSE cultural etiquette and was being gently if vaguely reprimanded, so I was treading lightly... I do feel silly, lol. Sorry for the noise. – William Feb 23 '15 at 19:12
  • 1
    No problem. I am glad you asked. Please always do if something seems "off" like that; no need to guess. – Mr.Wizard Feb 23 '15 at 19:16
  • ps I forgot to up-vote after you addressed the SequenceAlignment problem. corrected. – Mr.Wizard Feb 23 '15 at 19:18
2
a = {2 "A", "A", "A", "B", "B", "C"};
b = {"A", "B"}; 

Delete[a, First[Position[a, #, 1]] & /@ b]
(*{2 "A", "A", "B", "C"}*)
Basheer Algohi
  • 19,917
  • 1
  • 31
  • 78