This question has been addressed adequately for pattern matching. However, I have a similar task with floating point data. I am looking for an efficient way to delete duplicates within a tolerance value (preferably a conditional statement between the elements of the two lists). Kindly help.
Asked
Active
Viewed 94 times
2
2 Answers
1
master = RandomReal[{0, 1}, 1000];
sub = RandomReal[{0, 1}, 100];
{roundm, rounds} = Round[{master, sub}, 0.01];
s2 = DeleteDuplicates[Join[roundm, {Null}, rounds]] /. {a__, Null, b___} :> {b};
sub2 = Extract[sub, Flatten[Position[rounds, #] & /@ s2, 1]]
sub2 contains all the reals in sub that do not match master within the tolerance specified in Round.
For a faster version use makePositionFunction in the last step.
makePositionFunction[f_Symbol, data_, level_: {-1}] := Block[{},
ClearAll[f];
Reap[
MapIndexed[Sow[#2, #1] &, data, level, Heads -> True],
_, (f[#] = #2) &];
f[other_] := Position[data, other, level]]
makePositionFunction[pos, rounds];
sub2 = Extract[sub, Flatten[pos /@ s2, 1]]
Chris Degnen
- 30,927
- 2
- 54
- 108
1
SeedRandom[1]
ClearAll[list, masterlist]
{list, masterlist} = Round[RandomReal[1, {2, 10}], 0.01];
tolerance = .02;
dist = Nearest[masterlist->"Distance"];
Pick[list, dist[#][[1]]>tolerance & /@ list] (* or *)
Pick[list, UnitStep[dist[#][[1]] & /@ list - tolerance], 1]
{0.11, 0.79, 0.07, 0.54, 0.7}
dist0 = Nearest[masterlist -> {"Element", "Distance"}];
res = {#, dist0[#][[1]]} & /@ list;
Grid[Prepend[If[#[[2, -1]] > tolerance, Flatten@Map[Style[#, Red, Bold] &, #, {-1}],
Flatten@#] & /@ res,
{Column[{"element in", "list"}, Alignment -> Center],
Column[{"Nearest" , "element in", "masterlist"}, Alignment -> Center], "distance"}],
Dividers -> All, Alignment -> {Center, Center}]
kglr
- 394,356
- 18
- 477
- 896

DeleteDuplicates/DeleteDuplicatesBy? – Kuba Feb 28 '18 at 11:09