4

Given a list like

list={{1},{2,3},{3,4,6},{6,7,8,5}};

I would like to very efficiently delete all sublists that are longer than a specific length, i.e.:

delLongSublists[list,2]

{{1},{2,3}}

My attempt at writing this function is:

delLongSublists[in_,q_]:=DeleteCases[in, Length[_List] > q, 1]

Unfortunately this does not work at all

delLongSublists[list, 2]

{{1},{2,3},{3,4,6},{6,7,8,5}}

Any suggestion on how to write this function computationally efficiently?

march
  • 23,399
  • 2
  • 44
  • 100
Kagaratsch
  • 11,955
  • 4
  • 25
  • 72

4 Answers4

9

This is as fast as I can do:

delLong[list_, length_] := Pick[list, UnitStep[Length /@ list - (1 + length)], 0]

delLong[{{1}, {2, 3}, {3, 4, 6}, {6, 7, 8, 5}}, 2]
(*  {{1}, {2, 3}}  *)

Big example, with check and timing analysis:

SeedRandom[0];
list = Table[RandomInteger[1, RandomInteger[5]], {10000}];

listlengths = Length /@ list;
Accumulate[Last /@ Sort@Tally@listlengths]
Table[Length@delLong[list, n], {n, 0, 5}]
(*
  {1707, 3385, 4960, 6645, 8351, 10000}
  {1707, 3385, 4960, 6645, 8351, 10000}
*)

Table[delLong[list, n], {n, 6}]; // RepeatedTiming
(*  {0.012, Null}  *)

Table[Length /@ list, {n, 6}]; // RepeatedTiming
(*  {0.012, Null}  *)

It's hard to see how it could be done faster than Length /@ list.

Michael E2
  • 235,386
  • 17
  • 334
  • 747
  • I don't quite understand what the Accumulate is doing. But the last two commands show that deleting the long sublists using your delLong effectively takes a much time as just counting the length of all sublists? That sounds pretty fast, since the extra operations seem to produce no overhead. – Kagaratsch Aug 09 '16 at 12:31
  • 1
    @Kagaratsch The sorted tally tells how many lists there are of lengths 0, 1, 2, etc. Accumulate adds them up, so that, say, the third entry in the output indicates how many lists have length 0, 1, or 2. That's just a check to compare with the output of delLong -- Yes, after counting the lengths of the sublists, remaining computation is at least an order of magnitude faster. See "vectorized", sects. 1.3 & 2.3. – Michael E2 Aug 09 '16 at 12:44
  • I wonder if it's worth packing the array given by Length /@ list. – LLlAMnYP Aug 10 '16 at 07:27
  • @LLlAMnYP I observed no significant difference for lists up to length 10^6. – Michael E2 Aug 10 '16 at 10:50
7

I am sure there are many ways to do this in Mathematica. one way could be

del[in_List, (q_Integer)?Positive]:=DeleteCases[in, x_ /; Length[x] > q];
list = {{1}, {2, 3}, {3, 4, 6}, {6, 7, 8, 5}};
del[list, 2]

Mathematica graphics

Another way, using Pick.

del2[in_List, (q_Integer)?Positive] := Pick[in, (Length[#] <= q & /@ in)]

Mathematica graphics

Nasser
  • 143,286
  • 11
  • 154
  • 359
4

Just another way:

del[lst_, t_] := Pick[lst, Length@# <= t & /@ lst]

e.g.

del[list, #] & /@ Range[4]

yields:

{{{1}}, {{1}, {2, 3}}, {{1}, {2, 3}, {3, 4, 6}}, {{1}, {2, 3}, {3, 4, 6}, {6, 7, 8, 5}}}

ubpdqn
  • 60,617
  • 3
  • 59
  • 148
  • oh, I added Pick also and just saw your answer using Pick. I do not know which is faster though. Did not time them. Normally Pick is supposed to be fast. – Nasser Aug 09 '16 at 10:03
  • @Nasser I have not tested and am not sure how large collection is targetted. May not make any meaningful difference for small size. As you commented 'many ways to do this'...the joy of Mathematica:) – ubpdqn Aug 09 '16 at 10:06
3

Example

del[lst_, t_] := Select[lst, Length @ #  <= t &]
e.doroskevic
  • 5,959
  • 1
  • 13
  • 32