22

If you have a simple list of lists as follows:

test = {{1, 2}, {4, 5, 6, 7}, {5, 4, 3}}

How do you ask Mathematica to return the sublist of greatest length?

I've been trying to write a Select command using pure functions without success.

Arnoud Buzing
  • 9,801
  • 2
  • 49
  • 58
Todd Allen
  • 2,124
  • 5
  • 29
  • 31

10 Answers10

23

If you only want one item from the resulting list, you can use the two-argument form of Ordering instead of Sort to be a bit more efficient:

test[[Ordering[test, -1]]]

biglist = 
  Table[RandomInteger[10, RandomInteger[100]], {10^5}];

Timing[biglist[[Ordering[biglist, -1]]]]

(*
==> {0.006476, {{10, 10, 10, 3, 4, 7, 4, 3, 9, 8, 8, 1, 2, 1, 5, 
   10, 10, 10, 9, 4, 6, 6, 9, 1, 2, 10, 8, 3, 0, 9, 1, 2, 5, 1, 1, 2, 
   7, 8, 9, 10, 8, 4, 8, 4, 7, 9, 3, 4, 5, 1, 6, 6, 4, 5, 8, 6, 3, 2, 
   6, 4, 9, 9, 9, 7, 1, 10, 4, 2, 10, 8, 0, 8, 1, 0, 9, 10, 7, 4, 5, 
   3, 6, 6, 6, 4, 2, 3, 1, 4, 9, 6, 5, 1, 8, 10, 0, 1, 3, 5, 10, 4}}}
*)

Timing[Last@Sort@biglist]

(*
==> {0.170369, {10, 10, 10, 3, 4, 7, 4, 3, 9, 8, 8, 1, 2, 1, 5, 
  10, 10, 10, 9, 4, 6, 6, 9, 1, 2, 10, 8, 3, 0, 9, 1, 2, 5, 1, 1, 2, 
  7, 8, 9, 10, 8, 4, 8, 4, 7, 9, 3, 4, 5, 1, 6, 6, 4, 5, 8, 6, 3, 2, 
  6, 4, 9, 9, 9, 7, 1, 10, 4, 2, 10, 8, 0, 8, 1, 0, 9, 10, 7, 4, 5, 3,
   6, 6, 6, 4, 2, 3, 1, 4, 9, 6, 5, 1, 8, 10, 0, 1, 3, 5, 10, 4}}
*)
Brett Champion
  • 20,779
  • 2
  • 64
  • 121
  • How could I forget about this one? +2 if I could! (It should be noted that this only returns one list and not ties.) – Mr.Wizard Feb 05 '12 at 03:30
  • 1
    This returns only one list of the highest length as R.M.'s and Spartacus' ones do, so these are not the full solutions. – Artes Feb 05 '12 at 10:49
18

One possibility:

test = {{1, 2}, {4, 5, 6, 7}, {5, 4, 3}};
lengths = Length /@ test;
max = Max[lengths];
pos = Position[lengths, max];
Extract[test, pos]

gives:

{{4, 5, 6, 7}}

If there are two or more sublists that are of 'greatest length' those will also be returned.

Arnoud Buzing
  • 9,801
  • 2
  • 49
  • 58
  • This is faster than I thought it would be, slightly faster than Last@SortBy[test, {Length}] on my test data, and quite a bit faster than the infix thing. – Mr.Wizard Feb 05 '12 at 02:26
  • Your solution is the best, though not in price, see http://mathematica.stackexchange.com/questions/1342/selecting-a-sublist-based-on-length/1346#1346 – Artes Feb 06 '12 at 11:14
15

Sort automatically sorts by length, so it is as simple as

Last@Sort@test
rm -rf
  • 88,781
  • 21
  • 293
  • 472
13

Reasonably fast and quite direct, but returns only one list if there are ties:

Last@SortBy[test, {Length}]

More whimsical but catching ties (warning: infix ahead):

test ~SortBy~ Length ~SplitBy~ Length // Last

Since Arnoud's method tests the fastest for functions that include ties, here is my terse version of it:

longest[L_List] := L ~Extract~ Position[#, Max@#] &[Length /@ L]
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
12

If you have V10, consider using MaximalBy:

MaximalBy[test, Length, 1]

Notice that MaximalBy[test, Length] returns all of the longest lists. Similarly, there is also MinimalBy.

Juho
  • 1,825
  • 1
  • 18
  • 32
10

A solution using Select is :

max = Max[Length /@ test];
Select[test, Length[#] == max &]

This solution and Arnoud's, as well as J.M.'s ones, are better if we have more lists of maximal length. E.g. for

test = {{1, 2}, {4, 5, 6, 7}, {5, 4, 3}, {2, 2, 3, 4}};

this returns

 {{4, 5, 6, 7}, {2, 2, 3, 4}}

Edit

Since one would like to know performance issues of various methods I've made a comparison of presented approaches (only for methods which return all longest sublists) on a very long list from the best to the slowest. On smaller lists proportions of timings may slightly change, but in general, the order is preserved.

longlist = Table[RandomInteger[{-10, 10}, RandomInteger[100]], {10^6}];

{lengths = Length /@ longlist; (Arnoud) max = Max[lengths]; pos = Position[lengths, max]; Extract[longlist, pos];} // Timing (* ==> {0.422, {Null}} ) {max = Max[Length /@ longlist]; (Artes) Select[longlist, Length[#] == max &];} // Timing ( ==> {1.685, {Null}} ) Pick[longlist, #, Max[#]] &[Length /@ longlist]; // Timing (J.M.) ( ==> {2.012, Null} ) longlist~SortBy~Length~SplitBy~Length // Last; // Timing (Spartacus) ( ==> {7.098, Null} ) allMaxBy[longlist, Length]; // Timing (Szabolcs) ( ==> {7.144, Null} *)

Artes
  • 57,212
  • 12
  • 157
  • 245
  • Somewhat different results are had with longlist = RandomInteger[99, #] & /@ RandomInteger[{1, 5000}, 15000]; – Mr.Wizard Feb 06 '12 at 19:06
  • For this list your method as well as Szabolcs' one seem to be 7 times faster than J.M.'s approach, though more than 2 times slower than mine and 7 times than Arnoud's – Artes Feb 06 '12 at 19:22
5

Alternatively:

test = {{2, 3}, {1, 2}, {4, 5, 6, 7}, {5, 4, 3}, {8, 9, 10, 11}};

Pick[test, #, Max[#]] &[Length /@ test]
{{4, 5, 6, 7}, {8, 9, 10, 11}}
J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
3

I sometimes use a little function MaxBy, made to be analogous with SortBy:

MaxBy[list_, fun_] := list[[First@Ordering[fun /@ list, -1]]]

You need the largest element by length, so you can evaluate

MaxBy[data, Length]

Note: this is based on the same principle as @Brett's solution, but it is slower. @Brett's and @R.M's exploit the fact that Mathematica sorts by length by default, while my solution explicitly uses Length. I still think it's a useful little function, so I shared it again.

The problem with MaxBy is that it only returns a single element, while there may be more than one list of the same length. Here's a somewhat slow but simple implementation that returns all maxima:

allMaxBy[data_, fun_] := Last@SplitBy[SortBy[data, fun], fun]
Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
3

We could also use TakeLargestBy

list = {{1, 2}, {4, 5, 6, 7}, {5, 4, 3}};

Take the largest list by length:

TakeLargestBy[Length, 1] @ list

{{4, 5, 6, 7}}

Take the two largest lists with additional information:

TakeLargestBy[list -> All, Length, 2]

gives

{<|"Element" -> {4, 5, 6, 7, 7}, "Index" -> 2, "Value" -> 5|>,
 <|"Element" -> {5, 4, 3}, "Index" -> 3, "Value" -> 3|>}
eldo
  • 67,911
  • 5
  • 60
  • 168
3

Nearest has been improved in V10.1. It with Length /@ longlist and MaximalBy compete with the pre-V10 solution by Arnoud. Two ways of using Nearest are presented, although there is not much difference between them If we speed up Arnoud's by compiling Position, they are in a virtual dead heat. For ease of use and elegance of expression, MaximalBy seems the winner.

longlist = Table[RandomInteger[{-10, 10}, RandomInteger[100]], {10^6}];

{lengths = Length /@ longlist;                       (*Arnoud*)
  max = Max[lengths];
  pos = Position[lengths, max];
  Extract[longlist, pos];} // RepeatedTiming
(*  {0.424, {Null}}  *)

MaximalBy[longlist, Length, 1]; // RepeatedTiming    (* mrm *)
(*  {0.396, Null}  *)

Part[longlist, 
   Nearest[# -> Automatic, Max[#]] &[Length /@ longlist]]; // RepeatedTiming
(*  {0.388, Null}  *)

Nearest[# -> longlist, Max[#]] &[Length /@ longlist]; // RepeatedTiming
(*  {0.392, Null}  *)

Packing the lengths of longlist helps a bit here.

{lengths = Developer`ToPackedArray[Length /@ longlist];   (*Arnoud*)
  max = Max[lengths];
  pos = Compile[{{lengths, _Integer, 1}, {max, _Integer}}, 
     Position[lengths, max]][lengths, max];
  Extract[longlist, pos];} // RepeatedTiming
(*  {0.38, {Null}}  *)

Part[longlist, 
   Nearest[# -> Automatic, Max[#]] &[
    Developer`ToPackedArray[Length /@ longlist]]]; // RepeatedTiming
(*  {0.375, Null}  *)

Nearest[# -> longlist, Max[#]] &[
   Developer`ToPackedArray[Length /@ longlist]]; // RepeatedTiming
(*  {0.376, Null}  *)

And packing really helps here (original was 1.26 sec. on my machine):

Pick[longlist, #, Max[#]] &[                              (* Guess... *)
   Developer`ToPackedArray[Length /@ longlist]]; // RepeatedTiming
(*  {0.365, Null}  *)
Michael E2
  • 235,386
  • 17
  • 334
  • 747