20

Given a list:

lis = {37.21, 37.21, 37.2, 44, 44, 44, 101, 101}

What is a simple way to extract the second largest elements?

In[1]:= someFunction[lis]

Out[1]= {44, 44, 44}

Conor
  • 7,449
  • 1
  • 22
  • 46

9 Answers9

21

One way, not highly efficient:

lis = {37.21, 37.21, 37.2, 44, 44, 44, 101, 101};

lis ~Cases~ Union[lis][[-2]]
{44, 44, 44}

This should be a bit more efficient:

ConstantArray @@ Sort[Tally@lis][[-2]]

Caveat: both of these methods rely on sorting and therefore require numeric data.


flinty's method with refinements by both C. E. and me:

Pick[lis, lis, RankedMax[DeleteDuplicates@lis, 2]]

This appears to be the fastest overall and it avoids the sorting issue referenced above.


Benchmarking

A quick test of the methods posted so far reveals an interesting pattern. Note that in the benchmark I use a list of a fixed length of one million and vary the number of unique elements within that list.

Adding methods f5, f6, and f7, and a second test with unpackable data.

Performed in Mathematica 10.1

Needs["GeneralUtilities`"]

SetOptions[Benchmark, TimeConstraint -> 30];

f1[lis_] := lis ~Cases~ Union[lis][[-2]]
f2[lis_] := ConstantArray @@ Sort[Tally@lis][[-2]]
f3[lis_] := MaximalBy[DeleteCases[lis, Max@lis], # &] (* Conor/kglr *)
f4[lis_] := Split[Sort@lis][[-2]]  (* kglr *)
f5[lis_] := Pick[lis, lis - RankedMax[DeleteDuplicates@lis, 2], 0]; (* flinty/C. E. *)
f6[lis_] := Extract[List/@KeySort[PositionIndex[lis]][[-2]]][lis] (* CA Trevillian *)
f7[lis_] := Pick[lis, lis, RankedMax[DeleteDuplicates@lis, 2]] (* flinty/C.E./me *)

BenchmarkPlot[{f1, f2, f3, f4, f5, f6, f7},
  RandomInteger[#, 1*^6] &, 10^Range[6], Joined -> True]

BenchmarkPlot[{f1, f2, f3, f4, f5, f6, f7},
  Prepend[0.5]@RandomInteger[#, 1*^6] &, 10^Range[6], Joined -> True]

enter image description here

enter image description here

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • 2
    +1 Using v12.1 on my Mac, the benchmarks for f1, f3, and f4 all stop at n == 10^5, only f2 goes up to n == 10^6. Also, the PlotMarkers are visible in the plot just like in the PlotLegends. – Bob Hanlon Jun 13 '20 at 14:19
  • flinty's DeleteDuplicates + RankedMax approach appears to be competitive when used in conjunction with Pick to select all elements. Benchmark - f5 is flinty's method. – C. E. Jun 13 '20 at 20:41
  • @BobHanlon Try SetOptions[BenchmarkPlot, TimeConstraint -> 30]. I don't need this in version 10.1 when explicitly specifying test points, but I think that should do it. – Mr.Wizard Jun 13 '20 at 22:36
  • 1
    @C.E. Benchmark updated. – Mr.Wizard Jun 13 '20 at 22:37
  • @Mr.Wizard - I get the error message SetOptions::optnf : TimeConstraint is not a known option for BenchmarkPlot. On the first plot, only f2 and f5 go to 10^6. The others only go to 10^5. On the second plot, f2 only goes to 10^5 and the others only go to 10^4. There appears to have been significant changes since v10.1, not all for the better. – Bob Hanlon Jun 14 '20 at 00:42
  • @BobHanlon It seems that the TimeConstraint option is no longer passed down. Please try SetOptions[Benchmark, TimeConstraint -> 30] after loading GeneralUtilities and see if that works. – Mr.Wizard Jun 14 '20 at 01:00
  • 1
    @Mr.Wizard - that fixed everything. Thanks. – Bob Hanlon Jun 14 '20 at 01:42
12

another way...

MaximalBy[DeleteCases[lis, Max@lis], # &]
{44, 44, 44}
Conor
  • 7,449
  • 1
  • 22
  • 46
10
Split[ Sort @ lis][[-2]]
 {44, 44, 44}

Also

Nearest[DeleteCases[Max @ #] @ #, Max @ #] & @ lis
{44, 44, 44}
kglr
  • 394,356
  • 18
  • 477
  • 896
9

Find the second largest unique element:

RankedMax[DeleteDuplicates@lis, 2]

... or alternatively:

Last@TakeLargest[DeleteDuplicates@lis, 2]

There are multiple ways to get them all:

Cases[lis, RankedMax[DeleteDuplicates@lis, 2]]
Cases[lis, Last@TakeLargest[DeleteDuplicates@lis, 2]]
Select[lis, # == Last@TakeLargest[DeleteDuplicates@lis, 2] &]
flinty
  • 25,147
  • 2
  • 20
  • 86
  • 1
    +1. It's faster to use Pick to get all, e.g. Pick[lis, lis - RankedMax[DeleteDuplicates@lis, 2], 0]; The Select one can be sped up by using With to inject the sought after value into the anonymous function (but it will still be much slower than alternatives). – C. E. Jun 13 '20 at 20:16
  • 1
    @C.E. This seems faster still: Pick[lis, lis, RankedMax[DeleteDuplicates@lis, 2]] – Mr.Wizard Jun 14 '20 at 06:58
  • @Mr.Wizard Thank you. I was messing around and somehow didn’t notice... – C. E. Jun 14 '20 at 08:35
6

This is probably terribly expensive compared to other methods, but I think it could be done better too, regardless...also I find it odd that Ordering doesn't manage for duplicated values...

Extract[List/@KeySort[PositionIndex[lis]][[-2]]][lis]
{44, 44, 44}

You can just grab the positions directly with

KeySort[PositionIndex[lis]][[-2]]
{4, 5, 6}

Though, I will say this is the only presented method so far that "Extracts" the second-largest value(s) in a list ;)

This is better to look at:

lis[[KeySort[PositionIndex[lis]][[-2]]]]
CA Trevillian
  • 3,342
  • 2
  • 8
  • 26
6

Another solution:

lis // DeleteCases[#, Max@#]& // Cases[#, Max@#]&
sakra
  • 5,120
  • 21
  • 33
3
Select[Select[c=Sort[lis],#!=Last[c] &],#==Last[Select[c,#!=Last[c] &]]&]
Reda.Kebbaj
  • 674
  • 3
  • 13
3

Another solution using Tally, SortBy and ConstanArray:

ConstantArray[#[[1]], #[[2]]] &@Last@SortBy[Tally[lis], Last]

({44, 44, 44})

E. Chan-López
  • 23,117
  • 3
  • 21
  • 44
2
list = {37.21, 37.21, 37.2, 44, 44, 44, 101, 101};

Using TakeLargestBy

spl = Split @ list

{{37.21, 37.21}, {37.2}, {44, 44, 44}, {101, 101}}

n = 2;

TakeLargestBy[spl, First, n][[n]]

{44, 44, 44}

To also get the position (in the splitted list)

n = 3;

TakeLargestBy[spl -> {"Element", "Index"}, First, n][[n]]

{{37.21, 37.21}, 1}

eldo
  • 67,911
  • 5
  • 60
  • 168