6

The situation is that I have a List like this:

A = { {1, 11}, {1, 12}, {1, 17}, {1, 20}, {1, 52}, {4, 179}, {5, 10},
      {5, 55}, {12, 16}, {13, 274}, {15, 206}, {17, 16}, {17, 23}, {17, 24},
      {17, 25}, {22, 82}, {26, 131}, {30, 222}, {30, 391}, {40, 24},
      {41, 31}, {41, 54}, {41, 78}, {41, 82}, {43, 42}, {44, 67},
      {45, 21}, {47, 81}, {49, 192}, {52, 10}, {56, 67}, {56, 76},
      {56, 79}, {56, 81}, {56, 89}, {56, 90}, {56, 91}, {56, 92} }

I would like to order the list A by frequencies from first component.

I tried to figure out this order with this function:

 frec = Table[i[[1, 1]]  , {i, 
      Sort[Tally[A, #1[[1]] ==   #2[[1]]  & ], #1[[2]] > #2[[2]] & ]}]

{56, 1, 41, 17, 30, 5, 52, 49, 47, 45, 44, 43, 40, 26, 22, 15, 13, 12, 4}

It means that 56 appears as first component more than 1, and so on.

But I really don't know how to use this order to produce something like this:

A = { {56, 67}, {56, 76}, {56, 79}, {56, 81}, {56, 89}, {56, 90}, {56, 91}, {56, 92},
      { 1, 11}, { 1, 12}, { 1, 17}, { 1, 20}, { 1, 52},
      {41, 31}, {41, 54}, {41, 78}, {41, 82},
      {17, 16}, {17, 23}, {17, 24}, {17, 25}, ... }
Adrian
  • 411
  • 4
  • 14
  • d555, you should not be so quick to Accept an answer. Despite receiving some fine ones already, you cannot know what you may get if you wait a bit longer. I suggest waiting 24 hours to give everyone around the world a chance to reply, before the question appears concluded. – Mr.Wizard Sep 08 '13 at 20:56
  • Ok, Sorry. I consider your hint for the next time. But, I should to say that the accepted answer was so fast with large list opposite others scripts. So, I taked it. – Jonathan Prieto-Cubides Sep 08 '13 at 21:13
  • 1
    If speed is your goal I think you'll want to see my answer. – Mr.Wizard Sep 08 '13 at 21:18

6 Answers6

10

After reading the other answers I propose this:

Join @@ SortBy[GatherBy[a, #[[1]] &], {-Length@# &}]

It is directly based on belisarius's method but I believe it is an improvement.

  • By using {-Length@# &} we attain a stable sort(1)(2) which is also faster.

  • Join @@ is faster than Flatten on packed sub-lists and returns a packed array.

Timings:

a = RandomInteger[99999, {500000, 2}];

Flatten[SortBy[GatherBy[a, #[[1]] &], -Length@# &], 1] // Timing // First
Join @@ SortBy[GatherBy[a, #[[1]] &], {-Length@# &}]   // Timing // First

0.608

0.2838

More than 2X faster on this data.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
6
Flatten[SortBy[GatherBy[a, #[[1]] &], -Length@# &], 1]

{{56, 67}, {56, 76}, {56, 79}, {56, 81}, {56, 89}, {56, 90}, {56, 91}, {56, 92}, 
 {1, 11}, {1, 12}, {1, 17}, {1, 20}, {1, 52}, 
 {17, 16}, {17, 23}, {17, 24}, {17, 25}, 
 {41, 31}, {41, 54}, {41, 78}, {41, 82}, 
 {5, 10}, {5, 55}, 
 {30, 222}, {30, 391}, 
 {4, 179}, {12, 16}, {13, 274}, {15, 206}, {22, 82}, {26, 131}, 
 {40, 24}, {43, 42}, {44, 67}, {45, 21}, {47, 81}, {49, 192}, {52, 10}}
Dr. belisarius
  • 115,881
  • 13
  • 203
  • 453
  • @Kuba I partly agree, but. The use of First[], Last[], etc. has been criticized before on this site as some kind of "generality loss" (while I don't share that opinion, I try to adapt my code to the site preferences), and as for the Reverse[] instead of -Length[], I really prefer to keep my orig (one less op) – Dr. belisarius Sep 08 '13 at 19:28
  • You are right. And you are right, I think -Length is better because it preserves sorting of equal frequency types by the second element. Which in case of reverse is reversed :) – Kuba Sep 08 '13 at 19:33
6

Here is another options:

rule = Dispatch@Rule@@@Tally[a[[All, 1]]];
Reverse@SortBy[a, #[[1]] /. rule &]

rule = Dispatch@Rule@@@Tally[a[[All, 1]]];
a[[Ordering[a[[All, 1]] /. rule]]]

The second form is faster

Murta
  • 26,275
  • 6
  • 76
  • 166
5

Here is one way that uses the Tally of the first elements to create a function that maps an element to the number of first occurrences. That function is then used in SortBy, with a secondary function that look at the last elements:

lookupmethod[a_] := Module[{lookup},
  (lookup[#1] = -#2) & @@@ Tally[a[[All, 1]]];
  SortBy[a, {lookup@First@# &, Last}]
  ]
lookupmethod[a]//Short
(* {{56,67},{56,76},{56,79},{56,81},<<30>>,{4,179},{49,192},{15,206},{13,274}} *)

The minus sign is there to get the ordering desired.

ssch
  • 16,590
  • 2
  • 53
  • 88
  • It seems unique elements are not sorted by the second argument, but I do not know if they have to. – Kuba Sep 08 '13 at 19:27
  • @Kuba, me neither, in that case {lookup@First@#&,First,Last} :) – ssch Sep 08 '13 at 19:30
  • 1
    @Kuba & ssch, if you want pairs to be sorted also by the second element in the case of a tie, simply use SortBy[a, lookup@First@# &] -- the single function without {} will automatically use canonical sort for tiebreaking. – Mr.Wizard Sep 09 '13 at 07:38
  • @Mr.Wizard The question is not "how" but "do we really want to"? :) OP has not made this issue clear. – Kuba Sep 09 '13 at 11:49
  • @Kuba I'm just providing options; don't shoot the messenger. :-p – Mr.Wizard Sep 09 '13 at 11:55
3

An approach using Sow and Reap

Flatten[Thread /@ (SortBy[
    Reap[Sow @@@ 
       Reverse /@ A, _, {#1, #2} &][[2]], -Length[#[[2]]] &]), 1]
ubpdqn
  • 60,617
  • 3
  • 59
  • 148
2

One more way might be this

    Block[{w={}},Table[If[A[[s,1]]==frec[[i]],w={w,A[[s]]},0],
{i,1,Length[frec]},{s,1,Length[A]}];Partition[w//Flatten,2]]

{{56,67},{56,76},{56,79},{56,81},{56,89},{56,90},{56,91},{56,92},{1,11},{1,12},{1,17},{1,20},{1,52},{41,31},{41,54},{41,78},{41,82},{17,16},{17,23},{17,24},{17,25},{30,222},{30,391},{5,10},{5,55},{52,10},{49,192},{47,81},{45,21},{44,67},{43,42},{40,24},{26,131},{22,82},{15,206},{13,274},{12,16},{4,179}}

Pankaj Sejwal
  • 2,063
  • 14
  • 23