3

I have a list where a few distinct numbers occur, e.g.

{41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 41, 41}

I want to extract the information of how many times in a row each number occurs. For the above list,

{{3,7},{1}}

I actually already have something that works, but it seems excessively complicated and amateurish. Any ideas to improve this?

(Take[#, All, {2}] // Flatten) & /@
 Sort[
  GatherBy[{#[[1]], Length[#]} & /@ Union@Split@INPUTLIST,
   First],
  First@First[#1] < First@First[#2] &]
A l'Maeaux
  • 195
  • 3
  • 1
    So you don't have to know which result corresponds to which value? – Kuba Feb 11 '14 at 07:15
  • They're supposed to be in increasing order, so here it's runs of 41 first, then runs of 1009. I suppose if they were more general than integers, you might want to include that information. – A l'Maeaux Feb 11 '14 at 07:58

6 Answers6

8

Same idea, a bit cleaner:

DeleteDuplicates /@ MapAt[Length, GatherBy[Split[yourListHere], First], {All, All}]

And a generic function, with option to annotate with element identities:

runsList[list_, names_: False] := 
  Module[{gb = GatherBy[Split[list], First], runs},
   runs = DeleteDuplicates /@ MapAt[Length, gb, {All, All}];
   If[names, Transpose[{Flatten[Union @@@ gb], runs}],runs]];

test = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 41, 41, 41};

runsList[test]
runsList[test, True]

(*

{{3, 7, 4}, {1}}

{{41, {3, 7, 4}}, {1009, {1}}}

*)

And a shorter version if you always want annotations:

{#[[1, 1]], Length /@ #} & /@ GatherBy[(Tally@Split@#)[[All, 1]], First] &[yourListHere]
ciao
  • 25,774
  • 2
  • 58
  • 139
5
list = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 41, 41};

GroupBy

GroupBy[Split @ list, First -> Length, DeleteDuplicates]

<|41 -> {3, 7}, 1009 -> {1}|>

Values @ %

{{3, 7}, {1}}

SequenceCases

SequenceCases[list, x : {Repeated[a_]} :> Length[x]]

{3, 1, 7, 1, 3}

kglr
  • 394,356
  • 18
  • 477
  • 896
4

Using PositionIndex, SequenceCases and ConsecutiveQ:

list = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 41, 41};

Length /@ SequenceCases[#, {__}?ConsecutiveQ] & /@ PositionIndex[list]

(<|41 -> {3, 7, 3}, 1009 -> {1, 1}|>)

E. Chan-López
  • 23,117
  • 3
  • 21
  • 44
3
list = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 41, 41};

Using PositionIndex (new in 10.0)

With[{p = PositionIndex @ list},
 Thread[Keys[p] -> (Length /@ Split[#, #2 - #1 == 1 &] & /@ Values[p])]]

{41 -> {3, 7, 3}, 1009 -> {1, 1}}

eldo
  • 67,911
  • 5
  • 60
  • 168
3

I have supported rasher's answer. I just post this as an example using Reap and Sow.

fun[u_, n_: Identity] := 
 Last@Reap[Sow[Length[#], First@#] & /@ Split[u], _, #1 -> n@#2 &]

Using:

data = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 41, 
   41};

If you want to collect all the runs:

fun[data]

gives:

{41 -> {3, 7, 3}, 1009 -> {1, 1}}

If you want to just want unique run lengths:

fun[data,DeleteDuplicates]

gives:

{41 -> {3, 7}, 1009 -> {1}}

If you want to do something with the runs, e.g. Mean:

fun[data, Mean]

yields:

{41 -> 13/3, 1009 -> 1}

Its timing is less efficient than any of rashers, e.g.

{#[[1, 1]], Length /@ #} & /@ 
     GatherBy[(Tally@Split@#)[[All, 1]], First] &[
   RandomInteger[{1, 100}, 10000000]] // AbsoluteTiming // First

yields 14.403590 seconds.

versus:

fun[RandomInteger[{1, 100}, 10000000]] // AbsoluteTiming // First

16.118732 seconds.

ubpdqn
  • 60,617
  • 3
  • 59
  • 148
1

use SequenceCases

ClearAll["Global`*"];
INPUTLIST = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 
   41, 41};
SequenceCases[INPUTLIST, x : {Repeated[a_]} :> {a, Length[x]}]
{{41, 3}, {1009, 1}, {41, 7}, {1009, 1}, {41, 3}}

Use Split

ClearAll["Global`*"];
INPUTLIST = {41, 41, 41, 1009, 41, 41, 41, 41, 41, 41, 41, 1009, 41, 
   41, 41};

grouped = Split[INPUTLIST]; output = Map[{First[#], Length[#]} &, grouped]

{{41, 3}, {1009, 1}, {41, 7}, {1009, 1}, {41, 3}}
138 Aspen
  • 1,269
  • 3
  • 16