Positions of elements in the initial flattened list in a split list

Question

Say, we have a list:

m1 = {2, 2, 7, 0, 7, 7, 2, 2, 2}

It can be split easily:

Split @ m1
(* {{2, 2}, {7}, {0}, {7, 7}, {2, 2, 2}}  *)

Wanted: a method to get the following list:

{{1, 2}, {3}, {4}, {5, 6}, {7, 8, 9}}

It should be as simple as possible and fast for long lists.

Related: (3585), (23607), (69906) – Mr.Wizard Jan 24 '16 at 19:06 — Mr.Wizard, Jan 24 '16 at 19:06

score 13 · Accepted Answer · answered Jan 23 '16 at 10:00

13

You can use

SplitBy[Range@Length@m1, m1[[#]] &]

answered Jan 23 '16 at 10:00

Simon Woods

84,945
8
175
324

3

Immortal solution – garej Jan 23 '16 at 10:03
1

so clean and nice +1 :-) – ubpdqn Jan 23 '16 at 10:09
1

+1 Looks useful. Is there some way to illustrate how this works? – Chris Degnen Jan 23 '16 at 11:24
1

As discussed here, Split and SplitBy are very slow for long list. I've tried it with m1 = Flatten[Table[#, # + 1] & /@ RandomInteger[{1, 100}, 10^5]]; and it is order of magnitude slower that counterparts. – garej Jan 23 '16 at 12:11
1

How it works: see output list = {}; SplitBy[ Range@Length@m1, (AppendTo[list, #]; m1[[#]]) &]; list plus docs SplitBy[list,f] splits list into sublists consisting of runs of successive elements that give the same value when f is applied. – Chris Degnen Jan 23 '16 at 14:33

garej · Answer 2 · 2016-01-24T21:54:16.910

There is a complicated trade-off between the speed and compact form in this case, so I have decided to post the version with Range, which I consider simple enough (comprehensible for new users) and second fast among the conterparts (at least, on my machine).

It is heavily based on @Mr.Wizard solution farsightedly provided by ChrisDegnen, so I do not claim originality:

dynS[p_] := Range @@@ Thread[{Accumulate@p - p + 1, Accumulate@p}]

Method

And after looking at @Mr.Wizard SparseArray solution I finally realize that we may use Listable attribute of Range to get even more compact version (this time I would prefer to keep pure function notation #). So this is favorite method for me (not mine :)!

dynSP[p_] := Range[# - p + 1, #]& @ Accumulate @ p

Timing benchmarking

I use the long list for benchmarking:

m1 = Flatten[Table[#, # + 1] & /@ RandomInteger[{1, 200}, 10^5]];
Length[m1]
(* 10156647 *)

And packed version later (second timing output in each method).

m1 = Developer`ToPackedArray[m1];

Range[Prepend[# + 1, 1], Append[#, Length @ m1]] & 
@ SparseArray[Differences @ m1]["AdjacencyLists"] // Length // RepeatedTiming
(* {0.403, 99530} *)
(* {0.274, 99489} *)

dynSP[Length /@ Split @ m1] // Length // RepeatedTiming
(* {0.476, 99439} *)
(* {0.626, 99439} *)

dynS[Length /@ Split @ m1] // Length // RepeatedTiming
(* {0.506, 99495} *)
(* {0.715, 99489} *)

Internal`PartitionRagged[Range[Length@m1], Length /@ Split@m1] // Length // RepeatedTiming
(* {0.589, 99495} *)
(* {0.78, 99489}  *)

dynP[Range@Length@m1, Length /@ Split @ m1] // Length // RepeatedTiming
(* {0.613, 99495} *)
(* {0.83, 99489}  *)

Module[{i = 1}, Replace[Split@m1, _ :> i++, {-1}]] // Length // RepeatedTiming
(* {3.845, 99439} *)
(* {4.1, 99439}   *)

Module[{i = 0}, Map[++i &, Split[m1], {-1}]] // Length // RepeatedTiming
(* {6.57, 99495} *)
(* {6.85, 99489} *)

SplitBy[Range @ Length @ m1, m1[[#]] &] // Length // RepeatedTiming
(* {24.6, 99495} *)
(* {25., 99489}  *)

Note: fastest function with SparseArray has been added a bit later so its result in terms of length is slightly different. The same is for the Module with Split version and my favorite DynSP.

score 8 · Answer 3 · answered Jan 23 '16 at 09:53

8

Perhaps:

s = Split@m1;
Internal`PartitionRagged[Range[Length@m1], Length /@ s]

answered Jan 23 '16 at 09:53

ubpdqn

60,617
3
59
148

oh, nice, I've missed such possibility, thank you. Related – garej Jan 23 '16 at 10:01

score 7 · Answer 4 · answered Jan 02 '21 at 12:16

7

Internal`CopyListStructure is quite fast:

Internal`CopyListStructure[Split @ #, Range @ Length @ #] & @ m1

 {{1, 2}, {3}, {4}, {5, 6}, {7, 8, 9}}

answered Jan 02 '21 at 12:16

kglr

394,356
18
477
896

thank you, I did not know about this structure. Upvoted a couple of similar questions. – garej Jan 02 '21 at 13:25

Kuba · Answer 5 · 2016-01-23T19:53:09.170

6

Too late for the party so here's something old style:

Module[{i = 0}, Map[++i &, Split[m1], {-1}]]

or

SplitBy[MapIndexed[Flatten@*List, m1], First][[;; , ;; , 2]]

edited Jan 23 '16 at 19:53

answered Jan 23 '16 at 19:48

Kuba

136,707
13
279
740

it is never too late to make a good contribution. I like the old style... – garej Jan 23 '16 at 19:52

score 5 · Answer 6 · edited Apr 13 '17 at 12:55

5

Using Mr.Wizard's ragged partition function here

dynP[l_, p_] := MapThread[l[[# ;; #2]] &,
  {{0}~Join~Most@# + 1, #} &@Accumulate@p]

m1 = {2, 2, 7, 0, 7, 7, 2, 2, 2};
m2 = Split@m1;

dynP[Range@Length@m1, Length /@ m2]

{{1, 2}, {3}, {4}, {5, 6}, {7, 8, 9}}

edited Apr 13 '17 at 12:55

Community

1

answered Jan 23 '16 at 10:04

Chris Degnen

30,927
2
54
108

,thank you, ironically I saw this solution a year ago but was incapable to grasp it :)) – garej Jan 23 '16 at 10:21

march · Answer 7 · 2016-01-24T19:29:13.367

5

m1 = {2, 2, 7, 0, 7, 7, 2, 2, 2};
Module[{i = 1}, Replace[Split@m1, _ :> i++, {-1}]]
(* {{1, 2}, {3}, {4}, {5, 6}, {7, 8, 9}} *)

edited Jan 24 '16 at 19:29

answered Jan 24 '16 at 18:26

march

23,399
2
44
100

@Mr.Wizard, I remember warnings concerning Block, so never use(d) it. – garej Jan 24 '16 at 19:06
@Mr.Wizard. That's interesting. I should go read the use cases for different scoping constructs post again. – march Jan 24 '16 at 19:06
march, using Block incorrectly seems to get a lot of people; as stated I did it myself many times before more experienced users (Szabolcz or Leonid probably) pointed out my mistake. Even Wolfram developers do it! Using his code on numbering[{{a, b}, {m, n}, {x, y}}] for example note that n has been incorrectly replaced by 0 in the output. – Mr.Wizard Jan 24 '16 at 19:12

score 4 · Answer 8 · edited Apr 13 '17 at 12:55

4

Adapted from my answer to a related question:

runs[a_List] := 
 Range[Prepend[# + 1, 1], Append[#, Length@a]] &@
  SparseArray[Differences@a]["AdjacencyLists"]

Now:

runs @ {2, 2, 7, 0, 7, 7, 2, 2, 2}

{{1, 2}, {3}, {4}, {5, 6}, {7, 8, 9}}

edited Apr 13 '17 at 12:55

Community

1

answered Jan 24 '16 at 19:30

Mr.Wizard

271,378
34
587
1,371

Oh, SparseArray!!! I was about to ask how to apply it here :))) – garej Jan 24 '16 at 19:32
@garej Glad I could be of help. :-) Please add this to your benchmark. – Mr.Wizard Jan 24 '16 at 19:34
@garej I see that a new syntax has been added for Table, after version 10.1.0 (which I use). Anyway (at least in 10.1) your m1 is not packed; would you please try your benchmark also with m1 = Developer`ToPackedArray[m1]; ? – Mr.Wizard Jan 24 '16 at 19:42

score 3 · Answer 9 · answered Jan 23 '16 at 21:44

3

I'll add another option, but be warned: It's rather slow.

FoldPairList[TakeDrop, Range@Length@m1, Length /@ (Split @ m1)]

answered Jan 23 '16 at 21:44

V.E.

1,700
17
16

yes, it hangs my machine with m1 = Flatten[Table[#, # + 1] & /@ RandomInteger[{1, 100}, 10^5]]; but it nice to have it here. It is as slow as using new SequencePosition ;) – garej Jan 23 '16 at 22:00

score 3 · Answer 10 · answered Jan 24 '16 at 18:22

3

SplitBy[Transpose[{m1, Range@Length@m1}], First][[;; , ;; , -1]]

or

m2 = Range@Length@m1;
i = 1; Split[m2, m1[[j = i++]] === m1[[j + 1]] &]

answered Jan 24 '16 at 18:22

Basheer Algohi

19,917
1
31
78

thank you, {14., 99530} for the first method, but I cannot check the second one. – garej Jan 24 '16 at 18:41
you can check it like this Module[{m2 = Range@Length@m1,i=1},Split[m2, m1[[j = i++]] === m1[[j + 1]] &]] – Basheer Algohi Jan 24 '16 at 18:44
{24., 99496} in my machine , that is too long !!! ):- – Basheer Algohi Jan 24 '16 at 18:49

Pillsy · Answer 11 · 2021-01-02T20:40:14.743

3

Just wanted to belatedly add a new-style spin on Kuba's classic old-style answer, using the "Counter" DataStructure

With[{counter = CreateDataStructure["Counter", 1]},
 Map[counter["Increment"] &, Split[m1], {2}]]
(* {{1, 2}, {3}, {4}, {5, 6}, {7, 8, 9}} *)

edited Jan 02 '21 at 20:40

answered Jan 02 '21 at 19:23

Pillsy

18,498
2
46
92

Positions of elements in the initial flattened list in a split list

11 Answers11

Method

Timing benchmarking

Linked