Partitioning with varying partition size

Question

How can I partition a list into partitions whose sizes vary? The length of the $k$'th partition is a function $f(k)$.

For example: if $l = \{1, 2, 3, 4, 5, 6\}$ and $f(k) = k$. Then the partitioning $p$ would look like $p = \{\{1\},\{2, 3\},\{4,5,6\}\}$

In Mathematica 11.2, the builtin TakeList will do this.

@Leonid that's not a discussion it's a solution. :-) (However, I believe I tested it before and it came up slower than dynP/dynamicPartition -- can you confirm for v8?) — Mr.Wizard, Jun 27 '12 at 09:53
@Mr.Wizard Yes, I can. But then, have a look at the listSplit function in my third post here :-) — Leonid Shifrin, Jun 27 '12 at 09:56
@Leonid I'm claiming "great minds think alike" for this one. :-) Since you're testing, how does Internal`PartitionRagged compare to dynP? — Mr.Wizard, Jun 27 '12 at 10:02
@Mr.Wizard dynP is about 40-50 % faster on my test: test = Flatten[Range /@ Range[5000]];, and then dynP[test, Range[5000]], and similarly for the internal function. — Leonid Shifrin, Jun 27 '12 at 10:08
@Mr.Wizard But, if we convert test to packed array with Developer`ToPackedArray, then the internal function is a little faster. I would generally mention in your answer that for packed arrays, your function creates a ragged list where however all sublists remain packed (because Part does not unpack). This allows for much faster execution and vastly more efficient storage as well, even though the resulting array is ragged. — Leonid Shifrin, Jun 27 '12 at 10:29

score 59 · Answer 1 · edited May 23 '17 at 12:35

The core solution

If I understand your question I previously wrote a function for this purpose.
The core of that function is:

dynP[l_, p_] := 
 MapThread[l[[# ;; #2]] &, {{0} ~Join~ Most@# + 1, #} & @ Accumulate @ p]

Version 8 users have Internal`PartitionRagged which has the same syntax for the basic case.

dynP[Range@6, {1, 2, 3}]

{{1}, {2, 3}, {4, 5, 6}}

dynP[Range@8, {3, 1, 2, 1}]

{{1, 2, 3}, {4}, {5, 6}, {7}}

Extended version

Since this answer proved popular I decided to do a full rewrite of dynamicPartition:

Shorter code with less duplication
Better performance and lower argument testing overhead
Partitioning of expressions with heads other than List

dynamicPartition[list, runs] splits list into lengths runs.

dynamicPartition[list, runs, All] appends all remaining elements in a single partition.

dynamicPartition[list, runs, spec₁, spec₂, ...] passes specifications spec_n to Partition for the remaining elements.

dPcore[L_, p : {q___, _}] := Inner[L[[# ;; #2]] &, {0, q} + 1, p, Head@L]

dPcore[L_, p_, All] := dPcore[L, p] ~Append~ Drop[L, Last@p]

dPcore[L_, p_, n__] := dPcore[L, p] ~Join~ Partition[L ~Drop~ Last@p, n]

dynamicPartition[L_, p : {__Integer}, x___] :=
  dPcore[L, Accumulate@p, x] /; ! Negative@Min@p && Length@L >= Tr@p

(This code no longer uses dynP shown above.)

Usage Examples:

dynamicPartition[Range@12, {4, 3}, All]

{{1, 2, 3, 4}, {5, 6, 7}, {8, 9, 10, 11, 12}}

dynamicPartition[Range@12, {4, 3}, 2]

{{1, 2, 3, 4}, {5, 6, 7}, {8, 9}, {10, 11}}

dynamicPartition[h[1, 2, 3, 4, 5, 6, 7], {3, 1}, 2, 1, 1, "x"]

h[h[1, 2, 3], h[4], h[5, 6], h[6, 7], h[7, "x"]]

Packed arrays

Please note that one special but practically important case is when the list you want to split is a packed array, or can be converted into one. Here is an illustration. First, we create a large (and apparently unpacked) test list:

(test = Flatten[Range/@Range[5000]])//Developer`PackedArrayQ

(*  False  *)

We now split it:

(res = dynP[test,Range[5000]]);//AbsoluteTiming

(* {0.2939453,Null} *)

We can see that the sublists are, or course, unpacked as well:

Developer`PackedArrayQ/@res//Short

(*  
      {False,False,False,False,False,False,False,False,
      <<4984>>,False,False,False,False,False,False,False,False}
*)

Converting to a packed array admittedly takes some time:

test1 = Developer`ToPackedArray[test]; // AbsoluteTiming

(* {0.1660157, Null} *)

But if you do some manipulations with this list many times, this will pay off. Also, often you end up with a packed list from the start. Anyway, now splitting this list is several times faster:

(res1 = dynP[test1,Range[5000]]);//AbsoluteTiming

(*  {0.0644531,Null}  *)

and all the sublists are now also packed:

Developer`PackedArrayQ/@res1//Short

(*
   {True,True,True,True,True,True,True,True,True,
    <<4982>>,True,True,True,True,True,True,True,True,True}
*)

which has a large impact on the total memory consumption as well:

ByteCount/@{res,res1}

(*    {400320040,50900040}    *)

The technique of converting sub-lists of a ragged lists to packed form was already discussed a few times here on SE, e.g. here. In this particular case, dynP will do that automatically when the initial list is packed, but it is still good to keep in mind, for example to avoid accidental unpacking of sublists during whatever further processing you want to perform on the resulting ragged list.

+1 very cool! Perhaps for consistency you could implement dynamicPartition[Range@12, {4, 3}, None] and/or dynamicPartition[Range@12, {4, 3}, 0] as well? — Ajasja, Jun 27 '12 at 09:44
@Ajasja those would be reasonable extensions, but I think I'll leave it as it is since I want to focus on the core method dynP. I thought it was better to copy the full function here rather than merely reference it on a separate site. — Mr.Wizard, Jun 27 '12 at 09:51
I agree, it's better to have the full answer here. For any body else reading (and for my future ref) Adding 0 and None is just a matter of adding dynamicPartition[l_List, p : {_Integer?NonNegative ..}, None | 0] := dynamicPartition[l, p] — Ajasja, Jun 27 '12 at 10:01
@Ajasja if you go down that road you could use None | 0 | PatternSequence[] and do it with one definition. — Mr.Wizard, Jun 27 '12 at 10:04
Thanks. ("PatternSequence[] represents a sequence of zero length.") — Ajasja, Jun 27 '12 at 10:08
@Ajasja also, I didn't want to clutter the answer further, but since additional arguments are passed to Partition more complicated specifications are possible, e.g. dynamicPartition[Range@20, {4, 3}, 2, 3, 1, "x"] — Mr.Wizard, Jun 27 '12 at 10:09
I'll just note that Wizard's dynP[] is effectively equivalent to the prize-winning solution in the 1992 Mathematica Programming Competition in Rotterdam, with a few modifications. The actual submission used Inner[] instead of MapThread[], but again, the algorithm is identical. — J. M.'s missing motivation, Jun 27 '12 at 16:02
@J.M. I'm out of time for today but it looks like Inner is faster than MapThread. I guess this function is due for a rewrite! — Mr.Wizard, Jun 27 '12 at 20:36
Sure, it used Fold[] and Take[] instead of Accumulate[] and Span[] (along with Inner[] as I mentioned earlier), but as I said, essentially the same method. Should be somewhere in The Mathematica Journal archives... look for the solution by Kris Thielemans. — J. M.'s missing motivation, Jun 28 '12 at 14:24
@J.M. yes, truly nothing new under the sun. Here is a reference to the competition that includes the code, and an alternative. At least I learned a fair bit by working the problem out myself, even if I did it over ten years after the competition. :-) — Mr.Wizard, Jun 28 '12 at 20:59
@Mr.Wizard: Forgot to +1 this beautifully thought-out and executed solution. Done. — ciao, Apr 17 '14 at 22:05

Mr.Wizard · Accepted Answer · 2013-05-14T01:23:45.040

Update: see section three for a significant optimization.

Reading your question again today I realize that I did not understand it completely the first time. Since my existing answer is already quite long I am posting an additional answer.

This method is not as fast as dynamicPartition but it finally does what you asked.

partitionBy[L_List, func_] := Reap[partitionBy[L, func, 1, 0]][[2, 1]]

partitionBy[L_List, func_, i_, pos_] :=
  With[{x = pos + func[i]},
    partitionBy[Sow @ L[[pos + 1 ;; x]]; L, func, i + 1, x] /; x <= Length@L
  ]

Examples:

partitionBy[Range@10, # &]

{{1}, {2, 3}, {4, 5, 6}, {7, 8, 9, 10}}

partitionBy[Range@10, 2 &]

{{1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}}

partitionBy[Range@12, Mod[#, 3, 1] &]

{{1}, {2, 3}, {4, 5, 6}, {7}, {8, 9}, {10, 11, 12}}

On long lists you may need to increase $IterationLimit.

While I enjoyed writing the functional code above it seems a procedural approach is faster:

partitionBy2[L_List, func_] :=
 Reap[Block[{i = 1, p = 0, x, n = Length@L},
   While[
     (x = p + func[i++]) <= n,
     Sow @ L[[p + 1 ;; (p = x)]];
   ]
 ]][[2, 1]]

Compiled function

For considerably greater speed with compilable length-functions the following may be used:

partitionBy3[L_List, func_] := 
 Inner[L[[# ;; #2]] &, ##, List] & @@ 
  Compile[{{n, _Integer}}, 
    Module[{i = 1},
     {#[[;; -3]] + 1, #[[2 ;; -2]]} & @
       NestWhileList[# + func[i++] &, 0, # <= n &]
    ]
  ] @ Length @ L

Example:

partitionBy2[Range@1*^7, Mod[#, 17, 1] &] // Timing // First

partitionBy3[Range@1*^7, Mod[#, 17, 1] &] // Timing // First

3.76

1.014

you may be interested to see the definition of Internal'PartitionRagged here in chat — Jacob Akkerboom, Jan 07 '14 at 14:35
@Xavier Certainly worth mentioning I think. You may wish to post that as a separate answer. I suggest you note in it however that it follows my initial and incorrect interpretation of the question, rather than what the OP actually wanted. (The latter answered by partitionBy.) — Mr.Wizard, Dec 04 '15 at 01:04

score 25 · Answer 3 · answered Oct 04 '17 at 01:05

25

New in 11.2 is TakeList:

TakeList[Range[10], {2, 3, 5}]

{{1, 2}, {3, 4, 5}, {6, 7, 8, 9, 10}}

answered Oct 04 '17 at 01:05

Carl Woll

130,679
6
243
355

score 15 · Answer 4 · edited Aug 14 '16 at 14:34

15

This can be implemented elegantly with FoldPairList and TakeDrop (both new in v10.2), in fact it's one of the examples in the documentation:

FoldPairList[TakeDrop, Range[10], {2, 3, 5}]

{{1, 2}, {3, 4, 5}, {6, 7, 8, 9, 10}}

FoldPairList[TakeDrop, Range[20], Range[5]]

{{1}, {2, 3}, {4, 5, 6}, {7, 8, 9, 10}, {11, 12, 13, 14, 15}}

edited Aug 14 '16 at 14:34

Karsten7

27,448
5
73
134

answered Aug 09 '16 at 15:24

masterxilo

5,739
17
39

score 13 · Answer 5 · answered May 14 '13 at 13:03

13

This won't win any prizes for performance, but perhaps if there was a prize for using the second argument of Split in ways that were never intended...

partitionBy[list_, func_] :=
 Module[{f, i = func[1], k = 1},
  _f := i-- > 1 || (i = func[++k]);
  Split[list, f]]

partitionBy[Range@12, Mod[#, 3, 1] &]

{{1}, {2, 3}, {4, 5, 6}, {7}, {8, 9}, {10, 11, 12}}

answered May 14 '13 at 13:03

Simon Woods

84,945
8
175
324

5

That is an odd construction: _f := ... – rcollyer May 14 '13 at 13:09
2

@rcollyer It may seem odd only because we are used to think of function definitions as if they are similar to other languages. Once we recall that they are rules, this is no more odd than the usual ones. I use this construct (_f:=...) from time to time too. – Leonid Shifrin May 14 '13 at 13:24
@LeonidShifrin I see how it works, but it is still a bit brain warping, though. Not that's a bad thing, just odd to look at. – rcollyer May 14 '13 at 13:27
@rcollyer Yes, I agree. I first saa this, IIRC, in the book of Roman Maeder long time ago, and I had the same feeling back then. – Leonid Shifrin May 14 '13 at 13:29
2

@rcollyer, I don't often use it, but here it seemed a natural way to express that the arguments of f are completely irrelevant to its purpose. – Simon Woods May 14 '13 at 13:52
Effective. Plus, I like the "abuse" of Or, a form I use frequently when I want to either return True or another value. +1 – rcollyer May 14 '13 at 14:03
In my opinion, in this case writing _f is just a confusing way of writing f[_]. These expressions do not have the same FullForm, but they are the same for pattern matching, as far as I know. When writing f[_] it is also more clear that the definition will be added to the DownValues of f. SetDelayed wants as its first argument an expression with as its head the symbol you want to add the definition to. It is kind enough to treat Blank[f] as f[Blank[]]. But something like x_ /; Head[x] == f := 9 does not work, which is ambiguous anyway. Nor does _:=3 work, no good head here. – Jacob Akkerboom Jul 11 '13 at 10:03
Trace[_g := 7, TraceInternal -> True, TraceOriginal -> True] does not reveal much, but the TraceInternal-> True adds an additional {g}. Also fun: Block[{HoldForm}, SetAttributes[HoldForm, HoldAll]; HoldForm[x_ /; Head[x] =!= FullForm] := HoldForm[FullForm[x]]; Trace[_g := 7, TraceInternal -> True, TraceOriginal -> True] ] – Jacob Akkerboom Jul 11 '13 at 10:42
@JacobAkkerboom, interesting point! There must be special handling so that SetDelayed knows you are trying to set a downvalue for f rather than for Blank. I wonder if it is even possible to set a downvalue for Blank (in the unlikely event that you would ever want to...) – Simon Woods Jul 11 '13 at 10:55
@SimonWoods I think the only think standing in your way there is Blank being protected :). You can always restart your kernel ;). – Jacob Akkerboom Jul 11 '13 at 13:10
I missed this answer until now. This is some amazing code. I'm still not making use of Or for flow control the way that you do, but I like it. – Mr.Wizard Aug 26 '13 at 08:47
@Mr.Wizard, I like using it, but I can see why people might prefer using more explicit flow control structures. Unrelated, looking at this answer, I can't understand why I used f rather than just putting it as a pure function in Split – Simon Woods Aug 26 '13 at 10:08
1

Like I said, I'd like to use it but somehow I never do; I'll have to make a point of trying to use it I guess. Regarding f, perhaps it was just the style you felt like using that day, but a quick test suggests that it is faster than the pure function (which I find a bit surprising) and also slightly faster than the form f[__] which some people wanted you to use, which I find less surprising as I remember seeing that behavior before. – Mr.Wizard Aug 26 '13 at 15:31

score 10 · Answer 6 · answered Jun 27 '12 at 09:45

This is a bit different than Mr.Wizards excellent solution: it calculates the number of successively longer partitions (thus no irregular partitioning argument can be given) using the summation formula, and then does the same Accumulate & extract inside a MapThread.

myPartition[list_] := Module[
   {num = Ceiling[n /. First@Solve[{ n (1 + n)/2 == Length@list, n > 0}]]},
   MapThread[
    Take[list, {#2, Min[Length@list, #2 + #1 - 1]}] &,
    {Range@num, Most@FoldList[Plus, 1, Range@num]}]
   ];

myPartition@Range@20

{{1}, {2, 3}, {4, 5, 6}, {7, 8, 9, 10}, {11, 12, 13, 14, 15}, {16, 17, 18, 19, 20}}

score 7 · Answer 7 · edited Apr 13 '17 at 12:56

7

I have created today a question which was a duplicate of this (thanks to Pinguin Dirk).

My attepmts are not very spophisticated but one may find them useful:

Let f[k] be a list of lengths:

l = Range[10];
p = {2, 3, 5};

1

Take[l, {1, 0} + #] & /@ (Partition[Prepend[Accumulate@p, 0], 2, 1])

{{1, 2}, {3, 4, 5}, {6, 7, 8, 9, 10}}

2

FoldList[{Take[#1[[ 2]], #2], Drop[#1[[ 2]], #2]} &, {1, l}, p
        ][[ ;; , 1]] // Rest

{{1, 2}, {3, 4, 5}, {6, 7, 8, 9, 10}}

edited Apr 13 '17 at 12:56

Community

1

answered Jul 11 '13 at 08:09

Kuba

136,707
13
279
740

Solution 1 nicely explains "the core solution" of Mr.Wizards older answer. – Jacob Akkerboom Jul 11 '13 at 10:52

score 2 · Answer 8 · answered Feb 04 '21 at 09:01

2

A fairly pedestrian take, but I haven't seen it posted:

f[l_List] := Table[l[[(k (k - 1))/2 + 1 ;; (k (k + 1))/2]],
{k, 1, Floor[1/2 (-1 + Sqrt[1 + 8 Length@l])]}]

answered Feb 04 '21 at 09:01

Whelp

1,715
10
21

Partitioning with varying partition size

8 Answers8

The core solution

Extended version

Packed arrays

Compiled function

Linked

Related