27

A simple array of integers is given. The problem is to detect if a pattern is repeatedly occurring in the array, and find the length of that pattern.

For example, for

{19, 6, 19, 6, 19, 6, 19, 6, 19, 6, 19, 6}

pattern {19, 6} should be detected and its length is 2.

For

{73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7}

pattern {73, 7, 4} should be detected and its length is 3. (at the end of the array there need not be the complete pattern, but the pattern should start at the beginning of the array)

For

{73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}

the whole array is the pattern and its length is 14.

Related links

MATLAB function seqperiod()

SO question on cycle detection

Related question on this site

Wikipedia article on cycle detection

VividD
  • 3,660
  • 4
  • 26
  • 42
  • This problem was solved in python pretty cleverly not too long ago. http://stackoverflow.com/questions/29481088/how-can-i-tell-if-a-string-repeats-itself-in-python

    I suspect that algorithm is going to be the cleanest and fastest solution in any high-level language that supports finding sequences within a list.

    – QuestionC Apr 17 '15 at 16:05
  • @QuestionC The answer to that question that compares all other answers is great. – VividD Apr 17 '15 at 16:12
  • It's an interesting problem so it drew a lot of attention. The accepted (s+s).find(s, 1, -1) answer is (IMO) pretty amazing, so I wanted to raise attention in case that's applicable in Mathematica. – QuestionC Apr 17 '15 at 16:19
  • 1
    For the case of missing or corrupted values, there is some discussion in this MathGroup thread – Daniel Lichtblau Apr 17 '15 at 19:12
  • 1
    I wonder if a clever solution based on Fourier is possible here? (I'm the guy who came up with the (s+s).find(s, 1, -1) solution, btw--flattered to see it's gotten so much attention!) – David Zhang Apr 17 '15 at 21:19
  • How large are the lists you plan to use this on - I think the answers so far can be bettered for large lists, but it that's not the case... – ciao Apr 18 '15 at 07:50
  • 5000 elements. @rasher – VividD Apr 18 '15 at 07:52
  • @VividD: Ah, then most s/b fine, though Chris's in particular seems quite snappy! Interesting question, +1 on you... – ciao Apr 18 '15 at 07:58
  • 1
    @QuestionC actually I don't think that algorithm works here, because in this question the cycle does not have to be complete at the end of the list. For example {1,2,3,1,2,3,1,2} should display a cycle length of 3, but the concatenate-and-search algorithm would indicate that the string is not periodic. – David Z Apr 19 '15 at 08:57

8 Answers8

16

This uses partitioning, with padding if required, to make sublists.

f = Module[{b, c = 1},
    While[Length[b = Union@Partition[#, c, c, {1, 1}, Take[#, c]]] > 1, c++];
    {Length@First@b, First@b}] &;

Example

f@{73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7}

{3, {73, 7, 4}}

Chris Degnen
  • 30,927
  • 2
  • 54
  • 108
12
ClearAll[len]

len[{p__, p__ .., e___}] /; MatchQ[{p}, {e, __}] := Length[{p}]
len[p_] := Length[p]

len /@ lists
(* {2, 3, 14} *)
C. E.
  • 70,533
  • 6
  • 140
  • 264
  • 1
    Are you sure you want to use .. instead of ...? If p has only one partial repeat, above construct doesn't quite get it. – kirma Apr 17 '15 at 16:48
  • @kirma Yes, I consciously made it that way because I didn't consider p to be a pattern if it never fully repeats itself. For example I thought it best that {1, 2, 7, 3, 5, 1} should be counted as one pattern of length six, instead of a pattern of length five. It depends on what the solution is used for I suppose. – C. E. Apr 17 '15 at 18:12
  • Very nice. +1 . – ciao Apr 18 '15 at 03:30
  • @Pickett for my application it is not so critical to make such distinction or not, but I would think your way – VividD Apr 18 '15 at 08:08
11

I won't bet my hand for this but seems to be ok:

ClearAll[return];
return[x : {0 ..., 1}, list_] := {#, list[[;; #]]} &[Length@x];
return[x_, y_] := {Length@y, y};

sqPeriod[list_] := return[FindLinearRecurrence[list], list]



sqPeriod /@ {
   {19, 6, 19, 6, 19, 6, 19, 6, 19, 6, 19, 6},
   {73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7},
   {73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}
   } // Column

enter image description here

Kuba
  • 136,707
  • 13
  • 279
  • 740
7

In versions 10+, there is FindTransientRepeat:

  • FindTransientRepeat[list, n] returns a pair of lists {transient,repeat} where the elements of repeat occur successively at least n times at the end of list.
  • FindTransientRepeat accepts an incomplete copy of the repeated sublist in the last position
  • The minimum number of repetitions refers to complete repetitions

For the examples in the posted question

lists = {{19, 6, 19, 6, 19, 6, 19, 6, 19, 6, 19, 6}, 
         {73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7},
         {73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}};

we get

FindTransientRepeat[#, 2]& /@ lists

{{{}, {19, 6}},
{{}, {73, 7, 4}},
{{73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17}, {7}}}

We can define a function that processes the output of FindTransientRepeat to get the results in desired form:

Clearall[repeatsF]
repeatsF = Module[{ftr = FindTransientRepeat[#, 2], lst = #},
 If[First @ ftr === {}, {Length @ Last @ ftr, Last @ ftr}, {Length @ lst, lst}]]&;

 repeatsF /@ lists

{{2, {19, 6}},
{3, {73, 7, 4}},
{14, {73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}}}

kglr
  • 394,356
  • 18
  • 477
  • 896
7

This answer only returns the period. If you want to extract the repeating substring, just use Take[list, period].

sequencePeriod = Compile[{{l, _Integer, 1}},
  With[{n = Length[l]},
   Catch[
    Do[
     If[
       Catch[
        Do[
         Do[
           If[l[[j]] != l[[k]], Throw[False]];,
           {k, i + j, n, i}
           ];,
         {j, i}
         ];
        Throw[True];
        ],
       Throw[i];
       ];,
     {i, Quotient[n, 2]}
     ];
    Throw[n];
    ]
   ]
  ]

For each trial period i, I go through the elements j of the trial list (1 through i) and make sure that each one is repeated in each subsequent copy of the list (k, offset from j by multiples of the period i).

This is the fastest solution so far:

enter image description here

I generated the lists for these tests with

Join @@ ConstantArray[RandomInteger[n, n], {n}]
  • Pickett's solution is the most elegant, but seems to have exponential complexity.
  • Kuba's has around quartic complexity due to the generality of FindLinearRecurrence.
  • Chris has cubic performance, since he splits and compares the entire list at each trial period.
  • My solution has quadratic performance, since we stop comparison as soon as we encounter a nonmatching element. It also has low memory usage, since I don't manipulate the array at all.
2012rcampion
  • 7,851
  • 25
  • 44
6

FindRepeat meets your need perfectly.

FindRepeat[{19, 6, 19, 6, 19, 6, 19, 6, 19, 6, 19, 6}]

{19, 6}

FindRepeat[{73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7}]

{73, 7, 4}

FindRepeat[{73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}]

{73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}

AsukaMinato
  • 9,758
  • 1
  • 14
  • 40
6

Using the undocumented "Periodic" padding as the third argument of PadRight:

ClearAll[fpF, fpF2]
fpF = Block[{i = 1}, While[i < Length@# && 
      PadRight[#[[;; i]], Length@#, "Periodic"] != #, i++]; i] &;
fpF2 = Block[{i = 1}, While[i < Length@# && 
      PadRight[#[[;; i]], Length@#, "Periodic"] != #, i++]; {i, #[[;; i]]}] &;

Examples: Using tc from @ubpdqn's answer:

tc = {{19, 6, 19, 6, 19, 6, 19, 6, 19, 6, 19, 6}, 
      {73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7, 4, 73, 7},
      {73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 7}, 
      {6, 3, 6, 3, 3, 6, 3, 6, 3, 3, 6, 3, 6, 3, 3, 6, 3, 6, 3, 3},
      {6, 1, 6, 2, 6, 3, 6, 1, 6, 2, 6, 3, 6, 1, 6, 2},
      {1, 1, 1}, 
      {1, 2, 1, 2, 1}};

fpF /@ tc
(* {2, 3, 14, 5, 6, 1, 2} *)

{#, ## & @@ fpF2@#} & /@ tc // Grid[#, Dividers -> All] & 
(* or {#,fpF @ #, #[[;;fpF @ #]]}&/@tc //Grid *)

enter image description here

If a complete periodic pattern were sought, we could search for periods less than or equal to half the Length of the input list:

ClearAll[fpFa, fpFb]
fpFa = Block[{i = 1, n = Length@#},  While[i < 1 + n/2 && 
        PadRight[#[[;; i]], n, "Periodic"] != #, i++]; i = If[i < 1 + n/2, i, n]] &;
fpFb = Block[{i = 1, n = Length@#}, While[i < 1 + n/2 && 
        PadRight[#[[;; i]], n, "Periodic"] != #, i++]; i = If[i < 1 + n/2, i, n]; 
        {i, #[[;; i]]}] &;
kglr
  • 394,356
  • 18
  • 477
  • 896
3

Late to party, and liking all answers but esp Chris Degnen:

per[u_] := Module[{j = 1, lg = Length@u},
  While[Total[
     Abs[Take[Join @@ ConstantArray[u[[;; j]], Ceiling[lg/j]], lg] - 
       u]] != 0, j++]; {j, u[[;; j]]}]

Some test cases:

tc={{19, 6, 19, 6, 19, 6, 19, 6, 19, 6, 19, 6}, {73, 7, 4, 73, 7, 4, 73, 
  7, 4, 73, 7, 4, 73, 7}, {73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 9, 17, 7, 
  7}, {6, 3, 6, 3, 3, 6, 3, 6, 3, 3, 6, 3, 6, 3, 3, 6, 3, 6, 3, 
  3}, {6, 1, 6, 2, 6, 3, 6, 1, 6, 2, 6, 3, 6, 1, 6, 2}, {1, 1, 1}, {1,
   2, 1, 2, 1}}

Testing: per/@tc

yields:

{{2, {19, 6}}, {3, {73, 7, 4}}, {14, {73, 7, 4, 7, 2, 6, 7, 2, 7, 73, 
   9, 17, 7, 7}}, {5, {6, 3, 6, 3, 3}}, {6, {6, 1, 6, 2, 6, 
   3}}, {1, {1}}, {2, {1, 2}}}
ubpdqn
  • 60,617
  • 3
  • 59
  • 148