How to locate a list within a list?

Question

for example f[{1,2,3,4},{2,3}]=True, f[{1,2,3,4},{2,3,4}]=True but f[{1,2,3,4},{2,4}]=False. Which function should I use because I don't want to reinvent the wheel. Thanks in advance

What is the result of f[{1,2,3,4},{3,2}]? – halirutan May 29 '18 at 09:39 — halirutan, May 29 '18 at 09:39

halirutan · Answer 1 · 2018-05-29T11:24:44.290

It really depends on how fast you need it and how many elements your list will have. Let's assume the worst:

list = Range[1000000];
contain = {1, 2, 3, 4, 5, 6};

Checking fun from Lotus

fun[list, contain] // RepeatedTiming
(* {0.27, True} *)

and f from kglr

f[list, contain] // RepeatedTiming
(* {0.043, True} *)

Here is a version that is again an order of magnitude faster, although it looks awful

h[list_, sub_] := With[{l = Length[sub]},
  Catch[Developer`PartitionMap[If[# === sub, Throw[True]] &, list, l, 1]] === True
  ]

h[list, contain] // RepeatedTiming
(* {0.00271, True} *)

However

As you see, my example was made to test the influence of a large input list and an early accepting contain. I should note when contain is located at the far end of list, @kglr's version works best and it seems to have a constant running time.

Final notes

I tried to find a simple algorithm, that matches the speed of SequencePosition and is in most cases better. A satisfactory solution seems to be to go linear through the list and test if the current element is equal to the first element in the sub list. Only if yes, we compare the full sublist.

In high-level Mathematica, this still cannot compete but when we compile it down to C, then this seems to be a fast solution. A clear disadvantage of this solution is that it only works with typed lists (here integers).

uglyC = Compile[{{list, _Integer, 1}, {sub, _Integer, 1}},
  With[{l = Length[sub], first = sub[[1]]},
   Do[
     If[list[[i]] === first && 
       list[[i ;; i + Length[sub] - 1]] === sub,
      Return[True]
      ], {i, 1, Length[list] - Length[sub] + 1}
     ] === True
   ], CompilationTarget -> "C", RuntimeOptions -> "Speed"
  ]

Let's do some test-cases with a random large list

list = RandomInteger[100, 1000000];

First we try a long sublist that is not part. I need to use AbsoluteTiming for the compiled code to get meaningful results

f[list, Range[50, 150]] // RepeatedTiming
(* {0.0057, False} *)

Median@Table[First@AbsoluteTiming[uglyC[list, Range[50, 150]]], {50}]
(* 0.004662 *)

Sublists that match, but were the match is at the very end of list will have an equivalent runtime. However, the closer the match comes to the front of list, the faster the Do loop will be

f[list, list[[100 ;; 110]]] // RepeatedTiming
(* {0.0057, True} *)

Median@Table[First@AbsoluteTiming[uglyC[list, list[[100 ;; 110]]]], {50}]
(* 5.*10^-6 *)

I have not tested all scenarios and I believe the ugly compiled code should not be used, if it is not absolutely time-critical. A simple SequencePosition should be preferred.

On my machine, kglr's f2 seems to be significantly slower than f1. I think it's also important to examine a negative case, such as {889, 893, 894, 895}. h's construction seems like it would have quite a bit of difficulty getting False results quickly. — eyorble, May 29 '18 at 10:18
@eyorble Yes, that was what I tried to express in my last section. All mapping methods check each sublist and the worst case is when the sublist is not in the original list. For this case, kglr's approach still works fast while all others drop like hell. A fast solution could theoretically be done in a Do loop, but I don't get this fast enough (not without compiling at least). — halirutan, May 29 '18 at 10:40

kglr · Answer 2 · 2018-05-29T11:08:48.943

7

f1 = SequencePosition[##] != {}&;
f1b = Length[SequencePosition[##]] > 0&; (* thanks: @Henrik Schumacher *)

f1[{1, 2, 3, 4}, #] & /@ {{2, 3}, {2, 3, 4}, {2, 4},{2, 3, 1}}

{True, True, False, False}

f2 = MemberQ[Subsequences[#, {Length @ #2}],#2]&;
f2[{1, 2, 3, 4},#] & /@ {{2, 3}, {2, 3, 4}, {2, 4}, {2, 3, 1}}

{True, True, False, False}

Two variations on halirutan's approach:

ClearAll[f3,f4]
f3[lst_,sub_]:= Or @@ BlockMap[# === sub&, lst, Length@sub, 1]
f4[lst_,sub_]:=Catch[BlockMap[If[# === sub, Throw[True]]&, lst, Length@sub, 1]] === True

edited May 29 '18 at 11:08

answered May 29 '18 at 09:44

kglr

394,356
18
477
896

It is often a bit more efficient testing lists for length >0 instead of comparing agains {}. Length[SequencePosition[##]] > 0 & should be a bit faster than f1. – Henrik Schumacher May 29 '18 at 10:27
1

Nice, now we have cyclic references between our posts :) I still like your SequencePosition best. It's clear, short and should work for a wide range of cases. +1 – halirutan May 29 '18 at 10:42
@HenrikSchumacher, thank you, great point. I added that variant. – kglr May 29 '18 at 11:09

Mr.Wizard · Accepted Answer · 2018-05-29T11:23:18.493

6

We may use pattern matching:

f[a_, {ss__}] := MatchQ[a, {___, ss, ___}]

Testing:

f[{1, 2, 3, 4}, {2, 3}]          (* True  *)
f[{1, 2, 3, 4}, {2, 3, 4}]       (* True  *)
f[{1, 2, 3, 4}, {2, 4}]          (* False *)

This should work in any version of Mathematica and it is nearly equivalent in performance to SequencePosition: (as measured in version 10.1 under Windows)

f1[Range[10000], {5000, 5001, 5002}] // RepeatedTiming
f[Range[10000], {5000, 5001, 5002}]  // RepeatedTiming

{0.000444, True}

{0.000475, True}

edited May 29 '18 at 11:23

answered May 29 '18 at 11:03

Mr.Wizard

271,378
34
587
1,371

great! neat and simple – Georgy May 29 '18 at 12:03

score 4 · Answer 4 · answered May 29 '18 at 09:44

How about this ?

fun[list1_, list2_] := Module[{n, list3},
  n = Length[list2];
  list3 = Partition[list1, n, 1];
  MemberQ[list3, list2]
  ]

In[10]:= fun[{1, 2, 3, 4}, {2, 3}]

Out[10]= True

In[8]:= fun[{1, 2, 3, 4}, {2, 3, 4}]

Out[8]= True

In[9]:= fun[{1, 2, 3, 4}, {2, 4}]

Out[9]= False

How to locate a list within a list?

4 Answers4

However

Final notes

Linked

Related