Select over many lists with changing selection criteria

Question

I have many individual timeseries I am trying to align and compute the mean and other statistics for. The problem I am having is that I am only interested in aligning a particular part of each timeseries and the part I want to align changes for each timeseries. For example, "timeseries A" may be from t=1...200 but I only want to use t=25...125 for it. While "timeseries B" may also be from t=1...200 but I want to use t=35...135 for it.

I am guessing I need to use Thread or MapThread for this but am not coming up with a solution. Below is what works on an individual timeseries where #[[2]] refers to the second element in the sublist for timeseries[[i]] that contains the timestamps I want to use as criteria for Select:

Select[timeseries[[i]], #[[2]] >24 && #[[2]] <126 &]

To do this I know I will need a second list containing the timestamp range selection parameters that are relevant to each particular timeseries[[i]]. For example a list like:

{{24,126},{34,136},{17,119},{74,176}...}

where each ordered pair on the gives the range of interest for each timeseries. In other words for timeseries[[1]] I want to extract those sublists of data for timepoints 25...125, for timeseries[[2]] I want to extract sublists for timepoints 35...135, for timeseries[[3]] I want timepoints 18...118, etc.

Thanks for any suggestions you can provide.

score 2 · Answer 1 · edited Apr 13 '17 at 12:55

Are you using TimeSeries? You do not include this function in your question so I shall assume not.

Let's start with this sample data:

SeedRandom[0]

data = RandomReal[99, {5, 4, 3}];

ranges = {{29, 69}, {35, 74}, {23, 59}, {7, 45}, {13, 51}};

You could then use MapThread as you anticipated:

MapThread[
 Cases[#, x_ /; #2[[1]] < x[[2]] < #2[[2]]] &,
 {data, ranges}
]

{{{64.5943, 62.674, 67.5985}, {23.6067, 63.1187, 10.0087}},
 {{89.5738, 63.4305, 30.3474}},
 {{27.9594, 29.3512, 68.8993}, {38.1301, 37.5996, 42.8385}},
 {{48.5995, 36.6677, 31.1134}, {65.0027, 29.8126, 83.9426}},
 {{2.24306, 16.1958, 68.7413}, {75.8672, 20.8045, 68.5393}, {41.9572, 29.9591, 9.34261}}}

Instead of Cases you could use Interval, IntervalMemberQ and Pick:

intv = Interval /@ ranges;

MapThread[
 Pick[#, IntervalMemberQ[#2, #[[All, 2]]]] &,
 {data, intv}
]

If your timeseries data is already sorted on the t parameter then it would be faster to use a binary search to find the start and end points in each set. If you provide some representative data and this is applicable I shall include an example.

score 2 · Answer 2 · answered Mar 04 '15 at 23:01

selF[tseries_, indexranges_] := MapIndexed[tseries[[#2[[1]], Span@@ #]] &, indexranges]

Example:

SeedRandom[0]
ts = RandomInteger[10 {#, # + 1}, 200] & /@ Range[5];
starts = {24, 34, 17, 74, 10};
samplesize = 100;
ranges1 = {#, # + samplesize} & /@ starts;
(* {{24,124}, {34,134}, {17,117}, {74,174}, {10,110}} *)
increment = 5;
ranges2 = {#, # + samplesize, increment} & /@ starts;
(* {{24,124,5}, {34,134,5}, {17,117,5}, {74,174,5}, {10,110,5}} *)

Row[ListLinePlot[#, ImageSize -> 300] & /@ {ts, selF[ts, ranges1], selF[ts, ranges2]}]

enter image description here

Select over many lists with changing selection criteria

2 Answers2