3

I have one table which contains time periods ("start" and "end") and a "type" for every period:

table1 = {{"start", "end", "type"}, 
          {{2013, 8, 10, 8, 5, 0.`}, {2013, 8, 10, 10, 6, 0.`}, "a"}, 
          {{2013, 8, 10, 10, 6, 0.`}, {2013, 8, 10, 10, 50, 0.`}, "b"}, 
          {{2013, 8, 10, 10, 50, 0.`}, {2013, 8, 10, 12, 10, 10.`}, "c"}} 

Now, I have a second table which contains dates:

table2 = {"date", 
          {2013, 8, 10, 11, 5, 0.`}, 
          {2013, 8, 10, 10, 15, 0.`}, 
          {2013, 8, 10, 10, 35, 0.`},
          {2013, 8, 10, 11, 10, 0.`}, 
          {2013, 8, 10, 12, 5, 0.`}} 

What I want to do now, is to test whether a date is within one of the periods and if yes in which period. The result should be a table which shows in which period the date is. For my small example, the table should look like this:

result = {{"date", "coresp. type"}, 
          {{2013, 8, 10, 11, 5, 0.`}, "c"}, 
          {{2013, 8, 10, 10, 15, 0.`}, "b"}, 
          {{2013, 8, 10, 8, 5, 0.`}, "a"}, 
          {{2013, 8, 10, 11, 10, 0.`}, "c"}, 
          {{2013, 8, 10, 12, 5, 0.`}, "c"}, 
          {{2013, 9, 10, 10, 10, 0.`}, "none"}}

Is there an way to create the result table automatically?

mmal
  • 3,508
  • 2
  • 18
  • 38
RMMA
  • 2,710
  • 2
  • 18
  • 33
  • Ah, I deleted my answer but now I know whats wrong. Your result does not fit table2... – Kuba Aug 01 '13 at 08:45
  • You are right I did copy the false version of table2, sorry for that. – RMMA Aug 01 '13 at 10:39
  • Frink, I humbly ask that you review the comparative timing that I have added to my answer and reconsider your selection. – Mr.Wizard Aug 02 '13 at 16:40

4 Answers4

9

I would do it like this, using AbsoluteTime, which is often much faster than alternatives.

Module[{
  intv = Interval /@ Map[AbsoluteTime, table1[[2 ;;, {1, 2}]], {-2}],
  type = table1[[2 ;;, 3]],
  data = Rest[table2],
  out
 },
 out = Pick[type, intv ~IntervalMemberQ~ #] & /@ AbsoluteTime /@ data;
 Join[List /@ data, out /. {} -> {"none"}, 2] ~Prepend~ {"date", "coresp. type"}
]
{{"date", "coresp. type"},
 {{2013, 8, 10, 11, 5, 0.`}, "c"},
 {{2013, 8, 10, 10, 15, 0.`}, "b"},
 {{2013, 8, 10, 10, 35, 0.`}, "b"},
 {{2013, 8, 10, 11, 10, 0.`}, "c"},
 {{2013, 8, 10, 12, 5, 0.`}, "c"}}

Comparative Timings

With apologies to Kuba, since I feel that the method that was Accepted is vastly inferior to this one I am compelled to provide support for my position.

I will generate a large set of sample data. I will leave out the header rows in all data for simplicity.

t1big =
  Join[
    Partition[DateList /@ Range[1.43*^9, 3*^9, 3*^7], 2, 1],
    List /@ CharacterRange["A", "z"] ~Drop~ {27, 32},
    2
  ];

t2big = RandomSample[DateList /@ Range[1*^9, 3*^9, 1*^6]];

Length /@ {t1big, t2big}
{52, 2001}

The timing function:

timeAvg = 
  Function[func,
    Do[If[# > 0.3, Return[#/5^i]] & @@ Timing@Do[func, {5^i}], {i, 0, 15}],
    HoldFirst];

First Kuba's method:

With[{
   table1 = t1big,
   table2 = t2big
  },
  new = {};
  len = Length@table1;
  Do[
     If[i == len + 1, AppendTo[new, {#, "None"}]; Break[];];
     If[DateDifference[#, table1[[i, 2]]] >= 0 && DateDifference[table1[[i, 1]], #] >= 0, 
      AppendTo[new, {#, table1[[i, 3]]}]; Break[]],
     {i, 1, len + 1}
  ] & /@ table2;
  new
] // timeAvg

80.2

Then mine:

Module[{
   intv = Interval /@ Map[AbsoluteTime, t1big[[All, {1, 2}]], {-2}],
   type = t1big[[All, 3]],
   data = t2big,
   out
  },
  out = Pick[type, intv ~IntervalMemberQ~ #] & /@ AbsoluteTime /@ data;
  Join[List /@ data, out /. {} -> {"none"}, 2]
] // timeAvg

0.03304

So my method is ~2400X faster than Kuba's.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • Another neat use of Pick...I have to use this more. – ubpdqn Aug 02 '13 at 03:23
  • I cannot get the code to run without removing Module, i.e. making separate expressions. The "none" case is not listed but not labelled: a minor point. My answer uses the same principle of mapping dates to real numbers, though my function is poor it is effective. The "none" case can bet tested either from the second dataset in the answer or from my answer (res). However, I have stripped the header. – ubpdqn Aug 02 '13 at 04:03
  • @ubpdqn Sorry, minor transcription error. Fixed now. – Mr.Wizard Aug 02 '13 at 04:06
  • Thanks @Mr.Wizard : works now. Meant to write "none" case is listed but not labelled, but this is a minor point. I am learning a lot studying your coding practice. – ubpdqn Aug 02 '13 at 04:14
  • @ubpdqn You're right, I forgot "none", and I was too busy yesterday to fix it; I'll do that now. I'm glad you like my code, but remember it's only one style and Mathematica supports many. – Mr.Wizard Aug 02 '13 at 15:55
  • @ubpdqn I didn't notice that you were already using IntervalMemberQ. +1 on your answer. :-) – Mr.Wizard Aug 02 '13 at 16:12
  • Thank you @Mr.Wizard. That was whole point of creating my own strictly mono tonic increasing silly function. The ugly code surrounding it was largely to label the none case but even this is much more ugly than necessary. So, I learned a lot from your code. – ubpdqn Aug 02 '13 at 21:24
4

It can be done faster if we know more about those intervals, if they can overlap etc.

But let's make an assumption that there is only one or none date interval matching for each date in table2.

Edit: your result is for different table2 than you showed so let's take what you've taken:

table2 = Rest[result][[ ;; , 1]]

Also, probably there is something built in but meanwhile:

new = {};
len = Length@table1;
Do[
   If[i == len + 1, AppendTo[new, {#, "None"}]; Break[];];
   If[DateDifference[#, table1[[i, 2]]] >= 0 && 
      DateDifference[table1[[i, 1]], #] >= 0
      ,
      AppendTo[new, {#, table1[[i, 3]]}]; Break[]], 
  {i, 1, len + 1}] & /@ table2;

new
{{{2013, 8, 10, 11, 5, 0.}, "c"}, {{2013, 8, 10, 10, 15, 0.}, "b"}, 
  {{2013, 8, 10, 8, 5, 0.}, "a"}, {{2013, 8, 10, 11, 10, 0.}, "c"}, 
  {{2013, 8, 10, 12, 5, 0.}, "c"}, {{2013, 9, 10, 10, 10, 0.}, "None"}}
Kuba
  • 136,707
  • 13
  • 279
  • 740
2

I make the assumptions, as above, of disjoint intervals and test data belongs to one or no interval.

I note that your original and result test data are not the same. The following is not elegant and I look forward to better solutions. It does work.

(* Strip header *)    
tab1=Drop[table1,1];
(* Define function to map date to number *)
    df[u_] := {10000, 100, 1, 0.01, 0.0001, 0.000001}.u;
(* Use Which to categorize *)
    f[x_] := Which @@ 
       Join[Flatten[{IntervalMemberQ[#[[1]], 
             df[x]], {x, #[[2]]}} & /@ ({Interval[{df@#[[1]], 
                df@#[[2]]}], #[[3]]} & /@ tab1), 
         1], {True, {x, "none"}}];

Applying f/@res to your second test set:

res={{2013, 8, 10, 11, 5, 0.}, {2013, 8, 10, 10, 15, 0.}, {2013, 8, 10, 8,
   5, 0.}, {2013, 8, 10, 11, 10, 0.}, {2013, 8, 10, 12, 5, 0.}, {2013,
   9, 10, 10, 10, 0.}}

yields:

   {
{{2013, 8, 10, 11, 5, 0.},"c"},
{{2013, 8, 10, 10, 15, 0.},"b"},
{{2013, 8, 10, 8, 5, 0.},"a"},
{{2013, 8, 10, 11, 10, 0.},"c"}, 
{{2013, 8, 10, 12, 5, 0.},"c"},
{{2013, 9, 10, 10, 10, 0.},"none"}
}

consistent with the question.

For the original test data:

tab2={{2013, 8, 10, 11, 5, 0.}, {2013, 8, 10, 10, 15, 0.}, {2013, 8, 10, 
  10, 35, 0.}, {2013, 8, 10, 11, 10, 0.}, {2013, 8, 10, 12, 5, 0.}}

yields:

    {
{{2013, 8, 10, 11, 5, 0.}, "c"},
{{2013, 8, 10, 10, 15, 0.},"b"},
{{2013, 8, 10, 10, 35, 0.},"b"},
{{2013, 8, 10, 11, 10, 0.},"c"}, 
{{2013, 8, 10, 12, 5, 0.}, "c"}
}
ubpdqn
  • 60,617
  • 3
  • 59
  • 148
0

I just factored it differently, nothing special:

period[dt_, {start_, end_, period_}] := 
  If[DateDifference[start, dt ] > 0 &&   DateDifference[dt, end] > 0, 
     period];
getPeriod[dt_] := 
  First[DeleteCases[period[dt, #] & /@ Rest[table1], Null]];
Table[{date, getPeriod[date]}, {date , Rest[table2]}]

{
 {{2013, 8, 10, 11, 5, 0.}, "c"}, 
 {{2013, 8, 10, 10, 15, 0.}, "b"}, 
 {{2013, 8, 10, 10, 35, 0.}, "b"}, 
 {{2013, 8, 10, 11, 10, 0.}, "c"}, 
 {{2013, 8, 10, 12, 5, 0.}, "c"}
}   
cormullion
  • 24,243
  • 4
  • 64
  • 133