I have a large file with irregularly spaced data in the format:
{{"2012/08/06 21:05:22", 29}, {"2012/08/06 21:10:14",
28}, {"2012/08/06 21:15:12", 29}, {"2012/08/06 21:20:14",
29}, {"2012/08/06 21:30:12", 28}, {"2012/08/06 21:35:12",
28}, {"2012/08/06 21:40:13", 30}, {"2012/08/06 21:45:13",
30}, {"2012/08/06 22:00:13", 29}, {"2012/08/06 22:05:08", 28}}
The whole file is about 100000 lines long and can be downloaded here: 2.4 MB csv file
I have converted the time strings with AbsoluteTime and removed comments and empty lines:
imp = DeleteCases[{#[[1]],
Mean[DeleteCases[
ToExpression[
Map[If[Head[#] === String,
StringReplace[#, "#" ~~ __ -> ""], #] &, #[[2 ;; -1]]]],
Null]]} & /@
Rest[Import["dht22.csv", "Data"]][[
1 ;; -1]], {_, Mean[{}]}];
absTimp = {AbsoluteTime[#[[1]]], #[[2]]} & /@ imp;
Now I want to calculate hourly means of this data. This is problematic because there can be hours and even days when no data was recorded, and the data is not recorded exactly on the hour, it has some seconds or minutes of delay. So I want to interpolate the data and calculate hourly means of the interpolated function. I have this code:
interpolation = Interpolation[absTimp, InterpolationOrder -> 1]
timeFirst = imp[[1, 1]];
listFirst = DateList[timeFirst];
start = AbsoluteTime[Join[listFirst[[1 ;; 3]], {listFirst[[4]] + 1, 0, 0}]];
timeLast = imp[[-1, 1]];
listLast = DateList[timeLast];
end = AbsoluteTime[Join[listLast[[1 ;; 3]], {listLast[[4]] - 1, 0, 0}]];
AbsoluteTiming[
Table[Integrate[interpolation[x], {x, t, t + 3600}], {t, start,
Min[ (* shorter range for testing *)
end,
start + 3600*24
], 3600}]/3600]
This calculation for 25 hourly means takes 2 seconds if I use Integrate and 3.5 seconds with NIntegrate on my computer, so it's clearly unusable for my large data set. How can I improve the speed of the calculation?
MovingAverage.Differencesmethod, I don't think it's useable because my integration intervals are usually between data points. – shrx Jan 06 '14 at 18:05MovingAverage.Differencesapproach will give you huge speedup (you also canCompileit to get even better performance). – Alexey Popkov Jan 06 '14 at 18:18