I want to algorithmically process energy meter data. The energy meter measures a heat or power producer or a heat or power consumer (but not both, so the measured energy will always have a positive sign). No additional information is known about the energy system (like maximum load) neither about the type of energy meter - only data stored in a database can be accessed. Processing will be done by an algorithm looking at data for a given time interval (no live processing).
Usually, data are weakly monotonic of the form
2015-04-01 00:00 20.78 kWh
2015-04-01 00:05 30.80 kWh
2015-04-01 00:10 73.99 kWh
2015-04-01 00:20 82.30 kWh
2015-04-01 00:25 82.30 kWh
2015-04-01 00:30 83.44 kWh
...
The energy produced or consumed for a given period is simply the difference of the energy meter counts. So far, so good. However, the algorithm has to deal with the following three problems:
1. Outliers "above" have to be detected as invalid data.
2015-04-01 00:00 20.78 kWh
2015-04-01 00:05 30.80 kWh
2015-04-01 00:10 500 kWh
2015-04-01 00:20 82.30 kWh
2015-04-01 00:25 82.30 kWh
2015-04-01 00:30 83.44 kWh
....
2. Outliers "below" have to be detected as invalid data.
2015-04-01 00:00 20.78 kWh
2015-04-01 00:05 30.80 kWh
2015-04-01 00:10 20 kWh
2015-04-01 00:20 82.30 kWh
2015-04-01 00:25 82.30 kWh
2015-04-01 00:30 83.44 kWh
....
In unlikely cases, there might be several consecutive outliers above or below or a combination of both.
3. A reset of the energy meter has to be detected automatically.
2015-04-01 00:00 20.78 kWh
2015-04-01 00:05 30.80 kWh
2015-04-01 00:10 3.99 kWh
2015-04-01 00:20 12.30 kWh
2015-04-01 00:25 12.30 kWh
2015-04-01 00:30 13.44 kWh
...
After a reset, counting starts a again from another level (a reset is simply a level shift). The level the counting starts from after the reset is often zero, but can also be any other positive number. A reset can occur at an arbitrary point in time (usually not too often).
To my eyes, problems 1. - 3. seem ubiquitous in measurement engineering and must have been already addressed. Nevertheless, I couldn't find any literature on this topic. Does anybody know about existing solutions to this problem? All help will be highly appreciated.
If you decide that you can go ahead, all you need is a simple program to detect and remove those data points.
– DLS3141 Aug 05 '15 at 14:04