5

I wondered if anyone has another or even a more direct way of finding the last dates of each month available from a list of successive dates?

I currently do the following (note: nothing special about WeatherData[] here, I just used it to generate a list of dates):

WeatherData["KMDZ", "MeanTemperature", {{2007, 1, 1}, {2007, 12, 31}, "Day"}][[All, 1]];
Last[#] & /@ SplitBy[%, DateList[#][[2]] &]

{{2007, 1, 31}, {2007, 2, 28}, {2007, 3, 31}, {2007, 4, 30}, {2007, 5, 31}, {2007, 6, 30}, {2007, 7, 31}, {2007, 8, 31}, {2007, 9,30}, {2007, 10, 31}, {2007, 11, 30}, {2007, 12, 31}}

This works fine. I just wondered if I could think about this problem in a different way.


It seems the question needs a clarification. The starting list of dates may not include every date in a given month as such; it also may not include the actual last date of any given month. I need to find the last date in any month available in the list.

J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
Jagra
  • 14,343
  • 1
  • 39
  • 81

2 Answers2

8

Since your data already has dates in the form {Y, M, D} you could do without DateList, and as mentioned by J. M. you can use Last in place of Last[#] &, therefore you could use:

Last /@ SplitBy[data, #[[2]] &]

Since as the operation above shows you are only looking at the second column, you might use a numeric operation and Pick:

Pick[data, Differences@data[[All, 2]] ~Append~ 1, 1]

This is actually quite fast (timeAvg):

Last /@ SplitBy[data, #[[2]] &] // timeAvg

Pick[data, Differences@data[[All, 2]] ~Append~ 1, 1] // timeAvg

0.00031456

0.000020992

This can be made slightly faster still by using the undocumented properties of SparseArray. (Evaluate SparseArray[{1}]["Properties"] for a list.)

data ~Extract~ 
  SparseArray[Differences@data[[All, 2]] ~Append~ 1]["NonzeroPositions"]

EDIT: The Differences methods work on the assumption that there is at least one date in each month, therefore transition points will be characterized by a delta of 1. However if that is not the cause you would need to add something like Unitize:

Pick[data, Unitize@Differences@data[[All, 2]] ~Append~ 1, 1]

For the sake of exploring less practical alternatives you could also use:

Reap[
  If[#[[2]] < #2[[2]], Sow@#] & @@@ Partition[data, 2, 1]
][[2, 1]]

Or:

Reap[
  Fold[(If[#[[2]] < #2[[2]], Sow@#]; #2) &, First@data, Rest@data]
][[2, 1]]

(Each of these misses the last date which would need to be appended.)

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
6

Just ask for the 0th day and you'll get the last day of the previous month:

DateList[{2012, 2, 0}]

{2012, 1, 31, 0, 0, 0.}

or:

DateList@{2007, #, 0} & /@ (Range@12 + 1)

{{2007, 1, 31}, {2007, 2, 28}, {2007, 3, 31}, {2007, 4, 30}, {2007, 5, 31}, {2007, 6, 30}, {2007, 7, 31}, {2007, 8, 31}, {2007, 9, 30}, {2007, 10, 31}, {2007, 11, 30}, {2007, 12, 31}}

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
Chris Degnen
  • 30,927
  • 2
  • 54
  • 108
  • Somehow I had interpreted the question as asking how to determine, in a list of dates split by month, the latest date for each month. But I guess your interpretation's the one. – J. M.'s missing motivation Jun 25 '12 at 14:28
  • @J.M. -- Actually you got it right. Perhaps the question had some ambiguity. I added a clarification at the end of the original question. – Jagra Jun 25 '12 at 18:45