Hopefully this question has a simple answer. I have the following bit of code:
fits = Quiet[
Check[
NMinimize[
{ssd, And @@ cons}, vars,
AccuracyGoal -> accur,
PrecisionGoal -> Infinity,
Method -> {"DifferentialEvolution", "ScalingFactor" -> 1.0, "CrossProbability" -> 0.0},
MaxIterations -> maxIter,
StepMonitor :> (fitStep++)
], {}
]
];
where the variables accur and maxIter are set elsewhere. I've noticed that fitStep sometimes exceeds maxIter during the minimization, which seems to indicate that an "iteration" and a "step" are not the same thing in this context. If they're different, is there a way to monitor the "iteration" and not the "step"? I just want to be able to limit the fit to a certain number of iterations (to prevent it from going on too long) and monitor the progress against that limit.
EDIT: 16 May 2013
Alright I've put together a self-contained chunk of code to demonstrate the MaxIterations issue. I know it's not the most basic example but it's sure to reproduce what I'm seeing. The chunk of code below is commented, but just to summarize: I have some spectral data that I'm trying to fit with a sum of Logistic Power Peak functions using the Differential Evolution method. The important variables here are vars, cons, args, and ssd, as those are the arguments fed to NMinimize[]. The ssd variable contains a very large expression so evaluate it with caution.
Evaluate the following expression first in a separate cell to watch the progress of the minimization.
(* Watch the progress here. *)
Dynamic["Iterating: " <> ToString[fitStep] <> " / " <> ToString[maxIter] <> " ..."]
Change maxIter and the "PostProcess" option inside NMinimize[] (bottom) to see the various behaviors.
(* Set the number of iterations. *)
maxIter = 500;
(* Here's some data I'm trying to fit. *)
pts={
{525, 3.95}, {527, 5.07}, {529, 5.15}, {531, 5.58}, {533, 5.85},
{535, 6.53}, {537, 7.01}, {539, 6.79}, {541, 7.01}, {543, 7.43},
{545, 7.18}, {547, 6.92}, {549, 6.34}, {551, 7.21}, {553, 7.75},
{555, 7.65}, {557, 8.71}, {559, 8.89}, {561, 9.02}, {563, 9.72},
{565, 9.62}, {567, 9.88}, {569, 10.50}, {571, 9.65}, {573, 9.71},
{575, 8.52}, {577, 7.70}, {579, 8.15}, {581, 8.29}, {583, 10.00},
{585, 10.00}, {587, 9.28}, {589, 10.80}, {591, 11.90}, {593, 13.80},
{595, 14.90}, {597, 14.30}, {599, 15.30}, {601, 16.20}, {603, 16.30},
{605, 17.90}, {607, 17.70}, {609, 19.20}, {611, 20.40}, {613, 20.80},
{615, 21.40}, {617, 20.30}, {619, 18.40}, {621, 19.80}, {623, 19.90},
{625, 20.70}, {627, 19.80}, {629, 18.60}, {631, 19.40}, {633, 20.40},
{635, 20.50}, {637, 23.10}, {639, 22.50}, {641, 25.20}, {643, 26.40},
{645, 26.10}, {647, 25.60}, {649, 25.20}, {651, 23.30}, {653, 23.10},
{655, 23.00}, {657, 24.00}, {659, 22.70}, {661, 22.00}, {663, 21.70},
{665, 21.30}, {667, 19.60}, {669, 20.30}, {671, 17.60}, {673, 18.20},
{675, 17.40}, {677, 16.10}, {679, 15.60}, {681, 14.10}, {683, 12.20},
{685, 12.10}, {687, 13.50}, {689, 13.90}, {691, 12.50}, {693, 12.00},
{695, 11.60}, {697, 11.10}, {699, 9.41}, {701, 8.56}, {703, 6.69},
{705, 6.50}, {707, 7.20}, {709, 7.03}, {711, 6.75}, {713, 6.01},
{715, 6.97}, {717, 7.54}, {719, 7.76}, {721, 7.47}, {723, 7.33},
{725, 6.93}, {727, 6.64}, {729, 6.25}, {731, 4.93}, {733, 4.67},
{735, 4.06}, {737, 4.17}, {739, 4.90}, {741, 4.64}, {743, 4.63},
{745, 5.11}, {747, 4.75}, {749, 4.45}
};
(* This is the Logistic Power Peak (LPP) function. *)
(* I'm fitting with a sum of these (the sum is handled below). *)
(* #1 = a = height *)
(* #2 = b = center *)
(* #3 = c = width *)
(* #4 = d = asymmetry *)
(* #5 = s = reflection *)
LPP = (#2/#5)(#5+1)^((#5+1)/#5)
* (1+Exp[(#6(#1-#3)+#4*Log[#5])/#4])^(-((#5+1)/#5))Exp[(#6(#1-#3)+#4*Log[#5])/#4]&;
(* These are just labels for each of the five peaks being fitted. *)
pkIndx = {121101022, 121111022, 121121022, 121131022, 121141022, 121151022};
(* For each of the five peaks, set initial values for the parameters a, b, c, d, and s. *)
(* #1 = a = height *)
(* #2 = b = center *)
(* #3 = c = width *)
(* #4 = d = asymmetry *)
(* #5 = s = reflection *)
MapThread[(a[#1] = #2) &, {pkIndx, {4.26, 6.41, 3.68, 11.70, 3.87, 15.38}}];
MapThread[(b[#1] = #2) &, {pkIndx, {534.67, 559.62, 583.72, 607.08, 629.76, 651.83}}];
MapThread[(c[#1] = #2) &, {pkIndx, {5.02, 5.49, 5.96, 6.43, 6.91, 7.39}}];
MapThread[(d[#1] = #2) &, {pkIndx, {5.94, 6.16, 6.38, 6.59, 6.79, 6.98}}];
MapThread[(s[#1] = #2) &, {pkIndx, {1, 1, 1, 1, 1, 1}}];
(* Set lower constraints for each peak parameter. *)
MapThread[(daMins[#1] = #2) &, {pkIndx, {3, 5, 2, 5, 0, 5}}];
MapThread[(dbMins[#1] = #2) &, {pkIndx, {10, 10, 10, 10, 10, 10}}];
MapThread[(dcMins[#1] = #2) &, {pkIndx, {3, 3, 3, 3, 3, 3}}];
MapThread[(ddMins[#1] = #2) &, {pkIndx, {3, 3, 3, 3, 3, 3}}];
(* Set upper constrainsts for each peak parameter. *)
MapThread[(daPlus[#1] = #2) &, {pkIndx, {5, 5, 5, 5, 5, 5}}];
MapThread[(dbPlus[#1] = #2) &, {pkIndx, {10, 10, 10, 10, 10, 10}}];
MapThread[(dcPlus[#1] = #2) &, {pkIndx, {3, 3, 3, 3, 3, 3}}];
MapThread[(ddPlus[#1] = #2) &, {pkIndx, {3, 3, 3, 3, 3, 3}}];
(* In my larger program, this checks to see which peak parameters are being fitted. *)
(* Here we're fitting all but s, which always takes the value 1. *)
bool = {
(daPlus[#] != 0 || daMins[#] != 0),
(dbPlus[#] != 0 || dbMins[#] != 0),
(dcPlus[#] != 0 || dcMins[#] != 0),
(ddPlus[#] != 0 || ddMins[#] != 0)
} & /@ pkIndx;
aPos = Flatten[Position[bool[[All, 1]], True]];
bPos = Flatten[Position[bool[[All, 2]], True]];
cPos = Flatten[Position[bool[[All, 3]], True]];
dPos = Flatten[Position[bool[[All, 4]], True]];
(* Make a table of variables for NMinimize[]. *)
vars = Join[
a0[#] & /@ pkIndx[[aPos]],
b0[#] & /@ pkIndx[[bPos]],
c0[#] & /@ pkIndx[[cPos]],
d0[#] & /@ pkIndx[[dPos]]
];
(* Make a table of constraints for NMinimize[]. *)
cons = Join[
(a[#] - daMins[#] < a0[#] < a[#] + daPlus[#]) & /@ pkIndx[[aPos]],
(b[#] - dbMins[#] < b0[#] < b[#] + dbPlus[#]) & /@ pkIndx[[bPos]],
(c[#] - dcMins[#] < c0[#] < c[#] + dcPlus[#]) & /@ pkIndx[[cPos]],
(d[#] - ddMins[#] < d0[#] < d[#] + ddPlus[#]) & /@ pkIndx[[dPos]]
];
(*
Make a table of arguments to be fed to the LLP[] function in the Sum below.
Normally this table contains numerical values for the parameters that aren't being fitted
and variables for the ones that are.
Here we're fitting everything except s so the table contains only variables.
*)
args = MapThread[
{
aDum -> #2[[1]] /. {True -> a0[#1], False -> a[#1]},
bDum -> #2[[2]] /. {True -> b0[#1], False -> b[#1]},
cDum -> #2[[3]] /. {True -> c0[#1], False -> c[#1]},
dDum -> #2[[4]] /. {True -> d0[#1], False -> d[#1]}
} &,
{pkIndx, bool}
];
(*
Create the sum of squared deviations for NMinimize[] to minimize.
The inner Sum[] sums over the 5 LLP[] functions at a given x, and each LLP[] gets its
arguments from the "args" table.
The outer Sum[] sums the squared deviations over all the data points.
*)
ssd = Sum[
(
Sum[
LPP[pts[[i, 1]], aDum, bDum, cDum, dDum, s[pkIndx[[j]]]] /. args[[j]],
{j, Length[pkIndx]}
]
- pts[[i, 2]]
)^2,
{i, 1, Length[pts], 1}
];
(* Run the fit. *)
fitStep = 0;
fits = NMinimize[{ssd, And @@ cons}, vars,
AccuracyGoal -> 3,
PrecisionGoal -> Infinity,
Method -> {
"DifferentialEvolution",
"ScalingFactor" -> 1.0,
"CrossProbability" -> 0.0,
"PostProcess" -> False
},
MaxIterations -> maxIter,
StepMonitor :> (fitStep++)
];
My observations:
with
"PostProcess->False":MaxIterations= 300 ---> # iterations = 220MaxIterations= 500 ---> # iterations = 250MaxIterations= 1000 ---> # iterations = 500MaxIterations= 1200 ---> # iterations = 600
with
"PostProcess->FindMinimum"MaxIterations= 300 ---> # iterations = 221MaxIterations= 500 ---> # iterations = 251MaxIterations= 1000 ---> # iterations = 501
with
"PostProcess->True"MaxIterations= 300 ---> # iterations = 286MaxIterations= 500 ---> # iterations = 316MaxIterations= 1000 ---> # iterations = 566MaxIterations= 1500 ---> # iterations = 816
Lastly, I did find a comment in The Mathematica Guidebook for Numerics by Michael Trott claiming that (paraphrasing) in FindMinimum[], MaxIterations determines the effective number of evaluations in each search direction. I believe it was on page 371 but I'll have to double-check.
Sowthem and find out? (I don't know a lot about how this works, so I could be wrong). In any case, please include a minimal example (i.e., definitions for ssd, cons, vars, etc.) – rm -rf May 14 '13 at 22:03NMinimizethat is sufficient to illustrate your problem. Regardless of whether it matters or not, it makes life simple for everyone who might answer your question if they can simply copy your code and run it to test it. Most of us don't have the time or inclination to get creative and think up of a valid function to minimize, appropriate conditions, etc — all of this before we actually get to the heart of your problem. – rm -rf May 14 '13 at 22:15f[x_?NumericQ] := Sin[.4 x] + .7 Cos[.6 x] x; {steps, eval} = First /@ Last@ Reap[NMinimize[{f@x, -10 < x < 10}, {x}, Method -> "DifferentialEvolution", StepMonitor :> Sow[{x, f@x}, "steps"], EvaluationMonitor :> Sow[{x, f@x}, "eval"]], {"steps", "eval"}]; Plot[f@x, {x, -10, 10}, Epilog -> {{Red, Point[eval]}, {Green, Point[steps]}}]Also look at the lengths ofstepsandeval. – rm -rf May 14 '13 at 23:12NMinimizethinks of as a step or an iteration, that's a different matter, especially for constrained optimizations where barrier parameters or penalty functions may need to be adjusted in between DE iterations to guide the process into the correct minimum. For more, see my answers here and here. – Oleksandr R. May 15 '13 at 00:30NMinimizesteps andFindMinimumpostprocessing steps. If you want to disable the postprocessing (not recommended, sinceFindMinimumconverges very quickly), useMethod -> {"DifferentialEvolution", "PostProcess" -> False}. Interestingly,"PostProcess" -> {FindMinimum, MaxIterations -> #}has no effect on the actual number of postprocessing iterations, while"PostProcess" -> {KKT, MaxIterations -> #}does. I guessFindMinimumis not so easily dissuaded from its mission. – Oleksandr R. May 15 '13 at 00:39"ScalingFactor" -> 1.0, "CrossProbability" -> 0.0deliberately so that the optimization never converges? This may influence howNMinimizebehaves if after a certain number of iterations it continues to see no improvement. Anyway, I would repeat my previous suggestion to try this again with diagnostics switched on: this way,NMinimizewill tell you explicitly what it is doing. If you specifyMaxIterationsmuch larger than needed for convergence, some may well be used to adjust barrier parameters, etc.; for smallMaxIterations, the limit is obeyed. – Oleksandr R. May 17 '13 at 02:49Anyway, I'll take a look at the diagnostics and report back what I find; thanks very much for the suggestion!
– Adam May 17 '13 at 04:21CrossProbabilityis reversed relative to the literature--traditionally 1 retains the mutant vector while 0 keeps the target, but here it's the other way around. That is, a value of 0 (Mathematica's 1) yields a one-dimensional hill climbing approach, assuming that Mathematica implements DE as described by Storn and Price rather than using this as a literal probability. (I did look at the code at one point, but I've now forgotten what it does.) ... – Oleksandr R. May 18 '13 at 06:40NMinimizewon't produce one if the global optimization doesn't converge but the later postprocessing does; both must fail for an error to be returned to the user. I suppose this is because global optimization algorithms like DE provide no useful error estimate, whereas local methods do, and thus the message is only really meaningful in the latter case. – Oleksandr R. May 18 '13 at 06:45