Customized StepSize control in unconstrained gradient-descent optimization

Question

I have a type of minimization problem I frequently solve that involves warping a large 2D triangle-mesh (~25,000 vertices) to fit a model. In the mesh, vertices carry the empirical measurements (each of which has a predicted position in the model, which is a continuous field). The potential/objective function for the system is something like this:

potentialFn[{X0_List, triangles_List}, Xideal_List, X_List] := With[
  {elens0 = EdgeLengths[X0, triangles],
   elens  = EdgeLengths[X, triangles],
   angs0  = AngleValues[X0, triangles],
   angs   = AngleValues[X, triangles]},
  Plus[
    Total[(elens0 - elens)^2],
    Total[(angs0 - angs)^2],
    Total@Exp[-(X - Xideal)^2],
    0.25*Total[1 + (angs0/angs)^2 - 2*(angs0/angs))]]];

The idea is that the potential is equal to the sum of the squared deviations in the distances between vertex neighbors (the edges) plus the sum of the squared deviation in the angles of the mesh, plus the goal function, which is 0 when the vertices are perfectly aligned with the model and otherwise monotonically increasing with distance from an ideal fit, and finally a term that is designed to make the potential approach infinity as any triangle approaches inversion; i.e., the last term (which is similar to models of the van der Waals potential) constrains the potential such that triangle ABC would have to cross a singularity in order to becomes triangle ACB (in terms of a counterclockwise ordering of vertices). Additionally I have well-tested analytical functions that calculate the gradients (but nothing for the Hessians). I've confirmed that the gradient descent works correctly by, among other things, running a gradient descent search with a very small step-size for a very long time.

Most of those details are irrelevant to my question, however. What is important is that I can, for any set of vertex coordinates that are valid (i.e., no triangles have been inverted), calculate a maximum steps-size S such that for any actual step size s (0 < s < S) the resulting vertex coordinates will also be valid; so long as my step sizes always follow this rule, I can guarantee that no triangles will invert. The problem I have is that there doesn't seem to be a way for me to provide this information to the Mathematica line-search algorithm in functions like FindMinimum.

Initially, I thought that something like this would be the solution:

FindMinimum[
  potentialFn[{X0, triangles}, X],
  {X, X0},
  Gradient :> potentialGradient[{X0, triangles}, X],
  Method -> {
    "ConjugateGradient",
    "StepControl" -> {
      "LineSearch",
      "MaxRelativeStepSize" :> CalculateMaxRelativeStepSize[{X0, triangles}, X]}}]

This, however, gives me an error (FindMinimum::lsmrss, with message "-- Message Text not found -- (CalculateMaxRelativeStepSize[{X0, triangles}, X])") that I can only assume is due to FindMinimum's inability to interpret the delayed-rule. I've spent a lot of time looking through the Mathematica documentation on conjugate-gradient and related unconstrained search and have found nothing that indicates that I can actually control the step-size aside from setting a permanent step-size length relative to the norm of the total gradient. That is fairly useless in this kind of case, unfortunately --- I can use it, but it results in a very slow search.

My question is this: are there existing (undocumented?) methods for providing Mathematica's line-search method with a way to calculate a maximum gradient step-size?

Note: I realize I haven't provided a minimal working example of this problem. I can do so, but this would be quite an undertaking as there is a lot of context around the specific problem instances --- if anyone believes that they can help me with this kind of optimization given an example, I will work on this, but I'd appreciate some indication that the work of translating the problem into a compact instance won't be for naught before I do it.

I should have noted that the best solution I've found is to use the StepMonitor argument; whenever the previous step invalidates a triangle, I use Throw/Catch to exit the minimization and restart it with a smaller "MaxRelativeStepSize". It seems clunky and is definitely not optimal, but it works for now. — nben, Jan 28 '16 at 22:07
I should also note that the approach in the above comment quietly crashes the kernel as often as not somewhere in the middle of the optimization. — nben, Jan 29 '16 at 19:59
You have one too many ) in the last line of potentialFn. Also, potentialGradient and CalculateMaxRelativeStepSize are undefined. EdgeLengths and AngleValues also appear to be undefined or misused. — bbgodfrey, Aug 14 '16 at 03:55
@bbgodfrey Thanks for pointing out the minor syntax error, but the note at the end of the original post was included precisely so that people would not try to copy/paste the code then post comments like that. — nben, Aug 14 '16 at 16:14
I interpreted you last paragraph differently. Sorry. In any case, I would at least take a serious look at your problem, if I could run your code. — bbgodfrey, Aug 14 '16 at 16:43
Might an appropriate value for "RestartThreshold" be helpful? See Nonlinear Conjugate Gradient Methods. — bbgodfrey, Aug 19 '16 at 03:47

Customized StepSize control in unconstrained gradient-descent optimization

0 Answers0