3

In working on an estimation problem, I created three Gaussian-like functions. The ML estimate could be seen as the argmax of the product of the three functions. In trying to get the code to run faster, I discovered complementary functions to ArgMax: NArgMax, and FindArgMax. However, I can not get the functions NArgMax and FindArgMaxto agree with the ArgMax solution. For reference, I believe that the ArgMax solution is correct, and the other two are not. Without pasting all of my code, here is a MWE that demonstrates the effect:

test1[x_, y_] := 1/.01^2*Exp[-(x^2 + y^2)/0.01^2]
test2[x_, y_] := 1/.1^2*Exp[-((x - 5)^2 + (y - 3)^2)/0.1^2]
ArgMax[test1[x, y] test2[x, y], {x, y}] // AbsoluteTiming
FindArgMax[
  test1[x, y]*test2[x, y], {{x, 2.5}, {y, 1.5}}] // AbsoluteTiming
NArgMax[test1[x, y] test2[x, y], {x, y}] // AbsoluteTiming

The results are

{0.0388864, {0.049505, 0.029703}}
{0.102196, {2.5, 1.5}}
{0.0687933, {1.02414*10^30, 8.98768*10^29}}

However, it is worth noting that FindArgMax throws the warning: FindArgMax::fmgz: Encountered a gradient that is effectively zero. The result returned may not be a maximum; it may be a minimum or a saddle point. >>. NArgMax does not produce any warnings or errors.

I believe this warning may be the reason that neither FindArgMax nor NArgMax find the proper solution: FindArgMax simply considers the initial estimate to be the final solution (as it warns), but I do not know why NArgMax returns such large values for its solution. However, ArgMax does correctly identify the maximum.

Any help in resolving this would be much appreciated.

J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
Michael Witt
  • 727
  • 3
  • 12
  • How is AbsoluteTiming relevant to the question? – Taiki Nov 03 '15 at 03:36
  • It isn't really relevant to the question, it was more me trying to figure out which would be faster, if they were in agreement. I didn't remove it because AbsoluteTiming is well documented and I figured it didn't really matter. I can remove it, if you think it really detracts from the question. – Michael Witt Nov 03 '15 at 03:38
  • Your test function test1[x, y] * test2[x, y] produces ridiculously small numbers for any x and y. I guess none of the results are correct. – Taiki Nov 03 '15 at 03:44
  • No, I'm fairly certain that the result produced by ArgMax is correct. The size of the numeric solution shouldn't matter. If you let the variance of test1 equal the variance of test2, the ML estimate should be the midpoint of the two centroids, which is what it produces in that case; and FindArgMax also would 'find' that result, although that is its initial estimate. – Michael Witt Nov 03 '15 at 03:52
  • OK. My next guess is that since numerically test1[x, y] * test2[x, y] is very small, numerical methods by FindArgMax and NArgMax wouldn't work. Only an exact method done by ArgMax would. – Taiki Nov 03 '15 at 04:01
  • Also, you can change test2 to produce larger values overall by changing its center to (0.25,0.25), and I still do not get correct answers, e.g. test2[x_, y_] := 1/.1^2*Exp[-((x - .25)^2 + (y - .25)^2)/0.1^2] – Michael Witt Nov 03 '15 at 04:01
  • 2
    try using exact values (100^2 instead of 1/.01^2) and increasing workingprecision – george2079 Nov 03 '15 at 04:01
  • Indeed NArgMax[test1[x, y] * test2[x, y], {x, y}, WorkingPrecision -> 10] for example helps even without using exact numbers. – Taiki Nov 03 '15 at 04:10
  • Ok. So changing WorkingPrecision does actually help make both NArgMax and FindArgMax work, assuming your estimate for FindArgMax is close enough. This is the first time that changing WorkingPrecision has actually helped me resolve a problem. Thanks! – Michael Witt Nov 03 '15 at 04:28
  • Well, on further review, even setting WorkingPrecision->2 works better than whatever the default is. Why would that be? – Michael Witt Nov 03 '15 at 04:31
  • I can't reproduce WorkingPrecision->2 producing a good result. (I wouldn't call a wrong result w/o warnings an improvement). Anyway is this a real problem? ArgMax works, use it... – george2079 Nov 03 '15 at 17:09

1 Answers1

2

As pointed out by george2079 in the comment, it is adviseable to use arbitrary-precision numbers in this case, so we can set WorkingPrecision however high.

test1[x_, y_] := 1/(1/100)^2*Exp[-(x^2 + y^2)/(1/100)^2]
test2[x_, y_] := 1/(1/10)^2*Exp[-((x - 5)^2 + (y - 3)^2)/(1/10)^2]
f[x_, y_] := test1[x_, y_] * test2[x_, y_]

Then, for example,

NArgMax[f[x, y], {x, y}, WorkingPrecision -> 10]

gives a result that agrees with that given by ArgMax.

As to FindArgMax, it's a little more tricky. Even

FindArgMax[f[x, y], {x, y}, WorkingPrecision -> 10000]

doesn't help. We can improve that by manually supplying the gradient and Hessian:

grad = D[f[x, y], {{x, y}}]
hess = D[f[x, y], {{x, y}}, {{x, y}}]
FindArgMax[
  f[x, y],
  {x, y},
  Gradient -> grad, 
  Method -> {"Newton", Hessian -> hess},
  WorkingPrecision -> 10000
]

FindArgMax now gives a result that agrees with that given by ArgMax.

Taiki
  • 5,259
  • 26
  • 34
  • That 'fix' for FindArgMax still requires ridiculously high WorkingPrecision. Simply changing the estimate is a simpler and easier way of getting FindArgMax to be in agreement. Furthermore, there is the issue that I pointed out, that even setting WorkingPrecision -> 2 for NArgMax allows it to find the solution properly. Why would that help? A precision of 2 digits is extremely low, yet it allows NArgMax to converge. – Michael Witt Nov 03 '15 at 04:42
  • I don't know why NArgMax works with low WorkingPrecision and FindArgMax requires high WorkingPrecision. What did you mean by saying that 'that fix ... still requires ridiculously high WorkingPrecision'? Does the fix not resolve your problem as asked for in the question? – Taiki Nov 03 '15 at 04:51
  • Well I mean, it works now. But the default WorkingPrecision is MachinePrecision, which is roughly 15.95. So setting WorkingPrecision to 2 shouldn't really help, yet it does. So, yes it resolves the problem, but only raises more questions. – Michael Witt Nov 03 '15 at 04:57
  • Indeed it's interesting to investigate further why that's the case. Also strange is that NArgMax[f[x, y], {x, y}, WorkingPrecision -> $MachinePrecision] gives the result that agrees with ArgMax, while NArgMax[f[x, y], {x, y}, WorkingPrecision -> MachinePrecision] doesn't! – Taiki Nov 03 '15 at 05:03