Tremendous bug using EstimatedDistribution (using NMinimize) in v8.0 and v10.2

Question

Let's define F4 and estimate the distribution of the data:

W = Import["https://pastebin.com/raw/kE49s1Fj", "Package"];
F4 = Log[W];

EstimatedDistribution[F4, MixtureDistribution[{0.65, 1 - 0.65},
                      {NormalDistribution[α, c], NormalDistribution[d, e]}]]

EstimatedDistribution[F4, MixtureDistribution[{0.65, 1 - 0.65},
                      {NormalDistribution[Subscript[μ, 4], Subscript[σ, 4]],
                       NormalDistribution[Subscript[ν, 4], Subscript[τ, 4]]}]]

EstimatedDistribution[F4, MixtureDistribution[{0.65, 1 - 0.65},
                      {NormalDistribution[a, c], NormalDistribution[d, e]}]]

The code provides three different results in version 8.0 and version 10.2 as well. Only the last is appropriate. To my knowledge all are using NMinimize (MLE) internally.

Some users reported similar issues when variables/symbols are changed in other functions. I thought that has been fixed in version 10?

What does version 11 do?

11.1.0 for Mac OS X. I get the last two as closely (many digits) equivalent: MixtureDistribution[{0.65, 0.35}, {NormalDistribution[8.69107, 0.568114], NormalDistribution[5.62755, 2.06378]}] , first one: MixtureDistribution[{0.65, 0.35}, {NormalDistribution[6.86847, 2.28485], NormalDistribution[8.646, 0.386817]}] — John Joseph M. Carrasco, Aug 24 '17 at 11:33
Definitely all about lexicographic ordering of variable choice. Without getting exotic in variable names: same dichotomy between D1 = EstimatedDistribution[Log[W], MixtureDistribution[{0.65, 1 - 0.65}, {NormalDistribution[a1, a2], NormalDistribution[a3, a4]}]] vs `D2 = EstimatedDistribution[Log[W], MixtureDistribution[{0.65, 1 - 0.65}, {NormalDistribution[a3, a4], NormalDistribution[a1, a2]}]] — John Joseph M. Carrasco, Aug 24 '17 at 11:39
So, it is looking better in 11.1. In 8 and 10 there is a huge difference. Can you try replacing 'a' by 'q' in the last one. — JHT, Aug 24 '17 at 11:41
{NormalDistribution[q, c], NormalDistribution[d, e]} $\mapsto$ {NormalDistribution[a3, a4], NormalDistribution[a1, a2]} ==={NormalDistribution[\[Alpha], c], NormalDistribution[d, e]} — John Joseph M. Carrasco, Aug 24 '17 at 11:46
This probably isn't considered a bug. This kind of problem doesn't have a single correct solution in many cases. When you change the variable names, you're effectively changing the expression being given to NMinimize and NMinimize can't guarantee that it returns the same result in that case - even if they're symbolically equal up to a variable name change. — Searke, Aug 24 '17 at 14:34
This function could be made to be do what you want and have consistent results regardless of what the variable names are. I'm afraid however that would mean removing any symbolic processing from the core the function. That could have some serious downsides. — Searke, Aug 24 '17 at 14:40
That would mean the result may change in every run, which is not. It does only change when variable names are changed. This is the worst thing that can happen in a CAS. — JHT, Aug 24 '17 at 15:45
I think it's a precision issue. Using WorkingPrecision -> 30 gets one pretty much the same answer. — JimB, Aug 24 '17 at 19:23
What exactly is a measure of bugginess in your understanding? You keep using dramatic words like "tremendous" and "huge", but these adjectives do not provide any real information beyond the fact that you think it is a bug (which in this case is not true, according too comments). — István Zachar, Aug 25 '17 at 12:10
It doesn't matter what I'm thinking, it is provable a bug. Its not about that particular function. If you are doing science you rely on software. Since you cannot cross-check every computation you have to trust your software, which is massively discredited by a behavior as shown here and in the link. I'm not aware of similar big bugs in other software. — JHT, Aug 25 '17 at 13:00

Bob Hanlon · Answer 1 · 2017-08-25T03:12:56.870

$Version

(*  "11.1.1 for Mac OS X x86 (64-bit) (April 18, 2017)"  *)

W = ToExpression@Import["https://pastebin.com/raw/kE49s1Fj"];

F4 = Log[W];

The default ParameterEstimator for EstimatedDistribution is MaximumLikelihood

Options[EstimatedDistribution, ParameterEstimator]

(*  {ParameterEstimator -> "MaximumLikelihood"}  *)

With this default option the first case differs from the last two.

(dist=EstimatedDistribution[F4,
    MixtureDistribution[{0.65, 1 - 0.65},
     {NormalDistribution @@ #[[1]],
      NormalDistribution @@ #[[2]]}]] & /@
  {{{α, c}, {d, e}},
   {{Subscript[μ, 4], Subscript[σ, 4]},
    {Subscript[ν, 4], Subscript[τ, 4]}},
   {{a, c}, {d, e}}}) // Column

EDIT: With the ParameterEstimator option set as suggested by @JimBaldwin the results are equivalent between the cases

(dist = N[EstimatedDistribution[F4, 
       MixtureDistribution[{0.65, 1 - 0.65}, {NormalDistribution @@ #[[1]], 
         NormalDistribution @@ #[[2]]}], 
       ParameterEstimator -> {Automatic, 
         Method -> {Automatic, WorkingPrecision -> 30, 
           MaxIterations -> 300}}]] & /@ {{{α, c}, {d, 
       e}}, {{Subscript[μ, 4], 
       Subscript[σ, 4]}, {Subscript[ν, 4], 
       Subscript[τ, 4]}}, {{a, c}, {d, e}}}) // Column

Legended[
 Show[
  Histogram[F4, Automatic, "PDF"],
  SmoothHistogram[F4,
   PlotStyle -> {{Blue, Thick}}],
  Plot[PDF[dist[[1]], x], {x, 0, 11},
   PlotRange -> All,
   PlotStyle -> {{Red, Thick}}],
  PlotLabel -> Style[distr[[2]], Bold],
  ImageSize -> Large,
  Epilog -> Inset[
    DistributionFitTest[F4, dist[[1]], {"TestDataTable", All}] //
      Rasterize // Image,
    {3, 0.35}]],
 Placed[
  LineLegend[
   {Directive[Blue, Thick], Directive[Red, Thick]},
   {"SmoothHistogram", "PDF"}],
  {0.3, 0.3}]]

Unfortunately, this favors the very bad solution. The case a,c,d,e privides a much better fit. — JHT, Aug 24 '17 at 15:42
Explicitly giving the defaults (that sounds kinda circularly redundant?) and including WorkingPrecision and MaxIterations seems to make it give consistent results: ParameterEstimator -> {Automatic, Method -> {Automatic, WorkingPrecision -> 30, MaxIterations -> 300}}. — JimB, Aug 25 '17 at 02:10
For whatever it's worth my comment was not intended to be a correction but rather just an observation that might lead someone much more knowledgeable than me as to what might be causing the issue. Just putting in WorkingPrecision -> 30 gives one 3 different answers. Another oddity is that the default MaxIterations for NMaximize is 100 and no warnings are given if MaxIterations isn't included or if MaxIterations -> 100. However, one does get warnings of failure to converge if one puts in MaxIterations -> 150 for two of the parameterizations. — JimB, Aug 25 '17 at 03:24
Okay the above solution works also in v8. Although the running time is much longer. But it does not address the bug that the same code provides different result using other variables, which hurts the fundamental principle of such a software. see also https://mathematica.stackexchange.com/questions/25182/variable-naming-changes-everything — JHT, Aug 25 '17 at 10:38

Tremendous bug using EstimatedDistribution (using NMinimize) in v8.0 and v10.2

1 Answers1