I'm trying to fit a transformed (actually shifted to the right) Coxian distribution to some data ( if needed an example set is here Sim data ). I've tried a number of ways for this:
FindDistributionParametersin my case yields the worst results.FindFitto fit the PDF of the distribution to aHistogramListworks the best atmNMinimizethe distance between the distribution moments to data moments
I as well tried to combine those methods by providing the starting values (and here random starting values work a little bit better than some meaningfull values obtained via FindInstance).
My question is how to enchance my solutions and why some of them do not work like others?
Here's something I tried: (Off-topic question - how can I hide some part of the post for it not to be long?)
The first approach is with direct FindDistributionParameters
simData =
Import["fitData.txt", "Table"];
phasesToUse = 3;
distrShift = 10;
distrStart = 0;
Clear[coxModel];
coxModel[p_] := coxModel[p] = TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha][j], {j, 1, p - 1}],
Table[\[Lambda][j], {j, 1, p}]
]
];
Clear[coxModelPDF];
coxModelPDF[p_] := coxModelPDF[p] = PDF[
coxModel[p],
x
];
resFindParam = FindDistributionParameters[
simData,
coxModel[phasesToUse]
];
resFindParam
Show[{
Histogram[simData, {0.5}, "PDF"],
Plot[
(*Evaluate[*)coxModelPDF[x](*]*),
{x, distrShift, 30}
]
}]
As you can see, FindDistributionParameters failed horribly here. Think it found some local minimum and failed to find anything better.
Next I tried to supply an initial guess by finding a point that fits the first moment of data:
initVals = FindInstance[
Evaluate[And @@ Join[
{0.9 <= Moment[coxModel[phasesToUse], 1]/
Moment[simData[[All, pointToApprox]], 1] <= 1.1},
{coxModelPDF[phasesToUse, distrShift] == distrStart},
Table[0 <= \[Alpha][i] <= 1, {i, 1, phasesToUse - 1}],
Table[\[Lambda][i] >= 0, {i, 1, phasesToUse}]
]],
Join[
Table[\[Alpha][i], {i, 1, phasesToUse - 1}],
Table[\[Lambda][i], {i, 1, phasesToUse}]
]
][[1]] /. {\[Alpha] -> \[Alpha]I, \[Lambda] -> \[Lambda]I};
resFindParam = FindDistributionParameters[
simData[[All, pointToApprox]],
coxModel[phasesToUse],
Join[
Table[{\[Alpha][i], \[Alpha]I[i]}, {i, 1, phasesToUse - 1}],
Table[{\[Lambda][i], \[Lambda]I[i]}, {i, 1, phasesToUse}]
] /. initVals
];
Unfo, in this case it repeats the previous case or just almost infinitely tries to find parameters.
Then I tried to NMinimize the distance between at least the first two moments and then to feed it to FindDistributionParameters:
simData =
Import["fitData.txt", "Table"];
momentsToMin = 2;
phasesToUse = 3;
distrShift = 10;
Clear[simMoment];
simMoment[n_, i_] := simMoment[n, i] = Moment[
simData,
i
];
Clear[coxMoment];
coxMoment[p_, i_] := coxMoment[p, i] = Moment[
TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha]T[i, j], {j, 1, p - 1}],
Table[\[Lambda]T[i, j], {j, 1, p}]
]
],
i
];
tmpEnum = 0;
minRes = NMinimize[
{
Evaluate[Plus @@ Table[
SquaredEuclideanDistance[
Join[Table[\[Alpha][i], {i, 1, phasesToUse - 1}],
Table[\[Lambda][i], {i, 1, phasesToUse}]],
Join[Table[\[Alpha]T[k, j], {j, 1, phasesToUse - 1}],
Table[\[Lambda]T[k, j], {j, 1, phasesToUse}]]
],
{k, 1, momentsToMin}
]],
Evaluate[And @@ Join[
{PDF[
TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha][i], {i, 1, phasesToUse - 1}],
Table[\[Lambda][i], {i, 1, phasesToUse}]
]
],
distrShift
] == 0},
Table[
coxMoment[phasesToUse, k] == simMoment[pointToApprox, k],
{k, 1, momentsToMin}
],
Table[0 < \[Alpha][i] < 1, {i, 1, phasesToUse - 1}],
Table[\[Lambda][i] > 0.01, {i, 1, phasesToUse}],
Flatten[
Table[0 < \[Alpha]T[k, j] < 1, {j, 1, phasesToUse - 1}, {k, 1,
momentsToMin}]],
Flatten[
Table[\[Lambda]T[k, j] > 0.01, {j, 1, phasesToUse}, {k, 1,
momentsToMin}]]
]]
},
Join[
Table[\[Alpha][i], {i, 1, phasesToUse - 1}],
Table[\[Lambda][i], {i, 1, phasesToUse}],
Flatten[
Table[\[Alpha]T[k, j], {j, 1, phasesToUse - 1}, {k, 1,
momentsToMin}]],
Flatten[
Table[\[Lambda]T[k, j], {j, 1, phasesToUse}, {k, 1,
momentsToMin}]]
],
EvaluationMonitor :> (tmpEnum++;
If[Mod[tmpEnum, 200] == 0, Print[tmpEnum];];)
];
distrRes = FindDistributionParameters[
simData,
TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha]F[i], {i, 1, phasesToUse - 1}],
Table[\[Lambda]F[i], {i, 1, phasesToUse}]
]
],
Join[Table[{\[Alpha]F[i], \[Alpha][i]}, {i, 1, phasesToUse - 1}],
Table[{\[Lambda]F[i], \[Lambda][i]}, {i, 1, phasesToUse}]] /.
minRes[[2]]
];
minRes
distrRes
Show[{
Histogram[simData, {0.5}, "PDF"],
Plot[
Evaluate[PDF[
TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha][i], {i, 1, phasesToUse - 1}],
Table[\[Lambda][i], {i, 1, phasesToUse}]
]
] /. minRes[[2]],
x
]],
{x, distrShift, 30}
],
Plot[
Evaluate[PDF[
TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha]F[i], {i, 1, phasesToUse - 1}],
Table[\[Lambda]F[i], {i, 1, phasesToUse}]
]
] /. distrRes,
x
]],
{x, distrShift, 30},
PlotStyle -> Red
]
}]
The result is:
Where the blue line is for the parameters after NMinimize and the red is for FindDistributionParameters after that.
This is better now, but still not the best thing. Yet it is possible to obtain better results.
The last thing I tried is to directly fit the PDF to the HistogramList
simData =
Import["fitData.txt", "Table"];
phasesToUse = 3;
distrShift = 10;
distrStart = 0;
histStep = 0.5;
tmpFitData =
HistogramList[simData, {histStep}, "PDF"];
fitData =
Join[{{tmpFitData[[1, 1]], 0}},
Table[{(tmpFitData[[1, i + 1]] + tmpFitData[[1, i]])/2,
tmpFitData[[2, i]]}, {i, 1, Length[tmpFitData[[2]]]}]];
Clear[coxModel];
coxModel[p_] := coxModel[p] = TransformedDistribution[
x + distrShift,
x \[Distributed] CoxianDistribution[
Table[\[Alpha][j], {j, 1, p - 1}],
Table[\[Lambda][j], {j, 1, p}]
]
];
Clear[coxModelPDF];
coxModelPDF[p_, x_] := coxModelPDF[p, x] = PDF[
coxModel[p],
x
];
fitPars = FindFit[
fitData,
{
coxModelPDF[phasesToUse, x],
Evaluate[And @@ Join[
{coxModelPDF[phasesToUse, distrShift] == distrStart},
Table[0 < \[Alpha][i] < 1, {i, 1, phasesToUse - 1}],
Table[\[Lambda][i] > 0.01, {i, 1, phasesToUse}](*,
Flatten[Table[(\[Lambda][i]/\[Lambda][
j]\[LessEqual]0.95||\[Lambda][i]/\[Lambda][
j]\[GreaterEqual]1.05),{i,1,phasesToUse},{j,i+1,
phasesToUse}]]*)
]]
},
Join[
Table[\[Alpha][i], {i, 1, phasesToUse - 1}],
Table[\[Lambda][i], {i, 1, phasesToUse}]
],
x(*,
MaxIterations\[Rule]1000*),
Method -> NMinimize(*,
NormFunction\[Rule](Norm[#, 2] & )*)
];
fitPars
Show[{
Histogram[simData, {0.5}, "PDF"],
Plot[
Evaluate[coxModelPDF[phasesToUse, x] /. fitPars],
{x, distrShift, 30}
]
}]
This is the best I could obtain till now. But I'm sure that there is a way to find a better fit and/or improve my codes.
Any feedback is welcome and sorry for the long post.












ParameterEstimatorsetting? – J. M.'s missing motivation Mar 04 '16 at 18:52ParameterEstimatorforFindDistributionParameters. For exampleParameterEstimator -> {"MethodOfMoments", MaxIterations -> 500, AccuracyGoal -> 8}ignores the distribution assumptions. And other come to the same result - flat line or ignore assumptions as well. – Andrew S. Mar 04 '16 at 19:00p=2, use only 100 sample points (rather than 20,000) to get things working, useFindDistributionParameterswithout the shift parameter to get starting values, and finally useFindDistributionParameterswith all parameters (including the shift). Then when that's working write the code for the general case. – JimB Mar 04 '16 at 19:54FindFit) like the plague. (FindFit for least-squares estimation is fine. Just not when you're estimating a density function from a random sample.) – JimB Mar 04 '16 at 19:58ParameterEstimator -> {"MaximumLikelihood", Method -> "NMaximize"}does a semi-decent job here. The default estimator is failing quite badly, however. – Stefan R Mar 04 '16 at 21:33FindDistributionParameters, but I can't get it working well. THe worst thing I noticed is that you can't (or can you?) specify any constranits, for example assume that the resulting PDF must be 0 when x=0. Even thoughFindFitis not recommended, it yields the best results atm.@StefanR Somehow it likes to ignore distribution assumtions as well...
@AntonAntonov Yeah, changed it, as I really am interested in just CoxianDistribution.
– Andrew S. Mar 05 '16 at 03:42FindDistributionParametersin some cases ignore assumptions? I as well noticed that when different lambdas are almost equal, fitting functions fail to go any further (finding some local extremum there?). What can I do in this case? Like explicitly specify in constraints (when aplicable) that their ratio should differ for at least 5%? – Andrew S. Mar 05 '16 at 03:47