To start with:
link3 = {679, 531, 379, 272, 198, 143, 99, 67, 46, 32, 22, 14, 9, 5, 3, 1};
klink3 = 20;
B1997 = {14, 10, 6, 3, 3, 6, 2, 3, 2, 1, 1, 1, 0, 3, 2, 0, 3, 1, 1, 1, 0, 0, 1, 2, 0, 0, 1};
kB1997 = 50;
taxicabsA = {142, 81, 49, 7, 3, 1};
ktaxicabsA = 10;
rabbits = {43, 16, 8, 6, 0, 2, 1};
krabbits = 18;
mylik[data_, kdata_] := Block[
{K, mypdf, fj, pj, loglik},
K = kdata;
mypdf = ProbabilityDistribution[
Binomial[K, j]*(
w*p^j*(1 - p)^(K - j)
+
(1 - w)*Beta[alpha + j, beta + K - j]/Beta[alpha, beta]), {j, 0, K, 1}
];
fj = Prepend[PadRight[data, K], f0];
pj = PDF[mypdf, #] & /@ Range[0, K];
loglik =
LogGamma[Total[fj] + 1] - Total[LogGamma[fj + 1]] + fj.Log[pj]
];
cons = {f0 > 0, alpha > 0, beta > 0, 0 < p < 1, 0 < w < 1};
pars = {f0, p, w, alpha, beta};
We will first take a look at:
NMaximize[{mylik[link3, klink3], cons}, pars, Method -> {"RandomSearch", "SearchPoints" -> Automatic}]
NMaximize[{mylik[link3, klink3], cons}, pars, Method -> {"RandomSearch", "SearchPoints" -> 200}]
(* Automatic is better, multiple search points is no good *)
NMaximize[{mylik[B1997,kB1997], cons}, pars, Method -> {"RandomSearch", "SearchPoints" -> Automatic}]
NMaximize[{mylik[B1997,kB1997], cons}, pars, Method -> {"RandomSearch", "SearchPoints" -> 200}]
(* Automatic is not good enough, multiple search points is useful *)
How about some other methods?
Try with DifferentialEvolution:
OptMethod="DifferentialEvolution";
NMaximize[{mylik[B1997,kB1997], cons}, pars, Method -> {OptMethod, "SearchPoints" -> Automatic}]
NMaximize[{mylik[B1997,kB1997], cons}, pars, Method -> {OptMethod, "SearchPoints" -> 200}]
NMaximize[{mylik[link3, klink3], cons}, pars, Method -> {OptMethod, "SearchPoints" -> Automatic}]
NMaximize[{mylik[link3, klink3], cons}, pars, Method -> {OptMethod, "SearchPoints" -> 200}]
(* None of the above result is optimal *)
Try with SimulatedAnnealing:
OptMethod="SimulatedAnnealing";
NMaximize[{mylik[B1997,kB1997], cons}, pars, Method -> {OptMethod, "SearchPoints" -> Automatic}]
NMaximize[{mylik[B1997,kB1997], cons}, pars, Method -> {OptMethod, "SearchPoints" -> 200}]
NMaximize[{mylik[link3, klink3], cons}, pars, Method -> {OptMethod, "SearchPoints" -> Automatic}]
NMaximize[{mylik[link3, klink3], cons}, pars, Method -> {OptMethod, "SearchPoints" -> 200}]
(* None of the above result is optimal *)
So my main questions are:
- If I want to write a general routine (function) for different datasets, what method do I use? I dont want to use too many search points and see from above, sometimes, it's even worse.
What is the proper way to setup a mixture distribution? For this example, it's a mixture of binomial and betabinomial distribution. I have tried to setup the probabilities using many other ways, such as:
pj = Table[ Binomial[K, j](wp^j*(1 - p)^(K - j) + (1 - w)* Beta[alpha + j, beta + K - j]/Beta[alpha, beta]), {j, 0, K}]
(* OR *)
pj = MixtureDistribution[{w,1-w},{BinomialDistribution[K,p],BetaBinomialDistribution[alpha,beta,K]}]; pj = Table[PDF[pj,j],{j,0,K}]
(* OR *)
pj = Table[w*PDF[BinomialDistribution[K, p], j] + (1 - w)*PDF[BetaBinomialDistribution[alpha, beta, K], j], {j, 0, K}]
(* Sorry about the poor format, but i just can't get the code out of the list environment *)
I have even tried to compile the likelihood function, but it does not seem to help. In terms of speed or the optimization result. Most cases, if I check
CompilePrint, there are still a lot ofMainEvaluate. I think that's why there is no speed up.
All I want, is (hopefully) to speed up the optimization, settle at the global optima using less "SearchPoints" and a general way to adapt to the toy datasets. Any suggestion would be appreciated.
Update:
A lot of the tips (tricks) says that bultin functions are better, so I tried this:
mylik3[data_, kdata_] := Block[
{K, mypdf, fj, pj, loglik},
K = kdata;
mypdf = ProbabilityDistribution[
w*PDF[BinomialDistribution[K,p],j]
+
(1 - w)*PDF[BetaBinomialDistribution[alpha,beta,K],j], {j, 0, K, 1}
];
fj = Prepend[PadRight[data, K], f0];
pj = PDF[mypdf, #] & /@ Range[0, K];
loglik =
LogGamma[Total[fj] + 1] - Total[LogGamma[fj + 1]] + fj.Log[pj]
];
But this is not better (or faster) at all.
"SearchPoints"has hardly any useful effect for differential evolution and to modify"ScalingFactor"and"CrossProbability"instead.NMaximize[{mylik[B1997, kB1997], cons}, pars, Method -> {"DifferentialEvolution", "ScalingFactor" -> 0.90, "CrossProbability" -> 0.05}]andNMaximize[{mylik[link3, klink3], cons}, pars, Method -> {"DifferentialEvolution", "ScalingFactor" -> 0.90, "CrossProbability" -> 0.05}]... – Oleksandr R. Dec 19 '14 at 23:09NMaximizeif your function also did not return nonsense values likeIndeterminatefor certain values of the parameters. – Oleksandr R. Dec 19 '14 at 23:10taxicabsA, it returns a likelihood value of-16.3518. But I knew there is a better one-16.3354, which would give a huge difference on the parameter values. I want to keep search points merely as a way toensureI found thebestglobal optima. I spend a lot of time, search on different ways to produceMixtureDistributionsbecause I dont know whether they will make a difference in the optimization. Ifbultinfunctions arefaster, would that be moreefficientto use? – Chen Stats Yu Dec 19 '14 at 23:35taxicabsAdata should have these values{alpha -> 5714973, beta -> 31443412, w -> 1 - 0.6243, f0 -> 180.3501, p -> 0.0317}, which gives a target function value of-16.3354. – Chen Stats Yu Dec 19 '14 at 23:40properway to compute the likelihood function. So that I can use more search points (using randomsearch). I think randomsearch will explore awiderparameter space if more points are used, so a better chance to find the global optima? – Chen Stats Yu Dec 19 '14 at 23:46