0

I wish to perform goodness of fit. Currently I use NonLinearModelFit :

nlmSimple = 
NonlinearModelFit[data, model, {{a, 20000}, {k1, 300}, {b, 20000}}, t, Weights -> 1/dataErr^2, VarianceEstimatorFunction -> (1 &)];

with $data = \{\{x_1, y_1\},...,\{x_n,y_n\}\}$ and $dataErr = \{w_1,...,w_n\}$. $w_n$ is the standard deviation follow by the $n-th$ distribution (asssume to be a normal distribution) in which $y_n$ is picked up. So each data point is pick up in distribution with different width. This is working fine but I would like to get a Chi Square test (or equivalent, I am "use" to Chi square but I know there are other goodness test, so I am open to any proposal) for the overall goodness of the fit.

I use function from here Performing a chi-square goodness of fit test. I added the degree of Freedom :

pearsonTest[obs_, exp_, dof_] /; Length[obs] == Length[exp] := 
Block[{t}, t = Total[(obs - exp)^2/exp] // N;
{t/(Length[exp] - dof), 
SurvivalFunction[ChiSquareDistribution[Length[exp] - dof], t]}];

with $obs = \{y_1,...,y_n\}$ and $exp= \{model[x_1],...,model[x_n]\} $ But this does not take into account the standard deviation of $y_n$ and will output the same $\chi^2$ for any set of $w_n$. $model$ is a function than can be anything from simple exponential to "complicated" function with plenty of parameters.

So my question : does NonLinearMdelFit include some build-in tool for the overall fit's goodness I can use (I used the property of fitted model but this is only for parameters error)? And if no, how to add weighted data in a Pearson test (so this is more a mathematical problem).

Dalnor
  • 161
  • 1
  • 9
  • 2
    Using $(o-e)^2/e$ doesn't make any sense in a regression model. Maybe you're thinking about the following: https://en.wikipedia.org/wiki/Reduced_chi-squared_statistic. I suggest asking the question first on CrossValidated (https://stats.stackexchange.com/) and then coming back here for implementation. – JimB Jun 18 '19 at 18:39
  • OK so I think about it a bit, and you right it doesn't make sense since the underlying assuption is that the variable $o$ is following poisson disitrbution (which is true in my case) with mean = $e$ and variance=$e$ (which in not true). In my case the variance is $w_i^2$ so I should use $(o_i−e_i)^2/w_i^2$ . I will ask on CrossValidated as you suggest (I edit the first post for adding few details about the problem). – Dalnor Jun 19 '19 at 08:51

1 Answers1

0

OK I solve the mathematical part of the problem. Since my data follow poisson normal distribution of mean $\mu=y_i$ and $\sigma = w_i^2$ I need to use $\chi^2= \sum_{i=1}^n (f(x_i)- x_i)^2/w_i^2$ instead of $\chi^2= \sum_{i=1}^n (f(x_i)- x_i)^2/x_i$ (assumption $\sigma = x_i$).

(you can found it in any (I think) data analysis book for exemple : Fundamental Numerical Methods and Data Analysis of George W. Collins II ).

But is strange that mathematica doesn't provide buil-in tool for that no ? For exemple a property of the FittedModel ouput.

Dalnor
  • 161
  • 1
  • 9
  • That is a good book but definitely a bit dated and doesn't cover generalized linear models or generalized linear mixed models. This answer mentions that your data follows a Poisson distribution but a comment in your original post says that it does not. You might consider looking at Mathematica's GeneralizedLinearModelFit, get a more up-to-date textbook written by a statistician, and wean yourself off of thinking only in terms of $\chi^2$ statistics. – JimB Jun 19 '19 at 15:43
  • Ok I will look at GeneralizedLinearModelFit. About the distribution : I edited my posts. At first this is poisson distribution (count data) but this can be approximazie by normal distribution (high count rate) and when I apply systematics correction and errors I assume normal distribution ( error propagation). So $y_i$ follow normal distribution of $w_i$. – Dalnor Jun 20 '19 at 08:46