7

I have distilled an error analysis problem into the following:

I have a multinomial distribution, $X$, consisting of $n$ independent trials where each trial takes on the values $\{0,1,\ldots,k-1\}$ with uniform probability of $\frac{1}{k}$.

Now I am interested in finding the distribution of the sum of all the outcomes in the $n$ trials, but not sure how to approach the problem. I suspect it has something to do with partition functions. Would be glad if the relevant probability distribution function in MATLAB could also be pointed out.

2 Answers2

5

Since you mentioned in a comment that $n=4$ in your case, here's a way to derive the distribution for small values of $n$.

The number of ways of writing $m$ as a sum of $n$ values from $0$ to $k-1$ is the coefficient of $x^m$ in

$$ (1+x+\dotso+x^{k-1})^n=\left(\frac{1-x^k}{1-x}\right)^n=(1+x+x^2+\dotso)^n\sum_{j=0}^n\binom nj(-x^k)^j\;. $$

Thus you just have to place the binomial coefficients in a sequence at distances of $k$ and then sum the sequence $n$ times; e.g. for $n=4$ and $k=3$:

$$ \begin{array}{rrrrrrrrr} 1&0&0&-4&0&0&6&0&0&-4&0&0&1\\ 1&1&1&-3&-3&-3&3&3&3&-1&-1&-1\\ 1&2&3&0&-3&-6&-3&0&3&2&1\\ 1&3&6&6&3&-3&-6&-6&-3&-1\\ 1&4&10&16&19&16&10&4&1 \end{array} $$

Then dividing through by the total number $k^n=81$ of possibilities gives you the probabilities for the values of the sum.

joriki
  • 238,052
4

If you have $n\geq 30$, you can apply the central limit theorem.

Assume $u_1,u_2,\cdots,u_n\sim {\cal{U}}[0,k-1]$ i.i.d., then we know for each $u_k$ its mean and variance are $$\mu={(k-1)\over 2} {\rm\,and\,} \sigma^2 = {k^2-1\over 12}.$$ Now construct the summation random variable $$X=\sum_{k=1}^n u_k$$ Then you can find the corresponding mean and variance of r.v. $X$ as $$\mu_X = n\mu {\rm\,and\,} \sigma^2_X = n\sigma^2$$ So if $n$ is sufficiently large, then the distribution of $X$ can be considered as a Gaussian, i.e. $$X\sim {\cal N}(n\mu,n\sigma^2)$$ Based on this information, I think it is easy for your to use Matlab to compute the CDF of $X$. The last step is to get the discrete PMF from this continuous CDF, which can be thought of the following form $$Pr(X=m) = CDF(m+.5)-CDF(m-.5)$$

pitfall
  • 551