Unable to calculate binomial coefficients correctly in matlab

Question

I tried to compute the value of the following expression in Matlab:

$$ 2^l\sum_{i=0}^{n-1}{N\choose i}\varepsilon^i(1-\varepsilon)^{N-i} $$

where $N=16272$, $n=499$, $l=107$, $\varepsilon=0.02$.

I tried to use nchoosek(n, k) but I got warnings of “Result may not be exact. The coefficient has a maximum relative error of 4.2633e-14, corresponding to absolute error 1.147533136012604e+104. ” The result is NaN, which is obviously wrong.

Is the way I calculate the binomial coefficient wrong? And how can I solve this problem in Matlab?

It would help your Readers willing to respond to know why you are interested in this sum. The factor $2^\ell$ seems unrelated to the rest of the expression. The Matlab computing engine does not have (natively) an extended precision integer faculty, and its unclear if you want to get a numerical approximation or an exact result. — hardmath, May 19 '23 at 16:56
It appears that you're computing the cumulative binomial probability density. There are algorithms for this that are numerically stable but do not involve explicitly computing the sum in your question. The MATLAB statistics toolbox has a function binocdf() that can do this for you. — Brian Borchers, May 19 '23 at 17:26

helloworld922 · Answer 1 · 2023-05-19T18:57:39.873

The expression $$ \sum_{i=0}^{n-1} \binom{N}{i} \epsilon^i (1-\epsilon)^{N-i} $$ is the binomial cumulative distribution function. In Matlab, you can evaluate this as

binocdf(n-1,N,epsilon)

However, this will be basically 1. You can more accurately compute how close to 1 this will be by computing its complement:

binocdf(n-1,N,epsilon,'upper')

For me the latter evaluates to about $1.0947174122631063 \cdot 10^{-19}$. So your resulting answer will be essentially just $2^l$, at least as far as floating point precision can represent.

I computed the answer symbolically using Sympy, and using 100 billion digits precision the complement is about $1.09471741224404 \cdot 10^{-19}$.

score 2 · Answer 2 · answered May 19 '23 at 10:21

The numbers that you have mentioned will blow up the computations as the floating point arithmetic will be pushed to its limits which will result in high numerical errors.

For context, let us just consider $\epsilon^i(1-\epsilon)^{N-i}$. With $N = 16272, i=107$ and $1-\epsilon = 9.8 \times 10^{-1}$, the second term evaluates to $\simeq 9.8 \times 10^{-16000}$, which is clearly out of bounds for IEEE floating points in double precision so you get a zero which makes the whole expression zero or NaN depending on the floating point implementation.

Even with some extended precision libraries, I am not sure if you will be able to get an answer to your expression. I am just curious where this arises for you!

Federico Poloni · Answer 3 · 2023-05-19T16:22:05.663

1

If you are interested in bounding that quantity a priori, you can look at tail bounds. It rates to be very small, so you might find out it's even too small to fit in a Float64 double-precision value.

edited May 19 '23 at 16:22

answered May 19 '23 at 16:08

Federico Poloni

11,344
1
31
59

Unable to calculate binomial coefficients correctly in matlab

3 Answers3