7

I'm trying to replicate a graph I created in Desmos over here,

with the following code in Overleaf

\documentclass{article}
\usepackage{pgfplots}
\pgfplotsset{compat=1.7}

% https://tex.stackexchange.com/questions/198572/tikz-binomial-distribution

\begin{document}
\begin{tikzpicture}[
    declare function={binom(\x,\n,\p)=\n!/(\x!*(\n-\x)!)*\p^\x*(1-\p)^(\n-\x);},
    declare function={normd(\x,\n,\p)=binom(\x*\n,\n,\p);}
]

\begin{axis}
    \addplot[cyan, domain=0:1]{normd(x, 15, 0.7)};
\end{axis}

\end{tikzpicture}
\end{document}

However, the resulting graph is not as smooth (and rather jaggy): jagged

I've tried to increase the number of samples, samples=500 and samples at={0, 0.001, ..., 0.999, 1}, but both still can't replicate the granularity that Desmos produces.

Are there just some functions which are hard to be plotted through pgfplot?

Sentient
  • 275
  • You could just add smooth to the plot. However, then the plot is no longer accurate, i.e. the interpolation will cause the plot to deviate from the true values (a bit). I guess that the problem is that factorial is only defined for integers. You may want to use the Gamma function for a continuous interpolation. –  Sep 05 '18 at 22:27
  • Per your suggestion, I tried adding smooth into the axis parameters, but that only results in the jagged edges to being smoothed but does not create the normalized binomial plot I'm trying to create :( – Sentient Sep 05 '18 at 22:29
  • Yes, but my comment had a second part. ;-) –  Sep 05 '18 at 23:00

2 Answers2

8

I get a very different result from Ruixi if I use the well-known relation Gamma(n+1)=n! to provide a continuous version of factorial. However, the result seems to match your Desmos curve very accurately, so I think it is correct. The definition of the Gamma function is from this answer, and I used it to provide a continuous generalization of factorial here, when encountering the same problem.

\documentclass{article}
\usepackage{pgfplots}
\pgfplotsset{compat=1.16}

% https://tex.stackexchange.com/questions/198572/tikz-binomial-distribution

\begin{document}
\begin{tikzpicture}[% gamma definition from https://tex.stackexchange.com/a/120449/121799
    declare function={gamma(\z)=(2.506628274631*sqrt(1/\z) + 0.20888568*(1/\z)^(1.5) +
    0.00870357*(1/\z)^(2.5) - (174.2106599*(1/\z)^(3.5))/25920 -
    (715.6423511*(1/\z)^(4.5))/1244160)*exp((-ln(1/\z)-1)*\z);
    smoothbinom(\x,\n,\p)=gamma(\n+1)/(gamma(\x+1)*gamma((\n-\x)+1))*pow(\p,\x)*(1-\p)^(\n-\x);
    smoothnormd(\x,\n,\p)=smoothbinom(\x*\n,\n,\p);
    }]

\begin{axis}
    \addplot[blue, domain=0:1,samples=100,smooth]{smoothnormd(x, 15, 0.7)};
\end{axis}
\end{tikzpicture}
\end{document}

enter image description here

CROSS CHECKS: I first check that the Gamma function has been implemented correctly and then draw your contour with it.

\documentclass{article}
\usepackage{pgfplots}
\pgfplotsset{compat=1.16}

% https://tex.stackexchange.com/questions/198572/tikz-binomial-distribution

\begin{document}
\begin{tikzpicture}[
    declare function={binom(\x,\n,\p)=\n!/(\x!*(\n-\x)!)*\p^\x*(1-\p)^(\n-\x);
    normd(\x,\n,\p)=binom(\x*\n,\n,\p);
    gamma(\z)=(2.506628274631*sqrt(1/\z) + 0.20888568*(1/\z)^(1.5) +
    0.00870357*(1/\z)^(2.5) - (174.2106599*(1/\z)^(3.5))/25920 -
    (715.6423511*(1/\z)^(4.5))/1244160)*exp((-ln(1/\z)-1)*\z);
    smoothbinom(\x,\n,\p)=gamma(\n+1)/(gamma(\x+1)*gamma((\n-\x)+1))*pow(\p,\x)*(1-\p)^(\n-\x);
    smoothnormd(\x,\n,\p)=smoothbinom(\x*\n,\n,\p);
    }]

\begin{axis}
    \addplot[cyan, domain=1:10,samples=10,only marks,mark=*]{x!};
    \addplot[blue, domain=1:10]{gamma(x+1)};
\end{axis}
\end{tikzpicture}


\begin{tikzpicture}[
    declare function={binom(\x,\n,\p)=\n!/(\x!*(\n-\x)!)*\p^\x*(1-\p)^(\n-\x);
    normd(\x,\n,\p)=binom(\x*\n,\n,\p);
    gamma(\z)=(2.506628274631*sqrt(1/\z) + 0.20888568*(1/\z)^(1.5) +
    0.00870357*(1/\z)^(2.5) - (174.2106599*(1/\z)^(3.5))/25920 -
    (715.6423511*(1/\z)^(4.5))/1244160)*exp((-ln(1/\z)-1)*\z);
    smoothbinom(\x,\n,\p)=gamma(\n+1)/(gamma(\x+1)*gamma((\n-\x)+1))*pow(\p,\x)*(1-\p)^(\n-\x);
    smoothnormd(\x,\n,\p)=smoothbinom(\x*\n,\n,\p);
    }]

\begin{axis}
    \addplot[cyan, domain=0:1]{normd(x, 15, 0.7)};
    \addplot[blue, domain=0:1]{smoothnormd(x, 15, 0.7)};
\end{axis}
\end{tikzpicture}
\end{document}

enter image description here

As you see, the smoothed out version of your plot is very different from the one with factorial in, but factorial is per se not defined for non-integers.

  • 1
    Oh, amazingly it looks like this would reproduce the Desmos (whatever that is) curve. –  Sep 05 '18 at 22:47
  • Oof, I probably would've never though about using the Gamma function to smooth it out. Thanks! – Sentient Sep 05 '18 at 22:52
  • @Sentient This is a standard trick which (essentially) gave t'Hooft a Nobel prize. ;-) You need it in order to compute loop diagrams in 4-\epsilon dimensions, where \epsilon is a real number. –  Sep 05 '18 at 22:57
6

Added: Please note that the following discussions are based on my background in probability theory. In your example, binomial probabilities are defined only when \x*\n are integers; that is, these probabilities are defined only when \x is equal to one of the following 16 numbers

0, 1/15, 2/15, ... , 14/15, 15/15

Thus, in order to smoothly extend the graph, you have many options. These extensions include but are not limited to

  1. Analytical continuation via the gamma function. But, you should be aware of the fact that

    TeX is no Mathematica or MATLAB or Maple

    and therefore the gamma function is not available on the fly. So, @marmot provided an excellent answer in which the gamma function is hardcoded using its asymptotic expansions. Because it uses asymptotic expansions, there could be unexpected computation errors (but only if you choose extreme domain to be plotted).

  2. Use smooth in combination of the appropriate sample points. The smooth key basically interpolates between two points using splines. Usually, a twice continuously differentiable spline (a cubic spline) is used and it is good enough to fool the human eyes. This is what I propose in my new answer.

  3. Use Gaussian or normal density approximation. There is a Local Central Limit Theorem which states

    The discrete binomial probabilities can be approximated by a continuous Gaussian/normal density curve: The larger the \n*\p and \n*(1-\p), the better the approximation. (cf. Probability: Theory and Examples (4th ed.) by Rick Durrett, Sections 3.1 and 3.5).

    This is what I propose in my old answer.

Please note that care must be taken when doing factorial calculations! If you pass real numbers into binom, then be prepared to get a “probability” bigger than one (cf. this question of mine). In your original graph, some “probabilities” seems to be bigger than two!

New answer

I just realized the OP wanted a “normalized binomial probability mass plot” instead of a “normal density approximation plot”. In this case, you can tell TikZ to draw only the points at which \x*\n are integers by specifying samples=\n+1. Remember to use integers for factorial calculations and to normalize in the coordinates afterward.

\documentclass{article}
\usepackage{pgfplots}
\pgfplotsset{compat=1.7}

% https://tex.stackexchange.com/questions/198572/tikz-binomial-distribution

\begin{document} \begin{tikzpicture}[ declare function={binom(\x,\n,\p)=\n!/(\x!(\n-\x)!)\p^\x(1-\p)^(\n-\x);}, % declare function={normd(\x,\n,\p)=binom(\x\n,\n,\p);}% <- This is bad practice ]

\begin{axis} \addplot[cyan, samples at={0,1,...,15}, smooth](x/15,{binom(x, 15, 0.7)}); \end{axis}

\end{tikzpicture} \end{document}

binomial

Notice the top of the curve is around 0.2. Indeed, binom(11,15,0.7) = 0.218623131...


Old answer

Here, let’s use “Gaussian/normal approximation”. The approximating Gaussian/normal density takes the form

1/sqrt(2 * pi * n * p * (1-p)) * exp( - n * (x - p)^2 / (2 * p * (1-p)) )

Of course, this approximation works better if n*p and n*(1-p) are larger. If you are illustrating a normal approximation, then this would be your choice:

\documentclass{article}
\usepackage{pgfplots}
\pgfplotsset{compat=1.7}

% https://tex.stackexchange.com/questions/198572/tikz-binomial-distribution

\begin{document} \begin{tikzpicture}[ declare function={binom(\x,\n,\p)=\n!/(\x!(\n-\x)!)\p^\x(1-\p)^(\n-\x);}, % declare function={normd(\x,\n,\p)=binom(\x\n,\n,\p);}% <- This is bad practice declare function={normaldensity(\x,\n,\p)=exp(-\n(\x-\p)^2/(2\p(1-\p)))/sqrt(2pi\n\p*(1-\p));} ]

\begin{axis} \addplot[cyan, samples at={0,1,...,15}, smooth](x/15,{binom(x, 15, 0.7)}); \addplot[orange, domain=0:1, smooth]{normaldensity(x, 15, 0.7)}; \end{axis}

\end{tikzpicture} \end{document}

normal

Here, the cyan curve is drawn using my new answer, and the orange curve is drawn using normal approximation.

Ruixi Zhang
  • 9,553
  • What formula is that, or how is it derived from the normalized binomial that I gave? – Sentient Sep 05 '18 at 22:34
  • @Sentient Answer updated. ;-) I assume your naming normd is referring to “normal density”, correct? – Ruixi Zhang Sep 05 '18 at 22:39
  • oh sorry, I just used normd as a shorten way to say normalized, since I wanted the normalized binomial. That was also a very good explanation of how you obtained the approximation -- thanks! :) – Sentient Sep 05 '18 at 22:51