1

standard uniform distribution

The standard uniform distribution, $X \sim U(0, 1)$, has a probability density function (PDF):

$$ f(x) = \begin{cases} 1 \quad \text{if } 0 < x < 1\\ 0 \quad \text{otherwise} \end{cases} $$

$X \sim U(0, 1)$ has

  • mean $E(X) = \mu_X = 0.5$;
  • variance $V(X) = \frac 1 {12}$; and,
  • $P(a < X < b) = \int_a^b f(x)dx$

sum of two independent random variables, $X+Y$

Given two independent random variables, $X$ and $Y$, we have

  • $E(X+Y) = E(X) + E(Y)$; and,
  • $V(X+Y) = V(X) + V(Y)$.

Also if $X$ and $Y$ have PDFs $f$ and $g$, respectively, then the the PDF of $X + Y$ is the convolution of $f$ and $g$, which assuming $f$ and $g$ are supported only on $[0, \infty)$, is

$$ (f \ast g)(x) = \int_0^x f(u)g(x-u)du $$

and

$$P(a < X+Y < b) = \int_a^b (f \ast g)(x)dx$$

$n$th self-convolution, $f^n$

Let $f^n$ be the convolution of $f$ with itself $n$ times. Or more formally,

$$ f^n(x) = \begin{cases} \begin{align} f(x) \quad &\text{if } n = 1\\ (f \ast f^{n-1})(x) \quad &\text{if } n > 1 \end{align} \end{cases} $$

i.e.

  • $f^1(x) = f(x)$
  • $f^2(x) = (f \ast f)(x)$
  • $f^3(x) = (f \ast f^2)(x)$
  • etc.

sum of $n$ i.i.d. random variable, $Y_n = X_1 + \cdots + X_n$

Let $Y_n = \sum_{i=1}^n X_i$ for $n >= 1$, i.e. the sum $n$ random variables $X_1, ... X_n$.

If the $X_i$ are i.i.d. (independent and identically distributed) and each with PDF $f(x)$, we have

  • $E(Y_n) = n E(X_1)$; and,
  • $V(Y_n) = nV(X_1)$.
  • $P(a < Y_n < b) = \int_a^b f^n(x)dx$

How do I compute $P(Y_n < 0.5)$ and $P(Y_n > 1)$ where $Y_n = X_1 + \cdots X_n$ where $X_i \sim U(0, 1)$?

We have $n$ i.i.d. random variables $X_i \sim U(0, 1)$ each with a PDF of

$$ f(x) = \begin{cases} 1 \quad \text{if } 0 < x < 1\\ 0 \quad \text{otherwise} \end{cases} $$

I need to compute

  • $P(Y_n < 0.5) = \int_0^0.5 f^n(x)$; and,
  • $P(Y_n > 1) = \int_1^\infty f^n(x)$

where, again, $Y_n = X_1 + \cdots X_n$ for $n >= 1$.

This involves computing the $n$th self-convolution of $f$ for arbitrary $n$. Is there a known formula of the $n$th self-convolution of $f$? Can I use some trick to compute said probabilities using the fact that $E(Y_n) = nE(X)$ and $V(Y_n) = nV(X)$ or the fact that the Central Limit Theorem tells us that $f^n$ tends towards the normal distribution as $n$ tends to $\infty$?

Ultimately, I need to compute

$$ \sum_{i=2}^\infty \Bigg(\bigg( \prod_{j=1}^{i-1} P(Y_j < 0.5) \bigg) P(Y_i > 1) \Bigg) $$

to $d$ digits past the decimal point. Comparing subsequent terms of the sum, say at $n$ and $n+1$, we have

$$ \begin{align} n\text{th term:} &\quad P(Y_1 < 0.5) \cdot P(Y_2 < 0.5) \cdots P(Y_{n-1} < 0.5) \cdot P(Y_n > 1)\\ (n+1)\text{th term:} &\quad P(Y_1 < 0.5) \cdot P(Y_2 < 0.5) \cdots P(Y_{n-1} < 0.5) \cdot P(Y_n < 0.5) \cdot P(Y_{n+1} < 1) \end{align} $$

factoring out the common factors, $C = P(Y_1 < 0.5) \cdot P(Y_2 < 0.5) \cdots P(Y_{n-1} < 1)$

$$ \begin{align} n\text{th term:} &\quad C \cdot P(Y_n > 1)\\ (n+1)\text{th term:} &\quad C \cdot P(Y_n < 0.5) \cdot P(Y_{n+1} < 1) \end{align} $$

How do they compare? I think $P(Y_n < 0.5) << P(Y_n > 1) < P(Y_{n+1} < 1)$ such that $P(Y_n > 1) > P(Y_n < 0.5) \cdot P(Y_{n+1} < 1)$, so my gut feeling is that the terms are decreasing substantially (proof needed). Assuming that the the terms are decreasing, there is a term, say the $n$th term whose value is less than $10^{-d}$ which doesn't contribute to the first $d$ digits of the sum. Once we hit that term, we can stop because this term and subsequent terms don't contribute to the first $d$ digits of the sum$^*$.

$^*$ actually, I think this is only the case if we prove that $i$th term $\geq 10 \cdot (i+1)$th term for $i \geq n$.


I computed $P(Y_n < 0.5)$ and $P(Y_n > 1)$ manually and also using Desmos for a few $n$:

n P(Y_n < 0.5) P(Y_n > 1)
1 0.5 0
2 0.25 0.5
3 0.020833... 0.833...
4 0.002604166... 0.95833...
joseville
  • 1,477
  • 3
    Notation $nX$ to denote sum of $n$ independent copies of $X$ is terrible, where did you get it from? – Esgeriath Mar 17 '23 at 19:12
  • I made it up, because writing out $X + X + X + ... + X$ seemed is very verbose and error prone especially to when conveying the $sum_{i=2}^\infty P(...$. Is there a better notation? Maybe $X_n$? – joseville Mar 17 '23 at 19:33
  • 3
    Usually it's done by defining $X_1, X_2, \ldots \text{i.i.d.} U([0, 1]), Y_n = X_1 + \ldots +X_n$ and using $Y_n$ in calculations – Esgeriath Mar 17 '23 at 19:36
  • @Esgeriath I fixed the notation. – joseville Mar 18 '23 at 00:22

2 Answers2

2

Note $nX_1=X_1+...+X_1$ is certainly not the same as $X_1+...+X_n$ for iid $X_1,...X_n$. In particular, the first sum has dependent identically distributed terms.

Coming to your question, if $X_1,...,X_n$ are iid $U(0,1)$, their sum has the Irwin-Hall distribution, which has a known closed-form CDF.

In particular, the CDF of the sum is given by $$P\left(\sum_{i=1}^nX_i\leq x\right)=\frac{1}{n!}\sum_{k=0}^{\lfloor{x\rfloor}}(-1)^k {n \choose k}(x-k)^n,\quad x\in [0,n].$$

You can use this to compute your probabilities of interest.

Golden_Ratio
  • 12,591
  • Thanks! Does $X_1 + ... + X_1$ just represent the same sample from $U(0, 1)$ added to itself $n$ times? Whereas $X_1 + ... + X_n$ for iid $X_1, ..., X_n \sim U(0, 1)$ represent $n$ different samples from $U(0, 1)$? – joseville Mar 17 '23 at 22:07
  • 1
    @joseville Right, $X_1+...+X_1$ is the sum of $n$ copies of the same individual observation that is drawn from $U(0,1)$. In contrast, for iid ${X_i}_{i=1,...,n},$ $X_1+...+X_n$ is the sum of $n$ independent observations, each one drawn from $U(0,1)$. – Golden_Ratio Mar 18 '23 at 01:00
1

If $y\in[0,1]$ there is a nice closed form for $P(Y_n\le y)$: $$P(Y_n\le y)=\frac {y^n}{n!},$$ which you can prove by induction. From this you can obtain $P(Y_n<\frac12)$, which is the same as $P(Y_n\le\frac12)$, and $P(Y_n>1) = 1 - P(Y_n \leq 1) = 1 - \frac {y^n} {n!} = 1 - \frac 1 {n!}$.

joseville
  • 1,477
grand_chat
  • 38,951