I would like to determine the probability $\mathbb{P}(X_1+\dots+X_n\leq 1)$, where $X=(X_i)_{1\leq i\leq n}$ is a family of independent uniform random variables on $[0,1]$. My first idea is to do this by induction. The first three base cases are straightforward to determine and give us $\mathbb{P}(X_1\leq 1)=1$, $\mathbb{P}(X_1+X_2\leq 1)=\frac{1}{2}$ and $\mathbb{P}(X_1+X_2+X_3\leq 1)=\frac{1}{6}$, which suggests that $\mathbb{P}(X_1+\dots+X_n\leq 1)=\frac{1}{n!}$. Supposing this is true for a certain arbitrary integer $n$, I am having difficulties establishing the result for $n+1$, i.e. $\mathbb{P}(X_1+\dots+X_n+X_{n+1}\leq 1)=\frac{1}{(n+1)!}$. I believe the starting point should be: $$\mathbb{P}(X_1+\dots+X_n+X_{n+1}\leq 1)=\mathbb{P}(X_1+\dots+X_n\leq 1-X_{n+1}),$$ and then somehow condition on $X_{n+1}$, but I am stuck at this point of the calculation. Any ideas of references to literature or even an alternative direct proof would be greatly appreciated.
-
https://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution – Mar 04 '16 at 23:21
-
@d.k.o. Yes I am aware of the Irwin Hall distribution, however I would still like to establish the result as per above. – user223935 Mar 04 '16 at 23:24
-
1Oh. Maybe geometric approach will suffice. This probability is represented by a volume of a part of a (hyper)cube (in 2 dim it's the lower triangle) – Mar 04 '16 at 23:26
-
@d.k.o. I used the geometric approach for the base cases $n=2$ and $n=3$, but it is harder to prove for an arbitrary integer – user223935 Mar 04 '16 at 23:35
-
Related: https://math.stackexchange.com/q/769545/321264. – StubbornAtom Jun 29 '20 at 07:57
-
https://math.stackexchange.com/q/1382459/321264 – StubbornAtom Oct 08 '20 at 15:39
2 Answers
Prove by induction the more general result: If $0\le t\le 1$, then $$ P(S_n\le t)=\frac{t^n}{n!}, $$ where $S_n$ denotes the sum $X_1+\cdots+X_n$. The base case $n=1$ is clear. If holds for $n$, then calculate for $0\le t\le 1$: $$ P(S_{n+1}\le t)=\int_0^1P(S_n+x\le t)f(x)dx\stackrel{(1)}=\int_0^t\frac{(t-x)^n}{n!}\,dx=\frac{t^{n+1}}{(n+1)!} $$ Note that in (1) the quantity $P(S_n\le t-x)$ is zero when $x>t$.
- 38,951
-
-
1@user223935: it does and you then get $\mathbb{P}(S_n\le 1)=\frac{1^n}{n!}$ but the more general hypothesis is easier to prove by induction – Henry Mar 04 '16 at 23:46
-
1@user223935 Sure! And even if it $t=1$ was not included in the proof, you can take the limit as $t\uparrow 1$ (since $S_n$ has a continuous CDF). – grand_chat Mar 04 '16 at 23:47
-
Where did the expression $P(S_{n+1} \leq t) = \int_{0}^1 P(S_n+x\leq t) f(x) dx$ come from? – 24n8 May 04 '20 at 00:50
-
This is what I start with $P(S_{n+1} \leq t) = \int_{0}^t f_{S_{n+1}}(s_{n+1}) ds$, where $S_{n+1}$ is a continuous random variable with pdf $f_{S_{n+1}}$, but I'm kind of confused how you got the next expression. Looks like the definition of conditional probability was somehow applied, but it's not jumping out at me right now. – 24n8 May 04 '20 at 00:53
-
5@Iamanon I am using $P(S_{n+1}\le t)=P(S_n+X_{n+1}\le t)=\int P(S_n+X_{n+1}\le t\mid X_{n+1}=x)f(x),dx$. Notice now that $P(S_n+X_{n+1}\le t\mid X_{n+1}=x)=P(S_n+x \le t\mid X_{n+1}=x)=P(S_n+x\le t)$ since $S_n$ and $X_{n+1}$ are independent. – grand_chat May 04 '20 at 01:26
-
Makes sense! Also, just from a general perspective, how do you build the intuition to condition over an extra variable? Like, in this case, what pointed you in the direction of conditioning on $X_{n+1}$? – 24n8 May 04 '20 at 02:37
-
@Iamanon Since we are attempting induction, we are trying to derive a result for $S_{n+1}$ from the result for $S_n$. This will possible if we can turn $X_{n+1}$ from a random variable into something we can treat as a constant, which motivates conditioning on $X_{n+1}$. – grand_chat May 04 '20 at 05:08
-
Oh right. I think in this case, it is a little more obvious. I feel like in some other cases, perhaps not involving induction, it's not as obvious to introduce an extra condition. – 24n8 May 04 '20 at 12:33
A geometric argument should suffice. Given that $\{X_k\}_\infty$ are all iid Uniform$(0;1)$ random variables, then:
$\mathsf P(X_1+X_2\leq 1)$ is the probability that points distributed uniformly over the unit square lie in the lower left triangle; which is $1/2$ the area of the unit square.
$\mathsf P(X_1+X_2+X_3\leq 1)$ is the probability that points distributed uniformly over the unit cube lie in the $(0,0,0)$-corner pyramid; which is $1/6$ the volume of the unit cube.
$\mathsf P(X_1+X_2+X_3+X_4\leq 1)$ is the probability that points distributed uniformly over the unit tesseract lie in $(0,0,0,0)$-corner pentachron; which is $1/24$ of the hypervolume of the unit tesseract.
And so forth.
$\mathsf P(\sum\limits_{k=1}^n X_k\leq 1)$ is the probability that points distributed uniformly over a unit $n$-hypercube lie in a corner $n$-hyperpyramid; which is $1/n!$ of the $n$-hypervolume of the unit $n$-hypercube.
- 129,094
-
4This is true, but it rather presupposes that you know the hypervolume of a corner hyperpyramid – Henry Mar 04 '16 at 23:48