What is the reason that $p_i$ does not depend on $i$?

Question

Consider the following formula :

$$p_i=\sum_{j=1}^n\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}},$$

where $i,n$ are positive integer and $i=1,\ldots,n^2$.

Suppose there are $n$ boxes (the boxes are labeled with $1,2,\ldots,n$) each containing $n$ balls. The balls in each box are ordered according to their size. The $i$th $(i=1,2,\ldots,n^2)$ smallest ball among all $n^2$ balls is the $j$th smallest ball $(j=1,2,\ldots,n)$ of a box if and only if (1) the first $(i-1)$ values contain exactly $(j-1)$ 1's. This can be done in $\binom{i-1}{j-1}$ ways, $(2)$ the $i$th value is a $1$, (3) And therefore there are exactly $(n-j)$ 1's distributed among the remaining $(n^2-i)$ values. This can be done in $\binom{n^2-i}{n-j}$ ways.

The total number of ways in which the $n^2$ balls can be randomly allocated to $n$ boxes is $\binom{n^2}{n}$.

Then the probability that the $i$th smallest ball among all $n^2$ balls is the $j$th smallest ball of a box is

$$\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}}.$$

We select a ball if it is the $1$st smallest ball of the $1$st box, or $2$nd smallest ball of the $2$nd box, thus $n$th smallest ball of the $n$th box. It is not possible that the $i$th smallest ball overall was in two boxes, but it had to be in some box. Therefore the event that "the $i$th smallest ball was in box $j$", is an exhaustive partition of the possibilities. Consequently there chances add up. Hence the probability that the $i$th smallest ball is selected is

$$p_i=\sum_{j=1}^n\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}}.$$

I expected different value of $p_i$ for different $i$. But I was astonished that the $p_i$ is same for all $i$.

It seems

$$p_i=\sum_{j=1}^n\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}}=\frac{1}{n}.$$

I checked it by calculating $p_i$ for different $n$ such as $2$ or $3$.

What is the reason that $p_i$ does not depend on $i$?

Why this complex formula is nothing but a uniform probability?

By "it seems ...$=\frac1{n}$", do you mean you went through and proved this equation, or is it just a hunch? If the algebra works out to cancel out all terms with $i$ , then wouldnt that answer your question? But if you are looking for an extra-mathematical reason you would need to provide more context about where the formula came from. — Nap D. Lover, Jul 26 '17 at 13:58
If $i = 1,...,n^2$, then it is not a probability distribution, because $n^2\times1/n = n > 1$ — , Jul 26 '17 at 14:13
@Fedor for a specific $j$, $\sum_{i=1}^{n^2}\binom{i-1}{j-1}\binom{n^2-i}{n-j}=\binom{n^2}{n}$ , confirming the law of total probability. — user 31466, Jul 26 '17 at 14:23
@Fedor $p_i$ is called inclusion probability. Inclusion probability of an element of a population is its probability of becoming part of the sample during the drawing of a single sample. https://en.wikipedia.org/wiki/Sampling_probability — user 31466, Jul 26 '17 at 14:32
@user 31466 I am wondering, is it definitely not $\binom{n^2-1}{n-1}$ in the denominator? This way I may think of a combinatorial interpretation — , Jul 26 '17 at 14:50
@Fedor no. Let me edit the post once more by clarifying how the formula came. — user 31466, Jul 26 '17 at 14:53

Markus Scheuer · Accepted Answer · 2017-07-26T15:49:57.597

4

Here is an algebraic verification. Denoting with $[z^k]$ the coefficient of $z^k$ in a series we can write e.g. \begin{align*} [z^k](1+z)^n=\binom{n}{k} \end{align*}

We obtain \begin{align*} \color{blue}{\sum_{j=1}^n}&\color{blue}{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}\\ &=\sum_{j=0}^\infty [z^{j-1}](1+z)^{i-1}[u^{n-j}](1+u)^{n^2-i}\tag{1}\\ &=[u^n](1+u)^{n^2-i}\sum_{j=0}^\infty u^j[z^j]z(1+z)^{i-1}\tag{2}\\ &=[u^n](1+u)^{n^2-i}u(1+u)^{i-1}\tag{3}\\ &=[u^{n-1}](1+u)^{n^2-1}\tag{4}\\ &\color{blue}{=\binom{n^2-1}{n-1}=\frac{1}{n}\binom{n^2}{n}} \end{align*}

Comment:

In (1) we apply the coefficient of operator twice and set the upper limit of the series to $\infty$ without changing anything since we are adding zeros only.
In (2) we use the linearity of the coefficient of operator and apply the rule $[z^{p-q}]A(z)=[z^p]z^{q}A(z)$.
In (3) we apply the substitution rule of the coefficient of operator with $z:=u$
\begin{align*} A(u)=\sum_{j=0}^\infty a_j u^j=\sum_{j=0}^\infty u^j [z^j]A(z) \end{align*}
In (4) we do some simplifications and are ready to select the coefficient of $[u^{n-1}]$.

edited Jul 26 '17 at 15:49

answered Jul 26 '17 at 15:42

Markus Scheuer

108,315

+1: Very elegant and general answer. I think one can use this method, whenever binomial coefficients are involved. Thank you for sharing your method. – MrYouMath Jul 26 '17 at 16:55
@MrYouMath: You're welcome. This method is sufficiently powerful even for somewhat more challenging problems as shown in this answer. :-) – Markus Scheuer Jul 26 '17 at 16:59
Verified and (+1). – Marko Riedel Jul 26 '17 at 20:16
@MarkoRiedel: Many thanks, Marko. :-) – Markus Scheuer Jul 26 '17 at 20:33

score 3 · Answer 2 · 2017-07-26T15:40:19.873

3

$p_i = \sum^{n}_{j=1} \dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}} = \sum^{n}_{j=1} \dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2-1}{n-1}\times\dfrac{n^2}{n}} = \dfrac{1}{n}\sum^{n}_{j=1} \dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2-1}{n-1}}$.

The formula inside the sum can be interpreted the following way:

Let's say I have $n^2-1$ balls from which $i-1$ are red and $n^2-i$ are blue. If I select $n-1$ balls from this set, the probabilty of getting $j-1$ red balls and $n-j$ blue balls is $\dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2-1}{n-1}}$.

Now we sum probabilities over all possible outcomes, which is equal to $1$, hence $p_i = \dfrac{1}{n}$.

edited Jul 26 '17 at 15:40

answered Jul 26 '17 at 15:25

I think you mean "... $i-1$ are red and $n^2-i$ are blue". – Ramiro Jul 26 '17 at 15:33
@Ramiro Yes, you're right, thank you – Jul 26 '17 at 15:40

What is the reason that $p_i$ does not depend on $i$?

2 Answers2