4

Consider the following formula :

$$p_i=\sum_{j=1}^n\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}},$$

where $i,n$ are positive integer and $i=1,\ldots,n^2$.

Suppose there are $n$ boxes (the boxes are labeled with $1,2,\ldots,n$) each containing $n$ balls. The balls in each box are ordered according to their size. The $i$th $(i=1,2,\ldots,n^2)$ smallest ball among all $n^2$ balls is the $j$th smallest ball $(j=1,2,\ldots,n)$ of a box if and only if (1) the first $(i-1)$ values contain exactly $(j-1)$ 1's. This can be done in $\binom{i-1}{j-1}$ ways, $(2)$ the $i$th value is a $1$, (3) And therefore there are exactly $(n-j)$ 1's distributed among the remaining $(n^2-i)$ values. This can be done in $\binom{n^2-i}{n-j}$ ways.

The total number of ways in which the $n^2$ balls can be randomly allocated to $n$ boxes is $\binom{n^2}{n}$.

Then the probability that the $i$th smallest ball among all $n^2$ balls is the $j$th smallest ball of a box is

$$\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}}.$$

We select a ball if it is the $1$st smallest ball of the $1$st box, or $2$nd smallest ball of the $2$nd box, thus $n$th smallest ball of the $n$th box. It is not possible that the $i$th smallest ball overall was in two boxes, but it had to be in some box. Therefore the event that "the $i$th smallest ball was in box $j$", is an exhaustive partition of the possibilities. Consequently there chances add up. Hence the probability that the $i$th smallest ball is selected is

$$p_i=\sum_{j=1}^n\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}}.$$

I expected different value of $p_i$ for different $i$. But I was astonished that the $p_i$ is same for all $i$.

It seems

$$p_i=\sum_{j=1}^n\frac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}}=\frac{1}{n}.$$

I checked it by calculating $p_i$ for different $n$ such as $2$ or $3$.

What is the reason that $p_i$ does not depend on $i$?

Why this complex formula is nothing but a uniform probability?

  • 2
    By "it seems ...$=\frac1{n}$", do you mean you went through and proved this equation, or is it just a hunch? If the algebra works out to cancel out all terms with $i$ , then wouldnt that answer your question? But if you are looking for an extra-mathematical reason you would need to provide more context about where the formula came from. – Nap D. Lover Jul 26 '17 at 13:58
  • @LoveTooNap29 I have edited my question. – user 31466 Jul 26 '17 at 14:10
  • If $i = 1,...,n^2$, then it is not a probability distribution, because $n^2\times1/n = n > 1$ –  Jul 26 '17 at 14:13
  • @Fedor for a specific $j$, $\sum_{i=1}^{n^2}\binom{i-1}{j-1}\binom{n^2-i}{n-j}=\binom{n^2}{n}$ , confirming the law of total probability. – user 31466 Jul 26 '17 at 14:23
  • @Fedor $p_i$ is called inclusion probability. Inclusion probability of an element of a population is its probability of becoming part of the sample during the drawing of a single sample. https://en.wikipedia.org/wiki/Sampling_probability – user 31466 Jul 26 '17 at 14:32
  • @user 31466 I am wondering, is it definitely not $\binom{n^2-1}{n-1}$ in the denominator? This way I may think of a combinatorial interpretation –  Jul 26 '17 at 14:50
  • 1
    @Fedor no. Let me edit the post once more by clarifying how the formula came. – user 31466 Jul 26 '17 at 14:53
  • I have finished editing. – user 31466 Jul 26 '17 at 15:25
  • 1
    @user31466 I have just posted my answer –  Jul 26 '17 at 15:25

2 Answers2

4

Here is an algebraic verification. Denoting with $[z^k]$ the coefficient of $z^k$ in a series we can write e.g. \begin{align*} [z^k](1+z)^n=\binom{n}{k} \end{align*}

We obtain \begin{align*} \color{blue}{\sum_{j=1}^n}&\color{blue}{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}\\ &=\sum_{j=0}^\infty [z^{j-1}](1+z)^{i-1}[u^{n-j}](1+u)^{n^2-i}\tag{1}\\ &=[u^n](1+u)^{n^2-i}\sum_{j=0}^\infty u^j[z^j]z(1+z)^{i-1}\tag{2}\\ &=[u^n](1+u)^{n^2-i}u(1+u)^{i-1}\tag{3}\\ &=[u^{n-1}](1+u)^{n^2-1}\tag{4}\\ &\color{blue}{=\binom{n^2-1}{n-1}=\frac{1}{n}\binom{n^2}{n}} \end{align*}

Comment:

  • In (1) we apply the coefficient of operator twice and set the upper limit of the series to $\infty$ without changing anything since we are adding zeros only.

  • In (2) we use the linearity of the coefficient of operator and apply the rule $[z^{p-q}]A(z)=[z^p]z^{q}A(z)$.

  • In (3) we apply the substitution rule of the coefficient of operator with $z:=u$
    \begin{align*} A(u)=\sum_{j=0}^\infty a_j u^j=\sum_{j=0}^\infty u^j [z^j]A(z) \end{align*}

  • In (4) we do some simplifications and are ready to select the coefficient of $[u^{n-1}]$.

Markus Scheuer
  • 108,315
3

$p_i = \sum^{n}_{j=1} \dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2}{n}} = \sum^{n}_{j=1} \dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2-1}{n-1}\times\dfrac{n^2}{n}} = \dfrac{1}{n}\sum^{n}_{j=1} \dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2-1}{n-1}}$.

The formula inside the sum can be interpreted the following way:

Let's say I have $n^2-1$ balls from which $i-1$ are red and $n^2-i$ are blue. If I select $n-1$ balls from this set, the probabilty of getting $j-1$ red balls and $n-j$ blue balls is $\dfrac{\binom{i-1}{j-1}\binom{n^2-i}{n-j}}{\binom{n^2-1}{n-1}}$.

Now we sum probabilities over all possible outcomes, which is equal to $1$, hence $p_i = \dfrac{1}{n}$.