we randomly pick $M$ elements from a set of $N$ real numbers, $A=\{ a_1,a_2,\ldots,a_N \}$. Then we sort these $M$ numbers, what is the expected value for the largest element? (let say $N=1000000$, $M=1000$), I am interested in the solution for general $A$, and/or the case that elements of $A$ are taken from a normal distribution.
-
1If yo have sorted increasingly the $a_i$ then $$P(max=a_j)=\frac{C(j-1,M-1)}{C(N,M)}$$ for $j\geq M$. (C is the binomial coefficient) – Kolmo Mar 08 '12 at 19:28
-
Are the $M$ elements a subset of $A$ or are you sampling with replacement? For sampling with replacement, assuming that $a_1 < a_2 < \cdots < a_N$, $$P{\max = a_i} = \left(\frac{i}{N}\right)^M - \left(\frac{i-1}{N}\right)^M$$ from which you can find $$E[\max] = \sum_{i=1}^N a_i\left[\left(\frac{i}{N}\right)^M - \left(\frac{i-1}{N}\right)^M\right].$$ – Dilip Sarwate Mar 08 '12 at 19:28
3 Answers
Note: Since you're concerned with subsets, I'm going to assume that you're choosing your elements without repetition. As per your question, I'll also assume that you're picking your elements with uniform probability (ie completely at random).
Adding a bit of notation, suppose that our chosen subset is $B=\{b_1,b_2,\cdots,b_M\}$ with $b_1<b_2<\cdots <b_M$. If we assume also that $a_1<a_2<\cdots<a_N$, then clearly $P(b_M<a_M)=0$, since the smallest $M$ elements that we can choose from $A$ are $\{a_1,a_2,\cdots,a_M\}$.
Now, $P(b_M=a_M)=\frac{1}{{N \choose M}}$, since the only possible such subset is $\{a_1,a_2,\cdots, a_M\}$.
To compute $P(b_M=a_{M+1})$, the only way this can happen is if $\{b_1,\cdots, b_{M-1}\}\subset \{a_1,a_2,\cdots, a_M\}$, hence $P(b_M=a_{M+1})=\frac{{M \choose M-1}}{{N \choose M}}$.
And by a similar argument, for $1\leq i \leq N-M$, $P(b_M=a_{M+i})=\frac{{M +(i-1) \choose {M-1}}}{{N \choose M}}$.
So, from there, the expected value is $$\frac{1}{{N \choose M}}\sum\limits_{i=0}^{N-M}a_i {{M+(i-1)} \choose {M-1}}$$
If you want to pick $M$ random numbers, you can pick the highest one $m$ in $N$ ways and then the rest in $\binom{m-1}{M-1}$: $$\binom{N}{M} = \sum_m\binom{m-1}{M-1}\,,$$ so the probability that $m$ is the maximum is $$\binom{m-1}{M-1}\Big/\binom{N}{M}\,.$$
- 37,381
-
Seems like the highest one could only be chosen in $N-M+1$ ways. This is because $M-1$ of the numbers will be less than the maximum. – Patrick Mar 12 '12 at 19:57
-
@Patrick Yep, but $\binom{M-2}{M-1} = 0$ and so on, i.e. you could pick the maximum to be the smallest number just to pick the rest in 0 ways ;-) – dtldarek Mar 12 '12 at 20:39
-
Ah, of course. I didn't know the notation $(n \mbox{ over } k)$ was defined when $n<k$. – Patrick Mar 12 '12 at 22:23
Hint 1: For every nonnegative random variable $X$, $\mathrm E(X)=\int\limits_0^{+\infty}\mathrm P(X\geqslant x)\mathrm dx$.
Hint 2: If $M=\max\{\xi_1,\xi_2,\ldots,\xi_n\}$, then, for every $x$, $[M\leqslant x]=\bigcap\limits_{k=1}^n[\xi_k\leqslant x]$.
- 279,727
-
-
Why should I? You closed the question (45 minutes after asking it), did not you? – Did Mar 08 '12 at 22:37
-
well, I posted a new question, but it looks like others thought it was exactly same question, so it was deleted.. Anyways I appreciate your time and help with this question. – mghandi Mar 09 '12 at 18:27