1

Let's assume we are talking about lottery that 6 distinct numbers from 1 to 40 are drawn. We know the lowest number that was drawn $X_1$. we have the random variable $X_6$ that represent the highest number that was drawn.

How do I calculate $E(X_6)$ and $VAR(X_6)$ as a function of $X_1$ ?

I tried to run a simulation of $10K$ draws (but it was not from 40, but from 36..the title misleading), and according to the plots there seem to be a linear relation in both cases.enter image description hereenter image description here

d_e
  • 1,565
  • Are the six drawn numbers necessarily distinct? – drhab Jul 05 '16 at 14:13
  • yes. 6 distinct from 1 to 40. I'll add that to the post. – d_e Jul 05 '16 at 14:14
  • 2
    $X_6$ can be interpreted as the highest of $5$ distinct numbers drawn from the set ${X_1+1,X_1+2,\dots,40}$. – drhab Jul 05 '16 at 14:18
  • I see, thanks. so I guess, my question is basically what is mean/variance of the highest out of 5 drawn, which unfortunately I have no answer either at this point. another hint ? – d_e Jul 05 '16 at 14:28
  • Your first graph looks linear, your second possibly not – Henry Jul 05 '16 at 14:47

2 Answers2

1

Hint:

Let $5$ distinct numbers be drawn from set $\left\{ 1,\cdots,n\right\} $ where $n\geq5$ and let $M$ denote the highest number drawn.

There are $\binom{n}{5}$ equiprobable possibilities for drawing $5$ distinct numbers.

Under the condition that $M=k$ there are $\binom{k-1}{4}$ possibilities, so that:$$P\left(M=k\right)=\frac{\binom{k-1}{4}}{\binom{n}{5}}$$

Apply this on $n=40-X_1$ and find expectation and variance of $M+X_1$ where $X_1$ must be looked at as a fixed constant.

drhab
  • 151,093
1

As drhab noted, given the value of $X_1$, the remaining numbers are a uniformly random sample of size $5$ without replacement from the numbers above $X_1$.

To find the mean of the maximum of $k$ distinct numbers drawn from $1$ to $n$, take $n-k$ white balls, $k$ red balls and $1$ yellow ball, arrange them in a uniformly random permutation in a circle, and remove the yellow ball to obtain a linear arrangement, with the positions of the $k$ red balls corresponding to the $k$ numbers drawn. By symmetry, all $k+1$ segments between the $k+1$ coloured balls have the same expected length, so the expected value of the highest number drawn is $n-\frac{n-k}{k+1}$.

To find the variance, consider the sum $X=\sum_iX_i$ of $n-k$ indicator variables, where each variable is $1$ if the corresponding white ball is to the right of the last red ball and $0$ if it's to the left. Then the highest number drawn is $n-X$, and $E[X]=\frac{n-k}{k+1}$ as derived above. Now calculate

\begin{align} \mathbb E[X^2] &= \mathbb E\left[\left(\sum_iX_i\right)^2\right] \\ &= \mathbb E\left[\sum_iX_i\right]+\mathbb E\left[\sum_{i\ne j}X_iX_j\right] \\ &= \frac{n-k}{k+1}+(n-k)(n-k-1)\mathbb E[X_1X_2] \\ &= \frac{n-k}{k+1}+(n-k)(n-k-1)\frac1{\binom{k+2}2} \\ &= \frac{n-k}{k+1}\left(1+2\cdot\frac{n-k-1}{k+2}\right)\;. \end{align}

Then

\begin{align} \operatorname{Var}[X] &=\mathbb E[X^2]-E[X]^2 \\ &= \frac{n-k}{k+1}\left(1+2\cdot\frac{n-k-1}{k+2}-\frac{n-k}{k+1}\right) \\ &= \frac{k(n-k)(n+1)}{(k+1)^2(k+2)}\;, \end{align}

which is also the variance of $n-X$, that is, of the highest number drawn.

Note that no sums over the distribution had to be computed in the process, only one probability, the probability $\frac1{\binom{k+2}2}$ for two white balls to be to the right of all $k$ red balls.

joriki
  • 238,052