0

Given a set of $N$ iid samples, $x_k$, of random variable $X$ and the expectation estimator,

$$E[X] \approx \frac{1}{N} \sum_{k=1}^N x_k,$$

how do I find the variance of this estimator (not the variance of $X$) as a function of $N$?

I would only like to know the error in the estimate of $E[X]$ and what I would expect is that more samples (larger $N$) drives the variance down, so that

$$\lim_{N \to \infty} \bigg[\frac{1}{N} \sum_{k=1}^N x_k - E[X]\bigg] = 0.$$

Am I right about this? Perhaps variance is not the correct way to measure the error in the estimate? What I would like to understand is, how many $N$ is sufficient because there is a cost for every extra $x_k$ that is processed.

user827822
  • 319
  • 1
  • 9

1 Answers1

1

Without more knowledge on $X$, we cannot say a lot. However, if it is IID with mean $\mu$ and variance $\sigma^2$, the sample mean $\hat{\mu} = \frac{1}{N}\left(\sum_{n=0}^{N-1}x_n\right)$ is unbiased, since:

$$E[\hat{\mu}] = E\left(\frac{1}{N}\sum_{n=0}^{N-1}x_n\right) = \frac{1}{N}\sum_{n=0}^{N-1}E\left(x_n\right) = N\mu/N=\mu\,.$$

And its variance goes to zero when $N$ increases:

$$V[\hat{\mu}] = V\left(\frac{1}{N}\sum_{n=0}^{N-1}x_n\right) = \frac{1}{N^2}\sum_{n=0}^{N-1}V\left(x_n\right) = N\sigma^2/N^2=\sigma^2/N\,.$$

Thus, the expectation converges to the actual mean, and the variance of the estimator tends to zero as the number of samples grows. Under these definitions, the sample mean is a consistent estimator. Note that one could try to use other hypotheses: alternative norms, convergence in law, etc.

Laurent Duval
  • 31,850
  • 3
  • 33
  • 101