2

I was reading some statistics book that said - "Usually, it is expensive to perform an analysis on an entire population; hence, most statistical methods are about drawing conclusions about a population by analyzing a sample." and went on to give this formula for sample variance:

$$\frac{1}{N-1} \sum_{i=1}^{N}(x_i - \mu)^2$$

with the explanation - "The reason to use denominator N-1 for a sample instead of N, is the degrees of freedom."

I looked-up "degrees of freedom", I found many explanations, and this is what I could understand - If we are asked to choose a sample of "n" values knowing the mean has to be some "x", we are actually only free to choose "n-1" values. But I cant seem to extend this understanding to the above calculation of sample variance;

I have a population of size "P" and I randomly choose a sample of size "N". Now that I have already chosen my sample the variance of this sample, by definition of variance should be $\frac{1}{N}\sum_{i=1}^N(x_i - \mu)^2$. If I replace N with N-1, how is that the variance of the sample?

I might be missing some prerequisite knowledge here, I am not sure.

Thank you in advance!

  • Note the difference between the variance of a set of data and the estimated variance of a distribution from a set of sampling data. – Trebor Jul 12 '21 at 09:33
  • 2
    This might be helpful https://en.wikipedia.org/wiki/Bessel%27s_correction#Source_of_bias – John Gowers Jul 12 '21 at 09:38
  • The average, $\mu$, along with $N-1$ of the values determine the $N$th value so you are not free to choose $N$ values. – John Douma Jul 12 '21 at 09:59
  • @JohnDouma OP said they understood that already. – John Gowers Jul 12 '21 at 10:04
  • @JohnDouma but here $\mu$ is not known prior to choosing our sample, we took a sample first, that we think will give us an estimate of the population. – Pratheek Ponnuru Jul 12 '21 at 12:30
  • @JohnGowers the document offers a better albeit lengthy explanation for the (n-1) in the denominator compared to the "degrees of freedom" explanation which is very ambiguous at the least. Thanks for pointing me to it. – Pratheek Ponnuru Jul 12 '21 at 14:37

0 Answers0