$E[\Sigma(y_i-\bar{y})^2]=(n-1)\sigma^2 +\beta_1^2\Sigma(x_i-\bar{x})^2$ proof

Question

I am having trouble proving the identity below.

$E[\Sigma(y_i-\bar{y})^2]=(n-1)\sigma^2 +\beta_1^2\Sigma(x_i-\bar{x})^2$

where the assumptions are

$Cov[y_i,y_j]=0$ for $i \ne j$

$E[y_i]=\beta_0+\beta_1x_i, Var[y_i]=\sigma^2$

$\hat\beta_0$ and $\hat\beta_1$ are least squares estimate of $\beta_0$ and $\beta_1$

So far I understand that $$E[\hat\beta_1^2]=\beta_1^2 +\frac{\sigma^2}{\Sigma(x-\bar x)^2}$$

but I seem to really have issue understanding the relationship between x and y :(

I am thinking that

$$E\left[n \frac{1}{n}\Sigma(y_i-\bar{y})^2 \right]=n E[Var[Y_i]]= n\sigma^2$$ which looks nothing like the expression...

May I get some help, please?

I don't see the point to mention $\hat\beta_0$ and $\hat\beta_1$ because they aren't part of the formula you want to establish. Can you precise it ? — Jean Marie, Feb 03 '19 at 13:33
That is actually one of the issues that I am having, too... All I know right now is that $y_i = \beta_0 +\beta_1x_i + \epsilon_i$ and $\hat\beta_0$, $\hat\beta_1$ are the least squares estimate. — hyg17, Feb 03 '19 at 13:48
I am sure that it is used to prove that $\hat\sigma^2= \frac{1}{n-2} \Sigma(y_i-\hat\beta_0-\hat\beta_1 x_i)^2$ which is a problem that comes after this one. — hyg17, Feb 03 '19 at 13:49
One thing is sure, it is a generalization of the "sample variance formula" : ${\displaystyle s^{2}={\frac {n}{n-1}}\sigma {Y}^{2}={\frac {n}{n-1}}\left({\frac {1}{n}}\sum _{i=1}^{n}\left(Y{i}-{\overline {Y}}\right)^{2}\right)={\frac {1}{n-1}}\sum {i=1}^{n}\left(Y{i}-{\overline {Y}}\right)^{2}}$ (copied from https://en.wikipedia.org/wiki/Variance) — Jean Marie, Feb 03 '19 at 15:25

score 2 · Answer 1 · answered Feb 04 '19 at 09:11

What follows is a complete explanation with more detail than what might be needed for a full understanding.

In the linear regression model $$y_i = \beta_0 + \beta_1 x_i + \epsilon_i$$ the only random variable on the right-hand side of this equation is $$\epsilon_i \sim \operatorname{Normal}(0, \sigma^2).$$ Everything else is either a parameter ($\beta_0$, $\beta_1)$ or a covariate ($x_i$). The left-hand side $y_i$ is therefore a random variable, whose randomness is attributed to the error term. As the errors are independent, so are the responses.

Note there is no parameter estimation mentioned here. $\beta_0$ and $\beta_1$ represent the true parameters for the model, in the sense that if you were to make numerous observations of the response for a given value of $x_i$, you would find that these would be normally distributed with mean $\mu_i = \beta_0 + \beta_1 x_i$ and variance $\sigma^2$.

The best way to understand the random variable $$\sum_{i=1}^n (y_i - \bar y)^2$$ is to first ask, if a single $y_i$ has expectation $\mu_i$, then what is the expectation of the sample mean $\bar y$? This easily follows from the law of total expectation: $$\operatorname{E}[\bar y] = \frac{1}{n} \sum_{i=1}^n \operatorname{E}[y_i] = \frac{1}{n} \sum_{i=1}^n \beta_0 + \beta_1 x_i = \beta_0 + \beta_1 \bar x,$$ which is simply the response at the mean value of the covariate. Let's call this value $\mu$.

Now it is easy to partition the sum of squares, knowing that by construction, the mean deviation of $\bar y$ from $\mu$ is zero: $$\begin{align*} \sum_{i=1}^n (y_i - \bar y)^2 &= \sum_{i=1}^n (y_i - \mu + \mu - \bar y)^2 \\ &= \sum_{i=1}^n \left( (y_i - \mu)^2 + 2(y_i - \mu)(\mu - \bar y) + (\mu - \bar y)^2 \right) \\ &= \sum_{i=1}^n (y_i - \mu)^2 + 2 (\mu - \bar y) \sum_{i=1}^n (y_i - \mu) + n(\mu - \bar y)^2 \\ &= \sum_{i=1}^n (y_i - \mu)^2 + 2 (\mu - \bar y)(n \bar y - n \mu) + n(\mu - \bar y)^2 \\ &= \sum_{i=1}^n (y_i - \mu)^2 - n(\mu - \bar y)^2. \end{align*}$$ The expected value $\operatorname{E}[(\bar y - \mu)^2] = \operatorname{Var}[\bar y]$ by definition, and by the independence of responses, $$\operatorname{Var}[\bar y] \overset{\text{ind}}{=} \frac{1}{n^2} \sum_{i=1}^n \operatorname{Var}[y_i] = \frac{\sigma^2}{n}.$$ So all that remains is to compute the expectation of the first term. But $$\operatorname{E}\left[\sum_{i=1}^n (y_i - \mu)^2\right] = \sum_{i=1}^n \operatorname{E}[(y_i - \mu)^2],$$ and since $$\begin{align*}\operatorname{E}[(y_i - \mu)^2] &= \operatorname{E}[(y_i - \mu_i + \mu_i - \mu)^2] \\ &= \operatorname{E}[(y_i - \mu_i)^2] + 2(\mu_i - \mu) \operatorname{E}[y_i - \mu_i] + (\mu_i - \mu)^2 \\ &= \operatorname{Var}[y_i] + (\mu_i - \mu)^2 \\ &= \sigma^2 + (\mu_i - \mu)^2, \end{align*}$$ we obtain after putting everything together $$\begin{align*} \operatorname{E}\left[\sum_{i=1}^n (y_i - \bar y)^2\right] &= n \sigma^2 + \sum_{i=1}^n (\mu_i - \mu)^2 - \sigma^2 \\ &= (n-1)\sigma^2 + \sum_{i=1}^n (\beta_0 + \beta_1 x_i - (\beta_0 + \beta_1 \bar x))^2 \\ &= (n-1)\sigma^2 + \beta_1^2 \sum_{i=1}^n (x_i - \bar x)^2 \end{align*}$$ as claimed.

Throughout our discussion, the only random variables here have been $\epsilon_i$ and any functions of these, such as $y_i$ and $\bar y$. The quantities $\mu_i$ and $\mu$ are not random, being functions of the parameters and covariate. It is important to keep this in mind.

Thank you for your detailed explanation! I would be able to come back to this page whenever I would like to review this again. — hyg17, Feb 04 '19 at 19:25

V. Vancak · Accepted Answer · 2019-02-04T08:24:51.840

1

\begin{align} \mathbb{E}\sum (y_i - \bar{y} ) ^2&= \mathbb{E}\sum y_i^2 - n\mathbb{E}\bar{y} ^2\\ &=nVar(y_i)+\sum\mathbb{E}^2(y_i)-n(Var(\bar{y})+\mathbb{E}^2(\bar{y}))\\ &=n\sigma^2 + n \beta_0^2 + 2\beta_1 \sum x_i + \beta_1^2\sum x^2_i -\sigma^2 - n\beta_0^2 - 2\beta_1\bar{x} - \beta_1^2n\bar{x}^2\\ &= (n-1)\sigma^2 +\beta_1^2(\sum x_i^2 - n \bar{x}^2)\\ &= (n-1)\sigma^2 +\beta_1^2\sum (x_i - \bar{x})^2. \end{align}

edited Feb 04 '19 at 08:24

answered Feb 03 '19 at 20:25

V. Vancak

16,444

Here is what really confuses me. Why is $\bar y$ not treated as a constant? Also, if $E[\bar y] = E[y_i]$ so I am thinking that $\beta_0+\beta_1 x_i$ cancels ... – hyg17 Feb 04 '19 at 00:47
@hyg17 Please see the edited answer – V. Vancak Feb 04 '19 at 08:25
Thank you for the concise explanation. I was able to fully understand the situation that I was conflicted in. – hyg17 Feb 04 '19 at 19:25

score 0 · Answer 3 · answered Aug 04 '20 at 23:57

We are assuming $y_i = \beta_0+\beta_1 x_i + \varepsilon_i$, where the $\varepsilon$'s are uncorrelated with mean zero and variance $\sigma^2$. Using rules of sample average we have $\bar y = \beta_0 +\beta_1 \bar x + \bar\varepsilon$. Subtracting these gives $$ y_i - \bar y = \beta_1(x_i-\bar x) + (\varepsilon_i-\bar\varepsilon).\tag1$$

Square: $$ (y_i - \bar y)^2 = \beta_1^2(x_i-\bar x)^2 +2\beta_1(x_i-\bar x)(\varepsilon_i-\bar\varepsilon) + (\varepsilon_i-\bar\varepsilon)^2.\tag2$$ Take expectation. Remember the $x$'s are constant, so the middle term vanishes. [Reason: $E(\varepsilon_i)=0$, so that $E(\bar\varepsilon)=E[\frac1n\sum \varepsilon_i]=0$ also.] We are left with $$ E[(y_i - \bar y)^2] = \beta_1^2(x_i-\bar x)^2 + E[(\varepsilon_i-\bar\varepsilon)^2].\tag3$$ Sum over $i$, using linearity of expectation: $$E\sum_{i=1}^n (y_i - \bar y)^2=\beta_1^2\sum_{i=1}^n(x_i-\bar x)^2 + E\sum_{i=1}^n(\varepsilon_i-\bar\varepsilon)^2.\tag4$$ The rightmost term equals $(n-1)\sigma^2$, by adapting the result for independent variables to the uncorrelated case.

$E[\Sigma(y_i-\bar{y})^2]=(n-1)\sigma^2 +\beta_1^2\Sigma(x_i-\bar{x})^2$ proof

3 Answers3