1

In the book Fundamentals of Statistical Processing, Volume I: Estimation Theory by Steven M. Kay on page 19, it say that the mean square error of an estimator, $\hat{\theta}$, given the true value, $\theta$, is defined as

\begin{align} \text{mse}(\hat{\theta}) &= E[(\hat{\theta} - \theta)^2] \\ &= E \left[ \left[\left( \hat{\theta} - E[\hat{\theta}] \right) + \left( E[\hat{\theta}] - \theta \right) \right]^2 \right]\\ &= \text{var}(\hat{\theta}) + b^2(\theta) \end{align} where $b(\theta) = E[\hat{\theta}] - \theta$ is the bias.

But if I write it out, it doesn't come out to be the same

\begin{align} \text{mse}(\hat{\theta}) &= E[(\hat{\theta} - \theta)^2] \\ &= E \left[ \left[\left( \hat{\theta} - E[\hat{\theta}] \right) + \left( E[\hat{\theta}] - \theta \right) \right]^2 \right]\\ &= E \left[ \left( \hat{\theta} - E[\hat{\theta}] \right)^2 + \left( E[\hat{\theta}] - \theta \right)^2 + 2 \left( \hat{\theta} - E[\hat{\theta}] \right) \left( E[\hat{\theta}] - \theta \right) \right]\\ &= \text{var}(\hat{\theta}) + b^2(\theta) + 2 E \left[\left( \hat{\theta} - E[\hat{\theta}] \right) \left( E[\hat{\theta}] - \theta \right) \right] \end{align}

What am I missing here?

Gilles
  • 3,386
  • 3
  • 21
  • 28
Zero
  • 207
  • 2
  • 8

2 Answers2

2

HINT:

$$ E\left[\hat \theta - E\left(\hat \theta \right)\right] = E\left(\hat \theta\right) - E\left(\hat \theta\right) $$

Gilles
  • 3,386
  • 3
  • 21
  • 28
  • well now i feel stupid :) thanks anyways – Zero Dec 30 '20 at 16:32
  • It's important that $\theta$ is modeled as a deterministic parameter rather than a random variable. That's why we can take the other factor out of the expectation. – Matt L. Dec 30 '20 at 17:13
  • @MattL. yeah...i just didn't even think about expanding further and stopped like in the question...thanks :) – Zero Jan 01 '21 at 10:50
  • @MattL. It is indeed true that for classical estimation (contrary to Bayesian) $\theta$ is deterministic. – Gilles Jan 01 '21 at 14:13
2

It's important to point out here that Kay is talking about classical estimation, which means that the parameters to be estimated are unknown but deterministic. Hence, the term $E[\hat{\theta}]-\theta$ is deterministic, and, consequently, the last term in your equation becomes

$$\begin{align} E \left[\left( \hat{\theta} - E[\hat{\theta}] \right) \left( E[\hat{\theta}] - \theta \right) \right]&= \left(E[\hat{\theta}]-\theta\right)E\left[\hat{\theta}-E[\hat{\theta}]\right]\\&=\left(E[\hat{\theta}]-\theta\right)\left(E[\hat{\theta}]-E[\hat{\theta}]\right) \\&=0\end{align}$$

Matt L.
  • 89,963
  • 9
  • 79
  • 179