Help in understanding from book expression of variance of an estimator : PRBS vs real valued input

Question

The question is based on Book : Fundamentals of Statistical Signal Processing by Steven Kay, Chapter 4 : Eq(4.21). The expression for the variance of the estimated coefficients when the input is PRN as

$$\textrm{Var}\left(\hat{h}_i\right)= \frac{\sigma_w^2}{Nr_{uu}[0]} \tag{1}$$
where $\sigma_w^2$ is the variance of the measurement noise; $\sigma_u^2$ is the variance of the input.

Let the model be: $$x[n] = \sum_{i=1}^m h[i]u[n-i] + w[n] \tag{2}$$

where $u[\cdot]$ is the input process, and $w[n]$ is a zero mean white gaussian noise. $\sigma_{\rm input}^2$ is the variance of the input, $u[n]$ and $\sigma_{\rm noise}^2$ is the variance of the measurement noise $w[n]$. The cross-correlation function $R_{x,u}[\cdot]$ between $x[\cdot]$ and $u[\cdot]$ is the periodic convolution between the sequence $h[\cdot]$ and the periodic autocorrelation function $R_{u,u}[\cdot]$

Performance is evaluated using CRLB of channel coefficients.

Can somebody please explain to clear my confusion about what the expression of the variance of the estimator i.e, Cramer-Rao lower bound for the channel coefficients will be
- (a) when $u[n]$ takes real values. Is it $\frac{m\sigma_{\rm noise}^2}{N \sigma_{\rm input}^2}$
- (b) when $u[n]$ takes values from a finite symbol set $\{+1,-1\}$. Is it $\displaystyle \frac{m\sigma^2_{\rm noise}}{N}$ where $\sigma^2_u = 1$ and $r_{uu} =1 $
The expression in the book is for PRN sequence, but I don't quite understand what PRN sequence is. is it another term for symbolic values (not real values)? By PRN do we mean the values are ${+1,-1}$?

EDIT of the update based on the answer and offline discussions

Just to confirm if I am on the correct track. The general formula for $var(\mathbf{\Delta h}\mathbf{\Delta h^T}) =\frac{ \text{variance of measurement noise}}{\text{normalizing factor * autocorrelation of input}}$ = $\frac{\sigma^2}{D*\text{autocorre of input}} $ where $D = N * variance = N {\sigma^2_u }$.

Therefore, $var(\mathbf{\Delta h}\mathbf{\Delta h^T}) = \frac{\sigma^2}{N \sigma^2_u \sum u[n]u^*[n]}$ = $\frac{\sigma^2}{N(\sigma^2_u \sigma^{*2}_u)^2}$ [autocorrelation having complex conjugate term] ??

For +1/-1 sequence, this formula evaluates to $var(\mathbf{\Delta h}\mathbf{\Delta h^T}) = \frac{\sigma^2}{N\sigma^2_u} $

For QAM :$var(\mathbf{\Delta h}\mathbf{\Delta h^*}) = \frac{\sigma^2}{D (\sum u[n] u^*[n])}$ = $\frac{\sigma^2} { \big(N \sigma_u \sigma_u^*\big) \big(\sum u[n] u^*[n]\big)}$

did you see all my comments in the offline discussion? I believe D is Nvariance not Nrms — Dan Boschen, Feb 28 '17 at 22:21
And yes in the computation all squared values are done with complex conjugates so the result will be real. You also omitted your factor of m which is the noise growth from your final filter where m is the number of taps in that filter. — Dan Boschen, Feb 28 '17 at 22:24
@DanBoschen: I did see all the comments in the offline discussion and there you wrote D = N* rms. So, now D will be N times variance and there won't be nay square root term? If yes, then this makes more sense now. However, do you think there is any direct value to substitute for autocorrelation of the input which appears in the denominator for QAM? — Ria George, Mar 01 '17 at 00:23
Could you please see a new question which is related to this one? I thought of informing you with the confidence that you may be able to better understand what the Question is all about as in the new question, the input characteristic changes to an ergodic process. The link is http://dsp.stackexchange.com/questions/38012/autocorrelation-for-ergodic-signal — Ria George, Mar 01 '17 at 00:37
I had commented after the first D=Nrms that I thought it should be Nvariance, if you agree, can you update your post above to avoid confusion? Also I believe the autocorrelation that is in your denominator should be at delay = 0 (which is the variance when divided by N). — Dan Boschen, Mar 01 '17 at 12:39
@DanBoschen: I have tried my best to correct it but still not confident about my understanding. It will be immensely helpful if you can kindly change the update where I have made mistakes. Why do we take autocorrelation at delay 0? Please forgive my if these questions are lame, I am still learning and very basic things are impossible to ask from an instructor. — Ria George, Mar 01 '17 at 19:55
The autocorrelation at delay = 0 is the variance (or proportional to it depending on how the autocorrelation is normalized( — Dan Boschen, Mar 01 '17 at 22:55
And i think you are missing a complex conjugate in your first autocorrelation expression? Please do take a closer look at the link I gave you http://dsp.stackexchange.com/questions/31318/compensating-loudspeaker-frequency-response-in-an-audio-signal/31326#31326 as you can use that code with QAM signals and confirm the scaling is correct yourself by measuring the variance on the coefficient solution that you get using the Wiener Hopf equations. — Dan Boschen, Mar 01 '17 at 23:01
@DanBoschen: sorry to sound noisy again but your last comment has confused me a bit. If autocorrelation at delay 0 is the variance, then does the denominator have a product of variance? Is theis the formula then $\frac{\sigma^2}{N(\sigma^2 \sigma^{*2})^2}$I have put complex conjugate term in the first autocorrelation. It would be very helpful if you may kindly update your answer with the part that you wrote in offline discussion and the update in the Question which I wrote to avoid any mistakes. — Ria George, Mar 02 '17 at 17:08

score 3 · Accepted Answer · edited Feb 27 '17 at 20:31

A PRN sequence is a Pseudo-Random Noise sequence, often generated by using an Linear Feedback Shift Register (LFSR) with the feedback taps done by using a primitive irreducible polynomial in GF{2}, which is the Golois Field of 2 elements. When a primitive and irreducible polynomial in GF{2} is used, the LFSR will produce a "maximum length sequence", meaning the pattern at its output will be the longest possible for the number of registers in the LFSR. An example generated from the polynomial $1+x^3+x^4$ is shown in the picture below. This polynomial is primitive and irreducible in GF{2}, and in this case we will use the symbols $-1$ and $+1$ as the two elements in the field), and therefore this LFSR as configured will produce a maximum length sequence; the state of the generator, being the values after each cycle at the output of each register in the generator, will cycle through every possible combination pseudo-randomly before repeating, except for the all zero state. (So in the picture shown the choices are $0001$, $0010$, $0011$, $0100$, $\ldots$, $1111$ in pseudo-random order, so the pattern of the sequence at the output will have $15$ values before repeating) Generally the maximum length is $x^N-1$ where $N$ is the order of the polynomial, the $-1$ is because the all-zero state is not allowed).

For clarity we will refer to each output of the LFSR as a "chip", and the complete sequence is referred to as a "symbol".

When you correlate to a matched PRN sequence + noise (the input), you multiply the input by each coefficient in the sequence and sum the result (correlation = multiply + accumulate). If you are aligned to the code, the multiplication will cause the $-1$'s in the input to be $+1$'s, and the $+1$'s in the input to be $+1$'s, essentially converting the pseudo-random pattern to be all $1$'s. This output of all ones is then accumulated to achieve the correlator output. (Below shows the multiplication that would then be fed to an accumulator after each result).

The accumulation on the signal component (which is all $1$'s when aligned) will go up in magnitude by $N$ given $N$ samples in the sequence (if we assume one sample per chip).

Assuming the input noise itself is uncorrelated to the PRN sequence, and white within our sample space (which means that each sample of noise is uncorrelated to all the other samples), then the multiplication by $+1/-1$ will not change the variance of the noise. Therefore the standard deviation of the noise will increase by $\sqrt{N}$ in the accumulator. (When you add N uncorrelated noise samples, the standard deviation goes up by the $\sqrt{N}$).

So the variance of the noise is increasing by $N$, but the variance of the signal is increasing by $N^2$ (since the signal magnitude, not power, went up by a factor of $N$). The end result is the signal-to-noise ratio (SNR) changes by a factor of N in power or $10\log(N)$ in $\textrm{dB}$, and this is referred to as the "processing gain" of the PRN correlator.

If we normalize the signal level to be the same at input and output, the noise standard deviation will have changed by a factor of $1/\sqrt{N}$, and it's variance by $1/N$, matching your equation.

Stating this mathematically, consider our samples at the output of the multiplier as $$X_i\quad\text{for}\quad i=1, 2, \ldots, N$$

The multiplication by $+1/-1$ does not change the distribution of the noise on the input samples (they will still be independent and identically distributed, IID), with the same variance as the input given as $$\textrm{Var}(X_i) = \sigma^2$$

Multiplying by $\pm 1$ with a synchronized PRN causes the mean to be 1 as previously described, $$E(X_i)=1$$

We will denote the output after accumulating over all $N$ samples as $Y$, but we will also divide by $N$ to be consistent with the normalized autocorrelation $r_{uu}=1 = E(Y)$: $$r_{uu} = \frac{1}{N}\Sigma_1^N X_i = E(Y)$$

Where $X_i$ is the product of the input waveform with the PRN.

This results in the expression for the mean and variance of $Y$ as follows:

\begin{align} E(Y) &= E\left[\frac{1}{N}\left(X_1+X_2+ \ldots +X_N\right)\right] = \frac{1}{N}E\left[(X_1+X_2+ \ldots +X_N)\right]\\ \textrm{Var}(Y) &= \textrm{Var}\left[\frac{1}{N}\left(X_1+X_2+ \ldots +X_N\right)\right] = \frac{1}{N^2}\textrm{Var}\left(X_1+X_2+ \ldots+X_N\right) \end{align}

\begin{align} E\left[(X_1+X_2+ \ldots +X_N)\right] &= E[X_1]+E[X_2]+ \ldots E[X_N] = N\\ \textrm{Var}(X_1+X_2+ \ldots +X_N) &= \textrm{Var}(X_1)+\textrm{Var}(X_2) + \ldots \textrm{Var}(X_N) = N\sigma^2 \end{align}

Therefore \begin{align} E(Y) &= NE[(X_1+X_2+ \ldots +X_N)] = 1\quad\text{and}\\ \textrm{Var}(Y) &= \frac{1}{N^2}\textrm{Var}(X_1+X_2+ \ldots +X_N) = \frac{1}{N^2}N\sigma^2 = \frac{\sigma^2}{N} \end{align}

Thank you for the illustration. I have one last doubt which is is this result $\sigma^2/N$applicable to any symbolic sequence, say symbolic sequence taking in values 1,2,3,4 or -1,+1,-2,+2 ? — Ria George, Feb 27 '17 at 01:29
I would say not as then the noise in each sample becomes weighted, unlike the +1 /-1 case where the "gain" for each summation is 1 (Note if you have a white noise process and you multiply it by +1 or -1 it does not change the distribution or scale of the noise, however multiplying by 2 for some of the samples and 1 for others would bias the result). Do you agree? — Dan Boschen, Feb 27 '17 at 02:37
I was thinking that if the symbols are equi-probable then $r_{uu}$ = 1? — Ria George, Feb 27 '17 at 04:24
The general equation for the autocorrelation is $R_{uu}(n) =\frac{1}{N} \Sigma_{k =0}^{N-1} u(k) u(k-n).$ -- We can see that if each sample is +1/-1 then $r_{uu} =1$, but if your samples are larger than 1, then $r_{uu}$ will not be 1, right? — Dan Boschen, Feb 27 '17 at 04:36
I see. Just to confirm if my understanding is correct --In general, the expression for variance of the estimator $\hat{h}$ is $\sigma^2/(N\sigma^2_u r_{uu})$. Depending on the input type, $u$, the expression changes to $\sigma^2/(N\sigma^2_u)$ only for input taking in values ${+1,1}$. Otherwise, the expression remains the same for the rest of the other input types. Am I correct? — Ria George, Feb 27 '17 at 14:54
And, would the order number of the model, $m$ be in the numerator as I wrote in the Question? — Ria George, Feb 27 '17 at 15:14
No it would change for all input types- recognize the averaging operation that is being performed and how the variance changes in an average. For the +/-1 case this was equivalent to a simple average but for the other types you describe they could be viewed as a weighted average. — Dan Boschen, Feb 27 '17 at 15:31
And yes the order of the model $m$ would be as written in that you are summing over m coefficients each with IID noise, so the final variance would increase by a factor of m — Dan Boschen, Feb 27 '17 at 15:58
This sounds like a channel estimation/equalization using the Wiener-Hopf equations for example where N chips are used as a training sequence to estimate the channel, and your resulting equalizer filter has m taps. The coefficients in the solution have noise that is the input noise variance /N and then the number of taps in your equalizer is summing these noise errors in power, resulting in the increase by m for m taps. If this is the case it assumes the noise contribution from tap to tap is uncorrelated, which is valid if the initial input is white noise. Is this what you are doing?? — Dan Boschen, Feb 27 '17 at 21:11
Yes, I am trying to find the expression for the variance of the channel estimates (coefficients) for non-blind channel estimation where the input to the channel is known. The input follows a multinomial distribution. So, I am confused what the expression would be. Then, I want to find the expression for the variance of the channel estimates in a blind setting. I have asked a question for this one here http://dsp.stackexchange.com/questions/37876/what-is-the-correct-way-in-formulating-log-likelihood-expression-and-doubt-if-we Could you please have a glance at that question also? — Ria George, Feb 27 '17 at 22:47
As you say that the noise distribution from tap to tap is uncorrelated, then would the expression $\sigma^2/N$ hold for input taking in values from a finite alphabet set but other than $\pm1$ and the size of the alphabet set is higher that the binary alphabet set? — Ria George, Feb 27 '17 at 22:50
Thank you for your continued discussion and helpful insights. — Ria George, Feb 27 '17 at 22:51
I see, now I understand better what you are doing. The sequence you use for training, with the expanded alphabet, can we assume its autocorrelation is 1 (or can be normalized to 1) at 0 time offset and 0 everywhere else (meaning sufficiently pseudo-random)? — Dan Boschen, Feb 27 '17 at 23:36
The assumptions that are applied to QAM symbols are applied to the symbols in my application. The source symbols are 64-QAM. I don't know if for QAM we assume autocorrelation 1. My basic is not that strong in communications. I think it maybe safe to assume the symbols are pseudo random but the probability of occurrence of each symbol is different, so the symbols are not equi-probable. — Ria George, Feb 28 '17 at 01:13
This may help give you further insight as well, notably the code for Octave/Matlab that you could take and apply to your case perhaps: http://dsp.stackexchange.com/questions/31318/compensating-loudspeaker-frequency-response-in-an-audio-signal/31326#31326 — Dan Boschen, Feb 28 '17 at 02:42
Thanks for the link, very useful. However, I don't know the answer about the autocorrelation $r_{u,u}$ for multiple level symbols. I checked in the text books as well, but the information is not there. Can you please say what the expression would be? — Ria George, Feb 28 '17 at 03:34

Help in understanding from book expression of variance of an estimator : PRBS vs real valued input

1 Answers1

Linked