3

UPDATE 2 Okay, I think I understand now why the defined CRLB is not applying to my use case. In my use case, we have very high SNR, and the the first signal $x_1$ is always the same. So the classical implementation in the literature of $$ x_1 = s(t)+n_1 (t)$$ $$ x_2 = s(t-D)+n_2 (t)$$ is not entirely correct, since this case, I believe, is usually defined for different random noise $n_1$ and $n_2$ at each shot. In my case, however, we always use the same reference $x_1$ for comparison with successive $x_2$, so $n_1$ is fixed. In the case of cross-correlation, i see this as basically having a $x_1$ without noise (so the noisy signal is the reference), and a $n_2$ with noise.

So, normally, cross-correlation would imply $$ x_ 1 *x_2 = R_{xx}(t-D)+s*n_1+s*n_2+n_1*n_2$$ But if consider no noise in $x_1$, I would get $$ x_ 1 *x_2 = R_{xx}(t-D)+s*n_2$$ Which, assuming the $n_1*n_2$ term is negligible, and $s*n_1 \approx s*n_2$would imply about half the influence of noise, correct?

I assume that maybe this would alter how SNR is involved in the cramer-rao. Hopefully I'm not saying anything overly dumb, but if possible I would like to have some insight on this issue.

UPDATE (Discussion in the comments)

I found the reason for why I would get a bias with sinc interpolation. Since in my application I use the same reference several times, and the first measurement is subject to noise, the reference itself has some time displacement D at the start. I tried using as reference (x1) the signal before corrupting with noise and now both values agree. Nevertheless, I wouldn't want this to detract from the question. Without the bias, both the MSE and the Variance are twice that of the cramer rao. I believe the cramer-rao is correct, according to literature, so any thoughts on why this happens?

In summary, I get this :

  • Random noise in $x_1$, random noise in $x_2$ (usual textbook case) : MSE = Variance = CRLB.
  • No noise in $x_1$, random noise in $x_2$ : MSE = Variance, although two times less than the CRLB.
  • Always the same noise in $x_1$, random noise in $x_2$ (my intended case) : Lower MSE, higher variance, which would suggest a biased estimator.

See UPDATE 2 for my conclusions on the matter.


Original Question

Hope this isn't a repeat question, as I've seen a couple regarding the CRLB for cross-correlation but none seems to mention this problem.

According to the following papers : Equation 5/Appendix A and Equation 24/Table 2, the Cross-correlation for a discrete random white Gaussian noise, critically sampled is

$$ \sigma^2_{CRLB} = \frac{3}{N\pi^2}\left(\frac{1+2SNR}{SNR^2}\right) $$

Which, from my understanding, should give a lower bound on the variance of the obtained lags, given the number of samples $N$ and the signal SNR. However, when I attempt to simulate it, I keep getting the variance of my signal about 2.5 times lower, for the case of zero delay.

% Test Cramer Rao

close all, clc, clear all

runs = 100;
MSE  = 0;
CRLB = 0;
for r = 1:runs
Iterations = 1000;
N = 512;
Delay = 0;

% 1 : Generate a Random White Gaussian Signal
% ---
signal = randn(1,N);
signal = repmat(signal,[Iterations,1]);

% 2 : Generate the noise required to obtain a given SNR.
% ---
SNR = 20;
noise_variance = 1/SNR* var(signal(1,:));
noise  = sqrt(noise_variance)*randn(Iterations,N);

% 3 : Calculate the actual SNR for both the zero-mean noise and zero-mean
%     signal
% ---
SNR = (rms(signal(1,:))/rms(noise(1,:)))^2;

% 4 : Corrupt signal with generated noise
% ---
signal_w_noise = signal + noise;

% 5 : Process cross correlations to calculate lags,
%     interpolate as defined 
% ---
x1 = signal_w_noise(1,:);
lags = [];
for seg = 1:Iterations
    x2 = signal_w_noise(seg,:);
    [Correlation, lag_array] = xcov(x1,x2);
    [~,peak_index] = max(Correlation);

    coarse = lag_array(peak_index);
    fine = -(1/2)*(Correlation(peak_index+1) - Correlation(peak_index-1)) / (Correlation(peak_index+1) + Correlation(peak_index-1) - 2*Correlation(peak_index));

    lag = coarse+fine;
    lags = [lags lag];
end
% Calculate the mean square error and compare it to the CRLB
% ---
MSE  = MSE + (mean((lags - Delay).^2))/runs;
CRLB =  CRLB + ((3/((pi.^2)*(N))) * ((1 + 2 .* SNR)./(SNR.^2)))/runs;

end


disp(CRLB/MSE)

In this presented code, I am using the interpolation mentioned in one of the papers cited above, which I get that it's biased, and that may have something to do with it. I assume there is some lack of understanding on my part on what I should expect from the CRLB, and I was hoping some of you could explain to me what I am not understanding.

NOTE 1:

I applied MATLAB's interpft function (which I assume should be unbiased), and I also get wildly changing variances, which after a couple of runs approach 1, although I feel that since the Cramer-Rao constitutes a lower bound, it should be impossible for the variance to go below it. Code here: here.

NOTE 2:

I've seen some discussion on one of the papers here: Direct Correlation (DC) time delay estimation: variance keeps decreasing for increasing SNR?, but I don't think it corresponds to my problem. In my case, the variance of the signal is lower than the CRLB, which should be a lower bound.

Royi
  • 19,608
  • 4
  • 197
  • 238
  • 1
    You can get lucky. A single realization is a random variable. If the average of many realizations is less than the CRLB, there is a problem but any single realization can beat the CRLB –  Dec 17 '17 at 22:03
  • I get that, but even so, when I attempt using the interpft method several times, if I compute the variance (instead of the MSE), it approaches 2.

    Shouldn't the interpft (zero-padding in the frequency domain, or sinc interpolation) method be essentially unbiased, since it comes from the Whittaker-Shannon theorem?

    – Luca Martini Dec 18 '17 at 16:24
  • If you are calculating both MSE and variance, How does the mean and mean error compare? –  Dec 18 '17 at 16:35
  • Interpolation introduces correlation. Did you adjust for the actual degrees of freedom? –  Dec 18 '17 at 19:26
  • Sorry, I don't understand what you mean by Interpolation introduces correlation. Do I need to adjust any degrees of freedom? I'm only correlating a white gaussian signal with itself, with different noises. – Luca Martini Dec 18 '17 at 19:51
  • Try doubling the interpret factor and see if your variance decreases –  Dec 18 '17 at 19:58
  • 1
    tried it at 500 and 10.000, to make sure. I get fairly similar results. I implemented the zero pad on the frequency domain myself, and got similar results. I don't understand why sinc interpolation in the frequency domain would be biased, actually.. – Luca Martini Dec 18 '17 at 21:52
  • Also, to comment on what you asked before, I get something like 1.25*CRLB for MSE, 2 times the CRLB, with the Variance, and +-0.01 for the mean. Obviously there is some bias in the calculation, as it is not zero mean, and the variance is much larger than the CRLB. – Luca Martini Dec 18 '17 at 21:55
  • I found the reason for the bias. It was a stupid reason, since in my application I use the same reference several times, and the first measurement is subject to noise, the reference itself has some time displacement D at the start. I tried using the reference without noise and now both values agree.

    Nevertheless, I wouldn't want this to detract from the question. Without the bias, both the MSE and the Variance are twice that of the cramer rao. I believe the cramer-rao is correct, according to literature, so any thoughts on why this happens?

    – Luca Martini Dec 19 '17 at 08:24
  • CRLB isn’t tight, particularly at low SNR. Ziv Zakai is more realistic –  Dec 19 '17 at 11:31
  • Thanks for the suggestion, I will look into it, although I don't think that's the problem, since my application is at very high SNR. I have some idea of what may be the problem, and I updated it in the question. – Luca Martini Dec 19 '17 at 14:59
  • Are you after the derivation of the CRLB for Shift Estimation? I'm not sure I understood your question. – Royi Jul 29 '18 at 18:02

1 Answers1

2

CRLB is also a function of signal bandwidth (effective bandwidth or second derivative of xcorr peak):

$$\sigma=\dfrac{1}{2\pi F_e\sqrt{\frac{2E}{N_0}}} $$

Also note, that SNR is not RMS/RMS. The reason is very simple, your signal occupies a certain time, while noise is always here. Also, RMS is not std. See Subsample interpolation bias error in time of flight estimation by direct correlation in digital domain.

For $F_e$ calculation $E$ can be sum(signal.^2) if N0=en.^2, en=std(noise)/sqrt(fN) if noise is white and covering up to Nyquist, otherwise use noise BW.

lennon310
  • 3,590
  • 19
  • 24
  • 27