GCC-PHAT (Generalized cross correlation) always peak at delay=0 on real audio signal

Question

I have studied the GCC-PHAT algorithm to estimate TDOA on audio signals at 2 mics.

Here is my MATLAB implementation:

function cc = freq_xcorr_phat(x,y)
  n = length(x)+length(y)-1;
  X = fft(x,n);
  Y = fft(y,n);
  R = X.*conj(Y);
  R = exp(1i*angle(R));
  cc = ifft(R);
endfunction

I use a SIMULATED stereo audio signal (where channel 1 is an real audio recording and channel 2 is channel 1 with a specific sample delay, 7 samples in this case) to test this function. The result GCC-PHAT cc is plotted below, where we can see that it shows the expected result, a peak at -7:

BUT, when I use a REAL stereo audio signal(channel 2 delay 15 samples) to test this function, the result GCC-PHAT plot goes weird. We can see that it has a peak at -15, but the peak at 0 is more stronger:

THE QUESTION IS:

Why the second plot peakd at 0 and stronger than peak at -15? It doesn't make sense to me.

p.s.

The plot is actually the middle part of fftshift(cc).
My question might be relevant to this question.
The real stereo audio signal is recorded from an embedded system(MCU).

Matlab has a gccphat in the SP toolbox. have you compared outputs with that? — , Feb 14 '19 at 15:48

score 2 · Answer 1 · answered Jun 15 '19 at 12:23

I think that's because of the DC offset of the signal. A time domain or frequency domain plot provided by you could have been helpful here for the confirmation of your doubt. Following are suggestions

1.try to filter the signals before FFT depending upon the signal band you are interested in

2.Subtract the signal by the mean of the entire signal (which is a reasonable trick to find DC offset) or by the DC offset directly if you know that.

The signal of your's must have less swing amplitude in comparison to DC offset making the '0' point correlation of greater amplitude than the actual delay.

GCC-PHAT (Generalized cross correlation) always peak at delay=0 on real audio signal

1 Answers1

Linked