Question on covariance matrix of 2 spatial signals

Question

Every time I think I have understood the covariance matrix, someone else comes up wih a different formulation.

I am currently reading this paper:

J. Benesty, "Adaptive eigenvalue decomposition algorithm for passive acoustic source localization", J. Acoust. Soc. Am. Volume 107, Issue 1, pp. 384-391 (2000)

and I have come across a formulation I do not quite understand. Here, the author is constructing the covariance matrix between two signals, $x_1$, and $x_2$. Those two signals are from different sensors.

For the covariance matrix of one signal, I know that we can get it by calculating the regression matrix, and then multiply it by the Hermitian of that same matrix, and dividing by $N$, the length of the original vector. The size of the covariance matrix here can be arbitrary, with maximum size being $N\times N$.

For the covariance matrix of two spatial signals, if we place the first signal in the first row, and the second signal in the second row of a matrix, then multiply by its Hermitian, and also divide by $N$, then we get a $2\times 2$ covariance matrix of both spatial signals.

However, in this paper, the author computes what looks like four matricies, $R_1{}_1, R_1{}_2, R_2{}_1$, and $R_2{}_2$, and then puts them into a super matrix and calls that the covariance matrix.

Why is this so? Here is an image of the text:

enter image description here

Dilip Sarwate · Accepted Answer · 2012-01-06T09:55:59.793

If you have two signal vectors $x_1[n]$ and $x_2[n]$ each of $N$ elements, then there are two different things we can consider.

How do the quantities $\displaystyle\sum_{n=1}^N x_i[n]x_j[n],~ i, j \in \{1,2\}$ compare? In particular, when the signals are noisy and the noises can be considered to be jointly stationary (or jointly wide-sense stationary), these quantities can be used to estimate the noise variances in the two signals as well as the covariance of the noises at any fixed sampling time. This is what you get from the $2\times 2$ covariance matrix $$R_{2\times 2} = \left[\begin{matrix} \sigma_1^2 & C\\ C & \sigma_2^2\end{matrix}\right].$$ The noise in $x_1[n]$ has variance $\sigma_1^2 = R_{1,1}$ which might be different from $R_{2,2} = \sigma_2^2$, the variance of the noise in $x_2[n]$. But the noises are correlated with covariance $R_{1.2}=R_{2,1} = C$. Now, if we plan on doing things with just what happens at $n$, ignoring whatever might be happening at $n-1$ or $n+1$ etc., then this is all the information we need.
Unless the noise is known to be (or assumed to be) white noise so that noise samples from different sampling instants are independent (and therefore uncorrelated) or we simply assume uncorrelated noise samples, there is information that we are ignoring by not considering the correlation between $x_1[n]$ and $x_1[m]$, samples from the same process at different times or locations, and the correlation between $x_1[n]$ and $x_2[m]$, samples from the two processes at different times or locations. This additional information might lead to a better estimate/solution. Now we have a total of $2N$ noise samples, and therefore a $2N \times 2N$ covariance matrix to consider. If we arrange matters the way the authors did, we have $R_{\text{full}} = E[XX^T]$ where $$X = (x_1[1],x_1[2],\ldots,x_1[N],x_2[1],x_2[2],\ldots,x_2[N])^T = (\mathbf x_1,\mathbf x_2)^T$$ and so $$R_{\text{full}} = \left[ \begin{matrix}R_{\mathbf x_1,\mathbf x_1} & R_{\mathbf x_1,\mathbf x_2}\\ R_{\mathbf x_2,\mathbf x_1} & R_{\mathbf x_2,\mathbf x_2}\end{matrix} \right]$$ where $R_{\mathbf x_i,\mathbf x_j} = E[\mathbf x_i\mathbf x_j^T]$. Note that $R_{\mathbf x_i,\mathbf x_j}$ is, in essence, the cross-correlation function of $(x_i[1],x_i[2],\ldots,x_i[N])$ and $(x_j[1],x_j[2],\ldots,x_j[N])$ if $i \neq j$ and the autocorrelation function if $i = j$. If the noise processes are white and uncorrelated except when $n = m$, then $$R_{\text{full}} \rightarrow R_{\text{simple}} = \left[ \begin{matrix}\sigma_1^2I & CI\\ CI & \sigma_2^2I \end{matrix}\right]$$ where $I$ is the $N\times N$ identity matrix, and $\sigma_1^2, \sigma_2^2$ and $C$ are as defined in Item 1 above. How realistic this noise model might be is something for the end user to determine. If the model is realistic, then nothing is gained by looking at the $2N\times 2N$ matrix $R_{\text{full}}$ since all the information is there in the $2\times 2$ matrix $R_{2\times 2}$ of Item 1 above. Ditto if the model is unrealistic but we don't intend to (or are unable to) use all the information in the full $2N\times 2N$ matrix $R_{\text{full}}$; we will make do with just $\sigma_1^2, \sigma_2^2$ and $C$ of Part 1 for which we don't need $R_{\text{full}}$ or $R_{\text{simple}}$, just $R_{2\times 2}$.

Thanks. First, shouldn't the sigma in (1) say from n = 0 to N-1? (Not from i = 1 to n). — Spacey, Jan 02 '12 at 22:32
I am not sure I still understand what/why we are doing it this way. Are you saying that for (1), since the noises in both vectors are completely independent of each other, we have to use that method, and thus get a 2x2 co-variance matrix, but that in the second case (2), since the noises in the vectors are not independent, we have to concatenate both vectors and then compute their co-variance matrix? Why though? Im afraid I still do not understand the motivation here... — Spacey, Jan 02 '12 at 22:35
Thanks I will read it again. Also, the subscript for sigma must be 'n', not 'i'. — Spacey, Jan 03 '12 at 01:22
I will jot down some more questions/comments tomorrow, but for now, what are the 'official' names of $R_2{}x{}_2, R{\text{full}}$, and $R_{\text{simple}}$? I cant imagine they are all called 'co-variance matrices', since this leads to confusion, (as was the main motivation for this question). What are they normally referred to? — Spacey, Jan 03 '12 at 07:37
Ok - just to be clear - in the first case, are you saying that the noise samples in vector $x_1$, are all independent of each other? noise at sample 0 is completely independent of noise at sample 1, of noise at sample 2, etc etc? — Spacey, Jan 04 '12 at 03:17
I know what you mean. Since their pdfs dont change over time, (stationary), then we can use this method (case 1) to calculate their respective correlations between vector $x_1$ and vector $x_2$, as well as the variances of both those vectors. HOWEVER, arent you saying that IF we know that vectors 1 and 2 are both white, (such that each of their temporal samples are independent of each other, $x_1(0)$ is independent from $x_1(12), etc), then the 2x2 is all we need. However, if the temporal samples are in fact correlated, then we should use case 2. Isnt that what you are saying? — Spacey, Jan 04 '12 at 05:51
Please correct the $\displaystyle\sum_{i=1}^N x_i[n]x_j[n]$ to read $\displaystyle\sum_{n=1}^N x_i[n]x_j[n]$ instead so that I can accept the answer. — Spacey, Jan 06 '12 at 07:14

Question on covariance matrix of 2 spatial signals

1 Answers1