0

I have read a lot about ICA. I think I could not find the answer that why non-Gaussian Variables are independent. What I understand, Central Limit Theorem states that

distribution of sum of independent variable tends toward more Gaussian than its original random variables.

$ s_i $ are the number of original independent sources in ICA; whereas ICA model is $x=As$. So we can define $y=w^Ts$. The main goal is to find the unmixing matrix $w$ that maximize the non-Gausaanity. So my question is what is non-Gaussanity here and why its necessary to maximize it to extract the original sources.

Any Enlightenment Please!

M Ali Qadar
  • 103
  • 1
  • 3
  • 2
    Welcome to DSP.SE! Any random variable can be independent from any other one. It does not depend on whether they are Gaussian distributed or non-Gaussian distributed. Why do you think they should be? – Peter K. Oct 29 '15 at 08:58
  • Well, Thats the confusion, how non-gaussian variables are independent. Central Limit Theorem Says that distribution of sum of independent variable tends toward more Gaussian than its original random variables. Also maximizing non-Gaussanity yields one of independent component. So What is the connection of non-Gaussanity with independence of components? I think non-gaussanity gives local maxima for that particular component and based on local maxima we can tell all uncorrelated components – M Ali Qadar Oct 29 '15 at 10:55
  • Please reword your question using the definitions in the comment on your deleted "answer". Please edit the question rather that post another non-answer. – Peter K. Oct 29 '15 at 12:16

2 Answers2

4

The model ICA uses says that there exist some unknown, statistically independent sources, $s_i$ that are non-normally distributed (their distributions are something other than Gaussian):

$$ s_i \sim S(\mu_{s_i}, \sigma^2_{s_i}) $$

where $S$ is some (possibly) known but non-Gaussian distribution with mean $\mu_{s_i}$ and variance $\sigma^2_{s_i}$.

Then it is assumed that what you can actually measure is the addition of these: $$ \mathbf{x} = \mathbf{A}\mathbf{s} $$ where $\mathbf{s}$ is the vector of the $s_i, i=1,\ldots,N$, $\mathbf{A}$ is an $N\times N$ matrix (usually assumed to be invertible) and $\mathbf{x}$ is the vector of the actual measurements.

Because the $x_i$ will just be weighted sums of the $s_i$: $$ x_i = a_{i1} s_1 + a_{i2} s_2 + \cdots + a_{iN} s_N $$ the central limit theorem says that the distribution of $x_i$ will be closer to Gaussian than the distribution of the $s_i$ (provided certain non-restrictive conditions on the true distribution of the $s_i$ are met).

Then all ICA really tries to do is to find a $\mathbf{W} = \mathbf{A}^{-1}$ the inverse of $\mathbf{A}$.

There are any number of measures for "non-Gaussianity". One of the simplest (?) is to use kurtosis as the measure of how far the sample of a random variable is from Gaussianity.

So, to attempt to answer:

So my question is what is non-Gaussianity here and why its necessary to maximize it to extract the original sources.

There are several different ways one could measure non-Gaussianity. For example, kurtosis can be used as it is known that the kurtosis of a Gaussian is 3. Any distribution with a kurtosis different from 3 is therefore "non-Gaussian" to some extent.

The reason we want non-Gaussianity is because we assumed that the original sources are non-Gaussian.

Peter K.
  • 25,714
  • 9
  • 46
  • 91
2

In the explanation above, the last statement is not the actual reason why in ICA you assume non-Gaussian sources. After all, you could as well assume Gaussian sources and after the mixing process, the observed data will look Gaussian, which is not a problem at all.

The reason for assuming non-Gaussian sources was pointed out by Comon (1994), and it is because in order to learn a unique factorization $\mathbf{x=As}$, at most one component of s can be Gaussian. Which it has to do with the invariance under rotation of the Gaussian distribution. For simplicity, we usually assume non-Gaussian priors, where for instance $p_i(s_{i})\propto \dfrac{1}{\cosh s_{i}}$ yields the non-linear contrast function, $\tanh$, used in the ICA algorithm.

In another note related to the title of the post, "Why non-Gaussian variables are independent". Independence and non-Gaussianity are two different issues. Informally, independence means that a joint distribution has a factorial form on its components such that $p(s_1,\ldots,s_n)=\prod_i^n p_i(s_i)$, where the distributions $p_i(s_i)$ may or may not be Gaussian.