Samples of colored noise (taken at different times) generally
are correlated random variables because the
autocorrelation function of the noise process is not a delta function
as it is in the case of white noise. Thus, if we assume a zero-mean
process (noise is generally assumed to be regardless of its color),
then the covariance of two signals separated in time by $\tau$ seconds is
$R(\tau)$ where $R(t) = \mathcal F^{-1}(S(f)$ is the autocorrelation
function of the process (inverse
Fourier transform of power spectral density). Note that is is possible
for $R(t)$ to be zero for some values of $t$ (e.g. $R(t) = \operatorname{sinc}(t)$ is a valid autocorrelation function),
but it cannot be zero
for all nonzero $t$.
As far as the density function of any sample, if the process is
Gaussian, the sample is Gaussian even if the process has been filtered
with a linear filter before sampling. But if the process is not
Gaussian (it is,
let us say, LaPlacian), then while each sample will be LaPlacian, the
same cannot be said generally of samples of the process after filtering
of any kind. In other words, Gaussianity
survives linear filtering, LaPlacism generally does not.
So, how does maximum-likelihood estimation work when samples have
correlated noise? Consider the case when we wish to estimate the
unknown mean of a $\mathcal N(\mu, 1)$ random variable, and we
have two observations $x$ and $y$. In the standard case of
independent observations, the likelihood function is
$$L(\mu) = \frac{1}{2\pi}\exp\left(-\frac{1}{2}\left[(x-\mu)^2+(y-\mu)^2\right]\right).$$
The _maximum-likelihood estimator for $\mu$ is the number $\hat{\mu}$
that maximizes $L(\mu)$, which works out to be the number $\hat{\mu}$
that minimizes $(x-\mu)^2+(y-\mu)^2$. This is a quadratic in $\mu$
and the maximum-likelihood estimate turns out to be
$\hat{\mu}=\frac{x+y}{2}$. When the observations are correlated with
correlation coefficient $\rho$, then
$$L(\mu) = \frac{1}{2\pi\sqrt{1-\rho^2}}
\exp\left(-\frac{1}{2\sqrt{1-\rho^2}}
\left[(x-\mu)^2-2\rho(x-\mu)(y-\mu)+(y-\mu)^2\right]\right).$$
Once again we need to find the $\hat{\mu}$ where
$(x-\mu)^2-2\rho(x-\mu)(y-\mu)+(y-\mu)^2$ has a minimum.
We still have a quadratic in $\mu$ but now we get terms
like $xy$ in the coefficients. What $\hat{\mu}$ works out to
be is left for you to work out.
And what if we have $n$ observations where $n > 2$? All the above
still applies. For independent identically distributed Gaussian noise
in the samples, the sample mean
$n^{-1}\sum_i x_i$ is the maximum-likelihood estimate of $\mu$ but
in the case of correlated Gaussian random variables we get very
messy minimization problems because the quadratic that we are trying
to minimize depends on the inverse of the covariance matrix and
the result is a nonlinear function of the data instead of a simple
easy-to-remember result like the sample mean.
What if the noise is not Gaussian? The same principles apply -- set up the
likelihood function and find where it attains its maximum value -- but
the calculations are quite a bit different, all depending on
what you assume or know is the joint density of the observations.
rand()function) as so: $$ y[n] = x[n] - x[n-1] $$ that is colored noise (less spectrum at low frequencies), but it ain't gaussian p.d.f., it's triangular p.d.f. – robert bristow-johnson Jun 18 '14 at 02:21