3

I have a long signal (million of samples) containing a lot of Gaussian peaks, whose standard deviation is random and about $5$ to $50$ samples wide. Sometimes, these peaks overlap, but not often. The signal also contains high frequency noise.

I would like to compute the distribution of standard deviations, in a similar way one can extract the "distribution of frequencies" using FFT. Currently, we do a first pass for detecting the peaks, then do a classical function fit in the peak region. This is not very robust and the distribution fluctuates a lot depending on the fitting algorithm fine tuning.

Is there a more robust approach which would do a kind of a Gaussian transform?

Gilles
  • 3,386
  • 3
  • 21
  • 28
galinette
  • 131
  • 1
  • 1
    Can you please clarify the concept of "distribution of gaussians"? Are you after a " spectrum" whose X-axis is somehow related to the "kind" of Gaussian (thin and tall, fat and long) and the Y-axis is related to the "amount of Gaussian"? Might have to be 2D, one d for the "mean" (position) and one d for the "st.dev" (width). Have you considered some form of matched filtering ? – A_A Dec 29 '16 at 10:36
  • By distribution, I mean on the experimental signal, computing an histogram of the standard deviations. The histogram would have standard deviations as X-axis bins, and energy (gaussian areas) as Y-axis – galinette Dec 30 '16 at 15:13
  • I mean distribution, because from that experimental histogram data, I want to build a continuous distribution, and generate a random signal that will match the standard deviation statistics of the experimental signal. – galinette Dec 30 '16 at 15:14
  • I believe this is called Gaussian Pulse Decomposition? https://www.ncbi.nlm.nih.gov/pubmed/9084830 – endolith Jun 28 '17 at 14:04

2 Answers2

1

The (continuous) Fourier transform of a Gaussian is a Gaussian:

$$ \mathscr{F}\Big\{ x(t) \Big\} \triangleq \int\limits_{-\infty}^{\infty} x(t) \, e^{-j 2 \pi f t} \, dt $$

$$ \mathscr{F}\left\{ e^{-\pi t^2} \right\} = e^{-\pi f^2} $$

$$ \mathscr{F}\left\{ e^{-\pi \alpha t^2} \right\} = \frac{1}{\sqrt{\alpha}} e^{-\frac{\pi}{\alpha} f^2} \qquad \alpha > 0 $$

$$ \mathscr{F}\left\{ e^{-\pi \alpha (t-\tau)^2} \right\} = \frac{1}{\sqrt{\alpha}} e^{-\frac{\pi}{\alpha} f^2} e^{-j 2 \pi f \tau} $$

$$ \mathscr{F}\left\{ \sum_m c_m e^{-\pi \alpha_m (t-\tau_m)^2} \right\} = \sum_m \frac{c_m}{\sqrt{\alpha_m}} e^{-\frac{\pi}{\alpha_m} f^2} e^{-j 2 \pi f \tau_m} $$

note these gaussians have "tails" that go on forever, but gaussians usually die off to very close to zero very rapidly.

assuming no time or frequency aliasing, let $x[n] \triangleq x\big(n/f_\text{s}\big)$ where $f_\text{s}$ is the sample rate. Then the DTFT is

$$\begin{align} X\big(e^{j\omega}\big) &\triangleq \sum_{n=-\infty}^{\infty} x[n] \, e^{-j \omega n} \\ &= f_\text{s} \sum_{n=-\infty}^{\infty} x\big(n/f_\text{s}\big) \, e^{-j f_\text{s} \omega (n/f_\text{s})} \frac{1}{f_\text{s}} \\ & \approx f_\text{s} \int\limits_{-\infty}^{+\infty} x(t) \, e^{-j 2 \pi \frac{f_\text{s} \omega}{2 \pi} t} \, dt \\ &= f_\text{s} \ \mathscr{F}\Big\{ x(t) \Big\} \Bigg|_{f=f_\text{s}\frac{\omega}{2 \pi}} \\ \end{align}$$

so if we let $$x(t) = \sum_m c_m e^{-\pi \alpha_m (t-\tau_m)^2}$$

$$\begin{align} x[n] &= \sum_m c_m e^{-\pi \alpha_m ((n/f_\text{s})-\tau_m)^2} \\ &= \sum_m c_m e^{-\pi (\alpha_m/f_\text{s}^2) (n-f_\text{s}\tau_m)^2} \\ &= \sum_m c_m e^{-\pi \widehat{\alpha_m} (n-\widehat{\tau_m})^2} \\ \end{align}$$

then the DTFT is

$$\begin{align} X\big(e^{j\omega}\big) & \approx f_\text{s} \ \sum_m \frac{c_m}{\sqrt{\alpha_m}} e^{-\frac{\pi}{\alpha_m} f^2} e^{-j2\pi f\tau_m} \Bigg|_{f=f_\text{s}\frac{\omega}{2 \pi}} \\ &= f_\text{s} \ \sum_m \frac{c_m}{\sqrt{\alpha_m}} e^{-\frac{\pi}{\alpha_m} \left(f_\text{s}\frac{\omega}{2 \pi}\right)^2} e^{-j2\pi \left( f_\text{s}\frac{\omega}{2 \pi} \right)\tau_m} \\ &= f_\text{s} \ \sum_m \frac{c_m}{\sqrt{\alpha_m}} e^{-\pi\frac{f_\text{s}^2}{4 \pi^2 \alpha_m} \omega^2} e^{-j \omega (f_\text{s}\tau_m)} \\ \end{align}$$

Then, assuming you choose your DFT length $N$ to be sufficiently large (you're saying in the millions) to cover all of the gaussians with their specific delays $f_\text{s}\tau_m$ in samples and widths of ca. $\frac{f_\text{s}}{\sqrt{\alpha_m}}$. Then the $N$-point DFT will sample this DTFT at $N$ equally spaced points around the unit circle.

$$\begin{align} \mathcal{DFT:} \quad X[k] &= \sum_{n=0}^{N-1} x[n] \, e^{-j 2 \pi \frac{nk}{N}} \\ &= X\big(e^{j\omega}\big) \Bigg|_{\omega = 2\pi\frac{k}{N}} \quad \mathcal{:DTFT}\\ &= f_\text{s} \ \sum_m \frac{c_m}{\sqrt{\alpha_m}} e^{-\pi\frac{f_\text{s}^2}{4 \pi^2 \alpha_m} \omega^2} e^{-j \omega (f_\text{s}\tau_m)} \Bigg|_{\omega = 2\pi\frac{k}{N}}\\ &= \ \sum_m \frac{c_m}{\sqrt{\alpha_m/f_\text{s}^2}} e^{-\pi\frac{f_\text{s}^2/\alpha_m}{4 \pi^2} \left(2\pi\frac{k}{N}\right)^2} e^{-j 2\pi\frac{k}{N} (f_\text{s}\tau_m)} \\ &= \sum_m \frac{c_m}{\sqrt{\widehat{\alpha_m}}} e^{-\frac{\pi}{\widehat{\alpha_m}} \left(\frac{k}{N}\right)^2} e^{-j 2\pi\frac{k}{N} \widehat{\tau_m}} \\ \end{align} $$

So summing up and I will remove the little hats in the discrete-time domain, if you have a collection of gaussian pulses with heights of $c_m$, peak width parameters $\frac{1}{\sqrt{\alpha_m}}$ (in units of samples), and peak positions at $\tau_m$ (also in units of samples), then your input should look like:

$$ x[n] = \sum_m c_m e^{-\pi \alpha_m (n-\tau_m)^2} $$

and the result of the $N$-point DFT should look like:

$$ X[k] \approx \sum_m \frac{c_m}{\sqrt{\alpha_m}} \ e^{-\pi/(N^2\alpha_m) k^2} \ e^{-j 2\pi(\tau_m/N) k} $$

for $-\tfrac{N}{2} < k < \tfrac{N}{2}$ . Remember with the DFT, $X[k+N]=X[k]$ for all integer $k$.

robert bristow-johnson
  • 20,661
  • 4
  • 38
  • 76
0

From your description, you have a signal composed of high-frequency noise (more simply put, white noise) plus of a fluctuating signal whose auto-correlation is about $5$ to $50$ samples. This all seems to be perfectly adapted for a Fourier analysis!

Your fitting method seems right but perhaps your modeling is perhaps wrong. The latter signal can be modeled as a noise term with a given envelope and a random phase. Using a log-normal distribution for instance, $$ \mathcal{E}(f) = W + A\cdot\exp(-\frac{\log(f/f_0)}{2\cdot B^2}) $$ where $W$ is the level of white noise and $A$ is that of the "peaks". From the Parseval theorem, there is an equivalence relation between the auto-correlation's width and that of the spectrum: the wider the spectrum's bandwidth $B$ around the peak $f_0$, the shorter the autocorrelation's width. As a consequence, given a good parameterization of the spectrum, you will easily find a function to perform a good fit within a window where the signal is known to be stationary.

Given that, it is then possible to extract the given peaks, but that is certainly another story...

meduz
  • 816
  • 6
  • 15