Note: it's "Work In Progress", I intend to address some limitations, including "time aliasing". Interested hot visitors may wish to "Follow" the answer.
Motivating the metric
We build a metric by considering the following:
- Energy measures information. Aliasing can be lossy or not. Either way there's a change in information, and an excellent quantification of information is frequency-domain energy. By Parseval-Plancherel's theorem, a change $\Delta E_\omega$ in frequency-domain energy corresponds to the same change $\Delta E_t$ in time-domain energy. One can experimentally verify on images that this faithfully quantifies aliasing losses.
- Subsampling in time <=> Folding in Fourier. This predicts effects of subsampling on frequency.
- Generality: we make an unbiased quantifier by assuming uniform input spectrum, to not favor some frequencies over others and enable fair comparison of different filters.
- Best vs worst: the best case is zero aliasing, worst is maximum. Ideally, a signal can be subsampled losslessly without filtering, meaning the original sequence can be recovered perfectly with DFT upsampling; the "best" reference hence shall a signal whose spectrum is, after subsampling, full-band and not changed by subsampling (except for constant scaling). Likewise, "worst" shall be full-band before subsampling.
Let signal length be $N=256$, and subsampling factor $M=8$. We have:
Before subsampling, x_full has x8 as much energy as x_ref. After, it's x64. So the gain due to aliasing is 64/8=8, or 700%.
To show the effects of aliasing on something in-between, we add a sine to x_ref:
Building the metric
So far we have, without percent conversion, and in terms of $x$ per Parseval-Plancherel's:
$$
\texttt{alias}\{x, M\} = r_\text{after} / r_\text{before} \tag{1} \\
$$
where, and $::M$ is subsampling by $M$,
$$
\begin{align}
& r_\text{before} = \|x\|^2 / \|x_\text{ref}\|^2 = \|x\|^2 / (N/M) \tag{2} \\
& r_\text{after} = \|x[::M]\|^2 / \|x_\text{ref}[::M]\|^2 = \|x[::M]\|^2 / (N / M^3)\tag{3}
\end{align}
$$
A problem, that we can see by considering $x=x_\text{full}$, is that for same $(N/M)$, this scales with $N$. Meaning, it differs across sampling rates. We prefer something between 0% and 100% for all. The scaling is proportional, so normalizing is easy. That, together with percent conversion, yields
$$
\texttt{alias}\{x, M\} = 100 \cdot \frac{r_\text{after} / r_\text{before} - 1}{M - 1} \tag{4}
$$
Another problem, for our context, is filter-input phase alignment; the worst case is obtained with bin-by-bin alignment of input with filter, so we do frequency-domain subsampling upon absolute values. The complete closed form is shown in "Full math version" section below, along minimal code. The measure has the following properties:
$$
\begin{align}
\texttt{alias}\{x_\text{full}\} &= 100\text{%} \tag{5} \\
\texttt{alias}\{x_\text{ref}\} &= 0\text{%} \tag{6} \\
\end{align}
$$
The metric isn't flawless and won't handle certain edge case inputs, but it does work well with practical or otherwise full-band/"normal" signals.
Applying the metric
We compare a certain kind of moving average against the Hamming-windowed sinc used by scipy.signal.decimate(ftype='fir'); the moving average is, where $n=[0, 1, ..., N-1]$:
$$
h_M[n] =
\begin{cases}
1/M, & -N/M < n \leq N/M \\
0, & \text{otherwise}
\end{cases}
$$
or simply, $M$ non-zero samples, as DFT-centered as possible. A motivation is if our filtering stride equals our filter length, i.e. zero overlap.
Results, with red horizontal line being the "ideal" reference:
As expected, non-overlapping arithmetic means perform awfully.
To get an idea of what the numbers mean, we compare recovery on each. Do this by filtering with each - that's xfilt; then, subsample xfilt and DFT-upsample it, to get xrecovered. Then, plot xfilt vs xrecovered for each, which will show how faithfully the intended portion of the spectrum is captured after decimating. On white Gaussian noise:
The example isn't cherry-picked, nor is it worst case. In fact it faithfully reflects the average performance on WGN, per 1,000,000 realizations upon $N, M = 256, 8$, according to relative Euclidean distance (see code). The worst case can be found by gradient descent, as I've done in designing my own lowpass filters; I'm not a fan of scipy's decimate in every context.
Full math version + minimal code
$\|\cdot\|^2 = \sum |\cdot|^2$, energy or squared L2 norm. We have
$$
\texttt{alias}\{x, M\} =
100 \cdot \frac{1}{M^2} \cdot
\frac{\|x[::M]\|^2 / \|x\|^2 - 1}{M - 1} \tag{7} \\
$$
Using aforementioned and referenced (see $X_\text{sub}$) relations, we have, with $X = \texttt{DFT}\{x\}$:
$$
\texttt{alias}\{x, M\}
= \frac{100}{M^4 (M - 1)}
\left( \frac{
\sum_{k=0}^{N/M - 1} \left|
\sum_{i=0}^{M-1}X[k + \frac{N}{M}i] \right|^2
}{\sum_{k=0}^{N - 1}|X[k]|^2
} - 1\right) \tag{8}
$$
However, to account for worst case phase alignment, we instead have
$$
\texttt{alias}\{x, M\} = \\
\frac{100}{M^4 (M - 1)}
\left( \frac{
\sum_{k=0}^{N/M - 1} \left|
\sum_{i=0}^{M-1}(|\Re e\{X\}| + j|\Im m\{X\}|)[k + \frac{N}{M}i] \right|^2
}{\sum_{k=0}^{N - 1}|X[k]|^2
} - 1\right) \tag{9}
$$
It's fairly simpler in code; if energy(x) == sum(abs(x)**2), then it's just
xf_sub = (abs(xf.real) + 1j*abs(xf.imag)).reshape(M, -1).mean(axis=0)
xf_ref_sub = xf_ref.reshape(M, -1).mean(axis=0)
r_before = energy(xf) / energy(xf_ref)
r_after = energy(xf_sub) / energy(x_ref_sub)
alias = 100 * (r_after / r_before - 1) / (M - 1)
Full code
Available at Github.
$$ \frac{\int\limits_{f_\mathrm{s}/2}^{\infty} \big|X(f)\big|^2 , \mathrm{d}f}{\int\limits_{0}^{\infty} \big|X(f)\big|^2 , \mathrm{d}f} $$
Here we assume that $|X(-f)|=|X(f)| \qquad \forall f \in \mathbb{R}$ . - - -
– robert bristow-johnson May 02 '23 at 17:16