Mitigating high-frequencies after sampling

Question

I have a dummy signal that I am sampling at $T = 100 \ \mu s$. According to the Nyquist frequency, I can identify frequencies up to $\pi/T = 31416 \ rad/s$, as we can see in the FFT of my signal (blue line of the first figure). For post-processing purposes, I only need the frequencies below $6000 \ rad/s$. So, what can I do to mitigate (or even make it zero?) the frequencies above $6000 \ rad/s$? Low pass filters? Any sort of windowing?

My goal:

I applied dispersion to the sampled signal (blue) and try to recover it by testing some techniques to correct the dispersion (the red profile is after recovering). Nevertheless, my approach cannot "see" any frequencies above $6000 \ rad/s$. Now, when I compared the original sampled signal and the recovered one, they have an annoying difference (the third figure shows the difference between them). Also, the recovered one seems to have some leakage. I would like to know how you would suggest dealing with this issue. I have the freedom to either treat the original signal or the recovered signal (or both).

The signals: original -- recovered

Thanks.

It would help if you shared sample data or code to generate it. — OverLordGoldDragon, Jun 22 '21 at 09:03
How can I add a data file here? I have not seen an option to attach files. — jeb, Jun 22 '21 at 09:15
Link an uploaded file (e.g. ufile, dropbox, drive), or last resort paste raw values, delimited, here. (Also use @ if replying to non-authors of a post to notify.) — OverLordGoldDragon, Jun 22 '21 at 09:29
@OverLordGoldDragon Thanks for the links. I tried with the Pastebin. Here is the link: https://pastebin.com/fK6i2vZ4 . The first column is the original time domain signal, sampled at T = 100 us. The second column is the recovered signal. The plots that I showed correspond to the FFT and the difference (error) between them. — jeb, Jun 22 '21 at 12:10

Dan Boschen · Answer 1 · 2021-06-22T11:19:29.203

It appears that your equalization for dispersion worked well for the positive frequencies if I read the plots correctly (at least on a error magnitude plot, to evaluate that visually better I recommend plotting in dB).

If you only need frequencies below 6000 rad/s, consider decimating your signal to a lower rate as the first step in processing. Decimating to a sampling rate of 14000 to 15000 rad/s would allow reasonable margin for simpler filtering in the decimation process but really simplify downstream processing over the current sampling rate of 62,832 rad/s. The decimation filter when properly designed will also remove all higher frequency noise that may otherwise alias in.

Unless your time domain waveform is a series of square pulses (which I don't recommend as they require infinite frequency support), your first plot may be suffering from significant spectral leakage. If your waveform is square pulses, the subsequent filtering to be suggested further below will help with that but in any case I recommend windowing your signal in time which will provide greater dynamic range (at the expense of frequency resolution). A great window choice for time-frequency localization is the Kaiser Window which comes very close with less computation to the ideal DPSS Window for sampled systems. To proceed with this create a window function in time matching the length of your data, multiply your data sample by sample with that window and then take the FFT. You should see a dramatic improvement in the spectral floor. I demonstrate this below with the DTFT of the Kaiser Window compared to a default rectangular window. The first parameter is the number of samples (so this one below was just 30 samples long) and the second parameter is the $\beta$ factor this window uses; the higher the value for $\beta$ (in this case we used 10) the deeper the side-lobes (higher dynamic range) and wider the main lobe (less frequency resolution).

It also appears that you may have some interpolation distortion (did you up-sample your signal?). If so, filtering can be optimized with multi-band filters at the interpolation images, but the need for that will be clearer once you review your windowed FFT spectrum.

Also it appears that you only wanted to compensate for the positive frequencies of an otherwise real signal. Don't you want to provide your equalization to both the positive and negative frequencies (the regions you showed in red)? The error is not indicating that.

My preferred and simple low pass filtering technique is a least squared linear phase filter, which provides optimum filtering in a least squared sense (will be guaranteed to have the least error from target frequency bands for a given number of filter taps) and is linear phase so will not contribute any further dispersion (importantly).

Least squared filter coefficients can be determined easily from MATLAB, Octave and Python scipy.signal using the firls function. The transition band between pass band and stop band will drive the number of taps needed to achieve a certain rejection requirement.

Finally for doing such equalization, compare your technique to the Wiener-Hopf equations as summarized in this post here which also provides a least-squared solution to the equalization (dispersion correction) problem.

Thanks for the detailed explanation. I put my data in this link https://pastebin.com/fK6i2vZ4 as suggested from a previous post. The 1st column is my original data, sampled at T = 100 us. The second column is the reconstructed data. There is one thing that I did not mention. I sampled at T=100 us because I need to know the signal at this time step. Therefore, I do not think I can downsize the signal, otherwise, I will lose information. I tried to use T = 1ms, and indeed, it works better. — jeb, Jun 22 '21 at 12:18
@jeb To say you need the signal at this time step and that you don't need the high frequencies associated with that time step are counter-statements. It is likely an issue with the processing approach used and you can downsample to my suggested rate with NO loss of information within the bandwidth I gave. What are the specific objectives of your data set both the frequency range of interest and the definition of the distortion you are trying to eliminate or target to get to? — Dan Boschen, Jun 22 '21 at 12:29
I am trying to simulate an algorithm to correct dispersion. So, first, a create a dummy sinusoidal signal (let's call it input) with the sampling rate that I will need (31416 rad/s). Then, I apply dispersion to the signal and try out my algorithm to correct it (let's call it output). But, this algorithm cannot recover all the frequencies (only those below 6000 rad/s). Now, when I try to compare both I will have that difference. One obvious conclusion is: my algorithm cannot see higher frequencies. — jeb, Jun 22 '21 at 13:18
How do I create a signal with that high sampling rate but disregard its high frequencies, so I can properly evaluate my algorithm? I do not know if that is even possible. I tried to apply a very simple low pass filter in the input, and they looked more similar. Still, it looks like I am missing something basic from DSP theory. — jeb, Jun 22 '21 at 13:20
If you are using a tone to measure dispersion, you can only really measure that one frequency (except your imperfect tone due to spectral leakage is “lighting up” the other frequencies to test them as well but that is far from ideal since those levels are so low. I think the issue is your algorithm — Dan Boschen, Jun 22 '21 at 14:35
What is typically done for such "channel estimation" is to use a spectrally rich sounding signal which are often either frequency chirps or pseudo-random sequences, in both cases ideally approximating white noise (flat over the band of interest) source which would properly "probe" each frequency with equal power providing a consistent estimate for each data point in frequency. — Dan Boschen, Jun 22 '21 at 15:47

Mitigating high-frequencies after sampling

1 Answers1