Addressing Hilbert transform edge effects

Question

I'm experimenting with the Hilbert transform in Python and just now understanding how potentially severe the edge effects are. Here is the code I'm using:

import numpy as np
import scipy.signal
import matplotlib.pyplot as plt
option = 1
if option == 1:
    x = np.sin(np.arange(100))
elif option == 2:
    x = np.sin(np.arange(100)*np.pi/10)
elif option == 3:
    x = np.sin(np.arange(100)/4)
hilbert_data = scipy.signal.hilbert(x, axis=0)
real = hilbert_data.real
imag = hilbert_data.imag
amp_data = np.absolute(hilbert_data)
phase_data = np.unwrap(np.angle(hilbert_data), axis=0)
freq_data = np.diff(phase_data, axis=0)
fig, axs = plt.subplots(2, 2, figsize=(12,5))
axs[0, 0].plot(x)
axs[0, 0].set_title('Signal')
axs[0, 1].plot(amp_data, 'tab:orange')
axs[0, 1].set_title('Amplitude')
axs[0, 1].set_ylim([0.8,1.2])
axs[1, 0].plot(phase_data, 'tab:green')
axs[1, 0].set_title('Phase')
axs[1, 1].plot(freq_data, 'tab:red')
axs[1, 1].set_title('Frequency')

These are the outputs when I run it with options 1, 2, and 3, respectively:

What's noticeable is how different the edge effects are in each example. With option 2, there are essentially no edge effects. With option 1, the edge effects on amplitude are oscillatory and frequency spikes up at the ends. With option 3, it's the opposite - amplitude spikes and frequency is oscillatory.

Assuming that there is reason to believe that a signal is relatively consistent (e.g., the example above is based on signal with uniform amplitude/frequency over time), is there a way to use the outputs in the middle to correct the outputs on the ends? In the example above, it might make sense to just take the amplitude/frequency from the the centermost point, however, for real signals, that would likely be inaccurate. The difficulty is that the edge effects appear to be highly sensitive to small changes in the input as with the above.

Are there ways to determine the instantaneous amplitude, frequency, and phase other than the Hilbert transform? Any thoughts are appreciated.

What is the problem that you're trying to solve? What kind of signal are you analysing and what for are you planning to use the instantaneous phase, frequency and amplitude? — Jazzmaniac, Jul 29 '22 at 16:17
I want to use it for time series prediction for financial markets. The instantaneous values would be inputs to a machine learning model. — SuperCodeBrah, Jul 29 '22 at 16:24
I hadn't learned Python or numpy or scipy. Are the plots we're seeing with these edge effects frequency-domain graphs of the Hilbert transformer frequency response? What are they? — robert bristow-johnson, Jul 29 '22 at 16:37
They are plots of the instantaneous amplitude, phase, and frequency values over time. The function scipy.signal.hilbert returns the analytic signal and then the instantaneous values are determined using numpy functions. — SuperCodeBrah, Jul 29 '22 at 16:41
Related. Hilbert transform is a convolution, and there's not much special about it, this is a general question on padding in time-frequency analysis. — OverLordGoldDragon, Jul 29 '22 at 16:48
@OverLordGoldDrago - Yes, that is helpful. I gave you an upvote on your answer. To get the most accurate estimate, would a sliding window approach make sense? For example, let's say the entire series is 10,000 points and the window has 201 points (i.e., 100 additional points on each side but presumably the more the better) - one could use the central point(s) to get the best approximation that is free of edge effects and then that could be performed over the entire series. — SuperCodeBrah, Jul 29 '22 at 17:44
@SuperCodeBrah I very much doubt that financial time series have meaningful instantaneous frequency, phase or amplitude properties. And I am extremely skeptical regarding the use of a fundamentally non-local feature like the Hilbert transform in machine learning or financial modelling. Also, if you're using machine learning then you should let the algorithm figure out the important features. That's the point of machine learning. Pre-conditioning or -transforming in whatever way only makes sense if you can make a very strong argument for the necessity of such a step. — Jazzmaniac, Jul 29 '22 at 18:19
I don't know for a fact that it would be effective. The goal is just to test it to find out, but with financial time series the most recent data is the most important, so I'm just trying to see what's out there to get the most from it. — SuperCodeBrah, Jul 29 '22 at 18:37
HT is hardly the best tool of choice for instantaneous AM/FM localization; it's a 1D transform. CWT/STFT with reassignment is a better fit, see SSQ (so yes @ sliding window); if the transforms yield multiple overlapping lines and nothing particularly clear, it's strong evidence AM/FM localization is useless. — OverLordGoldDragon, Jul 29 '22 at 19:57
I mostly agree with @Jazzmaniac, especially if the timeseries are short - this doesn't make time-freq useless (example), but other transforms are better suited, such as scattering (my implem to be released soon). Ideally we'd do pure ML, but that's less of an option with limited data, unless we opt for transfer learning. — OverLordGoldDragon, Jul 29 '22 at 20:00
The transform can be made by (a) calculate FFT, (b) multiply positive frequencies by $-i$, negative by $i$ and make $f(0)=1$ (c) reverse transform. As the fft is used the jump to the first or last point in your data acts as a false frequency, since the fft is always repetitive. If you make the data length integer multiples of $2\pi$ this extra frequency effect is minimised as it is also when you use many more data points. preferably make your data flow easily to zero at the ends of the range. — porphyrin, Aug 02 '22 at 09:35
Does this answer your question? How to alleviate the edging effect of the Hilbert transform? — OverLordGoldDragon, Sep 29 '22 at 02:59

Dan Boschen · Answer 1 · 2022-10-08T22:11:39.327

The Hilbert Transform as implemented is a filter with a long time domain response (and on its own is non-causal), so requires time history and a matching delay in order to produce a result. Given one set of samples, the Hilbert Transform and the Analytic Signal (the signal as the real component and the Hilbert Transform as the imaginary component), there is no way that it can produce an instantaneous result of the signal's envelope based on just that one sample. Further the more history we have, the closer we can approximate a true Hilbert impulse response and then improve the accuracy of the result, but using this to predict the instantaneous phase/freq/amplitude would require a consistency on those parameters over the memory of the filter (which then puts constraints on the total bandwidth of the signal).

Further it is to be noted that the hilbert function in MATLAB/Octave and Python's scipy.signal are due to the simplistic (and fast!) approach of simply nulling the negative frequency axis bins (the upper half of the FFT bins) that those algorithms use. That approach of determining the time domain impulse response for filters from the IFFT is called the "Frequency Sampling" method of filter design. It is the most intuitive when a specific frequency domain response is desired (just IFFT the response), but it also has the worst performance compared to other methods such as windowing or using the optimized algorithms known as "least-squares" and "equiripple" (Parks-McLellan) that are used in the 'firls' and 'firpm' or 'remez' commands; this is because the Frequency Sampling method will result in a response that is exact at the bin frequencies used, but have more error compared to the other methods at all other frequencies that are in between the FFT bins, and more error in the time domain responses as well. However this will not completely resolve the time requirement issue for the OP which will occur when using the Hilbert; an averaging interval is required before the first "valid" sample can be used.

To minimize errors, consider creating the Hilbert Transform based on windowing the ideal time domain Discrete Hilbert impulse response. Alternatively the Hilbert can also be derived from the coefficients of a half-band filter which are determined using the optimized algorithms, and the python `'remez' function includes an option for implementing the Hilbert directly.

Below are the results of recent simulations I did directly comparing the frequency response achievable with both odd and even samples (using 91 and 92 length samples) for the hilbert function compared to hilbert filters created using Kaiser windowed ideal Discrete Hilbert impulse response.

The following shows the resulting amplitude ripple for the frequency response of a Hilbert impulse response created with 91 and 92 samples using the hilbert function. Notice the significant difference between using even and odd samples. This is a direct effect of the time domain aliasing with the zeroing of FFT bins approach the hilbert function uses.

For an apples to apples comparison, below is the frequency response for the better case of 92 samples for the hilbert derived response compared to using the Kaiser windowed Discrete Hilbert Impulse response with the same number of samples and overall frequency coverage:

For Hilbert filters, odd length impulse responses may be desired since as linear phase solutions will have an integer sample delay.

Below shows a comparison of what can be achieved with different filter design approaches where each has been matched to the same frequency coverage (which is narrower than the hilbert comparison given above, but includes another Kaiser windowed solution):

To determine the best estimate of instantaneous amplitude, frequency and phase; consider instead sampling the waveform in quadrature with two ADCs rather than creating the quadrature channel from one waveform, and correcting for the time offset between the samples as I detail in this post.

What do you mean by "only stationary", only pure sines? Also can you clarify "exact Hilbert worst, remez better"? Exact Hilbert is optimal in an important sense that's used in time-frequency analysis, though I don't doubt there's better stuff if we're just sticking with 1D. — OverLordGoldDragon, Jul 29 '22 at 22:25
@OverLordGoldDragon I am not sure I follow your definition of “exact Hilbert” as the true “Hilbert” has an infinite time domain response so is not realizable. You may be thinking of FFT specific periodicity where the ‘hilibert’ function as done would be “exact”. The stationary comment is with regards to time domain filter settling time and for cases where we want to know the instantaneous phase / frequency and amplitude as the OP described - where we need to consider the bandwidth of the signal and the memory of the Hilbert implementation. — Dan Boschen, Jul 29 '22 at 23:53
Windowing and least squares approaches will provide a better Hilbert response given the same number of coefficients. This is when you are concerned with creating a Hilbert to cover the maximum bandwidth possible—- if the bandwidth needed isn’t approaching DC or Nyquist the difference may not be noticeable. — Dan Boschen, Jul 29 '22 at 23:55
By "exact" I referred to "nulling the negative frequency axis bins". What does stationary have to do with filter settling time? Is this referring to stochastic processes or having AM/FM? — OverLordGoldDragon, Jul 30 '22 at 03:52
@OverLordGoldDragon AM / FM - using “stationary” caused more confusion than it answered so I deleted that — Dan Boschen, Jul 31 '22 at 01:06

Addressing Hilbert transform edge effects

1 Answers1

Linked