1

I want to find the peak amplitude of an audio signal in a given frequency area.
My problem is that if the signal has a strong resonating frequency between 2 fft bins, then both bins are way below the actual amp.

I wrote this example python code, which:

  1. Generates a sine with amp 1 at a frequency, which perfectly fits the 34th bin. As expected the plot shows a peak of 1.

  2. Generates a sine with amp 1 at a frequency, which is exactly between the frequency of the 33rd and 34th bin. The graph only shows a peak of around 0.75 instead of 1.

I tried some different interpolation methods, but none yielded much more than a peak of 0.75. Also note that I don't need the amp of a specific frequency, but rather the peak amp in a given frequency domain.

import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate as interp

for i in (34,33.5): freq = i/204824000 readd = np.sin(2np.pifreqnp.arange(4096)/48000)

Y_k = np.fft.fft(readd)[0:2048]/4096 # FFT function from numpy
Y_k[1:] = 2*Y_k[1:]
fft = np.abs(Y_k)

x = range(0,6)
y = fft[31:37]
xnew = np.linspace(0, 5, num=81, endpoint=True)
f = interp.interp1d(x,y,kind='cubic')
plt.plot(x, y, 'o', xnew, interp.krogh_interpolate(x,y,xnew),'-', xnew, f(xnew), 'o')
plt.show()

lennon310
  • 3,590
  • 19
  • 24
  • 27
Potheker
  • 21
  • 1

3 Answers3

1

You're experiencing the phenomenon called spectral leakage. There are plenty of questions/answers on this subject on this website. You can search for spectral leakage in the search bar (most recent: Hilmar's answer to a similar question).

Increase your fft size by zero-padding with $N$ (try N = 8192 for example) zeros (fft.fft(readd,N)), effectively increasing the frequency resolution (interpolated spectrum).

Jdip
  • 5,980
  • 3
  • 7
  • 29
1

I would recommend multiplying the input to the FFT by a good window and zero-padding to double the length.. Perhaps a good Kaiser with $\beta \approx 6$ or $7$.

Perhaps a better window for this would be a Gaussian window with $\sigma \approx \frac{N}{10}$ with $N$ being the length of the FFT. Make $N$ good and large so that several cycles of most input waveforms can be seen in the window.

$$\begin{align} X[k] \circledast W[k] &= \mathcal{DFT} \Big\{ x[n] w[n] \Big\} \\ &= \sum\limits_{n=0}^{N-1} x[n] w[n] \, e^{-j2\pi nk/N} \end{align} $$

Where $w[n]$ is the window

$$ w[n] = \frac{1}{\sigma \sqrt{2\pi} } e^{-(n-N/2)^2/(2\sigma^2)} $$

Then magnitude-square the FFT and take the logarithm of that:

$$ 10 \log_{10}\Big( \Big|X[k] \circledast W[k] \Big|^2\Big) $$

if you want to see this in dB. But here's the deal, consider a single sinusoid, this Gaussian window has another Gaussian function in the frequency domain for each frequency component. Then in the log-magnitude domain, it's a quadratic. Then you can do this quadratic interpolation to accurately get the location of the peak and the height of it, in the log-magnitude domain.

robert bristow-johnson
  • 20,661
  • 4
  • 38
  • 76
0

Zero padding in the time domain is a simple and effective way to interpolate between the existing frequency samples. This can be done directly in the fft command in python as follows:

fft.fft(wfm, m)

Where $m$ is the total length including that added zeros. The resulting fft will have m total samples as an interpolated spectrum.

Dan Boschen
  • 50,942
  • 2
  • 57
  • 135