How do you find a specific frequency's volume corresponding to that signal's FFT?

Question

I've been dabbling in FFT for awhile now and I came across a problem:

We have a signal comprised of two different frequencies A and B. Frequency A is much louder than frequency B; frequency A's average amplitude is much greater than frequency B's in the sample. Also, let's say the content of the signals are equal; there is an equivalent amount of frequency A as there is B.

So we take the FFT of the signal sample, and our FFT graph looks like two spikes of equivalent magnitude around frequencies A and B. Here's the problem though, if we reversed the FFT it would produce a signal where signals A and B have equivalent loudness, or equivalent amplitude.

How can we preserve the amplitude or loudness of each signal in the FFT, knowing which frequency has which loudness or amplitude? If this problem were solved we could reverse the FFT and frequency A would be louder, or with greater average amplitude than B.

So we would have our FFT vector, and a loudness/amplitude vector directly corresponding to the loudness of each frequency in the FFT vector.

How can this be done?

Be careful not to mix up amplitude and loudness, they are very different things. Amplitude is a physical quantity, loudness a perceptual. A can have a lower amplitude than B but still be louder (depending on the frequencies). — Hilmar, Feb 25 '22 at 08:55
what is "amount" of frequencies? That doesn't exist: if you mean "power", "amplitude" or "magnitude" with "amount", then that would be in conflict with one being louder than the other. — Marcus Müller, Feb 25 '22 at 19:55

Dan Boschen · Answer 1 · 2022-02-25T00:57:44.340

So we take the FFT of the signal sample, and our FFT graph looks like two spikes of equivalent magnitude around frequencies A and B.

This is generally not true, if the FFT is done properly with consideration to resolution bandwidth, scalloping loss and dynamic range. The value of each frequency would be directly proportional to the amplitude of that signals component. Every bin in the FFT represents the magnitude and phase of a component given as samples of the base function $Ke^{j\omega t}$, and every sinusoid consists of two such base functions as given by Euler's formula.

To distinguish two frequencies properly, we need to ensure the frequencies are separated significantly larger than the resolution bandwidth of the FFT, which without further windowing, is given as $\Delta F= 1/T$ where $T$ is the the time duration in seconds and $\Delta F$ is the resolution bandwidth in Hz. This results equivalently as $1/N$ in bins. The issue with not windowing is the FFT will have significant sidelobes that will limit our ability to discern two tones of significantly different magnitude. Windowing significantly improves this by reducing sidelobes (and reducing scalloping loss, which is the variation in amplitude we would observe for a single bin when the actual frequency is intra-bin), but at the expense of increasing the resolution bandwidth.

These details are covered further in these posts:

How to calculate resolution of DFT with Hamming/Hann window?

What's the ideal FFT window for measuring a group of signals of differing amplitudes but close in frequency?

Specific Frequency Resolution

How do I calculate peak amplitude of the signal components after zero padding and FFT?

score 0 · Answer 2 · answered Feb 24 '22 at 19:07

Assuming the 2 frequency sine waves are integer periodic in the FFT's aperture's width, your assumption in paragraph 2 is incorrect. If the 2 sine waves have different amplitudes, the "spikes" in a graph of the FFT result will have 2 different heights.

How do you find a specific frequency's volume corresponding to that signal's FFT?

2 Answers2