FFT analysis of binned data of different length

Question

I am conducting experiments to collect wind speed data from wind anemometers placed on a moving platform. Closely, there is a fixed wind mast holding a wind vane. Prior to analysing the wind data from the moving platform, I binned the data with 1-minute average wind direction. Each bin contains a number of continuous data sets (Set 1, Set 2, Set 3 etc.) which may not necessarily have the same number of data points. For example, Set 1 has 20,000 data points, Set 2 has 6,000 data points, etc.

I am performing an FFT on each of the sets to detect any periodicities in the signal. I am plotting the real amplitude of the sinusoids present in the signal on the y-axis vs. the frequency of the sinusoid. The FFT of each data set is yielding different amplitude results, on the y-axis and I'm not sure whether this is a result of the different length of the arrays. I am interested in an average value of the y-axis of the spectrum. However, I don't think that it is suitable to just calculate the average due to the different array size. Can someone please suggest a method? I am not used to analysis of signals in this way so I'm a bit confused. Thanks a lot.

You can use Welch's method to compute the averaged periodogram of each data set with the same FFT size. — ThP, Aug 21 '14 at 14:12
thanks, but I do not have the same fft size that's the issue. — user10881, Aug 21 '14 at 19:47
@user1088 : Interesting. Before I suggest anything, can you describe the data-set more precisely, that is what is the reason for unequal data length in each data-set. — Neeks, Aug 22 '14 at 04:34
@ Neeks: Thanks for your reply. In short, I am conducting open field experiments over a number of hours. I am binning my data according to wind direction. However, I am interested in continuous data of wind speed, so my wind direction bin is composed of several sets, each set being a continuous period of measurement. These sets are of unequal length because, for ex., in Set 1 I had 2500 data points of continuous readings and in Set 2 had 1000 data points. I am plotting the real amplitude of the sine wave on the y-axis (fft(data)/N/2). I wish to compare the fft amplitude of each set. — user10881, Aug 22 '14 at 07:57
@ Neeks: However, since they are not of the same length I believe that I need to, somehow, account for such discrepancy but haven't thought of a way yet. — user10881, Aug 22 '14 at 07:58
@user10881: My suggestion, using the Welch's method, results in the same fft length regardless of the data size. — ThP, Aug 22 '14 at 10:38

score 0 · Accepted Answer · edited Apr 13 '17 at 12:47

Let's say you have several data sets $x_i$ and $N_i$ is the length of the $i$th data set.

First you're right that the amplitude of the FFT output scales with its length. To normalize all data sets they have to be divided by $N_i$. (Note that depending on the implementation of the FFT there might be a scaling involved already, check the Doc or see this question)

To combine all data sets you should append $(\max_i N_i) - N_i$ zeros to every data set before taking the FFT. The frequency axes are then aligned correctly and you can calculate the average of all FFT outputs.

Btw, I would plot the absolute value of the FFT output, not the real part. And sometimes I find it useful to "smooth" the spectrum by averaging every $M$ neighbouring FFT bins. Last, I have assumed that the data of every set has been acquired at equidistant time steps, otherwise the FFT isn't applicable at all.

OK, this is bad. The only answer, accepted, with a score of -1. Is this the correct answer or not? — giusti, Sep 13 '17 at 19:52

FFT analysis of binned data of different length

1 Answers1