I should start this off by saying I’m a hobbyist and by no means a student. I’ve been reading “The Audio Programming Book”, and attempting to implement STFT on the stm32 based Daisy platform.
In order to learn from a working sample, I’ve been using mutable instruments’ Shy_fft for fft processing. So far, I have normal fft > Ifft functioning and am beginning to take steps towards STFT.
I cannot seem to get the stft functionality working, and am struggling to even understand if I’m processing / understanding the fundamentals.
Here is the second implementation I tried. The first version was overwritten by the update
The first version seemed to “work” but there was an issue where the output was “choppy”, seeming to be dropping frames. The second version didn’t seem to output any audio.
EDIT: I should mention that the gist wasn’t updated with a functioning audio output. I’ll try and update it later and remove this edit
Any tips/comments on where I’m going wrong/how I should be doing things are greatly appreciated.
EDIT 2: See updated NaiveSTFT class here
fft_buffer_indexandifft_buffer_index, aren't they supposed to be the same through initialisation? Not that this will solve your problem but if you don't provide some information on how you use the class I am not sure it will be very easy to find where the problem is. Could you provide a minimal working example? – ZaellixA Sep 17 '23 at 11:22102, thefft_.Direct()function should be outside the innerfor-loop, as, if I understood the implementation correctly, it is supposed to act on the whole buffer (which isFFT_SIZElong and offset based on the "hop size" and frame - which seems correct -), which means that it should be called when the windowing process is finished, which in turn is outside the innerfor-loop of line91. I haven't run the code so I am not sure this may be an issue but based on my understanding it seems like a problem to me. – ZaellixA Sep 17 '23 at 11:26Also, I’ll provide the class in context of its use as well. I should have thought to do that!
– Daniel Lawler Sep 17 '23 at 11:31Here's the updated class and here's the actual implementation
– Daniel Lawler Sep 17 '23 at 16:02for-loop starting on line124)? I believe you should apply an overlap-save/overlap-add technique here instead of windowing the data. This will introduce consecutive Hann windows in the time domain which will most probably sound like very short successive sounds (since you seem to be using a very short frame size). – ZaellixA Sep 17 '23 at 19:22Main.cppfile,IN_LandIN_Rare note declared somewhere so I suppose they are some kind of global variables. Nevertheless, since you do get some audio out I believe this shouldn't be part of the problem. But, I believe, at least for troubleshooting you should work on only one channel and with some basic signals whose output is known (so that you have a reference to compare against). – ZaellixA Sep 17 '23 at 19:24I think the current issue is twofold: 1: the math behind the hop and frames do not appear to be functioning properly. If you drop down the frame size to 1 and make the buff size and fft size equal, the signal is almost perfect but still has the underlying digital noise. Which leads us to 2: I cannot seem to figure out where that noise is coming from. I have updated the gists to reflect
– Daniel Lawler Sep 18 '23 at 17:16Can someone give this one a "final once over" to make sure that I'm not losing my mind, and can assume it's being done correctly?
– Daniel Lawler Sep 19 '23 at 09:25I want to mention that I'm not doubting you - I just wanted to know if you had any suggestions on how to test. Good points on the buffer size / fft size making the chops inaudible due to size.
I'm actually not doing any processing, as I wanted to get the signal conversion 1:1 before I try anything in the freq/spec realm, so the signal should be unaltered.
– Daniel Lawler Sep 19 '23 at 10:54