2

In the literature on the short-time Fourier transform (STFT) it is often stated or implied that the constant overlap-add (COLA) constraint must be satisfied in order for the STFT to be invertible.

Example: https://ccrma.stanford.edu/~jos/sasp/FBS_Perfect_Reconstruction.html

People seem go to a lot of trouble to choose a window/overlap combination that satisfies COLA.

I'm challenging the community here to "change my view" that the COLA constraint is sufficient but not necessary. I contend that you can achieve perfect invertibility under much weaker conditions and so have access to a very broad range of windows and overlaps.

I've written this blog post to examine this question. I would love to know if I am missing something.

This script provides an example of a window that is not COLA compliant, but which creates an invertible STFT.

gauss256
  • 121
  • 4
  • 1
    Interesting. I'd suggest showing a non-trivial example where COLA is not satisfied but you can still invert the STFT. – MBaz May 04 '18 at 22:58
  • Was does CMV mean, by the way? – Laurent Duval May 04 '18 at 23:23
  • @MBaz I've run dozens of examples like that. I'll post some code. – gauss256 May 05 '18 at 03:03
  • @Laurent Duval CMV = change my view. A reference to a Reddit subreddit that is probably not too helpful in this context! – gauss256 May 05 '18 at 03:04
  • 1
    You have ripped the original statement out of context. STFT is invertible using FBS exactly if the window is COLA. It's trivially obvious that other windows that do not vanish can be inverted by dividing by the window function explicitly. – Jazzmaniac May 05 '18 at 10:13
  • @Jassmaniac I think we agree that the STFT is trivially invertible without COLA. Where we might disagree is whether anyone is confused about that. I don't see why anyone thinks about COLA at all and yet it shows up in many discussions of the STFT. – gauss256 May 05 '18 at 15:18
  • @gauss256, FBS inversion comes with nice properties that you want to have in many cases. It's the basis for many established algorithms. JOS@CCRMA is quite aware of all that and maybe you should read his texts to understand why "anyone thinks about COLA at all". I don't see any reason for continuing this discussion. – Jazzmaniac May 05 '18 at 16:37
  • @Jazzmaniac It was the books of JOS@CCRMA that led me to pose the question. Of course inversion is useful. But COLA is not needed for inversion, so why is it even mentioned? You are welcome to leave the discussion at any time. I am still interested in an answer to my question. – gauss256 May 05 '18 at 23:58
  • Not just inversion, FBS inversion, which stands for filter band summation. Why are you ignoring this important fact? FBS inversion is what gives STFT inversion the nice properties you'd expect from a filter bank. And again, that is the basis for many useful algorithms. Non-FBS inversion isn't. – Jazzmaniac May 06 '18 at 07:39
  • @Jazzmaniac In https://www.dsprelated.com/freebooks/sasp/Filter_Bank_Summation_FBS.html#chap:fbs, FBS is presented as an alternative interpretation of OLA. They are mathematically equivalent. If so, then FBS inversion is possible if and only if STFT inversion is possible. Are you claiming that the FBS version of COLA is necessary for inversion, even though it is not for OLA? – gauss256 May 06 '18 at 15:57
  • Dividing by the window is not part of critically sampled multi band reconstruction, no matter how you call it. I am saying that you are wrong for ignoring context and definitions and misinterpreting statements. – Jazzmaniac May 06 '18 at 16:14
  • Your blog render formulae in light grey, which is hard to read. – LIU Qingyuan Oct 14 '20 at 07:06
  • @LIU Qingyuan You're right, sorry about that. The blogging platform seems to have changed the stylesheet it's using. I don't know how to fix it myself, I'd have to work through their tech support. I hope that you can find a way to read it as is. – gauss256 Oct 14 '20 at 21:09
  • Thanks for your article, I agree COLA is often presented as necessary in general rather than FBS only, which is a problem worthy of your post here; I cited you. – OverLordGoldDragon Apr 05 '23 at 12:00

1 Answers1

2

You just triggered one old itch, so this is a partial (both meanings) answer only. For discrete-time, uniformly sampled STFT, one often uses the usual fixed-window with $3/4$, $1/2$, $1/4$ overlaps, and somehow uses direct closed form inverses. However, as long as there is redundancy (and completeness), there exists a infinite option for inverses.

In a collaborative work (with Jean-Christophe Pesquet and Jérôme Gauthier), following other works on vector-frames, we used the formalism of complex oversampled multirate filter banks

to model analysis STFT, with the aims of

  • checking conditions under which the STFT were invertible, given a hop between time-frames,
  • building "better" inverses (better localized in time or frequency) with optimization.

Our aim at that time was double:

  • for data in higher dimensions $d$, reduce the STFT redundancy $r^d$ as far as we could, -for 1D data, reduce the number of channels and increase the hop as we could as similar quality for real-time event detection.

All was based on polyphase matices, as we did consider windowed transform only as a special case; so the main outcomes were the following:

  • given any redundant analysis filter bank (complex or not, windowed-DFT like or not), check whether it is invertible
  • in most cases, anything redundant is almost surely inversible,
  • windows with negative values can be used, liked Kaiser windows, see On the Optimization of Oversampled DFT Filter Banks, 2007 enter image description here
  • you can reduce the degrees of freedom in the oversampled synthesis to perform better concentrated inverses after selection/denoising for (close-to) real-time computations,
  • ensure the Hermitian character in the inverse for scalar threshold.

There were remaining issues, among which: the difficulty to go back to the orthogonal setting when the redundancy tends to one. Weirds behaviors were observed, possibly related to Farey sequences, but I have no definitive clue. And References can be found in Optimization of Synthesis Oversampled Complex Filter Banks, 2009. Some related codes is available at SURE-LET Optimal oversampled Complex Filter Banks synthesis toolbox. We do plan a better release, but it has been remaining upcoming for months.

However, we did not investigate the non-COLA effect, and the necessary condition is not stated. My belief is that it is not necessary (due to redundancy), but this should be proven.

Laurent Duval
  • 31,850
  • 3
  • 33
  • 101
  • 1
    I think that the blog post I linked to in my question proves pretty conclusively that COLA is not necessary, under a wide variety of conditions. – gauss256 May 05 '18 at 03:06