8

Working with my own MATLAB implementation of the short-time Fourier transform (STFT), I've managed to write code for the analysis step where a 1D time-domain signal $s[t]$ is progressively windowed, taken into the Fourier domain and arranged in a 2D matrix $S[t,\omega]$. Each element in the 2D matrix is a function of time $t$ and frequency $\omega$.

This presentation is a rather nice overview of the STFT, and gives a number of equations detailing the analysis and re-synthesis steps.

However, I would like to be able to arbitrarily modify $S[t,\omega]$, and then use the re-synthesis to get back $s[t]$.

I believe that I should be able to change $S[t,\omega]$ in whatever way that I want, and then obtain $s[t]$ by the re-synthesis procedure. This seems to be very similar to the idea of the phase vocoder.

As noted in section 3.1 of the presentation, the time-domain signal $s[t]$ can be recomposed using a least-squares procedure. This is given as Equation (6) in the 1984 paper by Griffin and Lim.

The least-squares procedure is required to be applied when $S[t,\omega]$ is modified in some way.

Question:

  • What does Equation (6) of the Griffin and Lim paper mean?
  • What steps do I follow to numerically implement Equation (6)?

In the presentation, the equation is written in a slightly different way:

$$x(n)=\frac{\sum_{l=-\infty}^\infty w\left(n-lI\right)y\left(lI, n\right)}{\sum_{l=-\infty}^\infty w\left(n-lI\right)^2}$$

Note that $x(n)$ is the re-synthesized time-domain sequence, $w(n)$ is the window function, and $y(n)$ is the time-domain version of a column of the 2D matrix.

Steps:

From the presentation, here are the steps that I think are required to do the re-synthesis:

  1. Let w_n be the discrete window vector and y_n(:,k) be the time domain vector computed using the IFFT on a column k of the 2D matrix. Both w_n and y_n(:,k) are the same length.

  2. Then, using Matlab syntax, we compute the point-by-point multiplication: w_n .* y_n(:,k)

  • Is this the numerator of the expression above?
  • What happens during steps 3 and 4?
  • What do the infinite summations signify?
Royi
  • 19,608
  • 4
  • 197
  • 238
Nicholas Kinar
  • 953
  • 2
  • 8
  • 16
  • Yes, w_n .* y_n(:,k) looks like the numerator, except that perhaps you'll need to make y_n the same length as w_n: y_n(n:n+M,:) where M is the window length. – Peter K. Oct 24 '12 at 13:22
  • @PeterK: Thanks, Peter. So if y_n has a greater length than the window size due to the original time-domain signal being zero-padded, how do I cut w_n .* y_n(:,k)? Why would I want to take the very beginning of the signal y_n(n:n+M,:), and how do I overlap-add the frames to reconstruct the signal? How do I deal with the denominator? – Nicholas Kinar Oct 24 '12 at 14:21
  • Generally, y_n will have a much longer length than the window size --- zero padded or not. Therefore, you need to choose some part of y_n that is the same size as w_n. I probably got the indices wrong: if you choose the front M samples it'll be more like y_n(n-M+1:n,k). – Peter K. Oct 24 '12 at 20:12

1 Answers1

3

The least-squares re-synthesis procedure is very similar to the overlap-add (OLA) procedure.

Let w_n be the discrete window vector and y_n(:,k) be the time domain vector computed using the IFFT on a column k of the 2D matrix. Both w_n and y_n(:,k) are the same length.

Then, using Matlab syntax, we compute the point-by-point multiplication with the window:

w_n .* y_n(:,k)

As mentioned in the comments above, the y_n is trimmed to be the same length as the window w_t. The w_n .* y_n(:,k) is then overlap-added in the same fashion as shown in the code associated with my previous post on the inverse STFT here. The overlap-added sequence is the numerator of the expression shown in the original question above.

The same overlap-add operation is applied to the squared window w_t.^2. This is the denominator of the expression shown in the original question above.

Then, the final re-synthesized output is simply the point-by-point division of the numerator by the denominator. In Matlab syntax,

numerator ./ denominator

Nicholas Kinar
  • 953
  • 2
  • 8
  • 16