Calculation of the salience function (Klapuri 2006)

Question

I am reading this paper here and I wanted to check my understanding regarding the salience function: $$ \hat{s}(\tau)=\sum_{m=1}^M g(\tau, m) \max_{k\in \kappa_{\tau, m}}\lvert Y(k)\rvert $$

Where the set $\kappa_{\tau, m}$ defines a range of frequency bins in the vicinity of the $m^{\rm th}$ overtone partial of the F0 candidate.

We want to find the maximum of $Y(k)$ for those k. But k is a frequency. And Y is, according to Klapuri the discrete STFT of the signal. I don't understand what the STFT on a given frequency is.

The STFT represents a 2D vector so $Y(k)$ must be a vector, right?
How do we find the max of list of vectors?
Is something in my understanding wrong?

So let's say we are testing the salience of the period 0.002. We start with $m=1$, and get $g(\tau, m)$. Then what?

score 2 · Accepted Answer · answered Jun 06 '17 at 01:12

2

You are correct that STFT is a function of 2 variables (usually x-axis is time and y-axis is frequency). The author is a bit careless in calling $Y(k)$ the STFT. What he really means is that $Y(k)$ is a fixed column (i.e. fixed time slice) of the STFT. The salience function is more carefully defined by the same author in this paper titled Multipitch analysis of polyphonic music and speech signals using an auditory mode Eq. 13, where he uses a subscript $t$ to denote a fixed time slice of the STFT.

answered Jun 06 '17 at 01:12

Atul Ingle

4,124
1
14
25

Yes, that indeed makes much more sense!!! Thanks. – pavlos163 Jun 06 '17 at 02:01
Therefore, it is in reality $Y_{t}(k)$, which returns an amplitude, right? – pavlos163 Jun 06 '17 at 16:11
Yes. $Y_t(k)$ is the spectrum from a "time slice" at time $t$, which is a 1-D function of frequency and you can look for the maximum magnitude in this vector. – Atul Ingle Jun 06 '17 at 16:40

Calculation of the salience function (Klapuri 2006)

1 Answers1