When I compute a spectrogram of (say) a piece of music, there is a lot of frequency "smearing." Often we can reasonably expect that the "true" generating process is much sparser in frequency (i.e. maybe 10-20 active frequencies instead of hundreds).
We also know that plain old sine waves become smeared in spectrograms.
So, can't we model the corrupting "frequency leakage" at a specific timestep as a linear equation $$ Ax = b $$ Where
- $b$ is the known vector representing the STFFT-computed spectral densities at the various frequencies
- $x$ is the unknown, sparse vector of generating frequencies
- $A$ is the (computable) "smearing matrix" which maps power from sparse generating frequencies to the "smeared" set of frequencies which STTFT produces (We can just compute this by taking the STFFT of various sine wave snippets)
And naturally one could extend this to further encourage sparsity, e.g. $$ \min_{x} \|Ax - b\| + λ \|x\|_1 $$
I have not seen this approach discussed, but I am very new to DSP. Is this a known technique, or is there some reason why it would not work in "sharpening" a spectrogram?