0

I've implemented a function that calculates the zero-crossing rate for a given signal. I've used this same function to calculate the pitch. To differentiate voiced signals from unvoiced with reference to ZCR: A high ZCR means that the signal is unvoiced and a low ZCR means that it is voiced. My question is whether there is a threshold above which we can consider that a signal is unvoiced.

1 Answers1

0

Plain ZCR criterion is not enough for robust and accurate voiced/unvoiced separation. That being said, your threshold should be adaptive, there is no fixed threshold that works well for all speech waveforms. It depends on the approach you follow but a statistical threshold should do the work most of the time. You can google your question to find relative papers and see if their approach suits your purpose.

Just a note, ZCR is not the best choice for pitch estimation - it just gives a rough estimation.

GKH
  • 1,087
  • 6
  • 13
  • Thank you. In fact, I'm supposed to do a comparison between this method and other methods: autocorrelation, short-time energy and pitch detection algorithms (Yin and SWIPE). – Anass Naqqad Mar 04 '20 at 15:12
  • I see. That's fine then. – GKH Mar 04 '20 at 17:21