Finding the pitch value of a note

Question

I am trying to implement a karaoke scoring algorithm and, for a particular song, have the note information (done by a professional) that indicates what note occurs in which part of the lyric.

When I use a pitch detector and plot the pitch values for the portion of the lyric where a single note is held, the pitch varies a lot in the beginning (for the first few pitch values), hits a steady value (that roughly corresponds with the note that needs to be sung) and again shows a lot of variation before it transitions to the next note. I want to derive a single pitch value that is representative of the pitch value when the note is held steadily.

How do I go about it? Is this a problem of pitch smoothing (median)?

I am using a pitch algorithm based on the G.729 pitch detector with a pitch smoothing algorithm that seems to work really well (so far) on speech and on male singers singing in the presence of background music (very low volume).

http://dsp.stackexchange.com/questions/411/tips-for-improving-pitch-detection — endolith, Oct 18 '11 at 14:44
@endolith: I dont think pitch estimates need improvement. I am looking for a way to estimate the "steady" pitch value from an utterance sung at a specific note. in other words, the pitch holds a certain value after bouncing around for the first few values during note onset. It again varies during note transition. These times of onset and transition are not constant. I am looking for a way to estimate the "steady" time. Sorry if I was not clear. — Sriram, Oct 19 '11 at 06:31
So the issue is how to distinguish the steady regions from the transition regions? Or is it also about how to assign a pitch to the steady regions one they're identified (e.g. by averaging)? — datageist, Oct 19 '11 at 15:01
Both. I need to figure out the region where the pitch steadies itself and then figure out a way to reduce the pitch values in that region to a single value. I thought of a windowed-mode operation to isolate the steady regions. It seems to work decently well given that the window size and shift are figured out. — Sriram, Oct 19 '11 at 15:19
You could take the median or mean/average frequency estimated over the length of the note, possibly discarding values in the first and last 10% of the length of the note. Taking the mean may also help factor out intentional vibrato. — Russell Borogove, Oct 24 '11 at 22:06

hotpaw2 · Answer 1 · 2011-10-19T14:11:25.417

3

You may need to ignore the transients between stable pitches (for example, most consonants when singing lyrics) as these may be un-pitched sounds or noises, and thus confusing your pitch estimation algorithm.

The singer(s) may also be producing some vibrato, which should correctly show as pitch variations towards the end of some held notes.

edited Oct 19 '11 at 14:11

answered Oct 18 '11 at 22:01

hotpaw2

35,346
9
47
90

is there a method by which we can figure out where the vibrato occurs in a note? – Sriram Oct 19 '11 at 15:17
That sounds like a new and different question on human speech physiology and the physics of music. Perhaps a search of research papers on these topics is in order? – hotpaw2 Oct 19 '11 at 15:25

score 1 · Answer 2 · answered Oct 25 '11 at 07:12

1

In terms of pitch estimation, few people have published more extensively than Anssi Klapuri. I hope you find these papers useful > http://www.elec.qmul.ac.uk/people/anssik/publications.htm

answered Oct 25 '11 at 07:12

Dan Barry

854
4
5

Finding the pitch value of a note

2 Answers2