Finding the closest match of a recorded word to a 'corpus' of singly-recorded words.

Question

I am interested in learning more about processing audio and thought that it will be helpful to learn while doing a project. The project involves a pseudo-speec recognition where I have a corpus of singly-recorded audio of words, say apple.wav, bat.wav, cream.wav, dog.wav, that came from a text-to-speech program.

I have an 'unknown' audio file of a recorded audio of a word (spoken by a person) which is either the word 'apple', 'bat', 'cream', or 'dog'.

Suppose the spoken 'unknown' audio file is the recording of the word 'dog', how do i go about in matching that recording to the closest text-to-speech recording (i.e. dog.wav)?

(context: i want to built a system in our chemistry lab to log our waste disposal through speech e.g. the user talks through a computer saying "ethanol" and the computer records that audio and finds the closest recording of a chemical in the library of chemical audio recordings (ethanol.wav). The file name is then recorded in a database as a log of the disposal of ethanol)

This might be a duplicate but I have been browsing but I can't find anything.

Finding the closest match of a recorded word to a 'corpus' of singly-recorded words.

0 Answers0