2

What I am trying to do is have pocket sphinx get my mic input and have it then translate and based on the results compare it to predefined words and then execute a command/script. I was able to get pocketsphinx installed and running on my Raspberry Pi3, and the use with the terminal works, but what I am looking for is a template of code that will use the API and use the mic as input. I was able to find a C++ code for it, but not C. I have been digging online and no luck I hope someone can help me put something together with comments, I hope that is not too much to ask. Here is the C++ code: http://www.robotrebels.org/index.php?topic=239.msg1234#msg1234

Thank you in advance.

Vlad
  • 41
  • 5

1 Answers1

2

Here is the code to get Pocketsphinx to listen to the mic in just C. All you have to do is compile it and run it.

#include <stdio.h>
#include <string.h>
#include <pocketsphinx.h>
#include <sphinxbase/ad.h>
#include <sphinxbase/err.h>

const char * recognize_from_microphone();

ps_decoder_t *ps;                  // create pocketsphinx decoder structure
cmd_ln_t *config;                  // create configuration structure
ad_rec_t *ad;                      // create audio recording structure - for use with ALSA functions

int16 adbuf[4096];                 // buffer array to hold audio data
uint8 utt_started, in_speech;      // flags for tracking active speech - has speech started? - is speech currently happening?
int32 k;                           // holds the number of frames in the audio buffer
char const *hyp;                   // pointer to "hypothesis" (best guess at the decoded result)
char const *decoded_speech;


int main(int argc, char *argv[]) {

  config = cmd_ln_init(NULL, ps_args(), TRUE,                   // Load the configuration structure - ps_args() passes the default values
"-hmm", "/usr/local/share/pocketsphinx/model/en-us/en-us",  // path to the standard english language model
"-lm", "custom.lm",                                         // custom language model (file must be present)
"-dict", "custom.dic",                                      // custom dictionary (file must be present)
"-logfn", "/dev/null",                                      // suppress log info from being sent to screen
 NULL);

  ps = ps_init(config);                                                        // initialize the pocketsphinx decoder
  ad = ad_open_dev("sysdefault", (int) cmd_ln_float32_r(config, "-samprate")); // open default microphone at default samplerate

  while(1){                                                                   
decoded_speech = recognize_from_microphone();                 // call the function to capture and decode speech           
printf("You Said: %s\n", decoded_speech);                               // send decoded speech to screen

   }

 ad_close(ad);                                                    // close the microphone
}

const char * recognize_from_microphone(){

ad_start_rec(ad);                                // start recording
ps_start_utt(ps);                                // mark the start of the utterance
utt_started = FALSE;                             // clear the utt_started flag

while(1) {                                       
    k = ad_read(ad, adbuf, 4096);                // capture the number of frames in the audio buffer
    ps_process_raw(ps, adbuf, k, FALSE, FALSE);  // send the audio buffer to the pocketsphinx decoder

    in_speech = ps_get_in_speech(ps);            // test to see if speech is being detected

    if (in_speech && !utt_started) {             // if speech has started and utt_started flag is false                           
        utt_started = TRUE;                      // then set the flag
    }

    if (!in_speech && utt_started) {             // if speech has ended and the utt_started flag is true
        ps_end_utt(ps);                          // then mark the end of the utterance
        ad_stop_rec(ad);                         // stop recording
        hyp = ps_get_hyp(ps, NULL );             // query pocketsphinx for "hypothesis" of decoded statement
        return hyp;                              // the function returns the hypothesis
        break;                                   // exit the while loop and return to main
        }
   }

}
Vlad
  • 41
  • 5
  • Great work, but I'm not clear on where to place the c file, and how to get a successful compilation. I installed at ~/ then attempted to compile with g++ -O3 -o ps_boilerplate ps_boilerplate.c 'pkg-config --cflags --libs pocketsphinx sphinxbase' but I get errors pkg-config: command not found ps_boilerplate.c:3:26: fatal error: pocketsphinx.h: No such file or directory compilation terminated. – zipzit Mar 28 '17 at 20:02
  • After a bit of searching on compilation stuff... this should help address the issues posted in my previous comment. – zipzit Mar 29 '17 at 14:55
  • You first have to install and build it, look at this guide: https://wolfpaulus.com/embedded/raspberrypi2-sr/. Once you install it, you can place the C program anywhere you want, you just have to run it via terminal. Just make you you have the language and dict file in the same location . – Vlad Mar 29 '17 at 20:33
  • Odd. I'm using an OrangePi Zero with an Armbian OS. I had to manually add both libtool and pkg-config to dependencies to get up to speed. I also had troubles copying and pasting commands.. I presume wrong character type. When typed by hand things work okay. – zipzit Mar 31 '17 at 07:55
  • Vlad, you are my hero. All up and running. The real surprise is how much faster this minimized program is than the sample pocketsphinx_continuous program, even though I'm using the same custom.dic and .lm files. Speech to Text (STT) takes 6 to 8 seconds on the pocketsphinx_continuous program, but only a fraction of a second on your minimalist program. I used the wolfpaulus link, the sourceforge tutorialpocketsphinx link and your posting here to figure this all out. Note: look at that second link to ensure all dependencies are loaded, and also pay attention to -- flags on pkg-config use. – zipzit Mar 31 '17 at 08:15
  • I agree, the full version is very slow, this version is very quick. I do not understand what you mean but the last sentence of your comment. – Vlad Mar 31 '17 at 18:16