Questions tagged [mfcc]

Related to the calculation, verification, usage, and requirements for Mel Frequency Cepstral Coefficients.

MFCCs are commonly derived as follows:

  1. Take the Fourier transform of (a windowed excerpt of) a signal.
  2. Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows.
  3. Take the logs of the powers at each of the mel frequencies.
  4. Take the discrete cosine transform of the list of mel log powers, as if it were a signal. The MFCCs are the amplitudes of the resulting spectrum.

Great tutorial on MFCC: MFCC Tutorial on Practical Cryptography

179 questions
20
votes
1 answer

Cepstral Mean Normalization

Can anyone please explain about Cepstral Mean Normalization, how the equivalence property of convolution affect this? Is it must to do CMN in MFCC Based Speaker Recognition? Why the property of convolution is the fundamental need for MFCC? I am very…
mun
  • 215
  • 1
  • 2
  • 7
6
votes
3 answers

Sinusoidal liftering in implementations of MFCC

Some implementations of MFCC apply sinusoidal liftering as the final step in calculations of MFCC. It is claimed that speech recognition can be significantly improved. For instance, if $\text{MFCC}_i$ is a cepstral coefficient, and $w$ is a lifter,…
Celdor
  • 452
  • 1
  • 7
  • 17
5
votes
1 answer

What is the purpose of the log when computing the MFCC?

The steps of computing the Mel-Frequency Cepstrum Coefficients (MFCC) are: Frame blocking -> Windowing-> abs(DFT) -> Mel filter bank-> Sum coefficients for each filter-> Logarithm -> DCT But what is the purpose of the logarithm step?
Morten
  • 333
  • 4
  • 13
5
votes
2 answers

what is the mel scale?

I am not sure I understand what the Mel Scale is. Googling doesn't give me various answers. I seem to be getting the same response again and again. Which would be something like: "The mel scale reflects how people hear musical tone" First of all,…
Bob Burt
  • 359
  • 2
  • 6
  • 18
5
votes
1 answer

Mel filter in MFCC - is it necessary?

Is it necessary to use filter bank in MFCC process? Can anyone explain what is the Mel filter? I know that frequency in hertz is converted into Mel scale but is this formula can be directly applied after the Fourier transformation of the speech…
purple
  • 69
  • 1
  • 2
4
votes
2 answers

MFCC - Significance of number of features

I have been doing some readings on the computation of Mel-Frequency Cepstral Coefficients (MFCC) and further use of Vector Quantizers (VQ) for recognition purposes. I am however stumped by the method of computation for those MFCCs regarding the…
Thai Monk
  • 43
  • 1
  • 1
  • 5
3
votes
1 answer

Normalizing a vector with Mel Cepstrum Coefficients, Delta Coefficients and Log Energies for both

This is my very first signal processing exercise so I'm very new to this space. That being said the goal is to do isolate voice recognition in identifying someone saying yes and no.As the title suggests I have a 1x26 vector that looks as follows: …
user481610
  • 143
  • 1
  • 4
3
votes
0 answers

Finding similar fragment in recorded speech

I use MFCC algorithm for calculation of Mel-frequency coefficient for 2 files and DTW algorithm with Euclidean distance for finding similarity. I have results values around 400-600 on the files with the same word (cut from recorded speech of the…
Vbif
  • 131
  • 3
2
votes
0 answers

Text dependent speaker verification using MFCC and vector quantization

I am trying to implement a text dependent speaker verification system. Here each speaker will speak their unique identification number and they will be granted access. I am trying to implement this using MFCC features and Vector Quantization. Till…
Wanderer
  • 41
  • 3
2
votes
0 answers

How to calculate MFCCs over a narrow frequency range?

I have a bunch of brief (~1 second each) wave files, some of which contain a frequency-modulated sound of interest between about 3900-4500 Hz. I thought I might run MFCC calculations on these sound files and use the results to pursue…
2
votes
1 answer

Regarding MFCC feature of a speech signal

I am trying to implement speech recognition but I'm struck with a few questions: There are 12 Coefficients of MFCC. What are their names? What is the range of values of MFCC coefficients? If I want to relate MFCC coefficients graphically,…
user117268
  • 21
  • 1
1
vote
0 answers

Chunk a voice and do not save it in each step of processing

I need a way that when I get voice and chunk it with librosa Library when I do this my voice change to Typeless type . And I can not do any thing or do any progress in my voice . Other wise I don't want to save each step of processing to have less…
1
vote
0 answers

Dealing with MFCC Feature Vectors of different sizes

I am working on a project where I am classifying coughs of a patient as either positive or negative for a certain pulmonary illness. What I have at the moment is multiple cough events, segmented from larger recordings. I have extracted various…
Renier Botha
  • 111
  • 2
1
vote
2 answers

Comparing MFCC Features ,What do they represent?

I know that MFCC features are the spectral envelope of the input signal but I can't understand what do they mean and what do they represent . and if I have two people saying the same word how can I compare the resulting features. For example I…
Lama Sonmez
  • 23
  • 2
  • 6
1
vote
2 answers

Framing an audio signal

The first step in MFCC is to split the signal into frames and referring to this discussion on MFCC at this link (Help calculating / understanding the MFCCs: Mel-Frequency Cepstrum Coefficients) @pichenettles said that the frames are usually …
Lama Sonmez
  • 23
  • 2
  • 6
1
2