Questions tagged [speech-processing]

Speech processing is the study of speech signals and the processing methods of these signals.

Speech processing is the study of speech signals and the processing methods of these signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signal.

Speech signal is one form of audio signal. The others are tone, music and noises. A method in signal processing usually tested with music, speech and mixture of both to test the developed algorithm.

285 questions
3
votes
2 answers

Where can I get a database of speech signals?

For a classification sort of experiment using Matlab, I am in need of a database of male and female voices. So where can I get links of such open databases.
dexterdev
  • 349
  • 6
  • 17
2
votes
1 answer

What is the difference between neural network and deep neural network. an example thank you

What is the difference between neural network and deep neural network in speaker recognition .Can i have an example of code sur Matlab thank you
Makrem
  • 23
  • 2
1
vote
1 answer

Povey window formula

In Kaldi, "povey" is a window made to be similar to Hamming but to go to zero at the edges, it's: $$g(n) = \left(\frac{1-\cos\left(\frac{n}{N2\pi}\right)}{2}\right)^{0.85}$$ Please, can anyone mathematically explain why we are considering raising…
Rhythm
  • 11
  • 3
1
vote
0 answers

How to deal with variations in amplitudes when doing voiced/silence detection?

I have a collection of wave files that has a few well separated words spoken. My goal is to split them into individual files. I wanted the program to work on all files, not just one of them or require tuning on each one. So I get started to look at…
Andrew Au
  • 141
  • 3
1
vote
0 answers

Finding the closest match of a recorded word to a 'corpus' of singly-recorded words.

I am interested in learning more about processing audio and thought that it will be helpful to learn while doing a project. The project involves a pseudo-speec recognition where I have a corpus of singly-recorded audio of words, say apple.wav,…
1
vote
1 answer

Single channel speech enhancement

What is the difference between Magnitude spectral subtraction and Power spectral subtraction in terms of performance?
Kishor
  • 19
  • 1
0
votes
2 answers

How to remove salt & pepper noise from speech signal?

I'm having a lot of problems with my project: I added salt and pepper noise by inbuilt function I DWT function with Haar family Now I want to remove noise from original speech signal. I don't know how to do this
0
votes
3 answers

How to determine whether a speech segment is voiced/unvoiced?

I want to determine whether a speech frame is voiced/unvoiced. Out of many methods found while searching, one method said find energy of the frame and if it is above a certain threshold, mark it as voiced. Now, my question is how should I determine…
Anand Mohan
  • 117
  • 9
0
votes
0 answers

can you suggest me courses to start DSP?

I'm a cs student who's totally 100 % new to DSP can you recommend me math , physics and all the required courses so i can understand DSP better? thank you and also can you recommend some courses / books so i can study DSP?
0
votes
2 answers

What is the unit of x-axis of a sound signal?

I'm confused about the unit of x-axis. The below signal has a duration of 3 second and sampling frequency 44100. So should i write Time(s) for the unit of x-axis. However, in the graph, it shows the number of samples i.e 44100 * 3 = 132300.
Uygar Uçar
  • 65
  • 3
  • 10
0
votes
1 answer

Decoupling consonant phoneme

This question follows from Designing a sound that localises well I am developing an experimental system for pitch training. I am giving each of the 12 pitch-classes {C CD D DE E F FG G GA A AB B } an associated consonant phoneme: {d b r ng m v sh z…
P i
  • 1,329
  • 11
  • 24