TY - GEN
T1 - Improved acoustics modeling for speech recognition using transformation techniques
AU - Fung, Carrson
AU - Au, Oscar C.
AU - Wan, Wanggen
AU - Yim, Chi H.
AU - Keung, Cyan L.
PY - 2000/10
Y1 - 2000/10
N2 - In statistical speech recognition, misclassification often occurs when there is a mismatch between the incoming signal and the acoustics model inside the recognizer. In order to combat this problem, techniques such as Cepstral Mean Subtraction, Vocal Tract Normalization, adaptation and pronunciation model can be used. In this paper, we proposed a new approach based on transformation technique where the output distribution function in the HMM model, a Gaussian probability density function, could be transformed to match the estimated distribution of the incoming signal by using a memoryless invertible nonlinearity function. Since the new density still has a Gaussian form, the function could be completely characterized by using the Expectation Maximization (EM) algorithm.
AB - In statistical speech recognition, misclassification often occurs when there is a mismatch between the incoming signal and the acoustics model inside the recognizer. In order to combat this problem, techniques such as Cepstral Mean Subtraction, Vocal Tract Normalization, adaptation and pronunciation model can be used. In this paper, we proposed a new approach based on transformation technique where the output distribution function in the HMM model, a Gaussian probability density function, could be transformed to match the estimated distribution of the incoming signal by using a memoryless invertible nonlinearity function. Since the new density still has a Gaussian form, the function could be completely characterized by using the Expectation Maximization (EM) algorithm.
UR - http://www.scopus.com/inward/record.url?scp=85009063785&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85009063785
T3 - 6th International Conference on Spoken Language Processing, ICSLP 2000
BT - 6th International Conference on Spoken Language Processing, ICSLP 2000
PB - International Speech Communication Association
T2 - 6th International Conference on Spoken Language Processing, ICSLP 2000
Y2 - 16 October 2000 through 20 October 2000
ER -