Within-class feature normalization for robust speech recognition

Yuan Fu Liao*, Chi Hui Hsu, Chi Min Yang, Jeng Shien Lin, Sen Chia Chang

*此作品的通信作者

研究成果: Conference article同行評審

摘要

In this paper, a within-class feature normalization (WCFN) framework operating in transformed segment-level (instead of frame-level) super-vector space is proposed for robust speech recognition. In this framework, each segment hypothesis in a lattice is represented by a high dimensional super-vector and projected to a class-dependent lower-dimensional eigen-subspace to remove unwanted variability due to environment noise and speaker (different values of SNR, gender, types of noise and so on). The normalized super-vectors are verified by a bank of class detectors to further rescore the lattice. Experimental results on Aurora 2 multi-condition training task showed that the proposed WCFN approach achieved 7.45% average word error rate (WER). WCFN not only outperformed the multi-condition training baseline (Multi-Con., 13.72%) but also the blind ETSI advanced DSR front-end (ETSI-Adv., 8.65%), the histogram equalization (HEQ, 8.66%) and the non-blind reference model weighting (RMW, 7.29%) approaches.

原文English
頁(從 - 到)1020-1023
頁數4
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版狀態Published - 2008
事件INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, 澳大利亞
持續時間: 22 9月 200826 9月 2008

指紋

深入研究「Within-class feature normalization for robust speech recognition」主題。共同形成了獨特的指紋。

引用此