摘要
A new one-stage maximum confidence measure (MCM) based interaural phase difference estimation framework for noise masking is proposed to closely integrate the underline speech models into dual-microphone array noise filtering for robust speech recognition. The main ideas are: (1) utilizing both the speech and filler models of the recognizer to feedback confidence measures (CMs) that indicate the degree of separation between filtered speech and interference noises, and (2) automatically optimizing the parameters of the microphone array with an expectation maximization (EM) algorithm based on the proposed MCM criterion. Experimental results on a Mandarin voice command task show that the proposed approach significantly improves the final speech recognition rates. Moreover the observed performance degradation is usually graceful under low signal-to-noise ratios (SNRs) and close interference noises conditions.
原文 | English |
---|---|
頁(從 - 到) | 473-476 |
頁數 | 4 |
期刊 | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
出版狀態 | Published - 2011 |
事件 | 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, 意大利 持續時間: 27 8月 2011 → 31 8月 2011 |