Binary mask estimation based on frequency modulations

Chung Chien Hsu*, Jen-Tzung Chien, Tai-Shih Chi

*此作品的通信作者

研究成果: Conference article同行評審

摘要

In this paper, a binary mask estimation algorithm is proposed based on modulations of speech. A multi-resolution spectrotemporal analytical auditory model is utilized to extract modulation features to estimate the binary mask, which is often used in speech segregation applications. The proposed method estimates noise from the beginning of each test sentence, a common approach seen in many conventional speech enhancement algorithms, to further enhance the modulation features. Experimental results demonstrate that the proposed method outperforms the AMS-GMM system in terms of the HIT-FA rate when estimating the binary mask.s

原文English
頁(從 - 到)993-997
頁數5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版狀態Published - 1 一月 2014
事件15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
持續時間: 14 九月 201418 九月 2014

指紋

深入研究「Binary mask estimation based on frequency modulations」主題。共同形成了獨特的指紋。

引用此