A Dual-Channel Three-Stage Model for DoA and Speech Enhancement

Meng Hsuan Wu*, Yih Liang Shen*, Hsuan Cheng Chou*, Bo Wun Shih, Tai Shih Chi*

*此作品的通信作者

研究成果: Conference contribution同行評審

摘要

During the pandemic, teleconferencing becomes a necessity to our daily lives. It drives the demand for an integrated system which is not only able to effectively enhance speech sounds, but also to localize the speaker for video enhancement. In this paper, we propose a neural network based composite system which integrates a DoA estimator and a neural beamformer for dual-channel speech enhancement. The proposed system can accomplish two tasks at the same time by using sound signals received from dual microphones. The estimated DoA is converted into a spatial angle related feature, which provides complementary information to boost performance of the neural beamformer. The proposed system is evaluated in simulated far-field conditions with reverberations and noise. Simulation results demonstrate the proposed system outperforms stand-alone baseline systems in either one of the two tasks and achieves comparable results to the best stand-alone models in either one of the two tasks.

原文English
主出版物標題2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
發行者Institute of Electrical and Electronics Engineers Inc.
頁面1064-1068
頁數5
ISBN(電子)9798350300673
DOIs
出版狀態Published - 2023
事件2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023 - Taipei, 台灣
持續時間: 31 10月 20233 11月 2023

出版系列

名字2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023

Conference

Conference2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
國家/地區台灣
城市Taipei
期間31/10/233/11/23

指紋

深入研究「A Dual-Channel Three-Stage Model for DoA and Speech Enhancement」主題。共同形成了獨特的指紋。

引用此