A Dual-Channel Three-Stage Model for DoA and Speech Enhancement

Meng Hsuan Wu*, Yih Liang Shen*, Hsuan Cheng Chou*, Bo Wun Shih, Tai Shih Chi*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

During the pandemic, teleconferencing becomes a necessity to our daily lives. It drives the demand for an integrated system which is not only able to effectively enhance speech sounds, but also to localize the speaker for video enhancement. In this paper, we propose a neural network based composite system which integrates a DoA estimator and a neural beamformer for dual-channel speech enhancement. The proposed system can accomplish two tasks at the same time by using sound signals received from dual microphones. The estimated DoA is converted into a spatial angle related feature, which provides complementary information to boost performance of the neural beamformer. The proposed system is evaluated in simulated far-field conditions with reverberations and noise. Simulation results demonstrate the proposed system outperforms stand-alone baseline systems in either one of the two tasks and achieves comparable results to the best stand-alone models in either one of the two tasks.

Original languageEnglish
Title of host publication2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1064-1068
Number of pages5
ISBN (Electronic)9798350300673
DOIs
StatePublished - 2023
Event2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023 - Taipei, Taiwan
Duration: 31 Oct 20233 Nov 2023

Publication series

Name2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023

Conference

Conference2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
Country/TerritoryTaiwan
CityTaipei
Period31/10/233/11/23

Fingerprint

Dive into the research topics of 'A Dual-Channel Three-Stage Model for DoA and Speech Enhancement'. Together they form a unique fingerprint.

Cite this