Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario

Chiang Jen Peng*, Yun Ju Chan, Cheng Yu, Syu Siang Wang, Yu Tsao, Tai Shih Chi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Multi-task learning (MTL) and attention mechanism have been proven to effectively extract robust acoustic features for various speech-related tasks in noisy environments. In this study, we propose an attention-based MTL (ATM) approach that integrates MTL and the attention-weighting mechanism to simultaneously realize a multi-model learning structure that performs speech enhancement (SE) and speaker identification (SI). The proposed ATM system consists of three parts: SE, SI, and attention-Net (AttNet). The SE part is composed of a long-short-term memory (LSTM) model, and a deep neural network (DNN) model is used to develop the SI and AttNet parts. The overall ATM system first extracts the representative features and then enhances the speech signals in LSTM-SE and specifies speaker identity in DNN-SI. The AttNet computes weights based on DNN-SI to prepare better representative features for LSTM-SE. We tested the proposed ATM system on Taiwan Mandarin hearing in noise test sentences. The evaluation results confirmed that the proposed system can effectively enhance speech quality and intelligibility of a given noisy input. Moreover, the accuracy of the SI can also be notably improved by using the proposed ATM system.

Original languageEnglish
Title of host publication2021 IEEE International Symposium on Circuits and Systems, ISCAS 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728192017
DOIs
StatePublished - May 2021
Event53rd IEEE International Symposium on Circuits and Systems, ISCAS 2021 - Daegu, Korea, Republic of
Duration: 22 May 202128 May 2021

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
Volume2021-May
ISSN (Print)0271-4310

Conference

Conference53rd IEEE International Symposium on Circuits and Systems, ISCAS 2021
Country/TerritoryKorea, Republic of
CityDaegu
Period22/05/2128/05/21

Keywords

  • Attention weighting
  • Multi-task learning
  • Neural network
  • Speaker identification
  • Speech enhancement

Fingerprint

Dive into the research topics of 'Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario'. Together they form a unique fingerprint.

Cite this