Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm

Syu Siang Wang, Hsin Te Hwang, Ying Hui Lai, Yu Tsao, Xugang Lu, Hsin Min Wang, Borching Su

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

This paper investigates the use of the speech parameter generation (SPG) algorithm, which has been successfully adopted in deep neural network (DNN)-based voice conversion (VC) and speech synthesis (SS), for incorporating temporal information to improve the deep denoising auto-encoder (DDAE)-based speech enhancement. In our previous studies, we have confirmed that DDAE could effectively suppress noise components from noise corrupted speech. However, because DDAE converts speech in a frame by frame manner, the enhanced speech shows some level of discontinuity even though context features are used as input to the DDAE. To handle this issue, this study proposes using the SPG algorithm as a post-processor to transform the DDAE processed feature sequence to one with a smoothed trajectory. Two types of temporal information with SPG are investigated in this study: static-dynamic and context features. Experimental results show that the SPG with context features outperforms the SPG with static-dynamic features and the baseline system, which considers context features without SPG, in terms of standardized objective tests in different noise types and SNRs.

Original languageEnglish
Title of host publication2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages365-369
Number of pages5
ISBN (Electronic)9789881476807
DOIs
StatePublished - 19 Feb 2016
Event2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015 - Hong Kong, Hong Kong
Duration: 16 Dec 201519 Dec 2015

Publication series

Name2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015

Conference

Conference2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015
Country/TerritoryHong Kong
CityHong Kong
Period16/12/1519/12/15

Fingerprint

Dive into the research topics of 'Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm'. Together they form a unique fingerprint.

Cite this