A Deep Learning Based Approach to Synthesize Intelligible Speech with Limited Temporal Envelope Information

Ching Ju Hsiao, Fei Chen, Ji Yan Han, Wei Zhong Zheng, Ying Hui Lai*

*此作品的通信作者

研究成果: Conference contribution同行評審

摘要

Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.e., 0.8) scores in each test condition; and the human listening test showed that the average word correct rate of eight listeners was higher than 97.5%. These findings indicated that the proposed deep learning-based system can be a potential approach to synthesize a highly intelligible speech with limited envelope information in the future.

原文English
主出版物標題44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2022
發行者Institute of Electrical and Electronics Engineers Inc.
頁面1972-1976
頁數5
ISBN(電子)9781728127828
DOIs
出版狀態Published - 2022
事件44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2022 - Glasgow, United Kingdom
持續時間: 11 7月 202215 7月 2022

出版系列

名字Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
2022-July
ISSN(列印)1557-170X

Conference

Conference44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2022
國家/地區United Kingdom
城市Glasgow
期間11/07/2215/07/22

指紋

深入研究「A Deep Learning Based Approach to Synthesize Intelligible Speech with Limited Temporal Envelope Information」主題。共同形成了獨特的指紋。

引用此