A joint-feature learning-based voice conversion system for dysarthric user based on deep learning technology

Ko Chiang Chen, Hsiu Wei Yeh, Ji Yan Hang, Sin Hua Jhang, Wei Zhong Zheng, Ying Hui Lai*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Dysarthria speakers suffer from poor communication, and voice conversion (VC) technology is a potential approach for improving their speech quality. This study presents a joint feature learning approach to improve a sub-band deep neural network-based VC system, termed J-SBDNN. In this study, a listening test of speech intelligibility is used to confirm the benefits of the proposed J-SBDNN VC system, with several well-known VC approaches being used for comparison. The results showed that the J-SBDNN VC system provided a higher speech intelligibility performance than other VC approaches in most test conditions. It implies that the J-SBDNN VC system could potentially be used as one of the electronic assistive technologies to improve the speech quality for a dysarthric speaker.

Original languageEnglish
Title of host publication2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1838-1841
Number of pages4
ISBN (Electronic)9781538613115
DOIs
StatePublished - Jul 2019
Event41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2019 - Berlin, Germany
Duration: 23 Jul 201927 Jul 2019

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
ISSN (Print)1557-170X

Conference

Conference41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2019
Country/TerritoryGermany
CityBerlin
Period23/07/1927/07/19

Fingerprint

Dive into the research topics of 'A joint-feature learning-based voice conversion system for dysarthric user based on deep learning technology'. Together they form a unique fingerprint.

Cite this