Speech Reconstruction from the Larynx Vibration Feature Captured by Laser-Doppler Vibrometer Sensor

Yi Chieh Lin, Ji Yan Han, Yu Min Lin, Wei Zhong Zheng, Shuenn Tsong Young, Ying Hui Lai

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

There are many deep learning (DL)-based models with the contact sensors (e.g., throat microphone, TM) to reconstruct the speech from the vibration signals of the larynx. The TM can obtain robust speech information than an air-conducted microphone (ACM) sensor in noisy environments. However, it needs tight contact with the user's skin, which causes discomfort for users. Therefore, we assume that a non-contact sensor allows users to have a better experience. Following this concept, the DL-based models with a non-contact sensor, a laser-Doppler vibrometer (LDV), are proposed to reconstruct the speech from the vibration signals of the larynx. Notably, the recognition and speech synthesis modules were adopted in the proposed system. The experimental results showed that, on average, the word error rate (WER) of the recognition module in the proposed system achieves similar performance as TM did in both quiet and noisy testing conditions. Furthermore, the listening test showed that the synthesis module's reconstructed speech provided a higher preference rate and naturalness than an original recorded speech of the LDV sensor. These results suggested that the proposed system is a potential approach to reconstruct speech from the vibration signals of the larynx with DL technology, captured by a non-contact LDV sensor.

Original languageEnglish
Title of host publication2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages829-835
Number of pages7
ISBN (Electronic)9789881476890
StatePublished - 2021
Event2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - Tokyo, Japan
Duration: 14 Dec 202117 Dec 2021

Publication series

Name2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - Proceedings

Conference

Conference2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021
Country/TerritoryJapan
CityTokyo
Period14/12/2117/12/21

Fingerprint

Dive into the research topics of 'Speech Reconstruction from the Larynx Vibration Feature Captured by Laser-Doppler Vibrometer Sensor'. Together they form a unique fingerprint.

Cite this