Consonant classification in Mandarin based on the depth image feature: A pilot study

Han Chi Hsieh, Wei Zhong Zheng, Ko Chiang Chen, Ying Hui Lai

研究成果: Conference article同行評審

1 引文 斯高帕斯(Scopus)


The consonant is an important element in Mandarin, and various categories of consonant generation effectuate various facial expressions. Specifically, there are changes in facial muscles when speaking, and these changes are closely related to pronunciation; the facial muscles are associated with these hidden articulators, and the effects on the facial changes can be seen as 3D changes. However, in most studies, 2D images are used to analyze facial features when people talk. The 2D images serve to provide information in two dimensions (x- and y-axis); however, subtle deep motions (z-axis changes) of facial muscles when speaking can be difficult to detect accurately. Hence, the depth feature of the face (the point cloud feature in this study) was used to investigate the potential for consonant recognition, recorded by a time-of-flight 3D camera. In this study, we propose an algorithm to recognize the seven categories of Mandarin consonants using the depth features of the speaker's face. The proposed system yielded suitable classification accuracy for the recognition of seven categories of Mandarin consonants. This result implies that depth features can be used for speech-processing applications.


深入研究「Consonant classification in Mandarin based on the depth image feature: A pilot study」主題。共同形成了獨特的指紋。