Prosody modeling of spontaneous mandarin speech and its application to automatic speech recognition

Cheng Hsien Lin, Meng Chian Wu, Chung Long You, Chen Yu Chiang, Yih-Ru Wang, Sin-Horng Chen

研究成果: Conference article同行評審

5 引文 斯高帕斯(Scopus)

摘要

A prosody-assisted ASR approach for spontaneous Mandarin speech is proposed. It employs the joint prosody labeling and modeling algorithm proposed previously to construct a hierarchical prosodic model (HPM) and uses it in two-stage speech recognition. A word lattice is first generated by the HMM method using tri-phone AM and bigram LM. Then, the lattice is extended by replacing LM to a trigram model. A rescoring process is applied in the second stage to sequentially add factor POS and PM LMs, and the HPM. The method is evaluated on the MCDC database comprising 8 dialogues of 16 speakers with length of 9.09 hours. Error rates of syllable/character/word were reduced from 35.6/40.2/45.1% by the baseline trigram HMM method to 32.4/36.5/41.8% by the proposed method. The improvement is reasonably good as considering the WER upper-bound of 13.4% for the word lattice owing to the high OOV rate of the database. By error analysis, we find that many tone recognition errors and word segmentation errors were corrected. Besides, some information of the testing utterance was also obtained by the ASR, including POS of word, PM, tone of syllable, break type of syllable juncture, and prosodic state of syllable.

原文English
頁(從 - 到)1034-1037
頁數4
期刊Proceedings of the International Conference on Speech Prosody
2016-January
DOIs
出版狀態Published - 2016
事件8th Speech Prosody 2016 - Boston, 美國
持續時間: 31 5月 20163 6月 2016

指紋

深入研究「Prosody modeling of spontaneous mandarin speech and its application to automatic speech recognition」主題。共同形成了獨特的指紋。

引用此