Eigen-prosody analysis for robust speaker recognition under mismatch handset environment

Zi He Chen, Yuan Fu Liao, Yau Tarng Juang

研究成果同行評審

2 引文 斯高帕斯(Scopus)

摘要

Most speaker recognition systems utilize only low-level short-term spectral features and ignore high-level long-term information, such as prosody and speaking style. This paper presents a novel eigen-prosody analysis (EPA) approach to capture long-term prosodic information of a speaker for robust speaker recognition under mismatch environment. It converts the prosodic feature contours of a speaker's speech into sequences of prosody symbols, and then transforms the speaker recognition problem into a full text document retrieval-similar task. Experimental results on the well-known HTIMIT database have shown that, even only few training/test data is available, a remarkable improvement, about 28.7% relative error rate reduction comparing with the GMM/cepstral mean subtraction (CMS) baseline, could be achieved.

原文English
頁面1421-1424
頁數4
出版狀態Published - 2004
事件8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, 韓國
持續時間: 4 10月 20048 10月 2004

Conference

Conference8th International Conference on Spoken Language Processing, ICSLP 2004
國家/地區韓國
城市Jeju, Jeju Island
期間4/10/048/10/04

指紋

深入研究「Eigen-prosody analysis for robust speaker recognition under mismatch handset environment」主題。共同形成了獨特的指紋。

引用此