TY - CONF
T1 - Eigen-prosody analysis for robust speaker recognition under mismatch handset environment
AU - Chen, Zi He
AU - Liao, Yuan Fu
AU - Juang, Yau Tarng
N1 - Funding Information:
This work was supported by the National Science Council, Taiwan, under the project with contract NSC 92-2213-E-027-037 and Ministry of Education under the project with contract A-93-E-FA06-4-4.
PY - 2004
Y1 - 2004
N2 - Most speaker recognition systems utilize only low-level short-term spectral features and ignore high-level long-term information, such as prosody and speaking style. This paper presents a novel eigen-prosody analysis (EPA) approach to capture long-term prosodic information of a speaker for robust speaker recognition under mismatch environment. It converts the prosodic feature contours of a speaker's speech into sequences of prosody symbols, and then transforms the speaker recognition problem into a full text document retrieval-similar task. Experimental results on the well-known HTIMIT database have shown that, even only few training/test data is available, a remarkable improvement, about 28.7% relative error rate reduction comparing with the GMM/cepstral mean subtraction (CMS) baseline, could be achieved.
AB - Most speaker recognition systems utilize only low-level short-term spectral features and ignore high-level long-term information, such as prosody and speaking style. This paper presents a novel eigen-prosody analysis (EPA) approach to capture long-term prosodic information of a speaker for robust speaker recognition under mismatch environment. It converts the prosodic feature contours of a speaker's speech into sequences of prosody symbols, and then transforms the speaker recognition problem into a full text document retrieval-similar task. Experimental results on the well-known HTIMIT database have shown that, even only few training/test data is available, a remarkable improvement, about 28.7% relative error rate reduction comparing with the GMM/cepstral mean subtraction (CMS) baseline, could be achieved.
UR - http://www.scopus.com/inward/record.url?scp=85009110237&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85009110237
SP - 1421
EP - 1424
T2 - 8th International Conference on Spoken Language Processing, ICSLP 2004
Y2 - 4 October 2004 through 8 October 2004
ER -