摘要
A sample-based phone boundary detection algorithm is proposed in this paper. Some sample-based acoustic parameters are first extracted in the proposed method, including six sub-band signal envelopes, sample-based KL distance and spectral entropy. Then, the sample-based KL distance is used for boundary candidates preselection. Last, a supervised neural network is employed for final boundary detection. Experimental results using the TIMIT speech corpus showed that EERs of 13.2% and 15.1% were achieved for the training and test data sets, respectively. Moreover, 43.5% and 88.2% of boundaries detected were within 80- and 240-sample error tolerance from manual labeling results at the EER operating point.
原文 | English |
---|---|
頁面 | 1397-1400 |
頁數 | 4 |
出版狀態 | Published - 9月 2010 |
事件 | 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan 持續時間: 26 9月 2010 → 30 9月 2010 |
Conference
Conference | 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 |
---|---|
國家/地區 | Japan |
城市 | Makuhari, Chiba |
期間 | 26/09/10 → 30/09/10 |