Sample-based phone-like unit automatic labeling in Mandarin speech

You Yu Lin, Yih-Ru Wang

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper presents a sample-based phone boundary detection algorithm which can improve the accuracy of phone boundary labeling in speech signal. In the conventional phone labeling method adopted the frame-based approach, some acoustic features, like MFCCs, are used. And, the statistical approaches are employed to find the phone boundary based on these frame-based features. The HMM-based forced alignment method is most frequently used method. The main drawback of the frame-based approach lies in incapability of modeling rapid changes in speech signal; moreover, the time resolution of this approach is too coarse for some applications. To overcome this problem, a sample-wise phone boundary detection framework is proposed in this study. First, some sample-wise acoustic features are proposed which can properly model the variation of speech signal. The simple-based spectral KL distance is first employed for boundary candidates pre-selection in order to reduce the complexity of sample-based methods. Then, a supervised neural network is trained for phone boundary detection. Finally, the effectiveness of the proposed framework has been validated on automatic labeling of TCC-300 speech corpus.

Original languageEnglish
Pages137-149
Number of pages13
StatePublished - 2009
Event21st Conference on Computational Linguistics and Speech Processing, ROCLING 2009 - Taichung, Taiwan
Duration: 1 Sep 20092 Sep 2009

Conference

Conference21st Conference on Computational Linguistics and Speech Processing, ROCLING 2009
Country/TerritoryTaiwan
CityTaichung
Period1/09/092/09/09

Keywords

  • Phone boundary segmentation
  • Sample-based spectral KL distance
  • Sub-band signal envelope
  • Supervised neural network

Fingerprint

Dive into the research topics of 'Sample-based phone-like unit automatic labeling in Mandarin speech'. Together they form a unique fingerprint.

Cite this