The Speech Labeling and Modeling Toolkit (SLMTK) Version 1.0

Chen Yu Chiang, Wu Hao Li, Yen Ting Lin, Jia Jyu Su, Wei Cheng Chen, Cheng Che Kao, Shu Lei Lin, Pin Han Lin, Shao Wei Hong, Guan Ting Liou, Wen Yang Chang, Jen Chieh Chiang, Yen Ting Lin, Yih-Ru Wang, Sin Horng Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper introduces the Speech Labeling and Modeling Toolkit version 1.0 (SLMTK 1.0), which facilitates automatic labeling of text and speech for constructing text-To-speech (TTS) systems and speech analysis. The SLMTK 1.0 supports mixed Mandarin-English speech and the associated texts. The following seven steps then process the inputs: 1) text analysis, 2) acoustic feature extraction, 3) linguistic-speech alignment, 4) integration of syllable-based linguistic and prosodic-Acoustic features, 5) prosody labeling, 6) construction of prosody generation model, and 7) construction of acoustic models for speech synthesis. The outputs of the seven steps are, respectively, 1) linguistic labels, 2) acoustic features, 3) linguistic-speech alignment, 4) syllable-based linguistic and prosodic-Acoustic features, 5) prosody tags, 6) prosody generation model, and 7) acoustic models for speech synthesis. The SLMTK 1.0 has been applied to constructing personalized TTS systems for augmentative and alternative communication. In addition, the toolkit has also been applied to phonetic and prosodic labeling of L2 Mandarin speech to facilitate prosody analysis studies. The SLMTK 1.0 is available at https://slmtk.ce.ntpu.edu.tw for non-commercial use and welcomes all parties to enrich the functions of the SLMTK.

Original languageEnglish
Title of host publication2022 25th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350398564
DOIs
StatePublished - 2022
Event25th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2022 - Hanoi, Viet Nam
Duration: 24 Nov 202226 Nov 2022

Publication series

Name2022 25th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2022 - Proceedings

Conference

Conference25th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2022
Country/TerritoryViet Nam
CityHanoi
Period24/11/2226/11/22

Keywords

  • annotation
  • labeling
  • Mandarin
  • mixed Mandarin-English
  • prosody
  • text-To-speech

Fingerprint

Dive into the research topics of 'The Speech Labeling and Modeling Toolkit (SLMTK) Version 1.0'. Together they form a unique fingerprint.

Cite this