TY - GEN
T1 - Towards slovak-english-Mandarin speech recognition using deep learning
AU - Pleva, Matus
AU - Liao, Yuan Fu
AU - Hsu, Wuhua
AU - Hladek, Daniel
AU - Stas, Jan
AU - Viszlay, Peter
AU - Lojka, Martin
AU - Juhar, Jozef
N1 - Publisher Copyright:
© Croatian Society Electronics in Marine - ELMAR. All rights reserved.
PY - 2018/11/13
Y1 - 2018/11/13
N2 - This paper describes the progress of the development of multilingual speech enabled interface by exploring the state-of-the-art deep learning techniques in the frame of the bilateral project named "Deep Learning for Advanced Speech Enabled Applications". The advancement is especially expected in automatic subtitling of broadcast television and radio programs, databases creation, indexing and information retrieval. This implies investigation of deep learning techniques in the following sub-tasks: A) multilingual large vocabulary continuous speech recognition, b) audio events detection, c) speaker clustering and diarization, d) spoken discourse, speech, paragraph and sentence segmentation, e) emotion recognition and f) microphone array/multi-channel speech enhancement, g) data mining, h) multilingual speech synthesis, and i) spoken dialogue user interfaces. This paper describes the current work, description of the available data in the project and achieved results in the first task of Slovak speech recognition Kaldi module using deep learning algorithms.
AB - This paper describes the progress of the development of multilingual speech enabled interface by exploring the state-of-the-art deep learning techniques in the frame of the bilateral project named "Deep Learning for Advanced Speech Enabled Applications". The advancement is especially expected in automatic subtitling of broadcast television and radio programs, databases creation, indexing and information retrieval. This implies investigation of deep learning techniques in the following sub-tasks: A) multilingual large vocabulary continuous speech recognition, b) audio events detection, c) speaker clustering and diarization, d) spoken discourse, speech, paragraph and sentence segmentation, e) emotion recognition and f) microphone array/multi-channel speech enhancement, g) data mining, h) multilingual speech synthesis, and i) spoken dialogue user interfaces. This paper describes the current work, description of the available data in the project and achieved results in the first task of Slovak speech recognition Kaldi module using deep learning algorithms.
KW - Large Vocabulary Continuous Speech Recognition (LVCSR); Human Computer Interface (HCI); Deep Neural Networks (DNNs); Code-Switching; Bilingual Language Switching
UR - http://www.scopus.com/inward/record.url?scp=85058706057&partnerID=8YFLogxK
U2 - 10.23919/ELMAR.2018.8534661
DO - 10.23919/ELMAR.2018.8534661
M3 - Conference contribution
AN - SCOPUS:85058706057
T3 - Proceedings Elmar - International Symposium Electronics in Marine
SP - 151
EP - 154
BT - Proceedings of ELMAR 2018 - 60th International Symposium
A2 - Grgic, Mislav
A2 - Vitas, Dijana
A2 - Zovko-Cihlar, Branka
A2 - Mustra, Mario
PB - Croatian Society Electronics in Marine - ELMAR
T2 - 60th International Symposium on ELMAR, ELMAR 2018
Y2 - 16 September 2018 through 19 September 2018
ER -