Towards slovak-english-Mandarin speech recognition using deep learning

Matus Pleva*, Yuan Fu Liao, Wuhua Hsu, Daniel Hladek, Jan Stas, Peter Viszlay, Martin Lojka, Jozef Juhar

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

This paper describes the progress of the development of multilingual speech enabled interface by exploring the state-of-the-art deep learning techniques in the frame of the bilateral project named "Deep Learning for Advanced Speech Enabled Applications". The advancement is especially expected in automatic subtitling of broadcast television and radio programs, databases creation, indexing and information retrieval. This implies investigation of deep learning techniques in the following sub-tasks: A) multilingual large vocabulary continuous speech recognition, b) audio events detection, c) speaker clustering and diarization, d) spoken discourse, speech, paragraph and sentence segmentation, e) emotion recognition and f) microphone array/multi-channel speech enhancement, g) data mining, h) multilingual speech synthesis, and i) spoken dialogue user interfaces. This paper describes the current work, description of the available data in the project and achieved results in the first task of Slovak speech recognition Kaldi module using deep learning algorithms.

Original languageEnglish
Title of host publicationProceedings of ELMAR 2018 - 60th International Symposium
EditorsMislav Grgic, Dijana Vitas, Branka Zovko-Cihlar, Mario Mustra
PublisherCroatian Society Electronics in Marine - ELMAR
Pages151-154
Number of pages4
ISBN (Electronic)9789531842440
DOIs
StatePublished - 13 Nov 2018
Event60th International Symposium on ELMAR, ELMAR 2018 - Zadar, Croatia
Duration: 16 Sep 201819 Sep 2018

Publication series

NameProceedings Elmar - International Symposium Electronics in Marine
Volume2018-September
ISSN (Print)1334-2630

Conference

Conference60th International Symposium on ELMAR, ELMAR 2018
Country/TerritoryCroatia
CityZadar
Period16/09/1819/09/18

Keywords

  • Large Vocabulary Continuous Speech Recognition (LVCSR); Human Computer Interface (HCI); Deep Neural Networks (DNNs); Code-Switching; Bilingual Language Switching

Fingerprint

Dive into the research topics of 'Towards slovak-english-Mandarin speech recognition using deep learning'. Together they form a unique fingerprint.

Cite this