TY - GEN
T1 - Gated module neural network for multilingual speech recognition
AU - Liao, Yuan Fu
AU - Pleva, Matus
AU - Hladek, Daniel
AU - Stas, Jan
AU - Viszlay, Peter
AU - Lojka, Martin
AU - Juhar, Jozef
N1 - Publisher Copyright:
� 2018 IEEE
PY - 2018/7/2
Y1 - 2018/7/2
N2 - For most multilingual large vocabulary continuous speech recognition (LVCSR) systems, when multiple languages are allowed at the same time, their performance will degrade significantly due to the strong inter-language competition in the decoding phase. To increase the inter-language discrimination capacity, this paper presents a gated module neural network (GMN) approach that adapts a language identification (LID) component to directly assist the final multilingual LVCSR goal. Thanks to an international collaboration 3 large-scale speech corpora (Mandarin, English and Slovak, denoted as Zh, En and Sk) were shared for studying this problem. Hence the proposed approach was evaluated on both bilingual (Zh/En and Sk/En) and trilingual (Zh/En/Sk) LVCSR tasks. The experimental results show that the proposed GMN is promising and the performance of multilingual LVCSRs are now more comparable with the monolingual ones.
AB - For most multilingual large vocabulary continuous speech recognition (LVCSR) systems, when multiple languages are allowed at the same time, their performance will degrade significantly due to the strong inter-language competition in the decoding phase. To increase the inter-language discrimination capacity, this paper presents a gated module neural network (GMN) approach that adapts a language identification (LID) component to directly assist the final multilingual LVCSR goal. Thanks to an international collaboration 3 large-scale speech corpora (Mandarin, English and Slovak, denoted as Zh, En and Sk) were shared for studying this problem. Hence the proposed approach was evaluated on both bilingual (Zh/En and Sk/En) and trilingual (Zh/En/Sk) LVCSR tasks. The experimental results show that the proposed GMN is promising and the performance of multilingual LVCSRs are now more comparable with the monolingual ones.
KW - Gated module neural networks
KW - Language identification
KW - Multilingual speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85065866203&partnerID=8YFLogxK
U2 - 10.1109/ISCSLP.2018.8706679
DO - 10.1109/ISCSLP.2018.8706679
M3 - Conference contribution
AN - SCOPUS:85065866203
T3 - 2018 11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018 - Proceedings
SP - 131
EP - 135
BT - 2018 11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018
Y2 - 26 November 2018 through 29 November 2018
ER -