On latent semantic language modeling and smoothing

Jen-Tzung Chien, Meng Sung Wu, Hua Jui Peng

研究成果: Paper同行評審

4 引文 斯高帕斯(Scopus)

摘要

Language modeling plays a critical role for automatic speech recognition. Conventionally, the n-gram language models suffer from lacking good representation of historical words and estimating unseen parameters from insufficient training data. In this work, the latent semantic information is explored for language modeling and parameter smoothing. In language modeling, we present a new representation of historical words via retrieving the most likely relevance document. Besides, we also develop a novel parameter smoothing method where the language models of seen and unseen words are estimated by interpolating those of k nearest seen words in training corpus. The interpolation coefficients are determined according to the closeness of words in semantic space. In the experiments, the proposed modeling and smoothing methods can significantly reduce the perplexities of language models with moderate computation cost.

原文English
頁面1373-1376
頁數4
出版狀態Published - 10月 2004
事件8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
持續時間: 4 10月 20048 10月 2004

Conference

Conference8th International Conference on Spoken Language Processing, ICSLP 2004
國家/地區Korea, Republic of
城市Jeju, Jeju Island
期間4/10/048/10/04

指紋

深入研究「On latent semantic language modeling and smoothing」主題。共同形成了獨特的指紋。

引用此