This paper presents a flexible topic model based on the nested Indian buffet process (nIBP). The flexibility is achieved by relaxing three constraints: (1) number of topics is fixed, (2) topics are independent, and (3) topic hierarchy for a document is limited by a single tree path. Bayesian nonparametric learning is conducted to build a tree model where the number of topics and the topic hierarchies are automatically learnt from the given data. In particular, we propose the nIBP to construct the topic mixture model for representation of heterogeneous documents where the mixture components are flexibly selected from tree nodes or dishes that a document or customer chooses in Indian buffet process. The selection is performed in a nested and hierarchical manner. The experiments on document representation show the benefits of using the proposed nIBP.
|頁（從 - 到）||1434-1437|
|期刊||Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH|
|出版狀態||Published - 1 一月 2014|
|事件||15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore|
持續時間: 14 九月 2014 → 18 九月 2014