TY - GEN
T1 - Deep Bayesian data mining
AU - Chien, Jen Tzung
PY - 2020/1/20
Y1 - 2020/1/20
N2 - This tutorial addresses the fundamentals and advances in deep Bayesian mining and learning for natural language with ubiquitous applications ranging from speech recognition [7, 55] to document summarization [8], text classification [5, 75], text segmentation [18], information extraction [50], image caption generation [69, 72], sentence generation [25, 46], dialogue control [22, 76], sentiment classification, recommendation system, question answering [58] and machine translation [2], to name a few. Traditionally, “deep learning” is taken to be a learning process where the inference or optimization is based on the real-valued deterministic model. The “semantic structure” in words, sentences, entities, actions and documents drawn from a large vocabulary may not be well expressed or correctly optimized in mathematical logic or computer programs. The “distribution function” in discrete or continuous latent variable model for natural language may not be properly decomposed or estimated. This tutorial addresses the fundamentals of statistical models and neural networks, and focus on a series of advanced Bayesian models and deep models including hierarchical Dirichlet process [61], Chinese restaurant process [4], hierarchical Pitman-Yor process [60], Indian buffet process [35], recurrent neural network (RNN) [26, 41, 48, 65], long short-term memory, sequence-to-sequence model [59], variational auto-encoder (VAE) [44], generative adversarial network (GAN) [36], attention mechanism [27, 56], memory-augmented neural network [39, 58], skip neural network [6], temporal difference VAE [40], stochastic neural network [3, 47], stochastic temporal convolutional network [1], predictive state neural network [31], and policy neural network [49, 74]. Enhancing the prior/posterior representation is addressed [53, 62]. We present how these models are connected and why they work for a variety of applications on symbolic and complex patterns in natural language. The variational inference and sampling method are formulated to tackle the optimization for complicated models [54]. The word and sentence embeddings, clustering and co-clustering are merged with linguistic and semantic constraints. A series of case studies, tasks and applications are presented to tackle different issues in deep Bayesian mining, searching, learning and understanding. At last, we will point out a number of directions and outlooks for future studies. This tutorial serves the objectives to introduce novices to major topics within deep Bayesian learning, motivate and explain a topic of emerging importance for data mining and natural language understanding, and present a novel synthesis combining distinct lines of machine learning work.
AB - This tutorial addresses the fundamentals and advances in deep Bayesian mining and learning for natural language with ubiquitous applications ranging from speech recognition [7, 55] to document summarization [8], text classification [5, 75], text segmentation [18], information extraction [50], image caption generation [69, 72], sentence generation [25, 46], dialogue control [22, 76], sentiment classification, recommendation system, question answering [58] and machine translation [2], to name a few. Traditionally, “deep learning” is taken to be a learning process where the inference or optimization is based on the real-valued deterministic model. The “semantic structure” in words, sentences, entities, actions and documents drawn from a large vocabulary may not be well expressed or correctly optimized in mathematical logic or computer programs. The “distribution function” in discrete or continuous latent variable model for natural language may not be properly decomposed or estimated. This tutorial addresses the fundamentals of statistical models and neural networks, and focus on a series of advanced Bayesian models and deep models including hierarchical Dirichlet process [61], Chinese restaurant process [4], hierarchical Pitman-Yor process [60], Indian buffet process [35], recurrent neural network (RNN) [26, 41, 48, 65], long short-term memory, sequence-to-sequence model [59], variational auto-encoder (VAE) [44], generative adversarial network (GAN) [36], attention mechanism [27, 56], memory-augmented neural network [39, 58], skip neural network [6], temporal difference VAE [40], stochastic neural network [3, 47], stochastic temporal convolutional network [1], predictive state neural network [31], and policy neural network [49, 74]. Enhancing the prior/posterior representation is addressed [53, 62]. We present how these models are connected and why they work for a variety of applications on symbolic and complex patterns in natural language. The variational inference and sampling method are formulated to tackle the optimization for complicated models [54]. The word and sentence embeddings, clustering and co-clustering are merged with linguistic and semantic constraints. A series of case studies, tasks and applications are presented to tackle different issues in deep Bayesian mining, searching, learning and understanding. At last, we will point out a number of directions and outlooks for future studies. This tutorial serves the objectives to introduce novices to major topics within deep Bayesian learning, motivate and explain a topic of emerging importance for data mining and natural language understanding, and present a novel synthesis combining distinct lines of machine learning work.
KW - Bayesian learning
KW - Data mining
KW - Deep learning
KW - Information retrieval
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85079553305&partnerID=8YFLogxK
U2 - 10.1145/3336191.3371870
DO - 10.1145/3336191.3371870
M3 - Conference contribution
AN - SCOPUS:85079553305
T3 - WSDM 2020 - Proceedings of the 13th International Conference on Web Search and Data Mining
SP - 865
EP - 868
BT - WSDM 2020 - Proceedings of the 13th International Conference on Web Search and Data Mining
PB - Association for Computing Machinery, Inc
T2 - 13th ACM International Conference on Web Search and Data Mining, WSDM 2020
Y2 - 3 February 2020 through 7 February 2020
ER -