Bayesian Multi-Temporal-Difference Learning

Jen Tzung Chien*, Yi Chung Chiu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This paper presents a new sequential learning via a planning strategy where the future samples are predicted by reflecting the past experiences. Such a strategy is appealing to implement an intelligent machine which foresees multiple time steps instead of predicting step by step. In particular, a flexible sequential learning is developed to directly predict future states without visiting all intermediate states. A Bayesian approach to multi-temporal-difference neural network is accordingly proposed to calculate the stochastic belief state for an abstract state machine so as to capture large-span context as well as make high-level prediction. Importantly, the sequence data are represented by multiple jumpy states with varying temporal differences. A Bayesian state machine is trained by maximizing the variational lower bound of log likelihood of sequence data. A generalized sequence model with various number of Markov states is derived with the simplified realization to the previous temporal-difference variational autoencoder. The predictive states are learned to roll forward with jumps. Experiments show that this approach is substantially trained to predict jumpy states in various types of sequence data.

Original languageEnglish
Article numbere34
JournalAPSIPA Transactions on Signal and Information Processing
Volume11
Issue number1
DOIs
StatePublished - 2022

Keywords

  • Bayesian learning
  • sequential learning
  • state machine
  • temporal-difference learning
  • variational autoencoder

Fingerprint

Dive into the research topics of 'Bayesian Multi-Temporal-Difference Learning'. Together they form a unique fingerprint.

Cite this