TY - JOUR
T1 - Continuous-time self-attention in neural differential equation
AU - Chien, Jen Tzung
AU - Chen, Yi Hsiang
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021/6
Y1 - 2021/6
N2 - Neural differential equation (NDE) is recently developed as a continuous-time state machine which can faithfully represent the irregularly-sampled sequence data. NDE is seen as a substantial extension of recurrent neural network (RNN) which conducts discrete-time state representation for regularly-sampled data. This study presents a new continuous-time attention to improve sequential learning where the region of interest in continuous-time state trajectory over observed as well as missing samples is sufficiently attended. However, the attention score, calculated by relating between a query and a sequence, is memory demanding because self-attention should treat all time observations as query vectors to feed them into ordinary differential equation (ODE) solver. To deal with this issue, we develop a new form of dynamics for continuous-time attention where the causality property is adopted such that query vector is fed into ODE solver up to current time. The experiments on irregularly-sampled human activities and medical features show that this method obtains desirable performance with efficient memory consumption.
AB - Neural differential equation (NDE) is recently developed as a continuous-time state machine which can faithfully represent the irregularly-sampled sequence data. NDE is seen as a substantial extension of recurrent neural network (RNN) which conducts discrete-time state representation for regularly-sampled data. This study presents a new continuous-time attention to improve sequential learning where the region of interest in continuous-time state trajectory over observed as well as missing samples is sufficiently attended. However, the attention score, calculated by relating between a query and a sequence, is memory demanding because self-attention should treat all time observations as query vectors to feed them into ordinary differential equation (ODE) solver. To deal with this issue, we develop a new form of dynamics for continuous-time attention where the causality property is adopted such that query vector is fed into ODE solver up to current time. The experiments on irregularly-sampled human activities and medical features show that this method obtains desirable performance with efficient memory consumption.
KW - Attention mechanism
KW - Causal attention
KW - Neural differential equation
KW - Sequential learning
UR - http://www.scopus.com/inward/record.url?scp=85115883290&partnerID=8YFLogxK
U2 - 10.1109/ICASSP39728.2021.9414104
DO - 10.1109/ICASSP39728.2021.9414104
M3 - Conference article
AN - SCOPUS:85115883290
SN - 1520-6149
VL - 2021-June
SP - 3290
EP - 3294
JO - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
JF - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
T2 - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021
Y2 - 6 June 2021 through 11 June 2021
ER -