TY - GEN
T1 - Video summarization with frame index vision transformer
AU - Hsu, Tzu Chun
AU - Liao, Yi Sheng
AU - Huang, Chun Rong
N1 - Publisher Copyright:
© 2021 MVA Organization.
PY - 2021/7/25
Y1 - 2021/7/25
N2 - In this paper, we propose a novel frame index vision transformer for video summarization. Given training frames, we linearly project the content of the frames to obtain frame embedding. By incorporating the frame embedding with the index embedding and class embedding, the proposed frame index vision transformer can be efficiently and effectively applied to learn the importance of the input frames. As shown in the experimental results, the proposed method outperforms the state-of-the-art deep learning methods including recurrent neural network (RNN) and convolutional neural network (CNN) based methods in both of the SumMe and TVSum datasets. In addition, our method can achieve real-time computational efficiency during testing.
AB - In this paper, we propose a novel frame index vision transformer for video summarization. Given training frames, we linearly project the content of the frames to obtain frame embedding. By incorporating the frame embedding with the index embedding and class embedding, the proposed frame index vision transformer can be efficiently and effectively applied to learn the importance of the input frames. As shown in the experimental results, the proposed method outperforms the state-of-the-art deep learning methods including recurrent neural network (RNN) and convolutional neural network (CNN) based methods in both of the SumMe and TVSum datasets. In addition, our method can achieve real-time computational efficiency during testing.
UR - http://www.scopus.com/inward/record.url?scp=85114002659&partnerID=8YFLogxK
U2 - 10.23919/MVA51890.2021.9511350
DO - 10.23919/MVA51890.2021.9511350
M3 - Conference contribution
AN - SCOPUS:85114002659
T3 - Proceedings of MVA 2021 - 17th International Conference on Machine Vision Applications
BT - Proceedings of MVA 2021 - 17th International Conference on Machine Vision Applications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th International Conference on Machine Vision Applications, MVA 2021
Y2 - 25 July 2021 through 27 July 2021
ER -