Temporally-Aggregating Multiple-Discontinuous-Image Saliency Prediction with Transformer-Based Attention

Pin Jie Huang, Chi An Lu, Kuan Wen Chen*

*此作品的通信作者

研究成果: Conference contribution同行評審

3 引文 斯高帕斯(Scopus)

摘要

In this paper, we aim to apply deep saliency prediction to automatic drone exploration, which should consider not only one single image, but multiple images from different view angles or localizations in order to determine the exploration direction. However, little attention has been paid to such saliency prediction problem over multiple-discontinuous-image and none of existing methods take temporal information into consideration, which may mean that the current predicted saliency map is not consistent with the previous predicted results. For this purpose, we propose a method named Temporally-Aggregating Multiple-Discontinuous-Image Saliency Prediction Network (TA-MSNet). It utilizes a transformer-based attention module to correlate relative saliency information among multiple discontinuous images and, furthermore, applies the ConvLSTM module to capture the temporal information. Experiments show that the proposed TA-MSNet can estimate better and more consistent results than previous works for time series data.

原文English
主出版物標題2022 IEEE International Conference on Robotics and Automation, ICRA 2022
發行者Institute of Electrical and Electronics Engineers Inc.
頁面6571-6577
頁數7
ISBN(電子)9781728196817
DOIs
出版狀態Published - 2022
事件39th IEEE International Conference on Robotics and Automation, ICRA 2022 - Philadelphia, 美國
持續時間: 23 5月 202227 5月 2022

出版系列

名字Proceedings - IEEE International Conference on Robotics and Automation
ISSN(列印)1050-4729

Conference

Conference39th IEEE International Conference on Robotics and Automation, ICRA 2022
國家/地區美國
城市Philadelphia
期間23/05/2227/05/22

指紋

深入研究「Temporally-Aggregating Multiple-Discontinuous-Image Saliency Prediction with Transformer-Based Attention」主題。共同形成了獨特的指紋。

引用此