Enhanced Pedestrian Trajectory Prediction via the Cross-Modal Feature Fusion Transformer

Rashid Ali*, Hsu Feng Hsiao

*此作品的通信作者

研究成果: Conference contribution同行評審

1 引文 斯高帕斯(Scopus)

摘要

We address the challenge of predicting pedestrian trajectories in videos, a task inherently complex due to the diverse and intricate nature of human motion and interactions within their environment. The accurate anticipation of trajectories necessitates a holistic comprehension of the temporal evolution of past events in videos. Regrettably, existing methods often neglect the fusion of critical features, such as human behavior, motion, and interaction, thereby limiting their efficacy in tackling these challenges. To overcome these limitations, we propose the Cross-modal Feature Fusion Transformer, a novel approach for pedestrian trajectory prediction. Our model seamlessly integrates multimodal features, including human behavior, position, speed, and interaction with surroundings, to effectively encapsulate the temporal progression of observed frames. It consists of transformer-based cross-modal fusion encoder and decoder modules, adeptly melding the interactions between the multimodal features through a multi-head co-Attentional mechanism. This enables the precise prediction of future trajectories. Additionally, we incorporate auxiliary self-supervised future prediction losses to learn the temporal evolution of past and future multimodal features. We evaluate our approach on ETH/UCY and ActEV/VIRAT datasets and demonstrate its superior performance compared to state-of-The-Art methods.

原文English
主出版物標題2023 IEEE International Conference on Visual Communications and Image Processing, VCIP 2023
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9798350359855
DOIs
出版狀態Published - 2023
事件2023 IEEE International Conference on Visual Communications and Image Processing, VCIP 2023 - Jeju, 韓國
持續時間: 4 12月 20237 12月 2023

出版系列

名字2023 IEEE International Conference on Visual Communications and Image Processing, VCIP 2023

Conference

Conference2023 IEEE International Conference on Visual Communications and Image Processing, VCIP 2023
國家/地區韓國
城市Jeju
期間4/12/237/12/23

指紋

深入研究「Enhanced Pedestrian Trajectory Prediction via the Cross-Modal Feature Fusion Transformer」主題。共同形成了獨特的指紋。

引用此