Anchor-Based Detection for Natural Language Localization in Ego-Centric Videos

Bei Liu, Sipeng Zheng, Jianlong Fu, Wen Huang Cheng

研究成果: Conference contribution同行評審

1 引文 斯高帕斯(Scopus)

摘要

The Natural Language Localization (NLL) task aims to localize a sentence in a video with starting and ending timestamps. It requires a comprehensive understanding of both language and videos. We have seen a lot of work conducted for third-person view videos, while the task on ego-centric videos is still under-explored, which is critical for the understanding of increasing ego-centric videos and further facilitating embodied AI tasks. Directly adapting existing methods of NLL to ego-centric video datasets is challenging due to two reasons. Firstly, there is a temporal duration gap between different datasets. Secondly, queries in ego-centric videos usually require a better understanding of more complex and long-term temporal orders. For the above reason, we propose an anchor-based detection model for NLL in ego-centric videos.

原文English
主出版物標題2023 IEEE International Conference on Consumer Electronics, ICCE 2023
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781665491303
DOIs
出版狀態Published - 2023
事件2023 IEEE International Conference on Consumer Electronics, ICCE 2023 - Las Vegas, United States
持續時間: 6 1月 20238 1月 2023

出版系列

名字Digest of Technical Papers - IEEE International Conference on Consumer Electronics
2023-January
ISSN(列印)0747-668X

Conference

Conference2023 IEEE International Conference on Consumer Electronics, ICCE 2023
國家/地區United States
城市Las Vegas
期間6/01/238/01/23

指紋

深入研究「Anchor-Based Detection for Natural Language Localization in Ego-Centric Videos」主題。共同形成了獨特的指紋。

引用此