Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval

Cheng Huang*, Yi Lun Wu, Hong Han Shuai, Ching Chun Huang

*此作品的通信作者

研究成果: Conference contribution同行評審

摘要

Given an untrimmed video and a natural language query, video moment retrieval (VMR) aims to retrieve video moments described by the query. However, most existing VMR methods assume a one-to-one mapping between the input query and the target video moment (single-target VMR), disregarding the possibility that a video may contain multiple target moments that match the query description (multi-target VMR). Previous methods tackle multi-target VMR by incorporating false negative moments with the original target moment for multi-target training. However, existing methods cannot properly work when no false negative moments exist in the video, or when the identified false negative moments are noisy but are still being utilized as pseudo-labels. In this paper, we propose to tackle multi-target VMR by Semantic Fusion Augmentation and Semantic Boundary Detection (SFABD). Specifically, we use feature-level augmentation to generate augmented target moments, along with an intra-video contrastive loss to ensure feature consistency. Meanwhile, we perform semantic boundary detection to adaptively remove all false negatives from the negative set of contrastive loss to avoid semantic confusion. Extensive experiments conducted on Charades-STA, ActivityNet Captions, and QVHighlights show that our method achieves state-of-the-art performance on multi-target metrics and single-target metrics. The source code is available at https://github.com/basiclab/SFABD.

原文English
主出版物標題Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
發行者Institute of Electrical and Electronics Engineers Inc.
頁面6769-6778
頁數10
ISBN(電子)9798350318920
DOIs
出版狀態Published - 3 1月 2024
事件2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 - Waikoloa, United States
持續時間: 4 1月 20248 1月 2024

出版系列

名字Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024

Conference

Conference2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
國家/地區United States
城市Waikoloa
期間4/01/248/01/24

指紋

深入研究「Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval」主題。共同形成了獨特的指紋。

引用此