S2SiamFC: Self-supervised Fully Convolutional Siamese Network for Visual Tracking

Chon Hou Sio, Yu Jen Ma, Hong Han Shuai, Jun Cheng Chen, Wen Huang Cheng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

26 Scopus citations

Abstract

To exploit rich information from unlabeled data, in this work, we propose a novel self-supervised framework for visual tracking which can easily adapt the state-of-the-art supervised Siamese-based trackers into unsupervised ones by utilizing the fact that an image and any cropped region of it can form a natural pair for self-training. Besides common geometric transformation-based data augmentation and hard negative mining, we also propose adversarial masking which helps the tracker to learn other context information by adaptively blacking out salient regions of the target. The proposed approach can be trained offline using images only without any requirement of manual annotations and temporal information from multiple consecutive frames. Thus, it can be used with any kind of unlabeled data, including images and video frames. For evaluation, we take SiamFC as the base tracker and name the proposed self-supervised method as S2SiamFC. Extensive experiments and ablation studies on the challenging VOT2016 and VOT2018 datasets are provided to demonstrate the effectiveness of the proposed method which not only achieves comparable performance to its supervised counterpart and other unsupervised methods requiring multiple frames.

Original languageEnglish
Title of host publicationMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages1948-1957
Number of pages10
ISBN (Electronic)9781450379885
DOIs
StatePublished - 12 Oct 2020
Event28th ACM International Conference on Multimedia, MM 2020 - Virtual, Online, United States
Duration: 12 Oct 202016 Oct 2020

Publication series

NameMM 2020 - Proceedings of the 28th ACM International Conference on Multimedia

Conference

Conference28th ACM International Conference on Multimedia, MM 2020
Country/TerritoryUnited States
CityVirtual, Online
Period12/10/2016/10/20

Keywords

  • self-supervised learning
  • single object tracking

Fingerprint

Dive into the research topics of 'S2SiamFC: Self-supervised Fully Convolutional Siamese Network for Visual Tracking'. Together they form a unique fingerprint.

Cite this