TY - GEN
T1 - Learning Face Recognition Unsupervisedly by Disentanglement and Self-Augmentation
AU - Lee, Yi Lun
AU - Tseng, Min Yuan
AU - Luo, Yu Cheng
AU - Yu, Dung Ru
AU - Chiu, Wei Chen
N1 - Publisher Copyright:
© 2020 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/5
Y1 - 2020/5
N2 - As the growth of smart home, healthcare, and home robot applications, learning a face recognition system which is specific for a particular environment and capable of self-adapting to the temporal changes in appearance (e.g., caused by illumination or camera position) is nowadays an important topic. In this paper, given a video of a group of people, which simulates the surveillance video in a smart home environment, we propose a novel approach which unsuper- visedly learns a face recognition model based on two main components: (1) a triplet network that extracts identity-aware feature from face images for performing face recognition by clustering, and (2) an augmentation network that is conditioned on the identity-aware features and aims at synthesizing more face samples. Particularly, the training data for the triplet network is obtained by using the spatiotemporal characteristic of face samples within a video, while the augmentation network learns to disentangle a face image into identity-aware and identity-irrelevant features thus is able to generate new faces of the same identity but with variance in appearance. With taking the richer training data produced by augmentation network, the triplet network is further fine-tuned and achieves better performance in face recognition. Extensive experiments not only show the efficacy of our model in learning an environment- specific face recognition model unsupervisedly, but also verify its adaptability to various appearance changes.
AB - As the growth of smart home, healthcare, and home robot applications, learning a face recognition system which is specific for a particular environment and capable of self-adapting to the temporal changes in appearance (e.g., caused by illumination or camera position) is nowadays an important topic. In this paper, given a video of a group of people, which simulates the surveillance video in a smart home environment, we propose a novel approach which unsuper- visedly learns a face recognition model based on two main components: (1) a triplet network that extracts identity-aware feature from face images for performing face recognition by clustering, and (2) an augmentation network that is conditioned on the identity-aware features and aims at synthesizing more face samples. Particularly, the training data for the triplet network is obtained by using the spatiotemporal characteristic of face samples within a video, while the augmentation network learns to disentangle a face image into identity-aware and identity-irrelevant features thus is able to generate new faces of the same identity but with variance in appearance. With taking the richer training data produced by augmentation network, the triplet network is further fine-tuned and achieves better performance in face recognition. Extensive experiments not only show the efficacy of our model in learning an environment- specific face recognition model unsupervisedly, but also verify its adaptability to various appearance changes.
UR - http://www.scopus.com/inward/record.url?scp=85092741535&partnerID=8YFLogxK
U2 - 10.1109/ICRA40945.2020.9197348
DO - 10.1109/ICRA40945.2020.9197348
M3 - Conference contribution
AN - SCOPUS:85092741535
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 3018
EP - 3024
BT - 2020 IEEE International Conference on Robotics and Automation, ICRA 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE International Conference on Robotics and Automation, ICRA 2020
Y2 - 31 May 2020 through 31 August 2020
ER -