This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for im-age patch representation by employing deep learning tech-niques. It can be achieved by taking any descriptor learn-ing architecture for learning a leading descriptor and aug-menting the architecture with an additional network stream for learning a complementary descriptor. To enforce the complementary property, a new network layer, called data-dependent modulation (DDM) layer, is introduced for adap-tively learning the augmented network stream with the em-phasis on the training data that are not well handled by the leading stream. By optimizing the proposed joint loss function with late fusion, the obtained descriptors are com-plementary to each other and their fusion improves perfor-mance. Experiments on several problems and datasets show that the proposed method 1 is simple yet effective, outper-forming state-of-the-art methods.
|Name||Proceedings of the IEEE International Conference on Computer Vision|