Deep co-occurrence feature learning for visual object recognition

Ya Fang Shih, Yang Ming Yeh, Yen Yu Lin, Ming Fang Weng, Yi Chang Lu, Yung Yu Chuang

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

34 Scopus citations


This paper addresses three issues in integrating part-based representations into convolutional neural networks (CNNs) for object recognition. First, most part-based mod-els rely on a few pre-specified object parts. However, the optimal object parts for recognition often vary from cat-egory to category. Second, acquiring training data with part-level annotation is labor-intensive. Third, modeling spatial relationships between parts in CNNs often involves an exhaustive search of part templates over multiple net-work streams. We tackle the three issues by introducing a new network layer, called co-occurrence layer. It can ex-tend a convolutional layer to encode the co-occurrence be-tween the visual parts detected by the numerous neurons, instead of a few pre-specified parts. To this end, the feature maps serve as both filters and images, and mutual correla-tion filtering is conducted between them. The co-occurrence layer is end-to-end trainable. The resultant co-occurrence features are rotation-and translation-invariant, and are ro-bust to object deformation. By applying this new layer to the VGG-16 and ResNet-152, we achieve the recogni-tion rates of 83.6% and 85.8% on the Caltech-UCSD bird benchmark, respectively. The source code is available at
Original languageAmerican English
Title of host publication2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages10
ISBN (Print)9781538604571
StatePublished - 6 Nov 2017

Publication series

NameProceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017


Dive into the research topics of 'Deep co-occurrence feature learning for visual object recognition'. Together they form a unique fingerprint.

Cite this