Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator

Kuang Jui Hsu*, Yen-Yu Lin, Yung Yu Chuang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Scopus citations


Top-down saliency detection aims to highlight the regions of a specific object category, and typically relies on pixel-wise annotated training data. In this paper, we address the high cost of collecting such training data by a weakly supervised approach to object saliency detection, where only image-level labels, indicating the presence or absence of a target object in an image, are available. The proposed framework is composed of two collaborative CNN modules, an image-level classifier and a pixel-level map generator. While the former distinguishes images with objects of interest from the rest, the latter is learned to generate saliency maps by which the images masked by the maps can be better predicted by the former. In addition to the top-down guidance from class labels, the map generator is derived by also exploring other cues, including the background prior, superpixel- and object proposal-based evidence. The background prior is introduced to reduce false positives. Evidence from superpixels helps preserve sharp object boundaries. The clue from object proposals improves the integrity of highlighted objects. These different types of cues greatly regularize the training process and reduces the risk of overfitting, which happens frequently when learning CNN models with few training data. Experiments show that our method achieves superior results, even outperforming fully supervised methods.

Original languageEnglish
Article number8720239
Pages (from-to)5435-5449
Number of pages15
JournalIEEE Transactions on Image Processing
Issue number11
StatePublished - 1 Nov 2019


  • convolutional neural networks
  • Top-down object saliency detection
  • weakly supervised learning


Dive into the research topics of 'Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator'. Together they form a unique fingerprint.

Cite this