TY - JOUR
T1 - Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator
AU - Hsu, Kuang Jui
AU - Lin, Yen-Yu
AU - Chuang, Yung Yu
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Top-down saliency detection aims to highlight the regions of a specific object category, and typically relies on pixel-wise annotated training data. In this paper, we address the high cost of collecting such training data by a weakly supervised approach to object saliency detection, where only image-level labels, indicating the presence or absence of a target object in an image, are available. The proposed framework is composed of two collaborative CNN modules, an image-level classifier and a pixel-level map generator. While the former distinguishes images with objects of interest from the rest, the latter is learned to generate saliency maps by which the images masked by the maps can be better predicted by the former. In addition to the top-down guidance from class labels, the map generator is derived by also exploring other cues, including the background prior, superpixel- and object proposal-based evidence. The background prior is introduced to reduce false positives. Evidence from superpixels helps preserve sharp object boundaries. The clue from object proposals improves the integrity of highlighted objects. These different types of cues greatly regularize the training process and reduces the risk of overfitting, which happens frequently when learning CNN models with few training data. Experiments show that our method achieves superior results, even outperforming fully supervised methods.
AB - Top-down saliency detection aims to highlight the regions of a specific object category, and typically relies on pixel-wise annotated training data. In this paper, we address the high cost of collecting such training data by a weakly supervised approach to object saliency detection, where only image-level labels, indicating the presence or absence of a target object in an image, are available. The proposed framework is composed of two collaborative CNN modules, an image-level classifier and a pixel-level map generator. While the former distinguishes images with objects of interest from the rest, the latter is learned to generate saliency maps by which the images masked by the maps can be better predicted by the former. In addition to the top-down guidance from class labels, the map generator is derived by also exploring other cues, including the background prior, superpixel- and object proposal-based evidence. The background prior is introduced to reduce false positives. Evidence from superpixels helps preserve sharp object boundaries. The clue from object proposals improves the integrity of highlighted objects. These different types of cues greatly regularize the training process and reduces the risk of overfitting, which happens frequently when learning CNN models with few training data. Experiments show that our method achieves superior results, even outperforming fully supervised methods.
KW - Top-down object saliency detection
KW - convolutional neural networks
KW - weakly supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85071434280&partnerID=8YFLogxK
U2 - 10.1109/TIP.2019.2917224
DO - 10.1109/TIP.2019.2917224
M3 - Article
C2 - 31135360
AN - SCOPUS:85071434280
SN - 1057-7149
VL - 28
SP - 5435
EP - 5449
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 11
M1 - 8720239
ER -