Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation

Dingwen Zhang, Junwei Han, Yu Zhang, Dong Xu

Research output: Contribution to journalArticlepeer-review

91 Scopus citations

Abstract

Recently, the research field of salient object detection is undergoing a rapid and remarkable development along with the wide usage of deep neural networks. Being trained with a large number of images annotated with strong pixel-level ground-truth masks, the deep salient object detectors have achieved the state-of-the-art performance. However, it is expensive and time-consuming to provide the pixel-level ground-truth masks for each training image. To address this problem, this paper proposes one of the earliest frameworks to learn deep salient object detectors without requiring any human annotation. The supervisory signals used in our learning framework are generated through a novel supervision synthesis scheme, in which the key insights are 'knowledge source transition' and 'supervision by fusion'. Specifically, in the proposed learning framework, both the external knowledge source and the internal knowledge source are explored dynamically to provide informative cues for synthesizing supervision required in our approach, while a two-stream fusion mechanism is also established to implement the supervision synthesis process. Comprehensive experiments on four benchmark datasets demonstrate that the deep salient object detector trained by our newly proposed learning framework often works well without requiring any human annotated masks, which even approaches to its upper-bound obtained under the fully supervised learning fashion (within only 3 percent performance gap). Besides, we also apply the salient object detector learnt with our annotation-free learning framework to assist the weakly supervised semantic segmentation task, which demonstrates that our approach can also alleviate the heavy supplementary supervision required in the existing weakly supervised semantic segmentation framework.

Original languageEnglish
Article number8645692
Pages (from-to)1755-1769
Number of pages15
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume42
Issue number7
DOIs
StatePublished - 1 Jul 2020

Keywords

  • Salient object detection
  • annotation-free
  • supervision synthesis
  • weakly supervised semantic segmentation

Fingerprint

Dive into the research topics of 'Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation'. Together they form a unique fingerprint.

Cite this