PiCANet: Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection

Nian Liu, Junwei Han, Ming Hsuan Yang

Research output: Contribution to journalArticlepeer-review

117 Scopus citations

Abstract

Existing saliency models typically incorporate contexts holistically. However, for each pixel, usually only part of its context region contributes to saliency prediction, while other parts are likely either noise or distractions. In this paper, we propose a novel pixel-wise contextual attention network (PiCANet) to selectively attend to informative context locations at each pixel. The proposed PiCANet generates an attention map over the contextual region of each pixel and construct attentive contextual features via selectively incorporating the features of useful context locations. We present three formulations of the PiCANet via embedding the pixel-wise contextual attention mechanism into the pooling and convolution operations with attending to global or local contexts. All the three models are fully differentiable and can be integrated with convolutional neural networks with joint training. In this work, we introduce the proposed PiCANets into a U-Net model for salient object detection. The generated global and local attention maps can learn to incorporate global contrast and regional smoothness, which help localize and highlight salient objects more accurately and uniformly. Experimental results show that the proposed PiCANets perform effectively for saliency detection against the state-of-the-art methods. Furthermore, we demonstrate the effectiveness and generalization ability of the PiCANets on semantic segmentation and object detection with improved performance.

Original languageEnglish
Article number9076883
Pages (from-to)6438-6451
Number of pages14
JournalIEEE Transactions on Image Processing
Volume29
DOIs
StatePublished - 2020

Keywords

  • attention network
  • global context
  • local context
  • object detection
  • saliency detection
  • semantic segmentation

Fingerprint

Dive into the research topics of 'PiCANet: Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection'. Together they form a unique fingerprint.

Cite this