PiCANet: Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection

Nian Liu; Junwei Han; Ming Hsuan Yang

doi:10.1109/TIP.2020.2988568

PiCANet: Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection

Nian Liu, Junwei Han, Ming Hsuan Yang

School of Automation

Research output: Contribution to journal › Article › peer-review

117 Scopus citations

Abstract

Existing saliency models typically incorporate contexts holistically. However, for each pixel, usually only part of its context region contributes to saliency prediction, while other parts are likely either noise or distractions. In this paper, we propose a novel pixel-wise contextual attention network (PiCANet) to selectively attend to informative context locations at each pixel. The proposed PiCANet generates an attention map over the contextual region of each pixel and construct attentive contextual features via selectively incorporating the features of useful context locations. We present three formulations of the PiCANet via embedding the pixel-wise contextual attention mechanism into the pooling and convolution operations with attending to global or local contexts. All the three models are fully differentiable and can be integrated with convolutional neural networks with joint training. In this work, we introduce the proposed PiCANets into a U-Net model for salient object detection. The generated global and local attention maps can learn to incorporate global contrast and regional smoothness, which help localize and highlight salient objects more accurately and uniformly. Experimental results show that the proposed PiCANets perform effectively for saliency detection against the state-of-the-art methods. Furthermore, we demonstrate the effectiveness and generalization ability of the PiCANets on semantic segmentation and object detection with improved performance.

Original language	English
Article number	9076883
Pages (from-to)	6438-6451
Number of pages	14
Journal	IEEE Transactions on Image Processing
Volume	29
DOIs	https://doi.org/10.1109/TIP.2020.2988568
State	Published - 2020

Keywords

attention network
global context
local context
object detection
saliency detection
semantic segmentation

Access to Document

10.1109/TIP.2020.2988568

Cite this

@article{7049f076f3a8496a9e5209f5ba883e1e,

title = "PiCANet: Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection",

abstract = "Existing saliency models typically incorporate contexts holistically. However, for each pixel, usually only part of its context region contributes to saliency prediction, while other parts are likely either noise or distractions. In this paper, we propose a novel pixel-wise contextual attention network (PiCANet) to selectively attend to informative context locations at each pixel. The proposed PiCANet generates an attention map over the contextual region of each pixel and construct attentive contextual features via selectively incorporating the features of useful context locations. We present three formulations of the PiCANet via embedding the pixel-wise contextual attention mechanism into the pooling and convolution operations with attending to global or local contexts. All the three models are fully differentiable and can be integrated with convolutional neural networks with joint training. In this work, we introduce the proposed PiCANets into a U-Net model for salient object detection. The generated global and local attention maps can learn to incorporate global contrast and regional smoothness, which help localize and highlight salient objects more accurately and uniformly. Experimental results show that the proposed PiCANets perform effectively for saliency detection against the state-of-the-art methods. Furthermore, we demonstrate the effectiveness and generalization ability of the PiCANets on semantic segmentation and object detection with improved performance.",

keywords = "attention network, global context, local context, object detection, saliency detection, semantic segmentation",

author = "Nian Liu and Junwei Han and Yang, {Ming Hsuan}",

note = "Publisher Copyright: {\textcopyright} 1992-2012 IEEE.",

year = "2020",

doi = "10.1109/TIP.2020.2988568",

language = "英语",

volume = "29",

pages = "6438--6451",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - PiCANet

T2 - Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection

AU - Liu, Nian

AU - Han, Junwei

AU - Yang, Ming Hsuan

PY - 2020

Y1 - 2020

N2 - Existing saliency models typically incorporate contexts holistically. However, for each pixel, usually only part of its context region contributes to saliency prediction, while other parts are likely either noise or distractions. In this paper, we propose a novel pixel-wise contextual attention network (PiCANet) to selectively attend to informative context locations at each pixel. The proposed PiCANet generates an attention map over the contextual region of each pixel and construct attentive contextual features via selectively incorporating the features of useful context locations. We present three formulations of the PiCANet via embedding the pixel-wise contextual attention mechanism into the pooling and convolution operations with attending to global or local contexts. All the three models are fully differentiable and can be integrated with convolutional neural networks with joint training. In this work, we introduce the proposed PiCANets into a U-Net model for salient object detection. The generated global and local attention maps can learn to incorporate global contrast and regional smoothness, which help localize and highlight salient objects more accurately and uniformly. Experimental results show that the proposed PiCANets perform effectively for saliency detection against the state-of-the-art methods. Furthermore, we demonstrate the effectiveness and generalization ability of the PiCANets on semantic segmentation and object detection with improved performance.

AB - Existing saliency models typically incorporate contexts holistically. However, for each pixel, usually only part of its context region contributes to saliency prediction, while other parts are likely either noise or distractions. In this paper, we propose a novel pixel-wise contextual attention network (PiCANet) to selectively attend to informative context locations at each pixel. The proposed PiCANet generates an attention map over the contextual region of each pixel and construct attentive contextual features via selectively incorporating the features of useful context locations. We present three formulations of the PiCANet via embedding the pixel-wise contextual attention mechanism into the pooling and convolution operations with attending to global or local contexts. All the three models are fully differentiable and can be integrated with convolutional neural networks with joint training. In this work, we introduce the proposed PiCANets into a U-Net model for salient object detection. The generated global and local attention maps can learn to incorporate global contrast and regional smoothness, which help localize and highlight salient objects more accurately and uniformly. Experimental results show that the proposed PiCANets perform effectively for saliency detection against the state-of-the-art methods. Furthermore, we demonstrate the effectiveness and generalization ability of the PiCANets on semantic segmentation and object detection with improved performance.

KW - attention network

KW - global context

KW - local context

KW - object detection

KW - saliency detection

KW - semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85087731423&partnerID=8YFLogxK

U2 - 10.1109/TIP.2020.2988568

DO - 10.1109/TIP.2020.2988568

M3 - 文章

AN - SCOPUS:85087731423

SN - 1057-7149

VL - 29

SP - 6438

EP - 6451

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

M1 - 9076883

ER -

PiCANet: Pixel-Wise Contextual Attention Learning for Accurate Saliency Detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this