Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

Yijun Yan; Jinchang Ren; Genyun Sun; Huimin Zhao; Junwei Han; Xuelong Li; Stephen Marshall; Jin Zhan

doi:10.1016/j.patcog.2018.02.004

Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

Yijun Yan, Jinchang Ren, Genyun Sun, Huimin Zhao, Junwei Han, Xuelong Li, Stephen Marshall, Jin Zhan

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

147 引用（Scopus）

摘要

Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.

源语言	英语
页（从-至）	65-78
页数	14
期刊	Pattern Recognition
卷	79
DOI	https://doi.org/10.1016/j.patcog.2018.02.004
出版状态	已出版 - 7月 2018

访问文件

10.1016/j.patcog.2018.02.004

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{256eb995163243efab0af6b9c0b8f2a5,

title = "Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement",

abstract = "Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.",

keywords = "Background connectivity, Feature fusion, Gestalt laws guided optimization, Human vision perception, Image saliency detection",

author = "Yijun Yan and Jinchang Ren and Genyun Sun and Huimin Zhao and Junwei Han and Xuelong Li and Stephen Marshall and Jin Zhan",

note = "Publisher Copyright: {\textcopyright} 2018 Elsevier Ltd",

year = "2018",

month = jul,

doi = "10.1016/j.patcog.2018.02.004",

language = "英语",

volume = "79",

pages = "65--78",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

AU - Yan, Yijun

AU - Ren, Jinchang

AU - Sun, Genyun

AU - Zhao, Huimin

AU - Han, Junwei

AU - Li, Xuelong

AU - Marshall, Stephen

AU - Zhan, Jin

PY - 2018/7

Y1 - 2018/7

N2 - Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.

AB - Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.

KW - Background connectivity

KW - Feature fusion

KW - Gestalt laws guided optimization

KW - Human vision perception

KW - Image saliency detection

UR - http://www.scopus.com/inward/record.url?scp=85044678901&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2018.02.004

DO - 10.1016/j.patcog.2018.02.004

M3 - 文章

AN - SCOPUS:85044678901

SN - 0031-3203

VL - 79

SP - 65

EP - 78

JO - Pattern Recognition

JF - Pattern Recognition

ER -

Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

摘要

访问文件

其它文件与链接

指纹

引用此