An adaptive multi-sensor visual attention model

Wenbai Chen; Jingchen Li; Haobin Shi; Kao Shing Hwang

doi:10.1007/s00521-021-06857-z

An adaptive multi-sensor visual attention model

Wenbai Chen, Jingchen Li, Haobin Shi, Kao Shing Hwang

School of Computer Science

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.

Original language	English
Pages (from-to)	7241-7252
Number of pages	12
Journal	Neural Computing and Applications
Volume	34
Issue number	9
DOIs	https://doi.org/10.1007/s00521-021-06857-z
State	Published - May 2022

Keywords

Attention mechanism
Neural network
Visual attention model

Access to Document

10.1007/s00521-021-06857-z

Cite this

@article{319f972c28114ccd8818c966b1e47db2,

title = "An adaptive multi-sensor visual attention model",

abstract = "The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.",

keywords = "Attention mechanism, Neural network, Visual attention model",

author = "Wenbai Chen and Jingchen Li and Haobin Shi and Hwang, {Kao Shing}",

note = "Publisher Copyright: {\textcopyright} 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.",

year = "2022",

month = may,

doi = "10.1007/s00521-021-06857-z",

language = "英语",

volume = "34",

pages = "7241--7252",

journal = "Neural Computing and Applications",

issn = "0941-0643",

publisher = "Springer London",

number = "9",

}

TY - JOUR

T1 - An adaptive multi-sensor visual attention model

AU - Chen, Wenbai

AU - Li, Jingchen

AU - Shi, Haobin

AU - Hwang, Kao Shing

PY - 2022/5

Y1 - 2022/5

N2 - The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.

AB - The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.

KW - Attention mechanism

KW - Neural network

KW - Visual attention model

UR - http://www.scopus.com/inward/record.url?scp=85123821910&partnerID=8YFLogxK

U2 - 10.1007/s00521-021-06857-z

DO - 10.1007/s00521-021-06857-z

M3 - 文章

AN - SCOPUS:85123821910

SN - 0941-0643

VL - 34

SP - 7241

EP - 7252

JO - Neural Computing and Applications

JF - Neural Computing and Applications

IS - 9

ER -

An adaptive multi-sensor visual attention model

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this