An adaptive multi-sensor visual attention model

Wenbai Chen; Jingchen Li; Haobin Shi; Kao Shing Hwang

doi:10.1007/s00521-021-06857-z

An adaptive multi-sensor visual attention model

Wenbai Chen, Jingchen Li, Haobin Shi, Kao Shing Hwang

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.

源语言	英语
页（从-至）	7241-7252
页数	12
期刊	Neural Computing and Applications
卷	34
期	9
DOI	https://doi.org/10.1007/s00521-021-06857-z
出版状态	已出版 - 5月 2022

访问文件

10.1007/s00521-021-06857-z

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{319f972c28114ccd8818c966b1e47db2,

title = "An adaptive multi-sensor visual attention model",

abstract = "The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.",

keywords = "Attention mechanism, Neural network, Visual attention model",

author = "Wenbai Chen and Jingchen Li and Haobin Shi and Hwang, {Kao Shing}",

note = "Publisher Copyright: {\textcopyright} 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.",

year = "2022",

month = may,

doi = "10.1007/s00521-021-06857-z",

language = "英语",

volume = "34",

pages = "7241--7252",

journal = "Neural Computing and Applications",

issn = "0941-0643",

publisher = "Springer London",

number = "9",

}

TY - JOUR

T1 - An adaptive multi-sensor visual attention model

AU - Chen, Wenbai

AU - Li, Jingchen

AU - Shi, Haobin

AU - Hwang, Kao Shing

PY - 2022/5

Y1 - 2022/5

N2 - The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.

AB - The emerging recurrent visual attention models mostly utilize a sensor to continuously capture features from the input, which requires a suited design for the sensor. Researchers usually need a number of attempts to determine optimal structures for the sensor and corresponding modules. In this work, an adaptive multi-sensor visual attention model (AM-MA) is proposed to enhance the recurrent visual attention model. The proposed model uses several sensors to observe the original input recurrently, while the number of sensors can be added adaptively. Each sensor generates a hidden state and is followed by a location network to provide the deployment scheme. We design a self-evaluation mechanism for AM-MA, by which it can decide whether to add new sensors during training. Besides, the proposed AM-MA leverages a fine-tune mechanism to avoid a lengthy training process. AM-MA is a parameter-insensitive model. That is, there is no need for researchers to pre-train the model for finding the optimal structure in the case of unknown complexity. Experimental results show that the proposed AM-MA not only outperforms the renowned sensor-based attention model on image classification tasks, but also achieves satisfactory results when given an inappropriate structure.

KW - Attention mechanism

KW - Neural network

KW - Visual attention model

UR - http://www.scopus.com/inward/record.url?scp=85123821910&partnerID=8YFLogxK

U2 - 10.1007/s00521-021-06857-z

DO - 10.1007/s00521-021-06857-z

M3 - 文章

AN - SCOPUS:85123821910

SN - 0941-0643

VL - 34

SP - 7241

EP - 7252

JO - Neural Computing and Applications

JF - Neural Computing and Applications

IS - 9

ER -

An adaptive multi-sensor visual attention model

摘要

访问文件

其它文件与链接

指纹

引用此