TY - JOUR
T1 - CS-ViG-UNet
T2 - Infrared small and dim target detection based on cycle shift vision graph convolution network
AU - Lin, Jian
AU - Li, Shaoyi
AU - Yang, Xi
AU - Niu, Saisai
AU - Yan, Binbin
AU - Meng, Zhongjie
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/11/15
Y1 - 2024/11/15
N2 - Infrared small and dim target detection benefits from the exploration of correlations among targets, neighboring regions, and the background. However, existing methods that rely on convolutional neural networks and vision transformers cannot effectively capture long-range information correlations within images. To overcome this limitation, this paper proposes CS-ViG-UNet, a framework that introduces vision graph convolution for infrared small and dim target detection. Our framework employs a cyclic shift sparse graph attention mechanism to address the issue of reduced expressive power. Meanwhile, the CS-ViG module is designed to construct an effective graph structure using image patches, thereby capturing feature information relevant to target recognition. On the public datasets Sirst AUG and IRSTD-1K, our method obtained F1 scores of 0.8561 and 0.745, respectively, showing an improvement of 3.15 % and 4.1 % compared to the state-of-the-art methods. On the RTX3090 with TensorRT acceleration, CS-ViG-UNet can process approximately 357 images of size 256 × 256 pixels per second at FP16 precision. For detailed information, please visit our homepage: https://linaom1214.github.io/CSViG-UNet.
AB - Infrared small and dim target detection benefits from the exploration of correlations among targets, neighboring regions, and the background. However, existing methods that rely on convolutional neural networks and vision transformers cannot effectively capture long-range information correlations within images. To overcome this limitation, this paper proposes CS-ViG-UNet, a framework that introduces vision graph convolution for infrared small and dim target detection. Our framework employs a cyclic shift sparse graph attention mechanism to address the issue of reduced expressive power. Meanwhile, the CS-ViG module is designed to construct an effective graph structure using image patches, thereby capturing feature information relevant to target recognition. On the public datasets Sirst AUG and IRSTD-1K, our method obtained F1 scores of 0.8561 and 0.745, respectively, showing an improvement of 3.15 % and 4.1 % compared to the state-of-the-art methods. On the RTX3090 with TensorRT acceleration, CS-ViG-UNet can process approximately 357 images of size 256 × 256 pixels per second at FP16 precision. For detailed information, please visit our homepage: https://linaom1214.github.io/CSViG-UNet.
KW - Infrared small and dim target
KW - Infrared target detection
KW - U-shape architecture
KW - Vision graph network
UR - http://www.scopus.com/inward/record.url?scp=85195388306&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.124385
DO - 10.1016/j.eswa.2024.124385
M3 - 文章
AN - SCOPUS:85195388306
SN - 0957-4174
VL - 254
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 124385
ER -