CS-ViG-UNet: Infrared small and dim target detection based on cycle shift vision graph convolution network

Jian Lin; Shaoyi Li; Xi Yang; Saisai Niu; Binbin Yan; Zhongjie Meng

doi:10.1016/j.eswa.2024.124385

CS-ViG-UNet: Infrared small and dim target detection based on cycle shift vision graph convolution network

Jian Lin, Shaoyi Li, Xi Yang, Saisai Niu, Binbin Yan, Zhongjie Meng

School of Astronautics

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

Infrared small and dim target detection benefits from the exploration of correlations among targets, neighboring regions, and the background. However, existing methods that rely on convolutional neural networks and vision transformers cannot effectively capture long-range information correlations within images. To overcome this limitation, this paper proposes CS-ViG-UNet, a framework that introduces vision graph convolution for infrared small and dim target detection. Our framework employs a cyclic shift sparse graph attention mechanism to address the issue of reduced expressive power. Meanwhile, the CS-ViG module is designed to construct an effective graph structure using image patches, thereby capturing feature information relevant to target recognition. On the public datasets Sirst AUG and IRSTD-1K, our method obtained F1 scores of 0.8561 and 0.745, respectively, showing an improvement of 3.15 % and 4.1 % compared to the state-of-the-art methods. On the RTX3090 with TensorRT acceleration, CS-ViG-UNet can process approximately 357 images of size 256 × 256 pixels per second at FP16 precision. For detailed information, please visit our homepage: https://linaom1214.github.io/CSViG-UNet.

Original language	English
Article number	124385
Journal	Expert Systems with Applications
Volume	254
DOIs	https://doi.org/10.1016/j.eswa.2024.124385
State	Published - 15 Nov 2024

Keywords

Infrared small and dim target
Infrared target detection
U-shape architecture
Vision graph network

Access to Document

10.1016/j.eswa.2024.124385

Cite this

@article{c79ec77728654a43b950ba7f0ce7c6ac,

title = "CS-ViG-UNet: Infrared small and dim target detection based on cycle shift vision graph convolution network",

abstract = "Infrared small and dim target detection benefits from the exploration of correlations among targets, neighboring regions, and the background. However, existing methods that rely on convolutional neural networks and vision transformers cannot effectively capture long-range information correlations within images. To overcome this limitation, this paper proposes CS-ViG-UNet, a framework that introduces vision graph convolution for infrared small and dim target detection. Our framework employs a cyclic shift sparse graph attention mechanism to address the issue of reduced expressive power. Meanwhile, the CS-ViG module is designed to construct an effective graph structure using image patches, thereby capturing feature information relevant to target recognition. On the public datasets Sirst AUG and IRSTD-1K, our method obtained F1 scores of 0.8561 and 0.745, respectively, showing an improvement of 3.15 % and 4.1 % compared to the state-of-the-art methods. On the RTX3090 with TensorRT acceleration, CS-ViG-UNet can process approximately 357 images of size 256 × 256 pixels per second at FP16 precision. For detailed information, please visit our homepage: https://linaom1214.github.io/CSViG-UNet.",

keywords = "Infrared small and dim target, Infrared target detection, U-shape architecture, Vision graph network",

author = "Jian Lin and Shaoyi Li and Xi Yang and Saisai Niu and Binbin Yan and Zhongjie Meng",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Ltd",

year = "2024",

month = nov,

day = "15",

doi = "10.1016/j.eswa.2024.124385",

language = "英语",

volume = "254",

journal = "Expert Systems with Applications",

issn = "0957-4174",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - CS-ViG-UNet

T2 - Infrared small and dim target detection based on cycle shift vision graph convolution network

AU - Lin, Jian

AU - Li, Shaoyi

AU - Yang, Xi

AU - Niu, Saisai

AU - Yan, Binbin

AU - Meng, Zhongjie

PY - 2024/11/15

Y1 - 2024/11/15

N2 - Infrared small and dim target detection benefits from the exploration of correlations among targets, neighboring regions, and the background. However, existing methods that rely on convolutional neural networks and vision transformers cannot effectively capture long-range information correlations within images. To overcome this limitation, this paper proposes CS-ViG-UNet, a framework that introduces vision graph convolution for infrared small and dim target detection. Our framework employs a cyclic shift sparse graph attention mechanism to address the issue of reduced expressive power. Meanwhile, the CS-ViG module is designed to construct an effective graph structure using image patches, thereby capturing feature information relevant to target recognition. On the public datasets Sirst AUG and IRSTD-1K, our method obtained F1 scores of 0.8561 and 0.745, respectively, showing an improvement of 3.15 % and 4.1 % compared to the state-of-the-art methods. On the RTX3090 with TensorRT acceleration, CS-ViG-UNet can process approximately 357 images of size 256 × 256 pixels per second at FP16 precision. For detailed information, please visit our homepage: https://linaom1214.github.io/CSViG-UNet.

AB - Infrared small and dim target detection benefits from the exploration of correlations among targets, neighboring regions, and the background. However, existing methods that rely on convolutional neural networks and vision transformers cannot effectively capture long-range information correlations within images. To overcome this limitation, this paper proposes CS-ViG-UNet, a framework that introduces vision graph convolution for infrared small and dim target detection. Our framework employs a cyclic shift sparse graph attention mechanism to address the issue of reduced expressive power. Meanwhile, the CS-ViG module is designed to construct an effective graph structure using image patches, thereby capturing feature information relevant to target recognition. On the public datasets Sirst AUG and IRSTD-1K, our method obtained F1 scores of 0.8561 and 0.745, respectively, showing an improvement of 3.15 % and 4.1 % compared to the state-of-the-art methods. On the RTX3090 with TensorRT acceleration, CS-ViG-UNet can process approximately 357 images of size 256 × 256 pixels per second at FP16 precision. For detailed information, please visit our homepage: https://linaom1214.github.io/CSViG-UNet.

KW - Infrared small and dim target

KW - Infrared target detection

KW - U-shape architecture

KW - Vision graph network

UR - http://www.scopus.com/inward/record.url?scp=85195388306&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2024.124385

DO - 10.1016/j.eswa.2024.124385

M3 - 文章

AN - SCOPUS:85195388306

SN - 0957-4174

VL - 254

JO - Expert Systems with Applications

JF - Expert Systems with Applications

M1 - 124385

ER -

CS-ViG-UNet: Infrared small and dim target detection based on cycle shift vision graph convolution network

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this