UAVSeg: Dual-Encoder Cross-Scale Attention Network for UAV Images Semantic Segmentation

Zhen Wang, Zhuhong You, Nan Xu, Chuanlei Zhang, De Shuang Huang

科研成果: 期刊稿件文章同行评审

摘要

Benefiting from the powerful feature extraction and feature correlation modeling capabilities of convolutional neural networks (CNNs) and Transformer models, these techniques have been widely used in unmanned aerial vehicle (UAV) aerial image semantic segmentation tasks. However, the ground objects in aerial images contain feature information with different scales, and existing methods directly cascade low-level visual features and high-level semantic features without processing, resulting in low semantic segmentation precision. To address these challenges, we propose a Dual-Encoder Cross-Scale Attention Network, which efficiently extracts local and global context information from aerial images and performs fine-grained fusion of multi-scale features to improve semantic segmentation performance. Firstly, we introduce the Dual-CNNs-Transformer Encoder, which embeds the Scan-Focus Window Transformer (SFWT) into CNNs as an auxiliary encoder to supplement the local feature information lost in the global context information extraction process. Secondly, the Cross-Scale Lightweight Integration (CSLI) module is designed, which uses Light Dot-Product Attention Mechanism (DPAM) to fusion multi-scale features and reduce model calculation parameters. Lastly, the Linear Multi-Layer Perceptron (LMLP) is used to restore the feature map resolution while expanding the deconvolution receptive field. To validate the effectiveness of the proposed method, we conducted extensive experiments on real aerial scene datasets, including UAVid, Urban Drone, and Aeroscapes. The experimental results show that our method achieves state-of-the-art performance while maintaining superior real-time efficiency. Implementation codes will be available on https://github.com/darkseid-arch/UAVSeg.

源语言英语
文章编号3502401
期刊IEEE Transactions on Geoscience and Remote Sensing
DOI
出版状态已接受/待刊 - 2024

指纹

探究 'UAVSeg: Dual-Encoder Cross-Scale Attention Network for UAV Images Semantic Segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此