Bridging CNN and Transformer with Cross-Attention Fusion Network for Hyperspectral Image Classification

Fulin Xu, Shaohui Mei, Ge Zhang, Nan Wang, Qian Du

科研成果: 期刊稿件文章同行评审

20 引用 (Scopus)

摘要

Feature representation is crucial for hyperspectral image (HSI) classification. However, existing convolutional neural network (CNN)-based methods are limited by the convolution kernel and only focus on local features, which causes it to ignore the global properties of HSIs. Transformer-based networks can make up for the limitations of CNNs because they emphasize the global features of HSIs. How to combine the advantages of these two networks in feature extraction is of great importance in improving classification accuracy. Therefore, a cross-attention fusion network bridging CNN and Transformer (CAF-Former) is proposed, which can fully utilize the advantages of CNN in local features and Transformer's long time-dependent feature learning for hyperspectral classification. In order to fully explore the local and global information within an HSI, a Dynamic-CNN branch is proposed to effectively encode local features of pixels, while a Gaussian Transformer branch is constructed to accurately model the global features and long-range dependencies. Moreover, in order to fully interact with local and global features, a cross-attention fusion (CAF) module is proposed as a bridge to fuse the features extracted by the two branches. Experiments over several benchmark datasets demonstrate that the proposed CAF-Former significantly outperforms both CNN-based and Transformer-based state-of-the-art networks for HSI classification.

源语言英语
文章编号5522214
期刊IEEE Transactions on Geoscience and Remote Sensing
62
DOI
出版状态已出版 - 2024

指纹

探究 'Bridging CNN and Transformer with Cross-Attention Fusion Network for Hyperspectral Image Classification' 的科研主题。它们共同构成独一无二的指纹。

引用此