TY - JOUR
T1 - DGT
T2 - Deformable Graph Transformer for Hyperspectral Image Classification
AU - Lu, Yingjie
AU - Wang, Xiaofei
AU - Mei, Shaohui
AU - Xu, Fulin
AU - Ma, Mingyang
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Transformers can model global context to enhance the performance of hyperspectral classification. However, the explored global information is generally confined to the spatial neighborhood of target pixels. In order to fully leverage global correlation across broader areas, a deformable graph transformer (DGT) is proposed for hyperspectral classification, in which the global information within an entire image is explored to improve the classification performance. Specifically, DGT layers are designed to adaptively sample virtual nodes at varying distances from an initial graph constructed from an image, by which the global spatial information can be explored using deformable graph self-attention mechanism. Moreover, a learnable absolute position encoding (LAPE) module is constructed to enhance the spatial context awareness of DGT by integrating positional information into the graph nodes. Additionally, graph structure encoding and graph topology encoding are further designed as inductive biases for the graph, by which both local structural information and global topological information of the HSI are captured to enhance the feature extraction capability of the DGT layer. Ultimately, through the stacking of multiple DGT layers, a composite feature fusion learning (CFFL) module is employed to fully utilize the simple low-level and complex abstract high-level features extracted from different layers. Extensive experiments on four datasets demonstrate the superiority and robustness of the proposed DGT over several state-of-the-art methods in terms of various evaluation criteria.
AB - Transformers can model global context to enhance the performance of hyperspectral classification. However, the explored global information is generally confined to the spatial neighborhood of target pixels. In order to fully leverage global correlation across broader areas, a deformable graph transformer (DGT) is proposed for hyperspectral classification, in which the global information within an entire image is explored to improve the classification performance. Specifically, DGT layers are designed to adaptively sample virtual nodes at varying distances from an initial graph constructed from an image, by which the global spatial information can be explored using deformable graph self-attention mechanism. Moreover, a learnable absolute position encoding (LAPE) module is constructed to enhance the spatial context awareness of DGT by integrating positional information into the graph nodes. Additionally, graph structure encoding and graph topology encoding are further designed as inductive biases for the graph, by which both local structural information and global topological information of the HSI are captured to enhance the feature extraction capability of the DGT layer. Ultimately, through the stacking of multiple DGT layers, a composite feature fusion learning (CFFL) module is employed to fully utilize the simple low-level and complex abstract high-level features extracted from different layers. Extensive experiments on four datasets demonstrate the superiority and robustness of the proposed DGT over several state-of-the-art methods in terms of various evaluation criteria.
KW - deformable graph self-attention
KW - graph structure and topology encoding
KW - graph transformer
KW - Hyperspectral image classification
UR - http://www.scopus.com/inward/record.url?scp=85207141895&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3476327
DO - 10.1109/TGRS.2024.3476327
M3 - 文章
AN - SCOPUS:85207141895
SN - 0196-2892
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
ER -