Progressive Learning Vision Transformer for Open Set Recognition of Fine-Grained Objects in Remote Sensing Images

Yimin Fu, Zhunga Liu, Zuowei Zhang

科研成果: 期刊稿件文章同行评审

13 引用 (Scopus)

摘要

Open set recognition (OSR) aims to classify known classes and recognize unknown classes simultaneously. Existing OSR methods have primarily focused on learning decision boundaries based on overall feature representations, and have achieved good performance on various coarse-grained image datasets. However, the overall feature representations of objects in fine-grained image datasets are highly similar, making it difficult to distinguish between known and unknown classes by overall feature-based decision boundaries. To address this problem, we propose a progressive learning vision transformer (PLViT) with a coarse-to-fine optimization strategy. In PLViT, the overall feature representations are first optimized in the distance space to learn the initial decision boundaries. Then, a context-aware patch selection module is designed to locate the discriminative part regions. Afterward, the multilayer representations of each selected patch are aggregated according to the self-attention weights, and input into the last transformer layer to extract local feature representations. Finally, overall and local feature representations are adaptively fused and optimized in the angular space to further refine the decision boundaries. Experimental results on four fine-grained remote sensing object recognition datasets show that PLViT outperforms state-of-the-art methods.

源语言英语
文章编号5215113
期刊IEEE Transactions on Geoscience and Remote Sensing
61
DOI
出版状态已出版 - 2023

指纹

探究 'Progressive Learning Vision Transformer for Open Set Recognition of Fine-Grained Objects in Remote Sensing Images' 的科研主题。它们共同构成独一无二的指纹。

引用此