RIFormer: Learning Rotation-Invariant Features Via Transformer

Chao Song, Shaohui Mei, Mingyang Ma

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently, Transformers have been widely used in many computer vision tasks and have shown promising results. However, like convolutional neural networks (CNNs), Transformers cannot handle rotational variations well, thus hindering its further application in the field of remote sensing. In this paper, we design a rotation-invariant Transformer (RIFormer) to alleviate the abovementioned problem. Moreover, we propose a novel rotation-invariant position embedding (RIPE) to encode positional information of features, and this position-dependent features learned by RIPE is robust to rotations. The experimental results show that proposed RIFormer with RIPE can effectively learn rotation-invariant features compared to the state-of-the-art methods with limited parameters. We provide an open-source implementation of our method. It is publicly available at https://github.com/psychAo/RIFormer.

Original languageEnglish
Title of host publicationIGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5399-5402
Number of pages4
ISBN (Electronic)9798350320107
DOIs
StatePublished - 2023
Event2023 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2023 - Pasadena, United States
Duration: 16 Jul 202321 Jul 2023

Publication series

NameInternational Geoscience and Remote Sensing Symposium (IGARSS)
Volume2023-July

Conference

Conference2023 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2023
Country/TerritoryUnited States
CityPasadena
Period16/07/2321/07/23

Keywords

  • feature learning
  • position embedding
  • remote sensing
  • rotation-invariant
  • Transformer

Fingerprint

Dive into the research topics of 'RIFormer: Learning Rotation-Invariant Features Via Transformer'. Together they form a unique fingerprint.

Cite this