Skip to main navigation Skip to search Skip to main content

Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing

  • Bingyu Li
  • , Haocheng Dong
  • , Da Zhang
  • , Zhiyuan Zhao
  • , Hao Sun
  • , Junyu Gao
  • University of Science and Technology of China
  • Institute of Artificial Intelligence (TeleAI)
  • Northwestern Polytechnical University Xian

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

Open-Vocabulary Remote Sensing Image Segmentation (OVRSIS), an emerging task that adapts Open-Vocabulary Segmentation (OVS) to the remote sensing (RS) domain, remains underexplored due to the absence of a unified evaluation benchmark and the domain gap between natural and RS images. To bridge these gaps, we first establish a standardized OVRSIS benchmark (OVRSISBench) based on widely-used RS segmentation datasets, enabling consistent evaluation across methods. Using this benchmark, we comprehensively evaluate several representative OVS/OVRSIS models and reveal their limitations when directly applied to remote sensing scenarios. Building on these insights, we propose RSKT-Seg, a novel open-vocabulary segmentation framework tailored for remote sensing. RSKT-Seg integrates three key components: (1) a Multi-Directional Cost Map Aggregation (RS-CMA) module that captures rotation-invariant visual cues by computing vision-language cosine similarities across multiple directions; (2) an Efficient Cost Map Fusion (RS-Fusion) transformer, which jointly models spatial and semantic dependencies with a lightweight dimensionality reduction strategy; and (3) a Remote Sensing Knowledge Transfer (RS-Transfer) module that injects pre-trained knowledge and facilitates domain adaptation via enhanced upsampling. Extensive experiments on the benchmark show that RSKT-Seg consistently outperforms strong OVS baselines by +3.8 mIoU and +5.9 mACC, while achieving 2× faster inference through efficient aggregation.

Original languageEnglish
Pages (from-to)5982-5991
Number of pages10
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume40
Issue number8
DOIs
StatePublished - 2026
Event40th AAAI Conference on Artificial Intelligence, AAAI 2026 - Singapore, Singapore
Duration: 20 Jan 202627 Jan 2026

Fingerprint

Dive into the research topics of 'Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing'. Together they form a unique fingerprint.

Cite this