Abstract
Multimodal cross-city semantic segmentation aims to adapt a network trained on multiple labeled source domains (MSDs) from one city to multiple unlabeled target domains (MTDs) in another city, where the multiple domains refer to different sensor modalities. However, remote sensing data from different sensors increases the extent of domain shift in the fused domain space, making feature alignment more challenging. Meanwhile, traditional fusion methods only consider complementarity within MSDs (or MTDs), which wastes cross-domain relevant information and neglects control over domain shift. To address the above issues, we propose a similarity-inspired fusion and invertible transformation learning network (SFITNet) for multimodal cross-city semantic segmentation. To alleviate the increasing alignment difficulty in multimodal fused domains, an invertible transformation learning strategy (ITLS) is proposed, which adopts a topological perspective on unsupervised domain adaptation. This strategy aims to simulate the potential distribution transformation function between the MSD and the MTD based on invertible neural networks (INNs) after feature fusion, thereby performing distribution alignment independently within the two feature spaces. A cross-domain similarity-inspired information interaction module (CDSiM) is also designed, which considers the correspondence between the MSD and the MTD in the fusion stage, effectively utilizes multimodal complementary information and promotes the subsequent alignment of fused domain shifts. The semantic segmentation tests are completed on the public C2Seg-AB dataset and a new multimodal cross-city Su-Wu dataset. Compared with some state-of-the-art techniques, the experimental results demonstrated the superiority of the proposed SFITNet.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| DOIs | |
| State | Accepted/In press - 2025 |
Keywords
- Domain shift
- invertible transformation learning
- multimodal cross-city semantic segmentation
- similarity-inspired
- unsupervised domain adaptation
Fingerprint
Dive into the research topics of 'Multimodal Cross-City Semantic Segmentation Based on Similarity-Inspired Fusion and Invertible Transformation Learning Network'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver