RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision

Jinzhong Wang, Xuetao Tian, Shun Dai, Tao Zhuo, Haorui Zeng, Hongjuan Liu, Jiaqi Liu, Xiuwei Zhang, Yanning Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multispectral object detection, utilizing both visible (RGB) and thermal infrared (T) modals, has garnered significant attention for its robust performance across diverse weather and lighting conditions. However, effectively exploiting the complementarity between RGB-T modals while maintaining efficiency remains a critical challenge. In this paper, a very simple Group Shuffled Multi-receptive Attention (GSMA) module is proposed to extract and combine multi-scale RGB and thermal features. Then, the extracted multi-modal features are directly integrated with a multi-level path aggregation neck, which significantly improves the fusion effect and efficiency. Meanwhile, multi-modal object detection often adopts union annotations for both modals. This kind of supervision is not sufficient and unfair, since objects observed in one modal may not be seen in the other modal. To solve this issue, Multi-modal Supervision (MS) is proposed to sufficiently supervise RGB-T object detection. Comprehensive experiments on two challenging benchmarks, KAIST and DroneVehicle, demonstrate the proposed model achieves the state-of-the-art accuracy while maintaining competitive efficiency.

Original languageEnglish
Title of host publicationPattern Recognition - 27th International Conference, ICPR 2024, Proceedings
EditorsApostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages284-298
Number of pages15
ISBN (Print)9783031784460
DOIs
StatePublished - 2025
Event27th International Conference on Pattern Recognition, ICPR 2024 - Kolkata, India
Duration: 1 Dec 20245 Dec 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15317 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Pattern Recognition, ICPR 2024
Country/TerritoryIndia
CityKolkata
Period1/12/245/12/24

Keywords

  • Attention mechanism
  • Group shuffle
  • Multi-modal supervision
  • Multispectral object detection

Fingerprint

Dive into the research topics of 'RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision'. Together they form a unique fingerprint.

Cite this