跳到主要导航 跳到搜索 跳到主要内容

CDFNet: Cross-dimension fusion network with dual feature enhancement for multimodal object detection

  • Wencong Wu
  • , Xiuwei Zhang
  • , Hanlin Yin
  • , Haorui Zeng
  • , Chenxu Wei
  • , Lei Yu
  • , Yanning Zhang
  • Northwestern Polytechnical University Xian

科研成果: 期刊稿件文章同行评审

摘要

Multimodal object detection aims to utilize the complementarity between different modalities to improve detection results. However, most existing methods only enhance intermodality features by leveraging the interaction of spatial information while neglecting the interaction of channel information between multimodalities, resulting in insufficient enhancement of cross-modal features. Moreover, many detection models fuse multimodal features within a single feature dimension, failing to consider the use of multi-dimensional information, which means that multimodal feature information has not been fully exploited. To solve these drawbacks, we propose a cross-dimension fusion network with dual feature enhancement (CDFNet) for visible and infrared object detection. Specifically, a dual feature enhancement module (DFEM) is designed to enhance cross-modal representations by modeling multiplicative interactions at both spatial and channel levels. Furthermore, a cross-dimension feature fusion module (CDFFM) is developed for fully integrating the enhanced features by capturing different dimensional dependencies to obtain a more discriminative fused feature. Extensive experiments demonstrate that our proposed CDFNet achieves a 1.8% higher mAP detection accuracy on the LLVIP dataset compared to the state-of-the-art detection method, and exhibits more competitive network complexity than transformer-based and mamba-based models. The code of our CDFNet is released at https://github.com/WenCongWu/CDFNet.

源语言英语
文章编号132380
期刊Expert Systems with Applications
322
DOI
出版状态已出版 - 1 8月 2026

指纹

探究 'CDFNet: Cross-dimension fusion network with dual feature enhancement for multimodal object detection' 的科研主题。它们共同构成独一无二的指纹。

引用此