MRM-RETrack: Hybrid Multi-scale Residual and Mamba for RGB-Event Tracking

  • Yuting He
  • , Bin Fan
  • , Zhexiong Wan
  • , Zhiyuan Zhang
  • , Yuchao Dai

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, RGB-event object tracking has achieved significant progress, demonstrating its increasingly enhanced perception and tracking capabilities in dynamic scenes. However, existing methods are predominantly based on CNN or Transformer architectures, which typically suffer from high computational complexity and memory overhead. The emerging Mamba architecture, while preserving the ability to model long-range dependencies, significantly reduces memory consumption, opening new avenues for the design of efficient tracking models. Nevertheless, current Mamba-based RGB-event tracking methods still face challenges such as insufficient feature learning and lack of cross-modal alignment, thereby impacting tracking accuracy and overall robustness. This paper proposes a novel RGB-event tracking framework, aiming to achieve high-performance, low-memory cross-modal object tracking. Specifically, we introduce a hierarchical local-global feature extraction strategy, integrating a Multi-Scale Residual Module (MSRM) and a Gated Mamba Module (GMM), to collaboratively enhance both fine-grained local feature extraction and long-range dependency capture. Furthermore, we develop an efficient Aligned Difference-Enhanced Mamba module (ADE-Mamba), which explicitly aligns complementary contextual features by focusing on inter-modal discrepancies. To further boost tracking performance, we design an adaptive dual-modal tracking head that dynamically adjusts and fuses the contributions from the RGB and event modalities, enabling precise target localization. Extensive experiments on multiple benchmark datasets demonstrate that our method exhibits superior performance in both short-term and long-term tracking tasks.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Proceedings
EditorsJosef Kittler, Hongkai Xiong, Weiyao Lin, Jian Yang, Xilin Chen, Jiwen Lu, Jingyi Yu, Weishi Zheng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages162-177
Number of pages16
ISBN (Print)9789819557639
DOIs
StatePublished - 2026
Event8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 - Shanghai, China
Duration: 15 Oct 202518 Oct 2025

Publication series

NameLecture Notes in Computer Science
Volume16289 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025
Country/TerritoryChina
CityShanghai
Period15/10/2518/10/25

Keywords

  • Event Camera
  • Mamba
  • Multimodal Fusion
  • Object Tracking

Fingerprint

Dive into the research topics of 'MRM-RETrack: Hybrid Multi-scale Residual and Mamba for RGB-Event Tracking'. Together they form a unique fingerprint.

Cite this