Enhancing visual inertial odometry with efficient dynamic PerceptionNet and consistency improvement fusion

  • Ganchao Liu
  • , Haozhe Tian
  • , Qiang Gao
  • , Yuan Yuan

Research output: Contribution to journalArticlepeer-review

Abstract

Visual-Inertial Odometry (VIO) has become a key technology for autonomous systems by fusing visual and inertial data for robust self-localization. However, traditional VIO methods suffer from heavy parameter tuning, high computational cost, and limited robustness in dynamic environments. To address these issues, we propose a lightweight VIO framework that integrates two core components: an efficient dynamic perception network and a cross-modal consistency enhancement module. The standard convolutions in FlowNet are replaced with a dynamic perception network that leverages a two-stream feature generation module and a spatial-channel cooperative gating mechanism to capture long-range spatial dependencies while maintaining high computational efficiency. Furthermore, a novel fusion module is introduced to reduce latent discrepancies between heterogeneous visual and inertial modalities through a learnable shared mechanism. By adaptively aligning inertial features with visual features, this module enhances cross-modal complementarity and improves overall localization accuracy. Extensive experiments on multiple benchmark datasets demonstrate that the proposed framework achieves state-of-the-art performance while maintaining low complexity. Specifically, the method improves trajectory estimation precision by 61.6 % compared with the FlowNet-based baseline on KITTI.

Original languageEnglish
Article number112779
JournalPattern Recognition
Volume173
DOIs
StatePublished - May 2026

Keywords

  • Consistency improvement fusion
  • Deep learning
  • Dynamic PerceptionNet
  • Pose estimation
  • Visual inertial odometry

Fingerprint

Dive into the research topics of 'Enhancing visual inertial odometry with efficient dynamic PerceptionNet and consistency improvement fusion'. Together they form a unique fingerprint.

Cite this