Transformer Based Visual Inertial Odometry

Sicheng Fei, Jingfeng Li, Lei Li, Jie Liang, Jinwen Hu, Dingwen Zhang, Junwei Han

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Visual inertial odometry (VIO) is a sensor fusion technology used for positioning and navigation. It combines visual sensor and inertial sensor information to estimate the movement and location of the UAV in real time. In recent years deep learning based approaches VIO have shown outstanding performance than traditional geometric methods. However, VIO tasks usually need to capture long-distance feature dependencies to ensure the continuity and consistency of camera motion trajectories in time series. In this study, we introduce a new end to end transformer based VIO framework, named VIO-former, to enable the model to better understand motion features in video sequences. Comprehensive quantitative and qualitative evaluation is conducted on KITTI datasets to test our method. The experimental results shows that our approach can achieve superior performance when compared with the existing methods.

Original languageEnglish
Title of host publicationAdvances in Guidance, Navigation and Control - Proceedings of 2024 International Conference on Guidance, Navigation and Control Volume 17
EditorsLiang Yan, Haibin Duan, Yimin Deng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages567-575
Number of pages9
ISBN (Print)9789819622634
DOIs
StatePublished - 2025
EventInternational Conference on Guidance, Navigation and Control, ICGNC 2024 - Changsha, China
Duration: 9 Aug 202411 Aug 2024

Publication series

NameLecture Notes in Electrical Engineering
Volume1353 LNEE
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

ConferenceInternational Conference on Guidance, Navigation and Control, ICGNC 2024
Country/TerritoryChina
CityChangsha
Period9/08/2411/08/24

Keywords

  • Sensor fusion
  • Transformer
  • Visual inertial odometry

Fingerprint

Dive into the research topics of 'Transformer Based Visual Inertial Odometry'. Together they form a unique fingerprint.

Cite this