Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles

Chengze Wang; Yuan Yuan; Qi Wang

doi:10.1109/ICASSP.2019.8683446

Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles

Chengze Wang, Yuan Yuan, Qi Wang

School of Artificial Intelligence, OPtics and Electronics

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

In this paper, we present iDVO (inertia-embedded deep visual odometry), a self-supervised learning based monocular visual odometry (VO) for road vehicles. When modelling the geometric consistency within adjacent frames, most deep VO methods ignore the temporal continuity of the camera pose, which results in a very severe jagged fluctuation in the velocity curves. With the observation that road vehicles tend to perform smooth dynamic characteristics in most of the time, we design the inertia loss function to describe the abnormal motion variation, which assists the model to learn the consecutiveness from long-term camera ego-motion. Based on the recurrent convolutional neural network (RCNN) architecture, our method implicitly models the dynamics of road vehicles and the temporal consecutiveness by the extended Long Short-Term Memory (LSTM) block. Furthermore, we develop the dynamic hard-edge mask to handle the non-consistency in fast camera motion by blocking the boundary part and which generates more efficiency in the whole non-consistency mask. The proposed method is evaluated on the KITTI dataset, and the results demonstrate state-of-the-art performance with respect to other monocular deep VO and SLAM approaches.

Original language	English
Title of host publication	2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	2252-2256
Number of pages	5
ISBN (Electronic)	9781479981311
DOIs	https://doi.org/10.1109/ICASSP.2019.8683446
State	Published - May 2019
Event	44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom Duration: 12 May 2019 → 17 May 2019

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2019-May
ISSN (Print)	1520-6149

Conference

Conference	44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Country/Territory	United Kingdom
City	Brighton
Period	12/05/19 → 17/05/19

Keywords

Inertia
RCNN
Self-supervised Learning
Visual Odometry

Access to Document

10.1109/ICASSP.2019.8683446

Cite this

Wang, C., Yuan, Y., & Wang, Q. (2019). Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (pp. 2252-2256). Article 8683446 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8683446

Wang, Chengze ; Yuan, Yuan ; Wang, Qi. / Learning by Inertia : Self-supervised Monocular Visual Odometry for Road Vehicles. 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 2252-2256 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{f4ab4ed11b144a59852658d919002f93,

title = "Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles",

abstract = "In this paper, we present iDVO (inertia-embedded deep visual odometry), a self-supervised learning based monocular visual odometry (VO) for road vehicles. When modelling the geometric consistency within adjacent frames, most deep VO methods ignore the temporal continuity of the camera pose, which results in a very severe jagged fluctuation in the velocity curves. With the observation that road vehicles tend to perform smooth dynamic characteristics in most of the time, we design the inertia loss function to describe the abnormal motion variation, which assists the model to learn the consecutiveness from long-term camera ego-motion. Based on the recurrent convolutional neural network (RCNN) architecture, our method implicitly models the dynamics of road vehicles and the temporal consecutiveness by the extended Long Short-Term Memory (LSTM) block. Furthermore, we develop the dynamic hard-edge mask to handle the non-consistency in fast camera motion by blocking the boundary part and which generates more efficiency in the whole non-consistency mask. The proposed method is evaluated on the KITTI dataset, and the results demonstrate state-of-the-art performance with respect to other monocular deep VO and SLAM approaches.",

keywords = "Inertia, RCNN, Self-supervised Learning, Visual Odometry",

author = "Chengze Wang and Yuan Yuan and Qi Wang",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 ; Conference date: 12-05-2019 Through 17-05-2019",

year = "2019",

month = may,

doi = "10.1109/ICASSP.2019.8683446",

language = "英语",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "2252--2256",

booktitle = "2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings",

}

Wang, C, Yuan, Y & Wang, Q 2019, Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles. in 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings., 8683446, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2019-May, Institute of Electrical and Electronics Engineers Inc., pp. 2252-2256, 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, United Kingdom, 12/05/19. https://doi.org/10.1109/ICASSP.2019.8683446

Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles. / Wang, Chengze; Yuan, Yuan ; Wang, Qi.
2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 2252-2256 8683446 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Learning by Inertia

T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019

AU - Wang, Chengze

AU - Yuan, Yuan

AU - Wang, Qi

PY - 2019/5

Y1 - 2019/5

N2 - In this paper, we present iDVO (inertia-embedded deep visual odometry), a self-supervised learning based monocular visual odometry (VO) for road vehicles. When modelling the geometric consistency within adjacent frames, most deep VO methods ignore the temporal continuity of the camera pose, which results in a very severe jagged fluctuation in the velocity curves. With the observation that road vehicles tend to perform smooth dynamic characteristics in most of the time, we design the inertia loss function to describe the abnormal motion variation, which assists the model to learn the consecutiveness from long-term camera ego-motion. Based on the recurrent convolutional neural network (RCNN) architecture, our method implicitly models the dynamics of road vehicles and the temporal consecutiveness by the extended Long Short-Term Memory (LSTM) block. Furthermore, we develop the dynamic hard-edge mask to handle the non-consistency in fast camera motion by blocking the boundary part and which generates more efficiency in the whole non-consistency mask. The proposed method is evaluated on the KITTI dataset, and the results demonstrate state-of-the-art performance with respect to other monocular deep VO and SLAM approaches.

AB - In this paper, we present iDVO (inertia-embedded deep visual odometry), a self-supervised learning based monocular visual odometry (VO) for road vehicles. When modelling the geometric consistency within adjacent frames, most deep VO methods ignore the temporal continuity of the camera pose, which results in a very severe jagged fluctuation in the velocity curves. With the observation that road vehicles tend to perform smooth dynamic characteristics in most of the time, we design the inertia loss function to describe the abnormal motion variation, which assists the model to learn the consecutiveness from long-term camera ego-motion. Based on the recurrent convolutional neural network (RCNN) architecture, our method implicitly models the dynamics of road vehicles and the temporal consecutiveness by the extended Long Short-Term Memory (LSTM) block. Furthermore, we develop the dynamic hard-edge mask to handle the non-consistency in fast camera motion by blocking the boundary part and which generates more efficiency in the whole non-consistency mask. The proposed method is evaluated on the KITTI dataset, and the results demonstrate state-of-the-art performance with respect to other monocular deep VO and SLAM approaches.

KW - Inertia

KW - RCNN

KW - Self-supervised Learning

KW - Visual Odometry

UR - http://www.scopus.com/inward/record.url?scp=85068956321&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2019.8683446

DO - 10.1109/ICASSP.2019.8683446

M3 - 会议稿件

AN - SCOPUS:85068956321

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 2252

EP - 2256

BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 12 May 2019 through 17 May 2019

ER -

Wang C, Yuan Y , Wang Q. Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 2252-2256. 8683446. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2019.8683446

Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this