A method to build multi-scene datasets for CNN for camera pose regression

Yuhao Ma; Hao Guo; Hong Chen; Mengxiao Tian; Xin Huo; Chengjiang Long; Shiye Tang; Xiaoyu Song; Qing Wang

doi:10.1109/AIVR.2018.00022

A method to build multi-scene datasets for CNN for camera pose regression

Yuhao Ma, Hao Guo, Hong Chen, Mengxiao Tian, Xin Huo, Chengjiang Long, Shiye Tang, Xiaoyu Song, Qing Wang

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.

Original language	English
Title of host publication	Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	108-115
Number of pages	8
ISBN (Electronic)	9781538692691
DOIs	https://doi.org/10.1109/AIVR.2018.00022
State	Published - 2 Jul 2018
Externally published	Yes
Event	1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 - Taichung, Taiwan, Province of China Duration: 10 Dec 2018 → 12 Dec 2018

Publication series

Name	Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

Conference

Conference	1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018
Country/Territory	Taiwan, Province of China
City	Taichung
Period	10/12/18 → 12/12/18

Keywords

Camera pose Estimation
Convolutional neural network
Dataset
Visual localization

Access to Document

10.1109/AIVR.2018.00022

Cite this

Ma, Y., Guo, H., Chen, H., Tian, M., Huo, X., Long, C., Tang, S., Song, X., & Wang, Q. (2018). A method to build multi-scene datasets for CNN for camera pose regression. In Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 (pp. 108-115). Article 8613641 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AIVR.2018.00022

Ma, Yuhao ; Guo, Hao ; Chen, Hong et al. / A method to build multi-scene datasets for CNN for camera pose regression. Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 108-115 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018).

@inproceedings{fbed12ef3f524487848d094f3a3f88d8,

title = "A method to build multi-scene datasets for CNN for camera pose regression",

abstract = "Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.",

keywords = "Camera pose Estimation, Convolutional neural network, Dataset, Visual localization",

author = "Yuhao Ma and Hao Guo and Hong Chen and Mengxiao Tian and Xin Huo and Chengjiang Long and Shiye Tang and Xiaoyu Song and Qing Wang",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 ; Conference date: 10-12-2018 Through 12-12-2018",

year = "2018",

month = jul,

day = "2",

doi = "10.1109/AIVR.2018.00022",

language = "英语",

series = "Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "108--115",

booktitle = "Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018",

}

Ma, Y, Guo, H, Chen, H, Tian, M, Huo, X, Long, C, Tang, S, Song, X & Wang, Q 2018, A method to build multi-scene datasets for CNN for camera pose regression. in Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018., 8613641, Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018, Institute of Electrical and Electronics Engineers Inc., pp. 108-115, 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018, Taichung, Taiwan, Province of China, 10/12/18. https://doi.org/10.1109/AIVR.2018.00022

A method to build multi-scene datasets for CNN for camera pose regression. / Ma, Yuhao; Guo, Hao; Chen, Hong et al.
Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 108-115 8613641 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - A method to build multi-scene datasets for CNN for camera pose regression

AU - Ma, Yuhao

AU - Guo, Hao

AU - Chen, Hong

AU - Tian, Mengxiao

AU - Huo, Xin

AU - Long, Chengjiang

AU - Tang, Shiye

AU - Song, Xiaoyu

AU - Wang, Qing

PY - 2018/7/2

Y1 - 2018/7/2

N2 - Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.

AB - Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.

KW - Camera pose Estimation

KW - Convolutional neural network

KW - Dataset

KW - Visual localization

UR - http://www.scopus.com/inward/record.url?scp=85062191359&partnerID=8YFLogxK

U2 - 10.1109/AIVR.2018.00022

DO - 10.1109/AIVR.2018.00022

M3 - 会议稿件

AN - SCOPUS:85062191359

T3 - Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

SP - 108

EP - 115

BT - Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

Y2 - 10 December 2018 through 12 December 2018

ER -

Ma Y, Guo H, Chen H, Tian M, Huo X, Long C et al. A method to build multi-scene datasets for CNN for camera pose regression. In Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 108-115. 8613641. (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018). doi: 10.1109/AIVR.2018.00022

A method to build multi-scene datasets for CNN for camera pose regression

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this