A method to build multi-scene datasets for CNN for camera pose regression

Yuhao Ma; Hao Guo; Hong Chen; Mengxiao Tian; Xin Huo; Chengjiang Long; Shiye Tang; Xiaoyu Song; Qing Wang

doi:10.1109/AIVR.2018.00022

A method to build multi-scene datasets for CNN for camera pose regression

Yuhao Ma, Hao Guo, Hong Chen, Mengxiao Tian, Xin Huo, Chengjiang Long, Shiye Tang, Xiaoyu Song, Qing Wang

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

2 引用（Scopus）

摘要

Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.

源语言	英语
主期刊名	Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018
出版商	Institute of Electrical and Electronics Engineers Inc.
页	108-115
页数	8
ISBN（电子版）	9781538692691
DOI	https://doi.org/10.1109/AIVR.2018.00022
出版状态	已出版 - 2 7月 2018
已对外发布	是
活动	1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 - Taichung, 中国台湾期限: 10 12月 2018 → 12 12月 2018

出版系列

姓名	Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

会议

会议	1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018
国家/地区	中国台湾
市	Taichung
时期	10/12/18 → 12/12/18

访问文件

10.1109/AIVR.2018.00022

其它文件与链接

链接到 Scopus 的出版物

引用此

Ma, Y., Guo, H., Chen, H., Tian, M., Huo, X., Long, C., Tang, S., Song, X., & Wang, Q. (2018). A method to build multi-scene datasets for CNN for camera pose regression. 在 Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 (页码 108-115). 文章 8613641 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AIVR.2018.00022

Ma, Yuhao ; Guo, Hao ; Chen, Hong 等. / A method to build multi-scene datasets for CNN for camera pose regression. Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc., 2018. 页码 108-115 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018).

@inproceedings{fbed12ef3f524487848d094f3a3f88d8,

title = "A method to build multi-scene datasets for CNN for camera pose regression",

abstract = "Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.",

keywords = "Camera pose Estimation, Convolutional neural network, Dataset, Visual localization",

author = "Yuhao Ma and Hao Guo and Hong Chen and Mengxiao Tian and Xin Huo and Chengjiang Long and Shiye Tang and Xiaoyu Song and Qing Wang",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018 ; Conference date: 10-12-2018 Through 12-12-2018",

year = "2018",

month = jul,

day = "2",

doi = "10.1109/AIVR.2018.00022",

language = "英语",

series = "Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "108--115",

booktitle = "Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018",

}

Ma, Y, Guo, H, Chen, H, Tian, M, Huo, X, Long, C, Tang, S, Song, X & Wang, Q 2018, A method to build multi-scene datasets for CNN for camera pose regression. 在 Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018., 8613641, Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018, Institute of Electrical and Electronics Engineers Inc., 页码 108-115, 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018, Taichung, 中国台湾, 10/12/18. https://doi.org/10.1109/AIVR.2018.00022

A method to build multi-scene datasets for CNN for camera pose regression. / Ma, Yuhao; Guo, Hao; Chen, Hong 等.
Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc., 2018. 页码 108-115 8613641 (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - A method to build multi-scene datasets for CNN for camera pose regression

AU - Ma, Yuhao

AU - Guo, Hao

AU - Chen, Hong

AU - Tian, Mengxiao

AU - Huo, Xin

AU - Long, Chengjiang

AU - Tang, Shiye

AU - Song, Xiaoyu

AU - Wang, Qing

PY - 2018/7/2

Y1 - 2018/7/2

N2 - Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.

AB - Convolutional neural networks (CNN) have shown to be useful for camera pose regression, and They have robust effects against some challenging scenarios such as lighting changes, motion blur, and scenes with lots of textureless surfaces. Additionally, PoseNet shows that the deep learning system can interpolate the camera pose in space between training images. In this paper, we explore how different strategies for processing datasets will affect the pose regression and propose a method for building multi-scene datasets for training such neural networks. We demonstrate that the location of several scenes can be remembered using only one neural network. By combining multiple scenes, we found that the position errors of the neural network do not decrease significantly as the distance between the cameras increases, which means that we do not need to train several models for the increase number of scenes. We also explore the impact factors that influence the accuracy of models for multi-scene camera pose regression, which can help us merge several scenes into one dataset in a better way. We opened our code and datasets to the public for better researches.

KW - Camera pose Estimation

KW - Convolutional neural network

KW - Dataset

KW - Visual localization

UR - http://www.scopus.com/inward/record.url?scp=85062191359&partnerID=8YFLogxK

U2 - 10.1109/AIVR.2018.00022

DO - 10.1109/AIVR.2018.00022

M3 - 会议稿件

AN - SCOPUS:85062191359

T3 - Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

SP - 108

EP - 115

BT - Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 1st IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018

Y2 - 10 December 2018 through 12 December 2018

ER -

Ma Y, Guo H, Chen H, Tian M, Huo X, Long C 等. A method to build multi-scene datasets for CNN for camera pose regression. 在 Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018. Institute of Electrical and Electronics Engineers Inc. 2018. 页码 108-115. 8613641. (Proceedings - 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality, AIVR 2018). doi: 10.1109/AIVR.2018.00022

A method to build multi-scene datasets for CNN for camera pose regression

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此