APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving

Xibin Song; Peng Wang; Dingfu Zhou; Rui Zhu; Chenye Guan; Yuchao Dai; Hao Su; Hongdong Li; Ruigang Yang

doi:10.1109/CVPR.2019.00560

APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving

Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang

电子信息学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

146 引用（Scopus）

摘要

Autonomous driving has attracted remarkable attention from both industry and academia. An important task is to estimate 3D properties (e.g. translation, rotation and shape) of a moving or parked vehicle on the road. This task, while critical, is still under-researched in the computer vision community-partially owing to the lack of large scale and fully-annotated 3D car database suitable for autonomous driving research. In this paper, we contribute the first large scale database suitable for 3D car instance understanding-ApolloCar3D. The dataset contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20× larger than PASCAL3D+ and KITTI, the current state-of-the-art. To enable efficient labelling in 3D, we build a pipeline by considering 2D-3D keypoint correspondences for a single instance and 3D relationship among multiple instances. Equipped with such dataset, we build various baseline algorithms with the state-of-the-art deep convolutional neural networks. Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints. We show that using keypoints significantly improves fitting performance. Finally, we develop a new 3D metric jointly considering 3D pose and 3D shape, allowing for comprehensive evaluation and ablation study.

源语言	英语
主期刊名	Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
出版商	IEEE Computer Society
页	5447-5457
页数	11
ISBN（电子版）	9781728132938
DOI	https://doi.org/10.1109/CVPR.2019.00560
出版状态	已出版 - 6月 2019
活动	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, 美国期限: 16 6月 2019 → 20 6月 2019

出版系列

姓名	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
卷	2019-June
ISSN（印刷版）	1063-6919

会议

会议	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
国家/地区	美国
市	Long Beach
时期	16/06/19 → 20/06/19

访问文件

10.1109/CVPR.2019.00560

其它文件与链接

链接到 Scopus 的出版物

引用此

Song, X., Wang, P., Zhou, D., Zhu, R., Guan, C., Dai, Y., Su, H., Li, H., & Yang, R. (2019). APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving. 在 Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 (页码 5447-5457). 文章 8954083 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 卷 2019-June). IEEE Computer Society. https://doi.org/10.1109/CVPR.2019.00560

Song, Xibin ; Wang, Peng ; Zhou, Dingfu 等. / APOLLOCAR3D : A large 3D car instance understanding benchmark for autonomous driving. Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. 页码 5447-5457 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

@inproceedings{fc7a02dcadc54771a85671accedb9f96,

title = "APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving",

abstract = "Autonomous driving has attracted remarkable attention from both industry and academia. An important task is to estimate 3D properties (e.g. translation, rotation and shape) of a moving or parked vehicle on the road. This task, while critical, is still under-researched in the computer vision community-partially owing to the lack of large scale and fully-annotated 3D car database suitable for autonomous driving research. In this paper, we contribute the first large scale database suitable for 3D car instance understanding-ApolloCar3D. The dataset contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20× larger than PASCAL3D+ and KITTI, the current state-of-the-art. To enable efficient labelling in 3D, we build a pipeline by considering 2D-3D keypoint correspondences for a single instance and 3D relationship among multiple instances. Equipped with such dataset, we build various baseline algorithms with the state-of-the-art deep convolutional neural networks. Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints. We show that using keypoints significantly improves fitting performance. Finally, we develop a new 3D metric jointly considering 3D pose and 3D shape, allowing for comprehensive evaluation and ablation study.",

keywords = "3D from Single Image, Datasets and Evaluation, Robotics + Driving",

author = "Xibin Song and Peng Wang and Dingfu Zhou and Rui Zhu and Chenye Guan and Yuchao Dai and Hao Su and Hongdong Li and Ruigang Yang",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

year = "2019",

month = jun,

doi = "10.1109/CVPR.2019.00560",

language = "英语",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "5447--5457",

booktitle = "Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019",

}

Song, X, Wang, P, Zhou, D, Zhu, R, Guan, C, Dai, Y, Su, H, Li, H & Yang, R 2019, APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving. 在 Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019., 8954083, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 卷 2019-June, IEEE Computer Society, 页码 5447-5457, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, 美国, 16/06/19. https://doi.org/10.1109/CVPR.2019.00560

APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving. / Song, Xibin; Wang, Peng; Zhou, Dingfu 等.
Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. 页码 5447-5457 8954083 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 卷 2019-June).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - APOLLOCAR3D

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

AU - Song, Xibin

AU - Wang, Peng

AU - Zhou, Dingfu

AU - Zhu, Rui

AU - Guan, Chenye

AU - Dai, Yuchao

AU - Su, Hao

AU - Li, Hongdong

AU - Yang, Ruigang

PY - 2019/6

Y1 - 2019/6

N2 - Autonomous driving has attracted remarkable attention from both industry and academia. An important task is to estimate 3D properties (e.g. translation, rotation and shape) of a moving or parked vehicle on the road. This task, while critical, is still under-researched in the computer vision community-partially owing to the lack of large scale and fully-annotated 3D car database suitable for autonomous driving research. In this paper, we contribute the first large scale database suitable for 3D car instance understanding-ApolloCar3D. The dataset contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20× larger than PASCAL3D+ and KITTI, the current state-of-the-art. To enable efficient labelling in 3D, we build a pipeline by considering 2D-3D keypoint correspondences for a single instance and 3D relationship among multiple instances. Equipped with such dataset, we build various baseline algorithms with the state-of-the-art deep convolutional neural networks. Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints. We show that using keypoints significantly improves fitting performance. Finally, we develop a new 3D metric jointly considering 3D pose and 3D shape, allowing for comprehensive evaluation and ablation study.

AB - Autonomous driving has attracted remarkable attention from both industry and academia. An important task is to estimate 3D properties (e.g. translation, rotation and shape) of a moving or parked vehicle on the road. This task, while critical, is still under-researched in the computer vision community-partially owing to the lack of large scale and fully-annotated 3D car database suitable for autonomous driving research. In this paper, we contribute the first large scale database suitable for 3D car instance understanding-ApolloCar3D. The dataset contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20× larger than PASCAL3D+ and KITTI, the current state-of-the-art. To enable efficient labelling in 3D, we build a pipeline by considering 2D-3D keypoint correspondences for a single instance and 3D relationship among multiple instances. Equipped with such dataset, we build various baseline algorithms with the state-of-the-art deep convolutional neural networks. Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints. We show that using keypoints significantly improves fitting performance. Finally, we develop a new 3D metric jointly considering 3D pose and 3D shape, allowing for comprehensive evaluation and ablation study.

KW - 3D from Single Image

KW - Datasets and Evaluation

KW - Robotics + Driving

UR - http://www.scopus.com/inward/record.url?scp=85078797765&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.00560

DO - 10.1109/CVPR.2019.00560

M3 - 会议稿件

AN - SCOPUS:85078797765

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 5447

EP - 5457

BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

PB - IEEE Computer Society

Y2 - 16 June 2019 through 20 June 2019

ER -

Song X, Wang P, Zhou D, Zhu R, Guan C, Dai Y 等. APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving. 在 Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society. 2019. 页码 5447-5457. 8954083. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.00560

APOLLOCAR3D: A large 3D car instance understanding benchmark for autonomous driving

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此