Joint 3D instance segmentation and object detection for autonomous driving

Dingfu Zhou; Jin Fang; Xibin Song; Liu Liu; Junbo Yin; Yuchao Dai; Hongdong Li; Ruigang Yang

doi:10.1109/CVPR42600.2020.00191

Joint 3D instance segmentation and object detection for autonomous driving

Dingfu Zhou, Jin Fang, Xibin Song, Liu Liu, Junbo Yin, Yuchao Dai, Hongdong Li, Ruigang Yang

School of Electronics and Information

Research output: Contribution to journal › Conference article › peer-review

98 Scopus citations

Abstract

Currently, in Autonomous Driving (AD), most of the 3D object detection frameworks (either anchor- or anchor-freebased) consider the detection as a Bounding Box (BBox) regression problem. However, this compact representation is not sufficient to explore all the information of the objects. To tackle this problem, we propose a simple but practical detection framework to jointly predict the 3D BBox and instance segmentation. For instance segmentation, we propose a Spatial Embeddings (SEs) strategy to assemble all foreground points into their corresponding object centers. Base on the SE results, the object proposals can be generated based on a simple clustering strategy. For each cluster, only one proposal is generated. Therefore, the Non-Maximum Suppression (NMS) process is no longer needed here. Finally, with our proposed instance-aware ROI pooling, the BBox is refined by a second-stage network. Experimental results on the public KITTI dataset show that the proposed SEs can significantly improve the instance segmentation results compared with other feature embedding-based method. Meanwhile, it also outperforms most of the 3D object detectors on the KITTI testing benchmark.

Original language	English
Article number	9156967
Pages (from-to)	1836-1846
Number of pages	11
Journal	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs	https://doi.org/10.1109/CVPR42600.2020.00191
State	Published - 2020
Event	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 - Virtual, Online, United States Duration: 14 Jun 2020 → 19 Jun 2020

Access to Document

10.1109/CVPR42600.2020.00191

Cite this

@article{a946a89e9e2443d1a05daa9bd777a55c,

title = "Joint 3D instance segmentation and object detection for autonomous driving",

abstract = "Currently, in Autonomous Driving (AD), most of the 3D object detection frameworks (either anchor- or anchor-freebased) consider the detection as a Bounding Box (BBox) regression problem. However, this compact representation is not sufficient to explore all the information of the objects. To tackle this problem, we propose a simple but practical detection framework to jointly predict the 3D BBox and instance segmentation. For instance segmentation, we propose a Spatial Embeddings (SEs) strategy to assemble all foreground points into their corresponding object centers. Base on the SE results, the object proposals can be generated based on a simple clustering strategy. For each cluster, only one proposal is generated. Therefore, the Non-Maximum Suppression (NMS) process is no longer needed here. Finally, with our proposed instance-aware ROI pooling, the BBox is refined by a second-stage network. Experimental results on the public KITTI dataset show that the proposed SEs can significantly improve the instance segmentation results compared with other feature embedding-based method. Meanwhile, it also outperforms most of the 3D object detectors on the KITTI testing benchmark.",

author = "Dingfu Zhou and Jin Fang and Xibin Song and Liu Liu and Junbo Yin and Yuchao Dai and Hongdong Li and Ruigang Yang",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE; 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 ; Conference date: 14-06-2020 Through 19-06-2020",

year = "2020",

doi = "10.1109/CVPR42600.2020.00191",

language = "英语",

pages = "1836--1846",

journal = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

issn = "1063-6919",

publisher = "IEEE Computer Society",

}

TY - JOUR

T1 - Joint 3D instance segmentation and object detection for autonomous driving

AU - Zhou, Dingfu

AU - Fang, Jin

AU - Song, Xibin

AU - Liu, Liu

AU - Yin, Junbo

AU - Dai, Yuchao

AU - Li, Hongdong

AU - Yang, Ruigang

PY - 2020

Y1 - 2020

N2 - Currently, in Autonomous Driving (AD), most of the 3D object detection frameworks (either anchor- or anchor-freebased) consider the detection as a Bounding Box (BBox) regression problem. However, this compact representation is not sufficient to explore all the information of the objects. To tackle this problem, we propose a simple but practical detection framework to jointly predict the 3D BBox and instance segmentation. For instance segmentation, we propose a Spatial Embeddings (SEs) strategy to assemble all foreground points into their corresponding object centers. Base on the SE results, the object proposals can be generated based on a simple clustering strategy. For each cluster, only one proposal is generated. Therefore, the Non-Maximum Suppression (NMS) process is no longer needed here. Finally, with our proposed instance-aware ROI pooling, the BBox is refined by a second-stage network. Experimental results on the public KITTI dataset show that the proposed SEs can significantly improve the instance segmentation results compared with other feature embedding-based method. Meanwhile, it also outperforms most of the 3D object detectors on the KITTI testing benchmark.

AB - Currently, in Autonomous Driving (AD), most of the 3D object detection frameworks (either anchor- or anchor-freebased) consider the detection as a Bounding Box (BBox) regression problem. However, this compact representation is not sufficient to explore all the information of the objects. To tackle this problem, we propose a simple but practical detection framework to jointly predict the 3D BBox and instance segmentation. For instance segmentation, we propose a Spatial Embeddings (SEs) strategy to assemble all foreground points into their corresponding object centers. Base on the SE results, the object proposals can be generated based on a simple clustering strategy. For each cluster, only one proposal is generated. Therefore, the Non-Maximum Suppression (NMS) process is no longer needed here. Finally, with our proposed instance-aware ROI pooling, the BBox is refined by a second-stage network. Experimental results on the public KITTI dataset show that the proposed SEs can significantly improve the instance segmentation results compared with other feature embedding-based method. Meanwhile, it also outperforms most of the 3D object detectors on the KITTI testing benchmark.

UR - http://www.scopus.com/inward/record.url?scp=85094814239&partnerID=8YFLogxK

U2 - 10.1109/CVPR42600.2020.00191

DO - 10.1109/CVPR42600.2020.00191

M3 - 会议文章

AN - SCOPUS:85094814239

SN - 1063-6919

SP - 1836

EP - 1846

JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

M1 - 9156967

T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020

Y2 - 14 June 2020 through 19 June 2020

ER -

Joint 3D instance segmentation and object detection for autonomous driving

Abstract

Access to Document

Other files and links

Fingerprint

Cite this