TY - JOUR
T1 - NAS-FCOS
T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
AU - Wang, Ning
AU - Gao, Yang
AU - Chen, Hao
AU - Wang, Peng
AU - Tian, Zhi
AU - Shen, Chunhua
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020
Y1 - 2020
N2 - The success of deep neural networks relies on significant architecture engineering. Recently neural architecture search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount of computational resources, e.g., a few thousand GPU-days. To date, on challenging vision tasks such as object detection, NAS, especially fast versions of NAS, is less studied. Here we propose to search for the decoder structure of object detectors with search efficiency being taken into consideration. To be more specific, we aim to efficiently search for the feature pyramid network (FPN) as well as the prediction head of a simple anchor-free object detector, namely FCOS, using a tailored reinforcement learning paradigm. With carefully designed search space, search algorithms and strategies for evaluating network quality, we are able to efficiently search a top-performing detection architecture within 4 days using 8 V100 GPUs. The discovered architecture surpasses state-of-the-art object detection models (such as Faster R-CNN, RetinaNet and FCOS) by 1.5 to 3.5 points in AP on the COCO dataset, with comparable computation complexity and memory footprint, demonstrating the efficacy of the proposed NAS for object detection.
AB - The success of deep neural networks relies on significant architecture engineering. Recently neural architecture search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount of computational resources, e.g., a few thousand GPU-days. To date, on challenging vision tasks such as object detection, NAS, especially fast versions of NAS, is less studied. Here we propose to search for the decoder structure of object detectors with search efficiency being taken into consideration. To be more specific, we aim to efficiently search for the feature pyramid network (FPN) as well as the prediction head of a simple anchor-free object detector, namely FCOS, using a tailored reinforcement learning paradigm. With carefully designed search space, search algorithms and strategies for evaluating network quality, we are able to efficiently search a top-performing detection architecture within 4 days using 8 V100 GPUs. The discovered architecture surpasses state-of-the-art object detection models (such as Faster R-CNN, RetinaNet and FCOS) by 1.5 to 3.5 points in AP on the COCO dataset, with comparable computation complexity and memory footprint, demonstrating the efficacy of the proposed NAS for object detection.
UR - http://www.scopus.com/inward/record.url?scp=85094846553&partnerID=8YFLogxK
U2 - 10.1109/CVPR42600.2020.01196
DO - 10.1109/CVPR42600.2020.01196
M3 - 会议文章
AN - SCOPUS:85094846553
SN - 1063-6919
SP - 11940
EP - 11948
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
M1 - 9156326
Y2 - 14 June 2020 through 19 June 2020
ER -