TY - JOUR
T1 - Mutual-Assistance Learning for Object Detection
AU - Xie, Xingxing
AU - Lang, Chunbo
AU - Miao, Shicheng
AU - Cheng, Gong
AU - Li, Ke
AU - Han, Junwei
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2023/12/1
Y1 - 2023/12/1
N2 - Object detection is a fundamental yet challenging task in computer vision. Despite the great strides made over recent years, modern detectors may still produce unsatisfactory performance due to certain factors, such as non-universal object features and single regression manner. In this paper, we draw on the idea of mutual-assistance (MA) learning and accordingly propose a robust one-stage detector, referred as MADet, to address these weaknesses. First, the spirit of MA is manifested in the head design of the detector. Decoupled classification and regression features are reintegrated to provide shared offsets, avoiding inconsistency between feature-prediction pairs induced by zero or erroneous offsets. Second, the spirit of MA is captured in the optimization paradigm of the detector. Both anchor-based and anchor-free regression fashions are utilized jointly to boost the capability to retrieve objects with various characteristics, especially for large aspect ratios, occlusion from similar-sized objects, etc. Furthermore, we meticulously devise a quality assessment mechanism to facilitate adaptive sample selection and loss term reweighting. Extensive experiments on standard benchmarks verify the effectiveness of our approach. On MS-COCO, MADet achieves 42.5% AP with vanilla ResNet50 backbone, dramatically surpassing multiple strong baselines and setting a new state of the art.
AB - Object detection is a fundamental yet challenging task in computer vision. Despite the great strides made over recent years, modern detectors may still produce unsatisfactory performance due to certain factors, such as non-universal object features and single regression manner. In this paper, we draw on the idea of mutual-assistance (MA) learning and accordingly propose a robust one-stage detector, referred as MADet, to address these weaknesses. First, the spirit of MA is manifested in the head design of the detector. Decoupled classification and regression features are reintegrated to provide shared offsets, avoiding inconsistency between feature-prediction pairs induced by zero or erroneous offsets. Second, the spirit of MA is captured in the optimization paradigm of the detector. Both anchor-based and anchor-free regression fashions are utilized jointly to boost the capability to retrieve objects with various characteristics, especially for large aspect ratios, occlusion from similar-sized objects, etc. Furthermore, we meticulously devise a quality assessment mechanism to facilitate adaptive sample selection and loss term reweighting. Extensive experiments on standard benchmarks verify the effectiveness of our approach. On MS-COCO, MADet achieves 42.5% AP with vanilla ResNet50 backbone, dramatically surpassing multiple strong baselines and setting a new state of the art.
KW - One-stage object detection
KW - bounding box regression
KW - feature alignment
KW - mutual-assitance learning
UR - http://www.scopus.com/inward/record.url?scp=85173053972&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2023.3319634
DO - 10.1109/TPAMI.2023.3319634
M3 - 文章
C2 - 37756169
AN - SCOPUS:85173053972
SN - 0162-8828
VL - 45
SP - 15171
EP - 15184
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 12
ER -