TY - JOUR
T1 - GSDet
T2 - Object Detection in Aerial Images Based on Scale Reasoning
AU - Li, Wei
AU - Wei, Wei
AU - Zhang, Lei
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Variations in both object scale and style under different capture scenes (e.g., downtown, port) greatly enhance the difficulties associated with object detection in aerial images. Although ground sample distance (GSD) provides an apparent clue to address this issue, no existing object detection methods have considered utilizing this useful prior knowledge. In this paper, we propose the first object detection network to incorporate GSD into the object detection modeling process. More specifically, built on a two-stage detection framework, we adopt a GSD identification subnet converting the GSD regression into a probability estimation process, then combine the GSD information with the sizes of Regions of Interest (RoIs) to determine the physical size of objects. The estimated physical size can provide a powerful prior for detection by reweighting the weights from the classification layer of each category to produce RoI-wise enhanced features. Furthermore, to improve the discriminability among categories of similar size and make the inference process more adaptive, the scene information is also considered. The pipeline is flexible enough to be stacked on any two-stage modern detection framework. The improvement over the existing two-stage object detection methods on the DOTA dataset demonstrates the effectiveness of our method.
AB - Variations in both object scale and style under different capture scenes (e.g., downtown, port) greatly enhance the difficulties associated with object detection in aerial images. Although ground sample distance (GSD) provides an apparent clue to address this issue, no existing object detection methods have considered utilizing this useful prior knowledge. In this paper, we propose the first object detection network to incorporate GSD into the object detection modeling process. More specifically, built on a two-stage detection framework, we adopt a GSD identification subnet converting the GSD regression into a probability estimation process, then combine the GSD information with the sizes of Regions of Interest (RoIs) to determine the physical size of objects. The estimated physical size can provide a powerful prior for detection by reweighting the weights from the classification layer of each category to produce RoI-wise enhanced features. Furthermore, to improve the discriminability among categories of similar size and make the inference process more adaptive, the scene information is also considered. The pipeline is flexible enough to be stacked on any two-stage modern detection framework. The improvement over the existing two-stage object detection methods on the DOTA dataset demonstrates the effectiveness of our method.
KW - Object detection
KW - aerial images
KW - ground sample distance
KW - reasoning
UR - http://www.scopus.com/inward/record.url?scp=85104629939&partnerID=8YFLogxK
U2 - 10.1109/TIP.2021.3073319
DO - 10.1109/TIP.2021.3073319
M3 - 文章
C2 - 33886468
AN - SCOPUS:85104629939
SN - 1057-7149
VL - 30
SP - 4599
EP - 4609
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9411691
ER -