TY - JOUR
T1 - Visual Consistency Enhancement for Multiview Stereo Reconstruction in Remote Sensing
AU - Zhang, Wei
AU - Li, Qiang
AU - Yuan, Yuan
AU - Wang, Qi
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Learnable multiview stereo (MVS) aerial image depth estimation has obtained great success in 3-D digital urban reconstruction. Currently, most depth estimation methods in the large-scale sense heavily involve adapting the general MVS framework. However, these methods often overlook the cross-view interval and limited viewpoint inherent in aerial images data. In this article, we introduce an learning-based MVS method for aerial image depth estimation, which enhances visual consistency to address the insufficient accuracy caused by the characteristics of aerial image data, namely, AggrMVS. First, an optical flow-guided feature extraction module is introduced to map the dynamic relationship between reference and source images. It explicitly captures edge information of different depth components to guide the cost volume regularization. Second, a cross-view volume fusion module is proposed to enhance the interaction among reference volumes, further improving the aggregation ability of the source volume. Furthermore, AggrMVS achieves refined aerial image depth estimation results with a lightweight cascade architecture. Since low-altitude oblique aerial datasets currently lack, we reconstruct a multicategory synthetic aerial scene benchmark from general MVS datasets. The benchmark dataset is available at https://github.com/ToscW/BlendedUAV. Experiments on public and proposed datasets confirm that AggrMVS outperforms other MVS depth estimation methods in terms of qualitative and quantitative aspects.
AB - Learnable multiview stereo (MVS) aerial image depth estimation has obtained great success in 3-D digital urban reconstruction. Currently, most depth estimation methods in the large-scale sense heavily involve adapting the general MVS framework. However, these methods often overlook the cross-view interval and limited viewpoint inherent in aerial images data. In this article, we introduce an learning-based MVS method for aerial image depth estimation, which enhances visual consistency to address the insufficient accuracy caused by the characteristics of aerial image data, namely, AggrMVS. First, an optical flow-guided feature extraction module is introduced to map the dynamic relationship between reference and source images. It explicitly captures edge information of different depth components to guide the cost volume regularization. Second, a cross-view volume fusion module is proposed to enhance the interaction among reference volumes, further improving the aggregation ability of the source volume. Furthermore, AggrMVS achieves refined aerial image depth estimation results with a lightweight cascade architecture. Since low-altitude oblique aerial datasets currently lack, we reconstruct a multicategory synthetic aerial scene benchmark from general MVS datasets. The benchmark dataset is available at https://github.com/ToscW/BlendedUAV. Experiments on public and proposed datasets confirm that AggrMVS outperforms other MVS depth estimation methods in terms of qualitative and quantitative aspects.
KW - 3-D reconstruction
KW - dense image matching
KW - multiview stereo (MVS)
KW - vision consistency
UR - http://www.scopus.com/inward/record.url?scp=85207337589&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3482697
DO - 10.1109/TGRS.2024.3482697
M3 - 文章
AN - SCOPUS:85207337589
SN - 0196-2892
VL - 62
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5646011
ER -