TY - JOUR
T1 - MT-MVSNet
T2 - A lightweight and highly accurate convolutional neural network based on mobile transformer for 3D reconstruction of orchard fruit tree branches
AU - Zeng, Xilei
AU - Wan, Hao
AU - Fan, Zeming
AU - Yu, Xiaojun
AU - Guo, Hengrong
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/4/5
Y1 - 2025/4/5
N2 - Accurate and efficient three-dimensional (3D) fruit tree branch reconstruction is crucial for autonomous fruit-harvesting robot path planning and obstacle avoidance. Due to the intensive computational loads and the manual interventions required, however, real-time applications of the existing 3D reconstruction methods are largely hindered on mobile platforms. To address such issues, an end-to-end 3D reconstruction network, namely, MT-MVSNet, is proposed for object 3D reconstructions based on Multi-View Stereo (MVS) using RGB images. Specifically, the proposed MT-MVSNet consists of a novel mobile transformer block for global contextual path information capturing, a feature fusion module with feature attention edge, as well as an efficient depth search strategy for both completeness enhancement and computational complexity optimization. In addition, a branch-based semantic segmentation technique is also devised for precise noise filtering during depth map fusion. Extensive experiments with 100 sets of self-customized orchard fruit trees and publicly available datasets were conducted to verify the effectiveness of MT-MVSNet. Results compared to the of those existing methods showed that MT-MVSNet achieved an overall score of 0.312 mm on the DTU benchmark and an F-score of 54.18% on the Tanks & Temples dataset1, while only 3118 MB memory was required at a speed of 5.68 frames per second, with image containing 1152 × 864 pixels. Such results indicate that MT-MVSNet outperforms those mainstream existing ones in terms of balanced reconstruction accuracy, processing speed and computational efficiency, making it an appropriate candidate for real-time deployment on memory-constrained mobile robots.
AB - Accurate and efficient three-dimensional (3D) fruit tree branch reconstruction is crucial for autonomous fruit-harvesting robot path planning and obstacle avoidance. Due to the intensive computational loads and the manual interventions required, however, real-time applications of the existing 3D reconstruction methods are largely hindered on mobile platforms. To address such issues, an end-to-end 3D reconstruction network, namely, MT-MVSNet, is proposed for object 3D reconstructions based on Multi-View Stereo (MVS) using RGB images. Specifically, the proposed MT-MVSNet consists of a novel mobile transformer block for global contextual path information capturing, a feature fusion module with feature attention edge, as well as an efficient depth search strategy for both completeness enhancement and computational complexity optimization. In addition, a branch-based semantic segmentation technique is also devised for precise noise filtering during depth map fusion. Extensive experiments with 100 sets of self-customized orchard fruit trees and publicly available datasets were conducted to verify the effectiveness of MT-MVSNet. Results compared to the of those existing methods showed that MT-MVSNet achieved an overall score of 0.312 mm on the DTU benchmark and an F-score of 54.18% on the Tanks & Temples dataset1, while only 3118 MB memory was required at a speed of 5.68 frames per second, with image containing 1152 × 864 pixels. Such results indicate that MT-MVSNet outperforms those mainstream existing ones in terms of balanced reconstruction accuracy, processing speed and computational efficiency, making it an appropriate candidate for real-time deployment on memory-constrained mobile robots.
KW - Branch reconstruction
KW - Deep learning
KW - Mobile harvesting robot
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85213535939&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.126220
DO - 10.1016/j.eswa.2024.126220
M3 - 文章
AN - SCOPUS:85213535939
SN - 0957-4174
VL - 268
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 126220
ER -