TY - JOUR
T1 - Spatial-temporal context-aware network for 3D-Craft generation
AU - Ji, Ruyi
AU - Wang, Qunbo
AU - Wang, Boying
AU - Zhang, Hangu
AU - Zhang, Wentao
AU - Dai, Lin
AU - Wang, Yanni
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2025/5
Y1 - 2025/5
N2 - The generative modeling of 3D objects in the real world is an interesting but challenging task commonly constrained by process and order. Most existing methods focus on spatial relations to address this issue, neglecting the rich information between temporal sequences. To close this gap, we deliver a spatial-temporal context-aware network to explore the prediction of ordered actions for 3D object construction. Specifically, our approach is mainly formed by two modules, i.e., the spatial-context module and the temporal-context module. The spatial-context module is designed to learn the physical constraints in 3D object construction, such as spatial constraints and gravity. Meanwhile, the temporal-context module integrates the temporal context of action orders in history on the fly toward more accurate predictions. After that, the features of such two modules are merged to finalize the perdition of the following action’s position and block type. The entire model is optimized by the stochastic gradient descent optimization (SGD) method in an end-to-end manner. Extensive experiments conducted on the 3D-Craft dataset demonstrate that the proposed method surpasses the state-of-the-art methods with a large margin, i.e., improving 4.5% absolute ACC@1, 3.3% absolute ACC@5, and 4.1% absolute ACC@10. Moreover, the comprehensive ablation studies and insightful analysis further validate the effectiveness of the proposed method.
AB - The generative modeling of 3D objects in the real world is an interesting but challenging task commonly constrained by process and order. Most existing methods focus on spatial relations to address this issue, neglecting the rich information between temporal sequences. To close this gap, we deliver a spatial-temporal context-aware network to explore the prediction of ordered actions for 3D object construction. Specifically, our approach is mainly formed by two modules, i.e., the spatial-context module and the temporal-context module. The spatial-context module is designed to learn the physical constraints in 3D object construction, such as spatial constraints and gravity. Meanwhile, the temporal-context module integrates the temporal context of action orders in history on the fly toward more accurate predictions. After that, the features of such two modules are merged to finalize the perdition of the following action’s position and block type. The entire model is optimized by the stochastic gradient descent optimization (SGD) method in an end-to-end manner. Extensive experiments conducted on the 3D-Craft dataset demonstrate that the proposed method surpasses the state-of-the-art methods with a large margin, i.e., improving 4.5% absolute ACC@1, 3.3% absolute ACC@5, and 4.1% absolute ACC@10. Moreover, the comprehensive ablation studies and insightful analysis further validate the effectiveness of the proposed method.
KW - 3D object
KW - 3D-Craft generation
KW - Graph neural network
KW - Spatial-temporal context
UR - http://www.scopus.com/inward/record.url?scp=105001012638&partnerID=8YFLogxK
U2 - 10.1007/s10489-025-06468-4
DO - 10.1007/s10489-025-06468-4
M3 - 文章
AN - SCOPUS:105001012638
SN - 0924-669X
VL - 55
JO - Applied Intelligence
JF - Applied Intelligence
IS - 7
M1 - 579
ER -