TY - JOUR
T1 - Phase shift guided dynamic view synthesis from monocular video
AU - Zhao, Chuyue
AU - Huang, Xin
AU - Wang, Xue
AU - Zhou, Guoqing
AU - Wang, Qing
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/10
Y1 - 2025/10
N2 - This paper endeavors to address the challenge of synthesizing novel views from monocular videos featuring moving objects, particularly in complex scenes with non-rigid deformations. Existing implicit representations rely on motion estimation in the spatial domain, which often struggle to capture correct temporal dynamics under such conditions. To mitigate the drawback, we propose dynamic positional encoding to represent temporal dynamics as learnable phase shifts and leverage the implicit neural representation (INR) network for iterative optimization. Utilizing optimized phase shifts as guidance enhances the representational capability of the dynamic radiance field, thereby alleviating motion ambiguity and reducing artifacts around moving objects in novel views. This paper also introduces a rational evaluation metric, referred to as “dynamic only+”, for the quantitative assessment of the rendering quality in novel views, focusing on dynamic objects and surrounding regions impacted by motion. Experimental results on multiple challenging datasets demonstrate the favorable performance of the proposed approach over state-of-the-art dynamic view synthesis methods.
AB - This paper endeavors to address the challenge of synthesizing novel views from monocular videos featuring moving objects, particularly in complex scenes with non-rigid deformations. Existing implicit representations rely on motion estimation in the spatial domain, which often struggle to capture correct temporal dynamics under such conditions. To mitigate the drawback, we propose dynamic positional encoding to represent temporal dynamics as learnable phase shifts and leverage the implicit neural representation (INR) network for iterative optimization. Utilizing optimized phase shifts as guidance enhances the representational capability of the dynamic radiance field, thereby alleviating motion ambiguity and reducing artifacts around moving objects in novel views. This paper also introduces a rational evaluation metric, referred to as “dynamic only+”, for the quantitative assessment of the rendering quality in novel views, focusing on dynamic objects and surrounding regions impacted by motion. Experimental results on multiple challenging datasets demonstrate the favorable performance of the proposed approach over state-of-the-art dynamic view synthesis methods.
KW - Dynamic scene representation
KW - Learnable phase shift
KW - Monocular video
KW - Neural rendering
KW - Novel view synthesis
UR - https://www.scopus.com/pages/publications/105013561699
U2 - 10.1016/j.imavis.2025.105702
DO - 10.1016/j.imavis.2025.105702
M3 - 文章
AN - SCOPUS:105013561699
SN - 0262-8856
VL - 162
JO - Image and Vision Computing
JF - Image and Vision Computing
M1 - 105702
ER -