TY - JOUR
T1 - VDG
T2 - Vision-Only Dynamic Gaussian for Driving Simulation
AU - Li, Hao
AU - Li, Jingfeng
AU - Zhang, Dingwen
AU - Wu, Chenming
AU - Shi, Jieqi
AU - Zhao, Chen
AU - Feng, Haocheng
AU - Ding, Errui
AU - Wang, Jingdong
AU - Han, Junwei
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2025
Y1 - 2025
N2 - Recent advances in dynamic Gaussian splatting have significantly improved scene reconstruction and novel-view synthesis. However, existing methods often rely on pre-computed camera poses and Gaussian initialization using Structure from Motion (SfM) or other costly sensors, limiting their scalability. In this letter, we propose Vision-only Dynamic Gaussian (VDG), a novel method that, for the first time, integrates self-supervised visual odometry (VO) into a pose-free dynamic Gaussian splatting framework. Given the reason that estimated poses are not accurate enough to perform self-decomposition for dynamic scenes, we specifically design motion supervision, enabling precise static-dynamic decomposition and modeling of dynamic objects via dynamic Gaussians. Extensive experiments on urban driving datasets, including KITTI and Waymo, show that VDG consistently outperforms state-of-the-art dynamic view synthesis methods in both reconstruction accuracy and pose prediction with only image input.
AB - Recent advances in dynamic Gaussian splatting have significantly improved scene reconstruction and novel-view synthesis. However, existing methods often rely on pre-computed camera poses and Gaussian initialization using Structure from Motion (SfM) or other costly sensors, limiting their scalability. In this letter, we propose Vision-only Dynamic Gaussian (VDG), a novel method that, for the first time, integrates self-supervised visual odometry (VO) into a pose-free dynamic Gaussian splatting framework. Given the reason that estimated poses are not accurate enough to perform self-decomposition for dynamic scenes, we specifically design motion supervision, enabling precise static-dynamic decomposition and modeling of dynamic objects via dynamic Gaussians. Extensive experiments on urban driving datasets, including KITTI and Waymo, show that VDG consistently outperforms state-of-the-art dynamic view synthesis methods in both reconstruction accuracy and pose prediction with only image input.
KW - computer vision for transportation
KW - intelligent transportation systems
KW - Simulation and animation
UR - http://www.scopus.com/inward/record.url?scp=105003088534&partnerID=8YFLogxK
U2 - 10.1109/LRA.2025.3555938
DO - 10.1109/LRA.2025.3555938
M3 - 文章
AN - SCOPUS:105003088534
SN - 2377-3766
VL - 10
SP - 5138
EP - 5145
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 5
ER -