Skip to main navigation Skip to search Skip to main content

GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-Time

  • Hao Li
  • , Yuanyuan Gao
  • , Chenming Wu
  • , Dingwen Zhang
  • , Yalun Dai
  • , Chen Zhao
  • , Haocheng Feng
  • , Errui Ding
  • , Jingdong Wang
  • , Junwei Han
  • Northwestern Polytechnical University Xian
  • Baidu Inc
  • Nanyang Technological University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

This paper presents GGRt, a novel approach to generalizable novel view synthesis that alleviates the need for real camera poses, complexity in processing high-resolution images, and lengthy optimization processes, thus facilitating stronger applicability of 3D Gaussian Splatting (3D-GS) in real-world scenarios. Specifically, we design a novel joint learning framework that consists of an Iterative Pose Optimization Network (IPO-Net) and a Generalizable 3D-Gaussians (G-3DG) model. With the joint learning mechanism, the proposed framework can inherently estimate robust relative pose information from the image observations and thus primarily alleviate the requirement of real camera poses. Moreover, we implement a deferred back-propagation mechanism that enables high-resolution training and inference, overcoming the resolution constraints of previous methods. To enhance the speed and efficiency, we further introduce a progressive Gaussian cache module that dynamically adjusts during training and inference. As the first pose-free generalizable 3D-GS framework, GGRt achieves inference at ≥5 FPS and real-time rendering at ≥100 FPS. Through extensive experimentation, we demonstrate that our method outperforms existing NeRF-based pose-free techniques in terms of inference speed and effectiveness. It can also approach the real pose-based 3D-GS methods. Our contributions provide a significant leap forward for the integration of computer vision and computer graphics into practical applications, offering state-of-the-art results on LLFF, KITTI, and Waymo Open datasets and enabling real-time rendering for immersive experiences. Project page: https://3d-aigc.github.io/GGRt.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 - 18th European Conference, Proceedings
EditorsAleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol
PublisherSpringer Science and Business Media Deutschland GmbH
Pages325-341
Number of pages17
ISBN (Print)9783031732089
DOIs
StatePublished - 2025
Event18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: 29 Sep 20244 Oct 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15129 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period29/09/244/10/24

Keywords

  • Generalizable 3D-GS
  • Pose-Free
  • Real-time Rendering

Fingerprint

Dive into the research topics of 'GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-Time'. Together they form a unique fingerprint.

Cite this