LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

Delin Qu; Qizhi Chen; Pingrui Zhang; Xianqiang Gao; Bin Zhao; Zhigang Wang; Dong Wang; Xuelong Li

LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

Delin Qu, Qizhi Chen, Pingrui Zhang, Xianqiang Gao, Bin Zhao, Zhigang Wang, Dong Wang, Xuelong Li

科研成果: 期刊稿件 › 会议文章 › 同行评审

摘要

This paper scales object-level reconstruction to complex scenes, advancing interactive scene reconstruction. We introduce two datasets, OmniSim and InterReal, featuring 28 scenes with multiple interactive objects. To tackle the challenge of inaccurate interactive motion recovery in complex scenes, we propose LiveScene, a scene-level language-embedded interactive radiance field that efficiently reconstructs and controls multiple objects. By decomposing the interactive scene into local deformable fields, LiveScene enables separate reconstruction of individual object motions, reducing memory consumption. Additionally, our interaction-aware language embedding localizes individual interactive objects, allowing for arbitrary control using natural language. Our approach demonstrates significant superiority in novel view synthesis, interactive scene control, and language grounding performance through extensive experiments. Project page: https://livescenes.github.io.

源语言	英语
期刊	Advances in Neural Information Processing Systems
卷	37
出版状态	已出版 - 2024
已对外发布	是
活动	38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大期限: 9 12月 2024 → 15 12月 2024

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{3043b5bf5cb1452ba21d50d3ad7a213a,

title = "LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control",

abstract = "This paper scales object-level reconstruction to complex scenes, advancing interactive scene reconstruction. We introduce two datasets, OmniSim and InterReal, featuring 28 scenes with multiple interactive objects. To tackle the challenge of inaccurate interactive motion recovery in complex scenes, we propose LiveScene, a scene-level language-embedded interactive radiance field that efficiently reconstructs and controls multiple objects. By decomposing the interactive scene into local deformable fields, LiveScene enables separate reconstruction of individual object motions, reducing memory consumption. Additionally, our interaction-aware language embedding localizes individual interactive objects, allowing for arbitrary control using natural language. Our approach demonstrates significant superiority in novel view synthesis, interactive scene control, and language grounding performance through extensive experiments. Project page: https://livescenes.github.io.",

author = "Delin Qu and Qizhi Chen and Pingrui Zhang and Xianqiang Gao and Bin Zhao and Zhigang Wang and Dong Wang and Xuelong Li",

note = "Publisher Copyright: {\textcopyright} 2024 Neural information processing systems foundation. All rights reserved.; 38th Conference on Neural Information Processing Systems, NeurIPS 2024 ; Conference date: 09-12-2024 Through 15-12-2024",

year = "2024",

language = "英语",

volume = "37",

journal = "Advances in Neural Information Processing Systems",

issn = "1049-5258",

publisher = "Neural information processing systems foundation",

}

TY - JOUR

T1 - LiveScene

T2 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024

AU - Qu, Delin

AU - Chen, Qizhi

AU - Zhang, Pingrui

AU - Gao, Xianqiang

AU - Zhao, Bin

AU - Wang, Zhigang

AU - Wang, Dong

AU - Li, Xuelong

PY - 2024

Y1 - 2024

N2 - This paper scales object-level reconstruction to complex scenes, advancing interactive scene reconstruction. We introduce two datasets, OmniSim and InterReal, featuring 28 scenes with multiple interactive objects. To tackle the challenge of inaccurate interactive motion recovery in complex scenes, we propose LiveScene, a scene-level language-embedded interactive radiance field that efficiently reconstructs and controls multiple objects. By decomposing the interactive scene into local deformable fields, LiveScene enables separate reconstruction of individual object motions, reducing memory consumption. Additionally, our interaction-aware language embedding localizes individual interactive objects, allowing for arbitrary control using natural language. Our approach demonstrates significant superiority in novel view synthesis, interactive scene control, and language grounding performance through extensive experiments. Project page: https://livescenes.github.io.

AB - This paper scales object-level reconstruction to complex scenes, advancing interactive scene reconstruction. We introduce two datasets, OmniSim and InterReal, featuring 28 scenes with multiple interactive objects. To tackle the challenge of inaccurate interactive motion recovery in complex scenes, we propose LiveScene, a scene-level language-embedded interactive radiance field that efficiently reconstructs and controls multiple objects. By decomposing the interactive scene into local deformable fields, LiveScene enables separate reconstruction of individual object motions, reducing memory consumption. Additionally, our interaction-aware language embedding localizes individual interactive objects, allowing for arbitrary control using natural language. Our approach demonstrates significant superiority in novel view synthesis, interactive scene control, and language grounding performance through extensive experiments. Project page: https://livescenes.github.io.

UR - http://www.scopus.com/inward/record.url?scp=105000466085&partnerID=8YFLogxK

M3 - 会议文章

AN - SCOPUS:105000466085

SN - 1049-5258

VL - 37

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

Y2 - 9 December 2024 through 15 December 2024

ER -

LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

摘要

其它文件与链接

指纹

引用此