跳到主要导航 跳到搜索 跳到主要内容

Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs

  • Wenke Xia
  • , Dong Wang
  • , Xincheng Pang
  • , Zhigang Wang
  • , Bin Zhao
  • , Di Hu
  • , Xuelong Li
  • Gaoling School of Artificial Intelligence
  • Shanghai Artificial Intelligence Laboratory
  • China Telecommunications

科研成果: 书/报告/会议事项章节会议稿件同行评审

16 引用 (Scopus)

摘要

Generalizable articulated object manipulation is essential for home-assistant robots. Recent efforts focus on imitation learning from demonstrations or reinforcement learning in simulation, however, due to the prohibitive costs of real-world data collection and precise object simulation, it still remains challenging for these works to achieve broad adaptability across diverse articulated objects. Recently, many works have tried to utilize the strong in-context learning ability of Large Language Models (LLMs) to achieve generalizable robotic manipulation, but most of these researches focus on high-level task planning, sidelining low-level robotic control. In this work, building on the idea that the kinematic structure of the object determines how we can manipulate it, we propose a kinematic-aware prompting framework that prompts LLMs with kinematic knowledge of objects to generate low-level motion trajectory waypoints, supporting various object manipulation. To effectively prompt LLMs with the kinematic structure of different objects, we design a unified kinematic knowledge parser, which represents various articulated objects as a unified textual description containing kinematic joints and contact location. Building upon this unified description, a kinematic-aware planner model is proposed to generate precise 3D manipulation waypoints via a designed kinematic-aware chain-of-thoughts prompting method. Our evaluation spanned 48 instances across 16 distinct categories, revealing that our framework not only outperforms traditional methods on 8 seen categories but also shows a powerful zero-shot capability for 8 unseen articulated object categories with only 17 demonstrations. Moreover, the real-world experiments on 7 different object categories prove our framework's adaptability in practical scenarios. Code is released at https://github.com/GeWu-Lab/LLM-articulated-object-manipulation.

源语言英语
主期刊名2024 IEEE International Conference on Robotics and Automation, ICRA 2024
出版商Institute of Electrical and Electronics Engineers Inc.
2073-2080
页数8
ISBN(电子版)9798350384574
DOI
出版状态已出版 - 2024
活动2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, 日本
期限: 13 5月 202417 5月 2024

出版系列

姓名Proceedings - IEEE International Conference on Robotics and Automation
ISSN(印刷版)1050-4729

会议

会议2024 IEEE International Conference on Robotics and Automation, ICRA 2024
国家/地区日本
Yokohama
时期13/05/2417/05/24

指纹

探究 'Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs' 的科研主题。它们共同构成独一无二的指纹。

引用此