跳到主要导航 跳到搜索 跳到主要内容

Utilizing Large Language Models for Robot Skill Reward Shaping in Reinforcement Learning

  • Northwestern Polytechnical University Xian
  • Beijing Institute of Tracking and Telecommunications Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

In this paper, we examines the integration of LLMs in designing reward functions for reinforcement learning (RL) to enhance robotic applications with minimal human input. In RL, the reward function is pivotal, guiding the agent’s learning trajectory by evaluating the desirability of behaviors within specific environments. Traditional reward functions, often sparse, lead to slow convergence as agents require extensive interactions to learn effectively. By leveraging LLM’s ability to generate code from task semantics, we propose a new method that reduces the complexity of reward design, allowing even non-experts to create effective reward policies using semantic prompts. We utilize the Soft Actor-Critic (SAC) algorithm, known for its efficiency and stability, to train agents under these conditions. To validate the efficacy of our method, we compare it with traditional techniques like Trajectory-ranked reward extrapolation (T-REX). Our findings indicate that the LLM-generated rewards enable quicker convergence and are as effective as those crafted through conventional methods, demonstrating the potential of LLMs to revolutionize reward shaping in RL. Furthermore, we transferred the robot door-opening task from the real-world simulation environment to a real robot, achieving sim-to-real.This approach allows for the rapid deployment of robotic systems, making sophisticated robotics technology more accessible and feasible for a wider range of applications. This study underscores the transformative impact of integrating advanced language models into the realm of robotics and RL, opening up new avenues for future research and application.

源语言英语
主期刊名Intelligent Robotics and Applications - 17th International Conference, ICIRA 2024, Proceedings
编辑Xuguang Lan, Xuesong Mei, Caigui Jiang, Fei Zhao, Zhiqiang Tian
出版商Springer Science and Business Media Deutschland GmbH
3-17
页数15
ISBN(印刷版)9789819607822
DOI
出版状态已出版 - 2025
活动17th International Conference on Intelligent Robotics and Applications, ICIRA 2024 - Xi'an, 中国
期限: 31 7月 20242 8月 2024

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
15208 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议17th International Conference on Intelligent Robotics and Applications, ICIRA 2024
国家/地区中国
Xi'an
时期31/07/242/08/24

指纹

探究 'Utilizing Large Language Models for Robot Skill Reward Shaping in Reinforcement Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此