Hybrid Reinforcement Learning based on Human Preference and Advice for Efficient Robot Skill Learning

Bingqian Li, Xing Liu, Zhengxiong Liu, Panfeng Huang

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The key to realize the application of robots in real world is to design intelligent robots with certain autonomous skill learning ability. Reinforcement learning is a feasible solution. However, two important challenges limit the application of RL methods in robotics, including the difficulty of human-designed reward as well as long training time. Therefore, we study hybrid RL methods, which use human knowledge to assist agent learning. First, we propose a reward learning method based on human preference model to realize robot skill learning, which has better robustness and convergence than the traditional RL method with human-designed reward. Then, we combine it with Episode-Fuzzy-COACH, our previous work, to build a hybrid RL method based on human preference and advice. In this method, preference model is used to infer reward function and human advice is used to speed up the policy learning process. It realizes efficient robot skill learning without human-designed reward function. And it is proven the learning efficiency of this method is 73.3% higher than that of the reward learning method that only uses preference model.

源语言英语
主期刊名ICARM 2024 - 2024 9th IEEE International Conference on Advanced Robotics and Mechatronics
出版商Institute of Electrical and Electronics Engineers Inc.
655-661
页数7
ISBN(电子版)9798350385724
DOI
出版状态已出版 - 2024
活动9th IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2024 - Tokyo, 日本
期限: 8 7月 202410 7月 2024

出版系列

姓名ICARM 2024 - 2024 9th IEEE International Conference on Advanced Robotics and Mechatronics

会议

会议9th IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2024
国家/地区日本
Tokyo
时期8/07/2410/07/24

指纹

探究 'Hybrid Reinforcement Learning based on Human Preference and Advice for Efficient Robot Skill Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此