TY - GEN
T1 - Robust Quadrupedal Locomotion via Risk-Averse Policy Learning
AU - Shi, Jiyuan
AU - Bai, Chenjia
AU - He, Haoran
AU - Han, Lei
AU - Wang, Dong
AU - Zhao, Bin
AU - Zhao, Mingguo
AU - Li, Xiu
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The robustness of legged locomotion is crucial for quadrupedal robots in challenging terrains. Recently, Reinforcement Learning (RL) has shown promising results in legged locomotion and various methods try to integrate privileged distillation, scene modeling, and external sensors to improve the generalization and robustness of locomotion policies. However, these methods are hard to handle uncertain scenarios such as abrupt terrain changes or unexpected external forces. In this paper, we consider a novel risk-sensitive perspective to enhance the robustness of legged locomotion. Specifically, we employ a distributional value function learned by quantile regression to model the aleatoric uncertainty of environments, and perform risk-averse policy learning by optimizing the worst-case scenarios via a risk distortion measure. Extensive experiments in both simulation environments and a real Aliengo robot demonstrate that our method is efficient in handling various external disturbances, and the resulting policy exhibits improved robustness in harsh and uncertain situations in legged locomotion.
AB - The robustness of legged locomotion is crucial for quadrupedal robots in challenging terrains. Recently, Reinforcement Learning (RL) has shown promising results in legged locomotion and various methods try to integrate privileged distillation, scene modeling, and external sensors to improve the generalization and robustness of locomotion policies. However, these methods are hard to handle uncertain scenarios such as abrupt terrain changes or unexpected external forces. In this paper, we consider a novel risk-sensitive perspective to enhance the robustness of legged locomotion. Specifically, we employ a distributional value function learned by quantile regression to model the aleatoric uncertainty of environments, and perform risk-averse policy learning by optimizing the worst-case scenarios via a risk distortion measure. Extensive experiments in both simulation environments and a real Aliengo robot demonstrate that our method is efficient in handling various external disturbances, and the resulting policy exhibits improved robustness in harsh and uncertain situations in legged locomotion.
UR - http://www.scopus.com/inward/record.url?scp=85194490892&partnerID=8YFLogxK
U2 - 10.1109/ICRA57147.2024.10610086
DO - 10.1109/ICRA57147.2024.10610086
M3 - 会议稿件
AN - SCOPUS:85194490892
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 11459
EP - 11466
BT - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
Y2 - 13 May 2024 through 17 May 2024
ER -