Robotic Locomotion Skill Learning Using Unsupervised Reinforcement Learning With Controllable Latent Space Partition

Ziming He; Pengyu Chen; Haobin Shi; Jingchen Li; Kao Shing Hwang

doi:10.1109/TII.2024.3468453

Robotic Locomotion Skill Learning Using Unsupervised Reinforcement Learning With Controllable Latent Space Partition

Ziming He, Pengyu Chen, Haobin Shi, Jingchen Li, Kao Shing Hwang

School of Computer Science

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

Effective skill learning in an unsupervised manner is one of the capabilities an intelligent agent or robot should have. The discovered task-agnostic skills can be fine-tuned to downstream long-horizon tasks to improve execution efficiency. Unfortunately, the self-learning of locomotion skills, which occurs naturally in infancy, has been slow to develop in robotics. The instability exhibited by existing skill-learning methods makes it difficult to directly apply to complex control tasks, such as humanoid robots. To acquire reliable robotic locomotion skills, this article proposes a controllable latent space partition framework to assist reinforcement learning in accomplishing practicability-oriented unsupervised skill discovery (PoSD). Specifically, we use the distance similarity measure of the trajectory feature space to introduce the indicative information of the expert demonstrations into the partitioning and mapping process of the latent space. In addition, the intrinsic subrewards based on contrastive learning and particle entropy are designed to promote skill diversity and encourage exploration. Finally, reinforcement learning completes the generation of skill-conditioned policy driven by composite intrinsic rewards. The performance investigation of our method is conducted on five robots with more than 15 skills. The results indicate that PoSD achieves noticeable improvements in adaptation efficiency and practicability compared with other SOTA unsupervised skill discovery methods.

Original language	English
Pages (from-to)	902-911
Number of pages	10
Journal	IEEE Transactions on Industrial Informatics
Volume	21
Issue number	1
DOIs	https://doi.org/10.1109/TII.2024.3468453
State	Published - 2025

Keywords

Deep reinforcement learning (DRL)
robotic control
skill discovery
unsupervised reinforcement learning (URL)

Access to Document

10.1109/TII.2024.3468453

Cite this

@article{b1912834fd8c4af7934da97d6fb4f161,

title = "Robotic Locomotion Skill Learning Using Unsupervised Reinforcement Learning With Controllable Latent Space Partition",

abstract = "Effective skill learning in an unsupervised manner is one of the capabilities an intelligent agent or robot should have. The discovered task-agnostic skills can be fine-tuned to downstream long-horizon tasks to improve execution efficiency. Unfortunately, the self-learning of locomotion skills, which occurs naturally in infancy, has been slow to develop in robotics. The instability exhibited by existing skill-learning methods makes it difficult to directly apply to complex control tasks, such as humanoid robots. To acquire reliable robotic locomotion skills, this article proposes a controllable latent space partition framework to assist reinforcement learning in accomplishing practicability-oriented unsupervised skill discovery (PoSD). Specifically, we use the distance similarity measure of the trajectory feature space to introduce the indicative information of the expert demonstrations into the partitioning and mapping process of the latent space. In addition, the intrinsic subrewards based on contrastive learning and particle entropy are designed to promote skill diversity and encourage exploration. Finally, reinforcement learning completes the generation of skill-conditioned policy driven by composite intrinsic rewards. The performance investigation of our method is conducted on five robots with more than 15 skills. The results indicate that PoSD achieves noticeable improvements in adaptation efficiency and practicability compared with other SOTA unsupervised skill discovery methods.",

keywords = "Deep reinforcement learning (DRL), robotic control, skill discovery, unsupervised reinforcement learning (URL)",

author = "Ziming He and Pengyu Chen and Haobin Shi and Jingchen Li and Hwang, {Kao Shing}",

note = "Publisher Copyright: {\textcopyright} 2005-2012 IEEE.",

year = "2025",

doi = "10.1109/TII.2024.3468453",

language = "英语",

volume = "21",

pages = "902--911",

journal = "IEEE Transactions on Industrial Informatics",

issn = "1551-3203",

publisher = "IEEE Computer Society",

number = "1",

}

TY - JOUR

T1 - Robotic Locomotion Skill Learning Using Unsupervised Reinforcement Learning With Controllable Latent Space Partition

AU - He, Ziming

AU - Chen, Pengyu

AU - Shi, Haobin

AU - Li, Jingchen

AU - Hwang, Kao Shing

PY - 2025

Y1 - 2025

N2 - Effective skill learning in an unsupervised manner is one of the capabilities an intelligent agent or robot should have. The discovered task-agnostic skills can be fine-tuned to downstream long-horizon tasks to improve execution efficiency. Unfortunately, the self-learning of locomotion skills, which occurs naturally in infancy, has been slow to develop in robotics. The instability exhibited by existing skill-learning methods makes it difficult to directly apply to complex control tasks, such as humanoid robots. To acquire reliable robotic locomotion skills, this article proposes a controllable latent space partition framework to assist reinforcement learning in accomplishing practicability-oriented unsupervised skill discovery (PoSD). Specifically, we use the distance similarity measure of the trajectory feature space to introduce the indicative information of the expert demonstrations into the partitioning and mapping process of the latent space. In addition, the intrinsic subrewards based on contrastive learning and particle entropy are designed to promote skill diversity and encourage exploration. Finally, reinforcement learning completes the generation of skill-conditioned policy driven by composite intrinsic rewards. The performance investigation of our method is conducted on five robots with more than 15 skills. The results indicate that PoSD achieves noticeable improvements in adaptation efficiency and practicability compared with other SOTA unsupervised skill discovery methods.

AB - Effective skill learning in an unsupervised manner is one of the capabilities an intelligent agent or robot should have. The discovered task-agnostic skills can be fine-tuned to downstream long-horizon tasks to improve execution efficiency. Unfortunately, the self-learning of locomotion skills, which occurs naturally in infancy, has been slow to develop in robotics. The instability exhibited by existing skill-learning methods makes it difficult to directly apply to complex control tasks, such as humanoid robots. To acquire reliable robotic locomotion skills, this article proposes a controllable latent space partition framework to assist reinforcement learning in accomplishing practicability-oriented unsupervised skill discovery (PoSD). Specifically, we use the distance similarity measure of the trajectory feature space to introduce the indicative information of the expert demonstrations into the partitioning and mapping process of the latent space. In addition, the intrinsic subrewards based on contrastive learning and particle entropy are designed to promote skill diversity and encourage exploration. Finally, reinforcement learning completes the generation of skill-conditioned policy driven by composite intrinsic rewards. The performance investigation of our method is conducted on five robots with more than 15 skills. The results indicate that PoSD achieves noticeable improvements in adaptation efficiency and practicability compared with other SOTA unsupervised skill discovery methods.

KW - Deep reinforcement learning (DRL)

KW - robotic control

KW - skill discovery

KW - unsupervised reinforcement learning (URL)

UR - http://www.scopus.com/inward/record.url?scp=85207462074&partnerID=8YFLogxK

U2 - 10.1109/TII.2024.3468453

DO - 10.1109/TII.2024.3468453

M3 - 文章

AN - SCOPUS:85207462074

SN - 1551-3203

VL - 21

SP - 902

EP - 911

JO - IEEE Transactions on Industrial Informatics

JF - IEEE Transactions on Industrial Informatics

IS - 1

ER -

Robotic Locomotion Skill Learning Using Unsupervised Reinforcement Learning With Controllable Latent Space Partition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this