Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping

Yufeng Li; Jian Gao; Yimin Chen; Yaozhen He; Boxu Min

doi:10.1109/INDIN58382.2024.10774473

Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping

Yufeng Li, Jian Gao, Yimin Chen, Yaozhen He, Boxu Min

School of Marine Science and Technology

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

This paper proposes an end-to-end autonomous navigation algorithm for unknown environments based on deep reinforcement learning (DRL), which maps the lidar data collected by the robot into control commands. The proposed LM-TD3 algorithm utilizes the Twin Delayed Deep Deterministic(TD3) policy gradient network as the backbone to generate robot action control in continuous spaces. Based on this, the Long Short-Term Memory (LSTM) neural network is introduced into the actor and critic networks, allowing the model to store long-term navigation experiences to increase its ability to perceive and handle surrounding obstacles. Furthermore, a novel reward function in DRL is designed to smooth the motion pose of the robot while controlling the robot to achieve target tracking. Finally, to enhance the early learning efficiency of the DRL network, a Hindsight Experience Replay (HER) strategy is designed specifically for the autonomous navigation system to enhance the convergence speed of the algorithm. To validate the effectiveness of the LM-TD3 algorithm with simulation experiments, scenarios of varying complexities are designed to verify the navigation ability. Compared with the TD3 algorithm, the proposed LMTD3 method can generate shorter paths with enhanced obstacle avoidance capabilities, while also maintaining more stable robot posture control.

Original language	English
Title of host publication	Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9798331527471
DOIs	https://doi.org/10.1109/INDIN58382.2024.10774473
State	Published - 2024
Event	22nd IEEE International Conference on Industrial Informatics, INDIN 2024 - Beijing, China Duration: 18 Aug 2024 → 20 Aug 2024

Publication series

Name	IEEE International Conference on Industrial Informatics (INDIN)
ISSN (Print)	1935-4576

Conference

Conference	22nd IEEE International Conference on Industrial Informatics, INDIN 2024
Country/Territory	China
City	Beijing
Period	18/08/24 → 20/08/24

Keywords

Autonomous navigation
Deep reinforcement learning
Mobile robot
Robot pose

Access to Document

10.1109/INDIN58382.2024.10774473

Cite this

Li, Y., Gao, J., Chen, Y., He, Y., & Min, B. (2024). Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping. In Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024 (IEEE International Conference on Industrial Informatics (INDIN)). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/INDIN58382.2024.10774473

@inproceedings{19e6c8b1848f4f4795d845fed3de0806,

title = "Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping",

abstract = "This paper proposes an end-to-end autonomous navigation algorithm for unknown environments based on deep reinforcement learning (DRL), which maps the lidar data collected by the robot into control commands. The proposed LM-TD3 algorithm utilizes the Twin Delayed Deep Deterministic(TD3) policy gradient network as the backbone to generate robot action control in continuous spaces. Based on this, the Long Short-Term Memory (LSTM) neural network is introduced into the actor and critic networks, allowing the model to store long-term navigation experiences to increase its ability to perceive and handle surrounding obstacles. Furthermore, a novel reward function in DRL is designed to smooth the motion pose of the robot while controlling the robot to achieve target tracking. Finally, to enhance the early learning efficiency of the DRL network, a Hindsight Experience Replay (HER) strategy is designed specifically for the autonomous navigation system to enhance the convergence speed of the algorithm. To validate the effectiveness of the LM-TD3 algorithm with simulation experiments, scenarios of varying complexities are designed to verify the navigation ability. Compared with the TD3 algorithm, the proposed LMTD3 method can generate shorter paths with enhanced obstacle avoidance capabilities, while also maintaining more stable robot posture control.",

keywords = "Autonomous navigation, Deep reinforcement learning, Mobile robot, Robot pose",

author = "Yufeng Li and Jian Gao and Yimin Chen and Yaozhen He and Boxu Min",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 22nd IEEE International Conference on Industrial Informatics, INDIN 2024 ; Conference date: 18-08-2024 Through 20-08-2024",

year = "2024",

doi = "10.1109/INDIN58382.2024.10774473",

language = "英语",

series = "IEEE International Conference on Industrial Informatics (INDIN)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024",

}

Li, Y, Gao, J, Chen, Y, He, Y & Min, B 2024, Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping. in Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024. IEEE International Conference on Industrial Informatics (INDIN), Institute of Electrical and Electronics Engineers Inc., 22nd IEEE International Conference on Industrial Informatics, INDIN 2024, Beijing, China, 18/08/24. https://doi.org/10.1109/INDIN58382.2024.10774473

Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping. / Li, Yufeng; Gao, Jian; Chen, Yimin et al.
Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024. Institute of Electrical and Electronics Engineers Inc., 2024. (IEEE International Conference on Industrial Informatics (INDIN)).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping

AU - Li, Yufeng

AU - Gao, Jian

AU - Chen, Yimin

AU - He, Yaozhen

AU - Min, Boxu

PY - 2024

Y1 - 2024

N2 - This paper proposes an end-to-end autonomous navigation algorithm for unknown environments based on deep reinforcement learning (DRL), which maps the lidar data collected by the robot into control commands. The proposed LM-TD3 algorithm utilizes the Twin Delayed Deep Deterministic(TD3) policy gradient network as the backbone to generate robot action control in continuous spaces. Based on this, the Long Short-Term Memory (LSTM) neural network is introduced into the actor and critic networks, allowing the model to store long-term navigation experiences to increase its ability to perceive and handle surrounding obstacles. Furthermore, a novel reward function in DRL is designed to smooth the motion pose of the robot while controlling the robot to achieve target tracking. Finally, to enhance the early learning efficiency of the DRL network, a Hindsight Experience Replay (HER) strategy is designed specifically for the autonomous navigation system to enhance the convergence speed of the algorithm. To validate the effectiveness of the LM-TD3 algorithm with simulation experiments, scenarios of varying complexities are designed to verify the navigation ability. Compared with the TD3 algorithm, the proposed LMTD3 method can generate shorter paths with enhanced obstacle avoidance capabilities, while also maintaining more stable robot posture control.

AB - This paper proposes an end-to-end autonomous navigation algorithm for unknown environments based on deep reinforcement learning (DRL), which maps the lidar data collected by the robot into control commands. The proposed LM-TD3 algorithm utilizes the Twin Delayed Deep Deterministic(TD3) policy gradient network as the backbone to generate robot action control in continuous spaces. Based on this, the Long Short-Term Memory (LSTM) neural network is introduced into the actor and critic networks, allowing the model to store long-term navigation experiences to increase its ability to perceive and handle surrounding obstacles. Furthermore, a novel reward function in DRL is designed to smooth the motion pose of the robot while controlling the robot to achieve target tracking. Finally, to enhance the early learning efficiency of the DRL network, a Hindsight Experience Replay (HER) strategy is designed specifically for the autonomous navigation system to enhance the convergence speed of the algorithm. To validate the effectiveness of the LM-TD3 algorithm with simulation experiments, scenarios of varying complexities are designed to verify the navigation ability. Compared with the TD3 algorithm, the proposed LMTD3 method can generate shorter paths with enhanced obstacle avoidance capabilities, while also maintaining more stable robot posture control.

KW - Autonomous navigation

KW - Deep reinforcement learning

KW - Mobile robot

KW - Robot pose

UR - http://www.scopus.com/inward/record.url?scp=85215534693&partnerID=8YFLogxK

U2 - 10.1109/INDIN58382.2024.10774473

DO - 10.1109/INDIN58382.2024.10774473

M3 - 会议稿件

AN - SCOPUS:85215534693

T3 - IEEE International Conference on Industrial Informatics (INDIN)

BT - Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 22nd IEEE International Conference on Industrial Informatics, INDIN 2024

Y2 - 18 August 2024 through 20 August 2024

ER -

Li Y, Gao J, Chen Y, He Y, Min B. Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping. In Proceedings - 2024 IEEE 22nd International Conference on Industrial Informatics, INDIN 2024. Institute of Electrical and Electronics Engineers Inc. 2024. (IEEE International Conference on Industrial Informatics (INDIN)). doi: 10.1109/INDIN58382.2024.10774473

Deep Reinforcement Learning-based End-to-End Navigation of Mobile Robots With Reward Shaping

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this