跳到主要导航 跳到搜索 跳到主要内容

Linear Attention-Driven DRL with Sparse Expert Fusion: A Dynamic Optimization Algorithm of Underwater Manipulator for Long Horizon Tasks

  • Northwestern Polytechnical University Xian

科研成果: 期刊稿件文章同行评审

摘要

Deep Reinforcement Learning (DRL), leveraging the nonlinear approximation capabilities of deep neural networks, maps high-dimensional perception information to robotic control commands and has been widely applied to various continuous control tasks. However, for long-horizon tasks, manipulators often face challenges such as high-dimensional exploration spaces and sparse rewards, making it difficult to learn effective strategies and even leading to potentially dangerous actions. Additionally, collecting a large volume of high-quality expert demonstrations for underwater manipulators is challenging. To overcome these limitations, this study proposes a DRL algorithm that integrates a small amount of expert experience for long-horizon control of underwater manipulators. Firstly, a DRL dynamic optimization strategy based on expert experience is designed, featuring a dual-buffer dynamic sampling mechanism that enables efficient early-stage learning. Secondly, a linear attention mechanism network is developed to aggregate global task features while maintaining low computational complexity, allowing the model to effectively process high-dimensional sensory inputs and share features across multiple tasks. Furthermore, a staged reward function is designed to steadily learn skills for each phase, ultimately completing the entire long-horizon task. To validate the effectiveness of the proposed algorithm, an underwater simulation environment is constructed in Gazebo. In this simulation environment, the manipulator is trained on various long-horizon tasks, including grasp-place, grasp-stack, and grasp-insert. Training results demonstrate that the proposed algorithm outperforms existing methods in terms of task success rate and policy stability. Additionally, real-world experiments in a water tank environment verify the algorithm's generalization and robustness in practical underwater operations. Note to Practitioners - Underwater robotic manipulation is increasingly vital in complex applications such as offshore maintenance, deep-sea archaeology, and marine resource extraction. However, underwater manipulators often suffer from limited data, sparse rewards, and difficulty in executing multi-stage tasks with precision. This paper proposes a practical solution for enabling underwater manipulators to autonomously complete long-horizon tasks (e.g., grasping, stacking, inserting) through a Deep Reinforcement Learning (DRL) framework enhanced by expert demonstrations and a novel linear attention mechanism. The integration of a dynamic dual-buffer sampling strategy ensures sample efficiency and learning robustness, while the attention-based subtask sharing improves generalization across tasks. Practitioners can apply this method directly in Gazebo-based simulations or real-world underwater operations using conventional 6-DoF manipulators and depth cameras. The trained model demonstrates strong transferability without the need for task-specific fine-tuning, simplifying deployment in unpredictable marine environments. This work bridges the gap between high-level decision-making and low-level control, offering an adaptable and scalable learning pipeline for real-time underwater robotic applications.

源语言英语
页(从-至)9695-9708
页数14
期刊IEEE Transactions on Automation Science and Engineering
23
DOI
出版状态已出版 - 2026

指纹

探究 'Linear Attention-Driven DRL with Sparse Expert Fusion: A Dynamic Optimization Algorithm of Underwater Manipulator for Long Horizon Tasks' 的科研主题。它们共同构成独一无二的指纹。

引用此