跳到主要导航 跳到搜索 跳到主要内容

A Deep Reinforcement Learning-Based Whittle Index Policy for Multibeam Allocation

  • Yuhang Hao
  • , Zengfu Wang
  • , Jing Fu
  • , Quan Pan
  • Northwestern Polytechnical University Xian
  • Royal Melbourne Institute of Technology University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

In this paper, a non-myopic beam scheduling policy is proposed for multi-target tracking (MTT) in a phased-array radar network, seeking to minimize the discounted sum of tracking error of targets and improve the long-term tracking performance. The Whittle index policy based on the restless multiarmed bandit (RMAB) model can decompose the state space of the underlying optimization problem into independent spaces with reduced sizes. We consider the tracking error covariance (TEC) matrix as the state of each target (arm), which evolves based on the Kalman filter. However, for a real-world MTT, the exact calculation of the Whittle index in multiple dimensions is challenging. The neural network is established to achieve the feature extraction of TEC states and learn the corresponding Whittle index. The deep reinforcement learning (DRL) method is exploited to train the neural network by leveraging the threshold property of the Whittle index policy and engaging in interactions with a single target tracking environment. We propose the DRL-based Whittle index policy, namely DRLWI, aiming to solve the beam allocation problem for MTT with multi-dimensional TEC states. This approach effectively mitigates the exponential computational complexity of classical dynamic programming approaches and the low convergence rate caused by large joint state and action spaces in the simple application of DRL algorithms. Numerical results demonstrate the performance of the proposed DRLWI policy surpasses that of DRL algorithms and myopic policies.

源语言英语
主期刊名FUSION 2024 - 27th International Conference on Information Fusion
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9781737749769
DOI
出版状态已出版 - 2024
活动27th International Conference on Information Fusion, FUSION 2024 - Venice, 意大利
期限: 7 7月 202411 7月 2024

出版系列

姓名FUSION 2024 - 27th International Conference on Information Fusion

会议

会议27th International Conference on Information Fusion, FUSION 2024
国家/地区意大利
Venice
时期7/07/2411/07/24

指纹

探究 'A Deep Reinforcement Learning-Based Whittle Index Policy for Multibeam Allocation' 的科研主题。它们共同构成独一无二的指纹。

引用此