摘要
Optimal multi-target tracking in the colocated multiple-input multiple-output radar system with limited beam resources has been widely considered in the past decades but, in general, still remains an open question due to its high complexity. Here, we aim to minimize a measure of the overall error covariance of target kinematic state estimation by appropriately allocating the beam resources to different targets. We model the beam scheduling problem as a restless multi-armed bandit problem that aims to minimize the expected total discounted cost over an infinite time horizon and is in general PSPACE-hard. We improve upon the Whittle relaxation technique by proposing a more stringent method to decompose the correlated restless bandit processes. It leads to a relaxed version of the original optimization problem with a tighter performance bound compared to the Whittle relaxation. Meanwhile, unlike the Lagrangian dynamic program that attaches an independent Lagrangian multiplier to each decision epoch, which is inapplicable for infinite-horizon objectives, our method trades off the number of Lagrangian multipliers against the tightness of the relaxation. The proposed method allows to exploit different relaxation levels and results in a more efficient and effective policy. Numerical experiments demonstrate the effectiveness of the proposed policy.
源语言 | 英语 |
---|---|
期刊 | IEEE Transactions on Aerospace and Electronic Systems |
DOI | |
出版状态 | 已接受/待刊 - 2025 |