TY - JOUR
T1 - 拒止环境下基于深度强化学习的多无人机协同定位
AU - Wan, Kaifang
AU - Wu, Zhilin
AU - Wu, Yunhui
AU - Qiang, Haozhi
AU - Wu, Yibo
AU - Li, Bo
N1 - Publisher Copyright:
© 2025 Chinese Society of Astronautics. All rights reserved.
PY - 2025/4/25
Y1 - 2025/4/25
N2 - In strong adversarial scenarios, Unmanned Aerial Vehicles(UAVs)often experience GPS malfunction due to interference, making it difficult to obtain their accurate position. Since UAVs often operate in formations or clusters, this paper proposes a strategy that relies on UAVs within the formation to measure relative spatial positions and locate each other, allowing UAVs to update their position information in real time even after GPS signal loss. Firstly, in response to the GPS-denied environment, the theory of the Partially Observable Markov Decision Process(POMDP)is introduced and the elements of POMDP are analyzed to establish a POMDP decision model based on collaborative positioning and scheduling is established. A belief state update method based on the Extended Kalman Filter(EKF), as well as a Q-value estimation method based on Deep Q-Network(DQN)in deep reinforcement learning, is proposed to achieve accurate collaborative real-time positioning. Application tests in different scenarios show that the proposed model can achieve efficient management and scheduling of UAVs in formation, and can control GPS normal UAVs to effectively coordinate and locate GPS failed UAVs, which verifies the effectiveness of the model.
AB - In strong adversarial scenarios, Unmanned Aerial Vehicles(UAVs)often experience GPS malfunction due to interference, making it difficult to obtain their accurate position. Since UAVs often operate in formations or clusters, this paper proposes a strategy that relies on UAVs within the formation to measure relative spatial positions and locate each other, allowing UAVs to update their position information in real time even after GPS signal loss. Firstly, in response to the GPS-denied environment, the theory of the Partially Observable Markov Decision Process(POMDP)is introduced and the elements of POMDP are analyzed to establish a POMDP decision model based on collaborative positioning and scheduling is established. A belief state update method based on the Extended Kalman Filter(EKF), as well as a Q-value estimation method based on Deep Q-Network(DQN)in deep reinforcement learning, is proposed to achieve accurate collaborative real-time positioning. Application tests in different scenarios show that the proposed model can achieve efficient management and scheduling of UAVs in formation, and can control GPS normal UAVs to effectively coordinate and locate GPS failed UAVs, which verifies the effectiveness of the model.
KW - collaborative positioning
KW - deep reinforcement learning
KW - GPS-denied
KW - Markov decision
KW - multiple UAVs
UR - http://www.scopus.com/inward/record.url?scp=105006720752&partnerID=8YFLogxK
U2 - 10.7527/S1000-6893.2024.31024
DO - 10.7527/S1000-6893.2024.31024
M3 - 文章
AN - SCOPUS:105006720752
SN - 1000-6893
VL - 46
JO - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
JF - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
IS - 8
M1 - 331024
ER -