TY - JOUR
T1 - Lightweight Obstacle Avoidance for Fixed-Wing UAVs Using Entropy-Aware PPO
AU - Su, Meimei
AU - Chai, Haochen
AU - Zhao, Chunhui
AU - Lyu, Yang
AU - Hu, Jinwen
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/9
Y1 - 2025/9
N2 - Obstacle avoidance during high-speed, low-altitude flight remains a significant challenge for unmanned aerial vehicles (UAVs), particularly in unfamiliar environments where prior maps and heavy onboard sensors are unavailable. To address this, we present an entropy-aware deep reinforcement learning framework that enables fixed-wing UAVs to navigate safely using only monocular onboard cameras. Our system features a lightweight, single-frame depth estimation module optimized for real-time execution on edge computing platforms, followed by a reinforcement learning controller equipped with a novel reward function that balances goal-reaching performance with path smoothness under fixed-wing dynamic constraints. To enhance policy optimization, we incorporate high-quality experiences from the replay buffer into the gradient computation, introducing a soft imitation mechanism that encourages the agent to align its behavior with previously successful actions. To further balance exploration and exploitation, we integrate an adaptive entropy regularization mechanism into the Proximal Policy Optimization (PPO) algorithm. This module dynamically adjusts policy entropy during training, leading to improved stability, faster convergence, and better generalization to unseen scenarios. Extensive software-in-the-loop (SITL) and hardware-in-the-loop (HITL) experiments demonstrate that our approach outperforms baseline methods in obstacle avoidance success rate and path quality, while remaining lightweight and deployable on resource-constrained aerial platforms.
AB - Obstacle avoidance during high-speed, low-altitude flight remains a significant challenge for unmanned aerial vehicles (UAVs), particularly in unfamiliar environments where prior maps and heavy onboard sensors are unavailable. To address this, we present an entropy-aware deep reinforcement learning framework that enables fixed-wing UAVs to navigate safely using only monocular onboard cameras. Our system features a lightweight, single-frame depth estimation module optimized for real-time execution on edge computing platforms, followed by a reinforcement learning controller equipped with a novel reward function that balances goal-reaching performance with path smoothness under fixed-wing dynamic constraints. To enhance policy optimization, we incorporate high-quality experiences from the replay buffer into the gradient computation, introducing a soft imitation mechanism that encourages the agent to align its behavior with previously successful actions. To further balance exploration and exploitation, we integrate an adaptive entropy regularization mechanism into the Proximal Policy Optimization (PPO) algorithm. This module dynamically adjusts policy entropy during training, leading to improved stability, faster convergence, and better generalization to unseen scenarios. Extensive software-in-the-loop (SITL) and hardware-in-the-loop (HITL) experiments demonstrate that our approach outperforms baseline methods in obstacle avoidance success rate and path quality, while remaining lightweight and deployable on resource-constrained aerial platforms.
KW - collision avoidance
KW - deep reinforcement learning
KW - depth estimation
KW - monocular vision
KW - navigation
UR - https://www.scopus.com/pages/publications/105017422263
U2 - 10.3390/drones9090598
DO - 10.3390/drones9090598
M3 - 文章
AN - SCOPUS:105017422263
SN - 2504-446X
VL - 9
JO - Drones
JF - Drones
IS - 9
M1 - 598
ER -