TY - JOUR
T1 - Attention Enhanced Multi-Agent Reinforcement Learning for Cooperative Spectrum Sensing in Cognitive Radio Networks
AU - Gao, Ang
AU - Wang, Qinyu
AU - Wang, Yongze
AU - Du, Chengyuan
AU - Hu, Yansu
AU - Liang, Wei
AU - Ng, Soon Xin
N1 - Publisher Copyright:
© 1967-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Cooperative spectrum sensing (CSS) technology has been widely studied to enhance the spectrum sharing efficiency spatially and temporally in cognitive radio networks (CRNs), where the secondary users (SUs) can opportunistically reuse the channels already licensed to the primary users (PUs) for transmission by sensing spectrum holes. SUs are endowed with the global awareness of channels state by cooperating with each other without sweeping across the whole frequency bands. Since the channels occupation of PUs changes dynamically, the accurate sensing and swift information sharing are crucial for CRNs. The paper proposes a multi-agent deep reinforcement learning (DRL) based CSS method to help SUs efficiently finding a vacant channel by the cooperation with their partners. 1) Two partner selection algorithms are proposed named as Reliable Partner CSS and Adaptive Partner CSS, respectively. For the former, the partner selection is facilitated based on the historical sensing accuracy of SUs. While the latter takes the comprehensive consideration of both the reliability and geographical distribution of SUs to further improve the sensing accuracy. 2) Multi-agent deep deterministic policy gradient (MADDPG) is adopted to resist the dynamically varying channels condition as well as the high-dimension solution space. With the feature of 'centralized training and decentralized execution', each SU learns to interact with the environment and select a vacant channel for transmission by its partial observation, which greatly reduces the communication overhead caused by the cooperative spectrum sensing. 3) Numerical simulation demonstrates the convergence and availability of the proposed algorithms. No matter Reliable Partner CSS or Adaptive Partner CSS, the sensing accuracy can be greatly enhanced comparing with other non-cooperative or centralized learning approaches. Besides, the attention mechanism is introduced to MADDPG for Adaptive Partner CSS to reveal the behavior of SUs by the visualization of attention weight, which helps to partially interpret the 'black box' issue of DRL.
AB - Cooperative spectrum sensing (CSS) technology has been widely studied to enhance the spectrum sharing efficiency spatially and temporally in cognitive radio networks (CRNs), where the secondary users (SUs) can opportunistically reuse the channels already licensed to the primary users (PUs) for transmission by sensing spectrum holes. SUs are endowed with the global awareness of channels state by cooperating with each other without sweeping across the whole frequency bands. Since the channels occupation of PUs changes dynamically, the accurate sensing and swift information sharing are crucial for CRNs. The paper proposes a multi-agent deep reinforcement learning (DRL) based CSS method to help SUs efficiently finding a vacant channel by the cooperation with their partners. 1) Two partner selection algorithms are proposed named as Reliable Partner CSS and Adaptive Partner CSS, respectively. For the former, the partner selection is facilitated based on the historical sensing accuracy of SUs. While the latter takes the comprehensive consideration of both the reliability and geographical distribution of SUs to further improve the sensing accuracy. 2) Multi-agent deep deterministic policy gradient (MADDPG) is adopted to resist the dynamically varying channels condition as well as the high-dimension solution space. With the feature of 'centralized training and decentralized execution', each SU learns to interact with the environment and select a vacant channel for transmission by its partial observation, which greatly reduces the communication overhead caused by the cooperative spectrum sensing. 3) Numerical simulation demonstrates the convergence and availability of the proposed algorithms. No matter Reliable Partner CSS or Adaptive Partner CSS, the sensing accuracy can be greatly enhanced comparing with other non-cooperative or centralized learning approaches. Besides, the attention mechanism is introduced to MADDPG for Adaptive Partner CSS to reveal the behavior of SUs by the visualization of attention weight, which helps to partially interpret the 'black box' issue of DRL.
KW - Cognitive radio networks (CRNs)
KW - cooperative spectrum sensing (CSS)
KW - deep reinforcement learning
KW - multi-agent deep deterministic policy gradient
KW - multi-head attention
UR - http://www.scopus.com/inward/record.url?scp=85190172894&partnerID=8YFLogxK
U2 - 10.1109/TVT.2024.3384393
DO - 10.1109/TVT.2024.3384393
M3 - 文章
AN - SCOPUS:85190172894
SN - 0018-9545
VL - 73
SP - 10464
EP - 10477
JO - IEEE Transactions on Vehicular Technology
JF - IEEE Transactions on Vehicular Technology
IS - 7
ER -