TY - JOUR
T1 - Energy efficient transmission in underlay CR-NOMA networks enabled by reinforcement learning
AU - Liang, Wei
AU - Ng, Soon Xin
AU - Shi, Jia
AU - Li, Lixin
AU - Wang, Dawei
N1 - Publisher Copyright:
© 2013 China Institute of Communications.
PY - 2020/12
Y1 - 2020/12
N2 - In order to improve the energy efficiency (EE) in the underlay cognitive radio (CR)networks, a power allocation strategy based on an actor-critic reinforcement learning is proposed, where a cluster of cognitive users (CUs) can simultaneously access to the same primary spectrum band under the interference constraints of the primary user (PU), by employing the non-orthogonal multiple access (NOMA) technique. In the proposed scheme, the optimization of the power allocation is formulated as a non-convex optimization problem. Additionally, the power allocation for different CUs is based on the actor-critic reinforcement learning model, in which the weighted data rate is set as the reward function, and the generated action strategy (i.e. The power allocation) is iteratively criticized and updated. Both the CU's spectral efficiency and the PU's interference constrains are considered in the training of the actor-critic reinforcement learning. Furthermore, the first order Taylor approximation as well as other manipulations are adopted to solve the power allocation optimization problem for the sake of considering the conventional channel conditions. According to the simulation results, we find that our scheme could achieve a higher spectral efficiency for the CUs compared to a benchmark scheme without learning process as well as the existing Q-learning based method, while the resultant interference affecting the PU transmission can be maintained at a given tolerated limit.
AB - In order to improve the energy efficiency (EE) in the underlay cognitive radio (CR)networks, a power allocation strategy based on an actor-critic reinforcement learning is proposed, where a cluster of cognitive users (CUs) can simultaneously access to the same primary spectrum band under the interference constraints of the primary user (PU), by employing the non-orthogonal multiple access (NOMA) technique. In the proposed scheme, the optimization of the power allocation is formulated as a non-convex optimization problem. Additionally, the power allocation for different CUs is based on the actor-critic reinforcement learning model, in which the weighted data rate is set as the reward function, and the generated action strategy (i.e. The power allocation) is iteratively criticized and updated. Both the CU's spectral efficiency and the PU's interference constrains are considered in the training of the actor-critic reinforcement learning. Furthermore, the first order Taylor approximation as well as other manipulations are adopted to solve the power allocation optimization problem for the sake of considering the conventional channel conditions. According to the simulation results, we find that our scheme could achieve a higher spectral efficiency for the CUs compared to a benchmark scheme without learning process as well as the existing Q-learning based method, while the resultant interference affecting the PU transmission can be maintained at a given tolerated limit.
KW - cognitive radio network
KW - non-orthogonal multiple access scheme
KW - power allocation
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85099134481&partnerID=8YFLogxK
U2 - 10.23919/JCC.2020.12.005
DO - 10.23919/JCC.2020.12.005
M3 - 文章
AN - SCOPUS:85099134481
SN - 1673-5447
VL - 17
SP - 66
EP - 79
JO - China Communications
JF - China Communications
IS - 12
M1 - 9312792
ER -