TY - GEN
T1 - A Dynamic Power Allocation Scheme in Power-Domain NOMA using Actor-Critic Reinforcement Learning
AU - Zhang, Shaomin
AU - Li, Lixin
AU - Yin, Jiaying
AU - Liang, Wei
AU - Li, Xu
AU - Chen, Wei
AU - Han, Zhu
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - Non-orthogonal multiple access (NOMA) is one of the most promising technologies in the next-generation cellular communication. However, the effective power allocation strategy has always been a problem that needs to be solved in power-domain NOMA. In this paper, we propose a reinforcement learning (RL) method to solve the power allocation problem. In particular, in the power-domain NOMA, the base station (BS) simultaneously transmits data to the user under the constraint of the sum power. Considering that the power allocation assigned by the BS to each user can be used to optimize the energy efficient (EE) of the entire system, we propose the RL algorithm framework of the Actor-Critic to dynamically select the power allocation coefficient. A parameterized strategy is constructed in the Actor part, and then the Critic part evaluates it, and finally the Actor part adjust the strategy according to the feedback from the Critic part. Numerical results indicate that the proposed scheme can efficiently improve the EE of the entire system.
AB - Non-orthogonal multiple access (NOMA) is one of the most promising technologies in the next-generation cellular communication. However, the effective power allocation strategy has always been a problem that needs to be solved in power-domain NOMA. In this paper, we propose a reinforcement learning (RL) method to solve the power allocation problem. In particular, in the power-domain NOMA, the base station (BS) simultaneously transmits data to the user under the constraint of the sum power. Considering that the power allocation assigned by the BS to each user can be used to optimize the energy efficient (EE) of the entire system, we propose the RL algorithm framework of the Actor-Critic to dynamically select the power allocation coefficient. A parameterized strategy is constructed in the Actor part, and then the Critic part evaluates it, and finally the Actor part adjust the strategy according to the feedback from the Critic part. Numerical results indicate that the proposed scheme can efficiently improve the EE of the entire system.
KW - Actor-Critic
KW - energy efficiency
KW - NOMA
KW - power allocation
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85063085424&partnerID=8YFLogxK
U2 - 10.1109/ICCChina.2018.8641248
DO - 10.1109/ICCChina.2018.8641248
M3 - 会议稿件
AN - SCOPUS:85063085424
T3 - 2018 IEEE/CIC International Conference on Communications in China, ICCC 2018
SP - 719
EP - 723
BT - 2018 IEEE/CIC International Conference on Communications in China, ICCC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE/CIC International Conference on Communications in China, ICCC 2018
Y2 - 16 August 2018 through 18 August 2018
ER -