RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

Qian You; Qian Xu; Xin Yang; Tao Zhang; Ming Chen

doi:10.12142/ZTECOM.202302009

RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

Qian You, Qian Xu, Xin Yang, Tao Zhang, Ming Chen

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

Device-to-device (D2D) communications underlying cellular networks enabled by unmanned aerial vehicles (UAV) have been regarded as promising techniques for next-generation communications. To mitigate the strong interference caused by the line-of-sight (LoS) air-to-ground channels, we deploy a reconfigurable intelligent surface (RIS) to rebuild the wireless channels. A joint optimization problem of the transmit power of UAV, the transmit power of D2D users and the RIS phase configuration are investigated to maximize the achievable rate of D2D users while satisfying the quality of service (QoS) requirement of cellular users. Due to the high channel dynamics and the coupling among cellular users, the RIS, and the D2D users, it is challenging to find a proper solution. Thus, a RIS softmax deep double deterministic (RIS-SD3) policy gradient method is proposed, which can smooth the optimization space as well as reduce the number of local optimizations. Specifically, the SD3 algorithm maximizes the reward of the agent by training the agent to maximize the value function after the softmax operator is introduced. Simulation results show that the proposed RIS-SD3 algorithm can significantly improve the rate of the D2D users while controlling the interference to the cellular user. Moreover, the proposed RIS-SD3 algorithm has better robustness than the twin delayed deep deterministic (TD3) policy gradient algorithm in a dynamic environment.

源语言	英语
页（从-至）	61-69
页数	9
期刊	ZTE Communications
卷	21
期	2
DOI	https://doi.org/10.12142/ZTECOM.202302009
出版状态	已出版 - 13 6月 2023

访问文件

10.12142/ZTECOM.202302009

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{a645139d0a8749c4a21c004f1bfa8ee1,

title = "RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning",

abstract = "Device-to-device (D2D) communications underlying cellular networks enabled by unmanned aerial vehicles (UAV) have been regarded as promising techniques for next-generation communications. To mitigate the strong interference caused by the line-of-sight (LoS) air-to-ground channels, we deploy a reconfigurable intelligent surface (RIS) to rebuild the wireless channels. A joint optimization problem of the transmit power of UAV, the transmit power of D2D users and the RIS phase configuration are investigated to maximize the achievable rate of D2D users while satisfying the quality of service (QoS) requirement of cellular users. Due to the high channel dynamics and the coupling among cellular users, the RIS, and the D2D users, it is challenging to find a proper solution. Thus, a RIS softmax deep double deterministic (RIS-SD3) policy gradient method is proposed, which can smooth the optimization space as well as reduce the number of local optimizations. Specifically, the SD3 algorithm maximizes the reward of the agent by training the agent to maximize the value function after the softmax operator is introduced. Simulation results show that the proposed RIS-SD3 algorithm can significantly improve the rate of the D2D users while controlling the interference to the cellular user. Moreover, the proposed RIS-SD3 algorithm has better robustness than the twin delayed deep deterministic (TD3) policy gradient algorithm in a dynamic environment.",

keywords = "deep reinforcement learning, device-to-device communications, reconfigurable intelligent surface, softmax deep double deterministic policy gradient",

author = "Qian You and Qian Xu and Xin Yang and Tao Zhang and Ming Chen",

year = "2023",

month = jun,

day = "13",

doi = "10.12142/ZTECOM.202302009",

language = "英语",

volume = "21",

pages = "61--69",

journal = "ZTE Communications",

issn = "1673-5188",

publisher = "ZTE Communications",

number = "2",

}

TY - JOUR

T1 - RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

AU - You, Qian

AU - Xu, Qian

AU - Yang, Xin

AU - Zhang, Tao

AU - Chen, Ming

PY - 2023/6/13

Y1 - 2023/6/13

N2 - Device-to-device (D2D) communications underlying cellular networks enabled by unmanned aerial vehicles (UAV) have been regarded as promising techniques for next-generation communications. To mitigate the strong interference caused by the line-of-sight (LoS) air-to-ground channels, we deploy a reconfigurable intelligent surface (RIS) to rebuild the wireless channels. A joint optimization problem of the transmit power of UAV, the transmit power of D2D users and the RIS phase configuration are investigated to maximize the achievable rate of D2D users while satisfying the quality of service (QoS) requirement of cellular users. Due to the high channel dynamics and the coupling among cellular users, the RIS, and the D2D users, it is challenging to find a proper solution. Thus, a RIS softmax deep double deterministic (RIS-SD3) policy gradient method is proposed, which can smooth the optimization space as well as reduce the number of local optimizations. Specifically, the SD3 algorithm maximizes the reward of the agent by training the agent to maximize the value function after the softmax operator is introduced. Simulation results show that the proposed RIS-SD3 algorithm can significantly improve the rate of the D2D users while controlling the interference to the cellular user. Moreover, the proposed RIS-SD3 algorithm has better robustness than the twin delayed deep deterministic (TD3) policy gradient algorithm in a dynamic environment.

AB - Device-to-device (D2D) communications underlying cellular networks enabled by unmanned aerial vehicles (UAV) have been regarded as promising techniques for next-generation communications. To mitigate the strong interference caused by the line-of-sight (LoS) air-to-ground channels, we deploy a reconfigurable intelligent surface (RIS) to rebuild the wireless channels. A joint optimization problem of the transmit power of UAV, the transmit power of D2D users and the RIS phase configuration are investigated to maximize the achievable rate of D2D users while satisfying the quality of service (QoS) requirement of cellular users. Due to the high channel dynamics and the coupling among cellular users, the RIS, and the D2D users, it is challenging to find a proper solution. Thus, a RIS softmax deep double deterministic (RIS-SD3) policy gradient method is proposed, which can smooth the optimization space as well as reduce the number of local optimizations. Specifically, the SD3 algorithm maximizes the reward of the agent by training the agent to maximize the value function after the softmax operator is introduced. Simulation results show that the proposed RIS-SD3 algorithm can significantly improve the rate of the D2D users while controlling the interference to the cellular user. Moreover, the proposed RIS-SD3 algorithm has better robustness than the twin delayed deep deterministic (TD3) policy gradient algorithm in a dynamic environment.

KW - deep reinforcement learning

KW - device-to-device communications

KW - reconfigurable intelligent surface

KW - softmax deep double deterministic policy gradient

UR - http://www.scopus.com/inward/record.url?scp=85192487476&partnerID=8YFLogxK

U2 - 10.12142/ZTECOM.202302009

DO - 10.12142/ZTECOM.202302009

M3 - 文章

AN - SCOPUS:85192487476

SN - 1673-5188

VL - 21

SP - 61

EP - 69

JO - ZTE Communications

JF - ZTE Communications

IS - 2

ER -

RIS-Assisted UAV-D2D Communications Exploiting Deep Reinforcement Learning

摘要

访问文件

其它文件与链接

指纹

引用此