Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

Siqing Sun; Tianbo Li; Xiao Chen; Huachao Dong; Xinjing Wang

doi:10.1016/j.asoc.2024.111968

Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

Siqing Sun, Tianbo Li, Xiao Chen, Huachao Dong, Xinjing Wang

航海学院

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.

源语言	英语
文章编号	111968
期刊	Applied Soft Computing
卷	164
DOI	https://doi.org/10.1016/j.asoc.2024.111968
出版状态	已出版 - 10月 2024

访问文件

10.1016/j.asoc.2024.111968

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f87d2791a98f477a8feeda68fcceaedb,

title = "Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning",

abstract = "Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.",

keywords = "Autonomous surface vessels, Behavior cloning, Cooperative defense, Deep reinforcement learning, Reward design",

author = "Siqing Sun and Tianbo Li and Xiao Chen and Huachao Dong and Xinjing Wang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier B.V.",

year = "2024",

month = oct,

doi = "10.1016/j.asoc.2024.111968",

language = "英语",

volume = "164",

journal = "Applied Soft Computing",

issn = "1568-4946",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

AU - Sun, Siqing

AU - Li, Tianbo

AU - Chen, Xiao

AU - Dong, Huachao

AU - Wang, Xinjing

PY - 2024/10

Y1 - 2024/10

N2 - Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.

AB - Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.

KW - Autonomous surface vessels

KW - Behavior cloning

KW - Cooperative defense

KW - Deep reinforcement learning

KW - Reward design

UR - http://www.scopus.com/inward/record.url?scp=85198558605&partnerID=8YFLogxK

U2 - 10.1016/j.asoc.2024.111968

DO - 10.1016/j.asoc.2024.111968

M3 - 文章

AN - SCOPUS:85198558605

SN - 1568-4946

VL - 164

JO - Applied Soft Computing

JF - Applied Soft Computing

M1 - 111968

ER -

Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

摘要

访问文件

其它文件与链接

指纹

引用此