Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

Siqing Sun; Tianbo Li; Xiao Chen; Huachao Dong; Xinjing Wang

doi:10.1016/j.asoc.2024.111968

Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

Siqing Sun, Tianbo Li, Xiao Chen, Huachao Dong, Xinjing Wang

School of Marine Science and Technology

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.

Original language	English
Article number	111968
Journal	Applied Soft Computing
Volume	164
DOIs	https://doi.org/10.1016/j.asoc.2024.111968
State	Published - Oct 2024

Keywords

Autonomous surface vessels
Behavior cloning
Cooperative defense
Deep reinforcement learning
Reward design

Access to Document

10.1016/j.asoc.2024.111968

Cite this

@article{f87d2791a98f477a8feeda68fcceaedb,

title = "Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning",

abstract = "Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.",

keywords = "Autonomous surface vessels, Behavior cloning, Cooperative defense, Deep reinforcement learning, Reward design",

author = "Siqing Sun and Tianbo Li and Xiao Chen and Huachao Dong and Xinjing Wang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier B.V.",

year = "2024",

month = oct,

doi = "10.1016/j.asoc.2024.111968",

language = "英语",

volume = "164",

journal = "Applied Soft Computing",

issn = "1568-4946",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

AU - Sun, Siqing

AU - Li, Tianbo

AU - Chen, Xiao

AU - Dong, Huachao

AU - Wang, Xinjing

PY - 2024/10

Y1 - 2024/10

N2 - Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.

AB - Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.

KW - Autonomous surface vessels

KW - Behavior cloning

KW - Cooperative defense

KW - Deep reinforcement learning

KW - Reward design

UR - http://www.scopus.com/inward/record.url?scp=85198558605&partnerID=8YFLogxK

U2 - 10.1016/j.asoc.2024.111968

DO - 10.1016/j.asoc.2024.111968

M3 - 文章

AN - SCOPUS:85198558605

SN - 1568-4946

VL - 164

JO - Applied Soft Computing

JF - Applied Soft Computing

M1 - 111968

ER -

Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this