SPACECRAFT RELATIVE ON-OFF CONTROL VIA REINFORCEMENT LEARNING

S. V. Khoroshylov; C. Wang

doi:10.15407/knit2024.02.003

SPACECRAFT RELATIVE ON-OFF CONTROL VIA REINFORCEMENT LEARNING

S. V. Khoroshylov, C. Wang

School of Automation

National Academy of Sciences of Ukraine

Research output: Contribution to journal › Article › peer-review

Abstract

The article investigates the task of spacecraft relative control using reactive actuators, the output of which has two states, “on” or “off”. For cases where the resolution of the thrusters does not provide an accurate approximation of linear control laws using a pulse-width thrust modulator, the possibility of applying reinforcement learning methods for direct finding of control laws that map the state vector and the on-off thruster commands has been investigated. To implement such an approach, a model of controlled relative motion of two satellites in the form of a Markov decision process was obtained. The intelligent agent is presented in the form of “actor” and “critic” neural networks, and the architecture of these modules is defined. It is proposed to use a cost function with variable weights of control actions, which allows for optimizing the number of thruster firings explicitly. To improve the control performance, it is proposed to use an extended input vector for the “actor” and “critic” neural networks of the intelligent agent, which, in addition to the state vector, also includes information about the control action on the previous control step and the control step number. To reduce the training time, the agent was pre-trained on the data obtained using conventional control algorithms. Numerical results demonstrate that the reinforcement learning methodology allows the agent to outperform the results provided by the linear controller with the pulse-width modulator in terms of control accuracy, response time, and number of thruster firings.

Original language	English
Pages (from-to)	3-14
Number of pages	12
Journal	Space Science and Technology
Volume	30
Issue number	2
DOIs	https://doi.org/10.15407/knit2024.02.003
State	Published - 2024

Keywords

actor
critic
neural network
on-off control
reinforcement learning
spacecraft relative control
thruster firing

Access to Document

10.15407/knit2024.02.003

Cite this

@article{4995bd7b49a04723a5da67139979a9f0,

title = "SPACECRAFT RELATIVE ON-OFF CONTROL VIA REINFORCEMENT LEARNING",

abstract = "The article investigates the task of spacecraft relative control using reactive actuators, the output of which has two states, “on” or “off”. For cases where the resolution of the thrusters does not provide an accurate approximation of linear control laws using a pulse-width thrust modulator, the possibility of applying reinforcement learning methods for direct finding of control laws that map the state vector and the on-off thruster commands has been investigated. To implement such an approach, a model of controlled relative motion of two satellites in the form of a Markov decision process was obtained. The intelligent agent is presented in the form of “actor” and “critic” neural networks, and the architecture of these modules is defined. It is proposed to use a cost function with variable weights of control actions, which allows for optimizing the number of thruster firings explicitly. To improve the control performance, it is proposed to use an extended input vector for the “actor” and “critic” neural networks of the intelligent agent, which, in addition to the state vector, also includes information about the control action on the previous control step and the control step number. To reduce the training time, the agent was pre-trained on the data obtained using conventional control algorithms. Numerical results demonstrate that the reinforcement learning methodology allows the agent to outperform the results provided by the linear controller with the pulse-width modulator in terms of control accuracy, response time, and number of thruster firings.",

keywords = "actor, critic, neural network, on-off control, reinforcement learning, spacecraft relative control, thruster firing",

author = "Khoroshylov, {S. V.} and C. Wang",

note = "Publisher Copyright: {\textcopyright} Publisher PH «Akademperiodyka» of the NAS of Ukraine, 2023.",

year = "2024",

doi = "10.15407/knit2024.02.003",

language = "英语",

volume = "30",

pages = "3--14",

journal = "Space Science and Technology",

issn = "1561-8889",

publisher = "Publishing House Akademperiodyka",

number = "2",

}

TY - JOUR

T1 - SPACECRAFT RELATIVE ON-OFF CONTROL VIA REINFORCEMENT LEARNING

AU - Khoroshylov, S. V.

AU - Wang, C.

N1 - Publisher Copyright: © Publisher PH «Akademperiodyka» of the NAS of Ukraine, 2023.

PY - 2024

Y1 - 2024

N2 - The article investigates the task of spacecraft relative control using reactive actuators, the output of which has two states, “on” or “off”. For cases where the resolution of the thrusters does not provide an accurate approximation of linear control laws using a pulse-width thrust modulator, the possibility of applying reinforcement learning methods for direct finding of control laws that map the state vector and the on-off thruster commands has been investigated. To implement such an approach, a model of controlled relative motion of two satellites in the form of a Markov decision process was obtained. The intelligent agent is presented in the form of “actor” and “critic” neural networks, and the architecture of these modules is defined. It is proposed to use a cost function with variable weights of control actions, which allows for optimizing the number of thruster firings explicitly. To improve the control performance, it is proposed to use an extended input vector for the “actor” and “critic” neural networks of the intelligent agent, which, in addition to the state vector, also includes information about the control action on the previous control step and the control step number. To reduce the training time, the agent was pre-trained on the data obtained using conventional control algorithms. Numerical results demonstrate that the reinforcement learning methodology allows the agent to outperform the results provided by the linear controller with the pulse-width modulator in terms of control accuracy, response time, and number of thruster firings.

AB - The article investigates the task of spacecraft relative control using reactive actuators, the output of which has two states, “on” or “off”. For cases where the resolution of the thrusters does not provide an accurate approximation of linear control laws using a pulse-width thrust modulator, the possibility of applying reinforcement learning methods for direct finding of control laws that map the state vector and the on-off thruster commands has been investigated. To implement such an approach, a model of controlled relative motion of two satellites in the form of a Markov decision process was obtained. The intelligent agent is presented in the form of “actor” and “critic” neural networks, and the architecture of these modules is defined. It is proposed to use a cost function with variable weights of control actions, which allows for optimizing the number of thruster firings explicitly. To improve the control performance, it is proposed to use an extended input vector for the “actor” and “critic” neural networks of the intelligent agent, which, in addition to the state vector, also includes information about the control action on the previous control step and the control step number. To reduce the training time, the agent was pre-trained on the data obtained using conventional control algorithms. Numerical results demonstrate that the reinforcement learning methodology allows the agent to outperform the results provided by the linear controller with the pulse-width modulator in terms of control accuracy, response time, and number of thruster firings.

KW - actor

KW - critic

KW - neural network

KW - on-off control

KW - reinforcement learning

KW - spacecraft relative control

KW - thruster firing

UR - http://www.scopus.com/inward/record.url?scp=85192773591&partnerID=8YFLogxK

U2 - 10.15407/knit2024.02.003

DO - 10.15407/knit2024.02.003

M3 - 文章

AN - SCOPUS:85192773591

SN - 1561-8889

VL - 30

SP - 3

EP - 14

JO - Space Science and Technology

JF - Space Science and Technology

IS - 2

ER -

SPACECRAFT RELATIVE ON-OFF CONTROL VIA REINFORCEMENT LEARNING

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this