Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems

Qian Kang; Dengxiu Yu; Kang Hao Cheong; Zhen Wang

doi:10.1016/j.engappai.2024.108444

Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems

Qian Kang, Dengxiu Yu, Kang Hao Cheong, Zhen Wang

光电与智能研究院

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L₂ regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.

源语言	英语
文章编号	108444
期刊	Engineering Applications of Artificial Intelligence
卷	133
DOI	https://doi.org/10.1016/j.engappai.2024.108444
出版状态	已出版 - 7月 2024

访问文件

10.1016/j.engappai.2024.108444

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{40d42a57230a4173a176fc744fc94b67,

title = "Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems",

abstract = "Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L2 regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.",

keywords = "Batch gradient algorithm, Convergence, Long short-term memory, Regularization",

author = "Qian Kang and Dengxiu Yu and Cheong, {Kang Hao} and Zhen Wang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Ltd",

year = "2024",

month = jul,

doi = "10.1016/j.engappai.2024.108444",

language = "英语",

volume = "133",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems

AU - Kang, Qian

AU - Yu, Dengxiu

AU - Cheong, Kang Hao

AU - Wang, Zhen

PY - 2024/7

Y1 - 2024/7

N2 - Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L2 regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.

AB - Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L2 regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.

KW - Batch gradient algorithm

KW - Convergence

KW - Long short-term memory

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=85191483607&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2024.108444

DO - 10.1016/j.engappai.2024.108444

M3 - 文章

AN - SCOPUS:85191483607

SN - 0952-1976

VL - 133

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 108444

ER -

Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems

摘要

访问文件

其它文件与链接

指纹

引用此