TY - JOUR
T1 - Deterministic convergence analysis for regularized long short-term memory and its application to regression and multi-classification problems
AU - Kang, Qian
AU - Yu, Dengxiu
AU - Cheong, Kang Hao
AU - Wang, Zhen
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/7
Y1 - 2024/7
N2 - Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L2 regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.
AB - Long short-term memory (LSTM) is a recurrent neural network (RNN) framework designed to solve the gradient disappearance and explosion problems of traditional RNNs. In recent years, LSTM has become a state-of-the-art model for solving various machine-learning problems. This paper propose a novel regularized LSTM based on the batch gradient method. Specifically, the L2 regularization is appended to the objective function as a systematic external force, effectively controlling the excessive growth of weights in the network and preventing the overfitting phenomenon. In addition, a rigorous convergence analysis of the proposed method is carried out, i.e., monotonicity, weak convergence, and strong convergence results are obtained. Finally, comparative simulations are conducted on the benchmark data set for regression and classification problems, and the simulation results verify the effectiveness of the method.
KW - Batch gradient algorithm
KW - Convergence
KW - Long short-term memory
KW - Regularization
UR - http://www.scopus.com/inward/record.url?scp=85191483607&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2024.108444
DO - 10.1016/j.engappai.2024.108444
M3 - 文章
AN - SCOPUS:85191483607
SN - 0952-1976
VL - 133
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 108444
ER -