TY - JOUR
T1 - Enhancing classification efficiency in capsule networks through windowed routing
T2 - tackling gradient vanishing, dynamic routing, and computational complexity challenges
AU - Chen, Gangqi
AU - Mao, Zhaoyong
AU - Shen, Junge
AU - Hou, Dongdong
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2025/1
Y1 - 2025/1
N2 - Capsule networks overcome the two drawbacks of convolutional neural networks: weak rotated object recognition and poor spatial discrimination. However, they still have encountered problems with complex images, including high computational cost and limited accuracy. To address these challenges, this work has developed effective solutions. Specifically, a novel windowed dynamic up-and-down attention routing process is first introduced, which can effectively reduce the computational complexity from quadratic to linear order. A novel deconvolution-based decoder is also used to further reduce the computational complexity. Then, a novel LayerNorm strategy is used to pre-process neuron values in the squash function. This prevents saturation and mitigates the gradient vanishing problem. In addition, a novel gradient-friendly network structure is developed to facilitate the extraction of complex features with deeper networks. Experiments show that our methods are effective and competitive, outperforming existing techniques.
AB - Capsule networks overcome the two drawbacks of convolutional neural networks: weak rotated object recognition and poor spatial discrimination. However, they still have encountered problems with complex images, including high computational cost and limited accuracy. To address these challenges, this work has developed effective solutions. Specifically, a novel windowed dynamic up-and-down attention routing process is first introduced, which can effectively reduce the computational complexity from quadratic to linear order. A novel deconvolution-based decoder is also used to further reduce the computational complexity. Then, a novel LayerNorm strategy is used to pre-process neuron values in the squash function. This prevents saturation and mitigates the gradient vanishing problem. In addition, a novel gradient-friendly network structure is developed to facilitate the extraction of complex features with deeper networks. Experiments show that our methods are effective and competitive, outperforming existing techniques.
KW - Capsule Network
KW - Gradient Vanishing Problem
KW - Image Classification
KW - Window Attention
UR - http://www.scopus.com/inward/record.url?scp=85209576046&partnerID=8YFLogxK
U2 - 10.1007/s40747-024-01640-8
DO - 10.1007/s40747-024-01640-8
M3 - 文章
AN - SCOPUS:85209576046
SN - 2199-4536
VL - 11
JO - Complex and Intelligent Systems
JF - Complex and Intelligent Systems
IS - 1
M1 - 45
ER -