TY - JOUR
T1 - CACNN
T2 - Capsule Attention Convolutional Neural Networks for 3D Object Recognition
AU - Sun, Kai
AU - Zhang, Jiangshe
AU - Xu, Shuang
AU - Zhao, Zixiang
AU - Zhang, Chunxia
AU - Liu, Junmin
AU - Hu, Junying
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2025
Y1 - 2025
N2 - Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.
AB - Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.
KW - 3D object recognition
KW - capsule attention convolutional neural networks (CACNNs)
KW - capsule attention layer (CAL)
KW - view-based approaches
UR - http://www.scopus.com/inward/record.url?scp=86000431163&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2023.3326606
DO - 10.1109/TNNLS.2023.3326606
M3 - 文章
AN - SCOPUS:86000431163
SN - 2162-237X
VL - 36
SP - 4091
EP - 4102
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 3
ER -