CACNN: Capsule Attention Convolutional Neural Networks for 3D Object Recognition

Kai Sun; Jiangshe Zhang; Shuang Xu; Zixiang Zhao; Chunxia Zhang; Junmin Liu; Junying Hu

doi:10.1109/TNNLS.2023.3326606

CACNN: Capsule Attention Convolutional Neural Networks for 3D Object Recognition

Kai Sun, Jiangshe Zhang, Shuang Xu, Zixiang Zhao, Chunxia Zhang, Junmin Liu, Junying Hu

School of Mathematics and Statistics

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.

Original language	English
Pages (from-to)	4091-4102
Number of pages	12
Journal	IEEE Transactions on Neural Networks and Learning Systems
Volume	36
Issue number	3
DOIs	https://doi.org/10.1109/TNNLS.2023.3326606
State	Published - 2025

Keywords

3D object recognition
capsule attention convolutional neural networks (CACNNs)
capsule attention layer (CAL)
view-based approaches

Access to Document

10.1109/TNNLS.2023.3326606

Cite this

@article{0836db414e2a4ac89904ee02cc2378c8,

title = "CACNN: Capsule Attention Convolutional Neural Networks for 3D Object Recognition",

abstract = "Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.",

keywords = "3D object recognition, capsule attention convolutional neural networks (CACNNs), capsule attention layer (CAL), view-based approaches",

author = "Kai Sun and Jiangshe Zhang and Shuang Xu and Zixiang Zhao and Chunxia Zhang and Junmin Liu and Junying Hu",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.",

year = "2025",

doi = "10.1109/TNNLS.2023.3326606",

language = "英语",

volume = "36",

pages = "4091--4102",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

number = "3",

}

TY - JOUR

T1 - CACNN

T2 - Capsule Attention Convolutional Neural Networks for 3D Object Recognition

AU - Sun, Kai

AU - Zhang, Jiangshe

AU - Xu, Shuang

AU - Zhao, Zixiang

AU - Zhang, Chunxia

AU - Liu, Junmin

AU - Hu, Junying

PY - 2025

Y1 - 2025

N2 - Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.

AB - Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.

KW - 3D object recognition

KW - capsule attention convolutional neural networks (CACNNs)

KW - capsule attention layer (CAL)

KW - view-based approaches

UR - http://www.scopus.com/inward/record.url?scp=86000431163&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2023.3326606

DO - 10.1109/TNNLS.2023.3326606

M3 - 文章

AN - SCOPUS:86000431163

SN - 2162-237X

VL - 36

SP - 4091

EP - 4102

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

IS - 3

ER -

CACNN: Capsule Attention Convolutional Neural Networks for 3D Object Recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this