P2TC: A Lightweight Pyramid Pooling Transformer-CNN Network for Accurate 3D Whole Heart Segmentation

Hengfei Cui, Yifan Wang, Fan Zheng, Yan Li, Yanning Zhang, Yong Xia

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Cardiovascular disease is a leading global cause of death, requiring accurate heart segmentation for diagnosis and surgical planning. Deep learning methods have been demonstrated to achieve superior performances in cardiac structures segmentation. However, there are still limitations in 3D whole heart segmentation, such as inadequate spatial context modeling, difficulty in capturing long-distance dependencies, high computational complexity, and limited representation of local high-level semantic information. To tackle the above problems, we propose a lightweight Pyramid Pooling Transformer-CNN (P2TC) network for accurate 3D whole heart segmentation. The proposed architecture comprises a dual encoder-decoder structure with a 3D pyramid pooling Transformer for multi-scale information fusion and a lightweight large-kernel Convolutional Neural Network (CNN) for local feature extraction. The decoder has two branches for precise segmentation and contextual residual handling. The first branch is used to generate segmentation masks for pixel-level classification based on the features extracted by the encoder to achieve accurate segmentation of cardiac structures. The second branch highlights contextual residuals across slices, enabling the network to better handle variations and boundaries. Extensive experimental results on the Multi-Modality Whole Heart Segmentation (MM-WHS) 2017 challenge dataset demonstrate that P2TC outperforms the most advanced methods, achieving the Dice scores of 92.6% and 88.1% in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities respectively, which surpasses the baseline model by 1.5% and 1.7%, and achieves state-of-the-art segmentation results.

Original languageEnglish
JournalIEEE Journal of Biomedical and Health Informatics
DOIs
StateAccepted/In press - 2025

Keywords

  • Whole heart segmentation
  • bidirectional residual perception
  • large-kernel volumetric convolution
  • pyramid pooling Transformer

Fingerprint

Dive into the research topics of 'P2TC: A Lightweight Pyramid Pooling Transformer-CNN Network for Accurate 3D Whole Heart Segmentation'. Together they form a unique fingerprint.

Cite this