TY - JOUR
T1 - Improving multiple instance contrastive learning via sparse transformer for whole slide image classification
AU - Liu, Zhaoyang
AU - Lu, Mengkang
AU - Xia, Yong
AU - Liu, Wei
AU - Shu, Minglei
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/2
Y1 - 2026/2
N2 - Slide-level pathology diagnosis using whole slide images (WSIs) is typically formulated as a weakly supervised classification task, which can be effectively addressed through multiple instance learning (MIL). Motivated by the limitations of conventional MIL frameworks, we seek to fully exploit self-supervised learning to enhance instance-level feature extraction, while enabling efficient multi-instance aggregation that explicitly accounts for inter-instance correlations. In this paper, we present MICL++, an enhanced multiple instance contrastive learning framework tailored for WSI classification. SETMIL builds upon a sparse transformer backbone and comprises two key components. First, the Pathology-Specific Contrastive Learning Extraction (PSCLE) module generates discriminative instance-level features optimized for pathological image understanding. Second, the Efficient Sparse Transformer Aggregation (ESTA) module models long-range dependencies among instances with improved computational efficiency. Our method achieves state-of-the-art performance on the CAMELYON16 dataset and the TCGA lung cancer dataset, significantly surpassing prior MIL approaches. Additionally, on five widely used MIL benchmark datasets (MUSK1, MUSK2, ELEPHANT, FOX, and TIGER), our framework consistently outperforms existing methods, demonstrating strong generalization across both clinical and standard MIL scenarios.
AB - Slide-level pathology diagnosis using whole slide images (WSIs) is typically formulated as a weakly supervised classification task, which can be effectively addressed through multiple instance learning (MIL). Motivated by the limitations of conventional MIL frameworks, we seek to fully exploit self-supervised learning to enhance instance-level feature extraction, while enabling efficient multi-instance aggregation that explicitly accounts for inter-instance correlations. In this paper, we present MICL++, an enhanced multiple instance contrastive learning framework tailored for WSI classification. SETMIL builds upon a sparse transformer backbone and comprises two key components. First, the Pathology-Specific Contrastive Learning Extraction (PSCLE) module generates discriminative instance-level features optimized for pathological image understanding. Second, the Efficient Sparse Transformer Aggregation (ESTA) module models long-range dependencies among instances with improved computational efficiency. Our method achieves state-of-the-art performance on the CAMELYON16 dataset and the TCGA lung cancer dataset, significantly surpassing prior MIL approaches. Additionally, on five widely used MIL benchmark datasets (MUSK1, MUSK2, ELEPHANT, FOX, and TIGER), our framework consistently outperforms existing methods, demonstrating strong generalization across both clinical and standard MIL scenarios.
KW - Multiple instance learning
KW - Whole slide image
UR - https://www.scopus.com/pages/publications/105017614039
U2 - 10.1016/j.bspc.2025.108714
DO - 10.1016/j.bspc.2025.108714
M3 - 文章
AN - SCOPUS:105017614039
SN - 1746-8094
VL - 112
JO - Biomedical Signal Processing and Control
JF - Biomedical Signal Processing and Control
M1 - 108714
ER -