TY - JOUR
T1 - UniMiSS+
T2 - Universal Medical Self-Supervised Learning From Cross-Dimensional Unpaired Data
AU - Xie, Yutong
AU - Zhang, Jianpeng
AU - Xia, Yong
AU - Wu, Qi
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Self-supervised learning (SSL) opens up huge opportunities for medical image analysis that is well known for its lack of annotations. However, aggregating massive (unlabeled) 3D medical images like computerized tomography (CT) remains challenging due to its high imaging cost and privacy restrictions. In our pilot study, we advocated bringing a wealth of 2D images like X-rays as compensation for the lack of 3D data, aiming to build a universal medical self-supervised representation learning framework, called UniMiSS. Especially, we designed a pyramid U-like medical Transformer (MiT) as the backbone to make UniMiSS possible to perform SSL with both 2D and 3D images. UniMiSS surpasses current 3D-specific SSL in effectiveness and versatility, excelling in various downstream tasks and overcoming the limitations of dimensionality. However, the initial version did not fully explore the anatomical correlations between 2D and 3D images due to the absence of paired multi-modal patient data. In this extension, we introduce UniMiSS+, which leverages digitally reconstructed radiographs (DRR) technology to simulate X-rays from CT volumes, providing access to paired data. Benefiting from the paired group, we introduce an extra pair-wise constraint to boost the cross modality correlation learning, which also can be adopted as a cross dimension regularization to further improve the representations. We conduct expensive experiments on multiple 3D/2D medical image analysis tasks, including segmentation and classification. The results show that our UniMiSS+ achieves promising performance on various downstream tasks, not only outperforming ImageNet pre-training and other advanced SSL counterparts but also improving the predecessor UniMiSS pre-training.
AB - Self-supervised learning (SSL) opens up huge opportunities for medical image analysis that is well known for its lack of annotations. However, aggregating massive (unlabeled) 3D medical images like computerized tomography (CT) remains challenging due to its high imaging cost and privacy restrictions. In our pilot study, we advocated bringing a wealth of 2D images like X-rays as compensation for the lack of 3D data, aiming to build a universal medical self-supervised representation learning framework, called UniMiSS. Especially, we designed a pyramid U-like medical Transformer (MiT) as the backbone to make UniMiSS possible to perform SSL with both 2D and 3D images. UniMiSS surpasses current 3D-specific SSL in effectiveness and versatility, excelling in various downstream tasks and overcoming the limitations of dimensionality. However, the initial version did not fully explore the anatomical correlations between 2D and 3D images due to the absence of paired multi-modal patient data. In this extension, we introduce UniMiSS+, which leverages digitally reconstructed radiographs (DRR) technology to simulate X-rays from CT volumes, providing access to paired data. Benefiting from the paired group, we introduce an extra pair-wise constraint to boost the cross modality correlation learning, which also can be adopted as a cross dimension regularization to further improve the representations. We conduct expensive experiments on multiple 3D/2D medical image analysis tasks, including segmentation and classification. The results show that our UniMiSS+ achieves promising performance on various downstream tasks, not only outperforming ImageNet pre-training and other advanced SSL counterparts but also improving the predecessor UniMiSS pre-training.
KW - Medical image analysis
KW - self-supervised learning
KW - transformer
KW - universal learning
UR - http://www.scopus.com/inward/record.url?scp=85200246009&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2024.3436105
DO - 10.1109/TPAMI.2024.3436105
M3 - 文章
C2 - 39083391
AN - SCOPUS:85200246009
SN - 0162-8828
VL - 46
SP - 10021
EP - 10035
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 12
ER -