UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier

Yutong Xie; Jianpeng Zhang; Yong Xia; Qi Wu

doi:10.1007/978-3-031-19803-8_33

UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier

Yutong Xie, Jianpeng Zhang, Yong Xia, Qi Wu

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

49 引用（Scopus）

摘要

Self-supervised learning (SSL) opens up huge opportunities for medical image analysis that is well known for its lack of annotations. However, aggregating massive (unlabeled) 3D medical images like computerized tomography (CT) remains challenging due to its high imaging cost and privacy restrictions. In this paper, we advocate bringing a wealth of 2D images like chest X-rays as compensation for the lack of 3D data, aiming to build a universal medical self-supervised representation learning framework, called UniMiSS. The following problem is how to break the dimensionality barrier, i.e., making it possible to perform SSL with both 2D and 3D images? To achieve this, we design a pyramid U-like medical Transformer (MiT). It is composed of the switchable patch embedding (SPE) module and Transformers. The SPE module adaptively switches to either 2D or 3D patch embedding, depending on the input dimension. The embedded patches are converted into a sequence regardless of their original dimensions. The Transformers model the long-term dependencies in a sequence-to-sequence manner, thus enabling UniMiSS to learn representations from both 2D and 3D images. With the MiT as the backbone, we perform the UniMiSS in a self-distillation manner. We conduct expensive experiments on six 3D/2D medical image analysis tasks, including segmentation and classification. The results show that the proposed UniMiSS achieves promising performance on various downstream tasks, outperforming the ImageNet pre-training and other advanced SSL counterparts substantially. Code is available at https://github.com/YtongXie/UniMiSS-code.

源语言	英语
主期刊名	Computer Vision – ECCV 2022 - 17th European Conference, Proceedings
编辑	Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner
出版商	Springer Science and Business Media Deutschland GmbH
页	558-575
页数	18
ISBN（印刷版）	9783031198021
DOI	https://doi.org/10.1007/978-3-031-19803-8_33
出版状态	已出版 - 2022
活动	17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, 以色列期限: 23 10月 2022 → 27 10月 2022

出版系列

姓名	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	13681 LNCS
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	17th European Conference on Computer Vision, ECCV 2022
国家/地区	以色列
市	Tel Aviv
时期	23/10/22 → 27/10/22

访问文件

10.1007/978-3-031-19803-8_33

其它文件与链接

链接到 Scopus 的出版物

引用此

Xie, Y., Zhang, J., Xia, Y., & Wu, Q. (2022). UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. 在 S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, & T. Hassner (编辑), Computer Vision – ECCV 2022 - 17th European Conference, Proceedings (页码 558-575). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 13681 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19803-8_33

Xie, Yutong ; Zhang, Jianpeng ; Xia, Yong 等. / UniMiSS : Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. Computer Vision – ECCV 2022 - 17th European Conference, Proceedings. 编辑 / Shai Avidan ; Gabriel Brostow ; Moustapha Cissé ; Giovanni Maria Farinella ; Tal Hassner. Springer Science and Business Media Deutschland GmbH, 2022. 页码 558-575 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{d43737e65256489e8236a845cf7dad64,

title = "UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier",

abstract = "Self-supervised learning (SSL) opens up huge opportunities for medical image analysis that is well known for its lack of annotations. However, aggregating massive (unlabeled) 3D medical images like computerized tomography (CT) remains challenging due to its high imaging cost and privacy restrictions. In this paper, we advocate bringing a wealth of 2D images like chest X-rays as compensation for the lack of 3D data, aiming to build a universal medical self-supervised representation learning framework, called UniMiSS. The following problem is how to break the dimensionality barrier, i.e., making it possible to perform SSL with both 2D and 3D images? To achieve this, we design a pyramid U-like medical Transformer (MiT). It is composed of the switchable patch embedding (SPE) module and Transformers. The SPE module adaptively switches to either 2D or 3D patch embedding, depending on the input dimension. The embedded patches are converted into a sequence regardless of their original dimensions. The Transformers model the long-term dependencies in a sequence-to-sequence manner, thus enabling UniMiSS to learn representations from both 2D and 3D images. With the MiT as the backbone, we perform the UniMiSS in a self-distillation manner. We conduct expensive experiments on six 3D/2D medical image analysis tasks, including segmentation and classification. The results show that the proposed UniMiSS achieves promising performance on various downstream tasks, outperforming the ImageNet pre-training and other advanced SSL counterparts substantially. Code is available at https://github.com/YtongXie/UniMiSS-code.",

keywords = "Cross-dimension, Medical image analysis, Self-supervised learning, Transformer",

author = "Yutong Xie and Jianpeng Zhang and Yong Xia and Qi Wu",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 17th European Conference on Computer Vision, ECCV 2022 ; Conference date: 23-10-2022 Through 27-10-2022",

year = "2022",

doi = "10.1007/978-3-031-19803-8_33",

language = "英语",

isbn = "9783031198021",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "558--575",

editor = "Shai Avidan and Gabriel Brostow and Moustapha Ciss{\'e} and Farinella, {Giovanni Maria} and Tal Hassner",

booktitle = "Computer Vision – ECCV 2022 - 17th European Conference, Proceedings",

}

Xie, Y, Zhang, J, Xia, Y & Wu, Q 2022, UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. 在 S Avidan, G Brostow, M Cissé, GM Farinella & T Hassner (编辑), Computer Vision – ECCV 2022 - 17th European Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 13681 LNCS, Springer Science and Business Media Deutschland GmbH, 页码 558-575, 17th European Conference on Computer Vision, ECCV 2022, Tel Aviv, 以色列, 23/10/22. https://doi.org/10.1007/978-3-031-19803-8_33

UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. / Xie, Yutong; Zhang, Jianpeng; Xia, Yong 等.
Computer Vision – ECCV 2022 - 17th European Conference, Proceedings. 编辑 / Shai Avidan; Gabriel Brostow; Moustapha Cissé; Giovanni Maria Farinella; Tal Hassner. Springer Science and Business Media Deutschland GmbH, 2022. 页码 558-575 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 13681 LNCS).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - UniMiSS

T2 - 17th European Conference on Computer Vision, ECCV 2022

AU - Xie, Yutong

AU - Zhang, Jianpeng

AU - Xia, Yong

AU - Wu, Qi

PY - 2022

Y1 - 2022

N2 - Self-supervised learning (SSL) opens up huge opportunities for medical image analysis that is well known for its lack of annotations. However, aggregating massive (unlabeled) 3D medical images like computerized tomography (CT) remains challenging due to its high imaging cost and privacy restrictions. In this paper, we advocate bringing a wealth of 2D images like chest X-rays as compensation for the lack of 3D data, aiming to build a universal medical self-supervised representation learning framework, called UniMiSS. The following problem is how to break the dimensionality barrier, i.e., making it possible to perform SSL with both 2D and 3D images? To achieve this, we design a pyramid U-like medical Transformer (MiT). It is composed of the switchable patch embedding (SPE) module and Transformers. The SPE module adaptively switches to either 2D or 3D patch embedding, depending on the input dimension. The embedded patches are converted into a sequence regardless of their original dimensions. The Transformers model the long-term dependencies in a sequence-to-sequence manner, thus enabling UniMiSS to learn representations from both 2D and 3D images. With the MiT as the backbone, we perform the UniMiSS in a self-distillation manner. We conduct expensive experiments on six 3D/2D medical image analysis tasks, including segmentation and classification. The results show that the proposed UniMiSS achieves promising performance on various downstream tasks, outperforming the ImageNet pre-training and other advanced SSL counterparts substantially. Code is available at https://github.com/YtongXie/UniMiSS-code.

AB - Self-supervised learning (SSL) opens up huge opportunities for medical image analysis that is well known for its lack of annotations. However, aggregating massive (unlabeled) 3D medical images like computerized tomography (CT) remains challenging due to its high imaging cost and privacy restrictions. In this paper, we advocate bringing a wealth of 2D images like chest X-rays as compensation for the lack of 3D data, aiming to build a universal medical self-supervised representation learning framework, called UniMiSS. The following problem is how to break the dimensionality barrier, i.e., making it possible to perform SSL with both 2D and 3D images? To achieve this, we design a pyramid U-like medical Transformer (MiT). It is composed of the switchable patch embedding (SPE) module and Transformers. The SPE module adaptively switches to either 2D or 3D patch embedding, depending on the input dimension. The embedded patches are converted into a sequence regardless of their original dimensions. The Transformers model the long-term dependencies in a sequence-to-sequence manner, thus enabling UniMiSS to learn representations from both 2D and 3D images. With the MiT as the backbone, we perform the UniMiSS in a self-distillation manner. We conduct expensive experiments on six 3D/2D medical image analysis tasks, including segmentation and classification. The results show that the proposed UniMiSS achieves promising performance on various downstream tasks, outperforming the ImageNet pre-training and other advanced SSL counterparts substantially. Code is available at https://github.com/YtongXie/UniMiSS-code.

KW - Cross-dimension

KW - Medical image analysis

KW - Self-supervised learning

KW - Transformer

UR - http://www.scopus.com/inward/record.url?scp=85141817423&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-19803-8_33

DO - 10.1007/978-3-031-19803-8_33

M3 - 会议稿件

AN - SCOPUS:85141817423

SN - 9783031198021

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 558

EP - 575

BT - Computer Vision – ECCV 2022 - 17th European Conference, Proceedings

A2 - Avidan, Shai

A2 - Brostow, Gabriel

A2 - Cissé, Moustapha

A2 - Farinella, Giovanni Maria

A2 - Hassner, Tal

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 23 October 2022 through 27 October 2022

ER -

Xie Y, Zhang J, Xia Y, Wu Q. UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. 在 Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T, 编辑, Computer Vision – ECCV 2022 - 17th European Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. 页码 558-575. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-19803-8_33

UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此