MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking

Yutong Xie, Lin Gu, Tatsuya Harada, Jianpeng Zhang, Yong Xia, Qi Wu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Masked image modelling (MIM)-based pre-training shows promise in improving image representations with limited annotated data by randomly masking image patches and reconstructing them. However, random masking may not be suitable for medical images due to their unique pathology characteristics. This paper proposes Masked medical Image Modelling (MedIM), a novel approach, to our knowledge, the first research that masks and reconstructs discriminative areas guided by radiological reports, encouraging the network to explore the stronger semantic representations from medical images. We introduce two mutual comprehensive masking strategies, knowledge word-driven masking (KWM) and sentence-driven masking (SDM). KWM uses Medical Subject Headings (MeSH) words unique to radiology reports to identify discriminative cues mapped to MeSH words and guide the mask generation. SDM considers that reports usually have multiple sentences, each of which describes different findings, and therefore integrates sentence-level information to identify discriminative regions for mask generation. MedIM integrates both strategies by simultaneously restoring the images masked by KWM and SDM for a more robust and representative medical visual representation. Our extensive experiments on various downstream tasks covering multi-label/class image classification, medical image segmentation, and medical image-text analysis, demonstrate that MedIM with report-guided masking achieves competitive performance. Our method substantially outperforms ImageNet pre-training, MIM-based pre-training, and medical image-report pre-training counterparts. Codes are available at https://github.com/YtongXie/MedIM.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings
EditorsHayit Greenspan, Hayit Greenspan, Anant Madabhushi, Parvin Mousavi, Septimiu Salcudean, James Duncan, Tanveer Syeda-Mahmood, Russell Taylor
PublisherSpringer Science and Business Media Deutschland GmbH
Pages13-23
Number of pages11
ISBN (Print)9783031439063
DOIs
StatePublished - 2023
Event26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023 - Vancouver, Canada
Duration: 8 Oct 202312 Oct 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14220 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023
Country/TerritoryCanada
CityVancouver
Period8/10/2312/10/23

Fingerprint

Dive into the research topics of 'MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking'. Together they form a unique fingerprint.

Cite this