PICK: Predict and Mask for Semi-supervised Medical Image Segmentation

Qingjie Zeng; Zilin Lu; Yutong Xie; Yong Xia

doi:10.1007/s11263-024-02328-9

PICK: Predict and Mask for Semi-supervised Medical Image Segmentation

Qingjie Zeng, Zilin Lu, Yutong Xie, Yong Xia

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

Pseudo-labeling and consistency-based co-training are established paradigms in semi-supervised learning. Pseudo-labeling focuses on selecting reliable pseudo-labels, while co-training emphasizes sub-network diversity for complementary information extraction. However, both paradigms struggle with the inevitable erroneous predictions from unlabeled data, which poses a risk to task-specific decoders and ultimately impact model performance. To address this challenge, we propose a PredICt-and-masK (PICK) model for semi-supervised medical image segmentation. PICK operates by masking and predicting pseudo-label-guided attentive regions to exploit unlabeled data. It features a shared encoder and three task-specific decoders. Specifically, PICK employs a primary decoder supervised solely by labeled data to generate pseudo-labels, identifying potential targets in unlabeled data. The model then masks these regions and reconstructs them using a masked image modeling (MIM) decoder, optimizing through a reconstruction task. To reconcile segmentation and reconstruction, an auxiliary decoder is further developed to learn from the reconstructed images, whose predictions are constrained by the primary decoder. We evaluate PICK on five medical benchmarks, including single organ/tumor segmentation, multi-organ segmentation, and domain-generalized tasks. Our results indicate that PICK outperforms state-of-the-art methods. The code is available at https://github.com/maxwell0027/PICK.

源语言	英语
文章编号	102530
页（从-至）	3296-3311
页数	16
期刊	International Journal of Computer Vision
卷	133
期	6
DOI	https://doi.org/10.1007/s11263-024-02328-9
出版状态	已出版 - 6月 2025

访问文件

10.1007/s11263-024-02328-9

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{ab6ca3adc4ef4d69b4f0051fd24fe914,

title = "PICK: Predict and Mask for Semi-supervised Medical Image Segmentation",

abstract = "Pseudo-labeling and consistency-based co-training are established paradigms in semi-supervised learning. Pseudo-labeling focuses on selecting reliable pseudo-labels, while co-training emphasizes sub-network diversity for complementary information extraction. However, both paradigms struggle with the inevitable erroneous predictions from unlabeled data, which poses a risk to task-specific decoders and ultimately impact model performance. To address this challenge, we propose a PredICt-and-masK (PICK) model for semi-supervised medical image segmentation. PICK operates by masking and predicting pseudo-label-guided attentive regions to exploit unlabeled data. It features a shared encoder and three task-specific decoders. Specifically, PICK employs a primary decoder supervised solely by labeled data to generate pseudo-labels, identifying potential targets in unlabeled data. The model then masks these regions and reconstructs them using a masked image modeling (MIM) decoder, optimizing through a reconstruction task. To reconcile segmentation and reconstruction, an auxiliary decoder is further developed to learn from the reconstructed images, whose predictions are constrained by the primary decoder. We evaluate PICK on five medical benchmarks, including single organ/tumor segmentation, multi-organ segmentation, and domain-generalized tasks. Our results indicate that PICK outperforms state-of-the-art methods. The code is available at https://github.com/maxwell0027/PICK.",

keywords = "Attentive region masking, Medical image segmentation, Reconstruction, Semi-supervised learning",

author = "Qingjie Zeng and Zilin Lu and Yutong Xie and Yong Xia",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.",

year = "2025",

month = jun,

doi = "10.1007/s11263-024-02328-9",

language = "英语",

volume = "133",

pages = "3296--3311",

journal = "International Journal of Computer Vision",

issn = "0920-5691",

publisher = "Springer Netherlands",

number = "6",

}

TY - JOUR

T1 - PICK

T2 - Predict and Mask for Semi-supervised Medical Image Segmentation

AU - Zeng, Qingjie

AU - Lu, Zilin

AU - Xie, Yutong

AU - Xia, Yong

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

PY - 2025/6

Y1 - 2025/6

N2 - Pseudo-labeling and consistency-based co-training are established paradigms in semi-supervised learning. Pseudo-labeling focuses on selecting reliable pseudo-labels, while co-training emphasizes sub-network diversity for complementary information extraction. However, both paradigms struggle with the inevitable erroneous predictions from unlabeled data, which poses a risk to task-specific decoders and ultimately impact model performance. To address this challenge, we propose a PredICt-and-masK (PICK) model for semi-supervised medical image segmentation. PICK operates by masking and predicting pseudo-label-guided attentive regions to exploit unlabeled data. It features a shared encoder and three task-specific decoders. Specifically, PICK employs a primary decoder supervised solely by labeled data to generate pseudo-labels, identifying potential targets in unlabeled data. The model then masks these regions and reconstructs them using a masked image modeling (MIM) decoder, optimizing through a reconstruction task. To reconcile segmentation and reconstruction, an auxiliary decoder is further developed to learn from the reconstructed images, whose predictions are constrained by the primary decoder. We evaluate PICK on five medical benchmarks, including single organ/tumor segmentation, multi-organ segmentation, and domain-generalized tasks. Our results indicate that PICK outperforms state-of-the-art methods. The code is available at https://github.com/maxwell0027/PICK.

AB - Pseudo-labeling and consistency-based co-training are established paradigms in semi-supervised learning. Pseudo-labeling focuses on selecting reliable pseudo-labels, while co-training emphasizes sub-network diversity for complementary information extraction. However, both paradigms struggle with the inevitable erroneous predictions from unlabeled data, which poses a risk to task-specific decoders and ultimately impact model performance. To address this challenge, we propose a PredICt-and-masK (PICK) model for semi-supervised medical image segmentation. PICK operates by masking and predicting pseudo-label-guided attentive regions to exploit unlabeled data. It features a shared encoder and three task-specific decoders. Specifically, PICK employs a primary decoder supervised solely by labeled data to generate pseudo-labels, identifying potential targets in unlabeled data. The model then masks these regions and reconstructs them using a masked image modeling (MIM) decoder, optimizing through a reconstruction task. To reconcile segmentation and reconstruction, an auxiliary decoder is further developed to learn from the reconstructed images, whose predictions are constrained by the primary decoder. We evaluate PICK on five medical benchmarks, including single organ/tumor segmentation, multi-organ segmentation, and domain-generalized tasks. Our results indicate that PICK outperforms state-of-the-art methods. The code is available at https://github.com/maxwell0027/PICK.

KW - Attentive region masking

KW - Medical image segmentation

KW - Reconstruction

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85214081072&partnerID=8YFLogxK

U2 - 10.1007/s11263-024-02328-9

DO - 10.1007/s11263-024-02328-9

M3 - 文章

AN - SCOPUS:85214081072

SN - 0920-5691

VL - 133

SP - 3296

EP - 3311

JO - International Journal of Computer Vision

JF - International Journal of Computer Vision

IS - 6

M1 - 102530

ER -

PICK: Predict and Mask for Semi-supervised Medical Image Segmentation

摘要

访问文件

其它文件与链接

指纹

引用此