UAE: Universal Anatomical Embedding on multi-modality medical images

Xiaoyu Bai; Fan Bai; Xiaofei Huo; Jia Ge; Jingjing Lu; Xianghua Ye; Minglei Shu; Ke Yan; Yong Xia

doi:10.1016/j.media.2025.103562

UAE: Universal Anatomical Embedding on multi-modality medical images

Xiaoyu Bai, Fan Bai, Xiaofei Huo, Jia Ge, Jingjing Lu, Xianghua Ye, Minglei Shu, Ke Yan, Yong Xia

School of Computer Science

Research output: Contribution to journal › Article › peer-review

Abstract

Identifying anatomical structures (e.g., lesions or landmarks) is crucial for medical image analysis. Exemplar-based landmark detection methods are gaining attention as they allow the detection of arbitrary points during inference without needing annotated landmarks during training. These methods use self-supervised learning to create a discriminative voxel embedding and match corresponding landmarks via nearest-neighbor searches, showing promising results. However, current methods still face challenges in (1) differentiating voxels with similar appearance but different semantic meanings (e.g., two adjacent structures without clear borders); (2) matching voxels with similar semantics but markedly different appearance (e.g., the same vessel before and after contrast injection); and (3) cross-modality matching (e.g., CT-MRI landmark-based registration). To overcome these challenges, we propose a Unified framework for learning Anatomical Embeddings (UAE). UAE is designed to learn appearance, semantic, and cross-modality anatomical embeddings. Specifically, UAE incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning. We thoroughly evaluated UAE across intra- and inter-modality tasks, including one-shot landmark detection, lesion tracking on longitudinal CT scans, and CT-MRI affine/rigid registration with varying fields of view. Our results suggest that UAE outperforms state-of-the-art methods, offering a robust and versatile approach for landmark-based medical image analysis tasks. Code and trained models are available at: https://github.com/alibaba-damo-academy/self-supervised-anatomical-embedding-v2.

Original language	English
Article number	103562
Journal	Medical Image Analysis
Volume	103
DOIs	https://doi.org/10.1016/j.media.2025.103562
State	Published - Jul 2025

Keywords

Anatomical embedding learning
Landmark matching
Multi-modality image alignment

Access to Document

10.1016/j.media.2025.103562

Cite this

@article{f9de80a916e64333a782ddc10a0358e2,

title = "UAE: Universal Anatomical Embedding on multi-modality medical images",

abstract = "Identifying anatomical structures (e.g., lesions or landmarks) is crucial for medical image analysis. Exemplar-based landmark detection methods are gaining attention as they allow the detection of arbitrary points during inference without needing annotated landmarks during training. These methods use self-supervised learning to create a discriminative voxel embedding and match corresponding landmarks via nearest-neighbor searches, showing promising results. However, current methods still face challenges in (1) differentiating voxels with similar appearance but different semantic meanings (e.g., two adjacent structures without clear borders); (2) matching voxels with similar semantics but markedly different appearance (e.g., the same vessel before and after contrast injection); and (3) cross-modality matching (e.g., CT-MRI landmark-based registration). To overcome these challenges, we propose a Unified framework for learning Anatomical Embeddings (UAE). UAE is designed to learn appearance, semantic, and cross-modality anatomical embeddings. Specifically, UAE incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning. We thoroughly evaluated UAE across intra- and inter-modality tasks, including one-shot landmark detection, lesion tracking on longitudinal CT scans, and CT-MRI affine/rigid registration with varying fields of view. Our results suggest that UAE outperforms state-of-the-art methods, offering a robust and versatile approach for landmark-based medical image analysis tasks. Code and trained models are available at: https://github.com/alibaba-damo-academy/self-supervised-anatomical-embedding-v2.",

keywords = "Anatomical embedding learning, Landmark matching, Multi-modality image alignment",

author = "Xiaoyu Bai and Fan Bai and Xiaofei Huo and Jia Ge and Jingjing Lu and Xianghua Ye and Minglei Shu and Ke Yan and Yong Xia",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier B.V.",

year = "2025",

month = jul,

doi = "10.1016/j.media.2025.103562",

language = "英语",

volume = "103",

journal = "Medical Image Analysis",

issn = "1361-8415",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - UAE

T2 - Universal Anatomical Embedding on multi-modality medical images

AU - Bai, Xiaoyu

AU - Bai, Fan

AU - Huo, Xiaofei

AU - Ge, Jia

AU - Lu, Jingjing

AU - Ye, Xianghua

AU - Shu, Minglei

AU - Yan, Ke

AU - Xia, Yong

PY - 2025/7

Y1 - 2025/7

N2 - Identifying anatomical structures (e.g., lesions or landmarks) is crucial for medical image analysis. Exemplar-based landmark detection methods are gaining attention as they allow the detection of arbitrary points during inference without needing annotated landmarks during training. These methods use self-supervised learning to create a discriminative voxel embedding and match corresponding landmarks via nearest-neighbor searches, showing promising results. However, current methods still face challenges in (1) differentiating voxels with similar appearance but different semantic meanings (e.g., two adjacent structures without clear borders); (2) matching voxels with similar semantics but markedly different appearance (e.g., the same vessel before and after contrast injection); and (3) cross-modality matching (e.g., CT-MRI landmark-based registration). To overcome these challenges, we propose a Unified framework for learning Anatomical Embeddings (UAE). UAE is designed to learn appearance, semantic, and cross-modality anatomical embeddings. Specifically, UAE incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning. We thoroughly evaluated UAE across intra- and inter-modality tasks, including one-shot landmark detection, lesion tracking on longitudinal CT scans, and CT-MRI affine/rigid registration with varying fields of view. Our results suggest that UAE outperforms state-of-the-art methods, offering a robust and versatile approach for landmark-based medical image analysis tasks. Code and trained models are available at: https://github.com/alibaba-damo-academy/self-supervised-anatomical-embedding-v2.

AB - Identifying anatomical structures (e.g., lesions or landmarks) is crucial for medical image analysis. Exemplar-based landmark detection methods are gaining attention as they allow the detection of arbitrary points during inference without needing annotated landmarks during training. These methods use self-supervised learning to create a discriminative voxel embedding and match corresponding landmarks via nearest-neighbor searches, showing promising results. However, current methods still face challenges in (1) differentiating voxels with similar appearance but different semantic meanings (e.g., two adjacent structures without clear borders); (2) matching voxels with similar semantics but markedly different appearance (e.g., the same vessel before and after contrast injection); and (3) cross-modality matching (e.g., CT-MRI landmark-based registration). To overcome these challenges, we propose a Unified framework for learning Anatomical Embeddings (UAE). UAE is designed to learn appearance, semantic, and cross-modality anatomical embeddings. Specifically, UAE incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning. We thoroughly evaluated UAE across intra- and inter-modality tasks, including one-shot landmark detection, lesion tracking on longitudinal CT scans, and CT-MRI affine/rigid registration with varying fields of view. Our results suggest that UAE outperforms state-of-the-art methods, offering a robust and versatile approach for landmark-based medical image analysis tasks. Code and trained models are available at: https://github.com/alibaba-damo-academy/self-supervised-anatomical-embedding-v2.

KW - Anatomical embedding learning

KW - Landmark matching

KW - Multi-modality image alignment

UR - http://www.scopus.com/inward/record.url?scp=105002128356&partnerID=8YFLogxK

U2 - 10.1016/j.media.2025.103562

DO - 10.1016/j.media.2025.103562

M3 - 文章

AN - SCOPUS:105002128356

SN - 1361-8415

VL - 103

JO - Medical Image Analysis

JF - Medical Image Analysis

M1 - 103562

ER -

UAE: Universal Anatomical Embedding on multi-modality medical images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this