TY - GEN
T1 - Enjoying Information Dividend
T2 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
AU - Wang, Zhisong
AU - Ye, Yiwen
AU - Chen, Ziyang
AU - Xia, Yong
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Weakly supervised semantic segmentation (WSSS) in medical imaging struggles with effectively using sparse annotations. One promising direction for WSSS leverages gaze annotations, captured via eye trackers that record regions of interest during diagnostic procedures. However, existing gaze-based methods, such as GazeMedSeg, do not fully exploit the rich information embedded in gaze data. In this paper, we propose GradTrack, a framework that utilizes physicians’ gaze track, including fixation points, durations, and temporal order, to enhance WSSS performance. GradTrack comprises two key components: (1) the Gaze Track Map Generation module for creating hierarchical attention maps, and (2) the Track Attention module for integrating attention features, which collaboratively enable progressive feature refinement through multi-level gaze supervision during the decoding process. Experiments on the Kvasir-SEG and NCI-ISBI datasets demonstrate that our GradTrack consistently outperforms existing gaze-based methods, achieving Dice score improvements of 3.21% and 2.61%, respectively. Moreover, GradTrack significantly narrows the performance gap with fully supervised models, such as nnUNet.
AB - Weakly supervised semantic segmentation (WSSS) in medical imaging struggles with effectively using sparse annotations. One promising direction for WSSS leverages gaze annotations, captured via eye trackers that record regions of interest during diagnostic procedures. However, existing gaze-based methods, such as GazeMedSeg, do not fully exploit the rich information embedded in gaze data. In this paper, we propose GradTrack, a framework that utilizes physicians’ gaze track, including fixation points, durations, and temporal order, to enhance WSSS performance. GradTrack comprises two key components: (1) the Gaze Track Map Generation module for creating hierarchical attention maps, and (2) the Track Attention module for integrating attention features, which collaboratively enable progressive feature refinement through multi-level gaze supervision during the decoding process. Experiments on the Kvasir-SEG and NCI-ISBI datasets demonstrate that our GradTrack consistently outperforms existing gaze-based methods, achieving Dice score improvements of 3.21% and 2.61%, respectively. Moreover, GradTrack significantly narrows the performance gap with fully supervised models, such as nnUNet.
KW - Eye-tracking
KW - Gaze Supervision
KW - Segmentation
UR - https://www.scopus.com/pages/publications/105018040236
U2 - 10.1007/978-3-032-05127-1_20
DO - 10.1007/978-3-032-05127-1_20
M3 - 会议稿件
AN - SCOPUS:105018040236
SN - 9783032051264
T3 - Lecture Notes in Computer Science
SP - 202
EP - 212
BT - Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings
A2 - Gee, James C.
A2 - Hong, Jaesung
A2 - Sudre, Carole H.
A2 - Golland, Polina
A2 - Park, Jinah
A2 - Alexander, Daniel C.
A2 - Iglesias, Juan Eugenio
A2 - Venkataraman, Archana
A2 - Kim, Jong Hyo
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 September 2025 through 27 September 2025
ER -