Eye-Gaze-Guided Vision Transformer for Rectifying Shortcut Learning

Chong Ma, Lin Zhao, Yuzhong Chen, Sheng Wang, Lei Guo, Tuo Zhang, Dinggang Shen, Xi Jiang, Tianming Liu

科研成果: 期刊稿件文章同行评审

22 引用 (Scopus)

摘要

Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned representation. The situation becomes even more serious in medical image analysis, where the clinical data are limited and scarce while the reliability, generalizability and transparency of the learned model are highly required. To rectify the harmful shortcuts in medical imaging applications, in this paper, we propose a novel eye-gaze-guided vision transformer (EG-ViT) model which infuses the visual attention from radiologists to proactively guide the vision transformer (ViT) model to focus on regions with potential pathology rather than spurious correlations. To do so, the EG-ViT model takes the masked image patches that are within the radiologists' interest as input while has an additional residual connection to the last encoder layer to maintain the interactions of all patches. The experiments on two medical imaging datasets demonstrate that the proposed EG-ViT model can effectively rectify the harmful shortcut learning and improve the interpretability of the model. Meanwhile, infusing the experts' domain knowledge can also improve the large-scale ViT model's performance over all compared baseline methods with limited samples available. In general, EG-ViT takes the advantages of powerful deep neural networks while rectifies the harmful shortcut learning with human expert's prior knowledge. This work also opens new avenues for advancing current artificial intelligence paradigms by infusing human intelligence.

源语言英语
页(从-至)3384-3394
页数11
期刊IEEE Transactions on Medical Imaging
42
11
DOI
出版状态已出版 - 1 11月 2023

指纹

探究 'Eye-Gaze-Guided Vision Transformer for Rectifying Shortcut Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此