Spot the Difference: Difference Visual Question Answering with Residual Alignment

Zilin Lu, Yutong Xie, Qingjie Zeng, Mengkang Lu, Qi Wu, Yong Xia

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Difference Visual Question Answering (DiffVQA) introduces a new task aimed at understanding and responding to questions regarding the disparities observed between two images. Unlike traditional medical VQA tasks, DiffVQA closely mirrors the diagnostic procedures of radiologists, who frequently conduct longitudinal comparisons of images taken at different time points for a given patient. This task accentuates the discrepancies between images captured at distinct temporal intervals. To better address the variations, this paper proposes a novel Residual Alignment model (ReAl) tailored for DiffVQA. ReAl is designed to produce flexible and accurate answers by analyzing the discrepancies in chest X-ray images of the same patient across different time points. Compared to the previous method, ReAl additionally aid a residual input branch, where the residual of two images is fed into this branch. Additionally, a Residual Feature Alignment (RFA) module is introduced to ensure that ReAl effectively captures and learns the disparities between corresponding images. Experimental evaluations conducted on the MIMIC-Diff-VQA dataset demonstrate the superiority of ReAl over previous state-of-the-art methods, consistently achieving better performance. Ablation experiments further validate the effectiveness of the RFA module in enhancing the model’s attention to differences. The code implementation of the proposed approach will be made available.

源语言英语
主期刊名Medical Image Computing and Computer Assisted Intervention – MICCAI 2024 - 27th International Conference, Proceedings
编辑Marius George Linguraru, Qi Dou, Aasa Feragen, Stamatia Giannarou, Ben Glocker, Karim Lekadir, Julia A. Schnabel
出版商Springer Science and Business Media Deutschland GmbH
649-658
页数10
ISBN(印刷版)9783031720857
DOI
出版状态已出版 - 2024
活动27th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2024 - Marrakesh, 摩洛哥
期限: 6 10月 202410 10月 2024

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
15005 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议27th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2024
国家/地区摩洛哥
Marrakesh
时期6/10/2410/10/24

指纹

探究 'Spot the Difference: Difference Visual Question Answering with Residual Alignment' 的科研主题。它们共同构成独一无二的指纹。

引用此