TY - JOUR
T1 - ChestXRayBERT
T2 - A Pretrained Language Model for Chest Radiology Report Summarization
AU - Cai, Xiaoyan
AU - Liu, Sen
AU - Han, Junwei
AU - Yang, Libin
AU - Liu, Zhenguo
AU - Liu, Tianming
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Automatically generating the 'impression' section of a radiology report given the 'findings' section can summarize as much salient information of the 'findings' section as possible, thus promoting more effective communication between radiologists and referring physicians. To significantly reduce the workload of radiologists, we develop and evaluate a novel framework of abstractive summarization methods to automatically generate the 'impression' section of chest radiology reports. Despite recent advancements in natural language process (NLP) field such as BERT and its variants, existing abstractive summarization models and methods could not be directly applied to radiology reports, partly due to domain-specific radiology terminology. In response, we develop a pre-trained language model in the chest radiology domain, named ChestXRayBERT, to solve the problem of automatically summarizing chest radiology reports. Specifically, we first collect radiology-related scientific papers as pre-training corpus and pre-train a ChestXRayBERT on it. Then, an abstractive summarization model is proposed, which consists of the pre-trained ChestXRayBERT and a Transformer decoder. Finally, the model is fine-tuned on chest X-ray reports for the abstractive summarization task. When evaluated on the publicly available OPEN-I and MIMIC-CXR datasets, the performance of our proposed model achieves significant improvement compared with other neural networks-based abstractive summarization models. In general, the proposed ChestXRayBERT demonstrates the feasibility and promise of tailoring and extending advanced NLP techniques to the domain of medical imaging and radiology, as well as in the broader biomedicine and healthcare fields in the future.
AB - Automatically generating the 'impression' section of a radiology report given the 'findings' section can summarize as much salient information of the 'findings' section as possible, thus promoting more effective communication between radiologists and referring physicians. To significantly reduce the workload of radiologists, we develop and evaluate a novel framework of abstractive summarization methods to automatically generate the 'impression' section of chest radiology reports. Despite recent advancements in natural language process (NLP) field such as BERT and its variants, existing abstractive summarization models and methods could not be directly applied to radiology reports, partly due to domain-specific radiology terminology. In response, we develop a pre-trained language model in the chest radiology domain, named ChestXRayBERT, to solve the problem of automatically summarizing chest radiology reports. Specifically, we first collect radiology-related scientific papers as pre-training corpus and pre-train a ChestXRayBERT on it. Then, an abstractive summarization model is proposed, which consists of the pre-trained ChestXRayBERT and a Transformer decoder. Finally, the model is fine-tuned on chest X-ray reports for the abstractive summarization task. When evaluated on the publicly available OPEN-I and MIMIC-CXR datasets, the performance of our proposed model achieves significant improvement compared with other neural networks-based abstractive summarization models. In general, the proposed ChestXRayBERT demonstrates the feasibility and promise of tailoring and extending advanced NLP techniques to the domain of medical imaging and radiology, as well as in the broader biomedicine and healthcare fields in the future.
KW - abstractive summarization
KW - chest radiology report
KW - Pre-trained language model
UR - http://www.scopus.com/inward/record.url?scp=85121395984&partnerID=8YFLogxK
U2 - 10.1109/TMM.2021.3132724
DO - 10.1109/TMM.2021.3132724
M3 - 文章
AN - SCOPUS:85121395984
SN - 1520-9210
VL - 25
SP - 845
EP - 855
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -