TransLSTD: Augmenting hierarchical disease risk prediction model with time and context awareness via disease clustering

Tao You; Qiaodong Dang; Qing Li; Peng Zhang; Guanzhong Wu; Wei Huang

doi:10.1016/j.is.2024.102390

TransLSTD: Augmenting hierarchical disease risk prediction model with time and context awareness via disease clustering

Tao You, Qiaodong Dang, Qing Li, Peng Zhang, Guanzhong Wu, Wei Huang

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

The use of electronic health records has become widespread, providing a valuable source of information for predicting disease risk. While deep neural network models have been proposed and shown to be effective in this task, supplemented with medical domain knowledge for interpretability, several limitations still exist. Firstly, there is often a lack of differentiation between chronic and acute diseases leading to biased modeling of diseases. Secondly, the extraction of patient single-layer temporal patterns is limited, which hinders comprehensive representation and predictive power. Thirdly, weak interpretability based on deep neural networks prevents the extraction of valuable medical knowledge, limiting practical applications. To overcome these challenges, we propose TransLSTD, a hierarchical model that incorporates time awareness and context awareness while distinguishing between long-term and short-term diseases. TransLSTD uses clustering algorithms to classify disease types based on the occurrence feature matrix of diseases from EHR dataset and updates disease representation at the code level while creating patient visit embeddings. The model utilizes query vectors to incorporate visit context information and combines time data to capture the patient's overall health status. Finally, the prediction module generates outcomes and provides effective interpretations. We demonstrate the effectiveness of TransLSTD using two real-world datasets, outperforming state-of-the-art models in terms of both AUC and F1 values. The data and code are released at https://github.com/DangQD/TransLSTD-master.

源语言	英语
文章编号	102390
期刊	Information Systems
卷	124
DOI	https://doi.org/10.1016/j.is.2024.102390
出版状态	已出版 - 9月 2024

访问文件

10.1016/j.is.2024.102390

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{fdb8912766234080b8c1d5512902117c,

title = "TransLSTD: Augmenting hierarchical disease risk prediction model with time and context awareness via disease clustering",

abstract = "The use of electronic health records has become widespread, providing a valuable source of information for predicting disease risk. While deep neural network models have been proposed and shown to be effective in this task, supplemented with medical domain knowledge for interpretability, several limitations still exist. Firstly, there is often a lack of differentiation between chronic and acute diseases leading to biased modeling of diseases. Secondly, the extraction of patient single-layer temporal patterns is limited, which hinders comprehensive representation and predictive power. Thirdly, weak interpretability based on deep neural networks prevents the extraction of valuable medical knowledge, limiting practical applications. To overcome these challenges, we propose TransLSTD, a hierarchical model that incorporates time awareness and context awareness while distinguishing between long-term and short-term diseases. TransLSTD uses clustering algorithms to classify disease types based on the occurrence feature matrix of diseases from EHR dataset and updates disease representation at the code level while creating patient visit embeddings. The model utilizes query vectors to incorporate visit context information and combines time data to capture the patient's overall health status. Finally, the prediction module generates outcomes and provides effective interpretations. We demonstrate the effectiveness of TransLSTD using two real-world datasets, outperforming state-of-the-art models in terms of both AUC and F1 values. The data and code are released at https://github.com/DangQD/TransLSTD-master.",

keywords = "Data mining, Disease classification, Disease risk prediction, Electronic health records, Interpretability",

author = "Tao You and Qiaodong Dang and Qing Li and Peng Zhang and Guanzhong Wu and Wei Huang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Ltd",

year = "2024",

month = sep,

doi = "10.1016/j.is.2024.102390",

language = "英语",

volume = "124",

journal = "Information Systems",

issn = "0306-4379",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - TransLSTD

T2 - Augmenting hierarchical disease risk prediction model with time and context awareness via disease clustering

AU - You, Tao

AU - Dang, Qiaodong

AU - Li, Qing

AU - Zhang, Peng

AU - Wu, Guanzhong

AU - Huang, Wei

PY - 2024/9

Y1 - 2024/9

N2 - The use of electronic health records has become widespread, providing a valuable source of information for predicting disease risk. While deep neural network models have been proposed and shown to be effective in this task, supplemented with medical domain knowledge for interpretability, several limitations still exist. Firstly, there is often a lack of differentiation between chronic and acute diseases leading to biased modeling of diseases. Secondly, the extraction of patient single-layer temporal patterns is limited, which hinders comprehensive representation and predictive power. Thirdly, weak interpretability based on deep neural networks prevents the extraction of valuable medical knowledge, limiting practical applications. To overcome these challenges, we propose TransLSTD, a hierarchical model that incorporates time awareness and context awareness while distinguishing between long-term and short-term diseases. TransLSTD uses clustering algorithms to classify disease types based on the occurrence feature matrix of diseases from EHR dataset and updates disease representation at the code level while creating patient visit embeddings. The model utilizes query vectors to incorporate visit context information and combines time data to capture the patient's overall health status. Finally, the prediction module generates outcomes and provides effective interpretations. We demonstrate the effectiveness of TransLSTD using two real-world datasets, outperforming state-of-the-art models in terms of both AUC and F1 values. The data and code are released at https://github.com/DangQD/TransLSTD-master.

AB - The use of electronic health records has become widespread, providing a valuable source of information for predicting disease risk. While deep neural network models have been proposed and shown to be effective in this task, supplemented with medical domain knowledge for interpretability, several limitations still exist. Firstly, there is often a lack of differentiation between chronic and acute diseases leading to biased modeling of diseases. Secondly, the extraction of patient single-layer temporal patterns is limited, which hinders comprehensive representation and predictive power. Thirdly, weak interpretability based on deep neural networks prevents the extraction of valuable medical knowledge, limiting practical applications. To overcome these challenges, we propose TransLSTD, a hierarchical model that incorporates time awareness and context awareness while distinguishing between long-term and short-term diseases. TransLSTD uses clustering algorithms to classify disease types based on the occurrence feature matrix of diseases from EHR dataset and updates disease representation at the code level while creating patient visit embeddings. The model utilizes query vectors to incorporate visit context information and combines time data to capture the patient's overall health status. Finally, the prediction module generates outcomes and provides effective interpretations. We demonstrate the effectiveness of TransLSTD using two real-world datasets, outperforming state-of-the-art models in terms of both AUC and F1 values. The data and code are released at https://github.com/DangQD/TransLSTD-master.

KW - Data mining

KW - Disease classification

KW - Disease risk prediction

KW - Electronic health records

KW - Interpretability

UR - http://www.scopus.com/inward/record.url?scp=85191309013&partnerID=8YFLogxK

U2 - 10.1016/j.is.2024.102390

DO - 10.1016/j.is.2024.102390

M3 - 文章

AN - SCOPUS:85191309013

SN - 0306-4379

VL - 124

JO - Information Systems

JF - Information Systems

M1 - 102390

ER -

TransLSTD: Augmenting hierarchical disease risk prediction model with time and context awareness via disease clustering

摘要

访问文件

其它文件与链接

指纹

引用此