DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2

Yiming Du; Zhuotian Li; Qian He; Thomas Wetere Tulu; Kei Hang Katie Chan; Lin Wang; Sen Pei; Zhanwei Du; Zhen Wang; Xiao Ke Xu; Xiao Fan Liu

doi:10.1016/j.jai.2025.03.004

DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2

Yiming Du, Zhuotian Li, Qian He, Thomas Wetere Tulu, Kei Hang Katie Chan, Lin Wang, Sen Pei, Zhanwei Du, Zhen Wang, Xiao Ke Xu, Xiao Fan Liu

School of Cybersecurity

Research output: Contribution to journal › Article › peer-review

Abstract

Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development. Traditional neural network methods, such as BiLSTM, could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation. The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator. Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences, enhancing the model's capacity to discern and focus on distinctions among input gene pairs. The model, i.e., DNA Pretrained Cross-Immunity Protection Inference model (DPCIPI), outperforms state-of-the-art (SOTA) models in predicting hemagglutination inhibition titer from influenza viral gene sequences only. Improvement in binary cross-immunity prediction is 1.58% in F1, 2.34% in precision, 1.57% in recall, and 1.57% in Accuracy. For multilevel cross-immunity improvements, the improvement is 2.12% in F1, 3.50% in precision, 2.19% in recall, and 2.19% in Accuracy. Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity. With expanding gene data and advancements in pre-trained models, this approach promises significant impacts on vaccine development and public health.

Original language	English
Journal	Journal of Automation and Intelligence
DOIs	https://doi.org/10.1016/j.jai.2025.03.004
State	Accepted/In press - 2025

Keywords

Cross-immunity prediction
Deep learning
Hemagglutination inhibition
Influenza strains
Pre-trained model

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.jai.2025.03.004

Cite this

@article{d87442e5f5d14eb5b4a2de53d7a21265,

title = "DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2",

abstract = "Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development. Traditional neural network methods, such as BiLSTM, could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation. The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator. Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences, enhancing the model's capacity to discern and focus on distinctions among input gene pairs. The model, i.e., DNA Pretrained Cross-Immunity Protection Inference model (DPCIPI), outperforms state-of-the-art (SOTA) models in predicting hemagglutination inhibition titer from influenza viral gene sequences only. Improvement in binary cross-immunity prediction is 1.58% in F1, 2.34% in precision, 1.57% in recall, and 1.57% in Accuracy. For multilevel cross-immunity improvements, the improvement is 2.12% in F1, 3.50% in precision, 2.19% in recall, and 2.19% in Accuracy. Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity. With expanding gene data and advancements in pre-trained models, this approach promises significant impacts on vaccine development and public health.",

keywords = "Cross-immunity prediction, Deep learning, Hemagglutination inhibition, Influenza strains, Pre-trained model",

author = "Yiming Du and Zhuotian Li and Qian He and Tulu, {Thomas Wetere} and Chan, {Kei Hang Katie} and Lin Wang and Sen Pei and Zhanwei Du and Zhen Wang and Xu, {Xiao Ke} and Liu, {Xiao Fan}",

note = "Publisher Copyright: {\textcopyright} 2025 The Authors",

year = "2025",

doi = "10.1016/j.jai.2025.03.004",

language = "英语",

journal = "Journal of Automation and Intelligence",

issn = "2949-8554",

publisher = "KeAi Communications Co.",

}

TY - JOUR

T1 - DPCIPI

T2 - A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2

AU - Du, Yiming

AU - Li, Zhuotian

AU - He, Qian

AU - Tulu, Thomas Wetere

AU - Chan, Kei Hang Katie

AU - Wang, Lin

AU - Pei, Sen

AU - Du, Zhanwei

AU - Wang, Zhen

AU - Xu, Xiao Ke

AU - Liu, Xiao Fan

PY - 2025

Y1 - 2025

N2 - Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development. Traditional neural network methods, such as BiLSTM, could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation. The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator. Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences, enhancing the model's capacity to discern and focus on distinctions among input gene pairs. The model, i.e., DNA Pretrained Cross-Immunity Protection Inference model (DPCIPI), outperforms state-of-the-art (SOTA) models in predicting hemagglutination inhibition titer from influenza viral gene sequences only. Improvement in binary cross-immunity prediction is 1.58% in F1, 2.34% in precision, 1.57% in recall, and 1.57% in Accuracy. For multilevel cross-immunity improvements, the improvement is 2.12% in F1, 3.50% in precision, 2.19% in recall, and 2.19% in Accuracy. Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity. With expanding gene data and advancements in pre-trained models, this approach promises significant impacts on vaccine development and public health.

AB - Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development. Traditional neural network methods, such as BiLSTM, could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation. The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator. Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences, enhancing the model's capacity to discern and focus on distinctions among input gene pairs. The model, i.e., DNA Pretrained Cross-Immunity Protection Inference model (DPCIPI), outperforms state-of-the-art (SOTA) models in predicting hemagglutination inhibition titer from influenza viral gene sequences only. Improvement in binary cross-immunity prediction is 1.58% in F1, 2.34% in precision, 1.57% in recall, and 1.57% in Accuracy. For multilevel cross-immunity improvements, the improvement is 2.12% in F1, 3.50% in precision, 2.19% in recall, and 2.19% in Accuracy. Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity. With expanding gene data and advancements in pre-trained models, this approach promises significant impacts on vaccine development and public health.

KW - Cross-immunity prediction

KW - Deep learning

KW - Hemagglutination inhibition

KW - Influenza strains

KW - Pre-trained model

UR - http://www.scopus.com/inward/record.url?scp=105002757287&partnerID=8YFLogxK

U2 - 10.1016/j.jai.2025.03.004

DO - 10.1016/j.jai.2025.03.004

M3 - 文章

AN - SCOPUS:105002757287

SN - 2949-8554

JO - Journal of Automation and Intelligence

JF - Journal of Automation and Intelligence

ER -

DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this