TY - JOUR
T1 - Specific feature recognition on group specific networks (SFR-GSN)
T2 - a biomarker identification model for cancer stages
AU - Chen, Bolin
AU - Wang, Yuxin
AU - Zhang, Jinlei
AU - Han, Yourui
AU - Benhammouda, Hamza
AU - Bian, Jun
AU - Kang, Ruiming
AU - Shang, Xuequn
N1 - Publisher Copyright:
Copyright © 2024 Chen, Wang, Zhang, Han, Benhammouda, Bian, Kang and Shang.
PY - 2024/5
Y1 - 2024/5
N2 - Background and Objective: Accurate identification of cancer stages is challenging due to the complexity and heterogeneity of the disease. Current clinical diagnosis methods primarily rely on phenotypic observations, which may not capture early molecular-level changes accurately. Methods: In this study, a novel biomarker recognition method was proposed tailored for cancer stages by considering the change of gene expression relationships. Utilizing the sample-specific information and protein-protein interaction networks, the group specific networks were constructed to address the limited specificity of potential biomarkers. Then, a specific feature recognition method was proposed based on these group specific networks, which employed the random forest algorithm for initial screening followed by a recursive feature elimination process to identify the optimal biomarker subset. During exploring optimal results, a strategy termed the Cost-Benefit Ratio, was devised to facilitate the identification of stage-specific biomarkers. Results: Comparative experiments were conducted on lung adenocarcinoma and breast cancer datasets to validate the method’s efficacy and generalizability. The results showed that the identified biomarkers were highly stage-specific, and the F1 scores for predicting cancer stages were significantly improved. For the lung adenocarcinoma dataset, the F1 score reached 97.68%, and for the breast cancer dataset, it achieved 96.87%. These results significantly surpassed those of three conventional methods in terms of F1 scores. Moreover, from the perspective of biological functions, the biomarkers were proved playing an important role in cancer stage-evolution. Conclusion: The proposed method demonstrated its effectiveness in identifying stage-related biomarkers. By using these biomarkers as features, accurate prediction of cancer stages was achieved. Furthermore, the method exhibited potential for biomarker identification in subtype analyses, offering novel perspectives for cancer prognosis.
AB - Background and Objective: Accurate identification of cancer stages is challenging due to the complexity and heterogeneity of the disease. Current clinical diagnosis methods primarily rely on phenotypic observations, which may not capture early molecular-level changes accurately. Methods: In this study, a novel biomarker recognition method was proposed tailored for cancer stages by considering the change of gene expression relationships. Utilizing the sample-specific information and protein-protein interaction networks, the group specific networks were constructed to address the limited specificity of potential biomarkers. Then, a specific feature recognition method was proposed based on these group specific networks, which employed the random forest algorithm for initial screening followed by a recursive feature elimination process to identify the optimal biomarker subset. During exploring optimal results, a strategy termed the Cost-Benefit Ratio, was devised to facilitate the identification of stage-specific biomarkers. Results: Comparative experiments were conducted on lung adenocarcinoma and breast cancer datasets to validate the method’s efficacy and generalizability. The results showed that the identified biomarkers were highly stage-specific, and the F1 scores for predicting cancer stages were significantly improved. For the lung adenocarcinoma dataset, the F1 score reached 97.68%, and for the breast cancer dataset, it achieved 96.87%. These results significantly surpassed those of three conventional methods in terms of F1 scores. Moreover, from the perspective of biological functions, the biomarkers were proved playing an important role in cancer stage-evolution. Conclusion: The proposed method demonstrated its effectiveness in identifying stage-related biomarkers. By using these biomarkers as features, accurate prediction of cancer stages was achieved. Furthermore, the method exhibited potential for biomarker identification in subtype analyses, offering novel perspectives for cancer prognosis.
KW - biomarker
KW - cancer stages
KW - edge feature
KW - group specific network
KW - multi classification tasks
UR - http://www.scopus.com/inward/record.url?scp=85195393843&partnerID=8YFLogxK
U2 - 10.3389/fgene.2024.1407072
DO - 10.3389/fgene.2024.1407072
M3 - 文章
AN - SCOPUS:85195393843
SN - 1664-8021
VL - 15
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 1407072
ER -