TY - JOUR
T1 - Mining algorithm for breast cancer candidate disease module based on key node groups
AU - Wang, Yibin
AU - Cheng, Yongmei
AU - Zhang, Shaowu
N1 - Publisher Copyright:
© 2016, Editorial Department of Journal of Southeast University. All right reserved.
PY - 2016/3/20
Y1 - 2016/3/20
N2 - In order to solve the problems of small quantity, incomplete data, noise, and bias of the gene expression profile in the method for breast cancer disease module mining, a mining algorithm for candidate disease module based on the key node groups and the local node fitness constraints, the key node groups and local fitness (KNGLF) algorithm, is proposed. First, the topological overlap similarity score and the functional similarity score between the candidate genes and the pathogenic genes are fused into a fusion score. Through comparing the fusion score with the threshold value, the key nodes are selected and the key node groups are constructed. Then, the breast cancer candidate disease modules are mined based on the local fitness constraints and different decision criteria for different nodes. Finally, according to the enrichment analysis results, the candidate disease gene modules are identified. The experimental results show that compared with other existing mining algorithms for breast cancer module, the key node selection algorithm in the KNGLF algorithm has the smaller MRR (mean rank ratio) but the greater AUC (area under curve). Fifteen breast cancer candidate gene modules with significant biological significance are identified by the KNGLF algorithm. Besides, the KNGLF algorithm can be extended to identify other diseases related candidate modules.
AB - In order to solve the problems of small quantity, incomplete data, noise, and bias of the gene expression profile in the method for breast cancer disease module mining, a mining algorithm for candidate disease module based on the key node groups and the local node fitness constraints, the key node groups and local fitness (KNGLF) algorithm, is proposed. First, the topological overlap similarity score and the functional similarity score between the candidate genes and the pathogenic genes are fused into a fusion score. Through comparing the fusion score with the threshold value, the key nodes are selected and the key node groups are constructed. Then, the breast cancer candidate disease modules are mined based on the local fitness constraints and different decision criteria for different nodes. Finally, according to the enrichment analysis results, the candidate disease gene modules are identified. The experimental results show that compared with other existing mining algorithms for breast cancer module, the key node selection algorithm in the KNGLF algorithm has the smaller MRR (mean rank ratio) but the greater AUC (area under curve). Fifteen breast cancer candidate gene modules with significant biological significance are identified by the KNGLF algorithm. Besides, the KNGLF algorithm can be extended to identify other diseases related candidate modules.
KW - Breast cancer
KW - Candidate gene score
KW - Disease module mining
KW - Key node groups
KW - Local fitness
UR - http://www.scopus.com/inward/record.url?scp=84964317970&partnerID=8YFLogxK
U2 - 10.3969/j.issn.1001-0505.2016.02.007
DO - 10.3969/j.issn.1001-0505.2016.02.007
M3 - 文章
AN - SCOPUS:84964317970
SN - 1001-0505
VL - 46
SP - 265
EP - 270
JO - Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)
JF - Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)
IS - 2
ER -