Mining algorithm for breast cancer candidate disease module based on key node groups

Yibin Wang; Yongmei Cheng; Shaowu Zhang

doi:10.3969/j.issn.1001-0505.2016.02.007

Mining algorithm for breast cancer candidate disease module based on key node groups

Yibin Wang, Yongmei Cheng, Shaowu Zhang

School of Automation

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

Abstract

In order to solve the problems of small quantity, incomplete data, noise, and bias of the gene expression profile in the method for breast cancer disease module mining, a mining algorithm for candidate disease module based on the key node groups and the local node fitness constraints, the key node groups and local fitness (KNGLF) algorithm, is proposed. First, the topological overlap similarity score and the functional similarity score between the candidate genes and the pathogenic genes are fused into a fusion score. Through comparing the fusion score with the threshold value, the key nodes are selected and the key node groups are constructed. Then, the breast cancer candidate disease modules are mined based on the local fitness constraints and different decision criteria for different nodes. Finally, according to the enrichment analysis results, the candidate disease gene modules are identified. The experimental results show that compared with other existing mining algorithms for breast cancer module, the key node selection algorithm in the KNGLF algorithm has the smaller MRR (mean rank ratio) but the greater AUC (area under curve). Fifteen breast cancer candidate gene modules with significant biological significance are identified by the KNGLF algorithm. Besides, the KNGLF algorithm can be extended to identify other diseases related candidate modules.

Original language	English
Pages (from-to)	265-270
Number of pages	6
Journal	Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)
Volume	46
Issue number	2
DOIs	https://doi.org/10.3969/j.issn.1001-0505.2016.02.007
State	Published - 20 Mar 2016

Keywords

Breast cancer
Candidate gene score
Disease module mining
Key node groups
Local fitness

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.3969/j.issn.1001-0505.2016.02.007

Cite this

@article{fe70717197434fccb990c72dd3b910dc,

title = "Mining algorithm for breast cancer candidate disease module based on key node groups",

abstract = "In order to solve the problems of small quantity, incomplete data, noise, and bias of the gene expression profile in the method for breast cancer disease module mining, a mining algorithm for candidate disease module based on the key node groups and the local node fitness constraints, the key node groups and local fitness (KNGLF) algorithm, is proposed. First, the topological overlap similarity score and the functional similarity score between the candidate genes and the pathogenic genes are fused into a fusion score. Through comparing the fusion score with the threshold value, the key nodes are selected and the key node groups are constructed. Then, the breast cancer candidate disease modules are mined based on the local fitness constraints and different decision criteria for different nodes. Finally, according to the enrichment analysis results, the candidate disease gene modules are identified. The experimental results show that compared with other existing mining algorithms for breast cancer module, the key node selection algorithm in the KNGLF algorithm has the smaller MRR (mean rank ratio) but the greater AUC (area under curve). Fifteen breast cancer candidate gene modules with significant biological significance are identified by the KNGLF algorithm. Besides, the KNGLF algorithm can be extended to identify other diseases related candidate modules.",

keywords = "Breast cancer, Candidate gene score, Disease module mining, Key node groups, Local fitness",

author = "Yibin Wang and Yongmei Cheng and Shaowu Zhang",

year = "2016",

month = mar,

day = "20",

doi = "10.3969/j.issn.1001-0505.2016.02.007",

language = "英语",

volume = "46",

pages = "265--270",

journal = "Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)",

issn = "1001-0505",

publisher = "Southeast University",

number = "2",

}

TY - JOUR

T1 - Mining algorithm for breast cancer candidate disease module based on key node groups

AU - Wang, Yibin

AU - Cheng, Yongmei

AU - Zhang, Shaowu

PY - 2016/3/20

Y1 - 2016/3/20

N2 - In order to solve the problems of small quantity, incomplete data, noise, and bias of the gene expression profile in the method for breast cancer disease module mining, a mining algorithm for candidate disease module based on the key node groups and the local node fitness constraints, the key node groups and local fitness (KNGLF) algorithm, is proposed. First, the topological overlap similarity score and the functional similarity score between the candidate genes and the pathogenic genes are fused into a fusion score. Through comparing the fusion score with the threshold value, the key nodes are selected and the key node groups are constructed. Then, the breast cancer candidate disease modules are mined based on the local fitness constraints and different decision criteria for different nodes. Finally, according to the enrichment analysis results, the candidate disease gene modules are identified. The experimental results show that compared with other existing mining algorithms for breast cancer module, the key node selection algorithm in the KNGLF algorithm has the smaller MRR (mean rank ratio) but the greater AUC (area under curve). Fifteen breast cancer candidate gene modules with significant biological significance are identified by the KNGLF algorithm. Besides, the KNGLF algorithm can be extended to identify other diseases related candidate modules.

AB - In order to solve the problems of small quantity, incomplete data, noise, and bias of the gene expression profile in the method for breast cancer disease module mining, a mining algorithm for candidate disease module based on the key node groups and the local node fitness constraints, the key node groups and local fitness (KNGLF) algorithm, is proposed. First, the topological overlap similarity score and the functional similarity score between the candidate genes and the pathogenic genes are fused into a fusion score. Through comparing the fusion score with the threshold value, the key nodes are selected and the key node groups are constructed. Then, the breast cancer candidate disease modules are mined based on the local fitness constraints and different decision criteria for different nodes. Finally, according to the enrichment analysis results, the candidate disease gene modules are identified. The experimental results show that compared with other existing mining algorithms for breast cancer module, the key node selection algorithm in the KNGLF algorithm has the smaller MRR (mean rank ratio) but the greater AUC (area under curve). Fifteen breast cancer candidate gene modules with significant biological significance are identified by the KNGLF algorithm. Besides, the KNGLF algorithm can be extended to identify other diseases related candidate modules.

KW - Breast cancer

KW - Candidate gene score

KW - Disease module mining

KW - Key node groups

KW - Local fitness

UR - http://www.scopus.com/inward/record.url?scp=84964317970&partnerID=8YFLogxK

U2 - 10.3969/j.issn.1001-0505.2016.02.007

DO - 10.3969/j.issn.1001-0505.2016.02.007

M3 - 文章

AN - SCOPUS:84964317970

SN - 1001-0505

VL - 46

SP - 265

EP - 270

JO - Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)

JF - Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)

IS - 2

ER -

Mining algorithm for breast cancer candidate disease module based on key node groups

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this