Software defect prediction based on classifiers ensemble

Tao Wang; Weihua Li; Haobin Shi; Zun Liu

Software defect prediction based on classifiers ensemble

Tao Wang, Weihua Li, Haobin Shi, Zun Liu

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

68 引用（Scopus）

摘要

Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.

源语言	英语
页（从-至）	4241-4254
页数	14
期刊	Journal of Information and Computational Science
卷	8
期	16
出版状态	已出版 - 12月 2011

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d02e0e356bd744ecbfd8b8b6e5ff1d8a,

title = "Software defect prediction based on classifiers ensemble",

abstract = "Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.",

keywords = "Classifiers ensemble, Ensemble methodology, Software defect prediction",

author = "Tao Wang and Weihua Li and Haobin Shi and Zun Liu",

year = "2011",

month = dec,

language = "英语",

volume = "8",

pages = "4241--4254",

journal = "Journal of Information and Computational Science",

issn = "1548-7741",

publisher = "Binary Information Press",

number = "16",

}

TY - JOUR

T1 - Software defect prediction based on classifiers ensemble

AU - Wang, Tao

AU - Li, Weihua

AU - Shi, Haobin

AU - Liu, Zun

PY - 2011/12

Y1 - 2011/12

N2 - Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.

AB - Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.

KW - Classifiers ensemble

KW - Ensemble methodology

KW - Software defect prediction

UR - http://www.scopus.com/inward/record.url?scp=84855432881&partnerID=8YFLogxK

M3 - 文章

AN - SCOPUS:84855432881

SN - 1548-7741

VL - 8

SP - 4241

EP - 4254

JO - Journal of Information and Computational Science

JF - Journal of Information and Computational Science

IS - 16

ER -

Software defect prediction based on classifiers ensemble

摘要

其它文件与链接

指纹

引用此