Software defect prediction based on classifiers ensemble

Tao Wang; Weihua Li; Haobin Shi; Zun Liu

Software defect prediction based on classifiers ensemble

Tao Wang, Weihua Li, Haobin Shi, Zun Liu

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

68 Scopus citations

Abstract

Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.

Original language	English
Pages (from-to)	4241-4254
Number of pages	14
Journal	Journal of Information and Computational Science
Volume	8
Issue number	16
State	Published - Dec 2011

Keywords

Classifiers ensemble
Ensemble methodology
Software defect prediction

Cite this

@article{d02e0e356bd744ecbfd8b8b6e5ff1d8a,

title = "Software defect prediction based on classifiers ensemble",

abstract = "Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.",

keywords = "Classifiers ensemble, Ensemble methodology, Software defect prediction",

author = "Tao Wang and Weihua Li and Haobin Shi and Zun Liu",

year = "2011",

month = dec,

language = "英语",

volume = "8",

pages = "4241--4254",

journal = "Journal of Information and Computational Science",

issn = "1548-7741",

publisher = "Binary Information Press",

number = "16",

}

TY - JOUR

T1 - Software defect prediction based on classifiers ensemble

AU - Wang, Tao

AU - Li, Weihua

AU - Shi, Haobin

AU - Liu, Zun

PY - 2011/12

Y1 - 2011/12

N2 - Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.

AB - Software defect prediction using classification algorithms was advocated by many researchers. However, several new literatures show the performance bottleneck by applying a single classifier recent years. On the other hand, classifiers ensemble can effectively improve classification performance than a single classifier. Motivated by above two reasons which indicate that defect prediction using classifiers ensemble methods have not fully be exploited, we conduct a comparative study of various ensemble methods with perspective of taxonomy. These methods included Bagging, Boosting, Random trees, Random forest, Random subspace, Stacking, and Voting. We also compared these ensemble methods to a single classifier Naive Bayes. A series of benchmarking experiments on public-domain datasets MDP show that applying classifiers ensemble methods to predict defect could achieve better performance than using a single classifier. Specially, in all seven ensemble methods evolved by our experiments, Voting and Random forest had obvious performance superiority than others, and Stacking also had better generalization ability.

KW - Classifiers ensemble

KW - Ensemble methodology

KW - Software defect prediction

UR - http://www.scopus.com/inward/record.url?scp=84855432881&partnerID=8YFLogxK

M3 - 文章

AN - SCOPUS:84855432881

SN - 1548-7741

VL - 8

SP - 4241

EP - 4254

JO - Journal of Information and Computational Science

JF - Journal of Information and Computational Science

IS - 16

ER -

Software defect prediction based on classifiers ensemble

Abstract

Keywords

Other files and links

Fingerprint

Cite this