TY - JOUR
T1 - Code Multiview Hypergraph Representation Learning for Software Defect Prediction
AU - Qiu, Shaojian
AU - Huang, Mengyang
AU - Liang, Yun
AU - Peng, Chaoda
AU - Yuan, Yuan
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Software defect prediction technology aids the reliability assurance team in identifying defect-prone code and assists the team in reasonably allocating limited testing resources. Recently, researchers assumed that the topological associations among code fragments could be harnessed to construct defect prediction models. Nevertheless, existing graph-based methods only concentrate on features of single-view association, which fail to fully capture the rich information hidden in the code. In addition, software defects may involve multiple code fragments simultaneously, but traditional binary graph structures are insufficient for representing these multivariate associations. To address these two challenges, this article proposes a multiview hypergraph representation learning approach (MVHR-DP) to amplify the potency of code features in defect prediction. MVHR-DP initiates by creating hypergraph structures for each code view, which are then amalgamated into a comprehensive fusion hypergraph. Following this, a hypergraph neural network is established to extract code features from multiple views and intricate associations, thereby enhancing the comprehensiveness of representation in the modeling data. Empirical study shows that the prediction model utilizing features generated by MVHR-DP exhibits superior area under the curve (AUC), F-measure, and matthews correlation coefficient (MCC) results compared to baseline approaches across within-project, cross-version, and cross-project prediction tasks.
AB - Software defect prediction technology aids the reliability assurance team in identifying defect-prone code and assists the team in reasonably allocating limited testing resources. Recently, researchers assumed that the topological associations among code fragments could be harnessed to construct defect prediction models. Nevertheless, existing graph-based methods only concentrate on features of single-view association, which fail to fully capture the rich information hidden in the code. In addition, software defects may involve multiple code fragments simultaneously, but traditional binary graph structures are insufficient for representing these multivariate associations. To address these two challenges, this article proposes a multiview hypergraph representation learning approach (MVHR-DP) to amplify the potency of code features in defect prediction. MVHR-DP initiates by creating hypergraph structures for each code view, which are then amalgamated into a comprehensive fusion hypergraph. Following this, a hypergraph neural network is established to extract code features from multiple views and intricate associations, thereby enhancing the comprehensiveness of representation in the modeling data. Empirical study shows that the prediction model utilizing features generated by MVHR-DP exhibits superior area under the curve (AUC), F-measure, and matthews correlation coefficient (MCC) results compared to baseline approaches across within-project, cross-version, and cross-project prediction tasks.
KW - Code multiview fusion
KW - code representation learning
KW - hypergraph construction
KW - software defect prediction
KW - software reliability
UR - http://www.scopus.com/inward/record.url?scp=85193289974&partnerID=8YFLogxK
U2 - 10.1109/TR.2024.3393415
DO - 10.1109/TR.2024.3393415
M3 - 文章
AN - SCOPUS:85193289974
SN - 0018-9529
VL - 73
SP - 1863
EP - 1876
JO - IEEE Transactions on Reliability
JF - IEEE Transactions on Reliability
IS - 4
ER -