TY - JOUR
T1 - Boosting algorithm framework for ensemble neural networks based on coordinate descent
AU - He, Guanxiong
AU - Luo, Sheng
AU - Gao, Dengwei
AU - Nie, Feiping
AU - Li, Guoxu
N1 - Publisher Copyright:
© 2026 Elsevier Inc.
PY - 2026/9/5
Y1 - 2026/9/5
N2 - Traditional Gradient Boosting Decision Trees (GBDTs) predominantly rely on greedy forward stagewise optimization and decision tree base learners, which often restrict the model to suboptimal local solutions and limit feature representation capabilities. In this paper, we propose a novel framework termed Coordinate Descent Gradient Boosting (CDGB), which fundamentally alters this paradigm. First, we replace standard decision trees with sparse single-hidden-layer neural networks, incorporating ℓ2,1-norm regularization to adaptively learn compact and diverse subnetworks. Second, unlike the traditional greedy approach that fixes previous learners, we introduce a coordinate descent optimization strategy that enables the joint refinement of all base learners and their weights, theoretically guaranteeing monotonic convergence. Extensive experiments on 14 real-world datasets (covering both classification and regression tasks) demonstrate the robustness of CDGB. Statistical analysis confirms that CDGB achieves the best average rank (ranging from 1.00 to 1.14) across 10 evaluation metrics (e.g., Accuracy, MSE, AUC), significantly outperforming classical baselines (e.g., AdaBoost, Random Forest) and demonstrating superior ranking consistency compared to state-of-the-art libraries such as XGBoost and LightGBM.
AB - Traditional Gradient Boosting Decision Trees (GBDTs) predominantly rely on greedy forward stagewise optimization and decision tree base learners, which often restrict the model to suboptimal local solutions and limit feature representation capabilities. In this paper, we propose a novel framework termed Coordinate Descent Gradient Boosting (CDGB), which fundamentally alters this paradigm. First, we replace standard decision trees with sparse single-hidden-layer neural networks, incorporating ℓ2,1-norm regularization to adaptively learn compact and diverse subnetworks. Second, unlike the traditional greedy approach that fixes previous learners, we introduce a coordinate descent optimization strategy that enables the joint refinement of all base learners and their weights, theoretically guaranteeing monotonic convergence. Extensive experiments on 14 real-world datasets (covering both classification and regression tasks) demonstrate the robustness of CDGB. Statistical analysis confirms that CDGB achieves the best average rank (ranging from 1.00 to 1.14) across 10 evaluation metrics (e.g., Accuracy, MSE, AUC), significantly outperforming classical baselines (e.g., AdaBoost, Random Forest) and demonstrating superior ranking consistency compared to state-of-the-art libraries such as XGBoost and LightGBM.
KW - Coordinate descent
KW - Ensemble learning
KW - Gradient boosting
KW - Regularization
KW - Sparse neural networks
UR - https://www.scopus.com/pages/publications/105038325409
U2 - 10.1016/j.ins.2026.123559
DO - 10.1016/j.ins.2026.123559
M3 - 文章
AN - SCOPUS:105038325409
SN - 0020-0255
VL - 749
JO - Information Sciences
JF - Information Sciences
M1 - 123559
ER -