Machine learning for phase prediction of high entropy carbide ceramics from imbalanced data

  • Xuemeng Zhang
  • , Jia Sun
  • , Yuyu Zhang
  • , Kaifei Fan
  • , Zhixiang Zhang
  • , Yujia Zhang
  • , Keke Wu
  • , Laura Feldmann
  • , Lianwei Wu
  • , Ralf Riedel
  • , Hejun Li

Research output: Contribution to journalArticlepeer-review

Abstract

High-entropy carbide ceramics (HECCs) possess promising properties for extreme high-temperature applications. Machine learning offers an effective pathway to accelerate the discovery of novel HECCs, but data imbalance poses challenges for predictive performance. Here, we integrate the Borderline-SMOTE with machine learning algorithms to address this issue. A dataset containing 251 samples was established from literature, experimental synthesis, and synthetic oversampling. Key features influencing phase formation were selected via a four-step feature selection strategy. Ten common machine learning models were trained and optimized, with the random forest (RF) model identified as the most suitable for predicting HECCs phase formation ability. Eight HECCs compositions with high uncertainty were experimentally validated, and the results were incorporated back into the dataset to iteratively improve model accuracy. This work provides an efficient strategy for predicting phase formation in HECCs, particularly for small or imbalanced datasets, facilitating the accelerated design and reliable prediction of new HECCs.

Original languageEnglish
Article number22
Journalnpj Computational Materials
Volume12
Issue number1
DOIs
StatePublished - Dec 2026

Fingerprint

Dive into the research topics of 'Machine learning for phase prediction of high entropy carbide ceramics from imbalanced data'. Together they form a unique fingerprint.

Cite this