跳到主要导航 跳到搜索 跳到主要内容

EDAER: Entropy-Driven Approach for Entity and Relation Extraction in Chinese Cyber Threat Intelligence

  • Data Communication Technology Research Institute
  • Northwestern Polytechnical University Xian
  • Beijing Jiaotong University

科研成果: 期刊稿件文章同行评审

摘要

Cyber threat intelligence (CTI) has been explored to strengthen system security via taking raw threat data from various data sources and transforming it into actionable insights that enable organizations to predict, detect, and respond to cyber threats. Named entity recognition (NER) and relation extraction (RE) are the key tasks of CTI data mining. However, current CTI NER and/or RE research is mainly focused on English CTI, which is not directly transferable to Chinese CTI due to fundamental linguistic and terminological differences. Moreover, the existing limited studies on Chinese CTI do not effectively address uncertainty in predictions in low-resource scenarios where entities and relations are sparse. This work aims to improve the performance of NER and RE tasks in low-resource Chinese CTI scenarios, and we make two major contributions. The first is that we construct a Chinese CTI dataset, which includes 16 types of entities and 9 types of relations—more than those of the existing open-source dataset on Chinese CTI. The second is that we propose an entropy-driven approach for entity and relation (EDAER) extraction. EDAER is the first to combine the techniques of RoBERTa_wwm, Mamba, RDCNN and CRF to perform NER tasks. In addition, EDAER is the first to apply entropy to quantify the uncertainty of the model’s predictions in NER and RE tasks in Chinese CTI scenarios. Moreover, EDAER is the first to apply contrastive learning techniques in Chinese CTI scenarios to learn meaningful features by maximizing the similarity between positive samples and minimizing the similarity between negative samples. Extensive experimental results on public and our built datasets demonstrate that our proposed approach performs the best. These results show that (1) RoBERTa_wwwm significantly outperforms BERT on both NER and RE tasks; (2) Mamba outperforms BiLSTM on the NER task; (3) the entropy-based dynamic gating mechanism contributes to performance improvements in both NER and RE tasks; and (4) the uncertainty-guided contrastive learning mechanism is helpful for performance improvement in the NER task.

源语言英语
文章编号261
期刊Entropy
28
3
DOI
出版状态已出版 - 3月 2026

指纹

探究 'EDAER: Entropy-Driven Approach for Entity and Relation Extraction in Chinese Cyber Threat Intelligence' 的科研主题。它们共同构成独一无二的指纹。

引用此