WeStcoin: Weakly-Supervised Contextualized Text Classification with Imbalance and Noisy Labels

Yupei Zhang, Yaya Zhou, Shuhui Liu, Wenxin Zhang, Min Xiao, Xuequn Shang

科研成果: 书/报告/会议事项章节会议稿件同行评审

5 引用 (Scopus)

摘要

The joint problem of imbalance samples and noisy labels challenges the current text classifiers in real-world applications. Existing approaches are mostly devoted to handling either former or latter while fail to manage the fused issue. This paper introduces a novel weakly-supervised framework, dubbed WeSt-coin, to take into account the sensitivity cost on misclassifications between classes and seek seed words towards noisy-label corrections. After BERT that creates a contextualized corpus, WeStcoin learns a predicted label vector from the contextualized samples and meanwhile calculates a pseudo probability vector from seed words, and then projects the concatenated representation into an output space, followed by multiplying by a cost-sensitive matrix. WeStcoin is ultimately trained to decrease the residual between the model outputs and the noisy labels, where seed words are also updated in an iterative manner. Extensive experiments and ablation studies on two public text datasets demonstrate that the proposed model outperforms the state-of-the-art model in the text classification with imbalance samples and noisy labels. Codes are made available at https://github.com/ypzhaang.

源语言英语
主期刊名2022 26th International Conference on Pattern Recognition, ICPR 2022
出版商Institute of Electrical and Electronics Engineers Inc.
2451-2457
页数7
ISBN(电子版)9781665490627
DOI
出版状态已出版 - 2022
活动26th International Conference on Pattern Recognition, ICPR 2022 - Montreal, 加拿大
期限: 21 8月 202225 8月 2022

出版系列

姓名Proceedings - International Conference on Pattern Recognition
2022-August
ISSN(印刷版)1051-4651

会议

会议26th International Conference on Pattern Recognition, ICPR 2022
国家/地区加拿大
Montreal
时期21/08/2225/08/22

指纹

探究 'WeStcoin: Weakly-Supervised Contextualized Text Classification with Imbalance and Noisy Labels' 的科研主题。它们共同构成独一无二的指纹。

引用此