BLM-17m: A Large-Scale Dataset for Black Lives Matter Topic Detection on Twitter

Hasan Kemik, Nusret Ozates, Meysam Asgari-Chenaghlou, Yang Li, Erik Cambria

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Protection of human rights is one of the most important problems of the modern world. In this paper, we construct a Twitter dataset that covers one of the most significant human rights contradiction in recent years which affected the whole world: the George Floyd incident. We propose a labeled dataset for topic detection that contains about 17 million tweets. These Tweets are collected from 25 May 2020 to 21 August 2020, covering about 90 days from the start of the incident. We labeled the dataset by monitoring most trending news topics from global and local newspapers and used TF-IDF and LDA as baselines. We evaluated the results of these two methods with three different k values for precision, recall and F1-score.

源语言英语
主期刊名Proceedings - 23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
编辑Jihe Wang, Yi He, Thang N. Dinh, Christan Grant, Meikang Qiu, Witold Pedrycz
出版商IEEE Computer Society
736-743
页数8
ISBN(电子版)9798350381641
DOI
出版状态已出版 - 2023
活动23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023 - Shanghai, 中国
期限: 1 12月 20234 12月 2023

出版系列

姓名IEEE International Conference on Data Mining Workshops, ICDMW
ISSN(印刷版)2375-9232
ISSN(电子版)2375-9259

会议

会议23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
国家/地区中国
Shanghai
时期1/12/234/12/23

指纹

探究 'BLM-17m: A Large-Scale Dataset for Black Lives Matter Topic Detection on Twitter' 的科研主题。它们共同构成独一无二的指纹。

引用此