BLM-17m: A Large-Scale Dataset for Black Lives Matter Topic Detection on Twitter

  • Hasan Kemik
  • , Nusret Ozates
  • , Meysam Asgari-Chenaghlou
  • , Yang Li
  • , Erik Cambria

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Protection of human rights is one of the most important problems of the modern world. In this paper, we construct a Twitter dataset that covers one of the most significant human rights contradiction in recent years which affected the whole world: the George Floyd incident. We propose a labeled dataset for topic detection that contains about 17 million tweets. These Tweets are collected from 25 May 2020 to 21 August 2020, covering about 90 days from the start of the incident. We labeled the dataset by monitoring most trending news topics from global and local newspapers and used TF-IDF and LDA as baselines. We evaluated the results of these two methods with three different k values for precision, recall and F1-score.

Original languageEnglish
Title of host publicationProceedings - 23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
EditorsJihe Wang, Yi He, Thang N. Dinh, Christan Grant, Meikang Qiu, Witold Pedrycz
PublisherIEEE Computer Society
Pages736-743
Number of pages8
ISBN (Electronic)9798350381641
DOIs
StatePublished - 2023
Event23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023 - Shanghai, China
Duration: 1 Dec 20234 Dec 2023

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

Conference23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
Country/TerritoryChina
CityShanghai
Period1/12/234/12/23

Keywords

  • AI
  • BLM
  • BlackLivesMatter
  • Natural Language Processing
  • Sentiment Analysis
  • Social Media

Fingerprint

Dive into the research topics of 'BLM-17m: A Large-Scale Dataset for Black Lives Matter Topic Detection on Twitter'. Together they form a unique fingerprint.

Cite this