Gemes, K. A., Kovacs, A., Reichel, M., & Recski, G. (2021). Offensive text detection on English Twitter with deep learning models and rule-based systems. In P. Mehta, T. Mandl, P. Majumder, & M. Mitra (Eds.), FIRE-WN 2021 [FIRE 2021 Working Notes] (pp. 283–296). CEUR-WS.org. https://doi.org/10.34726/4342
E194-04 - Forschungsbereich Data Science E194 - Institut für Information Systems Engineering
-
Published in:
FIRE-WN 2021 [FIRE 2021 Working Notes]
-
Volume:
3159
-
Date (published):
2021
-
Event name:
Forum for Information Retrieval Evaluation
en
Event date:
13-Dec-2021 - 17-Dec-2021
-
Event place:
Gandhinagar, India
-
Number of Pages:
14
-
Publisher:
CEUR-WS.org
-
Peer reviewed:
Yes
-
Keywords:
social media data; hate speech detection; rule-based methods; deep learning; text classification
en
Abstract:
This paper describes the systems the TUW-Inf team submitted for the HASOC 2021 shared task on identifying offensive comments in social media. Besides a simple BERT-based classifier that achieved one of the highest F-scores on the binary classification task, we also build a high-precision rule-based classifier using a custom framework for human-in-the-loop learning. Both of our approaches are also evaluated qualitatively by manual analysis of 150 tweets, which also highlights possible controversies in the ground truth labels of the HASOC dataset