Gemes, K. A., Kovacs, A., Reichel, M., & Recski, G. (2021). Offensive text detection on English Twitter with deep learning models and rule-based systems. In P. Mehta, T. Mandl, P. Majumder, & M. Mitra (Eds.), FIRE-WN 2021 [FIRE 2021 Working Notes] (pp. 283–296). CEUR-WS.org. https://doi.org/10.34726/4342
E194-04 - Forschungsbereich Data Science E194 - Institut für Information Systems Engineering
-
Erschienen in:
FIRE-WN 2021 [FIRE 2021 Working Notes]
-
Band:
3159
-
Datum (veröffentlicht):
2021
-
Veranstaltungsname:
Forum for Information Retrieval Evaluation
en
Veranstaltungszeitraum:
13-Dez-2021 - 17-Dez-2021
-
Veranstaltungsort:
Gandhinagar, Indien
-
Umfang:
14
-
Verlag:
CEUR-WS.org
-
Peer Reviewed:
Ja
-
Keywords:
social media data; hate speech detection; rule-based methods; deep learning; text classification
en
Abstract:
This paper describes the systems the TUW-Inf team submitted for the HASOC 2021 shared task on identifying offensive comments in social media. Besides a simple BERT-based classifier that achieved one of the highest F-scores on the binary classification task, we also build a high-precision rule-based classifier using a custom framework for human-in-the-loop learning. Both of our approaches are also evaluated qualitatively by manual analysis of 150 tweets, which also highlights possible controversies in the ground truth labels of the HASOC dataset