<div class="csl-bib-body">
<div class="csl-entry">Gemes, K. A., Kovacs, A., & Recski, G. (2023). Offensive text detection across languages and datasets using rule-based and hybrid methods. In G. Drakopoulos & E. Kafeza (Eds.), <i>CIKM-WS 2022. Proceedings of the CIKM 2022 Workshops</i>. CEUR-WS.org. https://doi.org/10.34726/4341</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/177653
-
dc.identifier.uri
https://doi.org/10.34726/4341
-
dc.description.abstract
We investigate the potential of rule-based systems for the task of offensive text detection in English and German, and demonstrate their effectiveness in low-resource settings, as an alternative or addition to transfer learning across tasks and languages. Task definitions and annotation guidelines used by existing datasets show great variety, hence state-of-the-art machine learning models do not transfer well across datasets or languages. Furthermore, such systems lack explainability and pose a critical risk of unintended bias. We present simple rule systems based on semantic graphs for classifying offensive text in two languages and provide both quantitative and qualitative comparison of their performance with deep learning models on 5 datasets across multiple languages and shared tasks.
en
dc.language.iso
en
-
dc.relation.ispartofseries
CEUR Workshop Proceeding
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
offensive text
en
dc.subject
rule-based methods
en
dc.subject
human in the loop learning
en
dc.title
Offensive text detection across languages and datasets using rule-based and hybrid methods