<div class="csl-bib-body">
<div class="csl-entry">Pachinger, P., Goldzycher, J., Planitzer, A. M., Kusa, W., Hanbury, A., & Neidhardt, J. (2024). AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection. In <i>The 62nd Annual Meeting of the Association for Computational Linguistics : Findings of the Association for Computational Linguistics: ACL 2024</i> (pp. 11990–12001). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.713</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/209938
-
dc.description.abstract
Model interpretability in toxicity detection greatly profits from token-level annotations. However, currently, such annotations are only available in English. We introduce a dataset annotated for offensive language detection sourced from a news forum, notable for its incorporation of the Austrian German dialect, comprising 4,562 user comments. In addition to binary offensiveness classification, we identify spans within each comment constituting vulgar language or representing targets of offensive statements. We evaluate fine-tuned Transformer models as well as large language models in a zero- and few-shot fashion. The results indicate that while fine-tuned models excel in detecting linguistic peculiarities such as vulgar dialect, large language models demonstrate superior performance in detecting offensiveness in AustroTox.
en
dc.description.sponsorship
WWTF Wiener Wissenschafts-, Forschu und Technologiefonds
-
dc.description.sponsorship
Christian Doppler Forschungsgesells
-
dc.language.iso
en
-
dc.subject
Offensive Language Detection
en
dc.subject
Austrian German Dialect
en
dc.subject
Transformer Models and Large Language Models
en
dc.title
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.publication
The 62nd Annual Meeting of the Association for Computational Linguistics : Findings of the Association for Computational Linguistics: ACL 2024
-
dc.contributor.affiliation
University of Zurich, Switzerland
-
dc.contributor.affiliation
University of Vienna, Austria
-
dc.relation.isbn
979-8-89176-099-8
-
dc.description.startpage
11990
-
dc.description.endpage
12001
-
dc.relation.grantno
ICT20-015
-
dc.relation.grantno
CDL Neidhardt
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
The 62nd Annual Meeting of the Association for Computational Linguistics : Findings of the Association for Computational Linguistics: ACL 2024
-
tuw.relation.publisher
Association for Computational Linguistics
-
tuw.project.title
Transparente Automatisierte Inhaltsmoderation
-
tuw.project.title
Christian Doppler Labor für Weiterentwicklung des State-of-the-Art von Recommender-Systemen in mehreren Domänen
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E194-04 - Forschungsbereich Data Science
-
tuw.publication.orgunit
E056-23 - Fachbereich Innovative Combinations and Applications of AI and ML (iCAIML)
-
tuw.publisher.doi
10.18653/v1/2024.findings-acl.713
-
dc.description.numberOfPages
12
-
tuw.author.orcid
0000-0002-0706-810X
-
tuw.author.orcid
0000-0001-8181-6615
-
tuw.author.orcid
0000-0003-4420-4147
-
tuw.author.orcid
0000-0002-7149-5843
-
tuw.author.orcid
0000-0001-7184-1841
-
tuw.event.name
The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
en
tuw.event.startdate
11-08-2024
-
tuw.event.enddate
16-08-2024
-
tuw.event.online
Hybrid
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Bangkok
-
tuw.event.country
TH
-
tuw.event.presenter
Pachinger, Pia
-
tuw.event.track
Multi Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Wirtschaftswissenschaften
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
5020
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.openairetype
conference paper
-
item.fulltext
no Fulltext
-
item.languageiso639-1
en
-
item.grantfulltext
none
-
item.cerifentitytype
Publications
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
University of Zurich
-
crisitem.author.dept
University of Vienna
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.orcid
0000-0002-0706-810X
-
crisitem.author.orcid
0000-0001-8181-6615
-
crisitem.author.orcid
0000-0003-4420-4147
-
crisitem.author.orcid
0000-0002-7149-5843
-
crisitem.author.orcid
0000-0001-7184-1841
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.project.funder
WWTF Wiener Wissenschafts-, Forschu und Technologiefonds