<div class="csl-bib-body">
<div class="csl-entry">Sertkan, M., Althammer, S., & Hofstätter, S. (2023). Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation. In B. Danushka (Ed.), <i>Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Domonstrations)</i> (pp. 581–587). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-demo.56</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/192941
-
dc.description.abstract
In this paper, we introduce Ranger - a toolkit to facilitate the easy use of effect-size-based meta-analysis for multi-task evaluation in NLP and IR. We observed that our communities often face the challenge of aggregating results over incomparable metrics and scenarios, which makes conclusions and take-away messages less reliable. With Ranger, we aim to address this issue by providing a task-agnostic toolkit that combines the effect of a treatment on multiple tasks into one statistical evaluation, allowing for comparison of metrics and computation of an overall summary effect. Our toolkit produces publication-ready forest plots that enable clear communication of evaluation results over multiple tasks. Our goal with the ready-to-use Ranger toolkit is to promote robust, effect-size-based evaluation and improve evaluation standards in the community. We provide two case studies for common IR and NLP settings to highlight Ranger’s benefits.
en
dc.description.sponsorship
Christian Doppler Forschungsgesells
-
dc.language.iso
en
-
dc.subject
Effect-Size-Based Meta-Analysis
en
dc.subject
Multi-Task Evaluation
en
dc.subject
Comparative Metrics Analysis
en
dc.subject
Natural Language Processing (NLP)
en
dc.subject
Information Retrieval (IR)
en
dc.title
Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.description.startpage
581
-
dc.description.endpage
587
-
dc.relation.grantno
CDL Neidhardt
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Domonstrations)
-
tuw.container.volume
Volume 3: System Demonstrations
-
tuw.peerreviewed
true
-
tuw.relation.publisher
Association for Computational Linguistics
-
tuw.project.title
Christian Doppler Labor für Weiterentwicklung des State-of-the-Art von Recommender-Systemen in mehreren Domänen
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E194-04 - Forschungsbereich Data Science
-
tuw.publisher.doi
10.18653/v1/2023.acl-demo.56
-
dc.description.numberOfPages
7
-
tuw.author.orcid
0000-0003-0984-5221
-
tuw.event.name
61st Annual Meeting of the Association for Computational Linguistics (ACL’23)
en
tuw.event.startdate
10-07-2023
-
tuw.event.enddate
12-07-2023
-
tuw.event.online
Hybrid
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Toronto
-
tuw.event.country
CA
-
tuw.event.presenter
Sertkan, Mete
-
tuw.event.track
Multi Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.value
100
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.grantfulltext
none
-
item.fulltext
no Fulltext
-
item.cerifentitytype
Publications
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.orcid
0000-0003-0984-5221
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering