Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

Sertkan, Mete; Althammer, Sophia; Hofstätter, Sebastian

doi:10.18653/v1/2023.acl-demo.56

DC Field

Value

Language

dc.contributor.author

Sertkan, Mete

dc.contributor.author

Althammer, Sophia

dc.contributor.author

Hofstätter, Sebastian

dc.contributor.editor

Danushka, Bollegala

dc.date.accessioned

2024-01-29T11:16:05Z

dc.date.available

2024-01-29T11:16:05Z

dc.date.issued

2023-07

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Sertkan, M., Althammer, S., & Hofstätter, S. (2023). Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation. In B. Danushka (Ed.), <i>Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Domonstrations)</i> (pp. 581–587). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-demo.56</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/192941

dc.description.abstract

In this paper, we introduce Ranger - a toolkit to facilitate the easy use of effect-size-based meta-analysis for multi-task evaluation in NLP and IR. We observed that our communities often face the challenge of aggregating results over incomparable metrics and scenarios, which makes conclusions and take-away messages less reliable. With Ranger, we aim to address this issue by providing a task-agnostic toolkit that combines the effect of a treatment on multiple tasks into one statistical evaluation, allowing for comparison of metrics and computation of an overall summary effect. Our toolkit produces publication-ready forest plots that enable clear communication of evaluation results over multiple tasks. Our goal with the ready-to-use Ranger toolkit is to promote robust, effect-size-based evaluation and improve evaluation standards in the community. We provide two case studies for common IR and NLP settings to highlight Ranger’s benefits.

dc.description.sponsorship

Christian Doppler Forschungsgesells

dc.language.iso

dc.subject

Effect-Size-Based Meta-Analysis

dc.subject

Multi-Task Evaluation

dc.subject

Comparative Metrics Analysis

dc.subject

Natural Language Processing (NLP)

dc.subject

Information Retrieval (IR)

dc.title

Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.description.startpage

581

dc.description.endpage

587

dc.relation.grantno

CDL Neidhardt

dc.type.category

Full-Paper Contribution

tuw.booktitle

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Domonstrations)

tuw.container.volume

Volume 3: System Demonstrations

tuw.peerreviewed

true

tuw.relation.publisher

Association for Computational Linguistics

tuw.project.title

Christian Doppler Labor für Weiterentwicklung des State-of-the-Art von Recommender-Systemen in mehreren Domänen

tuw.researchTopic.id

tuw.researchTopic.name

Information Systems Engineering

tuw.researchTopic.value

100

tuw.publication.orgunit

E194-04 - Forschungsbereich Data Science

tuw.publisher.doi

10.18653/v1/2023.acl-demo.56

dc.description.numberOfPages

tuw.author.orcid

0000-0003-0984-5221

tuw.event.name

61st Annual Meeting of the Association for Computational Linguistics (ACL’23)

tuw.event.startdate

10-07-2023

tuw.event.enddate

12-07-2023

tuw.event.online

Hybrid

tuw.event.type

Event for scientific audience

tuw.event.place

Toronto

tuw.event.country

tuw.event.presenter

Sertkan, Mete

tuw.event.track

Multi Track

wb.sciencebranch

Informatik

wb.sciencebranch.oefos

1020

wb.sciencebranch.value

100

item.languageiso639-1

item.openairetype

conference paper

item.grantfulltext

none

item.fulltext

no Fulltext

item.cerifentitytype

Publications

item.openairecristype

http://purl.org/coar/resource_type/c_5794

crisitem.author.dept

E194-04 - Forschungsbereich Data Science

crisitem.author.dept

E194-04 - Forschungsbereich Data Science

crisitem.author.dept

E194-04 - Forschungsbereich Data Science

crisitem.author.orcid

0000-0003-0984-5221

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

crisitem.project.funder

Christian Doppler Forschungsgesells

crisitem.project.grantno

CDL Neidhardt

Appears in Collections:

Conference Paper

Show simple item record

Page view(s)

113

checked on Jan 29, 2024

Download(s)

checked on Jan 29, 2024

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM