TU Wien at SemEval-2024 Task 6: Unifying Model-Agnostic and Model-Aware Techniques for Hallucination Detection

Arzt, Varvara; Azarbeik, Mohammad Mahdi; Lasy, Ilya; Kerl, Tilman; Recski, Gábor

doi:10.18653/v1/2024.semeval-1.173

DC Field

Value

Language

dc.contributor.author

Arzt, Varvara

dc.contributor.author

Azarbeik, Mohammad Mahdi

dc.contributor.author

Lasy, Ilya

dc.contributor.author

Kerl, Tilman

dc.contributor.author

Recski, Gábor

dc.contributor.editor

Ojha, Atul Kumar

dc.contributor.editor

Dogruöz, A. Seza

dc.contributor.editor

Tayyar Madabushi, Harish

dc.contributor.editor

Da San Martino, Giovanni

dc.contributor.editor

Rosenthal, Sara

dc.contributor.editor

Rosá, Aiala

dc.date.accessioned

2025-01-28T16:16:26Z

dc.date.available

2025-01-28T16:16:26Z

dc.date.issued

2024-06

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Arzt, V., Azarbeik, M. M., Lasy, I., Kerl, T., & Recski, G. (2024). TU Wien at SemEval-2024 Task 6: Unifying Model-Agnostic and Model-Aware Techniques for Hallucination Detection. In A. K. Ojha, A. S. Dogruöz, H. Tayyar Madabushi, G. Da San Martino, S. Rosenthal, & A. Rosá (Eds.), <i>Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)</i> (pp. 1183–1196). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.semeval-1.173</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/209896

dc.description.abstract

This paper discusses challenges in Natural Language Generation (NLG), specifically addressing neural networks producing output that is fluent but incorrect, leading to “hallucinations”. The SHROOM shared task involves Large Language Models in various tasks, and our methodology employs both model-agnostic and model-aware approaches for hallucination detection. The limited availability of labeled training data is addressed through automatic label generation strategies. Model-agnostic methods include word alignment and fine-tuning a BERT-based pretrained model, while model-aware methods leverage separate classifiers trained on LLMs’ internal data (layer activations and attention values). Ensemble methods combine outputs through various techniques such as regression metamodels, voting, and probability fusion. Our best performing systems achieved an accuracy of 80.6% on the model-aware track and 81.7% on the model-agnostic track, ranking 3rd and 8th among all systems, respectively.

dc.language.iso

dc.subject

Hallucination Detection

dc.subject

model-aware approaches

dc.subject

model-agnostic approaches

dc.subject

Transparency

dc.subject

Large Language Models

dc.title

TU Wien at SemEval-2024 Task 6: Unifying Model-Agnostic and Model-Aware Techniques for Hallucination Detection

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.relation.publication

Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

dc.contributor.affiliation

TU Wien, Austria