E194 - Institut für Information Systems Engineering
-
Date (published):
2025
-
Number of Pages:
83
-
Keywords:
Information Retrieval; Natural Language Processing; Fact Checking; Retrieval; Reranking; Large Language Models
en
Abstract:
With the growing influence of social media, ensuring the accuracy of online information has become increasingly important. Automated fact-checking involves multiple stages, including claim detection, prioritization, retrieval of evidence, veracity prediction, and explanation generation. A crucial yet often overlooked component is retrieving previously fact-checked claims, which helps combat misinformation by matching new claims with existing fact-checks.In this work, we develop a multilingual and crosslingual fact-checked claim retrieval system based on a hybrid retrieval pipeline that combines lexical and dense retrieval models. We systematically evaluate different retrieval and reranking strategies, demonstrating that hybrid ensembles effectively balance efficiency and effectiveness, outperforming individual retrievers. While reranking significantly enhances crosslingual retrieval, its impact in monolingual settings remains limited, highlighting the effectiveness of well-designed ensembling over increasing complex ranking layers.Additionally, we analyze the impact of preprocessing steps, compare models in terms of retrieval performance, execution time, number of parameters and memory usage, and conduct an error analysis to identify key limitations. Finally, we discuss potential improvements and future research directions to enhance multilingual fact-check retrieval.Our approach was applied to SemEval-2025 Task 7, where we present results and insights gained from our participation.
en
Additional information:
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers