reposiTUm: Fact-checking LLMs with explainable information extraction

Datensatz Zitierlink:

http://hdl.handle.net/20.500.12708/211064
https://doi.org/10.34726/8540

Titel:

Fact-checking LLMs with explainable information extraction

Zitat:

Recski, G. (2024, November 19). Fact-checking LLMs with explainable information extraction [Conference Presentation]. Language Intelligence 2024, Austria. https://doi.org/10.34726/8540

reposiTUm-DOI:

10.34726/8540

Publikationstyp:

Vortrag - Konferenz Präsentation

Sprache:

Englisch

Autor_innen:

Recski, Gábor

Organisationseinheit:

E194-04 - Forschungsbereich Data Science

Datum (veröffentlicht):

19-Nov-2024

Veranstaltungsname:

Language Intelligence 2024

Veranstaltungszeitraum:

19-Nov-2024 - 20-Nov-2024

Veranstaltungsort:

Österreich

Keywords:

LLM; Explainability; Information Extraction

Abstract:

Large Language Models (LLMs) have become the most commonly used tool in natural language processing (NLP), but practitioners have quickly discovered their shortcomings. LLMs make mistakes that we can neither predict nor prevent. They are black boxes whose behavior cannot be configured or explained. Their exact training process is unknown and they are prone to unwanted bias. Their computational cost is enormous, creating both financial and environmental burdens. And unless they are developed in house, their use raises privacy and security issues. All these problems severely limit the applicability of LLMs in domains that require high degrees of transparency and trustworthiness, such as legal, medical, or financial NLP applications. The NLP group at TU Wien has developed POTATO, the explainable information extraction framework. Validated across complex domains and multiple languages, POTATO is an open-source engine for rule-based information extraction that can be used either as an alternative or an addition to LLM-based solutions. In particular, POTATO can be deployed as a fact-checker for Retrieval Augmented Generation (RAG) systems that answer user queries based on reliable and relevant documents.

Forschungsschwerpunkte:

Information Systems Engineering: 70%
Beyond TUW-research focus: 30%

Wissenschaftszweig:

6020 - Sprach- und Literaturwissenschaften: 30%
1020 - Informatik: 70%

Lizenz:

CC BY-SA 4.0