<div class="csl-bib-body">
<div class="csl-entry">Bauer, J. J., Eiter, T., Higuera Ruiz, N. N., & Oetsch, J. (2023, November 21). <i>Neuro-Symbolic Visual Graph Question Answering with LLMs for Language Parsing</i> [Conference Presentation]. TAASP23: Workshop on Trends and Applications of Answer Set Programming, Potsdam, Germany. https://doi.org/10.34726/5462</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/193865
-
dc.identifier.uri
https://doi.org/10.34726/5462
-
dc.description.abstract
Images containing graph-based structures are an ubiquitous and popular form of data representation that, to the best of our knowledge, have not yet been considered in the domain of Visual Question Answering (VQA). We provide a respective novel dataset and present a modular neuro-symbolic approach as a first baseline. Our dataset extends CLEGR, an existing dataset for question answering on graphs inspired by metro networks. Notably, the graphs there are given in symbolic form, while we consider the more challenging problem of taking images of graphs as input. Our solution combines optical graph recognition for graph parsing, a pre-trained optical character recognition neural network for parsing node labels, and answer-set programming for reasoning. The model achieves an overall average accuracy of 73% on the dataset. While regular expressions are sufficient to parse the natural language questions, we also study various large-language models to obtain a more robust solution that also generalises well to variants of questions that are not part of the dataset. Our evaluation provides further evidence of the potential of modular neuro-symbolic systems, in particular with pre-trained models, to solve complex VQA tasks.
en
dc.language.iso
en
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
Neurosymbolic
en
dc.subject
Visual Question Answering
en
dc.title
Neuro-Symbolic Visual Graph Question Answering with LLMs for Language Parsing
en
dc.type
Presentation
en
dc.type
Vortrag
de
dc.rights.license
Creative Commons Namensnennung 4.0 International
de
dc.rights.license
Creative Commons Attribution 4.0 International
en
dc.identifier.doi
10.34726/5462
-
dc.contributor.affiliation
TU Wien, Austria
-
dc.type.category
Conference Presentation
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E192-03 - Forschungsbereich Knowledge Based Systems
-
tuw.author.orcid
0000-0001-6003-6345
-
dc.rights.identifier
CC BY 4.0
de
dc.rights.identifier
CC BY 4.0
en
tuw.event.name
TAASP23: Workshop on Trends and Applications of Answer Set Programming
en
tuw.event.startdate
20-11-2023
-
tuw.event.enddate
21-11-2023
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Potsdam
-
tuw.event.country
DE
-
tuw.event.institution
Universität Potsdam
-
tuw.event.presenter
Higuera Ruiz, Nelson Nicolas
-
tuw.event.track
Single Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.value
100
-
item.openairecristype
http://purl.org/coar/resource_type/c_18cp
-
item.openaccessfulltext
Open Access
-
item.openairetype
conference paper not in proceedings
-
item.fulltext
with Fulltext
-
item.mimetype
application/pdf
-
item.languageiso639-1
en
-
item.grantfulltext
open
-
item.cerifentitytype
Publications
-
crisitem.author.dept
TU Wien
-
crisitem.author.dept
E192 - Institut für Logic and Computation
-
crisitem.author.dept
E192-03 - Forschungsbereich Knowledge Based Systems
-
crisitem.author.dept
E192-03 - Forschungsbereich Knowledge Based Systems