Hidden Schema Networks

Sanchez, Ramses; Conrads, Lukas; Welke, Pascal; Cvejoski, Kostadin; Ojeda, Cesar

doi:10.18653/v1/2023.acl-long.263

DC Field

Value

Language

dc.contributor.author

Sanchez, Ramses

dc.contributor.author

Conrads, Lukas

dc.contributor.author

Welke, Pascal

dc.contributor.author

Cvejoski, Kostadin

dc.contributor.author

Ojeda, Cesar

dc.date.accessioned

2023-09-12T09:32:04Z

dc.date.available

2023-09-12T09:32:04Z

dc.date.issued

2023-07

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Sanchez, R., Conrads, L., Welke, P., Cvejoski, K., & Ojeda, C. (2023). Hidden Schema Networks. In <i>Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</i> (pp. 4764–4798). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.263</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/188226

dc.description.abstract

Large, pretrained language models infer powerful representations that encode rich semantic and syntactic content, albeit implicitly. In this work we introduce a novel neural language model that enforces, via inductive biases, explicit relational structures which allow for compositionality onto the output representations of pretrained language models. Specifically, the model encodes sentences into sequences of symbols (composed representations), which correspond to the nodes visited by biased random walkers on a global latent graph, and infers the posterior distribution of the latter. We first demonstrate that the model is able to uncover ground-truth graphs from artificially generated datasets of random token sequences. Next, we leverage pretrained BERT and GPT-2 language models as encoder and decoder, respectively, to infer networks of symbols (schemata) from natural language datasets. Our experiments show that (i) the inferred symbols can be interpreted as encoding different aspects of language, as e.g. topics or sentiments, and that (ii) GPT-2-like models can effectively be conditioned on symbolic representations. Finally, we explore training autoregressive, random walk “reasoning” models on schema networks inferred from commonsense knowledge databases, and using the sampled paths to enhance the performance of pretrained language models on commonsense If-Then reasoning tasks.

dc.description.sponsorship

WWTF Wiener Wissenschafts-, Forschu und Technologiefonds

dc.language.iso

dc.subject

language models

dc.subject

reasoning

dc.subject

neuro-symbolic computation

dc.title

Hidden Schema Networks

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.contributor.affiliation

University of Bonn, Germany

dc.contributor.affiliation

University of Bonn, Germany

dc.contributor.affiliation

Fraunhofer Institute for Intelligent Analysis and Information Systems, Germany

dc.contributor.affiliation

University of Potsdam, Germany

dc.description.startpage

4764

dc.description.endpage

4798

dc.relation.grantno

ICT22-059

dc.type.category

Full-Paper Contribution

tuw.booktitle

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

tuw.relation.publisher

Association for Computational Linguistics

tuw.project.title

Structured Data Learning with Generalized Similarities

tuw.researchTopic.id

tuw.researchTopic.name

Information Systems Engineering

tuw.researchTopic.value

100

tuw.linking

https://2023.aclweb.org/#:~:text=Toronto%2C%20Canada%20July%209-14%2C%202023%20Photo%20%40%20Wallpaper,14th%2C%202023.%20More%20information%20will%20be%20announced%20soon.

tuw.linking

https://aclanthology.org/2023.acl-long.263.pdf

tuw.publication.orgunit

E194-06 - Forschungsbereich Machine Learning

tuw.publication.orgunit

E194 - Institut für Information Systems Engineering

tuw.publisher.doi

10.18653/v1/2023.acl-long.263

dc.description.numberOfPages

tuw.event.name

61st Annual Meeting of the Association for Computational Linguistics

tuw.event.startdate

09-07-2023

tuw.event.enddate

14-07-2023

tuw.event.online

On Site

tuw.event.type

Event for scientific audience

tuw.event.place

Toronto

tuw.event.country

tuw.event.presenter

Welke, Pascal

tuw.event.track

Multi Track

wb.sciencebranch

Informatik

wb.sciencebranch.oefos

1020

wb.sciencebranch.value

100

item.openairecristype

http://purl.org/coar/resource_type/c_5794

item.languageiso639-1

item.fulltext

no Fulltext

item.grantfulltext

none

item.openairetype

conference paper

item.cerifentitytype

Publications

crisitem.project.funder

WWTF Wiener Wissenschafts-, Forschu und Technologiefonds

crisitem.project.grantno

ICT22-059

crisitem.author.dept

University of Bonn

crisitem.author.dept

University of Bonn

crisitem.author.dept

E194-06 - Forschungsbereich Machine Learning

crisitem.author.dept

Fraunhofer Institute for Intelligent Analysis and Information Systems

crisitem.author.dept

University of Potsdam

crisitem.author.orcid

0000-0002-2123-3781

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

Appears in Collections:

Conference Paper

Show simple item record

Page view(s)

306

checked on Nov 20, 2023

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM