<div class="csl-bib-body">
<div class="csl-entry">Pavlović, A. (2025). <i>The Knowledge Graph Divide - connecting machine learning, databases, and the semantic web</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.128683</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2025.128683
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/214156
-
dc.description.abstract
Over the past decade, Knowledge Graphs (KGs) have received enormous interest from industry and academia. However, there are three key research communities, namely the Machine Learning (ML), Database (DB), and Semantic Web (SW) communities, studying KGs with major gaps between them. This dissertation is about bridging their divisions: Reasoning Divide. KGs are inherently incomplete. Therefore, the ML community has proposed Knowledge Graph Embedding Models (KGEs), achieving promising results for predicting missing links. Key data properties in the DB and SW fields are typically represented via logical rules. However, any current KGE cannot capture vital rules, i.e., infer missing links while adhering to such rules. Capturing (i) general composition and (ii) composition and hierarchy rules jointly are crucial open problems. To bridge this division, we introduce the ExpressivE model that embeds pairs of entities as points and relations as hyper-parallelograms in the virtual triple space R2d. This model design allows ExpressivE to capture a rich set of logical rules while offering an intuitive and consistent geometric interpretation of ExpressivE embeddings and their captured rules. Scalability Divide. Even more, the SW and DB communities provide massive KGs, calling for efficient KGEs. However, most contemporary ML-based KGEs require high-dimensional embeddings or complex embedding spaces for competitive prediction results, drastically raising their space and time requirements. Thus, developing efficient KGEs makes up another central open problem dividing the SW, DB, and ML fields. Facing this challenge, we propose SpeedE, a Euclidean KGE that (i) has strong inference capabilities,(ii) is competitive with state-of-the-art KGEs, significantly outperforming them on the YAGO3-10 and WN18RR benchmarks, and (iii) dramatically increases their efficiency, needing on WN18RR solely a fifth of the training time and a fourth of the parameters of the best-performing model (ExpressivE) to reach the same link prediction performance. Data Management Divide. Above all, the DB and SW communities have driven classical KG research. However, there remains a divide between approaches from these two fields. For instance, while languages such as SQL or Datalog are widely used in the DB area, a vastly different set of languages, such as SPARQL and OWL, is used in the SW area. This mismatch, however, makes blending KGs from both communities a complex endeavor, rendering the interoperability between DB and SW technologies a pressing open challenge. Thus, we present the SparqLog system, a uniform and consistent KG management framework meeting essential requirements from the SW and DB fields.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
machine learning
en
dc.subject
artificial intelligence
en
dc.subject
knowledge graphs
en
dc.subject
graph embeddings
en
dc.subject
databases
en
dc.subject
semantic web
en
dc.subject
query answering
en
dc.subject
efficiency
en
dc.subject
scalability
en
dc.subject
geometric interpretation
en
dc.title
The Knowledge Graph Divide - connecting machine learning, databases, and the semantic web
en
dc.title.alternative
Der Knowledge Graph Divide - Überwinden der Hürden zwischen Machine Learning, Databases und dem Semantic Web
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2025.128683
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Aleksandar Pavlović
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E192 - Institut für Logic and Computation
-
dc.type.qualificationlevel
Doctoral
-
dc.identifier.libraryid
AC17493329
-
dc.description.numberOfPages
189
-
dc.thesistype
Dissertation
de
dc.thesistype
Dissertation
en
tuw.author.orcid
0000-0001-6887-9515
-
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
item.openairecristype
http://purl.org/coar/resource_type/c_db06
-
item.fulltext
with Fulltext
-
item.openaccessfulltext
Open Access
-
item.mimetype
application/pdf
-
item.languageiso639-1
en
-
item.grantfulltext
open
-
item.openairetype
doctoral thesis
-
item.cerifentitytype
Publications
-
crisitem.author.dept
E192-02 - Forschungsbereich Databases and Artificial Intelligence