While the Web of Data in principle offers access to a wide range of interlinked data, the architecture of the Semantic Web today relies mostly on the data providers to maintain access to their data through SPARQL endpoints. Several studies, however, have shown that such endpoints often experience downtime, meaning that the data they maintain becomes inaccessible. While decentralized systems based on Peer-to-Peer (P2P) technology have previously shown to increase the availability of knowledge graphs, even when a large proportion of the nodes fail, processing queries in such a setup can be an expensive task since data necessary to answer a single query might be distributed over multiple nodes. In this paper, we therefore propose an approach to optimizing SPARQL queries over decentralized knowledge graphs, called LOTHBROK. While there are potentially many aspects to consider when optimizing such queries, we focus on three aspects: cardinality estimation, locality awareness, and data fragmentation. We empirically show that LOTHBROK is able to achieve significantly faster query processing performance compared to the state of the art when processing challenging queries as well as when the network is under high load.
en
dc.language.iso
en
-
dc.publisher
IOS PRESS
-
dc.relation.ispartof
Semantic Web
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
cardinality estimation
en
dc.subject
data locality
en
dc.subject
decentralization
en
dc.subject
knowledge graphs
en
dc.subject
Peer-to-Peer
en
dc.subject
query optimization
en
dc.subject
RDF
en
dc.subject
SPARQL
en
dc.title
Optimizing SPARQL queries over decentralized knowledge graphs