NPCS: Native Provenance Computation for SPARQL

Asma, Zubaria; Hernández, Daniel; Galárraga, Luis; Flouris, Giorgos; Fundulaki, Irini; Hose, Katja

doi:10.1145/3589334.3645557

Record link:

http://hdl.handle.net/20.500.12708/210848

Title:

NPCS: Native Provenance Computation for SPARQL

Citation:

Asma, Z., Hernández, D., Galárraga, L., Flouris, G., Fundulaki, I., & Hose, K. (2024). NPCS: Native Provenance Computation for SPARQL. In WWW ’24: Proceedings of the ACM Web Conference 2024 (pp. 2085–2093). ACM. https://doi.org/10.1145/3589334.3645557

CatalogPlus:

AC17426715

Publisher DOI:

10.1145/3589334.3645557

Publication Type:

Inproceedings - Full-Paper Contribution

Language:

English

Authors:

Asma, Zubaria
Hernández, Daniel
Galárraga, Luis
Flouris, Giorgos
Fundulaki, Irini
Hose, Katja

Organisational Unit:

E192-02 - Forschungsbereich Databases and Artificial Intelligence

Published in:

WWW '24: Proceedings of the ACM Web Conference 2024

ISBN:

979-8-4007-0171-9

DOI of the book:

10.1145/3589334

Date (published):

13-May-2024

Event name:

WWW '24: The ACM Web Conference 2024

Event date:

13-May-2024 - 17-May-2024

Event place:

Singapore, Singapore

Number of Pages:

Publisher:

ACM, New York, NY, USA

Peer reviewed:

Yes

Keywords:

data provenance; how-provenance; knowledge graphs; rdf; sparql

Abstract:

The popularity of Knowledge Graphs (KGs) both in industry and academia owes credit to their flexible data model, suitable for data integration from multiple sources. Several KG-based applications such as trust assessment or view maintenance on dynamic data rely on the ability to compute provenance explanations for query results. The how-provenance of a query result is an expression that encodes the records (triples or facts) that explain its inclusion in the result set. This article proposes NPCS, a Native Provenance Computation approach for SPARQL queries. NPCS annotates query results with their how-provenance. By building upon spm-provenance semirings, NPCS supports both monotonic and non-monotonic SPARQL queries. Thanks to its reliance on query rewriting techniques, the approach is directly applicable to already deployed SPARQL engines using different reification schemes - including RDF-star. Our experimental evaluation on two popular SPARQL engines (GraphDB and Stardog) shows that our novel query rewriting brings a significant runtime improvement over existing query rewriting solutions, scaling to RDF graphs with billions of triples.

Project (external):

European Union’s Horizon 2020
COST Action
Deutsche Forschungsgesellschaft

Project ID:

860801
CA19134
EXC 2120/1 – 390831618 ; SPP 1921 – 318363223 (COFFEE STA 572_15-2)

Research Areas:

Logic and Computation: 100%

Science Branch:

1020 - Informatik: 80%
1010 - Mathematik: 20%

License:

CC BY 4.0