Mining Validating Shapes for Large Knowledge Graphs via Dynamic Reservoir Sampling

Lissandrini, Matteo; Rabbani, Kashif; Hose, Katja

doi:10.34726/8213

DC Field

Value

Language

dc.contributor.author

Lissandrini, Matteo

dc.contributor.author

Rabbani, Kashif

dc.contributor.author

Hose, Katja

dc.contributor.editor

Atzori, Maurizio

dc.contributor.editor

CIACCIA, PAOLO

dc.contributor.editor

Ceci, Michelangelo

dc.contributor.editor

Mandreoli, Federica

dc.contributor.editor

Malerba, Donato

dc.contributor.editor

SANGUINETTI, MANUELA

dc.date.accessioned

2025-01-13T17:13:52Z

dc.date.available

2025-01-13T17:13:52Z

dc.date.issued

2024

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Lissandrini, M., Rabbani, K., & Hose, K. (2024). Mining Validating Shapes for Large Knowledge Graphs via Dynamic Reservoir Sampling. In M. Atzori, P. CIACCIA, M. Ceci, F. Mandreoli, D. Malerba, & M. SANGUINETTI (Eds.), <i>Proceedings of the 32nd Symposium on Advanced Database Systems (SEBD 2024)</i> (pp. 25–34). https://doi.org/10.34726/8213</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/208561

dc.identifier.uri

https://doi.org/10.34726/8213

dc.description.abstract

Knowledge Graphs (KGs) are databases that model knowledge from heterogeneous domains using the graph data model. Shape constraint languages have been adopted in KGs to ensure their data quality. They encode the equivalent of a schema in the Resource Description Framework (RDF). Unfortunately, few KGs are accompanied by a corresponding set of validating shapes. When validating shapes are missing, the solution is to extract them from the graph via mining techniques. Current shape extraction methods are often incomplete, not scalable, and generate spurious shapes. Thus, in this discussion paper, we present our recent contribution: a novel Quality Shapes Extraction (QSE) method for large graphs. QSE computes confidence and support for shape constraints via a novel Dynamic Reservoir Sampling method, enabling the identification of informative and reliable shapes. QSE is the first method (validated on WikiData and DBpedia) to extract a complete set of shapes from large real-world KGs.

dc.language.iso

dc.rights.uri

http://creativecommons.org/licenses/by/4.0/

dc.subject

Knowledge Graphs

dc.subject

Data Mining

dc.subject

Data Quality

dc.title

Mining Validating Shapes for Large Knowledge Graphs via Dynamic Reservoir Sampling

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.rights.license

Creative Commons Namensnennung 4.0 International

dc.rights.license

Creative Commons Attribution 4.0 International

dc.identifier.doi

10.34726/8213

dc.contributor.affiliation

University of Verona, Italy

dc.contributor.affiliation

Aalborg University, Denmark

dc.contributor.editoraffiliation

University of Bari Aldo Moro, Italy

dc.contributor.editoraffiliation

University of Modena and Reggio Emilia, Italy

dc.contributor.editoraffiliation

University of Bari Aldo Moro, Italy

dc.contributor.editoraffiliation

University of Cagliari, Italy

dc.relation.issn

1613-0073

dc.description.startpage

dc.description.endpage

dc.rights.holder

dc.type.category

Full-Paper Contribution

tuw.booktitle

Proceedings of the 32nd Symposium on Advanced Database Systems (SEBD 2024)

tuw.container.volume

3741

tuw.peerreviewed

true

tuw.researchTopic.id

tuw.researchTopic.name

Logic and Computation

tuw.researchTopic.value

100

tuw.publication.orgunit

E192-02 - Forschungsbereich Databases and Artificial Intelligence

dc.identifier.libraryid

AC17407983

dc.description.numberOfPages

tuw.author.orcid

0000-0001-7922-5998

tuw.author.orcid

0000-0002-6984-2121

tuw.author.orcid

0000-0001-7025-8099

dc.rights.identifier

CC BY 4.0

dc.rights.identifier

CC BY 4.0

tuw.editor.orcid

0000-0001-6112-7310

tuw.editor.orcid

0000-0002-1794-6244

tuw.editor.orcid

0000-0002-6690-7583

tuw.editor.orcid

0000-0002-8043-8787

tuw.editor.orcid

0000-0001-8432-4608

tuw.editor.orcid

0000-0002-0147-2208

tuw.event.name

Symposium on Advanced Database Systems 2024 (SEBD 2024)

tuw.event.startdate

23-06-2024

tuw.event.enddate

26-06-2024

tuw.event.online

On Site

tuw.event.type

Event for scientific audience

tuw.event.place

Villasimius

tuw.event.country

tuw.event.presenter

Hose, Katja

wb.sciencebranch

Informatik

wb.sciencebranch

Mathematik

wb.sciencebranch.oefos

1020

wb.sciencebranch.oefos

1010

wb.sciencebranch.value

item.languageiso639-1

item.openairetype

conference paper

item.openairecristype

http://purl.org/coar/resource_type/c_5794

item.grantfulltext

open

item.cerifentitytype

Publications

item.fulltext

with Fulltext

item.mimetype

application/pdf

item.openaccessfulltext

Open Access

crisitem.author.dept

University of Verona, Italy

crisitem.author.dept

Aalborg University

crisitem.author.dept

E192-02 - Forschungsbereich Databases and Artificial Intelligence

crisitem.author.orcid

0000-0001-7922-5998

crisitem.author.orcid

0000-0002-6984-2121

crisitem.author.orcid

0000-0001-7025-8099

crisitem.author.parentorg

E192 - Institut für Logic and Computation

Appears in Collections:

Conference Paper

Fulltext (Version of Record (published version))

Adobe PDF

(1.56 MB)

Mining Validating Shapes for Large Knowledge Graphs via Dynamic Reservoir Sampling

CC BY 4.0

Show simple item record

Page view(s)

263

checked on Jan 13, 2025

Download(s)

checked on Jan 13, 2025

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM