<div class="csl-bib-body">
<div class="csl-entry">Christensen, M. P., Leventidis, A., Lissandrini, M., Di Rocco, L., Miller, R. J., & Hose, K. (2025). Fantastic Tables and Where to Find Them: Table Search in Semantic Data Lakes. In <i>Advances in Database Technology - Volume 28 Proceedings 28th International Conference on Extending Database Technology (EDBT 2025)</i> (pp. 397–410). OpenProceedings.org. https://doi.org/10.48786/edbt.2025.32</div>
</div>
In data lakes, one of the core challenges remains finding relevant tables. We introduce the notion of semantic data lakes, i.e.,
repositories where datasets are linked to concepts and entities described in a knowledge graph (KG). We formalize the problem
of semantic table search, i.e., retrieving tables containing information semantically related to a given set of entities, and provide
the first formal definition of semantic relatedness of a dataset to tuples of entities. Our solution offers the first general framework
to compute the semantic relevance of the contents of a table w.r.t.
entity tuples, as well as efficient algorithms (exploiting semantic signals, such as entity types and embeddings) to scale the
semantic search to repositories with hundreds of thousands of distinct tables. Our extensive experiments on both real-world and
synthetic benchmarks show that our approach is able to retrieve more relevant tables (up to 5.4 times higher recall) in comparison
to existing methods while ensuring fast response times (up to 17 times faster with LSH).
en
dc.language.iso
en
-
dc.subject
Semantic Data Lakes
en
dc.subject
Knowledge graphs
en
dc.subject
Semantic table search
en
dc.title
Fantastic Tables and Where to Find Them: Table Search in Semantic Data Lakes
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Aalborg University, Denmark
-
dc.contributor.affiliation
University of Verona, Italy
-
dc.relation.isbn
978-3-89318-098-1
-
dc.relation.issn
2367-2005
-
dc.description.startpage
397
-
dc.description.endpage
410
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Advances in Database Technology - Volume 28 Proceedings 28th International Conference on Extending Database Technology (EDBT 2025)
-
tuw.peerreviewed
true
-
tuw.relation.publisher
OpenProceedings.org
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
70
-
tuw.researchTopic.value
30
-
tuw.publication.orgunit
E192-02 - Forschungsbereich Databases and Artificial Intelligence
-
tuw.publisher.doi
10.48786/edbt.2025.32
-
dc.description.numberOfPages
14
-
tuw.author.orcid
0000-0003-3168-6810
-
tuw.author.orcid
0000-0001-7025-8099
-
tuw.event.name
28th International Conference on Extending Database Technology (EDBT 2025)
en
tuw.event.startdate
25-03-2025
-
tuw.event.enddate
28-03-2025
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Barcelona
-
tuw.event.country
ES
-
tuw.event.presenter
Christensen, Martin Pekár
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
80
-
wb.sciencebranch.value
20
-
item.grantfulltext
none
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.openairetype
conference paper
-
item.languageiso639-1
en
-
item.cerifentitytype
Publications
-
item.fulltext
no Fulltext
-
crisitem.author.dept
Aalborg University, Denmark
-
crisitem.author.dept
Aalborg University
-
crisitem.author.dept
E192-02 - Forschungsbereich Databases and Artificial Intelligence