<div class="csl-bib-body">
<div class="csl-entry">Helali, M., Monjazeb, N., Vashisth, S., Carrier, P., Helal, A., Cavalcante, A., Ammar, K., Hose, K., & Mansour, E. (2024). KGLiDS: A Platform for Semantic Abstraction, Linking, and Automation of Data Science. In <i>2024 IEEE 40th International Conference on Data Engineering (ICDE)</i> (pp. 179–192). IEEE. https://doi.org/10.1109/ICDE60146.2024.00021</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/208548
-
dc.description.abstract
In recent years, we have witnessed the growing interest from academia and industry in applying data science technologies to analyze large amounts of data. In this process, a myriad of artifacts (datasets, pipeline scripts, etc.) are created. However, there has been no systematic attempt to holistically collect and exploit all the knowledge and experiences that are implicitly contained in those artifacts. Instead, data scientists recover information and expertise from colleagues or learn via trial and error. Hence, this paper presents a scalable platform, KGLiDS, that employs machine learning and knowledge graph technologies to abstract and capture the semantics of data science artifacts and their connections. Based on this information, KGLiDS enables various downstream applications, such as data discovery and pipeline automation. Our comprehensive evaluation covers use cases in data discovery, data cleaning, transformation, and AutoML. It shows that KGLiDS is significantly faster with a lower memory footprint than the state-of-the-art systems while achieving comparable or better accuracy.
en
dc.language.iso
en
-
dc.subject
Data Discovery
en
dc.subject
Data Integration
en
dc.subject
Graph Neural Networks
en
dc.subject
Knowledge Graphs
en
dc.subject
Linked Data Science
en
dc.subject
Semantic Abstraction
en
dc.title
KGLiDS: A Platform for Semantic Abstraction, Linking, and Automation of Data Science
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.publication
2024 IEEE 40th International Conference on Data Engineering (ICDE)
-
dc.contributor.affiliation
Concordia University, United States of America (the)
-
dc.contributor.affiliation
Concordia University, United States of America (the)
-
dc.contributor.affiliation
Concordia University, United States of America (the)
-
dc.contributor.affiliation
Concordia University, United States of America (the)
-
dc.contributor.affiliation
Concordia University, United States of America (the)
-
dc.contributor.affiliation
Borealis (Austria), Austria
-
dc.contributor.affiliation
University of Waterloo, Canada
-
dc.contributor.affiliation
Concordia University, United States of America (the)
-
dc.relation.isbn
979-8-3503-1715-2
-
dc.relation.doi
10.1109/ICDE60146.2024
-
dc.relation.issn
1063-6382
-
dc.description.startpage
179
-
dc.description.endpage
192
-
dc.type.category
Full-Paper Contribution
-
dc.relation.eissn
2375-026X
-
tuw.booktitle
2024 IEEE 40th International Conference on Data Engineering (ICDE)
-
tuw.peerreviewed
true
-
tuw.relation.publisher
IEEE
-
tuw.relation.publisherplace
Piscataway
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E192-02 - Forschungsbereich Databases and Artificial Intelligence
-
tuw.publisher.doi
10.1109/ICDE60146.2024.00021
-
dc.description.numberOfPages
14
-
tuw.author.orcid
0000-0001-7490-5011
-
tuw.author.orcid
0000-0003-1782-6637
-
tuw.author.orcid
0000-0003-0279-2729
-
tuw.author.orcid
0000-0003-0009-9197
-
tuw.author.orcid
0000-0001-7025-8099
-
tuw.event.name
IEEE 40th International Conference on Data Engineering (ICDE)
en
tuw.event.startdate
13-05-2024
-
tuw.event.enddate
16-05-2024
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Utrecht
-
tuw.event.country
NL
-
tuw.event.presenter
Helali, Mossad
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
80
-
wb.sciencebranch.value
20
-
item.grantfulltext
none
-
item.fulltext
no Fulltext
-
item.openairetype
conference paper
-
item.languageiso639-1
en
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.cerifentitytype
Publications
-
crisitem.author.dept
Concordia University, United States of America (the)
-
crisitem.author.dept
Concordia University, United States of America (the)
-
crisitem.author.dept
Concordia University, United States of America (the)
-
crisitem.author.dept
Concordia University, United States of America (the)
-
crisitem.author.dept
Concordia University, United States of America (the)
-
crisitem.author.dept
Borealis (Austria), Austria
-
crisitem.author.dept
University of Waterloo, Canada
-
crisitem.author.dept
E192-02 - Forschungsbereich Databases and Artificial Intelligence
-
crisitem.author.dept
Concordia University, United States of America (the)