<div class="csl-bib-body">
<div class="csl-entry">Verbruggen, C. R. R., Netz, L., Glaser, P.-L., Scholz, M., Huemer, C., Calamo, M., Rumpe, B., Snoeck, M., & Bork, D. (2025). Toward a Community-Curated Golden Dataset of UML Models. In <i>2025 ACM/IEEE 28th International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C)</i> (pp. 43–50). IEEE. https://doi.org/10.1109/MODELS-C68889.2025.00012</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/225650
-
dc.description.abstract
Datasets of Unified Modeling Language (UML) models are becoming increasingly valuable for education, empirical research, and tool development in model-driven engineering (MDE) and conceptual modeling. In recent years, several datasets have emerged - mostly compiled through automated crawling of open platforms such as GitHub and GenMyModel. While these efforts have improved access to real-world modeling artifacts, the resulting collections often suffer from serious quality issues: they include syntactically invalid models, semantically incorrect structures, and placeholder or dummy content. Moreover, most models are not accompanied by textual domain descriptions, which are essential for understanding the intent behind the model and assessing its semantic soundness. Therefore these model datasets are far from ideal as a source for modeling exercises or empirical MDE research. This paper presents an initial step toward a community-curated golden dataset of UML models, designed to address these limitations. Our contribution includes i) a curated set of UML models, each paired with a natural language description of the modeled domain requirements, ii) a publicly accessible web platform for exploring and querying the dataset, and iii) a structured process for community-based contribution and evaluation to support sustainable growth and quality assurance of the dataset. By fostering community involvement and providing high-quality, semantically grounded models, this work lays the foundation for a widely accepted benchmark dataset in UML-based research and education.
en
dc.language.iso
en
-
dc.subject
Model repository
en
dc.subject
Dataset
en
dc.subject
UML
en
dc.subject
Open models
en
dc.subject
Curation
en
dc.subject
Community
en
dc.subject
Education
en
dc.subject
Machine learning
en
dc.title
Toward a Community-Curated Golden Dataset of UML Models
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
RWTH Aachen University, Germany
-
dc.contributor.affiliation
Sapienza University of Rome, Italy
-
dc.contributor.affiliation
RWTH Aachen University, Germany
-
dc.contributor.affiliation
KU Leuven, Belgium
-
dc.relation.isbn
979-8-3315-7990-6
-
dc.description.startpage
43
-
dc.description.endpage
50
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
2025 ACM/IEEE 28th International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C)
-
tuw.peerreviewed
true
-
tuw.relation.publisher
IEEE
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E194-03 - Forschungsbereich Business Informatics
-
tuw.publisher.doi
10.1109/MODELS-C68889.2025.00012
-
dc.description.numberOfPages
8
-
tuw.author.orcid
0000-0003-0418-2633
-
tuw.author.orcid
0000-0003-2013-2919
-
tuw.author.orcid
0009-0006-2602-9604
-
tuw.author.orcid
0000-0002-2147-1966
-
tuw.author.orcid
0000-0002-3824-3214
-
tuw.author.orcid
0000-0001-8259-2297
-
tuw.event.name
28th International Conference on Model Driven Engineering Languages and Systems (MODELS 2025)
en
tuw.event.startdate
05-10-2025
-
tuw.event.enddate
10-10-2025
-
tuw.event.online
Hybrid
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Grand Rapids, Michigan
-
tuw.event.country
US
-
tuw.event.presenter
Verbruggen, Charlotte Roos R.
-
tuw.presentation.online
Online
-
tuw.event.track
Multi Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Wirtschaftswissenschaften
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
5020
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.fulltext
no Fulltext
-
item.languageiso639-1
en
-
item.grantfulltext
none
-
item.openairetype
conference paper
-
item.cerifentitytype
Publications
-
crisitem.author.dept
E194-03 - Forschungsbereich Business Informatics
-
crisitem.author.dept
RWTH Aachen University, Germany
-
crisitem.author.dept
E194-03 - Forschungsbereich Business Informatics
-
crisitem.author.dept
E194-03 - Forschungsbereich Business Informatics
-
crisitem.author.dept
E194-03 - Forschungsbereich Business Informatics
-
crisitem.author.dept
Sapienza University of Rome, Italy
-
crisitem.author.dept
RWTH Aachen University, Germany
-
crisitem.author.dept
KU Leuven, Belgium
-
crisitem.author.dept
E194-03 - Forschungsbereich Business Informatics
-
crisitem.author.orcid
0000-0003-0418-2633
-
crisitem.author.orcid
0000-0003-2013-2919
-
crisitem.author.orcid
0009-0006-2602-9604
-
crisitem.author.orcid
0000-0002-2147-1966
-
crisitem.author.orcid
0000-0002-3824-3214
-
crisitem.author.orcid
0000-0001-8259-2297
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering