<div class="csl-bib-body">
<div class="csl-entry">Glaser, P.-L. (2025). <i>Encoding Semantic Information in Conceptual Models for Machine Learning Applications</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.119285</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2025.119285
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/216680
-
dc.description
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
The integration of Conceptual Modeling (CM) and Machine Learning (ML) has given rise to a growing research field known as Machine Learning for Conceptual Modeling (ML4CM), where ML techniques are applied to support modeling tasks such as classifica-tion, completion, or repair. A crucial factor in these applications is the transformation of conceptual models into ML-compatible representations, called encodings. A wide variety of encoding strategies exist that draw on different information sources within conceptual models, depending on the specific use case. However, existing ML4CM studies tend to treat encodings as fixed and focus predominantly on tuning ML algorithms or hyperparameters. Consequently, encoding strategies and their internal configuration options receive limited scrutiny during evaluation, making it difficult for researchers and practitioners to select and adapt optimal encodings for specific tasks.This thesis addresses this gap by developing and evaluating a set of configurable semantic encodings for conceptual models. Specifically, it investigates how semantic information (e.g. names, types, contextualrelationships) within models can be systematically extracted and transformed into ML-compatible representations. The work adopts the Design Science Research methodology and extends the CM2ML framework with an ArchiMate parser and four semantic encoders: Bag-of-Words (BoW), Term Frequency (TF), Embeddings,and Triples. Each encoder captures distinct semantic aspects and supports extensive configurability to enable experimentation and task-specific adaptation. Furthermore, all encodings can be interactively visualized within the framework, offering real-time insight into parameter effects and traceability to link encoded features back to their source model elements.To evaluate the proposed encodings, the thesis combines a qualitative comparison based on defined criteria with a quantitative assessment through two representative ML tasks.The first task, dummy classification, employs TF encodings to distinguish dummy views from valid ones and explores the impact of common NLP parameters and weighting schemes. The second task, node classification, aims to predict element types based on local context, using triple encodings enriched with word embeddings for element names and one-hot vectors for types. The results demonstrate the suitability of the encodings for specific ML4CM tasks and that certain encoding configurations can have a substantial influence on model performance.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
conceptual modeling
en
dc.subject
encoding
en
dc.subject
machine learning
en
dc.title
Encoding Semantic Information in Conceptual Models for Machine Learning Applications
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2025.119285
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Philipp-Lorenz Glaser
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Ali, Syed Juned
-
tuw.publication.orgunit
E194 - Institut für Information Systems Engineering
-
dc.type.qualificationlevel
Diploma
-
dc.identifier.libraryid
AC17573736
-
dc.description.numberOfPages
111
-
dc.thesistype
Diplomarbeit
de
dc.thesistype
Diploma Thesis
en
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
tuw.assistant.staffStatus
staff
-
tuw.advisor.orcid
0000-0001-8259-2297
-
tuw.assistant.orcid
0000-0003-1221-0278
-
item.languageiso639-1
en
-
item.grantfulltext
open
-
item.openairetype
master thesis
-
item.openaccessfulltext
Open Access
-
item.mimetype
application/pdf
-
item.openairecristype
http://purl.org/coar/resource_type/c_bdcc
-
item.cerifentitytype
Publications
-
item.fulltext
with Fulltext
-
crisitem.author.dept
E194-03 - Forschungsbereich Business Informatics
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering