<div class="csl-bib-body">
<div class="csl-entry">Felder, J. (2025). <i>Predicting bugs in source code : A machine learning approach for predicting faults by utilizing code and change metrics</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.124198</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2025.124198
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/213269
-
dc.description
Zusammenfassung in deutscher Sprache
-
dc.description.abstract
Bug detection plays a critical role in software engineering, offering significant time and cost savings for organizations and developers alike. With the exponential growth in code volume and the availability of data surrounding its development, bug prediction has become increasingly important. This thesis focuses on combining code metrics, especially ones based on code changes, and machine learning techniques to address the challenge of identifying buggy software files.The thesis leverages a dataset comprising 34 open-source projects and utilizes more than 37 code metrics, ranging from basic measures such as Lines of Code to advanced metrics rooted in object-oriented programming principles. A CatBoost classifier was employed to develop a predictive model capable of classifying files as buggy or non-buggy and assigning a corresponding Risk Score -- a numerical indicator of the likelihood that a given file contains bugs. The model achieved an average accuracy of 84.1% and a recall rate of 83%, demonstrating its reliability and effectiveness in identifying buggy files.The analysis further examined the importance of individual code metrics in driving the model's predictions. Feature Importance Analysis identified complexity metrics and the Bus Factor as the most influential in predicting buggy files, offering valuable insights into key contributors to software quality. Additionally, a Logistic Regression-based approach, which achieved an accuracy of 61%, was evaluated to contrast its performance with advanced non-linear models like CatBoost, demonstrating the latter's superior predictive capabilities for bug prediction.This work contributes to the field of software engineering by demonstrating the efficacy of combining machine learning with metric-driven approaches for bug prediction. The results provide a foundation for future research and practical applications aimed at enhancing software reliability and development efficiency.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Machine Learning
en
dc.subject
Bug Prediction
en
dc.subject
Code Metrics
en
dc.subject
Change Metrics
en
dc.subject
Gradient Boosting
en
dc.subject
CatBoost
en
dc.subject
Classification
en
dc.title
Predicting bugs in source code : A machine learning approach for predicting faults by utilizing code and change metrics
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2025.124198
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Jodok Felder
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Schatten, Alexander
-
tuw.publication.orgunit
E194 - Institut für Information Systems Engineering