Machine learning in credit default risk

Petrov, Alexander

doi:10.34726/hss.2022.95042

DC Field

Value

Language

dc.contributor.advisor

Filzmoser, Peter

dc.contributor.author

Petrov, Alexander

dc.date.accessioned

2022-03-30T07:51:03Z

dc.date.issued

2022

dc.date.submitted

2022-03

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Petrov, A. (2022). <i>Machine learning in credit default risk</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2022.95042</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2022.95042

dc.identifier.uri

http://hdl.handle.net/20.500.12708/19819

dc.description

Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers

dc.description.abstract

While the amount of data collected by banks increases exponentially, the introduction of sophisticated machine learning models becomes inevitable in order to keep up with the times. The European Banking Authority (EBA) published a discussion paper in Novem- ber 2021 which might open new possibilities for the estimation of the risk parameters by the internal rating-based (IRB) approach.This thesis aims to compare the performance of different machine learning algorithms in the field of credit risk and, more specifically, in the discrimination of good and bad customers as a part of the probability of default (PD) estimation. The data consists of the corporate customers of a European bank and their balance sheet positions enriched by the region and industry information with the 12 months default flag as the target variable.The binary classification algorithms are described from the theoretical point of view and then applied using R packages. Thereby, the data pre-processing pipeline including an extensive missing data treatment as well as an outlier detection method plays a decisive role because of a significant noise level in the sample, while simultaneously addressing the problem of imbalanced data through undersampling and overweighting. A cross-validation procedure ensures that an adequate out-of-time generalization is achieved.The results state that some of the advanced machine learning techniques outperform the ordinary logistic regression and its regularized modifications while the others such as support vector machine deliver a comparable performance. A plain neural network with one hidden layer provides the best predictions in terms of gini on the holdout sample using a uniform quantile transformation. Random forest achieves the best performance with the untransformed data, notwithstanding that the interpretation of the results and implementation of the model in production environment are less straightforward than in case of logistic regression.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

Neural networks

dc.subject

Random forests

dc.subject

Logistic regression

dc.title

Machine learning in credit default risk

dc.title.alternative

Maschinelles Lernen beim Kreditausfallsrisiko

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2022.95042

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Alexander Petrov

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

tuw.publication.orgunit

E105 - Institut für Stochastik und Wirtschaftsmathematik

dc.type.qualificationlevel

Diploma

dc.identifier.libraryid

AC16486821

dc.description.numberOfPages

dc.thesistype

Diplomarbeit

dc.thesistype

Diploma Thesis

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

tuw.advisor.orcid

0000-0002-8014-4682

item.languageiso639-1

item.openairetype

master thesis

item.grantfulltext

open

item.fulltext

with Fulltext

item.cerifentitytype

Publications

item.mimetype

application/pdf

item.openairecristype

http://purl.org/coar/resource_type/c_bdcc

item.openaccessfulltext

Open Access

crisitem.author.dept

TU Wien

Appears in Collections:

Thesis

Fulltext (Version of Record (published version))

Adobe PDF

(1.26 MB)

In Copyright

Show simple item record

Page view(s)

419

checked on Nov 29, 2023

Download(s)

172

checked on Nov 29, 2023

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM