Title: Identification of Credit Default Drivers via Lasso Estimation in the Logistic Regression Model
Language: English
Authors: Blasko, Peter 
Qualification level: Diploma
Keywords: credit default; Lasso; logistic regression
Advisor: Schneider, Ulrike 
Issue Date: 2020
Number of Pages: 65
Qualification level: Diploma
Abstract: 
In this work, a binary logistic regression model for two-year default probabilities has been estimated on a data set containing information on 150.000 clients available on kaggle's competition "GiveMeSomeCredit". The optimal model has been selected by choosing a subset of continuous, categorical and ordinal variables reflecting sociodemographic and behavioral properties of the client as well as characteristics of their loans using the Lasso estimator. The issue of non-linear dependence of default probabilities on the regressors has been tackled by discretization of regressors using a version of the fused Lasso in a multivariate environment.We find that the model provides an excellent fit of the data by reaching an average out-of-sample AUC of over 86%, independent of the model selection criterion (AIC, BIC or CV). This value lies in the upper range of the industry standard and in range of more complicated modeling approaches such as in Wang et al. (2015). We see that the estimator gives the strongest weightsto behavioral variables such as past due status and limit utilization, while sociodemographic variables and loan properties are much less significant.
URI: https://doi.org/10.34726/hss.2020.66160
http://hdl.handle.net/20.500.12708/15075
DOI: 10.34726/hss.2020.66160
Library ID: AC15676004
Organisation: E105 - Institut für Stochastik und Wirtschaftsmathematik 
Publication Type: Thesis
Hochschulschrift
Appears in Collections:Thesis

Show full item record

Page view(s)

32
checked on Feb 19, 2021

Download(s)

10
checked on Feb 19, 2021

Google ScholarTM

Check


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.