<div class="csl-bib-body">
<div class="csl-entry">Schwendinger, B. (2024). <i>An optimization approach to generalized linear models</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.123068</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2024.123068
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/199308
-
dc.description.abstract
Statistical model building is a difficult and time-consuming process. Modelers need to balance various competing goals, such as predictive power, sparsity and fairness. Currently, practitioners follow model building rules and only check if a model meets the desired properties afterwards. These model selection techniques often rely on heuristics. As a result, final models often do not meet the desired criteria, let alone fulfill them satisfactorily. In this work, we investigate an optimization approach to Generalized Linear Models (GLMs), resulting in Holistic Generalized Linear Models (HGLMs). This class of models allows for the inclusion of additional constraints such as sparsity, limited multicollinearity or linear constraints, directly into the maximum likelihood estimation problem for the corresponding GLMs. To model and solve the resulting optimization problems, we use (mixed-integer) conic optimization or (mixed-integer) linear optimization while linearly approximating the objective function. Furthermore, we provide the software package holiglm for conveniently fitting HGLMs. Adding additional constraints to underlying optimization problem ensures that the desired properties of the statistical model, such as sparsity or non-negative coefficients, are always satisfied. The usual method for fitting GLMs is Maximum Likelihood Estimation via Iteratively Reweighted Least Squares (IRLS). However, when encountering linear constraints, which can occur for certain family/link combinations even without additional constraints, IRLS tends to converge slowly or does not converge at all. Conic optimization, as a generalization of linear optimization, offers a unified and reliable way for fitting HGLMs. In cases where the log-likelihood cannot be represented by the available cones, we resort to linear approximations. Since conic and linear optimization are exact optimization methods, the objective function will have a global optimum at the provided solution. In conclusion, the proposed class of HGLMs makes the process of model building less manual and more holistic since it is driven by optimization. Moreover, the holistic constraints can be used to incorporate additional knowledge or enable automated model selection by including information criteria into the objective function. Through our linear approximation scheme for differentiable log-likelihoods, we can fit HGLMs for a wide range of family/link combinations.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Holistische verallgemeinerte lineare Modelle
de
dc.subject
Auswahl der besten Teilmengen
de
dc.subject
Merkmalsauswahl
de
dc.subject
Gemischt-ganzzahlige konische Optimierung
de
dc.subject
Verallgemeinerte lineare Modelle
de
dc.subject
Holistic Generalized Linear Models
en
dc.subject
Best Subset Selection
en
dc.subject
Feature Selection
en
dc.subject
Mixed-Integer Conic Optimization
en
dc.subject
Generalized Linear Models
en
dc.title
An optimization approach to generalized linear models
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2024.123068
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Benjamin Schwendinger
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Hoch, Ralph
-
tuw.publication.orgunit
E384 - Institut für Computertechnik
-
dc.type.qualificationlevel
Doctoral
-
dc.identifier.libraryid
AC17245063
-
dc.description.numberOfPages
141
-
dc.thesistype
Dissertation
de
dc.thesistype
Dissertation
en
tuw.author.orcid
0000-0003-3315-8114
-
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
tuw.assistant.staffStatus
staff
-
item.languageiso639-1
en
-
item.openairetype
doctoral thesis
-
item.openairecristype
http://purl.org/coar/resource_type/c_db06
-
item.grantfulltext
open
-
item.cerifentitytype
Publications
-
item.fulltext
with Fulltext
-
item.mimetype
application/pdf
-
item.openaccessfulltext
Open Access
-
crisitem.author.dept
E384-01 - Forschungsbereich Software-intensive Systems