<div class="csl-bib-body">
<div class="csl-entry">Schwendinger, B., Schwendinger, F., & Vana Gür, L. (2022, June 22). <i>Holistic generalized linear models</i> [Poster Presentation]. useR! 2022, United States of America (the).</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/190692
-
dc.description.abstract
Selecting a sensible model from the set of all reasonable models is an essential but typically time-consuming process in the data analytic process. To simplify this process, Bertsimas & King 2015 and Bertsimas & Li 2020 introduce the holistic linear model (HLM). The HLM is a constrained linear regression model where the constraints aim to automate the model selection process by utilizing quadratic mixed-integer optimization. The integer constraints are used to place cardinality constraints on the linear regression model. Placing a cardinality constraint on the total number of variables allowed in the final model leads to the classical best subset selection problem (Miller 2002): minimize_{beta} 1/2 ||y-X*beta||_2^2 subject to ||beta||_0 =< k
Adding cardinality constraints on user-defined groups of variables can be used to limit the pairwise multicollinearity or select the best (non-linear) transformation. Additionally, the HLM allows posing constraints on the global multicollinearity and linear constraints on the parameters.
This work introduces holiglm, an R package for formulating and fitting holistic generalized linear models (HGLMs). To our knowledge, we are the first to suggest using conic optimization to extend the results presented for linear regression by Bertsimas et al. to the class of generalized linear models. The holiglm package provides a flexible infrastructure for automatically translating constrained generalized linear models into conic optimization problems. The optimization problems are solved by utilizing the R optimization infrastructure package ROI (Theußl, Schwendinger & Hornik 2020). Using ROI makes it possible for the user to choose from a wide range of commercial and open-source optimization solvers. Additionally, a high-level interface is provided, which can be used as a drop-in replacement for the stats::glm() function. Using conic optimization instead of iteratively reweighted least squares (IRLS) has the advantage that no starting values are needed, the results are more reliable (proven optimality) and the solvers are designed to handle constraints. These advantages come at the cost of a longer runtime. However, as shown by Schwendinger, Grün & Hornik 2021, for some GLMs the speed of the conic formulation is similar to the IRLS implementation.
en
dc.language.iso
en
-
dc.subject
Generalized Linear Models
en
dc.subject
Algorithmic regression
en
dc.subject
best subset selection
en
dc.subject
conic programming
en
dc.subject
holistic constraints
en
dc.title
Holistic generalized linear models
en
dc.type
Presentation
en
dc.type
Vortrag
de
dc.type.category
Poster Presentation
-
tuw.researchTopic.id
A4
-
tuw.researchTopic.id
A3
-
tuw.researchTopic.name
Mathematical Methods in Economics
-
tuw.researchTopic.name
Fundamental Mathematics Research
-
tuw.researchTopic.value
50
-
tuw.researchTopic.value
50
-
tuw.publication.orgunit
E384-01 - Forschungsbereich Software-intensive Systems