<div class="csl-bib-body">
<div class="csl-entry">Parzer, R., Vana Gür, L., & Filzmoser, P. (2023). <i>Sparse Projected Averaged Regression for High-Dimensional Data</i>. arXiv. https://doi.org/10.34726/5489</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/193951
-
dc.identifier.uri
https://doi.org/10.34726/5489
-
dc.description.abstract
We examine the linear regression problem in a challenging high-dimensional setting with correlated predictors to explain and predict relevant quantities, with explicitly allowing the regression coefficient to vary from sparse to dense. Most classical high-dimensional regression estimators require some degree of sparsity. We discuss the more recent concepts of variable screening and random projection as computationally fast dimension reduction tools, and propose a new random projection matrix tailored to the linear regression problem with a theoretical bound on the gain in expected prediction error over conventional random projections.
Around this new random projection, we built the Sparse Projected Averaged Regression (SPAR) method combining probabilistic variable screening steps with the random projection steps to obtain an ensemble of small linear models. In difference to existing methods, we introduce a thresholding parameter to obtain some degree of sparsity.
In extensive simulations and two real data applications we guide through the elements of this method and compare prediction and variable selection performance to various competitors. For prediction, our method performs at least as good as the best competitors in most settings with a high number of truly active variables, while variable selection remains a hard task for all methods in high dimensions.
en
dc.description.sponsorship
FWF - Österr. Wissenschaftsfonds
-
dc.language.iso
en
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
High-dimensional regression
en
dc.subject
Dimension Reduction
en
dc.subject
Random Projection
en
dc.subject
Screening
en
dc.title
Sparse Projected Averaged Regression for High-Dimensional Data
en
dc.type
Preprint
en
dc.type
Preprint
de
dc.rights.license
Creative Commons Namensnennung 4.0 International
de
dc.rights.license
Creative Commons Attribution 4.0 International
en
dc.identifier.doi
10.34726/5489
-
dc.identifier.arxiv
2312.00130
-
dc.relation.grantno
ZK 35-G
-
tuw.project.title
Hochdimensionales statistisches Lernen: Neue Methoden zur Förderung der Wirtschafts- und Nachhaltigkeitspolitik