<div class="csl-bib-body">
<div class="csl-entry">Parzer, R., Filzmoser, P., & Vana Gür, L. (2025). Sparse data-driven random projection in regression for high-dimensional data. <i>Journal of Data Science, Statistics, and Visualisation</i>, <i>5</i>(5). https://doi.org/10.52933/jdssv.v5i5.138</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/221864
-
dc.description.abstract
We examine the linear regression problem in a challenging high-dimensionalsetting with correlated predictors where the degree of sparsity of the coefficientsis unknown and can vary from sparse to dense. In this setting, we propose acombination of probabilistic variable screening with random projection tools asa computationally efficient approach. In particular, we introduce a new data-driven random projection for dimension reduction in linear regression, which ismotivated by a theoretical bound on the gain in expected prediction error overconventional random projections when using information about the true coefficient. The variables to be included in the projection are screened by consideringthe correlation of the predictors. To reduce the dependence on fine-tuning choices,we aggregate over an ensemble of linear models. A threshold parameter is introduced to obtain a higher degree of sparsity, which can be chosen together withthe number of models in the ensemble by cross-validation. In extensive simulations, we compare the proposed method with other random projection tools andwith well-known methods, and show that it is competitive in terms of predictionin a variety of scenarios with different sparsity and predictor covariance settings,while most competitors are targeted at either sparse or dense settings. Finally,we illustrate the method on two data applications.
en
dc.description.sponsorship
FWF - Österr. Wissenschaftsfonds
-
dc.language.iso
en
-
dc.publisher
International Association for Statistical Computing (IASC)
-
dc.relation.ispartof
Journal of data science, statistics, and visualisation
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
High-dimensional regression
en
dc.subject
Dimension reduction
en
dc.subject
Random Projection
en
dc.title
Sparse data-driven random projection in regression for high-dimensional data
en
dc.type
Article
en
dc.type
Artikel
de
dc.rights.license
Creative Commons Attribution 4.0 International
en
dc.rights.license
Creative Commons Namensnennung 4.0 International
de
dc.relation.grantno
ZK 35-G
-
dc.type.category
Original Research Article
-
tuw.container.volume
5
-
tuw.container.issue
5
-
tuw.peerreviewed
false
-
tuw.project.title
High-dimensional statistical learning: New methods to advance economic and sustainability policies