Robust linear regression for high-dimensional data: an overview

Filzmoser, Peter; Nordhausen, Klaus

doi:10.1002/wics.1524

Datensatz Zitierlink:

http://hdl.handle.net/20.500.12708/137149

Titel:

Robust linear regression for high-dimensional data: an overview

Zitat:

Filzmoser, P., & Nordhausen, K. (2021). Robust linear regression for high-dimensional data: an overview. Wiley Interdisciplinary Reviews: Computational Statistics. https://doi.org/10.1002/wics.1524

Verlags-DOI:

10.1002/wics.1524

Publikationstyp:

Artikel - Forschungsartikel

Sprache:

Englisch

Autor_innen:

Filzmoser, Peter
Nordhausen, Klaus

Organisationseinheit:

E105-06 - Forschungsbereich Computational Statistics

Zeitschrift:

Wiley Interdisciplinary Reviews: Computational Statistics

ISSN:

1939-0068

Datum (veröffentlicht):

2021

Umfang:

Verlag:

WILEY

Peer Reviewed:

Keywords:

Statistics and Probability

Abstract:

Digitization as the process of converting information into numbers leads to bigger and more complex data sets, bigger also with respect to the number of measured variables. This makes it harder or impossible for the practitioner to identify outliers or observations that are inconsistent with an underlying model. Classical least‐squares based procedures can be affected by those outliers. In the regression context, this means that the parameter estimates are biased, with consequences on the validity of the statistical inference, on regression diagnostics, and on the prediction accuracy. Robust regression methods aim at assigning appropriate weights to observations that deviate from the model. While robust regression techniques are widely known in the low‐dimensional case, researchers and practitioners might still not be very familiar with developments in this direction for high‐dimensional data. Recently, different strategies have been proposed for robust regression in the high‐dimensional case, typically based on dimension reduction, on shrinkage, including sparsity, and on combinations of such techniques. A very recent concept is downweighting single cells of the data matrix rather than complete observations, with the goal to make better use of the model‐consistent information, and thus to achieve higher efficiency of the parameter estimates.

Forschungsschwerpunkte:

Computational Materials Science: 100%

Wissenschaftszweig:

Mathematik
Informatik

Enthalten in den Sammlungen:

Article

Zur Langanzeige

Seiten Aufrufe

175

aufgerufen am 01.12.2023

Google Scholar^TM

Check

Seiten Aufrufe

Google ScholarTM

Google Scholar^TM