Chemometrics; FTIR spectra; High-dimensional data analysis; Robust classification; Robust regression
en
Abstract:
Data sets derived from practical experiments often pose challenges for (robust) statistical methods. In high-dimensional data sets, more variables than observations are recorded and often, there are also data present that do not follow the structure of the data majority. In order to handle such data with outlying observations, a variety of robust regression and classification methods have been developed for low-dimensional data. The high-dimensional case, however, is more challenging, and the variety of robust methods is much more limited. The choice of the method depends on the specific data structure, and numerical problems are more likely to occur. We give an overview of selected robust methods as well as implementations and demonstrate the application with two high-dimensional data sets from tribology. We show that robust statistical methods combined with appropriate pre-processing and sampling strategies yield increased prediction performance and insight into data differing from the majority.
en
Projekttitel:
Merkmalerkennung in mehrdimensionalen Datensätzen von geschmierten Kontakten: RV-TUW-01 (FFG - Österr. Forschungsförderungs- gesellschaft mbH)