Mayrhofer, M. (2024). Robustness and Explainable Outlier Detection for Multivariate, Matrix-variate, and Functional Settings [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.109582
E105 - Institut für Stochastik und Wirtschaftsmathematik
-
Datum (veröffentlicht):
2024
-
Umfang:
203
-
Keywords:
Robustness; Outliers; Explainable AI
en
Abstract:
This work addresses the challenges of robust covariance estimation and interpretable outlier detection for multivariate, matrix-variate, and functional data. The goal is to developmethods that enhance both the robustness and interpretability in these settings.For outlier interpretability, we propose a novel approach that combines robust Mahalanobis distances with Shapley values to decompose multivariate outlyingness into variable-specific contributions. We present this decomposition in the multivariate setting and demonstrate how our method reduces the exponential computational complexity in the number of variablesto linear complexity, while preserving the key properties of the Shapley value. This approach is also extended to the matrix-variate and functional setting, respectively.For robust location and covariance estimation in the matrix-variate setting, we define the Matrix Minimum Covariance Determinant (MMCD) estimators and prove that they are consistent in the class of matrix-variate elliptical distributions. We show that these estimators are matrix affine equivariant and achieve a higher breakdown point than the maximum attainable for any multivariate affine equivariant covariance estimator applied to vectorized data. We demonstrate that the incorporation of an additional reweighting step improves the efficiency, and finally present and implement a fast algorithm with convergence guarantees.The MMCD approach naturally extends to the setting of multivariate Functional Data Analysis (FDA), where data are represented using basis functions and coefficient matrices. We establish a connection between stochastic processes with a separable covariance structure and the corresponding matrix-variate distribution of their basis representations. In combination with a multivariate functional Mahalanobis (semi-)distance, the MMCD approach can be used to robustly estimate the mean and covariance functions for multivariate functional data.The combined use of robust Mahalanobis distances, MMCD estimators, and Shapley value-based outlyingness decomposition offers a comprehensive framework for robust and interpretable data analysis across multivariate, matrix-variate, and functional data structures,with substantial theoretical and practical benefits, verified through simulations and real-world examples.
en
Weitere Information:
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft