Multivariate outlier explanations using Shapley values and Mahalanobis distances

Mayrhofer, Marcus; Filzmoser, Peter

doi:10.48550/ARXIV.2210.10063

Record link:

http://hdl.handle.net/20.500.12708/137092
https://doi.org/10.34726/3163

Title:

Multivariate outlier explanations using Shapley values and Mahalanobis distances

Citation:

Mayrhofer, M., & Filzmoser, P. (2022). Multivariate outlier explanations using Shapley values and Mahalanobis distances. arXiv. https://doi.org/10.34726/3163

reposiTUm DOI:

10.34726/3163

CatalogPlus:

AC17203126

Publisher DOI:

10.48550/ARXIV.2210.10063

Publication Type:

Preprint

Language:

English

Authors:

Mayrhofer, Marcus
Filzmoser, Peter

Organisational Unit:

E105-06 - Forschungsbereich Computational Statistics

ArXiv ID:

arXiv:2210.10063

Date (published):

18-Oct-2022

Number of Pages:

Preprint Server:

arXiv

Keywords:

Shapley value; anomaly detection; cellwise outliers; Mahalanobis distance

Abstract:

For the purpose of explaining multivariate outlyingness, it is shown that the squared Mahalanobis distance of an observation can be decomposed into outlyingness contributions originating from single variables. The decomposition is obtained using the Shapley value, a well-known concept from game theory that became popular in the context of Explainable AI. In addition to outlier explanation, this concept also relates to the recent formulation of cellwise outlyingness, where Shapley values can be employed to obtain variable contributions for outlying observations with respect to their “expected” position given the multivariate data structure. In combination with squared Mahalanobis distances, Shapley values can be calculated at a low numerical cost, making them even more attractive for outlier interpretation. Simulations and real-world data examples demonstrate the usefulness of these concepts.

Project title:

Automotive Intelligence for/at Connected Shared Mobility: 101007326-2 - AI4CSM (European Commission)

Research Areas:

Mathematical and Algorithmic Foundations: 100%

Science Branch:

1010 - Mathematik: 100%

License:

CC BY 4.0