<div class="csl-bib-body">
<div class="csl-entry">Kalodikis, D. M. (2024). <i>Signature graphs in the context of compositional data</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.119449</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2024.119449
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/197367
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
Over the last years, graph signal processing has become a rich toolbox for treating data living on irregular domains. Graphs can be used to model pairwise relationships and thus provide high flexibility in modelling the structure of various problems. Many proposals were made to extend the traditional concept of graphs in order to capture a problem-inherent structure even more accurately. Naturally, each of these graph classes requires a suitably adapted mathematical framework.A recent proposal of a novel graph class by Dittrich and Matz are so-called signature graphs. These graphs model relationships in data by a non-negative scalar and a vector of signs, capturing an overall distance/correlation in the data by the scalar part, as well as relationships of different features in the data, each described by one sign in the sign vector. This model was shown to provide advantage over ordinary weighted signed graphs in clustering tasks. Recently, it was discovered that the usefulness of signature graphs can be leveraged when used in conjunction with involutions that describe symmetries in the data at hand.In this thesis, we develop a framework to use signature graphs with compositional data, that is datasets in which each datapoint describes a composition, e.g., chemical compounds of a sample. Since for such data only proportions matter, statistical treatment of compositional data shall be invariant to scaling. Therefore, traditional methods based on Euclidean geometry cannot be applied for a meaningful analysis. Aitchison laid the foundations of compositional data analysis by defining a new geometry (Aitchison geometry) which respects the principle of scale invariance among other advantageous properties.After giving a more thorough introduction into the fundamentals of both, graph signal processing and compositional data, we introduce novel linear and affine involutions in compositional data and propose a new type of transform to the Euclidean vector space, which allows a convenient description of the involutions in question. These involutions are parametrised, which makes them flexible in adapting to the considered data, but also necessitates a knowledge of these parameters for meaningful application. Thus, we proceed by developing methods to estimate the parameters in two scenarios: First, we assume to know pairwise relations of a few datapoints a priori, i.e. the signature that is later characterised by the signature graph. Based on this assumption we develop an involution estimator, which is ultimately targeted for clustering applications. Then, we propose a method following the concepts of blind source separation, which relies on prior information about the existence and non-existence of statistical correlations between datapoints, targeted for interpolation tasks in a scenario were we have a good understanding of the problems topology.Based on the identified involution, we propose a method for learning a signature graph from compositional data. Two equivalent formulations are stated and rated according to their computational costs. We then proceed to describe the clustering of the learned graph, going into the peculiarities of signature graphs and the concept of balancedness in signature graphs. A numerical study proves the advantage of using signature graphs over ordinary graphs as a basis for classification.Finally, we deal with the problem of reconstructing graph signals on balanced signature graphs from incomplete observations, i.e., interpolation. We start out by elucidating how the edge weights of a signature graph can be learned from correlations in observed data. Furthermore, we discuss the issue of balancing an unbalanced signature graph, before we present methods for bandlimited and Laplacian reconstruction on signature graphs, and how they can be simplified if the graph is balanced. Numerical experiments confirm the usefulness of our proposed methods.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
graph signal processing
en
dc.subject
compositional data
en
dc.subject
clustering
en
dc.title
Signature graphs in the context of compositional data