Title: Multi-class clustering with cluster label redundancy
Other Titles: Mehrklassen-Clustering mit Clustermarken-Redundanz
Language: English
Authors: Rezaei, Iman 
Qualification level: Diploma
Advisor: Matz, Gerald  
Assisting Advisor: Dittrich, Thomas 
Issue Date: 2020
Number of Pages: 58
Qualification level: Diploma
Big data analysis has attracted enormous attention during the last decade. In this field, graph signal processing is widely used in order to analyze data on graphs using a signal processing perspective. One of the key problems in this area is clustering, i.e., partitioning the data in homogeneous subgroups (clusters, communities) corresponding to subsets of nodes in the graph. The nodes within a cluster usually tend to be connected by edges that represent similarity relations. The ultimate goal in the clustering problem is to keep the number of nodes that are attributed to a wrong cluster as low as possible. Semi-supervised clustering is a special case of this problem in which some of the cluster labels are assumed to be known in advance and this prior knowledge is exploited to label (cluster) the rest of the data. In this work, we explore the use of error correcting codes in the context of semi-supervised multi-class clustering problems. The key idea here is to employ error-correcting codes for the cluster (class) labels in connection with various clustering schemes and to use the decoder of the error correcting code in order to detect and correct miss-classified data. The proposed algorithms for clustering with error-correcting code labels are studied experimentally in several different scenarios (graph size and topology, number and size of clusters). The performance of these methods is compared to conventional semi-supervised spectral clustering with one-hot encoded labels. The results show that with some modifications of the codeword structure of one-hot encoding, our error correcting approach outperforms existing clustering methods.
Keywords: graph signal processing; clustering
URI: https://doi.org/10.34726/hss.2020.84714
DOI: 10.34726/hss.2020.84714
Library ID: AC16071778
Organisation: E389 - Telecommunications 
Publication Type: Thesis
Appears in Collections:Thesis

Files in this item:

Show full item record

Page view(s)

checked on May 12, 2021


checked on May 12, 2021

Google ScholarTM


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.