<div class="csl-bib-body">
<div class="csl-entry">Iglesias Vázquez, F., Zseby, T., & Zimek, A. (2025). Parameterization-free clustering with sparse data observers. <i>Information Systems</i>, <i>133</i>, Article 102562. https://doi.org/10.1016/j.is.2025.102562</div>
</div>
-
dc.identifier.issn
0306-4379
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/216068
-
dc.description.abstract
Given a set of data points, clustering serves to discover groups based on pairwise similarities and the shapes drawn by the data in the feature space. In other words, it is a tool to describe data and reveal their intrinsic nature in terms of patterns or groups. In this paper, we review the methodology of clustering when used to explore a priori unknown data, i.e., we do not know how data spaces are manipulated, how algorithms are tuned, and how results are validated. Under this practical approach, we examine the advantages of SDOclust, a clustering method that stands out for its simplicity, lightness, no need for parameterization and not being subject to traditional clustering limitations. We test SDOclust and main established alternatives — HDBSCAN,
-means--, Fuzzy C-means, Hierarchical Clustering, CLASSIX, and N2D Deep Clustering — by extensive experimentation with more than 200 datasets, both real and synthetic, that have been collected from the literature on evaluation and represent different data analysis challenges. We submit only SDOclust to unfavorable testing conditions by denying it a parameter tuning phase. Nevertheless, its overall performance is excellent and positions it as one of the best general-purpose alternatives.
With deep clustering as the consolidation of a new paradigm, trends in clustering consist mainly in projecting data into spaces that are easier to dissect. Therefore, in cases where the original space does not show clustering-friendly structures and when we can assume transformation costs, SDOclust easily adapts and is a most natural choice to perform the partitioning task.
en
dc.language.iso
en
-
dc.publisher
PERGAMON-ELSEVIER SCIENCE LTD
-
dc.relation.ispartof
Information Systems
-
dc.subject
Clustering
en
dc.subject
Sparse data observers
en
dc.subject
Unsupervised learning
en
dc.subject
Data Analysis
en
dc.title
Parameterization-free clustering with sparse data observers
en
dc.type
Article
en
dc.type
Artikel
de
dc.contributor.affiliation
University of Southern Denmark, Denmark
-
dc.type.category
Original Research Article
-
tuw.container.volume
133
-
tuw.journal.peerreviewed
true
-
tuw.peerreviewed
true
-
wb.publication.intCoWork
International Co-publication
-
tuw.publication.invited
invited
-
tuw.researchTopic.id
C4
-
tuw.researchTopic.id
I2
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Mathematical and Algorithmic Foundations
-
tuw.researchTopic.name
Computer Engineering and Software-Intensive Systems