Ganian, R., Hamm, T., Korchemna, V., Okrasa, K., & Simonov, K. (2022). The Complexity of k-Means Clustering when Little is Known. In Proceedings of the 39th International Conference on Machine Learning (pp. 6960–6987). https://doi.org/10.34726/3070
E192-01 - Forschungsbereich Algorithms and Complexity
-
Erschienen in:
Proceedings of the 39th International Conference on Machine Learning
-
Band:
162
-
Datum (veröffentlicht):
2022
-
Veranstaltungsname:
39th International Conference on Machine Learning
en
Veranstaltungszeitraum:
17-Jul-2022 - 23-Jul-2022
-
Veranstaltungsort:
Baltimore, Vereinigte Staaten von Amerika
-
Umfang:
28
-
Peer Reviewed:
Ja
-
Keywords:
nfpc; ParAI
en
Abstract:
In the area of data analysis and arguably even in machine learning as a whole, few approaches have been as impactful as the classical k-means clustering. Here, we study the complexity of k-means clustering in settings where most of the data is not known or simply irrelevant. To obtain a more fine-grained understanding of the tractability of this clustering problem, we apply the parameterized complexity paradigm and obtain three new algorithms for k-means clustering of incomplete data: one for the clustering of bounded-domain (i.e., integer) data, and two incomparable algorithms that target real-valued data. Our approach is based on exploiting structural properties of a graphical encoding of the missing entries, and we show that tractability can be achieved using significantly less restrictive parameterizations than in the complementary case of few missing entries.
en
Projekttitel:
Parameterisierte Analyse in der Künstlichen Intelligenz: Y1329-N (Fonds zur Förderung der wissenschaftlichen Forschung (FWF)) New Frontiers for Parameterized Complexity: P31336-N35 (Fonds zur Förderung der wissenschaftlichen Forschung (FWF))