Wiedner, W. (2024). Variational inference for bayesian mixture models with a random number of components [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.107561
Bayesian inference; Bayesian estimation; Bayesian model; Mixture of finite mixtures; Bayesian mixture model; clustering; variational inference; variational Bayes; statistical signal processing; Gaussian mixture; probabilistic machine learning; exponential family
en
Abstract:
Finite mixture models, i.e., mixture models with a fixed number of components, have a long tradition in statistical modeling and are a well-established tool to explore structures in complex data. In scenarios where the number of components is unknown, choosing an appropriate number of components is a crucial modeling decision that can be challenging. In order to formalize this modeling decision in a Bayesian fashion, we investigate the mixture of finite mixtures (MFM) model. The MFM model extends the traditional finite mixture model to a Bayesian mixture model with a random number of components. Using the MFM model, it is possible to group data into meaningful subpopulations and estimate the model parameters without specifying the number of components a priori. We discuss equivalent representations of the MFM model such as the stick-breaking representation and relevant distributions such as the exchangeable partition probability function. For Bayesian inference of the model parameters, we propose a computationally efficient coordinate-ascent variational inference (CAVI) algorithm for MFM models and provide detailed derivations of the corresponding update equations. Subsequently, we focus on mixtures consisting of multivariate Gaussian component distributions with unknown means and known covariance matrices, resulting in a novel CAVI algorithm for the static mixture of finite Gaussian mixtures (MFGM) model. We evaluate the clustering performance of our CAVI algorithm using synthetic data generated according to a finite Gaussian mixture and observe high accuracy for suitably chosen hyperparameters. Furthermore, we apply an existing CAVI algorithm for Dirichlet process mixture (DPM) models, which are frequently used in scenarios with an unknown (but finite) number of components, to our data. A comparison reveals that the proposed CAVI algorithm for static MFGMs outperforms the CAVI algorithm for DPMs, especially for large datasets.