Ghalebi, E. (2020). Modeling and analysis of time-evolving sparse networks [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.81081
Information Cascades; Nonparametric Bayesian; Dynamic Network; Generative Model; Compressive Sensing
en
Abstract:
An important problem that arises in many applications of large networks is using observational data to infer the interactions (i.e., edges) between individuals (or vertices) which leads to complex collective behavior. In our work, we focus on time-evolving networks where basic assumptions about the shape or size of the underlying network topology do not hold. As the availability and importance of temporal interaction data such as email communication increases, it becomes increasingly important to not only analyze their data structure but also to generate a model to capture their properties. When analyzing these kinds of networks, we observe interactions between a set of entities and we wish to extract informative representations that are useful for making predictions about the entities and their relationships. We then develop generative models that explain the probabilistic distributions governing the dynamic networks, fit such models to real networks, and use them to generate realistic graphs. A central focus of this thesis is on applications where the edges of their dynamic network might not be observed, but instead we can observe the dynamics of stochastic cascading processes (e.g., information diffusion, virus propagation) occurring over the unobserved network. Further, we assume that the inferred edges construct a multigraph there might be multiple edges between two entities in a multigraph. We generate a probabilistic model for such data, where a tree construction phase is proposed to extract the most probable edges of the network. Using such a model allows us to infer the network given observations from the stochastic cascading process. Another important aspect of our research is the study of sparse real networks. most realworld multigraphs, such as email interaction datasets, are typically sparse in practice and they exhibit a modular structure with the distributions that evolve over time. We take the advantage of nonparametric models for interaction multigraphs which combines the sparsity of edge-exchangeable multigraphs coupled with the dynamic clustering patterns that tend to reinforce recent behavioral patterns. Finally, the problem of identifying central entities from the information point of view in a network is addressed by considering a prominent centrality measure in the networks. In addition to providing algorithms and theoretical analyses, we present extensive empirical evaluation of our approaches on several synthetic and real-world graphs.